| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was possible for bucket ownership to change between the condition check
and operation write phases in a way that was not implicitly detected by the
"has the replica set changed?"-test. This could trigger an assertion failure
caused by a daisy-chain of events:
1. Distributor tries to update the bucket DB with a bucket that it no longer
owns. This is a no-op, as the DB insertion silently drops updates to
non-owned buckets. This is to avoid reinsertion of stale state when replies
arrive to operations started on the old side of a state transition edge.
2. Distributor reads back the bucket it previously tried to write, with the
code assuming it must have a valid, present state (it just wrote it, after
all).
3. The distributor sets a flag in the read state and tries to write it back.
4. An invariant violation is detected when invalid state is attempted written
and the process aborts.
This commit adds explicit checking of ownership in the current and, if present,
pending cluster states iff the cluster state version has changed between phases.
|
|
|
|
|
| |
Follows the same pattern as that used for sub-operation metrics for
write-repair during Update processing.
|
|
|
|
| |
Pretty much a mirror image of the logic in `PutOperation`.
|
|\
| |
| |
| |
| | |
vespa-engine/havardpe/write-repair-for-conditional-remove
added write repair logic to RemoveOperation
|
| | |
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
| |
`CheckCondition::Outcome` now exposes the resulting trace (if any)
of the operations that were sent as part of the condition probe.
The outcome-accessor is changed to return a non-const reference to
allow for moving away the trace payload and into a higher-level
reply.
Turns out GetOperation did not aggregate reply traces as expected,
and PersistenceMessageTrackerImpl did not transfer trace state upon
failures (only success case). This makes this commit bigger than
initially expected, but trace coverage should now be improved on a
general basis, not just for conditional Put operations.
|
| |
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/default-enable-fast-update-restart-when-consistent
Default-enable fast path updates when document versions are consistent
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
This feature is always _implicitly_ enabled when 3-phase updates are
enabled (which is the case by default). Flip the config default to
true as well to be more in line with what's actually the case in
production.
|
|/ |
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/add-and-wire-condition-probe-config
Add and wire in condition probing configuration
|
| | |
|
|/ |
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/write-repair-for-conditional-puts
Support write-repair for conditional Put operations
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This adds a generic `CheckCondition` utility class that can be
explicitly invoked by operations requiring a distinct condition
evaluation phase. For now, this is only used by `PutOperation`.
Distributed condition checking is only used _iff_ the target
bucket's replica set is inconsistent and all involved content
nodes support condition probing.
Write-repair is currently disabled by default, as there does not
yet exist any config wired in that enables it (coming as part of
a follow-up).
Also currently missing is proper metric wiring and MBus reply
trace propagation for condition-checking sub-operations.
|
|/
|
|
| |
from unique pointer to shared pointer.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a storage API command is created internally on a node it is
always assigned a strictly increasing message ID that is guaranteed
to be unique within the process. Some parts of the code use this
as a way to distinguish messages from another. However, uniqueness
(prior to this commit) did not necessarily hold, as the underlying
wire protocol would inherit message IDs _from other nodes_ and override
the generated ID with this. I.e. uniqueness no longer holds.
This had exciting consequences when the stars aligned and a remote
node sent the same ID as one generated at the same time internally
on the receiver node. Luckily, in practice this would only be used
in a potentially ambiguous context when sanity checking shared read
lock sets for the _same bucket_ in the persistence threads. Invariant
checks would detect this is as an attempted duplicate lock acquisition
and abort the process. This has been latent for many, many years,
but we've seen it happen exactly once.
This commit introduces an explicit domain separation between the
node-internal (locally unique) IDs and the ID used by the originator.
The originator ID is maintained and returned over the wire to the
caller when sending a response to the incoming request.
Curiously, we don't actually need this originator ID at all since
the caller maintains explicit state containing the sender command.
Unfortunately we can't simply remove it, since versions prior to
this commit will still use whatever's on the wire.
|
| |
|
|\
| |
| | |
Improve error message
|
| | |
|
|\ \
| | |
| | |
| | |
| | | |
vespa-engine/toregge/remove-searchresult-and-documentsummary-documentapi-messages
Remove (SearchResult|DocumentSummary)(Command|Reply) storage and docu…
|
| | |
| | |
| | |
| | | |
messages.
|
|\ \ \
| | | |
| | | | |
Minor cleanups in Put/Remove operation code
|
| |/ /
| | |
| | |
| | |
| | | |
No changes in semantics, but removes an unneeded deep copy of the
current cluster state for every processed Put.
|
|/ /
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
a range
Add a new `matchers` directory in vespalib which can be used as a repository for
generic, reusable GMock matchers.
Move distributor tests from using an explicit gtest runner to using `GTest::gmock_main`
which serves the same purpose. Need to depend on a gmock target (not just gtest) to
be able to link with stuff required for matchers.
|
|\ \
| | |
| | |
| | |
| | | |
vespa-engine/vekterli/add-condition-match-metadata-aggregation-for-newest-replica
Add condition support to distributor `GetOperation`
|
| | |
| | |
| | |
| | |
| | |
| | | |
This involves two things:
* Propagate input condition to sent Get requests when present
* Add condition match status to newest replica metadata aggregation
|
|/ / |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Lets the "test" part of a test-and-set condition be evaluated
locally on individual content nodes. Piggybacks on top of metadata-only
Get operations, adding a new condition field to the request and a
boolean match result to the response.
Decouples the existing TaS utility code from being command-oriented,
allowing it to be used in other contexts as well.
Not yet wired through any protocols.
|
| | |
|
|/
|
|
|
|
|
|
|
|
| |
This is a pragmatic workaround for our current optimistic signalling
mechanisms seemingly being susceptible to lost wakeups under certain
conditions. We should redesign how persistence queue signalling works
on a more fundamental level to avoid this scenario altogether, but
for now this will at least remove the possibility that a thread may
be stalled if the persistence queues are completely quiescent over
longer periods of time.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Flushing the distributor stripes happens as part of process shutdown
and therefore takes place in the main thread. At this point we should
no longer receive any messages from the RPC/messaging subsystems, but
there may still be lingering replies to be purged. At this point, the
stripe threads are joined but the communication manager thread is not
(happens slightly after in the shutdown sequence).
We therefore have to explicitly form a mutex acquire/release pair to
ensure proper memory visibility between anything touched by the comm.
manager thread (within the mutex) and our main thread.
This should resolve a ThreadSanitizer error.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This covers both the entry points from the `storagenode` and
`searchnode` HTTP servers, though the former is mostly in the
name of legacy support.
Ideally, capability checking would exist as a property of the
HTTP server (Portal) bindings, but the abstractions for the
JSON request handling are sufficiently leaky that it ended up
making more sense to push things further down the hierarchy.
It's always a good thing to move away from using strings with
implicit semantics as return types anyway.
The `searchnode` state API handler mapping supports fine grained
capabilities. The legacy `storagenode` state API forwarding does
not; it uses a sledgehammer that expects the union of all possible
API capability requirements.
|
| |
|
| |
|
|\
| |
| |
| |
| | |
vespa-engine/geirst/less-document-without-type-repo
Reduce creation of Document instances without DocumentTypeRepo.
|
| | |
|
|/
|
|
|
|
| |
The DB API was rather coy about whether `forEach` had lower or
upper bound semantics with regards to the bucket ID passed in as
a starting point. Be explicit and add a lower-bound variant.
|
|
|
|
| |
Plus some additional minor cleanup.
|
|
|
|
|
|
| |
also get rid of some cleanup functions on reference counted classes
enable specifying low-level parameters to addref/subref (cnt/reserve)
|
| |
|
| |
|
| |
|