| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
| |
Remnants of the "file per bucket on spinning disks" days and no
longer used for anything.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Checking just the current state's version does not work as expected
if the distributor is currently _within_ a cluster state transition.
In this case, the pending state's version must be used instead.
Failure to do so will make the code erroneously believe the cluster
state has not changed since the start of the operation.
|
| |
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/do-not-block-comm-mgr-thread-when-aborting-ops
Avoid blocking CommunicationManager thread during cluster state transitions
|
| | |
|
| |
| |
| |
| | |
Also clarify/update some comments.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Incoming cluster state versions are not applied locally on a content
node until all potentially conflicting operations running in the persistence
threads have completed and all potentially conflicting operations in
the persistence queues have been aborted.
This can take a relatively long time when running LID space compactions
etc via the persistence threads, and we'd risk blocking the main
CommunicationManager thread (which handles all cluster controller
communication) for prolonged periods of time.
Move state blocking and internal state propagation to a dedicated
task executor. The executor only has 1 thread, effectively turning
it into an asynchronous FIFO executor.
|
|/ |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was possible for bucket ownership to change between the condition check
and operation write phases in a way that was not implicitly detected by the
"has the replica set changed?"-test. This could trigger an assertion failure
caused by a daisy-chain of events:
1. Distributor tries to update the bucket DB with a bucket that it no longer
owns. This is a no-op, as the DB insertion silently drops updates to
non-owned buckets. This is to avoid reinsertion of stale state when replies
arrive to operations started on the old side of a state transition edge.
2. Distributor reads back the bucket it previously tried to write, with the
code assuming it must have a valid, present state (it just wrote it, after
all).
3. The distributor sets a flag in the read state and tries to write it back.
4. An invariant violation is detected when invalid state is attempted written
and the process aborts.
This commit adds explicit checking of ownership in the current and, if present,
pending cluster states iff the cluster state version has changed between phases.
|
|
|
|
|
| |
Follows the same pattern as that used for sub-operation metrics for
write-repair during Update processing.
|
|
|
|
| |
Pretty much a mirror image of the logic in `PutOperation`.
|
|\
| |
| |
| |
| | |
vespa-engine/havardpe/write-repair-for-conditional-remove
added write repair logic to RemoveOperation
|
| | |
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
| |
`CheckCondition::Outcome` now exposes the resulting trace (if any)
of the operations that were sent as part of the condition probe.
The outcome-accessor is changed to return a non-const reference to
allow for moving away the trace payload and into a higher-level
reply.
Turns out GetOperation did not aggregate reply traces as expected,
and PersistenceMessageTrackerImpl did not transfer trace state upon
failures (only success case). This makes this commit bigger than
initially expected, but trace coverage should now be improved on a
general basis, not just for conditional Put operations.
|
| |
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/default-enable-fast-update-restart-when-consistent
Default-enable fast path updates when document versions are consistent
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
This feature is always _implicitly_ enabled when 3-phase updates are
enabled (which is the case by default). Flip the config default to
true as well to be more in line with what's actually the case in
production.
|
|/ |
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/add-and-wire-condition-probe-config
Add and wire in condition probing configuration
|
| | |
|
|/ |
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/write-repair-for-conditional-puts
Support write-repair for conditional Put operations
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This adds a generic `CheckCondition` utility class that can be
explicitly invoked by operations requiring a distinct condition
evaluation phase. For now, this is only used by `PutOperation`.
Distributed condition checking is only used _iff_ the target
bucket's replica set is inconsistent and all involved content
nodes support condition probing.
Write-repair is currently disabled by default, as there does not
yet exist any config wired in that enables it (coming as part of
a follow-up).
Also currently missing is proper metric wiring and MBus reply
trace propagation for condition-checking sub-operations.
|
|/
|
|
| |
from unique pointer to shared pointer.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a storage API command is created internally on a node it is
always assigned a strictly increasing message ID that is guaranteed
to be unique within the process. Some parts of the code use this
as a way to distinguish messages from another. However, uniqueness
(prior to this commit) did not necessarily hold, as the underlying
wire protocol would inherit message IDs _from other nodes_ and override
the generated ID with this. I.e. uniqueness no longer holds.
This had exciting consequences when the stars aligned and a remote
node sent the same ID as one generated at the same time internally
on the receiver node. Luckily, in practice this would only be used
in a potentially ambiguous context when sanity checking shared read
lock sets for the _same bucket_ in the persistence threads. Invariant
checks would detect this is as an attempted duplicate lock acquisition
and abort the process. This has been latent for many, many years,
but we've seen it happen exactly once.
This commit introduces an explicit domain separation between the
node-internal (locally unique) IDs and the ID used by the originator.
The originator ID is maintained and returned over the wire to the
caller when sending a response to the incoming request.
Curiously, we don't actually need this originator ID at all since
the caller maintains explicit state containing the sender command.
Unfortunately we can't simply remove it, since versions prior to
this commit will still use whatever's on the wire.
|
| |
|
|\
| |
| | |
Improve error message
|
| | |
|
|\ \
| | |
| | |
| | |
| | | |
vespa-engine/toregge/remove-searchresult-and-documentsummary-documentapi-messages
Remove (SearchResult|DocumentSummary)(Command|Reply) storage and docu…
|
| | |
| | |
| | |
| | | |
messages.
|
|\ \ \
| | | |
| | | | |
Minor cleanups in Put/Remove operation code
|
| |/ /
| | |
| | |
| | |
| | | |
No changes in semantics, but removes an unneeded deep copy of the
current cluster state for every processed Put.
|
|/ /
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
a range
Add a new `matchers` directory in vespalib which can be used as a repository for
generic, reusable GMock matchers.
Move distributor tests from using an explicit gtest runner to using `GTest::gmock_main`
which serves the same purpose. Need to depend on a gmock target (not just gtest) to
be able to link with stuff required for matchers.
|
|\ \
| | |
| | |
| | |
| | | |
vespa-engine/vekterli/add-condition-match-metadata-aggregation-for-newest-replica
Add condition support to distributor `GetOperation`
|
| | |
| | |
| | |
| | |
| | |
| | | |
This involves two things:
* Propagate input condition to sent Get requests when present
* Add condition match status to newest replica metadata aggregation
|
|/ / |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Lets the "test" part of a test-and-set condition be evaluated
locally on individual content nodes. Piggybacks on top of metadata-only
Get operations, adding a new condition field to the request and a
boolean match result to the response.
Decouples the existing TaS utility code from being command-oriented,
allowing it to be used in other contexts as well.
Not yet wired through any protocols.
|