| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
This avoids a potential starvation issue caused by the existing
implementation, which is bucket ID ordered within a given priority
class. The latter has the unfortunate effect that frequently reinserting
buckets that sort before buckets that are already in the queue may
starve these from being popped from the queue.
Move to a composite key that first sorts on priority, then on a strictly
increasing sequence number. Add a secondary index into this structure
that allows for lookups on bucket IDs as before.
|
|
|
|
|
|
|
| |
This is to help debug a very rare edge case where the theory is that
an update operation may race with the implicit removal of said document
by asynchronous GC. By dumping the pending GC state for the bucket+node
we can get some good indications on whether this theory holds.
|
|
|
|
| |
[run-systemtest]"
|
|\
| |
| | |
gcc 9 does not support std::ranges::all_of.
|
| | |
|
|/ |
|
| |
|
| |
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/treat-empty-replica-subset-as-inconsistent-for-get-operations
Treat empty replica subset as inconsistent for GetOperation [run-systemtest]
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
`GetOperation` document-level consistency checks are used by the multi-phase
update logic to see if we can fall back to a fast path even though not all
replicas are in sync. Empty replicas are not considered part of the send-set,
so only looking at replies from replicas _sent_ to will not detect this case.
If we haphazardly treat empty replicas as implicitly being in sync we risk
triggering undetectable inconsistencies at the document level. This can
happen if we send create-if-missing updates to an empty replica as well as a
non-empty replica, and the document exists in the latter replica.
The document would then be implicitly created on the empty replica with the
same timestamp as that of the non-empty one, even though their contents would
almost certainly differ.
With this change we initially tag all `GetOperations` with at least one empty
replica as having inconsistent replicas. This will trigger the full write-
repair code path for document updates.
|
|/
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we'd implicitly assume a failed CreateBucket reply meant the
bucket replica was not created, but this does not hold in the general case.
A failure may just as well be due to connection failures etc between the
distributor and content node. To tell for sure, we now send an explicit
RequestBucketInfo to the node in the case of CreateBucket failures. If
it _was_ created, the replica will be reintroduced into the bucket DB.
We still implicitly delete the bucket replica from the DB to avoid
transiently routing client write load to a bucket that may likely not
exist.
|
| |
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/decrement-merge-counter-when-sync-merge-handling-complete
Decrement persistence thread merge counter when syncronous processing is complete [run-systemtest]
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
complete
Add a generic interface for letting an operation know that the synchronous
parts of its processing in the persistence thread is complete. This allows
a potentially longer-running async operation to free up any limits that
were put in place when it was taking up synchronous thread resources.
Currently only used by merge-related operations (that may dispatch many
async ops). Since we have a max upper bound for how many threads in a stripe
may be processing merge ops at the same time (to avoid blocking client ops),
we previously could effectively stall the pipelining of merges caused by
hitting the concurrency limit even if all persistence threads were otherwise
idle (waiting for prior async merge ops to complete).
We now explicitly decrease the merge concurrency counter once the synchronous
processing is done, allowing us to take on further merges immediately.
|
|\ \
| |/
|/|
| |
| | |
vespa-engine/vekterli/cap-merge-delete-bucket-pri-to-default-feed-pri
Cap merge-induced DeleteBucket priority to that of default feed priority
|
| |
| |
| |
| |
| |
| |
| | |
This lets DeleteBucket operations FIFO with the client operations using
the default feed priority (120).
Not doing this risks preempting feed ops with deletes, elevating latencies.
|
|\ \
| | |
| | |
| | |
| | | |
vespa-engine/arnej/config-class-should-not-be-public
Arnej/config class should not be public
|
| | | |
|
| |/
| |
| |
| |
| |
| |
| | |
* For C++ code this introduces a "document::config" namespace, which will
sometimes conflict with the global "config" namespace.
* Move all forward-declarations of the types DocumenttypesConfig and
DocumenttypesConfigBuilder to a common header file.
|
| | |
|
| | |
|
| | |
|
|/
|
|
|
|
|
|
|
|
| |
If a reply arrives for a preempted cluster state it will be ignored.
To avoid it being automatically sent further down the storage chain
we still have to treat it as handled. Otherwise a scary looking but
otherwise benign "unhandled message" warning will be emitted in the
Vespa log.
Also move an existing test to the correct test fixture.
|
|
|
|
|
|
|
|
| |
Since deletes are now async in the backend, make them FIFO-order with
client feed by default to avoid swamping the queues with deletes.
Also, explicitly inherit priority for bucket deletion triggered by
bucket merging. This was actually missing previously and meant such
deletes got the default, very low priority.
|
|\
| |
| | |
Reduce log level from error to warning for update inconsistency message
|
| |
| |
| |
| |
| |
| | |
`error` level is for the "node is on fire and falling down an elevator
shaft" type of messages, not for operation inconsistencies. Reduce
to `warning` accordingly.
|
|/ |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Only skip deactivating buckets if the entire _node_ is marked as
maintenance state, i.e. the node has maintenance state across all
bucket spaces provided in the bundle. Otherwise treat the state
transition as if the node goes down, deactivating all buckets.
Also ensure that the bucket deactivation logic above the SPI is
identical to that within Proton. This avoids bucket DBs getting
out of sync between the two.
|
| |
|
| |
|
|
|
|
| |
[run-systemtest]"
|
|
|
|
|
|
|
|
|
|
|
| |
Only skip deactivating buckets if the entire _node_ is marked as
maintenance state, i.e. the node has maintenance state across all
bucket spaces provided in the bundle. Otherwise treat the state
transition as if the node goes down, deactivating all buckets.
Also ensure that the bucket deactivation logic above the SPI is
identical to that within Proton. This avoids bucket DBs getting
out of sync between the two.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
bucket before waiting for the replies.
Prepare RemoveResult to contain more replies.
|
| |
|
| |
|
|
|
|
|
|
| |
* Add `from_distributor()` utility function to `MergeBucketCommand`
* Simplify boolean expression by moving sub-expression to own statement
* Improve wording of config parameter
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Historically the MergeThrottler component has required a deterministic
forwarding of merges between nodes in strictly increasing distribution
key order. This is to avoid distributed deadlocks caused by ending up
with two or more nodes waiting for each other to release merge resources,
where releasing one depends on releasing the other. This works well,
but has the downside that there's an inherent pressure of merges towards
nodes with lower distribution keys. These often become a bottleneck.
This commit lifts this ordering restriction, by allowing forwarded,
unordered merges to immediately enter the active merge window. By doing
this we remove the deadlock potential, since nodes will longer be waiting
on resources freed by other nodes.
Since the legacy MergeThrottler has a lot of invariant checking around
strictly increasing merge chains, we only allow unordered merges to be
scheduled towards node sets where _all_ nodes are on a Vespa version
that explicitly understands unordered merges (and thus do not self-
obliterate upon seeing one). To communicate this, full bucket fetches
will now piggy-back version-specific feature sets as part of the response
protocol. Distributors then aggregate this information internally.
|
|
|
|
|
|
|
|
|
|
|
| |
This was a feature tracing its lineage back to ye olde VDS days where we
had a dusty root drive holding the OS and Vespa and a dozen spinning rust
drives storing the actual data. The health ping would confirm that the
root drive was still functioning properly by triggering a write to it,
as the node would swiftly bail if any of the I/O operations failed.
Today's access patterns are wildly different so we'll detect disk problems
quickly anyway, not to mention Vespa Cloud has disk health monitoring built in.
|
|
|
|
|
| |
- Include header file for atomic when needed.
- Use normal function template instead of abbreviated function template.
|
| |
|
|
|
|
| |
longer needed.
|
|\
| |
| |
| |
| | |
vespa-engine/toregge/handover-tracker-to-apply-bucket-diff-state-on-exceptions
Handover tracker to ApplyBucketDiffState on exceptions.
|
| | |
|