| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
- Add explicit implementations for the types needed.
|
| |
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/gc-unused-persistence-metrics
GC old and unused persistence-level metrics
|
| |
| |
| |
| |
| | |
These metrics are from the age of VDS and a dozen spinning disks
per node.
|
|/
|
|
|
| |
This is to help catch an unknown edge case that can happen if
distributor operation cancellation is enabled.
|
|
|
|
|
|
| |
Tracks invocation count, latency and failures (although we don't
expect to see any failures in this pipeline, since the remove ops
logically happen atomically with the bucket iteration).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This plugs the hole where merges could enter the active window even
if doing so would exceeded the total memory limit, as dequeueing is
a separate code path from when a merge is initially evaluated for
inclusion in the active window.
There is a theoretical head-of-line blocking/queue starvation issue
if the merge at the front of the queue has an unrealistically large
footprint and the memory limit is unrealistically low. In practice
this is not expected to be a problem, and it should never cause merging
to stop (at least one merge is always guaranteed to be able to execute).
As such, not adding any kind of heuristics to deal with this for now.
|
|
|
|
| |
By default the legacy behavior is used.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previous (legacy) behavior was to immediately async schedule a
full bucket deletion in the persistence backend, which incurs a very
disproportionate cost when documents are backed by many and/or
heavy indexes (such as HNSW). This risked swamping the backend with
tens to hundreds of thousands of concurrent document deletes.
New behavior splits deletion into three phases:
1. Metadata enumeration for all documents present in the bucket
2. Persistence-throttled async remove _per individual document_
that was returned in the iteration result. This blocks the
persistence thread (by design) if the throttling window is
not sufficiently large to accomodate all pending deletes.
3. Once all async removes have been ACKed, schedule the actual
`DeleteBucket` operation towards the backend. This will clean
up any remaining (cheap) tombstone entries as well as the meta
data store. Operation reply is sent as before once the delete
has completed.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Add config for min/max capping of deduced limit, as well as a scaling
factor based on the memory available to the process. Defaults
have been chosen based on empirical observations over many years,
but having these as config means we can tune things live if
it should ever be required.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The distributor only knows a limited amount of metadata per
bucket replica (roughly: checksum, doc count, doc size). It
therefore has no way to know if two replicas with different
checksums, both with 1000 documents, have 999 or 0 documents
in common. We therefore have to assume the worst and estimate
the worst case memory usage as being the _sum_ of mutually
divergent replica sizes.
Estimates are bounded by the expected bucket merge chunk size,
as we make the simplifying assumption that memory usage for
a particular node is (roughly) limited to this value for any
given bucket.
One special-cased exception to this is single-document replicas,
as one document can not be split across multiple chunks by
definition. Here we track the largest single document replica.
|
|
|
|
|
|
|
|
|
| |
If configured, the active merge window is limited so that the
sum of estimated memory usage for its merges does not go
beyond the configured soft memory limit. The window can
always fit a minimum of 1 merge regardless of its size to
ensure progress in the cluster (thus this is a soft limit,
not a hard limit).
|
| |
|
|
|
|
|
|
|
| |
This provides a strict upper bound for the number of concurrently
executing DeleteBucket operations, and ensures that no persistence
thread stripe can have all its threads allocated to processing
bucket deletions.
|
|\
| |
| | |
Print Bouncer state within lock to ensure visibility
|
| |
| |
| |
| |
| |
| | |
This code path is only encountered when debug logging is explicitly
enabled for the parent `StorageLink` component. Turns out an old
system test did just that.
|
|/
|
|
|
|
| |
Identifiers of the form `_Uppercased` are considered reserved by
the standard. Not likely to cause ambiguity in practice, but it's
preferable to stay on the good side of the standard-gods.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
Removes own `ConfigFetcher` in favor of pushing reconfiguration
responsibilities onto the components owning the Bouncer instance.
The current "superclass calls into subclass" approach isn't
ideal, but the longer term plan is to hoist all config subscriptions
out of `StorageNode` and into the higher-level `Process` structure.
|
|
|
|
|
|
|
| |
Removes need to duplicate locking and explicit config
propagation handling per config type.
Also remove unused upgrade-config wiring.
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/remove-unused-document-config-handler
Remove unused document config update logic
|
| |
| |
| |
| |
| |
| | |
Actual document config changes are propagated in from the
top-level `Process` via an entirely different call chain.
Having the unused one around is just confusing, so remove it.
|
|/
|
|
| |
for GCC false positives.
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/make-operation-priority-mapping-static
Remove unused configurability of operation priorities
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
As far as I know, this config has not been used by anyone for
at least a decade (if it ever was used for anything truly useful).
Additionally, operation priorities are a foot-gun at the best of
times. The ability to dynamically change the meaning of priority
enums even more so.
This commit entirely removes configuration of Document API
priority mappings in favor of a fixed mapping that is equal to
the default config, i.e. what everyone's been using anyway.
This removes a thread per distributor/storage node process as
well as 1 mutex and 1 (presumably entirely unneeded `seq_cst`)
atomic load in the message hot path. Also precomputes a LUT for
the priority reverse mapping to avoid needing to lower-bound seek
an explicit map.
|
|/ |
|
|
|
|
|
|
|
|
| |
This moves the responsibility for bootstrapping and updating config
for the `CommunicationManager` component to its owner. By doing this,
a dedicated `ConfigFetcher` can be removed. Since this is a
component used by both the distributor and storage nodes, this
reduces total thread count by 2 on a host.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This is required to allow messages to be bounced during the
final chain flushing step where the `CommunicationManager` is
shutting down the RPC subsystem and waiting for all RPC threads
to complete. At this point the Bouncer component below it has
completed transition into its final `CLOSED` state.
This is symmetrical to allowing the `CommunicationManager` to
send messages down while in a `FLUSHINGUP` state.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we now shut down the RPC server as the last step during flushing,
it's possible for incoming RPCs to arrive before we get to this point.
These will be immediately bounced (or swallowed) by the Bouncer
component that lies directly below the CommunicationManager, but to
actually get there we need to allow messages down in the StorageLink
`FLUSHINGUP` state.
This commit allows this explicitly for the CommunicationManager and
disallows it for everyone else. Also added stack trace dumping to the
log in the case that a violation is detected.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This moves RPC shutdown from being the _first_ thing that happens
to being the _last_ thing that happens during storage chain shutdown.
To avoid concurrent client requests from the outside reaching internal
components during the flushing phases, the Bouncer component will now
explicitly and immediately reject incoming RPCs after closing and all
replies will be silently swallowed (no one is listening for them at that
point anyway).
|
|
|
|
| |
subsystem, take 2"
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we now shut down the RPC server as the last step during flushing,
it's possible for incoming RPCs to arrive before we get to this point.
These will be immediately bounced (or swallowed) by the Bouncer
component that lies directly below the CommunicationManager, but to
actually get there we need to allow messages down in the StorageLink
`FLUSHINGUP` state.
This commit allows this explicitly for the CommunicationManager and
disallows it for everyone else. Also added stack trace dumping to the
log in the case that a violation is detected.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This moves RPC shutdown from being the _first_ thing that happens
to being the _last_ thing that happens during storage chain shutdown.
To avoid concurrent client requests from the outside reaching internal
components during the flushing phases, the Bouncer component will now
explicitly and immediately reject incoming RPCs after closing and all
replies will be silently swallowed (no one is listening for them at that
point anyway).
|