vespa - An engine for low-latency computation over large data sets

	Commit message (Collapse)	Author	Age	Files	Lines
...
\| \| * \|	Never block statecheckers	Henning Baldersheim	2024-02-02	5	-61/+2
\| \| \|/
\| * /	Always prioritize_global_bucket_merges	Henning Baldersheim	2024-02-02	7	-81/+11
\| \|/
* /	GC unused methods and members in distributormanager config, part 1	Henning Baldersheim	2024-02-02	2	-25/+2
\|/
*	Merge pull request #30130 from ↵	Henning Baldersheim	2024-02-01	4	-129/+15
\|\ \| \| \| \| \| \| \| \|	vespa-engine/balder/gc-void-config-from-stor-bouncer GC void config from stor-bouncer.def
\| *	GC void config from stor-bouncer.def	Henning Baldersheim	2024-02-01	4	-129/+15
\| \|
* \|	NORMAL_3 is the old normal	Henning Baldersheim	2024-02-01	1	-7/+7
\| \|
* \|	GC void config from stor-visitor.def	Henning Baldersheim	2024-02-01	6	-119/+26
\|/
*	GC chunklevel from bucketdb config.	Henning Baldersheim	2024-01-30	1	-1/+0
\|
*	GC unused stor-bucketdb and stor-opslogger config.	Henning Baldersheim	2024-01-30	6	-20/+0
\|
*	GC completely unused parameters from the days of VDS	Henning Baldersheim	2024-01-30	3	-29/+0
\|
*	GC completely unused parameters from the days of VDS	Henning Baldersheim	2024-01-30	1	-21/+0
\|
*	GC unused async_operation_dynamic_throttling_window_increment and ↵	Henning Baldersheim	2024-01-30	1	-2/+1
\| \| \| \|	async_operation_throttler_type
*	GC leftovers from use_per_document_throttled_delete_bucket	Henning Baldersheim	2024-01-30	6	-94/+41
\|
*	Expose content node entry+document counts via host info	Tor Brede Vekterli	2024-01-25	4	-7/+24
\| \| \| \| \| \| \| \| \| \|	This adds two new fields to the host info payload sent to the cluster controller; entry count (documents + tombstones) and visible document count (i.e. sans tombstones). To preserve symmetry, entry count has also been added to the metric set, as the host info fields originally referred to raw metric names.
*	GC control of use-per-document-delete and max-merge-memory from config ↵	Henning Baldersheim	2024-01-23	1	-1/+1
\| \| \| \|	production side in java.
*	- Avoid inefficient generic template.	Henning Baldersheim	2023-12-29	4	-40/+27
\| \| \| \|	- Add explicit implementations for the types needed.
*	Fix format strings.	Tor Egge	2023-11-20	2	-3/+4
\|
*	Merge pull request #29342 from ↵	Tor Brede Vekterli	2023-11-15	3	-43/+1
\|\ \| \| \| \| \| \| \| \|	vespa-engine/vekterli/gc-unused-persistence-metrics GC old and unused persistence-level metrics
\| *	GC old and unused persistence-level metrics	Tor Brede Vekterli	2023-11-15	3	-43/+1
\| \| \| \| \| \| \| \| \| \|	These metrics are from the age of VDS and a dozen spinning disks per node.
* \|	Explicitly print backtrace on bucket space invariant violation	Tor Brede Vekterli	2023-11-15	1	-2/+9
\|/ \| \| \| \|	This is to help catch an unknown edge case that can happen if distributor operation cancellation is enabled.
*	Add fundamental metrics for async "remove by GID" SPI operation	Tor Brede Vekterli	2023-11-15	4	-4/+25
\| \| \| \| \| \|	Tracks invocation count, latency and failures (although we don't expect to see any failures in this pipeline, since the remove ops logically happen atomically with the bucket iteration).
*	Also memory limit throttle enqueued merges	Tor Brede Vekterli	2023-11-13	3	-29/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This plugs the hole where merges could enter the active window even if doing so would exceeded the total memory limit, as dequeueing is a separate code path from when a merge is initially evaluated for inclusion in the active window. There is a theoretical head-of-line blocking/queue starvation issue if the merge at the front of the queue has an unrealistically large footprint and the memory limit is unrealistically low. In practice this is not expected to be a problem, and it should never cause merging to stop (at least one merge is always guaranteed to be able to execute). As such, not adding any kind of heuristics to deal with this for now.
*	Add and wire live config for selecting `DeleteBucket` behavior	Tor Brede Vekterli	2023-11-10	4	-9/+45
\| \| \| \|	By default the legacy behavior is used.
*	Implement DeleteBucket with throttled per-document async removal	Tor Brede Vekterli	2023-11-10	7	-43/+129
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previous (legacy) behavior was to immediately async schedule a full bucket deletion in the persistence backend, which incurs a very disproportionate cost when documents are backed by many and/or heavy indexes (such as HNSW). This risked swamping the backend with tens to hundreds of thousands of concurrent document deletes. New behavior splits deletion into three phases: 1. Metadata enumeration for all documents present in the bucket 2. Persistence-throttled async remove _per individual document_ that was returned in the iteration result. This blocks the persistence thread (by design) if the throttling window is not sufficiently large to accomodate all pending deletes. 3. Once all async removes have been ACKed, schedule the actual `DeleteBucket` operation towards the backend. This will clean up any remaining (cheap) tombstone entries as well as the meta data store. Operation reply is sent as before once the delete has completed.
*	Use int64_t constants.	Tor Egge	2023-11-07	1	-2/+2
\|
*	Add removeByGidAsync() to spi.	Tor Egge	2023-11-06	4	-0/+23
\|
*	Specify metric unit in description string	Tor Brede Vekterli	2023-11-02	1	-2/+2
\|
*	Less confusing naming	Tor Brede Vekterli	2023-11-02	2	-3/+3
\|
*	Wire HwInfo into MergeThrottler and use for auto-deduction of memory limits	Tor Brede Vekterli	2023-11-02	7	-58/+196
\| \| \| \| \| \| \| \|	Add config for min/max capping of deduced limit, as well as a scaling factor based on the memory available to the process. Defaults have been chosen based on empirical observations over many years, but having these as config means we can tune things live if it should ever be required.
*	Heuristically compute expected merge memory usage upper bound	Tor Brede Vekterli	2023-11-02	3	-10/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The distributor only knows a limited amount of metadata per bucket replica (roughly: checksum, doc count, doc size). It therefore has no way to know if two replicas with different checksums, both with 1000 documents, have 999 or 0 documents in common. We therefore have to assume the worst and estimate the worst case memory usage as being the _sum_ of mutually divergent replica sizes. Estimates are bounded by the expected bucket merge chunk size, as we make the simplifying assumption that memory usage for a particular node is (roughly) limited to this value for any given bucket. One special-cased exception to this is single-document replicas, as one document can not be split across multiple chunks by definition. Here we track the largest single document replica.
*	Add content node soft limit on max memory used by merges	Tor Brede Vekterli	2023-11-01	9	-66/+264
\| \| \| \| \| \| \| \| \|	If configured, the active merge window is limited so that the sum of estimated memory usage for its merges does not go beyond the configured soft memory limit. The window can always fit a minimum of 1 merge regardless of its size to ensure progress in the cluster (thus this is a soft limit, not a hard limit).
*	Remove unneeded and code-bloating test macro	Tor Brede Vekterli	2023-10-26	1	-40/+46
\|
*	Use same concurrency inhibition for DeleteBucket as for merge ops	Tor Brede Vekterli	2023-10-26	1	-1/+6
\| \| \| \| \| \| \|	This provides a strict upper bound for the number of concurrently executing DeleteBucket operations, and ensures that no persistence thread stripe can have all its threads allocated to processing bucket deletions.
*	Merge pull request #29098 from vespa-engine/vekterli/print-state-inside-lock	Henning Baldersheim	2023-10-25	2	-10/+11
\|\ \| \| \| \|	Print Bouncer state within lock to ensure visibility
\| *	Print Bouncer state within lock to ensure visibility	Tor Brede Vekterli	2023-10-25	2	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \|	This code path is only encountered when debug logging is explicitly enabled for the parent `StorageLink` component. Turns out an old system test did just that.
* \|	Avoid using a reserved identifier naming format	Tor Brede Vekterli	2023-10-25	5	-80/+80
\|/ \| \| \| \| \|	Identifiers of the form `_Uppercased` are considered reserved by the standard. Not likely to cause ambiguity in practice, but it's preferable to stay on the good side of the standard-gods.
*	Purge additional config instances not needed after bootstrap	Tor Brede Vekterli	2023-10-24	1	-0/+2
\|
*	Simplify and reuse utility config function	Tor Brede Vekterli	2023-10-24	2	-13/+8
\|
*	Rewire `FileStorManager` config	Tor Brede Vekterli	2023-10-24	8	-54/+68
\|
*	Rewire `ModifiedBucketChecker` config	Tor Brede Vekterli	2023-10-24	6	-27/+33
\|
*	Propagate `VisitorManager` config from outside	Tor Brede Vekterli	2023-10-24	7	-37/+59
\|
*	Provide explicit bootstrap config to `BucketManager`	Tor Brede Vekterli	2023-10-24	4	-22/+20
\|
*	Pull up and out config of `ChangedBucketOwnershipHandler` component	Tor Brede Vekterli	2023-10-24	9	-55/+86
\|
*	Wire config to MergeThrottler in from the outside	Tor Brede Vekterli	2023-10-24	6	-52/+66
\|
*	Explicitly de-inline `BootstrapConfigs` ctor/dtor	Tor Brede Vekterli	2023-10-23	2	-0/+10
\|
*	Propagate existing StorageNode config from main Process reconfig loop	Tor Brede Vekterli	2023-10-23	6	-106/+61
\|
*	Rewire Bouncer configuration flow	Tor Brede Vekterli	2023-10-19	9	-55/+82
\| \| \| \| \| \| \| \| \|	Removes own `ConfigFetcher` in favor of pushing reconfiguration responsibilities onto the components owning the Bouncer instance. The current "superclass calls into subclass" approach isn't ideal, but the longer term plan is to hoist all config subscriptions out of `StorageNode` and into the higher-level `Process` structure.
*	De-dupe `StorageNode` config propagation	Tor Brede Vekterli	2023-10-18	4	-125/+109
\| \| \| \| \| \| \|	Removes need to duplicate locking and explicit config propagation handling per config type. Also remove unused upgrade-config wiring.
*	Merge pull request #29003 from ↵	Henning Baldersheim	2023-10-18	2	-23/+0
\|\ \| \| \| \| \| \| \| \|	vespa-engine/vekterli/remove-unused-document-config-handler Remove unused document config update logic
\| *	Remove unused document config update logic	Tor Brede Vekterli	2023-10-18	2	-23/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Actual document config changes are propagated in from the top-level `Process` via an entirely different call chain. Having the unused one around is just confusing, so remove it.