vespa - An engine for low-latency computation over large data sets

	Commit message (Collapse)	Author	Age	Files	Lines
*	Handle maintenance operation cancellation edge case	Tor Brede Vekterli	27 hours	1	-4/+17
\| \| \| \| \| \| \| \| \| \| \|	If a maintenance operation reply comes from a node that went down after its original request was sent _and_ cancelling is enabled, there is a discrepancy where the distributor stripe message tracker does not know about the operation, but the maintenance operation owner _does_ know about it. This must be handled explicitly, as the message tracker returns a default constructed bucket when the message is not known. This default bucket must not be propagated out of the function, or we'll transitively trigger an invariant check failure.
*	Clean up ambiguities to prepare for vespalib::stringref => std::string_view	Henning Baldersheim	31 hours	8	-45/+31
\|
*	GC unused stream operator in test	Henning Baldersheim	3 days	1	-10/+0
\|
*	Remove duplicate definition.	Henning Baldersheim	3 days	1	-6/+0
\|
*	Fix typo	Tor Brede Vekterli	8 days	1	-1/+1
\|
*	Adjust cluster state node count to avoid false negatives in test	Tor Brede Vekterli	8 days	1	-1/+1
\| \| \| \| \| \|	Cluster state should state N nodes and distribution config should state <N nodes in order to properly check that attempting to use node N fails due to the _config_ and not the _state_.
*	Handle cluster state bundle with distribution config on content node	Tor Brede Vekterli	9 days	7	-38/+207
\| \| \| \| \| \| \| \| \| \| \| \| \|	If a cluster state bundle contains distribution config, this is internally propagated via the `StateManager` component to all registered state listeners. One such state listener is `FileStorManager`, which updates the content node-internal bucket space repository. All `SetSystemStateCommand` and internal config-aware components (`StateManager` and `ChangedBucketOwnershipHandler`) now explicitly track whether the cluster controller provides distribution config, or if the internally provided config should be used (including fallback to internal config if necessary).
*	Merge pull request #31675 from ↵	Henning Baldersheim	9 days	1	-1/+11
\|\ \| \| \| \| \| \| \| \|	vespa-engine/vekterli/decode-cluster-state-bundle-distribution-config-cpp Support decoding distribution config as part of cluster state bundles in C++
\| *	Support decoding distribution config as part of cluster state bundles in C++	Tor Brede Vekterli	12 days	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Actually _encoding_ config in the same format as that used for decoding config payloads is not directly supported, so we do our own roundabout conversion as part of testing.
* \|	Merge pull request #31692 from ↵	Henning Baldersheim	10 days	1	-1/+2
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	vespa-engine/toregge/adjust-storage-server-document-api-converter-unit-test-for-out-of-source-builds Adjust storage server document api converter unit test for
\| * \|	Adjust storage server document api converter unit test for	Tor Egge	10 days	1	-1/+2
\| \|/ \| \| \| \| \| \|	out of source builds.
* /	Adjust storage bucket manager unit test for out of source builds.	Tor Egge	10 days	1	-1/+2
\|/
*	Rename storage library to vespa_storage.	Tor Egge	13 days	16	-19/+19
\|
*	Rename persistence library to vespa_persistence.	Tor Egge	14 days	1	-1/+1
\|
*	Read distributor unit test config from source directory.	Tor Egge	2024-06-18	3	-5/+8
\|
*	Remove ancient, non-functional distributor init node state from tests	Tor Brede Vekterli	2024-06-17	1	-11/+11
\| \| \| \| \| \|	This has not been anything more than a no-op at best, as there is no (and to the best of my knowledge; never has been) an init state for distributors.
*	Accept distribution config from cluster controller on distributor	Tor Brede Vekterli	2024-06-14	3	-4/+135
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a received cluster state bundle contains an embedded distribution config, the distributor will act _as if_ it had atomically received and processed a distribution config change followed by a new cluster state, but where no bucket info requests were sent for the config change. If a distributor observes a state bundle containing distribution config it will note this internally and explicitly ignore any further received distribution configs _not_ arriving from the cluster controller. To handle downgrades and rollbacks, it undoes this toggle if it later observes a state bundle _without_ distribution config, reverting to using internal node config instead.
*	Add distribution config bundle to `ClusterStateBundle` in C++	Tor Brede Vekterli	2024-06-12	5	-357/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move distribution config transforms to vdslib so that the distribution config bundle can contain derived configs for all bucket spaces in one central place. This is part of the prerequisite work needed before we can start pushing distribution config from the cluster controller and rewiring how distribution config is propagated and used in the backends. Also, rename `Distribution::serialize()` to `Distribution::serialized()` since it returns a const ref to a cached serialized form and does not do on-demand serialization.
*	Use multiple shards for storage distributor gtest runner (unit test runner).	Tor Egge	2024-06-06	1	-5/+11
\|
*	Support enforcing strictly increasing state versions across cluster controllers	Tor Brede Vekterli	2024-06-03	1	-28/+98
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds a (live) config that specifies if content nodes and distributors shall reject cluster state versions that are lower than the one that is currently active on the node. This prevents "last write wins" ordering problems when multiple cluster controllers have partially overlapping leadership periods. In the name of pragmatism, we try to auto-detect the case where ZooKeeper state must have been lost on the cluster controller cluster, and accept the state even with lower version number. Otherwise, the content cluster would be effectively stalled until all its processes had been manually restarted. Adds wiring of live config to the `StateManager` component on both content and distributor nodes.
*	Ensure that only 1 operation per document ID can be in flight per diff apply	Tor Brede Vekterli	2024-05-24	1	-0/+48
\| \| \| \| \| \| \| \| \|	It's possible for a diff to contain multiple versions of the same document, persisted at different timestamps. To avoid async scheduling multiple operations per distinct document ID (technically a violation of the SPI invariants), do a pre-pass of the diff to find the highest timestamp present in the diff per document. Only schedule an operation if it's the newest one for a document.
*	Make replica selection for Puts and bucket activation symmetric	Tor Brede Vekterli	2024-05-22	1	-8/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The legacy Put replica selection behavior may route new versions of a document to replicas that are not considered optimal for activation. This is not normally an issue, but can manifest itself as missing coverage when the system is in flux with replicas moving away from Retired nodes containing ready replicas, as the existing replicas on the Retired node would be preferred for activation (and thus be used for searches) but incoming Puts would instead be sent to non-retired nodes due to being in the ideal state. The new replica ordering (and transitively, selection) behavior is identical between Puts and activation. This should help ensure that new versions of the document is routed to replica(s) that are most likely to be visible as part of searches. New selection behavior for Puts is config-gated and defaults to the legacy behavior. This also subtly changes the fallback ordering criteria for replica activation to consider the replica's existing DB _entry_ order instead of its node index. Since DB entries are always ordered by their ideal state order (with Retired nodes included), this will evenly distribute fallback activations rather than skewing them towards lower indexes. It is not expected that this has any negative effects in practice, and is therefore _not_ a config-gated change.
*	Remove vdstestlib module	Tor Brede Vekterli	2024-05-16	3	-156/+2
\| \| \| \|	Was only used by `DirConfig`.
*	Remove usages of deprecated DirConfig in storage unit tests	Tor Brede Vekterli	2024-05-15	35	-253/+523
\| \| \| \| \| \|	Introduce a distinct `StorageConfigSet` which wraps the actual underlying config objects and exposes them through a unified `ConfigUri`.
*	Remove legacy storage node root directory IO in unit tests	Tor Brede Vekterli	2024-05-14	6	-48/+16
\| \| \| \| \| \| \| \| \| \|	Once upon a time, VDS roamed the lands. It used real disk IO as part of tests. Then came the meteor and in-memory dummy persistence took over. Now it is time for the fossils to be moved into a museum where they belong. Also make PID file writing conditional on a config that is set to `false` during unit testing (but `true` as default).
*	Cleanup old and unused configs in tests	Tor Brede Vekterli	2024-05-13	4	-52/+10
\|
*	Propagate "create if missing"-flag outside binary Update payload in protocols	Tor Brede Vekterli	2024-04-26	3	-26/+93
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoids potentially having to deserialize the entire update just to get to a single bit of information that is technically metadata existing orthogonally to the document update itself. To ensure backwards/forwards compatibility, the flag is propagated as a Protobuf `enum` where the default value is a special "unspecified" sentinel, implying an old sender. Since the Java protocol implementation always eagerly deserializes messages, it unconditionally assigns the `create_if_missing` field when sending and completely ignores it when receiving. The C++ protocol implementation observes and propagates the field iff set. Otherwise the flag is deferred to the update object as before. This applies to both the DocumentAPI and StorageAPI protocols.
*	Ensure SetUp/TearDown symmetry with test superclass	Tor Brede Vekterli	2024-04-10	1	-2/+18
\| \| \| \| \|	Can't initialize members in constructor that depend on objects that are subsequently reset by the superclass' `SetUp()` method.
*	Ensure all async reply processing executor tasks have completed	Tor Brede Vekterli	2024-04-09	1	-0/+1
\|
*	Rewrite test to manually start single persistence thread	Tor Brede Vekterli	2024-04-09	1	-21/+11
\| \| \| \|	Avoids the need for barriers to avoid stepping on the thread's toes
*	Add Abseil failure signal handler to test runner to get stack dumps for crashes	Tor Brede Vekterli	2024-04-09	2	-1/+11
\|
*	Ensure visibility of max batch size reconfiguration in persistence thread	Tor Brede Vekterli	2024-04-09	1	-1/+10
\|
*	Support pipelining (batching) of mutating ops to same bucket	Tor Brede Vekterli	2024-04-09	3	-1/+353
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bucket operations require either exclusive (single writer) or shared (multiple readers) access. Prior to this commit, this means that many enqueued feed operations to the same bucket introduce pipeline stalls due to each operation having to wait for all prior operations to the bucket to complete entirely (including fsync of WAL append). This is a likely scenario when feeding a document set that was previously acquired through visiting, as such documents will inherently be output in bucket-order. With this commit, a configurable number of feed operations (put, remove and update) bound for the exact same bucket may be sent asynchronously to the persistence provider in the context of the _same_ write lock. This mirrors how merge operations work for puts and removes. Batching is fairly conservative, and will _not_ batch across further messages when any of the following holds: * A non-feed operation is encountered * More than one mutating operation is encountered for the same document ID * No more persistence throttler tokens can be acquired * Max batch size has been reached Updating the bucket DB, assigning bucket info and sending replies is deferred until _all_ batched operations complete. Max batch size is (re-)configurable live and defaults to a batch size of 1, which shall have the exact same semantics as the legacy behavior. Additionally, clock sampling for persistence threads have been abstracted away to allow for mocking in tests (no need for sleep!).
*	Enforce document timestamp requirements for updates in backend	Tor Brede Vekterli	2024-03-05	1	-6/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The document API has long since had a special field for update operations where an optional expected _existing_ backend timestamp can be specified, and where the update should only go through iff there is a timestamp match. This has been supported on the distributor all along, but only when write-repair is taking place (i.e. rarely), but the actual backend support has been lacking. No one has complained yet since this is very much not an advertised feature, but if we want to e.g. use this feature for improvements to batch updates we should ensure that it works as expected. With this commit, a non-zero "old timestamp" field is cross-checked against the existing document, and the update is only applied if the actual and expected timestamps match.
*	Keep const	Henning Baldersheim	2024-02-05	1	-1/+1
\|
*	- Remove multibit_split form config, as it is always off, but leave it for ↵	Henning Baldersheim	2024-02-05	6	-14/+18
\| \| \| \| \| \|	tests. - Reduce penetration of generated StorFilestorConfig.
*	Merge pull request #30164 from ↵	Henning Baldersheim	2024-02-05	2	-45/+28
\|\ \| \| \| \| \| \| \| \|	vespa-engine/balder/hardcode-enable_metadata_only_fetch_phase_for_inconsistent_updates - Hardcode enable_metadata_only_fetch_phase_for_inconsistent_updates …
\| *	GC unused test methods.	Henning Baldersheim	2024-02-05	1	-8/+0
\| \|
\| *	- Hardcode enable_metadata_only_fetch_phase_for_inconsistent_updates and ↵	Henning Baldersheim	2024-02-03	2	-41/+32
\| \| \| \| \| \| \| \| \| \| \| \|	restart_with_fast_update_path_if_all_get_timestamps_are_consistent to true. - The tests expecting depending on these flags specify these values explicit.
* \|	Merge pull request #30165 from vespa-engine/balder/gc-unused-distribution-config	Henning Baldersheim	2024-02-05	3	-22/+2
\|\ \ \| \| \| \| \| \|	Balder/gc unused distribution config
\| * \|	GC unused distributor_auto_ownership_transfer_on_whole_group_down	Henning Baldersheim	2024-02-03	3	-18/+2
\| \| \|
\| * \|	GC unused disk_distribution config.	Henning Baldersheim	2024-02-03	1	-4/+0
\| \|/
* \|	Merge pull request #30158 from ↵	Henning Baldersheim	2024-02-05	1	-14/+5
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	vespa-engine/balder/disable_queue_limits_for_chained_merges-always-true disable_queue_limits_for_chained_merges has long been true, GC
\| * \|	disable_queue_limits_for_chained_merges has long been true, GC	Henning Baldersheim	2024-02-02	1	-14/+5
\| \|/
* /	common_merge_chain_optimalization_minimum_size hardcoded at 64	Henning Baldersheim	2024-02-03	1	-2/+2
\|/
*	Condition probing has long been default	Henning Baldersheim	2024-02-02	5	-69/+13
\|
*	two_phase_garbage_collection is always enabled	Henning Baldersheim	2024-02-02	2	-36/+0
\|
*	Merge pull request #30146 from vespa-engine/balder/always-unordered-merging	Henning Baldersheim	2024-02-02	2	-32/+1
\|\ \| \| \| \|	Balder/always unordered merging
\| *	Alwasy use use_unordered_merge_chaining	Henning Baldersheim	2024-02-02	2	-32/+1
\| \|
* \|	Merge pull request #30145 from ↵	Henning Baldersheim	2024-02-02	1	-1/+0
\|\ \ \| \|/ \|/\| \| \| \| \|	vespa-engine/balder/gc-maxpendingidealstateoperations GC maxpendingidealstateoperations which has not been wired in for a l…