vespa - An engine for low-latency computation over large data sets

	Commit message (Collapse)	Author	Age	Files	Lines
*	Ensure parameter evaluation order does not have side effects	Tor Brede Vekterli	2024-04-10	3	-4/+7
\|
*	Low-level message fetch routine must not implicitly unlock mutex	Tor Brede Vekterli	2024-04-09	1	-1/+1
\| \| \| \| \| \| \|	Implicitly unlocking messes up higher level assumptions about when locks are held and thus cannot be safely done. Lock will be unlocked immediately after anyway, so this does not seem like a useful optimization.
*	Use `static_cast` instead of `dynamic_cast`	Tor Brede Vekterli	2024-04-09	1	-3/+3
\| \| \| \| \|	Downcast-safe type invariant shall be maintained by the message's own type ID tracking. If it's not, we have bigger problems.
*	Support pipelining (batching) of mutating ops to same bucket	Tor Brede Vekterli	2024-04-09	14	-122/+381
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bucket operations require either exclusive (single writer) or shared (multiple readers) access. Prior to this commit, this means that many enqueued feed operations to the same bucket introduce pipeline stalls due to each operation having to wait for all prior operations to the bucket to complete entirely (including fsync of WAL append). This is a likely scenario when feeding a document set that was previously acquired through visiting, as such documents will inherently be output in bucket-order. With this commit, a configurable number of feed operations (put, remove and update) bound for the exact same bucket may be sent asynchronously to the persistence provider in the context of the _same_ write lock. This mirrors how merge operations work for puts and removes. Batching is fairly conservative, and will _not_ batch across further messages when any of the following holds: * A non-feed operation is encountered * More than one mutating operation is encountered for the same document ID * No more persistence throttler tokens can be acquired * Max batch size has been reached Updating the bucket DB, assigning bucket info and sending replies is deferred until _all_ batched operations complete. Max batch size is (re-)configurable live and defaults to a batch size of 1, which shall have the exact same semantics as the legacy behavior. Additionally, clock sampling for persistence threads have been abstracted away to allow for mocking in tests (no need for sleep!).
*	Update to protobuf 5.26.1 (C++ API).	Tor Egge	2024-04-05	3	-2/+3
\|
*	Wire Prometheus metric export to state V1 APIs	Tor Brede Vekterli	2024-03-21	2	-11/+24
\| \| \| \| \| \| \| \| \| \|	Extends metric producer classes with the requested exposition format. As a consequence, the State API server has been changed to allow emitting other content types than just `application/json`. Add custom Prometheus rendering for Slobrok, as it does its own domain-specific metric tracking. However, since it has non-destructive sampling properties, we can actually use proper `counter` types.
*	Support internal metric rendering in Prometheus text format in C++	Tor Brede Vekterli	2024-03-19	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \|	Maps all internal metrics to one or more labelled time series. Due to poor compatibility between the data model (and sampling strategy) of the legacy metrics framework and that of Prometheus, all time series are emitted as `untyped` metrics. This is a stop-gap solution on the way to "properly" supporting Prometheus exposition, and the output of this renderer should therefore only be used for internal purposes.
*	Enforce document timestamp requirements for updates in backend	Tor Brede Vekterli	2024-03-05	2	-4/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The document API has long since had a special field for update operations where an optional expected _existing_ backend timestamp can be specified, and where the update should only go through iff there is a timestamp match. This has been supported on the distributor all along, but only when write-repair is taking place (i.e. rarely), but the actual backend support has been lacking. No one has complained yet since this is very much not an advertised feature, but if we want to e.g. use this feature for improvements to batch updates we should ensure that it works as expected. With this commit, a non-zero "old timestamp" field is cross-checked against the existing document, and the update is only applied if the actual and expected timestamps match.
*	use const where possible	Henning Baldersheim	2024-02-05	1	-2/+2
\|
*	Simpler to just use false directly.	Henning Baldersheim	2024-02-05	1	-3/+1
\|
*	- Remove multibit_split form config, as it is always off, but leave it for ↵	Henning Baldersheim	2024-02-05	7	-21/+35
\| \| \| \| \| \|	tests. - Reduce penetration of generated StorFilestorConfig.
*	Merge pull request #30164 from ↵	Henning Baldersheim	2024-02-05	5	-30/+11
\|\ \| \| \| \| \| \| \| \|	vespa-engine/balder/hardcode-enable_metadata_only_fetch_phase_for_inconsistent_updates - Hardcode enable_metadata_only_fetch_phase_for_inconsistent_updates …
\| *	- Hardcode enable_metadata_only_fetch_phase_for_inconsistent_updates and ↵	Henning Baldersheim	2024-02-03	5	-30/+11
\| \| \| \| \| \| \| \| \| \| \| \|	restart_with_fast_update_path_if_all_get_timestamps_are_consistent to true. - The tests expecting depending on these flags specify these values explicit.
* \|	Merge pull request #30165 from vespa-engine/balder/gc-unused-distribution-config	Henning Baldersheim	2024-02-05	2	-8/+2
\|\ \ \| \| \| \| \| \|	Balder/gc unused distribution config
\| * \|	Followup on review comments and initialize members explicit.	Henning Baldersheim	2024-02-05	1	-1/+1
\| \| \|
\| * \|	GC unused distributor_auto_ownership_transfer_on_whole_group_down	Henning Baldersheim	2024-02-03	2	-4/+0
\| \| \|
\| * \|	GC unused disk_distribution config.	Henning Baldersheim	2024-02-03	1	-5/+3
\| \|/
* \|	Merge pull request #30158 from ↵	Henning Baldersheim	2024-02-05	4	-23/+3
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	vespa-engine/balder/disable_queue_limits_for_chained_merges-always-true disable_queue_limits_for_chained_merges has long been true, GC
\| * \|	Add comment	Henning Baldersheim	2024-02-02	1	-0/+1
\| \| \|
\| * \|	disable_queue_limits_for_chained_merges has long been true, GC	Henning Baldersheim	2024-02-02	3	-23/+2
\| \|/
* \|	Merge pull request #30161 from ↵	Henning Baldersheim	2024-02-05	5	-136/+56
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	vespa-engine/balder/throttle_individual_merge_feed_ops_and_common_merge_chain_optimalization Balder/throttle individual merge feed ops and common merge chain optimalization
\| * \|	common_merge_chain_optimalization_minimum_size hardcoded at 64	Henning Baldersheim	2024-02-03	3	-97/+48
\| \| \|
\| * \|	throttle_individual_merge_feed_ops has long been enabled, cleaning up	Henning Baldersheim	2024-02-03	5	-39/+8
\| \|/
* /	Temporary add back use_btree_database until some zombies are laid to rest.	Henning Baldersheim	2024-02-05	1	-0/+4
\|/
*	Condition probing has long been default	Henning Baldersheim	2024-02-02	4	-19/+0
\|
*	two_phase_garbage_collection is always enabled	Henning Baldersheim	2024-02-02	4	-20/+1
\|
*	Merge pull request #30146 from vespa-engine/balder/always-unordered-merging	Henning Baldersheim	2024-02-02	6	-31/+16
\|\ \| \| \| \|	Balder/always unordered merging
\| *	Keep priority_merge_out_of_sync_copies until it can be safely cleaned out.	Henning Baldersheim	2024-02-02	1	-0/+3
\| \|
\| *	Only include what you need	Henning Baldersheim	2024-02-02	4	-6/+8
\| \|
\| *	Alwasy use use_unordered_merge_chaining	Henning Baldersheim	2024-02-02	4	-25/+5
\| \|
* \|	Merge pull request #30145 from ↵	Henning Baldersheim	2024-02-02	3	-10/+2
\|\ \ \| \|/ \|/\| \| \| \| \|	vespa-engine/balder/gc-maxpendingidealstateoperations GC maxpendingidealstateoperations which has not been wired in for a l…
\| *	GC maxpendingidealstateoperations which has not been wired in for a long time.	Henning Baldersheim	2024-02-02	3	-10/+2
\| \|
* \|	Merge pull request #30142 from ↵	Henning Baldersheim	2024-02-02	24	-100/+125
\|\ \ \| \|/ \|/\| \| \| \| \|	vespa-engine/balder/always-inhibit_default_merges_when_global_merges_pending - Always inhibit_default_merges_when_global_merges_pending
\| *	- Always inhibit_default_merges_when_global_merges_pending	Henning Baldersheim	2024-02-02	24	-100/+125
\| \| \| \| \| \| \| \| \| \|	- Only show config to the code that needs it. - Avoid using config autogenerated internals around in the code.
* \|	Always clear_bucket_priority_on_schedule.	Henning Baldersheim	2024-02-02	6	-34/+7
\|/
*	Always sequence mutating operations.	Henning Baldersheim	2024-02-02	4	-20/+0
\|
*	Merge pull request #30137 from vespa-engine/balder/always-report-host-info	Henning Baldersheim	2024-02-02	6	-50/+3
\|\ \| \| \| \|	Always report hostinfo
\| *	Always report hostinfo	Henning Baldersheim	2024-02-02	6	-50/+3
\| \|
* \|	Merge branch 'master' into balder/cleanup-distributormanagerconfig-1	Henning Baldersheim	2024-02-02	6	-144/+11
\|\\|
\| *	Merge pull request #30136 from vespa-engine/balder/gc-priority-control-by-config	Henning Baldersheim	2024-02-02	4	-91/+7
\| \|\ \| \| \| \| \| \|	GC priority control in config. Correct priority is essential to conte…
\| \| *	GC priority control in config. Correct priority is essential to content ↵	Henning Baldersheim	2024-02-02	4	-91/+7
\| \| \| \| \| \| \| \| \| \| \| \|	layer, and should not be reconfigured.
\| * \|	Merge pull request #30135 from vespa-engine/balder/never-block-state-checkers	Henning Baldersheim	2024-02-02	4	-34/+2
\| \|\ \ \| \| \| \| \| \| \| \|	Never block statecheckers
\| \| * \|	Never block statecheckers	Henning Baldersheim	2024-02-02	4	-34/+2
\| \| \|/
\| * /	Always prioritize_global_bucket_merges	Henning Baldersheim	2024-02-02	5	-23/+2
\| \|/
* /	GC unused methods and members in distributormanager config, part 1	Henning Baldersheim	2024-02-02	2	-25/+2
\|/
*	Merge pull request #30130 from ↵	Henning Baldersheim	2024-02-01	3	-64/+4
\|\ \| \| \| \| \| \| \| \|	vespa-engine/balder/gc-void-config-from-stor-bouncer GC void config from stor-bouncer.def
\| *	GC void config from stor-bouncer.def	Henning Baldersheim	2024-02-01	3	-64/+4
\| \|
* \|	NORMAL_3 is the old normal	Henning Baldersheim	2024-02-01	1	-7/+7
\| \|
* \|	GC void config from stor-visitor.def	Henning Baldersheim	2024-02-01	5	-118/+25
\|/
*	GC unused stor-bucketdb and stor-opslogger config.	Henning Baldersheim	2024-01-30	5	-18/+0
\|