vespa - An engine for low-latency computation over large data sets

	Commit message (Collapse)	Author	Age	Files	Lines
*	Remove athenz-provider-service.def	Valerij Fredriksen	2024-05-10	2	-39/+0
\|
*	maxtokenlength units are characters.	Tor Egge	2024-05-06	1	-1/+1
\|
*	Add max token length to ilscripts config.	Tor Egge	2024-05-06	1	-0/+2
\|
*	Merge pull request #31011 from ↵	Marius Arhaug	2024-04-30	1	-2/+0
\|\ \| \| \| \| \| \| \| \|	vespa-engine/marius/update-significance-model-fields Update significance model field and logic from architect meeting
\| *	Update significance model field and logic from architect meeting	MariusArhaug	2024-04-24	1	-2/+0
\| \|
* \|	add prepend support	Jo Kristian Bergum	2024-04-25	1	-0/+2
\|/
*	add vespa-otelcol-start	Arne Juul	2024-04-12	2	-1/+6
\|
*	Otel on logserver WIP	Ola Aunronning	2024-04-12	1	-0/+5
\|
*	Support pipelining (batching) of mutating ops to same bucket	Tor Brede Vekterli	2024-04-09	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bucket operations require either exclusive (single writer) or shared (multiple readers) access. Prior to this commit, this means that many enqueued feed operations to the same bucket introduce pipeline stalls due to each operation having to wait for all prior operations to the bucket to complete entirely (including fsync of WAL append). This is a likely scenario when feeding a document set that was previously acquired through visiting, as such documents will inherently be output in bucket-order. With this commit, a configurable number of feed operations (put, remove and update) bound for the exact same bucket may be sent asynchronously to the persistence provider in the context of the _same_ write lock. This mirrors how merge operations work for puts and removes. Batching is fairly conservative, and will _not_ batch across further messages when any of the following holds: * A non-feed operation is encountered * More than one mutating operation is encountered for the same document ID * No more persistence throttler tokens can be acquired * Max batch size has been reached Updating the bucket DB, assigning bucket info and sending replies is deferred until _all_ batched operations complete. Max batch size is (re-)configurable live and defaults to a batch size of 1, which shall have the exact same semantics as the legacy behavior. Additionally, clock sampling for persistence threads have been abstracted away to allow for mocking in tests (no need for sleep!).
*	Add config for significance models	MariusArhaug	2024-04-03	3	-0/+14
\|
*	bump maxtermoccurrences 1000 => 10000	Tor Egge	2024-02-20	1	-1/+1
\|
*	- Remove multibit_split form config, as it is always off, but leave it for ↵	Henning Baldersheim	2024-02-05	1	-6/+0
\| \| \| \| \| \|	tests. - Reduce penetration of generated StorFilestorConfig.
*	Merge pull request #30165 from vespa-engine/balder/gc-unused-distribution-config	Henning Baldersheim	2024-02-05	1	-20/+0
\|\ \| \| \| \|	Balder/gc unused distribution config
\| *	GC unused distributor_auto_ownership_transfer_on_whole_group_down	Henning Baldersheim	2024-02-03	1	-8/+0
\| \|
\| *	GC unused disk_distribution config.	Henning Baldersheim	2024-02-03	1	-13/+1
\| \|
* \|	common_merge_chain_optimalization_minimum_size hardcoded at 64	Henning Baldersheim	2024-02-03	1	-6/+0
\| \|
* \|	throttle_individual_merge_feed_ops has long been enabled, cleaning up	Henning Baldersheim	2024-02-03	1	-5/+2
\|/
*	GC completely unused parameters from the days of VDS	Henning Baldersheim	2024-01-30	1	-17/+0
\|
*	GC unused async_operation_dynamic_throttling_window_increment and ↵	Henning Baldersheim	2024-01-30	1	-21/+0
\| \| \| \|	async_operation_throttler_type
*	GC leftovers from use_per_document_throttled_delete_bucket	Henning Baldersheim	2024-01-30	1	-8/+0
\|
*	GC control of use-per-document-delete and max-merge-memory from config ↵	Henning Baldersheim	2024-01-23	1	-1/+1
\| \| \| \|	production side in java.
*	bump maxtermoccurrences 100 => 1000	Henning Baldersheim	2024-01-15	1	-1/+1
\|
*	handle multilingual models better	Jo Kristian Bergum	2024-01-06	1	-0/+3
\|
*	Add a splade embedder implementation	Jo Kristian Bergum	2023-12-15	2	-0/+30
\|
*	Add and wire live config for selecting `DeleteBucket` behavior	Tor Brede Vekterli	2023-11-10	1	-0/+8
\| \| \| \|	By default the legacy behavior is used.
*	add config for normalizers	Arne Juul	2023-10-11	1	-0/+11
\|
*	Update copyright	Jon Bratseth	2023-10-09	67	-70/+73
\|
*	- Reduce max lids per file and max file size to 4M and 256M during unit testing.	Henning Baldersheim	2023-10-05	1	-1/+1
\| \| \| \|	- Reduce max lids from 40M to 8M as default configuration.
*	Install config definition	Bjørn Christian Seime	2023-09-21	1	-0/+1
\|
*	Add ColBERT embedder	Jo Kristian Bergum	2023-09-21	1	-0/+36
\|
*	Add token endpoints to proxy config	Morten Tokle	2023-09-08	1	-0/+3
\|
*	Add numProxiesAllowedDown fields to orchestrator def	Håkon Hallingstad	2023-07-31	1	-0/+8
\|
*	Add port for token connector to nginx config	Bjørn Christian Seime	2023-07-19	1	-1/+2
\|
*	Split token authz into dedicated filter `CloudTokenDataPlaneFilter`	Bjørn Christian Seime	2023-07-19	3	-5/+11
\|
*	Add expiration concept to data plane tokens	Bjørn Christian Seime	2023-07-12	1	-0/+1
\|
*	Add parameters for tokens to config definition	Bjørn Christian Seime	2023-06-14	1	-0/+4
\|
*	Install config definition	Bjørn Christian Seime	2023-06-14	1	-1/+1
\|
*	DataplaneProxyConfig does not contain endpoints	Ola Aunronning	2023-06-13	1	-6/+0
\|
*	Prefer truncation configuration from tokenizer model	Bjørn Christian Seime	2023-06-12	1	-3/+10
\| \| \| \| \| \| \|	Only override truncation if not specified or max length exceeds max tokens accepted by model. Use JNI wrapper directly to determine existing truncation configuration (JSON format is not really documented). Simply configuration for pure tokenizer embedder. Disable DJL usage telemetry.
*	Merge pull request #27349 from vespa-engine/bjorncs/disable-padding	Bjørn Christian Seime	2023-06-08	1	-2/+3
\|\ \| \| \| \|	Bjorncs/disable padding
\| *	Disable padding and make it configurable	Bjørn Christian Seime	2023-06-08	1	-2/+3
\| \|
* \|	Merge branch 'master' into olaa/dataplane-proxy-config	Ola Aunrønning	2023-06-08	8	-1/+104
\|\\|
\| *	Fix typo	Bjørn Christian Seime	2023-06-07	1	-1/+1
\| \|
\| *	Ensure config definitions are installed on configserver	Bjørn Christian Seime	2023-06-07	3	-0/+3
\| \|
\| *	Merge pull request #27297 from vespa-engine/bjorncs/bert-embedder-services-xml	Bjørn Christian Seime	2023-06-06	2	-0/+34
\| \|\ \| \| \| \| \| \|	Bjorncs/bert embedder services xml
\| \| *	Make pooling strategy configurable for Huggingface embedder	Bjørn Christian Seime	2023-06-05	1	-0/+2
\| \| \|
\| \| *	Move config definition to `configdefinitions`	Bjørn Christian Seime	2023-06-05	1	-0/+32
\| \| \|
\| * \|	Add necessary options to use failOnWarnings	gjoranv	2023-06-05	1	-0/+1
\| \|/
\| *	Introduce services.xml syntax for configuring HuggingFace embedders	Bjørn Christian Seime	2023-06-02	4	-0/+60
\| \|
\| *	Remove use of stateGatherCount config, simplify and deprecate config field	Harald Musum	2023-05-30	1	-0/+1
\| \|