vespa - An engine for low-latency computation over large data sets

	Commit message (Collapse)	Author	Age	Files	Lines
*	All embedders are the same	Jon Bratseth	2024-02-09	1	-2/+2
\| \| \| \| \|	This is to avoid a validation override from changed indexing expression when embedder details are changed.
*	Support embedding into rank 3 tensors	Jon Bratseth	2024-02-02	3	-29/+42
\|
*	- Add alternative sparsify implementation using generic tensor.reduce/map.	Henning Baldersheim	2024-01-31	2	-9/+52
\| \| \| \| \| \| \|	- Add options for specifying which one to use in tests and performance benchmark. Based on original implementation prior to custom reduce with the following improvements. - Apply Math.log after reduction which is the samp optimization as done in the custom implementation. - Join the 2 separate single dimension reduce statements into single 2 dimensional reduce.
*	- Put the inner loops in separate methods. This improves ability to inline.	Henning Baldersheim	2024-01-20	2	-54/+52
\| \| \| \| \| \| \|	- Use Buffer.get(int index) instead of Buffer.get(). That avoids a write. - Use int as loop variable. - This brings the splade perfoamnce test down from 8s to 7s - TensorConverter.toVespaTensor more than doubled speed.
*	Rename getIndex => getDirectIndex	Henning Baldersheim	2024-01-20	1	-1/+1
\|
*	Add a class for assist efficient traversal of dimensions in an IndexedTensor.	Henning Baldersheim	2024-01-19	2	-4/+9
\|
*	Cache sizes.totalSize() in variable to prevent recomputation.	Henning Baldersheim	2024-01-18	1	-20/+19
\|
*	Since both value and log(value) are monotonically increasing for value >= 1,	Henning Baldersheim	2024-01-18	1	-8/+8
\| \| \| \| \|	we can just gather max(value) and do log at the end. Avoiding general Math.max which seems to have very costly NaN handling was quite benefiscal.
*	Construct array right away instead of going via a single element list and ↵	Henning Baldersheim	2024-01-18	1	-5/+15
\| \| \| \|	the java stream api.
*	Avoid generic reduce and keep PAD token embedding	Jo Kristian Bergum	2024-01-15	2	-24/+47
\|
*	remove extra space	Jo Kristian Bergum	2024-01-11	1	-1/+1
\|
*	address review	Jo Kristian Bergum	2024-01-11	2	-43/+25
\|
*	Avoid generic reduce to reduce gc pressure	Jo Kristian Bergum	2024-01-11	2	-19/+61
\|
*	final	Jo Kristian Bergum	2024-01-06	1	-1/+1
\|
*	handle multilingual models better	Jo Kristian Bergum	2024-01-06	3	-65/+147
\|
*	Allow mapped 1d tensor for embed expressions	Jo Kristian Bergum	2023-12-17	2	-13/+13
\|
*	Add a splade embedder implementation	Jo Kristian Bergum	2023-12-15	5	-0/+30962
\|
*	Move Jackson util from vespajlib to container-core.	Henning Baldersheim	2023-11-24	3	-3/+3
\|
*	jackson 2.16 changes some of its default settings so we consolidate our use ↵	Henning Baldersheim	2023-11-23	3	-8/+7
\| \| \| \| \| \|	of the ObjectMapper. Unless special options are used, use a common instance, or create via factory metod.
*	unpack_bits_from_int8 -> unpack_bits	Arne Juul	2023-11-10	1	-2/+2
\|
*	add simple expandBitTensor function	Arne Juul	2023-11-10	2	-9/+35
\|
*	Add support and upgrade opset	Jo Kristian Bergum	2023-10-26	4	-7/+31
\|
*	Add support for bfloat16 and float16	Jo Kristian Bergum	2023-10-26	4	-0/+82
\|
*	Less verbose logging when failing to find CUDA and it is optional	Jo Kristian Bergum	2023-10-26	2	-2/+53
\|
*	Disable CPU arena allocator for ONNX	Bjørn Christian Seime	2023-10-19	1	-0/+1
\| \| \| \| \|	The arena memory allocator pre-allocates excessive of memory up front. Disabling matches the existing configuration in ONNX integration for backend.
*	Update copyright	Jon Bratseth	2023-10-09	122	-122/+131
\|
*	Don't index PAD and re-factoring	Jo Kristian Bergum	2023-09-26	2	-41/+37
\|
*	Add config options + license	Jo Kristian Bergum	2023-09-21	2	-0/+2
\|
*	Ensure Onnx/Hugginface resources are cleaned up on deconstruction	Bjørn Christian Seime	2023-09-21	1	-0/+6
\|
*	Add ColBERT embedder	Jo Kristian Bergum	2023-09-21	4	-0/+599
\|
*	- Use equals when comparing Optional<Long>	Henning Baldersheim	2023-09-13	2	-4/+4
\| \| \| \|	- Minor cleanup
*	Use thread safe hash map	Bjørn Christian Seime	2023-08-31	1	-2/+2
\|
*	Merge pull request #27969 from vespa-engine/bjorncs/embedder-metrics	Jon Bratseth	2023-08-31	5	-8/+94
\|\ \| \| \| \|	Add generic metrics for embedders
\| *	Allow sampling of fractional millis	Bjørn Christian Seime	2023-08-25	3	-15/+10
\| \|
\| *	Add generic metrics for embedders	Bjørn Christian Seime	2023-08-04	5	-8/+99
\| \|
* \|	Better error message when importing models with illegal names	Lester Solbakken	2023-08-29	1	-0/+25
\|/
*	Log when GPU configuration is successful	Martin Polden	2023-07-19	1	-3/+8
\|
*	Log warning when failing to use GPU	Martin Polden	2023-07-19	1	-1/+6
\|
*	update onnx.proto	Arne Juul	2023-06-23	4	-80/+453
\| \| \| \| \|	* use latest version from https://github.com/onnx/onnx/blob/main/onnx/onnx.proto * track API changes (enum -> int32)
*	Prefer truncation configuration from tokenizer model	Bjørn Christian Seime	2023-06-12	1	-6/+19
\| \| \| \| \| \| \|	Only override truncation if not specified or max length exceeds max tokens accepted by model. Use JNI wrapper directly to determine existing truncation configuration (JSON format is not really documented). Simply configuration for pure tokenizer embedder. Disable DJL usage telemetry.
*	Add missing wiring of pooling strategy	Bjørn Christian Seime	2023-06-08	1	-11/+1
\|
*	Disable padding and make it configurable	Bjørn Christian Seime	2023-06-08	1	-0/+1
\|
*	Merge pull request #27297 from vespa-engine/bjorncs/bert-embedder-services-xml	Bjørn Christian Seime	2023-06-06	4	-49/+54
\|\ \| \| \| \|	Bjorncs/bert embedder services xml
\| *	Make pooling strategy configurable for Huggingface embedder	Bjørn Christian Seime	2023-06-05	3	-17/+54
\| \|
\| *	Move config definition to `configdefinitions`	Bjørn Christian Seime	2023-06-05	1	-32/+0
\| \|
* \|	Add necessary options to use failOnWarnings	gjoranv	2023-06-05	1	-0/+4
\|/
*	Introduce services.xml syntax for configuring HuggingFace embedders	Bjørn Christian Seime	2023-06-02	2	-29/+6
\|
*	Properly ignore token type ids from tokenizer if disabled	Bjørn Christian Seime	2023-05-30	1	-2/+2
\|
*	Remove dead code	Bjørn Christian Seime	2023-05-26	2	-43/+0
\|
*	Make truncation and max length configurable	Bjørn Christian Seime	2023-05-26	1	-12/+3
\|