vespa - An engine for low-latency computation over large data sets

	Commit message (Expand)	Author	Age	Files	Lines
*	Update copyright	Jon Bratseth	2023-10-09	5	-6/+6
*	Prefer truncation configuration from tokenizer model	Bjørn Christian Seime	2023-06-12	1	-1/+8
*	Test padding with truncation	Bjørn Christian Seime	2023-06-08	1	-2/+3
*	Verify presence of special token	Bjørn Christian Seime	2023-06-08	1	-2/+7
*	Disable padding and make it configurable	Bjørn Christian Seime	2023-06-08	1	-9/+23
*	Make truncation and max length configurable	Bjørn Christian Seime	2023-05-26	1	-2/+28
*	Revert "Revert "Bjorncs/huggingface tokenizer""	Bjørn Christian Seime	2023-05-12	1	-0/+88
*	Revert "Bjorncs/huggingface tokenizer"	Arnstein Ressem	2023-05-12	1	-88/+0
*	Disable special tokens by default	Bjørn Christian Seime	2023-05-11	1	-0/+1
*	Make HF tokenizer a separate embedder	Bjørn Christian Seime	2023-05-11	1	-0/+87
*	Add skipping of control tokens	Lester Solbakken	2023-02-10	2	-1/+12
*	Add decoding of sentencepiece token sequence to text	Lester Solbakken	2023-02-10	2	-0/+16
*	Test segmentation with subwords	Jon Bratseth	2021-12-17	2	-4/+13
*	BERT -> WordPiece, make subword prefix configurable	Jon Bratseth	2021-12-17	6	-129/+119
*	Add a BERT embedder	Jon Bratseth	2021-12-16	2	-6/+54
*	Encapsulate in a context	Jon Bratseth	2021-10-01	1	-2/+3
*	Update linguisticvs-components	Jon Bratseth	2021-09-30	1	-2/+2
*	encode -> embed	Jon Bratseth	2021-09-28	3	-23/+23
*	Separate component from linguistics	Jon Bratseth	2021-09-25	3	-0/+197