vespa - An engine for low-latency computation over large data sets

	Commit message (Expand)	Author	Age	Files	Lines
*	Replace all usages of Arrays.asList with List.of where possible.	Henning Baldersheim	2024-04-12	6	-30/+25
*	Merge pull request #30816 from vespa-engine/marius/add-significance-model-reg...	Marius Arhaug	2024-04-09	4	-0/+110
\|\
\| *	add illegal arg exception for languages not registered	MariusArhaug	2024-04-09	1	-0/+3
\| *	fix cr failures	MariusArhaug	2024-04-09	4	-10/+17
\| *	add significance model registry to linguistics	MariusArhaug	2024-04-04	4	-0/+100
* \|	Add SimpleTokenScript to SimpleTokenizer	MariusArhaug	2024-04-03	2	-0/+39
\|/
*	Update copyright	Jon Bratseth	2023-10-09	20	-20/+20
*	Don't remove indexable symbols when stemming	Jon Bratseth	2023-06-02	1	-2/+2
*	Always treat each symbol as a separate token	Jon Bratseth	2023-05-22	2	-3/+25
*	Threat 'other symbols' as letters	Jon Bratseth	2023-05-22	1	-0/+8
*	Compute code points in whole string only when needed	jonmv	2022-12-06	1	-1/+14
*	Split out opennlp-linguistics	Henning Baldersheim	2022-11-26	3	-365/+0
*	No functional changes	Jon Bratseth	2022-09-11	1	-2/+2
*	Determine token types considering all characters	Jon Bratseth	2022-08-16	2	-11/+44
*	Expand test case for language detection	Jon Marius Venstad	2021-12-20	1	-3/+28
*	Revert "Merge pull request #20578 from vespa-engine/revert-20568-jonmv/replac...	Jon Marius Venstad	2021-12-20	3	-35/+82
*	Revert "Replace optimaize with OpenNLP language detector [run-systemtest]"	Jon Marius Venstad	2021-12-18	3	-82/+35
*	Re-add files	Jon Marius Venstad	2021-12-18	2	-0/+82
*	Replace optimaize with OpenNLP language detector	Jon Marius Venstad	2021-12-17	1	-35/+0
*	Time out requests after 200s	Jon Marius Venstad	2021-12-13	1	-1/+0
*	Update 2020 Oath copyrights.	gjoranv	2021-10-27	1	-1/+1
*	Update Verizon Media copyright notices.	gjoranv	2021-10-07	1	-1/+1
*	Update 2017 copyright notices.	gjoranv	2021-10-07	20	-20/+20
*	Separate component from linguistics	Jon Bratseth	2021-09-25	5	-197/+0
*	Refactor to separate classes	Jon Bratseth	2021-09-17	1	-2/+1
*	Encode to sparse tensor	Jon Bratseth	2021-09-16	1	-0/+6
*	Encode to dense tensor	Jon Bratseth	2021-09-16	2	-0/+23
*	Make SentencePieceEncoder configurable	Jon Bratseth	2021-09-16	3	-30/+102
*	More unit tests	Jon Bratseth	2021-09-14	1	-1/+20
*	Pure Java sentencepiece implementation	Jon Bratseth	2021-09-13	3	-0/+78
*	Revert "Merge pull request #17754 from vespa-engine/revert-17747-bratseth/spe...	Jon Bratseth	2021-05-05	1	-0/+40
*	Revert "Reapply "Bratseth/special tokens""	Jon Bratseth	2021-05-05	1	-40/+0
*	Revert "Merge pull request #17746 from vespa-engine/revert-17738-revert-17737...	Jon Bratseth	2021-05-05	1	-0/+40
*	Revert "Revert "Revert "Bratseth/special tokens"""	Jon Bratseth	2021-05-05	1	-40/+0
*	Revert "Revert "Bratseth/special tokens""	Jon Bratseth	2021-05-04	1	-0/+40
*	Revert "Bratseth/special tokens"	Jon Bratseth	2021-05-04	1	-40/+0
*	Expose tokens as map	Jon Bratseth	2021-05-04	1	-5/+3
*	Move specialtokens to linguistics	Jon Bratseth	2021-05-04	1	-0/+42
*	No functional changes	Jon Bratseth	2021-04-14	1	-37/+26
*	No functional changes	Jon Bratseth	2021-04-14	8	-19/+16
*	No functional changes	Jon Bratseth	2021-02-03	1	-0/+19
*	handle plugin tokenizer returning tokens with empty original string	Arne Juul	2020-08-24	1	-0/+51
*	Minor unification of tests.	Henning Baldersheim	2020-08-12	2	-20/+36
*	Surrogate aware gram splitting	Jon Bratseth	2020-06-25	1	-9/+37
*	Add/corect copyright headers	Jon Bratseth	2020-01-03	1	-0/+1
*	Remove deprecated apis in linguistics.	gjoranv	2019-01-21	1	-27/+0
*	Deprecated methods and add OptimaizeDetector	Jon Bratseth	2018-11-01	2	-6/+36
*	use com.optimaize.langdetect for lang detection	Jefim Matskin	2018-07-24	1	-0/+5
*	add opennlp stemmers - revert previous changes	Jefim Matskin	2018-07-18	3	-5/+238
*	add lang detection and opennlp stemmers	Jefim Matskin	2018-07-17	2	-0/+6