summaryrefslogtreecommitdiffstats
path: root/linguistics/src/main/java
Commit message (Expand)AuthorAgeFilesLines
* Determine token types considering all charactersJon Bratseth2022-08-164-108/+89
* Remove on Vespa 8Jon Bratseth2022-06-081-8/+0
* Use '@Inject' from 'annotations' in multiple bundlesBjørn Christian Seime2022-05-062-2/+2
* Resolve rank profile inputsJon Bratseth2022-04-211-1/+1
* Rename defaultEmbedderName to defaultEmbedderIdLester Solbakken2022-03-221-2/+2
* Add convenience function to represent embedder as mapLester Solbakken2022-03-211-3/+26
* Stem by linguistics in rule basesJon Bratseth2022-01-101-3/+20
* annotate intentional switch fallthroughArne H Juul2022-01-061-0/+1
* Specify how the class is actually loadedJon Marius Venstad2021-12-211-1/+1
* Provide array of correct size.Jon Marius Venstad2021-12-201-1/+1
* Override ngram creation with something less sillyJon Marius Venstad2021-12-202-1/+32
* Use smaller chunks for faster detectionJon Marius Venstad2021-12-201-2/+2
* Upper bound on input size, and use opennlp before simple detectorJon Marius Venstad2021-12-201-6/+3
* Avoid putting nulls in languange mapJon Marius Venstad2021-12-201-2/+5
* Revert "Merge pull request #20578 from vespa-engine/revert-20568-jonmv/replac...Jon Marius Venstad2021-12-207-131/+163
* Revert "Replace optimaize with OpenNLP language detector [run-systemtest]"Jon Marius Venstad2021-12-187-163/+131
* Re-add filesJon Marius Venstad2021-12-182-0/+60
* Move model to module where it is needed, to simplify, at the cost of larger b...Jon Marius Venstad2021-12-183-22/+21
* Add some javadoc, and no need to handle null return for modelJon Marius Venstad2021-12-172-2/+4
* Replace optimaize with OpenNLP language detectorJon Marius Venstad2021-12-176-131/+102
* Add a BERT embedderJon Bratseth2021-12-161-2/+3
* Update 2020 Oath copyrights.gjoranv2021-10-271-1/+1
* Update Verizon Media copyright notices.gjoranv2021-10-072-2/+2
* Update 2018 copyright notices.gjoranv2021-10-072-2/+2
* Update 2017 copyright notices.gjoranv2021-10-0749-49/+49
* Encapsulate in a contextJon Bratseth2021-10-011-12/+46
* Pass destinationJon Bratseth2021-09-301-4/+10
* encode -> embedJon Bratseth2021-09-282-56/+56
* Separate component from linguisticsJon Bratseth2021-09-258-490/+0
* Linguistics cleanupJon Bratseth2021-09-2117-34/+29
* Add 'encode' expressionJon Bratseth2021-09-191-0/+17
* Provide a (non-working) encoder by defaultJon Bratseth2021-09-171-1/+1
* CleanupJon Bratseth2021-09-175-9/+2
* Refactor to separate classesJon Bratseth2021-09-177-201/+278
* Encoder interfaceJon Bratseth2021-09-163-4/+55
* Encode to sparse tensorJon Bratseth2021-09-161-0/+10
* Encode to dense tensorJon Bratseth2021-09-161-4/+13
* Use a result builderJon Bratseth2021-09-161-21/+53
* Make SentencePieceEncoder configurableJon Bratseth2021-09-161-6/+30
* Merge pull request #19130 from vespa-engine/bratseth/sp-exportJo Kristian Bergum2021-09-141-0/+7
|\
| * Make publicJon Bratseth2021-09-141-0/+7
* | Slight algorithm simplificationJon Bratseth2021-09-141-6/+4
* | Slight algorithm simplificationJon Bratseth2021-09-141-6/+3
* | Slight algorithm simplificationJon Bratseth2021-09-141-11/+10
|/
* Pure Java sentencepiece implementationJon Bratseth2021-09-132-2/+335
* we want to compare Linguistics objects for equivalenceArne Juul2021-08-043-0/+7
* Require replacements to be applied during tokenizationJon Bratseth2021-06-153-12/+11
* Revert "Merge pull request #17754 from vespa-engine/revert-17747-bratseth/spe...Jon Bratseth2021-05-056-13/+245
* Revert "Reapply "Bratseth/special tokens""Jon Bratseth2021-05-056-245/+13
* Revert "Merge pull request #17746 from vespa-engine/revert-17738-revert-17737...Jon Bratseth2021-05-056-13/+245