summaryrefslogtreecommitdiffstats
path: root/linguistics/src
Commit message (Collapse)AuthorAgeFilesLines
* Stem by linguistics in rule basesJon Bratseth2022-01-101-3/+20
| | | | Also add a @language directive to stem in other languages than english.
* annotate intentional switch fallthroughArne H Juul2022-01-061-0/+1
|
* Specify how the class is actually loadedJon Marius Venstad2021-12-211-1/+1
|
* Provide array of correct size.Jon Marius Venstad2021-12-201-1/+1
|
* Override ngram creation with something less sillyJon Marius Venstad2021-12-202-1/+32
|
* Use smaller chunks for faster detectionJon Marius Venstad2021-12-201-2/+2
|
* Expand test case for language detectionJon Marius Venstad2021-12-201-3/+28
|
* Upper bound on input size, and use opennlp before simple detectorJon Marius Venstad2021-12-201-6/+3
|
* Avoid putting nulls in languange mapJon Marius Venstad2021-12-201-2/+5
|
* Revert "Merge pull request #20578 from ↵Jon Marius Venstad2021-12-2012-172/+245
| | | | | | | vespa-engine/revert-20568-jonmv/replace-optimaize-with-lingua" This reverts commit 5476504932cd90eb2dad82dbab633e3ffa2034c3, reversing changes made to 235a78cc4707f78d18c6818a577de1b7507f5e40.
* Revert "Replace optimaize with OpenNLP language detector [run-systemtest]"Jon Marius Venstad2021-12-1812-245/+172
|
* Re-add filesJon Marius Venstad2021-12-185-0/+142
|
* Move model to module where it is needed, to simplify, at the cost of larger ↵Jon Marius Venstad2021-12-183-22/+21
| | | | bundles
* Replace UrlcharSequenceNormalizer with one with an improved regexJon Marius Venstad2021-12-171-6/+0
|
* Add some javadoc, and no need to handle null return for modelJon Marius Venstad2021-12-172-2/+4
|
* Replace optimaize with OpenNLP language detectorJon Marius Venstad2021-12-177-166/+102
|
* Add a BERT embedderJon Bratseth2021-12-161-2/+3
|
* Time out requests after 200sJon Marius Venstad2021-12-131-1/+0
|
* Update 2020 Oath copyrights.gjoranv2021-10-272-2/+2
|
* Update Verizon Media copyright notices.gjoranv2021-10-073-3/+3
|
* Update 2018 copyright notices.gjoranv2021-10-073-3/+3
|
* Update 2017 copyright notices.gjoranv2021-10-0769-69/+69
|
* Encapsulate in a contextJon Bratseth2021-10-011-12/+46
|
* Pass destinationJon Bratseth2021-09-301-4/+10
| | | | | This allows embedders to switch on it to enable bucket testing and similar.
* encode -> embedJon Bratseth2021-09-282-56/+56
|
* Separate component from linguisticsJon Bratseth2021-09-2515-1015/+0
|
* Linguistics cleanupJon Bratseth2021-09-2117-34/+29
|
* Add 'encode' expressionJon Bratseth2021-09-191-0/+17
|
* Provide a (non-working) encoder by defaultJon Bratseth2021-09-171-1/+1
|
* CleanupJon Bratseth2021-09-175-9/+2
|
* Refactor to separate classesJon Bratseth2021-09-178-203/+279
|
* Encoder interfaceJon Bratseth2021-09-163-4/+55
|
* Encode to sparse tensorJon Bratseth2021-09-162-0/+16
|
* Encode to dense tensorJon Bratseth2021-09-163-4/+36
|
* Use a result builderJon Bratseth2021-09-161-21/+53
|
* Make SentencePieceEncoder configurableJon Bratseth2021-09-165-36/+150
|
* Merge pull request #19130 from vespa-engine/bratseth/sp-exportJo Kristian Bergum2021-09-141-0/+7
|\ | | | | Make public
| * Make publicJon Bratseth2021-09-141-0/+7
| |
* | Merge pull request #19131 from vespa-engine/bratseth/sp-simplifyJon Bratseth2021-09-141-13/+7
|\ \ | | | | | | Slight algorithm simplification
| * | Slight algorithm simplificationJon Bratseth2021-09-141-6/+4
| | |
| * | Slight algorithm simplificationJon Bratseth2021-09-141-6/+3
| | |
| * | Slight algorithm simplificationJon Bratseth2021-09-141-11/+10
| |/
* / More unit testsJon Bratseth2021-09-141-1/+20
|/
* Pure Java sentencepiece implementationJon Bratseth2021-09-136-2/+723
|
* we want to compare Linguistics objects for equivalenceArne Juul2021-08-043-0/+7
|
* Require replacements to be applied during tokenizationJon Bratseth2021-06-153-12/+11
|
* Revert "Merge pull request #17754 from ↵Jon Bratseth2021-05-057-13/+285
| | | | | | | vespa-engine/revert-17747-bratseth/special-tokens-take-2" This reverts commit a2c9cd4bc04f1a3eaa31524b3970b96be5c2eda9, reversing changes made to 8c61a373af0066fbdf1cca354c24b197c7347321.
* Revert "Reapply "Bratseth/special tokens""Jon Bratseth2021-05-057-285/+13
|
* Revert "Merge pull request #17746 from ↵Jon Bratseth2021-05-057-13/+285
| | | | | | | vespa-engine/revert-17738-revert-17737-revert-17736-bratseth/special-tokens" This reverts commit 491856b396d003885e159345fe3f533f0fa35933, reversing changes made to 3720186303f4aef1d185525eaf61092097a64ec9.
* Revert "Revert "Revert "Bratseth/special tokens"""Jon Bratseth2021-05-057-285/+13
|