summaryrefslogtreecommitdiffstats
path: root/linguistics
Commit message (Expand)AuthorAgeFilesLines
* Encapsulate in a contextJon Bratseth2021-10-012-16/+65
* Pass destinationJon Bratseth2021-09-302-8/+14
* encode -> embedJon Bratseth2021-09-283-64/+64
* Separate component from linguisticsJon Bratseth2021-09-2517-1206/+0
* Linguistics cleanupJon Bratseth2021-09-2118-45/+29
* Add 'encode' expressionJon Bratseth2021-09-192-1/+35
* Provide a (non-working) encoder by defaultJon Bratseth2021-09-171-1/+1
* Update ABI specJon Bratseth2021-09-171-20/+46
* CleanupJon Bratseth2021-09-175-9/+2
* Refactor to separate classesJon Bratseth2021-09-178-203/+279
* Encoder interfaceJon Bratseth2021-09-163-4/+55
* Encode to sparse tensorJon Bratseth2021-09-163-0/+17
* Encode to dense tensorJon Bratseth2021-09-163-4/+36
* Use a result builderJon Bratseth2021-09-161-21/+53
* Make SentencePieceEncoder configurableJon Bratseth2021-09-166-56/+283
* Merge pull request #19130 from vespa-engine/bratseth/sp-exportJo Kristian Bergum2021-09-142-0/+79
|\
| * Add to abi specJon Bratseth2021-09-141-0/+72
| * Make publicJon Bratseth2021-09-141-0/+7
* | Merge pull request #19131 from vespa-engine/bratseth/sp-simplifyJon Bratseth2021-09-141-13/+7
|\ \
| * | Slight algorithm simplificationJon Bratseth2021-09-141-6/+4
| * | Slight algorithm simplificationJon Bratseth2021-09-141-6/+3
| * | Slight algorithm simplificationJon Bratseth2021-09-141-11/+10
| |/
* / More unit testsJon Bratseth2021-09-141-1/+20
|/
* Pure Java sentencepiece implementationJon Bratseth2021-09-137-2/+731
* we want to compare Linguistics objects for equivalenceArne Juul2021-08-044-1/+9
* Require replacements to be applied during tokenizationJon Bratseth2021-06-153-12/+11
* Revert "Merge pull request #17754 from vespa-engine/revert-17747-bratseth/spe...Jon Bratseth2021-05-058-13/+336
* Revert "Reapply "Bratseth/special tokens""Jon Bratseth2021-05-058-336/+13
* Revert "Merge pull request #17746 from vespa-engine/revert-17738-revert-17737...Jon Bratseth2021-05-058-13/+336
* Revert "Revert "Revert "Bratseth/special tokens"""Jon Bratseth2021-05-058-336/+13
* Revert "Revert "Bratseth/special tokens""Jon Bratseth2021-05-048-13/+336
* Revert "Bratseth/special tokens"Jon Bratseth2021-05-048-336/+13
* Avoid config in simple tokenizerJon Bratseth2021-05-041-7/+4
* Expose tokens as mapJon Bratseth2021-05-044-12/+17
* Wire in (but don't use) SpecialTokensJon Bratseth2021-05-046-18/+40
* Move specialtokens to linguisticsJon Bratseth2021-05-044-0/+299
* No functional changesJon Bratseth2021-04-141-37/+26
* No functional changesJon Bratseth2021-04-1413-128/+84
* No functional changesJon Bratseth2021-02-032-1/+20
* Add a testJon Bratseth2020-11-121-3/+0
* Allow no argument to install_config_definitionsHarald Musum2020-09-121-1/+1
* Use full name in config definition file namesHarald Musum2020-09-102-1/+1
* handle plugin tokenizer returning tokens with empty original stringArne Juul2020-08-242-1/+55
* Minor unification of tests.Henning Baldersheim2020-08-123-25/+36
* Update ABI specJon Bratseth2020-06-261-1/+2
* Surrogate aware gram splittingJon Bratseth2020-06-252-33/+122
* SpareCapacityMaintainer sketchJon Bratseth2020-06-126-66/+35
* variables in lambdas must be finalArne Juul2020-04-242-10/+16
* Apply suggestions from code reviewArne H Juul2020-04-243-7/+7
* add more tracing and debug logging of stemmingArne Juul2020-04-244-1/+25