summaryrefslogtreecommitdiffstats
path: root/linguistics/src/main/java/com/yahoo/language
Commit message (Expand)AuthorAgeFilesLines
* Separate component from linguisticsJon Bratseth2021-09-258-490/+0
* Linguistics cleanupJon Bratseth2021-09-2117-34/+29
* Add 'encode' expressionJon Bratseth2021-09-191-0/+17
* Provide a (non-working) encoder by defaultJon Bratseth2021-09-171-1/+1
* CleanupJon Bratseth2021-09-175-9/+2
* Refactor to separate classesJon Bratseth2021-09-177-201/+278
* Encoder interfaceJon Bratseth2021-09-163-4/+55
* Encode to sparse tensorJon Bratseth2021-09-161-0/+10
* Encode to dense tensorJon Bratseth2021-09-161-4/+13
* Use a result builderJon Bratseth2021-09-161-21/+53
* Make SentencePieceEncoder configurableJon Bratseth2021-09-161-6/+30
* Merge pull request #19130 from vespa-engine/bratseth/sp-exportJo Kristian Bergum2021-09-141-0/+7
|\
| * Make publicJon Bratseth2021-09-141-0/+7
* | Slight algorithm simplificationJon Bratseth2021-09-141-6/+4
* | Slight algorithm simplificationJon Bratseth2021-09-141-6/+3
* | Slight algorithm simplificationJon Bratseth2021-09-141-11/+10
|/
* Pure Java sentencepiece implementationJon Bratseth2021-09-132-2/+335
* we want to compare Linguistics objects for equivalenceArne Juul2021-08-043-0/+7
* Require replacements to be applied during tokenizationJon Bratseth2021-06-153-12/+11
* Revert "Merge pull request #17754 from vespa-engine/revert-17747-bratseth/spe...Jon Bratseth2021-05-056-13/+245
* Revert "Reapply "Bratseth/special tokens""Jon Bratseth2021-05-056-245/+13
* Revert "Merge pull request #17746 from vespa-engine/revert-17738-revert-17737...Jon Bratseth2021-05-056-13/+245
* Revert "Revert "Revert "Bratseth/special tokens"""Jon Bratseth2021-05-056-245/+13
* Revert "Revert "Bratseth/special tokens""Jon Bratseth2021-05-046-13/+245
* Revert "Bratseth/special tokens"Jon Bratseth2021-05-046-245/+13
* Avoid config in simple tokenizerJon Bratseth2021-05-041-7/+4
* Expose tokens as mapJon Bratseth2021-05-042-6/+13
* Wire in (but don't use) SpecialTokensJon Bratseth2021-05-046-18/+40
* Move specialtokens to linguisticsJon Bratseth2021-05-042-0/+206
* No functional changesJon Bratseth2021-04-145-109/+68
* No functional changesJon Bratseth2021-02-031-1/+1
* Add a testJon Bratseth2020-11-121-3/+0
* handle plugin tokenizer returning tokens with empty original stringArne Juul2020-08-241-1/+4
* Surrogate aware gram splittingJon Bratseth2020-06-251-24/+85
* SpareCapacityMaintainer sketchJon Bratseth2020-06-126-66/+35
* variables in lambdas must be finalArne Juul2020-04-242-10/+16
* Apply suggestions from code reviewArne H Juul2020-04-243-7/+7
* add more tracing and debug logging of stemmingArne Juul2020-04-244-1/+25
* Add/corect copyright headersJon Bratseth2020-01-031-0/+1
* Build tensors purely with floatsJon Bratseth2019-04-261-1/+1
* Move bound builder double array into double subclassJon Bratseth2019-04-261-1/+1
* Allow destructive changes in manually deployed zonesJon Bratseth2019-04-011-1/+1
* Nonfunctional changes onlyJon Bratseth2019-01-241-1/+1
* Generate html5 javadocgjoranv2019-01-211-7/+7
* Remove deprecated method (again)Jon Bratseth2019-01-212-16/+0
* Make SimpleLinguistics simple againJon Bratseth2019-01-213-97/+32
* Remove deprecated apis in linguistics.gjoranv2019-01-213-41/+0
* Deprecated methods and add OptimaizeDetectorJon Bratseth2018-11-014-0/+118
* Prepare for removal of deprecated membersJon Bratseth2018-10-163-4/+8
* Reduce code duplicationHenning Baldersheim2018-10-052-15/+14