aboutsummaryrefslogtreecommitdiffstats
path: root/linguistics/src/main/java/com/yahoo/language/process
Commit message (Expand)AuthorAgeFilesLines
* Update linguistics/src/main/java/com/yahoo/language/process/Segmenter.javaJon Bratseth2024-05-151-1/+1
* Improve javadocJon Bratseth2024-05-141-9/+10
* Key by embedder id and don't recompute inputsJon Bratseth2024-04-071-7/+8
* Add equivalent to `Map.computeIfAbsent()` to simplify typical usage of the cacheBjørn Christian Seime2024-04-041-1/+7
* Expose cache to embeddersJon Bratseth2024-04-011-0/+23
* Pass context when resolving propertiesJon Bratseth2024-02-151-9/+0
* ChainedMap can't be copiedJon Bratseth2024-01-201-1/+1
* Revert "Merge pull request #29905 from vespa-engine/revert-29884-bratseth/par...Jon Bratseth2024-01-201-0/+10
* Revert "Support parameter references in embed"Henning Baldersheim2024-01-151-10/+0
* Support parameter references in embedJon Bratseth2024-01-121-0/+10
* Revert "Merge pull request #29328 from vespa-engine/revert-29314-bratseth/cas...Jon Bratseth2023-11-141-2/+1
* Revert "Bratseth/casing take 2"Harald Musum2023-11-131-1/+2
* Revert "Revert "Don't lowercase linguistics annotations""Jon Bratseth2023-11-091-2/+1
* Revert "Don't lowercase linguistics annotations"Jon Bratseth2023-11-091-1/+2
* Don't lowercase linguistics annotationsJon Bratseth2023-11-091-2/+1
* Update copyrightJon Bratseth2023-10-0919-19/+19
* Allow sampling of fractional millisBjørn Christian Seime2023-08-251-3/+2
* Add generic metrics for embeddersBjørn Christian Seime2023-08-041-0/+37
* Don't remove indexable symbols when stemmingJon Bratseth2023-06-021-4/+5
* Always treat each symbol as a separate tokenJon Bratseth2023-05-222-17/+31
* Threat 'other symbols' as lettersJon Bratseth2023-05-221-2/+2
* Add decoding of sentencepiece token sequence to textLester Solbakken2023-02-101-0/+11
* Compute code points in whole string only when neededjonmv2022-12-061-5/+3
* Remove on Vespa 8Jon Bratseth2022-06-081-8/+0
* Resolve rank profile inputsJon Bratseth2022-04-211-1/+1
* Rename defaultEmbedderName to defaultEmbedderIdLester Solbakken2022-03-221-2/+2
* Add convenience function to represent embedder as mapLester Solbakken2022-03-211-3/+26
* Add a BERT embedderJon Bratseth2021-12-161-2/+3
* Update Verizon Media copyright notices.gjoranv2021-10-072-2/+2
* Update 2017 copyright notices.gjoranv2021-10-0716-16/+16
* Encapsulate in a contextJon Bratseth2021-10-011-12/+46
* Pass destinationJon Bratseth2021-09-301-4/+10
* encode -> embedJon Bratseth2021-09-282-56/+56
* Linguistics cleanupJon Bratseth2021-09-216-14/+12
* Add 'encode' expressionJon Bratseth2021-09-191-0/+17
* Encoder interfaceJon Bratseth2021-09-162-2/+41
* Require replacements to be applied during tokenizationJon Bratseth2021-06-152-12/+7
* Revert "Merge pull request #17754 from vespa-engine/revert-17747-bratseth/spe...Jon Bratseth2021-05-053-1/+214
* Revert "Reapply "Bratseth/special tokens""Jon Bratseth2021-05-053-214/+1
* Revert "Merge pull request #17746 from vespa-engine/revert-17738-revert-17737...Jon Bratseth2021-05-053-1/+214
* Revert "Revert "Revert "Bratseth/special tokens"""Jon Bratseth2021-05-053-214/+1
* Revert "Revert "Bratseth/special tokens""Jon Bratseth2021-05-043-1/+214
* Revert "Bratseth/special tokens"Jon Bratseth2021-05-043-214/+1
* Expose tokens as mapJon Bratseth2021-05-042-6/+13
* Wire in (but don't use) SpecialTokensJon Bratseth2021-05-043-5/+5
* Move specialtokens to linguisticsJon Bratseth2021-05-042-0/+206
* No functional changesJon Bratseth2021-02-031-1/+1
* handle plugin tokenizer returning tokens with empty original stringArne Juul2020-08-241-1/+4
* Surrogate aware gram splittingJon Bratseth2020-06-251-24/+85
* SpareCapacityMaintainer sketchJon Bratseth2020-06-125-66/+34