summaryrefslogtreecommitdiffstats
path: root/linguistics
Commit message (Expand)AuthorAgeFilesLines
* Merge pull request #31194 from vespa-engine/bratseth/stemming-traceJon Bratseth2024-05-161-1/+1
|\
| * Trace no stemming due to language=UNKNOWNJon Bratseth2024-05-121-1/+1
* | Merge pull request #31098 from vespa-engine/marius/add-significance-model-toolBjørn Christian Seime2024-05-152-4/+7
|\ \
| * | Add significance model generator cliMariusArhaug2024-05-142-4/+7
| |/
* | Update linguistics/src/main/java/com/yahoo/language/process/Segmenter.javaJon Bratseth2024-05-151-1/+1
* | Improve javadocJon Bratseth2024-05-141-9/+10
|/
* Fix CR commentsMariusArhaug2024-04-307-46/+34
* Update significance model field and logic from architect meetingMariusArhaug2024-04-2411-117/+246
* Merge pull request #30871 from vespa-engine/marius/add-significance-searcherMarius Arhaug2024-04-244-11/+18
|\
| * update abi-specMariusArhaug2024-04-161-1/+1
| * fix cr failuresMariusArhaug2024-04-163-10/+17
* | Replace all usages of Arrays.asList with List.of where possible.Henning Baldersheim2024-04-126-30/+25
* | Merge pull request #30809 from vespa-engine/jobergum/add-context-cachingJo Kristian Bergum2024-04-102-9/+17
|\ \ | |/ |/|
| * Key by embedder id and don't recompute inputsJon Bratseth2024-04-072-10/+11
| * Add equivalent to `Map.computeIfAbsent()` to simplify typical usage of the cacheBjørn Christian Seime2024-04-042-2/+9
* | Merge pull request #30816 from vespa-engine/marius/add-significance-model-reg...Marius Arhaug2024-04-0912-1/+385
|\ \
| * | add missing beta annotationMariusArhaug2024-04-091-0/+4
| * | add illegal arg exception for languages not registeredMariusArhaug2024-04-092-1/+8
| * | fix cr failuresMariusArhaug2024-04-0912-52/+104
| * | add significance model registry to linguisticsMariusArhaug2024-04-0410-1/+322
| |/
* | add comment for intention in determineScript functionMariusArhaug2024-04-041-0/+1
* | Add SimpleTokenScript to SimpleTokenizerMariusArhaug2024-04-034-1/+124
|/
* Expose cache to embeddersJon Bratseth2024-04-012-1/+27
* Update ABI specJon Bratseth2024-02-161-3/+1
* Pass context when resolving propertiesJon Bratseth2024-02-151-9/+0
* ChainedMap can't be copiedJon Bratseth2024-01-201-1/+1
* Revert "Merge pull request #29905 from vespa-engine/revert-29884-bratseth/par...Jon Bratseth2024-01-202-1/+13
* Revert "Support parameter references in embed"Henning Baldersheim2024-01-152-13/+1
* Support parameter references in embedJon Bratseth2024-01-122-1/+13
* Revert "Merge pull request #29328 from vespa-engine/revert-29314-bratseth/cas...Jon Bratseth2023-11-144-13/+30
* Revert "Bratseth/casing take 2"Harald Musum2023-11-134-30/+13
* Prefer first stem to original if non equalJon Bratseth2023-11-102-11/+28
* Revert "Revert "Don't lowercase linguistics annotations""Jon Bratseth2023-11-092-2/+2
* Revert "Don't lowercase linguistics annotations"Jon Bratseth2023-11-092-2/+2
* Don't lowercase linguistics annotationsJon Bratseth2023-11-092-2/+2
* Avoid cutting surrogate pairs when tokenisingjonmv2023-10-201-1/+1
* Update copyrightJon Bratseth2023-10-0973-73/+73
* Use Guice 6.0Bjørn Christian Seime2023-09-041-1/+1
* Allow sampling of fractional millisBjørn Christian Seime2023-08-252-4/+3
* Add generic metrics for embeddersBjørn Christian Seime2023-08-042-1/+56
* Add necessary options to use failOnWarningsgjoranv2023-06-051-0/+1
* Don't remove indexable symbols when stemmingJon Bratseth2023-06-025-8/+17
* Add bundle type to all CORE bundles.gjoranv2023-05-251-0/+3
* Update ABI specJon Bratseth2023-05-221-0/+1
* Always treat each symbol as a separate tokenJon Bratseth2023-05-224-20/+56
* Threat 'other symbols' as lettersJon Bratseth2023-05-222-2/+10
* Use dollar and hour base unitsJon Bratseth2023-05-191-2/+2
* Use metric enums everywhereJon Bratseth2023-03-061-1/+1
* Add abi specLester Solbakken2023-02-101-0/+1
* Add decoding of sentencepiece token sequence to textLester Solbakken2023-02-101-0/+11