aboutsummaryrefslogtreecommitdiffstats
path: root/linguistics
Commit message (Expand)AuthorAgeFilesLines
* Fix CR commentsMariusArhaug13 days7-46/+34
* Update significance model field and logic from architect meetingMariusArhaug2024-04-2411-117/+246
* Merge pull request #30871 from vespa-engine/marius/add-significance-searcherMarius Arhaug2024-04-244-11/+18
|\
| * update abi-specMariusArhaug2024-04-161-1/+1
| * fix cr failuresMariusArhaug2024-04-163-10/+17
* | Replace all usages of Arrays.asList with List.of where possible.Henning Baldersheim2024-04-126-30/+25
* | Merge pull request #30809 from vespa-engine/jobergum/add-context-cachingJo Kristian Bergum2024-04-102-9/+17
|\ \ | |/ |/|
| * Key by embedder id and don't recompute inputsJon Bratseth2024-04-072-10/+11
| * Add equivalent to `Map.computeIfAbsent()` to simplify typical usage of the cacheBjørn Christian Seime2024-04-042-2/+9
* | Merge pull request #30816 from vespa-engine/marius/add-significance-model-reg...Marius Arhaug2024-04-0912-1/+385
|\ \
| * | add missing beta annotationMariusArhaug2024-04-091-0/+4
| * | add illegal arg exception for languages not registeredMariusArhaug2024-04-092-1/+8
| * | fix cr failuresMariusArhaug2024-04-0912-52/+104
| * | add significance model registry to linguisticsMariusArhaug2024-04-0410-1/+322
| |/
* | add comment for intention in determineScript functionMariusArhaug2024-04-041-0/+1
* | Add SimpleTokenScript to SimpleTokenizerMariusArhaug2024-04-034-1/+124
|/
* Expose cache to embeddersJon Bratseth2024-04-012-1/+27
* Update ABI specJon Bratseth2024-02-161-3/+1
* Pass context when resolving propertiesJon Bratseth2024-02-151-9/+0
* ChainedMap can't be copiedJon Bratseth2024-01-201-1/+1
* Revert "Merge pull request #29905 from vespa-engine/revert-29884-bratseth/par...Jon Bratseth2024-01-202-1/+13
* Revert "Support parameter references in embed"Henning Baldersheim2024-01-152-13/+1
* Support parameter references in embedJon Bratseth2024-01-122-1/+13
* Revert "Merge pull request #29328 from vespa-engine/revert-29314-bratseth/cas...Jon Bratseth2023-11-144-13/+30
* Revert "Bratseth/casing take 2"Harald Musum2023-11-134-30/+13
* Prefer first stem to original if non equalJon Bratseth2023-11-102-11/+28
* Revert "Revert "Don't lowercase linguistics annotations""Jon Bratseth2023-11-092-2/+2
* Revert "Don't lowercase linguistics annotations"Jon Bratseth2023-11-092-2/+2
* Don't lowercase linguistics annotationsJon Bratseth2023-11-092-2/+2
* Avoid cutting surrogate pairs when tokenisingjonmv2023-10-201-1/+1
* Update copyrightJon Bratseth2023-10-0973-73/+73
* Use Guice 6.0Bjørn Christian Seime2023-09-041-1/+1
* Allow sampling of fractional millisBjørn Christian Seime2023-08-252-4/+3
* Add generic metrics for embeddersBjørn Christian Seime2023-08-042-1/+56
* Add necessary options to use failOnWarningsgjoranv2023-06-051-0/+1
* Don't remove indexable symbols when stemmingJon Bratseth2023-06-025-8/+17
* Add bundle type to all CORE bundles.gjoranv2023-05-251-0/+3
* Update ABI specJon Bratseth2023-05-221-0/+1
* Always treat each symbol as a separate tokenJon Bratseth2023-05-224-20/+56
* Threat 'other symbols' as lettersJon Bratseth2023-05-222-2/+10
* Use dollar and hour base unitsJon Bratseth2023-05-191-2/+2
* Use metric enums everywhereJon Bratseth2023-03-061-1/+1
* Add abi specLester Solbakken2023-02-101-0/+1
* Add decoding of sentencepiece token sequence to textLester Solbakken2023-02-101-0/+11
* Compute code points in whole string only when neededjonmv2022-12-062-6/+17
* Split out opennlp-linguisticsHenning Baldersheim2022-11-2614-783/+0
* Update ABI spec format, and update all specsjonmv2022-10-251-198/+198
* much simpler CharSequenceNormalizerArne Juul2022-10-063-9/+100
* Merge pull request #24007 from vespa-engine/bratseth/cleanup-082Jon Bratseth2022-09-252-13/+11
|\
| * No functional changesJon Bratseth2022-09-112-13/+11