aboutsummaryrefslogtreecommitdiffstats
path: root/linguistics-components/src/main
Commit message (Expand)AuthorAgeFilesLines
* Construct array right away instead of going via a single element list and the...Henning Baldersheim2024-01-181-2/+3
* Update copyrightJon Bratseth2023-10-0918-21/+21
* HuggingFace Tokenizer expects path to be a directoryBjørn Christian Seime2023-08-311-2/+18
* Prefer truncation configuration from tokenizer modelBjørn Christian Seime2023-06-122-13/+104
* Test padding with truncationBjørn Christian Seime2023-06-081-1/+1
* Disable padding and make it configurableBjørn Christian Seime2023-06-081-3/+7
* Introduce services.xml syntax for configuring HuggingFace embeddersBjørn Christian Seime2023-06-022-13/+1
* Make truncation and max length configurableBjørn Christian Seime2023-05-262-5/+17
* Implement deconstructBjørn Christian Seime2023-05-161-0/+1
* Change parameter type to 'model'Bjørn Christian Seime2023-05-121-1/+1
* Revert "Revert "Bjorncs/huggingface tokenizer""Bjørn Christian Seime2023-05-124-0/+183
* Revert "Bjorncs/huggingface tokenizer"Arnstein Ressem2023-05-124-183/+0
* Disable special tokens by defaultBjørn Christian Seime2023-05-112-12/+9
* Mark HF integration as betaBjørn Christian Seime2023-05-112-0/+5
* Make HF tokenizer a separate embedderBjørn Christian Seime2023-05-114-0/+181
* Add skipping of control tokensLester Solbakken2023-02-102-6/+22
* Add decoding of sentencepiece token sequence to textLester Solbakken2023-02-103-2/+24
* Revert "Revert collect(Collectors.toList())"Henning Baldersheim2022-12-041-1/+1
* Revert collect(Collectors.toList())Henning Baldersheim2022-12-041-1/+1
* collect(Collectors.toList()) -> toList()Henning Baldersheim2022-12-021-1/+1
* Use '@Inject' from 'annotations' in multiple bundlesBjørn Christian Seime2022-05-062-2/+2
* BERT -> WordPiece, make subword prefix configurableJon Bratseth2021-12-175-33/+65
* Add a BERT embedderJon Bratseth2021-12-166-37/+305
* Add custom `@Beta` annotationBjørn Christian Seime2021-12-031-1/+1
* Correct copyright headersJon Bratseth2021-10-201-1/+0
* Add missiung copyrightsJon Bratseth2021-10-201-0/+1
* Encapsulate in a contextJon Bratseth2021-10-011-9/+7
* Update linguisticvs-componentsJon Bratseth2021-09-301-3/+9
* encode -> embedJon Bratseth2021-09-282-16/+15
* Use full filenameJon Bratseth2021-09-271-0/+0
* Separate component from linguisticsJon Bratseth2021-09-2510-0/+818