summaryrefslogtreecommitdiffstats
path: root/linguistics-components/src
Commit message (Expand)AuthorAgeFilesLines
* Construct array right away instead of going via a single element list and the...Henning Baldersheim2024-01-181-2/+3
* Update copyrightJon Bratseth2023-10-0923-27/+27
* HuggingFace Tokenizer expects path to be a directoryBjørn Christian Seime2023-08-311-2/+18
* Prefer truncation configuration from tokenizer modelBjørn Christian Seime2023-06-123-14/+112
* Test padding with truncationBjørn Christian Seime2023-06-082-3/+4
* Verify presence of special tokenBjørn Christian Seime2023-06-081-2/+7
* Disable padding and make it configurableBjørn Christian Seime2023-06-082-12/+30
* Introduce services.xml syntax for configuring HuggingFace embeddersBjørn Christian Seime2023-06-022-13/+1
* Make truncation and max length configurableBjørn Christian Seime2023-05-263-7/+45
* Implement deconstructBjørn Christian Seime2023-05-161-0/+1
* Change parameter type to 'model'Bjørn Christian Seime2023-05-121-1/+1
* Revert "Revert "Bjorncs/huggingface tokenizer""Bjørn Christian Seime2023-05-127-0/+271
* Revert "Bjorncs/huggingface tokenizer"Arnstein Ressem2023-05-127-271/+0
* Disable special tokens by defaultBjørn Christian Seime2023-05-113-12/+10
* Mark HF integration as betaBjørn Christian Seime2023-05-112-0/+5
* Make HF tokenizer a separate embedderBjørn Christian Seime2023-05-117-0/+268
* Add skipping of control tokensLester Solbakken2023-02-104-7/+34
* Add decoding of sentencepiece token sequence to textLester Solbakken2023-02-105-2/+40
* Revert "Revert collect(Collectors.toList())"Henning Baldersheim2022-12-041-1/+1
* Revert collect(Collectors.toList())Henning Baldersheim2022-12-041-1/+1
* collect(Collectors.toList()) -> toList()Henning Baldersheim2022-12-021-1/+1
* Use '@Inject' from 'annotations' in multiple bundlesBjørn Christian Seime2022-05-062-2/+2
* Test segmentation with subwordsJon Bratseth2021-12-172-4/+13
* BERT -> WordPiece, make subword prefix configurableJon Bratseth2021-12-1712-162/+184
* Add a BERT embedderJon Bratseth2021-12-169-43/+30881
* Add custom `@Beta` annotationBjørn Christian Seime2021-12-031-1/+1
* Correct copyright headersJon Bratseth2021-10-201-1/+0
* Add missiung copyrightsJon Bratseth2021-10-201-0/+1
* Encapsulate in a contextJon Bratseth2021-10-012-11/+10
* Update linguisticvs-componentsJon Bratseth2021-09-302-5/+11
* encode -> embedJon Bratseth2021-09-285-39/+38
* Use full filenameJon Bratseth2021-09-271-0/+0
* Separate component from linguisticsJon Bratseth2021-09-2515-0/+1015