aboutsummaryrefslogtreecommitdiffstats
path: root/linguistics-components
Commit message (Expand)AuthorAgeFilesLines
* Construct array right away instead of going via a single element list and the...Henning Baldersheim2024-01-181-2/+3
* Update copyrightJon Bratseth2023-10-0925-29/+29
* Use version tagHenning Baldersheim2023-10-041-1/+0
* Update dependency ai.djl.huggingface:tokenizers to v0.24.0renovate[bot]2023-10-041-1/+1
* Use Guice 6.0Bjørn Christian Seime2023-09-041-1/+1
* HuggingFace Tokenizer expects path to be a directoryBjørn Christian Seime2023-08-311-2/+18
* Update dependency ai.djl.huggingface:tokenizers to v0.23.0renovate[bot]2023-08-301-1/+1
* Update abi-specs after making config class Builders finalgjoranv2023-07-171-4/+8
* Prefer truncation configuration from tokenizer modelBjørn Christian Seime2023-06-123-14/+112
* Test padding with truncationBjørn Christian Seime2023-06-082-3/+4
* Verify presence of special tokenBjørn Christian Seime2023-06-081-2/+7
* Disable padding and make it configurableBjørn Christian Seime2023-06-082-12/+30
* Introduce services.xml syntax for configuring HuggingFace embeddersBjørn Christian Seime2023-06-023-13/+7
* Make truncation and max length configurableBjørn Christian Seime2023-05-263-7/+45
* Implement deconstructBjørn Christian Seime2023-05-161-0/+1
* Change parameter type to 'model'Bjørn Christian Seime2023-05-121-1/+1
* Revert "Revert "Bjorncs/huggingface tokenizer""Bjørn Christian Seime2023-05-128-2/+303
* Revert "Bjorncs/huggingface tokenizer"Arnstein Ressem2023-05-128-303/+2
* Disable special tokens by defaultBjørn Christian Seime2023-05-113-12/+10
* Mark HF integration as betaBjørn Christian Seime2023-05-112-0/+5
* Make HF tokenizer a separate embedderBjørn Christian Seime2023-05-118-2/+300
* Add skipping of control tokensLester Solbakken2023-02-105-7/+35
* Add abi specLester Solbakken2023-02-101-1/+3
* Add decoding of sentencepiece token sequence to textLester Solbakken2023-02-105-2/+40
* Revert "Revert collect(Collectors.toList())"Henning Baldersheim2022-12-041-1/+1
* Revert collect(Collectors.toList())Henning Baldersheim2022-12-041-1/+1
* collect(Collectors.toList()) -> toList()Henning Baldersheim2022-12-021-1/+1
* Split out opennlp-linguisticsHenning Baldersheim2022-11-261-0/+6
* Update ABI spec format, and update all specsjonmv2022-10-251-102/+102
* Set project version to 8-SNAPSHOTgjoranv2022-06-081-2/+2
* Remove config version on Vespa 8Jon Bratseth2022-06-081-4/+0
* install_jar CMake functionHåkon Hallingstad2022-05-201-1/+1
* Use '@Inject' from 'annotations' in multiple bundlesBjørn Christian Seime2022-05-062-2/+2
* Don't embed annotations in osgi bundlesBjørn Christian Seime2022-05-041-0/+6
* unify java warnings (use compiler args from parent)Arne H Juul2022-01-061-8/+0
* Test segmentation with subwordsJon Bratseth2021-12-172-4/+13
* BERT -> WordPiece, make subword prefix configurableJon Bratseth2021-12-1713-276/+306
* Add a BERT embedderJon Bratseth2021-12-1610-43/+31009
* update ABI for generated buildersArne H Juul2021-12-091-0/+1
* Add custom `@Beta` annotationBjørn Christian Seime2021-12-031-1/+1
* Update 2019 Oath copyrights.gjoranv2021-10-271-1/+1
* Correct copyright headersJon Bratseth2021-10-201-1/+0
* Add missiung copyrightsJon Bratseth2021-10-201-0/+1
* Update 2017 copyright notices.gjoranv2021-10-071-1/+1
* Encapsulate in a contextJon Bratseth2021-10-013-13/+12
* Update linguisticvs-componentsJon Bratseth2021-09-303-7/+13
* encode -> embedJon Bratseth2021-09-286-49/+48
* Use full filenameJon Bratseth2021-09-271-0/+0
* Separate component from linguisticsJon Bratseth2021-09-2521-0/+1300