Commit message (Expand) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Construct array right away instead of going via a single element list and the... | Henning Baldersheim | 2024-01-18 | 1 | -2/+3 |
* | Update copyright | Jon Bratseth | 2023-10-09 | 25 | -29/+29 |
* | Use version tag | Henning Baldersheim | 2023-10-04 | 1 | -1/+0 |
* | Update dependency ai.djl.huggingface:tokenizers to v0.24.0 | renovate[bot] | 2023-10-04 | 1 | -1/+1 |
* | Use Guice 6.0 | Bjørn Christian Seime | 2023-09-04 | 1 | -1/+1 |
* | HuggingFace Tokenizer expects path to be a directory | Bjørn Christian Seime | 2023-08-31 | 1 | -2/+18 |
* | Update dependency ai.djl.huggingface:tokenizers to v0.23.0 | renovate[bot] | 2023-08-30 | 1 | -1/+1 |
* | Update abi-specs after making config class Builders final | gjoranv | 2023-07-17 | 1 | -4/+8 |
* | Prefer truncation configuration from tokenizer model | Bjørn Christian Seime | 2023-06-12 | 3 | -14/+112 |
* | Test padding with truncation | Bjørn Christian Seime | 2023-06-08 | 2 | -3/+4 |
* | Verify presence of special token | Bjørn Christian Seime | 2023-06-08 | 1 | -2/+7 |
* | Disable padding and make it configurable | Bjørn Christian Seime | 2023-06-08 | 2 | -12/+30 |
* | Introduce services.xml syntax for configuring HuggingFace embedders | Bjørn Christian Seime | 2023-06-02 | 3 | -13/+7 |
* | Make truncation and max length configurable | Bjørn Christian Seime | 2023-05-26 | 3 | -7/+45 |
* | Implement deconstruct | Bjørn Christian Seime | 2023-05-16 | 1 | -0/+1 |
* | Change parameter type to 'model' | Bjørn Christian Seime | 2023-05-12 | 1 | -1/+1 |
* | Revert "Revert "Bjorncs/huggingface tokenizer"" | Bjørn Christian Seime | 2023-05-12 | 8 | -2/+303 |
* | Revert "Bjorncs/huggingface tokenizer" | Arnstein Ressem | 2023-05-12 | 8 | -303/+2 |
* | Disable special tokens by default | Bjørn Christian Seime | 2023-05-11 | 3 | -12/+10 |
* | Mark HF integration as beta | Bjørn Christian Seime | 2023-05-11 | 2 | -0/+5 |
* | Make HF tokenizer a separate embedder | Bjørn Christian Seime | 2023-05-11 | 8 | -2/+300 |
* | Add skipping of control tokens | Lester Solbakken | 2023-02-10 | 5 | -7/+35 |
* | Add abi spec | Lester Solbakken | 2023-02-10 | 1 | -1/+3 |
* | Add decoding of sentencepiece token sequence to text | Lester Solbakken | 2023-02-10 | 5 | -2/+40 |
* | Revert "Revert collect(Collectors.toList())" | Henning Baldersheim | 2022-12-04 | 1 | -1/+1 |
* | Revert collect(Collectors.toList()) | Henning Baldersheim | 2022-12-04 | 1 | -1/+1 |
* | collect(Collectors.toList()) -> toList() | Henning Baldersheim | 2022-12-02 | 1 | -1/+1 |
* | Split out opennlp-linguistics | Henning Baldersheim | 2022-11-26 | 1 | -0/+6 |
* | Update ABI spec format, and update all specs | jonmv | 2022-10-25 | 1 | -102/+102 |
* | Set project version to 8-SNAPSHOT | gjoranv | 2022-06-08 | 1 | -2/+2 |
* | Remove config version on Vespa 8 | Jon Bratseth | 2022-06-08 | 1 | -4/+0 |
* | install_jar CMake function | Håkon Hallingstad | 2022-05-20 | 1 | -1/+1 |
* | Use '@Inject' from 'annotations' in multiple bundles | Bjørn Christian Seime | 2022-05-06 | 2 | -2/+2 |
* | Don't embed annotations in osgi bundles | Bjørn Christian Seime | 2022-05-04 | 1 | -0/+6 |
* | unify java warnings (use compiler args from parent) | Arne H Juul | 2022-01-06 | 1 | -8/+0 |
* | Test segmentation with subwords | Jon Bratseth | 2021-12-17 | 2 | -4/+13 |
* | BERT -> WordPiece, make subword prefix configurable | Jon Bratseth | 2021-12-17 | 13 | -276/+306 |
* | Add a BERT embedder | Jon Bratseth | 2021-12-16 | 10 | -43/+31009 |
* | update ABI for generated builders | Arne H Juul | 2021-12-09 | 1 | -0/+1 |
* | Add custom `@Beta` annotation | Bjørn Christian Seime | 2021-12-03 | 1 | -1/+1 |
* | Update 2019 Oath copyrights. | gjoranv | 2021-10-27 | 1 | -1/+1 |
* | Correct copyright headers | Jon Bratseth | 2021-10-20 | 1 | -1/+0 |
* | Add missiung copyrights | Jon Bratseth | 2021-10-20 | 1 | -0/+1 |
* | Update 2017 copyright notices. | gjoranv | 2021-10-07 | 1 | -1/+1 |
* | Encapsulate in a context | Jon Bratseth | 2021-10-01 | 3 | -13/+12 |
* | Update linguisticvs-components | Jon Bratseth | 2021-09-30 | 3 | -7/+13 |
* | encode -> embed | Jon Bratseth | 2021-09-28 | 6 | -49/+48 |
* | Use full filename | Jon Bratseth | 2021-09-27 | 1 | -0/+0 |
* | Separate component from linguistics | Jon Bratseth | 2021-09-25 | 21 | -0/+1300 |