Commit message (Expand) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Make HF tokenizer a separate embedder | Bjørn Christian Seime | 2023-05-11 | 3 | -0/+87 |
* | Add skipping of control tokens | Lester Solbakken | 2023-02-10 | 2 | -1/+12 |
* | Add decoding of sentencepiece token sequence to text | Lester Solbakken | 2023-02-10 | 2 | -0/+16 |
* | Test segmentation with subwords | Jon Bratseth | 2021-12-17 | 2 | -4/+13 |
* | BERT -> WordPiece, make subword prefix configurable | Jon Bratseth | 2021-12-17 | 7 | -129/+119 |
* | Add a BERT embedder | Jon Bratseth | 2021-12-16 | 3 | -6/+30576 |
* | Encapsulate in a context | Jon Bratseth | 2021-10-01 | 1 | -2/+3 |
* | Update linguisticvs-components | Jon Bratseth | 2021-09-30 | 1 | -2/+2 |
* | encode -> embed | Jon Bratseth | 2021-09-28 | 3 | -23/+23 |
* | Separate component from linguistics | Jon Bratseth | 2021-09-25 | 5 | -0/+197 |