Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | ChainedMap can't be copied | Jon Bratseth | 2024-01-20 | 1 | -1/+1 |
| | |||||
* | Revert "Merge pull request #29905 from ↵ | Jon Bratseth | 2024-01-20 | 2 | -1/+13 |
| | | | | | | | vespa-engine/revert-29884-bratseth/param-refs-in-embed" This reverts commit c6b547c0c2898a324983356aa677ea3082533f7d, reversing changes made to 8c7f8c17ad5e1de5adcc71ee34f2a3c1cd36d6bd. | ||||
* | Revert "Support parameter references in embed" | Henning Baldersheim | 2024-01-15 | 2 | -13/+1 |
| | |||||
* | Support parameter references in embed | Jon Bratseth | 2024-01-12 | 2 | -1/+13 |
| | | | | Support embed(@myParameter) in addition to embed('text to embed') | ||||
* | Revert "Merge pull request #29328 from ↵ | Jon Bratseth | 2023-11-14 | 4 | -13/+30 |
| | | | | | | | vespa-engine/revert-29314-bratseth/casing-take-2" This reverts commit a72e949533a46d665440a9c72ca2b8fb58f3a9c3, reversing changes made to 944d635d00e165166508ef23399e9ed65a87a9c8. | ||||
* | Revert "Bratseth/casing take 2" | Harald Musum | 2023-11-13 | 4 | -30/+13 |
| | |||||
* | Prefer first stem to original if non equal | Jon Bratseth | 2023-11-10 | 2 | -11/+28 |
| | |||||
* | Revert "Revert "Don't lowercase linguistics annotations"" | Jon Bratseth | 2023-11-09 | 2 | -2/+2 |
| | | | | This reverts commit 0dfd4fe4c6ddbded490da36e71f27c4b70aa4226. | ||||
* | Revert "Don't lowercase linguistics annotations" | Jon Bratseth | 2023-11-09 | 2 | -2/+2 |
| | |||||
* | Don't lowercase linguistics annotations | Jon Bratseth | 2023-11-09 | 2 | -2/+2 |
| | | | | | | Tokens are already lowercased by our bundled linguistics components. Lowercasing again when annotating precludes plugging in a lingustics component which preserves casing. | ||||
* | Avoid cutting surrogate pairs when tokenising | jonmv | 2023-10-20 | 1 | -1/+1 |
| | |||||
* | Update copyright | Jon Bratseth | 2023-10-09 | 73 | -73/+73 |
| | |||||
* | Use Guice 6.0 | Bjørn Christian Seime | 2023-09-04 | 1 | -1/+1 |
| | | | | | | https://github.com/google/guice/wiki/Guice600 We cannot upgrade to 7.x as we export javax.inject from container. 6.x supports both the old javax.inject and the new jakarta.inject replacement. | ||||
* | Allow sampling of fractional millis | Bjørn Christian Seime | 2023-08-25 | 2 | -4/+3 |
| | |||||
* | Add generic metrics for embedders | Bjørn Christian Seime | 2023-08-04 | 2 | -1/+56 |
| | |||||
* | Add necessary options to use failOnWarnings | gjoranv | 2023-06-05 | 1 | -0/+1 |
| | |||||
* | Don't remove indexable symbols when stemming | Jon Bratseth | 2023-06-02 | 5 | -8/+17 |
| | |||||
* | Add bundle type to all CORE bundles. | gjoranv | 2023-05-25 | 1 | -0/+3 |
| | |||||
* | Update ABI spec | Jon Bratseth | 2023-05-22 | 1 | -0/+1 |
| | |||||
* | Always treat each symbol as a separate token | Jon Bratseth | 2023-05-22 | 4 | -20/+56 |
| | |||||
* | Threat 'other symbols' as letters | Jon Bratseth | 2023-05-22 | 2 | -2/+10 |
| | | | | | The unicode class 'other symbols' contains emojis, math symbols, etc. Treat these as letter characters to support searching for them. | ||||
* | Use dollar and hour base units | Jon Bratseth | 2023-05-19 | 1 | -2/+2 |
| | |||||
* | Use metric enums everywhere | Jon Bratseth | 2023-03-06 | 1 | -1/+1 |
| | |||||
* | Add abi spec | Lester Solbakken | 2023-02-10 | 1 | -0/+1 |
| | |||||
* | Add decoding of sentencepiece token sequence to text | Lester Solbakken | 2023-02-10 | 1 | -0/+11 |
| | |||||
* | Compute code points in whole string only when needed | jonmv | 2022-12-06 | 2 | -6/+17 |
| | |||||
* | Split out opennlp-linguistics | Henning Baldersheim | 2022-11-26 | 14 | -783/+0 |
| | |||||
* | Update ABI spec format, and update all specs | jonmv | 2022-10-25 | 1 | -198/+198 |
| | |||||
* | much simpler CharSequenceNormalizer | Arne Juul | 2022-10-06 | 3 | -9/+100 |
| | |||||
* | Merge pull request #24007 from vespa-engine/bratseth/cleanup-082 | Jon Bratseth | 2022-09-25 | 2 | -13/+11 |
|\ | | | | | No functional changes | ||||
| * | No functional changes | Jon Bratseth | 2022-09-11 | 2 | -13/+11 |
| | | |||||
* | | Make validation messages clearer given multiple instances | Jon Bratseth | 2022-09-15 | 1 | -2/+0 |
|/ | |||||
* | bump protoc version | Arne Juul | 2022-08-27 | 1 | -4/+0 |
| | |||||
* | Determine token types considering all characters | Jon Bratseth | 2022-08-16 | 6 | -119/+133 |
| | |||||
* | Set project version to 8-SNAPSHOT | gjoranv | 2022-06-08 | 1 | -2/+2 |
| | |||||
* | Remove on Vespa 8 | Jon Bratseth | 2022-06-08 | 2 | -10/+1 |
| | |||||
* | Use '@Inject' from 'annotations' in multiple bundles | Bjørn Christian Seime | 2022-05-06 | 2 | -2/+2 |
| | |||||
* | Resolve rank profile inputs | Jon Bratseth | 2022-04-21 | 1 | -1/+1 |
| | |||||
* | Update abi-spec | Lester Solbakken | 2022-03-22 | 1 | -1/+1 |
| | |||||
* | Rename defaultEmbedderName to defaultEmbedderId | Lester Solbakken | 2022-03-22 | 1 | -2/+2 |
| | |||||
* | Add convenience function to represent embedder as map | Lester Solbakken | 2022-03-21 | 2 | -3/+30 |
| | |||||
* | Stem by linguistics in rule bases | Jon Bratseth | 2022-01-10 | 2 | -3/+21 |
| | | | | Also add a @language directive to stem in other languages than english. | ||||
* | unify java warnings (use compiler args from parent) | Arne H Juul | 2022-01-06 | 1 | -8/+0 |
| | |||||
* | annotate intentional switch fallthrough | Arne H Juul | 2022-01-06 | 1 | -0/+1 |
| | |||||
* | Specify how the class is actually loaded | Jon Marius Venstad | 2021-12-21 | 1 | -1/+1 |
| | |||||
* | Provide array of correct size. | Jon Marius Venstad | 2021-12-20 | 1 | -1/+1 |
| | |||||
* | Override ngram creation with something less silly | Jon Marius Venstad | 2021-12-20 | 2 | -1/+32 |
| | |||||
* | Use smaller chunks for faster detection | Jon Marius Venstad | 2021-12-20 | 1 | -2/+2 |
| | |||||
* | Expand test case for language detection | Jon Marius Venstad | 2021-12-20 | 1 | -3/+28 |
| | |||||
* | Upper bound on input size, and use opennlp before simple detector | Jon Marius Venstad | 2021-12-20 | 1 | -6/+3 |
| |