Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | - Put the inner loops in separate methods. This improves ability to inline. | Henning Baldersheim | 2024-01-20 | 2 | -54/+52 |
| | | | | | | | - Use Buffer.get(int index) instead of Buffer.get(). That avoids a write. - Use int as loop variable. - This brings the splade perfoamnce test down from 8s to 7s - TensorConverter.toVespaTensor more than doubled speed. | ||||
* | Rename getIndex => getDirectIndex | Henning Baldersheim | 2024-01-20 | 1 | -1/+1 |
| | |||||
* | Add a class for assist efficient traversal of dimensions in an IndexedTensor. | Henning Baldersheim | 2024-01-19 | 2 | -4/+9 |
| | |||||
* | Cache sizes.totalSize() in variable to prevent recomputation. | Henning Baldersheim | 2024-01-18 | 1 | -20/+19 |
| | |||||
* | Since both value and log(value) are monotonically increasing for value >= 1, | Henning Baldersheim | 2024-01-18 | 1 | -8/+8 |
| | | | | | we can just gather max(value) and do log at the end. Avoiding general Math.max which seems to have very costly NaN handling was quite benefiscal. | ||||
* | Construct array right away instead of going via a single element list and ↵ | Henning Baldersheim | 2024-01-18 | 1 | -5/+15 |
| | | | | the java stream api. | ||||
* | Avoid generic reduce and keep PAD token embedding | Jo Kristian Bergum | 2024-01-15 | 2 | -24/+47 |
| | |||||
* | remove extra space | Jo Kristian Bergum | 2024-01-11 | 1 | -1/+1 |
| | |||||
* | address review | Jo Kristian Bergum | 2024-01-11 | 2 | -43/+25 |
| | |||||
* | Avoid generic reduce to reduce gc pressure | Jo Kristian Bergum | 2024-01-11 | 2 | -19/+61 |
| | |||||
* | final | Jo Kristian Bergum | 2024-01-06 | 1 | -1/+1 |
| | |||||
* | handle multilingual models better | Jo Kristian Bergum | 2024-01-06 | 3 | -65/+147 |
| | |||||
* | Allow mapped 1d tensor for embed expressions | Jo Kristian Bergum | 2023-12-17 | 2 | -13/+13 |
| | |||||
* | Add a splade embedder implementation | Jo Kristian Bergum | 2023-12-15 | 5 | -0/+30962 |
| | |||||
* | Move Jackson util from vespajlib to container-core. | Henning Baldersheim | 2023-11-24 | 3 | -3/+3 |
| | |||||
* | jackson 2.16 changes some of its default settings so we consolidate our use ↵ | Henning Baldersheim | 2023-11-23 | 3 | -8/+7 |
| | | | | | | of the ObjectMapper. Unless special options are used, use a common instance, or create via factory metod. | ||||
* | unpack_bits_from_int8 -> unpack_bits | Arne Juul | 2023-11-10 | 1 | -2/+2 |
| | |||||
* | add simple expandBitTensor function | Arne Juul | 2023-11-10 | 2 | -9/+35 |
| | |||||
* | Add support and upgrade opset | Jo Kristian Bergum | 2023-10-26 | 4 | -7/+31 |
| | |||||
* | Add support for bfloat16 and float16 | Jo Kristian Bergum | 2023-10-26 | 4 | -0/+82 |
| | |||||
* | Less verbose logging when failing to find CUDA and it is optional | Jo Kristian Bergum | 2023-10-26 | 2 | -2/+53 |
| | |||||
* | Disable CPU arena allocator for ONNX | Bjørn Christian Seime | 2023-10-19 | 1 | -0/+1 |
| | | | | | The arena memory allocator pre-allocates excessive of memory up front. Disabling matches the existing configuration in ONNX integration for backend. | ||||
* | Update copyright | Jon Bratseth | 2023-10-09 | 122 | -122/+131 |
| | |||||
* | Don't index PAD and re-factoring | Jo Kristian Bergum | 2023-09-26 | 2 | -41/+37 |
| | |||||
* | Add config options + license | Jo Kristian Bergum | 2023-09-21 | 2 | -0/+2 |
| | |||||
* | Ensure Onnx/Hugginface resources are cleaned up on deconstruction | Bjørn Christian Seime | 2023-09-21 | 1 | -0/+6 |
| | |||||
* | Add ColBERT embedder | Jo Kristian Bergum | 2023-09-21 | 4 | -0/+599 |
| | |||||
* | - Use equals when comparing Optional<Long> | Henning Baldersheim | 2023-09-13 | 2 | -4/+4 |
| | | | | - Minor cleanup | ||||
* | Use thread safe hash map | Bjørn Christian Seime | 2023-08-31 | 1 | -2/+2 |
| | |||||
* | Merge pull request #27969 from vespa-engine/bjorncs/embedder-metrics | Jon Bratseth | 2023-08-31 | 5 | -8/+94 |
|\ | | | | | Add generic metrics for embedders | ||||
| * | Allow sampling of fractional millis | Bjørn Christian Seime | 2023-08-25 | 3 | -15/+10 |
| | | |||||
| * | Add generic metrics for embedders | Bjørn Christian Seime | 2023-08-04 | 5 | -8/+99 |
| | | |||||
* | | Better error message when importing models with illegal names | Lester Solbakken | 2023-08-29 | 1 | -0/+25 |
|/ | |||||
* | Log when GPU configuration is successful | Martin Polden | 2023-07-19 | 1 | -3/+8 |
| | |||||
* | Log warning when failing to use GPU | Martin Polden | 2023-07-19 | 1 | -1/+6 |
| | |||||
* | update onnx.proto | Arne Juul | 2023-06-23 | 4 | -80/+453 |
| | | | | | * use latest version from https://github.com/onnx/onnx/blob/main/onnx/onnx.proto * track API changes (enum -> int32) | ||||
* | Prefer truncation configuration from tokenizer model | Bjørn Christian Seime | 2023-06-12 | 1 | -6/+19 |
| | | | | | | | Only override truncation if not specified or max length exceeds max tokens accepted by model. Use JNI wrapper directly to determine existing truncation configuration (JSON format is not really documented). Simply configuration for pure tokenizer embedder. Disable DJL usage telemetry. | ||||
* | Add missing wiring of pooling strategy | Bjørn Christian Seime | 2023-06-08 | 1 | -11/+1 |
| | |||||
* | Disable padding and make it configurable | Bjørn Christian Seime | 2023-06-08 | 1 | -0/+1 |
| | |||||
* | Merge pull request #27297 from vespa-engine/bjorncs/bert-embedder-services-xml | Bjørn Christian Seime | 2023-06-06 | 4 | -49/+54 |
|\ | | | | | Bjorncs/bert embedder services xml | ||||
| * | Make pooling strategy configurable for Huggingface embedder | Bjørn Christian Seime | 2023-06-05 | 3 | -17/+54 |
| | | |||||
| * | Move config definition to `configdefinitions` | Bjørn Christian Seime | 2023-06-05 | 1 | -32/+0 |
| | | |||||
* | | Add necessary options to use failOnWarnings | gjoranv | 2023-06-05 | 1 | -0/+4 |
|/ | |||||
* | Introduce services.xml syntax for configuring HuggingFace embedders | Bjørn Christian Seime | 2023-06-02 | 2 | -29/+6 |
| | |||||
* | Properly ignore token type ids from tokenizer if disabled | Bjørn Christian Seime | 2023-05-30 | 1 | -2/+2 |
| | |||||
* | Remove dead code | Bjørn Christian Seime | 2023-05-26 | 2 | -43/+0 |
| | |||||
* | Make truncation and max length configurable | Bjørn Christian Seime | 2023-05-26 | 1 | -12/+3 |
| | |||||
* | Use GPU by default if available | Bjørn Christian Seime | 2023-05-22 | 2 | -2/+4 |
| | |||||
* | Revert "Revert "Bjorncs/huggingface tokenizer"" | Bjørn Christian Seime | 2023-05-12 | 6 | -210/+29 |
| | | | | This reverts commit 2bb74878879b3acb1919fd658b8f2c476d8129d6. | ||||
* | Revert "Bjorncs/huggingface tokenizer" | Arnstein Ressem | 2023-05-12 | 6 | -29/+210 |
| |