summaryrefslogtreecommitdiffstats
path: root/model-integration
Commit message (Collapse)AuthorAgeFilesLines
* Add support and upgrade opsetJo Kristian Bergum2023-10-264-7/+31
|
* Add support for bfloat16 and float16Jo Kristian Bergum2023-10-264-0/+82
|
* Less verbose logging when failing to find CUDA and it is optionalJo Kristian Bergum2023-10-262-2/+53
|
* Disable CPU arena allocator for ONNXBjørn Christian Seime2023-10-191-0/+1
| | | | | The arena memory allocator pre-allocates excessive of memory up front. Disabling matches the existing configuration in ONNX integration for backend.
* Update copyrightJon Bratseth2023-10-09122-122/+131
|
* Don't index PAD and re-factoringJo Kristian Bergum2023-09-262-41/+37
|
* Add config options + licenseJo Kristian Bergum2023-09-212-0/+2
|
* Ensure Onnx/Hugginface resources are cleaned up on deconstructionBjørn Christian Seime2023-09-211-0/+6
|
* Add ColBERT embedderJo Kristian Bergum2023-09-214-0/+599
|
* - Use equals when comparing Optional<Long>Henning Baldersheim2023-09-132-4/+4
| | | | - Minor cleanup
* Use thread safe hash mapBjørn Christian Seime2023-08-311-2/+2
|
* Merge pull request #27969 from vespa-engine/bjorncs/embedder-metricsJon Bratseth2023-08-315-8/+94
|\ | | | | Add generic metrics for embedders
| * Allow sampling of fractional millisBjørn Christian Seime2023-08-253-15/+10
| |
| * Add generic metrics for embeddersBjørn Christian Seime2023-08-045-8/+99
| |
* | Better error message when importing models with illegal namesLester Solbakken2023-08-291-0/+25
|/
* Log when GPU configuration is successfulMartin Polden2023-07-191-3/+8
|
* Log warning when failing to use GPUMartin Polden2023-07-191-1/+6
|
* update onnx.protoArne Juul2023-06-234-80/+453
| | | | | * use latest version from https://github.com/onnx/onnx/blob/main/onnx/onnx.proto * track API changes (enum -> int32)
* Prefer truncation configuration from tokenizer modelBjørn Christian Seime2023-06-121-6/+19
| | | | | | | Only override truncation if not specified or max length exceeds max tokens accepted by model. Use JNI wrapper directly to determine existing truncation configuration (JSON format is not really documented). Simply configuration for pure tokenizer embedder. Disable DJL usage telemetry.
* Add missing wiring of pooling strategyBjørn Christian Seime2023-06-081-11/+1
|
* Disable padding and make it configurableBjørn Christian Seime2023-06-081-0/+1
|
* Merge pull request #27297 from vespa-engine/bjorncs/bert-embedder-services-xmlBjørn Christian Seime2023-06-064-49/+54
|\ | | | | Bjorncs/bert embedder services xml
| * Make pooling strategy configurable for Huggingface embedderBjørn Christian Seime2023-06-053-17/+54
| |
| * Move config definition to `configdefinitions`Bjørn Christian Seime2023-06-051-32/+0
| |
* | Add necessary options to use failOnWarningsgjoranv2023-06-051-0/+4
|/
* Introduce services.xml syntax for configuring HuggingFace embeddersBjørn Christian Seime2023-06-022-29/+6
|
* Properly ignore token type ids from tokenizer if disabledBjørn Christian Seime2023-05-301-2/+2
|
* Remove dead codeBjørn Christian Seime2023-05-262-43/+0
|
* Make truncation and max length configurableBjørn Christian Seime2023-05-261-12/+3
|
* Use GPU by default if availableBjørn Christian Seime2023-05-222-2/+4
|
* Revert "Revert "Bjorncs/huggingface tokenizer""Bjørn Christian Seime2023-05-126-210/+29
| | | | This reverts commit 2bb74878879b3acb1919fd658b8f2c476d8129d6.
* Revert "Bjorncs/huggingface tokenizer"Arnstein Ressem2023-05-126-29/+210
|
* Handle models requiring token type idsBjørn Christian Seime2023-05-112-13/+20
|
* Don't lower caseBjørn Christian Seime2023-05-111-1/+1
|
* Disable special tokens by defaultBjørn Christian Seime2023-05-111-0/+1
|
* Mark HF integration as betaBjørn Christian Seime2023-05-111-0/+2
|
* Make HF tokenizer a separate embedderBjørn Christian Seime2023-05-115-197/+6
|
* Don't specify both package and namespaceBjørn Christian Seime2023-05-112-1/+1
|
* Upgrade HF Tokenizer to 0.22.1Bjørn Christian Seime2023-05-081-1/+1
|
* Handle nullsBjørn Christian Seime2023-05-081-0/+4
|
* fixup! Require GPU when requested and available for Bert + HF embeddersBjørn Christian Seime2023-05-081-1/+1
|
* Require GPU when requested and available for Bert + HF embeddersBjørn Christian Seime2023-05-085-5/+6
|
* Require GPU when available for ONNX evaluation in global-phase and embeddersBjørn Christian Seime2023-05-083-5/+42
|
* Make thread pool size configurableBjørn Christian Seime2023-05-055-17/+24
|
* Make normalization optionalBjørn Christian Seime2023-05-052-2/+8
|
* Allow for manual configuration of GPUBjørn Christian Seime2023-05-052-1/+8
|
* Move config to same package as componentBjørn Christian Seime2023-05-052-1/+1
|
* Split out HF TokenizerBjørn Christian Seime2023-05-054-23/+174
|
* Put the openai client in a separate componentJon Bratseth2023-04-2515-482/+20
|
* Export APIJon Bratseth2023-04-204-1/+218
|