aboutsummaryrefslogtreecommitdiffstats
path: root/model-integration/src/main
Commit message (Expand)AuthorAgeFilesLines
* Use chat template in llama if availableLester Solbakken2024-05-301-0/+2
* Use var type to handle renaming of jllama interface classes.Henning Baldersheim2024-05-161-1/+1
* Avoid methods deprecated in jackson 2.17.1Henning Baldersheim2024-05-061-2/+2
* Revert "Update jackson2.vespa.version to v2.17.0"Henning Baldersheim2024-05-061-2/+2
* Merge pull request #31120 from vespa-engine/lesters/local-llm-timeoutHarald Musum2024-05-062-7/+38
|\
| * Add timeout for requests waiting to start local llm inferenceLester Solbakken2024-05-062-7/+38
* | Avoid deprecated methods.Henning Baldersheim2024-05-061-2/+2
|/
* Merge pull request #31049 from vespa-engine/jobergum/add-prepend-embedder-sup...Bjørn Christian Seime2024-04-261-1/+17
|\
| * add prepend supportJo Kristian Bergum2024-04-251-1/+17
* | Update defaults for local LLM configLester Solbakken2024-04-241-3/+3
|/
* Remove unneccessary importLester Solbakken2024-04-221-1/+0
* Set minimum number of threads to 1Lester Solbakken2024-04-221-1/+1
* Reapply "Lesters/add local llms 2"Lester Solbakken2024-04-166-0/+293
* Revert "Lesters/add local llms 2"Harald Musum2024-04-156-293/+0
* Reapply "Lesters/add local llms"Lester Solbakken2024-04-156-0/+293
* Revert "Lesters/add local llms"Lester Solbakken2024-04-156-293/+0
* Merge branch 'master' into lesters/add-local-llmsLester Solbakken2024-04-128-20/+13
|\
| * Unify on List.ofHenning Baldersheim2024-04-117-17/+11
| * Unify on Map.ofHenning Baldersheim2024-04-111-3/+2
* | Move LLM client stuff from container-search to model-integrationLester Solbakken2024-04-126-0/+293
|/
* cache more and re-factorJo Kristian Bergum2024-04-081-55/+64
* Key by embedder id and don't recompute inputsJon Bratseth2024-04-071-40/+33
* Add equivalent to `Map.computeIfAbsent()` to simplify typical usage of the cacheBjørn Christian Seime2024-04-042-20/+3
* Add caching of onnx inference output using Context cacheJo Kristian Bergum2024-04-041-14/+35
* Support for dimensionality flexbility and caching onnx inference output using...Jo Kristian Bergum2024-04-041-26/+34
* Add some more tests on the binarizationJo Kristian Bergum2024-03-301-1/+1
* fix unwanted importJo Kristian Bergum2024-03-291-1/+0
* Add support for binarization and matryoshka for hf-embedderJo Kristian Bergum2024-03-291-5/+56
* All embedders are the sameJon Bratseth2024-02-091-2/+2
* Support embedding into rank 3 tensorsJon Bratseth2024-02-022-20/+26
* - Add alternative sparsify implementation using generic tensor.reduce/map.Henning Baldersheim2024-01-311-3/+44
* - Put the inner loops in separate methods. This improves ability to inline.Henning Baldersheim2024-01-201-53/+51
* Rename getIndex => getDirectIndexHenning Baldersheim2024-01-201-1/+1
* Add a class for assist efficient traversal of dimensions in an IndexedTensor.Henning Baldersheim2024-01-191-2/+7
* Cache sizes.totalSize() in variable to prevent recomputation.Henning Baldersheim2024-01-181-20/+19
* Since both value and log(value) are monotonically increasing for value >= 1,Henning Baldersheim2024-01-181-8/+8
* Construct array right away instead of going via a single element list and the...Henning Baldersheim2024-01-181-5/+15
* Avoid generic reduce and keep PAD token embeddingJo Kristian Bergum2024-01-151-11/+16
* address reviewJo Kristian Bergum2024-01-111-42/+23
* Avoid generic reduce to reduce gc pressureJo Kristian Bergum2024-01-111-18/+47
* finalJo Kristian Bergum2024-01-061-1/+1
* handle multilingual models betterJo Kristian Bergum2024-01-061-60/+62
* Allow mapped 1d tensor for embed expressionsJo Kristian Bergum2023-12-171-10/+12
* Add a splade embedder implementationJo Kristian Bergum2023-12-151-0/+168
* Move Jackson util from vespajlib to container-core.Henning Baldersheim2023-11-243-3/+3
* jackson 2.16 changes some of its default settings so we consolidate our use o...Henning Baldersheim2023-11-233-8/+7
* unpack_bits_from_int8 -> unpack_bitsArne Juul2023-11-101-2/+2
* add simple expandBitTensor functionArne Juul2023-11-101-6/+17
* Add support and upgrade opsetJo Kristian Bergum2023-10-261-1/+23
* Less verbose logging when failing to find CUDA and it is optionalJo Kristian Bergum2023-10-261-2/+2