summaryrefslogtreecommitdiffstats
path: root/searchlib
Commit message (Collapse)AuthorAgeFilesLines
* Add a 5x faster handcoded detection of legal feature names that does not ↵Henning Baldersheim2021-09-051-3/+0
| | | | require quoting.
* Merge pull request #18922 from ↵Henning Baldersheim2021-08-311-1/+1
|\ | | | | | | | | vespa-engine/toregge/enable-alternate-visited-nodes-trackers-for-hnsw-index Enable alternate visited nodes trackers for HNSW index.
| * Enable alternate visited nodes trackers for HNSW index.Tor Egge2021-08-311-1/+1
| |
* | Lower limit for selecting BitVectorVisistedTracker.Tor Egge2021-08-311-1/+1
|/
* Add class comments. Fix typo.Tor Egge2021-08-313-1/+12
|
* Prepare for alternate visited nodes trackers for HNSW index.Tor Egge2021-08-309-9/+186
|
* Merge pull request #18898 from ↵Henning Baldersheim2021-08-302-18/+45
|\ | | | | | | | | vespa-engine/geirst/avoid-global-filter-calculation-when-not-needed The global filter is only needed when having a nearest neighbor index…
| * The global filter is only needed when having a nearest neighbor index (hnsw) ↵Geir Storli2021-08-302-18/+45
| | | | | | | | | | | | and doing approximate calculation. This avoids costly calculation of the global filter in cases it is not needed.
* | Handle when priorityQ goes from not full to full.Henning Baldersheim2021-08-301-1/+5
| |
* | As doSeek is called alot more frequent than doUnpack just use locking of the ↵Henning Baldersheim2021-08-301-8/+6
|/ | | | | | | heap in unpack. In addition to adjusting the priority Q also update the distance_threshold with a relaxed store to an atomic variable. On read the distance threshold can be read cheaply with a relaxed load.
* Report address space usage for shared string repo for non-dense tensor ↵Geir Storli2021-08-235-2/+18
| | | | attributes.
* Report address space usage for components in tensor attributes.Geir Storli2021-08-2012-2/+67
|
* Merge pull request #18783 from vespa-engine/toregge/compact-hnsw-indexGeir Storli2021-08-2010-6/+300
|\ | | | | Compact HNSW index when ratio of dead bytes / address space is too high
| * Factor out common code.Tor Egge2021-08-181-17/+21
| |
| * Compact HNSW index when ratio of dead bytes / address space is too highTor Egge2021-08-1810-6/+296
| | | | | | | | relative to used bytes / address space.
* | Track max address space usage among components in attributes vectors in all ↵Geir Storli2021-08-201-0/+1
| | | | | | | | sub databases.
* | Include limits when needed.Tor Egge2021-08-181-0/+1
|/
* Merge pull request #18755 from ↵Håvard Pettersen2021-08-167-192/+4
|\ | | | | | | | | vespa-engine/havardpe/move-feature-name-symbol-extractor move FeatureNameExtractor
| * move FeatureNameExtractorHåvard Pettersen2021-08-167-192/+4
| | | | | | | | to make it available for use in vespa-eval-expr
* | Merge pull request #18752 from ↵Henning Baldersheim2021-08-162-4/+7
|\ \ | | | | | | | | | | | | vespa-engine/toregge/use-4096-buffers-for-hnsw-index-link-array-store Use 4096 buffers for HNSW link array store.
| * | Use 4096 buffers for HNSW link array store.Tor Egge2021-08-162-4/+7
| | | | | | | | | | | | | | | Configure link array store to handle arrays of 193 elements or less without indirect storage.
* | | Improve naming and readabilityHenning Baldersheim2021-08-161-7/+8
| | |
* | | Instead of having one large array of individually allocated vectors useHenning Baldersheim2021-08-162-15/+46
|/ / | | | | | | | | 2 large, optionally mmapped, vectors where the first just points into the second. In order to avoid resizing, count first.
* | Minor code layoutHenning Baldersheim2021-08-151-2/+1
| |
* | Better naming.Henning Baldersheim2021-08-151-2/+2
| |
* | Provide more details on memory usage.Henning Baldersheim2021-08-151-1/+9
| |
* | Add a time budget of 100ms. If counting not complete by then, abort, and let ↵Henning Baldersheim2021-08-152-7/+17
| | | | | | | | the count be incomplete.
* | Use a simple std::vector<bool> for visited markin as most bits will be set.Henning Baldersheim2021-08-151-4/+4
| |
* | Avoid starting a separate thread for completing index insert.Henning Baldersheim2021-08-131-34/+73
| | | | | | | | | | Use a queue and do completition in the forground. That ensures only a single thread modifying the attribute.
* | Notify when _pending reaches zero.Henning Baldersheim2021-08-131-3/+6
| |
* | Refactor for readability and maintenance.Henning Baldersheim2021-08-132-30/+89
| |
* | Use the executor for the part that can be parallell when rebuilding index on ↵Henning Baldersheim2021-08-132-7/+64
|/ | | | load.
* Add an executor to the AttributeVector::load/onLoad interface so attributes ↵Henning Baldersheim2021-08-1232-39/+46
| | | | can use multithread load if feasible.
* swappable -> pagedHenning Baldersheim2021-08-124-7/+7
|
* A swappable attribute will use a file backed memory allocator.Henning Baldersheim2021-08-126-12/+55
|
* swapable -> swappableHenning Baldersheim2021-08-122-3/+3
|
* Control swappableHenning Baldersheim2021-08-122-3/+3
|
* Add swapable attribute option.Henning Baldersheim2021-08-123-85/+42
|
* Merge pull request #18716 from ↵Henning Baldersheim2021-08-114-10/+20
|\ | | | | | | | | vespa-engine/havardpe/avoid-crash-on-runtime-onnx-errors avoid crash on run-time onnx errors
| * avoid crash on run-time onnx errorsHåvard Pettersen2021-08-114-10/+20
| | | | | | | | | | | | | | | | - warn about onnx model dry-run being disabled - catch and report onnx errors during ranking - zero-fill failed results to avoid re-using previous results - use explicit output size in fragile model (output became float[2] instead of float[batch] anyways)
* | Unify on using hex for hash values.Henning Baldersheim2021-08-111-1/+1
| |
* | Remove outdated commentHenning Baldersheim2021-08-111-4/+4
| |
* | Properly access the feature name for hashed edges.Henning Baldersheim2021-08-112-5/+7
| |
* | Add unit test with comment of what is incorrect with hashed partiotion edges ↵Henning Baldersheim2021-08-111-2/+30
| | | | | | | | and feature generation.
* | Refactor to avoid multiple hash lookups and code bloat.Henning Baldersheim2021-08-112-22/+26
| |
* | Unify code layout.Henning Baldersheim2021-08-102-40/+30
| |
* | Unify on 'using'Henning Baldersheim2021-08-102-7/+5
| |
* | Minor cleanup.Henning Baldersheim2021-08-101-24/+14
| |
* | SimplifyHenning Baldersheim2021-08-102-11/+6
|/
* Split current global_filter_limit into global_filter.lower_limit/upper_limit.Henning Baldersheim2021-08-044-11/+47
| | | | | | | | | If estimated_hits < lower_limit no filter is set which will cause fallback to bruteforce. If estimated_hits in [lower_limit, upper_limit] apply global filter. if estimated_hits > upper_limit an empty filter is set. This will avoid the filter setup cost. So if the filter has a huge setup cost, you can reduce upper_limit to a number below 1.0 and instead increase target_num_hits similarly. Setting target_num_hits to 1.0/upper_limit * 1.2 should give similar recall. This will add a 20% safety to handle correlation of filter and NearestNeightbor calculation.