aboutsummaryrefslogtreecommitdiffstats
path: root/searchlib
Commit message (Collapse)AuthorAgeFilesLines
* Update copyrightJon Bratseth2023-10-092456-2458/+2458
|
* - Avoid holding a bucketizer guard. Just get it everytime you need it.Henning Baldersheim2023-10-052-25/+3
| | | | | - Max hold time is often above 2-3 seconds. This makes it very likely that a sudden buildup might add l ot of memory to onhold.
* Use ConstBufferRef and add some noexceptHenning Baldersheim2023-10-0516-76/+79
|
* Merge pull request #28801 from ↵Henning Baldersheim2023-10-052-0/+6
|\ | | | | | | | | vespa-engine/balder/disable-cache-for-removed-subdb Disable cache for removed only docsubdb.
| * Add test for disabling of cache in removed dbHenning Baldersheim2023-10-052-0/+5
| |
| * Disable cache for removed only docsubdb.Henning Baldersheim2023-10-051-0/+1
| |
* | Merge pull request #28800 from ↵Henning Baldersheim2023-10-052-6/+6
|\ \ | |/ |/| | | | | vespa-engine/balder/reduce-max-number-of-lids-2-8m - Reduce max lids per file and max file size to 4M and 256M during un…
| * - Reduce max lids per file and max file size to 4M and 256M during unit testing.Henning Baldersheim2023-10-052-6/+6
| | | | | | | | - Reduce max lids from 40M to 8M as default configuration.
* | Merge branch 'master' into balder/refactor-for-clarityHenning Baldersheim2023-10-054-20/+35
|\ \
| * | - Instead of keeping a map of bucketId => lids, just append everything to a ↵Henning Baldersheim2023-10-044-21/+36
| |/ | | | | | | | | | | | | vector and sort when complete. - This significantly improves memory usage during compaction. Instead of many heap allocations - You now get fewer mmapped allocations that are dropped when done.
* / - Number of partitions is fixed compile time => use std::array.Henning Baldersheim2023-10-054-22/+25
|/ | | | - Use unique_ptr on outer object instead of unique_ptr on multiple non-movable inner objects.
* GC unused includeHenning Baldersheim2023-10-041-2/+0
|
* Process idx file in streaming fashion instead of first reading all and then ↵Henning Baldersheim2023-10-042-73/+48
| | | | process.
* GC unused and non computed return value.Henning Baldersheim2023-10-044-46/+53
| | | | Refactor to prepare for streaming read.
* Use large allocator and control size of TmpChunkMeta.Henning Baldersheim2023-10-041-1/+2
|
* Merge pull request #28776 from ↵Tor Egge2023-10-031-3/+4
|\ | | | | | | | | vespa-engine/toregge/avoid-unaligned-read-while-decoding-serialized-query-stack-dump Avoid unaligned read while decoding serialized query stack dump.
| * Avoid unaligned read while decoding serialized query stack dump.Tor Egge2023-10-031-3/+4
| |
* | Merge pull request #28773 from ↵Henning Baldersheim2023-10-033-5/+6
|\ \ | |/ |/| | | | | vespa-engine/geirst/dfa-table-as-default-fuzzy-matching-algorithm Use DfaTable as default fuzzy matching algorithm for maxEditDistance …
| * Use DfaTable as default fuzzy matching algorithm for maxEditDistance <= 2.Geir Storli2023-10-033-5/+6
| |
* | Prevent eternal loop if bit vectors are shorter than docid limitHenning Baldersheim2023-10-033-8/+8
| |
* | Add disabled test to prove eternal loop.Henning Baldersheim2023-10-031-4/+35
| |
* | Add test counting seeksHenning Baldersheim2023-10-031-0/+16
| |
* | Refactor testHenning Baldersheim2023-10-031-127/+90
|/
* Revert "Use DfaTable as default fuzzy matching algorithm for maxEditDistance ↵Henning Baldersheim2023-10-022-2/+2
| | | | …"
* Merge pull request #28765 from ↵Geir Storli2023-10-022-2/+2
|\ | | | | | | | | vespa-engine/geirst/dfa-table-as-default-fuzzy-matching-algorithm Use DfaTable as default fuzzy matching algorithm for maxEditDistance …
| * Use DfaTable as default fuzzy matching algorithm for maxEditDistance <= 2.Geir Storli2023-10-022-2/+2
| |
* | Merge pull request #28736 from ↵Henning Baldersheim2023-10-024-16/+34
|\ \ | | | | | | | | | | | | vespa-engine/balder/use-as-bitvector-api-instead-of-casting - Use asBitVectorIterator instead of isBitVector + casting to present…
| * | Expose only necessary meta information for bitvector, not the iterator interfaceHenning Baldersheim2023-10-024-19/+32
| | |
| * | - Use asBitVectorIterator instead of isBitVector + casting to present a ↵Henning Baldersheim2023-09-294-10/+15
| | | | | | | | | | | | | | | | | | | | | | | | BitVectorIterator interface. - Allow Filter wrapper to expose underlying BitVector. - This ensures that the bitvectors are handled first during termwise evaluation, as they have a constant cost and will reduce the cost for the ones coming later on.
* | | Merge pull request #28723 from ↵Henning Baldersheim2023-10-024-36/+73
|\ \ \ | |_|/ |/| | | | | | | | vespa-engine/balder/lift-out-single-leaf-iterators-from-ws Lift out single iterators if they are leafs and tfmd is not needed.
| * | Use new scoped if syntax.Henning Baldersheim2023-10-021-2/+1
| | |
| * | If there is a single child in the ws, that also is a leaf, it will be be ↵Henning Baldersheim2023-09-293-5/+19
| | | | | | | | | | | | lifted out directly.
| * | Add test for single term wsetsHenning Baldersheim2023-09-291-12/+32
| | |
| * | Use braced initializersHenning Baldersheim2023-09-291-21/+19
| | |
| * | Lift out single iterators if they are leafs and tfmd is not needed.Henning Baldersheim2023-09-292-3/+9
| |/
* | Normalize class names in attribute weighted set blueprint test.Tor Egge2023-09-291-4/+27
| |
* | Merge pull request #28737 from vespa-engine/geirst/fuzzy-posting-list-fallbackGeir Storli2023-09-292-3/+43
|\ \ | |/ |/| Add fallback to using posting list when fuzzy and being non-strict.
| * Add fallback to using posting list when fuzzy and being non-strict.Geir Storli2023-09-292-3/+43
| |
* | Reduce code duplication between fillArray and fillBitVector inTor Egge2023-09-292-23/+35
|/ | | | PostingListFoldedSearchContextT.
* - Resolve (!field_is_filter && !_tmd.isNotNeeded()) once upfront.Henning Baldersheim2023-09-291-5/+5
| | | | - Lift out single items if filter or match data not needed.
* Lift out single iterators if either field is filter, or termfieldmatchdata ↵Henning Baldersheim2023-09-281-1/+1
| | | | is not needed.
* Add noexceptHenning Baldersheim2023-09-281-32/+34
|
* Merge pull request #28687 from ↵Geir Storli2023-09-284-45/+153
|\ | | | | | | | | vespa-engine/toregge/avoid-unneeded-counting-of-hits Avoid counting hits in range multiple times.
| * Store a limited number of posting list indexes in countHits() toTor Egge2023-09-274-10/+70
| | | | | | | | | | reduce amount of dictionary entry filtering in fillArray() and fillBitVector() for regexp search and fuzzy search.
| * Avoid counting hits in range multiple times.Tor Egge2023-09-272-43/+91
| |
* | Merge pull request #28691 from ↵Henning Baldersheim2023-09-272-5/+3
|\ \ | | | | | | | | | | | | vespa-engine/vekterli/preserve-successor-prefix-during-matching Preserve prefix of input DFA successor string
| * | Preserve prefix of input DFA successor stringTor Brede Vekterli2023-09-272-5/+3
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a non-empty string is passed as a successor to the DFA, the contents of the string will be preserved, i.e. the successor will always be _appended_ to any existing data. This allows for less manual fiddling when implementing prefix locking by the caller (no need to concatenate a prefix with the generated successor string). Note: this has some added cognitive cost where the caller now has the entire responsibility of resetting the successor between calls. The existing fuzzy matcher has been updated to no longer require a separation between successor prefix and suffix; it can now safely reuse the successor prefix between calls.
* / Split MultiBitVectorIterator into implementation and Iterator interface for ↵Henning Baldersheim2023-09-272-99/+158
|/ | | | reuse.
* Factor out fallback_to_approx_num_hits() member function inTor Egge2023-09-272-32/+16
| | | | posting list search contexts.
* Merge pull request #28670 from ↵Henning Baldersheim2023-09-268-31/+58
|\ | | | | | | | | vespa-engine/balder/use-DocumentWeightOrFilterSearch-for-iterator-packs - Make iterator pack template argument to handle both AttributeIterat…