vespa - An engine for low-latency computation over large data sets

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge branch 'master' into balder/add-noexcept	Henning Baldersheim	2024-04-26	4	-170/+118
\|\
\| *	Merge pull request #31044 from vespa-engine/balder/add-noexcept-4	Henning Baldersheim	2024-04-26	2	-163/+116
\| \|\ \| \| \| \| \| \|	Add noexcept
\| \| *	Add noexcept	Henning Baldersheim	2024-04-25	2	-163/+116
\| \| \|
\| * \|	Use std::vector	Henning Baldersheim	2024-04-25	3	-8/+3
\| \|/
* \|	Add override to destructor.	Henning Baldersheim	2024-04-25	1	-1/+1
\| \|
* \|	Add noexcept and remove outdated comment	Henning Baldersheim	2024-04-25	2	-24/+23
\|/
*	Add noexcept	Henning Baldersheim	2024-04-25	4	-58/+51
\|
*	Add noexcept	Henning Baldersheim	2024-04-24	2	-90/+84
\|
*	Merge pull request #31016 from vespa-engine/balder/use-std-vector	Henning Baldersheim	2024-04-24	17	-159/+146
\|\ \| \| \| \|	Use std::vector instead of vespalib::Array
\| *	Remove incorrect noexcept	Henning Baldersheim	2024-04-24	1	-14/+6
\| \|
\| *	Add noexcept	Henning Baldersheim	2024-04-24	3	-29/+29
\| \|
\| *	Add noexcept	Henning Baldersheim	2024-04-24	4	-52/+48
\| \|
\| *	Add noexcept	Henning Baldersheim	2024-04-24	2	-29/+29
\| \|
\| *	Use std::vector instead of vespalib::Array	Henning Baldersheim	2024-04-24	8	-36/+35
\| \|
* \|	Merge pull request #31020 from ↵	Henning Baldersheim	2024-04-24	1	-0/+9
\|\ \ \| \|/ \|/\| \| \| \| \|	vespa-engine/toregge/disable-thread-sanitizer-instrumentation-for-accelerated-and128-and-or128 Disable thread sanitizer instrumentation for anonymous get function
\| *	Add comment describing why instrumentation is turned off.	Tor Egge	2024-04-24	1	-0/+3
\| \|
\| *	Disable thread sanitizer instrumentation for anonymous get function	Tor Egge	2024-04-24	1	-0/+6
\| \| \| \| \| \| \| \| \| \|	used by accelerated bitwise and/or. Source bitvectors might be modified due to feeding during search.
* \|	Add noexcept	Henning Baldersheim	2024-04-24	2	-4/+4
\| \|
* \|	Add noexcept	Henning Baldersheim	2024-04-24	8	-49/+49
\|/
*	Fix format string in hamming benchmark.	Tor Egge	2024-04-23	1	-1/+2
\|
*	Use explicit and do not expose nbostream in headerfile.	Henning Baldersheim	2024-04-22	1	-4/+4
\|
*	Wire fuzzy prefix matching support through the query stack	Tor Brede Vekterli	2024-04-19	4	-13/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds `prefix:[true\|false]` annotation support to the `fuzzy` query operator in the YQL and JSON query languages. Fuzzy prefix matching semantics are wired through to the matcher implementations for both indexed and streaming search. Example usage: {maxEditDistance:1,prefix:true}fuzzy("foo") Will match `foo`, `foobar`, `foxtrot`, `zookeeper` and so on. It can be combined with the existing prefix locking feature: {maxEditDistance:1,prefixLength:2,prefix:true}fuzzy("foo") Which will match `foo`, `foobar`, `foxtrot` etc, but _not_ `zookeeper` since the locked prefix (`fo`) does not match. Due to the complexities involved with extending the legacy binary query stack representation, signalling prefix matching for the fuzzy term is done by pragmatically adding a new, generic "prefix matching" term-level flag. This is currently ignored for everything except fuzzy query items. Modernizing the query stack format to make it more extensible (i.e. move encoding to Protobuf) is on the backlog...!
*	Merge pull request #30932 from ↵	Tor Brede Vekterli	2024-04-19	14	-69/+379
\|\ \| \| \| \| \| \| \| \|	vespa-engine/vekterli/levenshtein-prefix-matching-algo-support Add prefix match support to Levenshtein algorithm implementations
\| *	Add prefix match support to Levenshtein algorithm implementations	Tor Brede Vekterli	2024-04-16	14	-69/+379
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds support for matching the _prefix_ of a source string against a target string (the prefix query) within a bounded maximum number of `k` max edits. Iff the number of edits required is within the specified bound, returns the _minimum_ number of edits required to transform the source string prefix to the full target string. By convention, we treat the target string as the "columnar" string in the Levenshtein matrix (i.e. its characters are column-indexed, whereas the source string is row-indexed). This matters for prefix matching, because unlike regular Levenshtein matching it is _not_ symmetric between source and target strings. By definition, the Levenshtein matrix cell at row `i`, column `j` provides the minimum number of edits required to transform a prefix of source string S (up to and including length `i`) into a prefix of target string T (up to and including length `j`). Since we want to match against the entire target (prefix query) string of length `n`, the problem is reduced to finding the minimum value of the `n`th column that is `<= k`. Example: matching the source string `abcdef` against the target `acd` with `k` == 2: a c d 0 1 2 3 a 1 0 1 2 b 2 1 1 2 c 3 2 1 2 d 4 3 2 1 e 5 4 3 2 f 6 5 4 3 In this case, the _shortest_ matching prefix is simply `a`, but the _minimum edits_ prefix is `abcd`. The latter prefix's distance is what we return. For our generalized (i.e. arbitrary `k`) Levenshtein implementation, this is implemented fairly straight forward since it operates on matrix rows already (sparsely populated around the diagonal). For the DFA implementation(s), transitioning between states in a Levenshtein DFA is equivalent to explicitly computing the (sparse) matrix columns around the diagonal for the source character being currently matched, so the same principle applies directly to it. Since we don't have any explicit notion of the value of matrix columns in the abstract DFA, a source string in prefix mode will be considered a match when _any_ DFA state is a match. By definition, this is as-if the matrix row represented by the state has a `n`th column that is <= `k`. We then follow the DFA until it can no longer match, keeping track of the lowest state edit distance encountered. This mirrors finding the row whose `n`th column minimizes `k`. Prefix matching support has been added to the core DFA matching loop algorithms, with only very minor changes to the underlying DFA implementations. Successor generation upon mismatch should work as expected with no algorithmic changes introduced for prefix matching. Prefix match mode has been added as a dimension to the exhaustive successor checking unit test.
* \|	Roll out binary_hamming 3 => 4	Henning Baldersheim	2024-04-19	1	-1/+1
\| \|
* \|	Increase roll out of hamming distance from 2 to 3.	Henning Baldersheim	2024-04-18	1	-1/+1
\| \|
* \|	Merge pull request #30933 from ↵	Henning Baldersheim	2024-04-17	2	-4/+4
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	vespa-engine/balder/optimize-single-subspace-with-fast-path - Optimize distance calculation for tensors with single dense subspace.
\| * \|	- Optimize distance calculation for tensors with single dense subspace.	Henning Baldersheim	2024-04-16	2	-4/+4
\| \|/ \| \| \| \| \| \| \| \|	- Let EmptySubspace be invalid. - Add noexcept to get_tensor(s).
* \|	Use WORD_SZ instead of the constant 8	Henning Baldersheim	2024-04-17	1	-1/+1
\| \|
* \|	Add micro benchmark for binary hamming distance.	Henning Baldersheim	2024-04-17	4	-13/+69
\|/
*	Add Prometheus support to simple-metrics snapshot rendering	Tor Brede Vekterli	2024-03-22	6	-11/+241
\|
*	Wire Prometheus metric export to state V1 APIs	Tor Brede Vekterli	2024-03-21	10	-70/+168
\| \| \| \| \| \| \| \| \| \|	Extends metric producer classes with the requested exposition format. As a consequence, the State API server has been changed to allow emitting other content types than just `application/json`. Add custom Prometheus rendering for Slobrok, as it does its own domain-specific metric tracking. However, since it has non-destructive sampling properties, we can actually use proper `counter` types.
*	Move normalize_class_name to vespalib.	Tor Egge	2024-03-14	3	-0/+47
\|
*	Merge pull request #30550 from ↵	Geir Storli	2024-03-11	1	-3/+3
\|\ \| \| \| \| \| \| \| \|	vespa-engine/toregge/early-exit-on-fatal-failure-in-sharded-hash-map-unit-test Early exit on fatal failure in sharded hash map unit test.
\| *	Early exit on fatal failure in sharded hash map unit test.	Tor Egge	2024-03-09	1	-3/+3
\| \|
* \|	Merge pull request #30549 from ↵	Geir Storli	2024-03-11	1	-12/+12
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	vespa-engine/toregge/early-exit-on-fatal-failure-in-array-store-unit-test Early exit on fatal failure in array store unit test.
\| * \|	Early exit on fatal failure in array store unit test.	Tor Egge	2024-03-09	1	-12/+12
\| \|/
* \|	Merge pull request #30548 from ↵	Geir Storli	2024-03-11	1	-26/+11
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	vespa-engine/toregge/early-exit-on-fatal-error-in-frozen-btree-unit-test Early exit on fatal error in frozen btree unit test.
\| * \|	Early exit on fatal error in frozen btree unit test.	Tor Egge	2024-03-09	1	-26/+11
\| \|/
* /	Early exit on fatal failure in unique store test.	Tor Egge	2024-03-09	1	-3/+3
\|/
*	Merge pull request #30532 from ↵	Henning Baldersheim	2024-03-08	2	-45/+19
\|\ \| \| \| \| \| \| \| \|	vespa-engine/toregge/rewrite-random-unit-test-to-gtest Rewrite random unit test to gtest.
\| *	Rewrite random unit test to gtest.	Tor Egge	2024-03-08	2	-45/+19
\| \|
* \|	Merge pull request #30533 from ↵	Henning Baldersheim	2024-03-08	2	-25/+5
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	vespa-engine/toregge/rewrite-ptr-holder-unit-test-to-gtest Rewrite PtrHolder unit test to gtest.
\| * \|	Rewrite PtrHolder unit test to gtest.	Tor Egge	2024-03-08	2	-25/+5
\| \|/
* \|	Merge pull request #30534 from ↵	Henning Baldersheim	2024-03-08	2	-242/+201
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	vespa-engine/toregge/rewrite-asciistream-unit-test-to-gtest Rewrite asciistream unit test to gtest.
\| * \|	Rewrite asciistream unit test to gtest.	Tor Egge	2024-03-08	2	-242/+201
\| \|/
* \|	Merge pull request #30531 from ↵	Henning Baldersheim	2024-03-08	2	-137/+87
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	vespa-engine/toregge/rewrite-program-options-unit-test-to-gtest Rewrite program options unit test to gtest.
\| * \|	Rewrite program options unit test to gtest.	Tor Egge	2024-03-08	2	-137/+87
\| \|/
* \|	Merge pull request #30535 from ↵	Geir Storli	2024-03-08	2	-37/+11
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	vespa-engine/toregge/rewrite-xml-serializable-unit-test-to-gtest Rewrite XmlSerializable unit test to gtest.
\| * \|	Rewrite XmlSerializable unit test to gtest.	Tor Egge	2024-03-08	2	-37/+11
\| \|/