| Commit message (Collapse) | Author | Age | Files | Lines |
|\ |
|
| |\
| | |
| | | |
Add noexcept
|
| | | |
|
| |/ |
|
| | |
|
|/ |
|
| |
|
| |
|
|\
| |
| | |
Use std::vector instead of vespalib::Array
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
|\ \
| |/
|/|
| |
| | |
vespa-engine/toregge/disable-thread-sanitizer-instrumentation-for-accelerated-and128-and-or128
Disable thread sanitizer instrumentation for anonymous get function
|
| | |
|
| |
| |
| |
| |
| | |
used by accelerated bitwise and/or. Source bitvectors might be modified
due to feeding during search.
|
| | |
|
|/ |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds `prefix:[true|false]` annotation support to the `fuzzy`
query operator in the YQL and JSON query languages. Fuzzy
prefix matching semantics are wired through to the matcher
implementations for both indexed and streaming search.
Example usage:
{maxEditDistance:1,prefix:true}fuzzy("foo")
Will match `foo`, `foobar`, `foxtrot`, `zookeeper` and so on.
It can be combined with the existing prefix locking feature:
{maxEditDistance:1,prefixLength:2,prefix:true}fuzzy("foo")
Which will match `foo`, `foobar`, `foxtrot` etc, but _not_
`zookeeper` since the locked prefix (`fo`) does not match.
Due to the complexities involved with extending the legacy binary
query stack representation, signalling prefix matching for the
fuzzy term is done by pragmatically adding a new, generic "prefix
matching" term-level flag. This is currently ignored for
everything except fuzzy query items.
Modernizing the query stack format to make it more extensible
(i.e. move encoding to Protobuf) is on the backlog...!
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/levenshtein-prefix-matching-algo-support
Add prefix match support to Levenshtein algorithm implementations
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Adds support for matching the _prefix_ of a source string against
a target string (the prefix query) within a bounded maximum number
of `k` max edits. Iff the number of edits required is within the
specified bound, returns the _minimum_ number of edits required to
transform the source string prefix to the full target string.
By convention, we treat the target string as the "columnar" string
in the Levenshtein matrix (i.e. its characters are column-indexed,
whereas the source string is row-indexed). This matters for prefix
matching, because unlike regular Levenshtein matching it is _not_
symmetric between source and target strings.
By definition, the Levenshtein matrix cell at row `i`, column `j`
provides the minimum number of edits required to transform a prefix
of source string S (up to and including length `i`) into a prefix of
target string T (up to and including length `j`). Since we want to
match against the entire target (prefix query) string of length `n`,
the problem is reduced to finding the minimum value of the `n`th
column that is `<= k`.
Example: matching the source string `abcdef` against the target `acd`
with `k` == 2:
a c d
0 1 2 3
a 1 0 1 2
b 2 1 1 2
c 3 2 1 2
d 4 3 2 1
e 5 4 3 2
f 6 5 4 3
In this case, the _shortest_ matching prefix is simply `a`, but the
_minimum edits_ prefix is `abcd`. The latter prefix's distance is
what we return.
For our generalized (i.e. arbitrary `k`) Levenshtein implementation,
this is implemented fairly straight forward since it operates on
matrix rows already (sparsely populated around the diagonal).
For the DFA implementation(s), transitioning between states in a
Levenshtein DFA is equivalent to explicitly computing the (sparse)
matrix columns around the diagonal for the source character being
currently matched, so the same principle applies directly to it.
Since we don't have any explicit notion of the value of matrix columns
in the abstract DFA, a source string in prefix mode will be considered
a match when _any_ DFA state is a match.
By definition, this is as-if the matrix row represented by the state
has a `n`th column that is <= `k`. We then follow the DFA until it
can no longer match, keeping track of the lowest state edit distance
encountered. This mirrors finding the row whose `n`th column
minimizes `k`.
Prefix matching support has been added to the core DFA matching
loop algorithms, with only very minor changes to the underlying DFA
implementations. Successor generation upon mismatch should work
as expected with no algorithmic changes introduced for prefix
matching. Prefix match mode has been added as a dimension to the
exhaustive successor checking unit test.
|
| | |
|
| | |
|
|\ \
| | |
| | |
| | |
| | | |
vespa-engine/balder/optimize-single-subspace-with-fast-path
- Optimize distance calculation for tensors with single dense subspace.
|
| |/
| |
| |
| |
| | |
- Let EmptySubspace be invalid.
- Add noexcept to get_tensor(s).
|
| | |
|
|/ |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Extends metric producer classes with the requested exposition format.
As a consequence, the State API server has been changed to allow
emitting other content types than just `application/json`.
Add custom Prometheus rendering for Slobrok, as it does its own
domain-specific metric tracking. However, since it has non-destructive
sampling properties, we can actually use proper `counter` types.
|
| |
|
|\
| |
| |
| |
| | |
vespa-engine/toregge/early-exit-on-fatal-failure-in-sharded-hash-map-unit-test
Early exit on fatal failure in sharded hash map unit test.
|
| | |
|
|\ \
| | |
| | |
| | |
| | | |
vespa-engine/toregge/early-exit-on-fatal-failure-in-array-store-unit-test
Early exit on fatal failure in array store unit test.
|
| |/ |
|
|\ \
| | |
| | |
| | |
| | | |
vespa-engine/toregge/early-exit-on-fatal-error-in-frozen-btree-unit-test
Early exit on fatal error in frozen btree unit test.
|
| |/ |
|
|/ |
|
|\
| |
| |
| |
| | |
vespa-engine/toregge/rewrite-random-unit-test-to-gtest
Rewrite random unit test to gtest.
|
| | |
|
|\ \
| | |
| | |
| | |
| | | |
vespa-engine/toregge/rewrite-ptr-holder-unit-test-to-gtest
Rewrite PtrHolder unit test to gtest.
|
| |/ |
|
|\ \
| | |
| | |
| | |
| | | |
vespa-engine/toregge/rewrite-asciistream-unit-test-to-gtest
Rewrite asciistream unit test to gtest.
|
| |/ |
|
|\ \
| | |
| | |
| | |
| | | |
vespa-engine/toregge/rewrite-program-options-unit-test-to-gtest
Rewrite program options unit test to gtest.
|
| |/ |
|
|\ \
| | |
| | |
| | |
| | | |
vespa-engine/toregge/rewrite-xml-serializable-unit-test-to-gtest
Rewrite XmlSerializable unit test to gtest.
|
| |/ |
|