Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | vespalib::stringref => std::string_view | Henning Baldersheim | 2 days | 3 | -3/+3 |
| | |||||
* | Merge pull request #31660 from vespa-engine/havardpe/remove-testapp | Henning Baldersheim | 2024-06-20 | 1 | -1/+1 |
|\ | | | | | remove TEST_APPHOOK, TEST_INIT, TEST_DONE and TestApp | ||||
| * | remove TEST_APPHOOK, TEST_INIT, TEST_DONE and TestApp | Håvard Pettersen | 2024-06-20 | 1 | -1/+1 |
| | | |||||
* | | Rename streamingvisitors library to vespa_streamingvisitors. | Tor Egge | 2024-06-20 | 13 | -13/+13 |
|/ | |||||
* | Rename searchlib library to vespa_searchlib. | Tor Egge | 2024-06-20 | 2 | -2/+2 |
| | |||||
* | Read searchvisitor unit test config from source directory. | Tor Egge | 2024-06-18 | 1 | -3/+8 |
| | |||||
* | Merge pull request #31511 from ↵ | Henning Baldersheim | 2024-06-10 | 2 | -54/+33 |
|\ | | | | | | | | | vespa-engine/toregge/rewrite-vsm-document-unit-test-to-gtest Rewrite vsm document unit test to gtest. | ||||
| * | Use string literals directly. | Tor Egge | 2024-06-10 | 1 | -11/+11 |
| | | |||||
| * | Rewrite vsm document unit test to gtest. | Tor Egge | 2024-06-10 | 2 | -54/+33 |
| | | |||||
* | | Merge pull request #31513 from ↵ | Henning Baldersheim | 2024-06-10 | 2 | -26/+7 |
|\ \ | | | | | | | | | | | | | vespa-engine/toregge/rewrite-query-wrapper-unit-test-to-gtest Rewrite query wrapper unit test to gtest. | ||||
| * | | Rewrite query wrapper unit test to gtest. | Tor Egge | 2024-06-10 | 2 | -26/+7 |
| | | | |||||
* | | | Merge pull request #31514 from ↵ | Henning Baldersheim | 2024-06-10 | 2 | -50/+30 |
|\ \ \ | | | | | | | | | | | | | | | | | vespa-engine/toregge/rewrite-textutil-unit-test-to-gtest Rewrite textutil unit test to gtest. | ||||
| * | | | Rewrite textutil unit test to gtest. | Tor Egge | 2024-06-10 | 2 | -50/+30 |
| |/ / | |||||
* | | | Merge pull request #31510 from ↵ | Henning Baldersheim | 2024-06-10 | 2 | -54/+53 |
|\ \ \ | |/ / |/| | | | | | | | | vespa-engine/toregge/rewrite-vsm-docsum-unit-test-to-gtest Rewrite vsm docsum unit test to gtest. | ||||
| * | | Rewrite vsm docsum unit test to gtest. | Tor Egge | 2024-06-10 | 2 | -54/+53 |
| |/ | |||||
* / | Rewrite vsm::CharBuffer unit test to gtest. | Tor Egge | 2024-06-10 | 2 | -63/+49 |
|/ | |||||
* | Use explicit and do not expose nbostream in headerfile. | Henning Baldersheim | 2024-04-22 | 1 | -0/+2 |
| | |||||
* | Wire fuzzy prefix matching support through the query stack | Tor Brede Vekterli | 2024-04-19 | 1 | -27/+87 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds `prefix:[true|false]` annotation support to the `fuzzy` query operator in the YQL and JSON query languages. Fuzzy prefix matching semantics are wired through to the matcher implementations for both indexed and streaming search. Example usage: {maxEditDistance:1,prefix:true}fuzzy("foo") Will match `foo`, `foobar`, `foxtrot`, `zookeeper` and so on. It can be combined with the existing prefix locking feature: {maxEditDistance:1,prefixLength:2,prefix:true}fuzzy("foo") Which will match `foo`, `foobar`, `foxtrot` etc, but _not_ `zookeeper` since the locked prefix (`fo`) does not match. Due to the complexities involved with extending the legacy binary query stack representation, signalling prefix matching for the fuzzy term is done by pragmatically adding a new, generic "prefix matching" term-level flag. This is currently ignored for everything except fuzzy query items. Modernizing the query stack format to make it more extensible (i.e. move encoding to Protobuf) is on the backlog...! | ||||
* | Add streaming mode version of tokens document field writer. | Tor Egge | 2024-03-27 | 2 | -0/+101 |
| | |||||
* | Use filter settings from rank profiles and query terms in streaming search. | Tor Egge | 2024-03-15 | 1 | -6/+10 |
| | |||||
* | Take owenship for the stuff you provide. Do not rely on the caller. | Henning Baldersheim | 2024-02-13 | 2 | -11/+7 |
| | |||||
* | Also test size of heap and number of hits kept. | Henning Baldersheim | 2024-02-13 | 1 | -0/+6 |
| | |||||
* | Test that all hits are kept. | Henning Baldersheim | 2024-02-13 | 1 | -0/+17 |
| | |||||
* | - Add all hits to the hit collector. | Henning Baldersheim | 2024-02-13 | 1 | -14/+21 |
| | | | | | - Maintain a heap on the side, and keep heap property when producing results and features. - Drop teh pointer to the document once it drops off the heap. | ||||
* | Revert "Revert "- Use explicit given wanted hit count."" | Henning Baldersheim | 2024-02-12 | 2 | -12/+12 |
| | |||||
* | Revert "- Use explicit given wanted hit count." | Henning Baldersheim | 2024-02-12 | 2 | -12/+12 |
| | |||||
* | It is know up front that if we sort by rank or by sortblob. So instead of ↵ | Henning Baldersheim | 2024-02-11 | 2 | -12/+12 |
| | | | | | | detecting by first hit, and hoping the rest are the same, set expectations ahead and assert all hits are correct. | ||||
* | Handle search::streaming::PhraseQueryNode as a leaf in the query tree. | Tor Egge | 2024-02-06 | 2 | -37/+10 |
| | |||||
* | Move Normalization from search::streaming => search | Henning Baldersheim | 2024-02-05 | 1 | -1/+1 |
| | |||||
* | Add unpack_match_data member function to search::streaming::QueryTerm. | Tor Egge | 2024-02-05 | 2 | -8/+2 |
| | |||||
* | Start with position 0 for each element in streaming search. | Tor Egge | 2024-01-25 | 1 | -90/+92 |
| | |||||
* | Track element length in streaming mode. | Tor Egge | 2024-01-25 | 1 | -0/+15 |
| | |||||
* | Treat regex and fuzzy whole-field matching as 1 logical word | Tor Brede Vekterli | 2024-01-22 | 1 | -0/+16 |
| | | | | | | We have concluded that this is the most semantically correct way of reporting the count, and as a bonus it avoids having to do a separate pass over the string buffer. | ||||
* | Adjust search::streaming::Hit to better match | Tor Egge | 2024-01-22 | 2 | -2/+3 |
| | | | | search::fef::TermFieldMatchDataPosition. | ||||
* | Support fuzzy term matching in streaming search | Tor Brede Vekterli | 2024-01-18 | 1 | -28/+192 |
| | | | | | | | | | | Uses a DFA-based matcher for max edits in {1, 2} and falls back to the legacy non-DFA matcher for all other values (including 0). Currently only supports fuzzy matching across the full field string, i.e. there's no implicit tokenization or whitespace removal. This matches the semantics we currently have for fuzzy search over attributes in a non-streaming case | ||||
* | Propagate normalizing mode and max field length to new searcher | Tor Brede Vekterli | 2024-01-16 | 1 | -0/+12 |
| | | | | | Needed to avoid default normalizing mode/max field length being used in the reconfigured searcher instance. | ||||
* | Add regular expression support to streaming search | Tor Brede Vekterli | 2024-01-15 | 1 | -2/+47 |
| | | | | | | | | | | | | | | Introduces an explicit regex query term node (which wraps an RE2 regex instance internally) and extends the existing UTF-8 flexible string searcher to use this query node. Regex matching is optionally case (in)sensitive depending on the normalization mode used. Note on `searcher/searcher_test.cpp`: this adds a magic sentinel `#` char prefix to query term parsing in the test to let a query term be interpreted as a regex rather than exact/prefix/suffix/substring match. | ||||
* | Just use normalize_mode directly from searcher. | Henning Baldersheim | 2024-01-12 | 1 | -2/+2 |
| | |||||
* | Split out tokenizer and test it explicit. | Henning Baldersheim | 2024-01-11 | 1 | -0/+21 |
| | |||||
* | Use the normalize_mode config. | Henning Baldersheim | 2024-01-10 | 1 | -13/+13 |
| | |||||
* | Simplify ancient carefully hand optimized code in favour of simple readable code | Henning Baldersheim | 2024-01-10 | 1 | -4/+9 |
| | |||||
* | Code cleanup | Henning Baldersheim | 2024-01-10 | 2 | -6/+5 |
| | |||||
* | - Fold query for streaming search based on either query item type, or field ↵ | Henning Baldersheim | 2024-01-05 | 1 | -8/+11 |
| | | | | | | | | definition. - This ensures that query processing and document processing is symmetric for streaming search. No longer rely on java query processing being symmetric with backend c++ variant. - Indexed search does no normalization in backend and uses query as is. | ||||
* | - Modernize code | Henning Baldersheim | 2024-01-04 | 1 | -24/+24 |
| | | | | - Unify some conversion tables. | ||||
* | Revert "Revert "Balder/only rewrite numeric terms for text fields"" | Henning Baldersheim | 2024-01-03 | 1 | -1/+1 |
| | |||||
* | Revert "Balder/only rewrite numeric terms for text fields" | Henning Baldersheim | 2024-01-03 | 1 | -1/+1 |
| | |||||
* | Only rewrite numeric terms when searching text fields. | Henning Baldersheim | 2024-01-02 | 1 | -1/+1 |
| | |||||
* | Standard plural of leaf is leaves. | Tor Egge | 2023-11-30 | 1 | -1/+1 |
| | |||||
* | Update copyright | Jon Bratseth | 2023-10-09 | 26 | -24/+26 |
| | |||||
* | Use "_test" suffix for unit test cpp files. | Geir Storli | 2023-08-30 | 8 | -4/+4 |
| |