Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Adjust search::streaming::Hit to better match | Tor Egge | 2024-01-22 | 5 | -6/+7 |
| | | | | search::fef::TermFieldMatchDataPosition. | ||||
* | Merge pull request #29969 from ↵ | Tor Brede Vekterli | 2024-01-19 | 6 | -33/+215 |
|\ | | | | | | | | | vespa-engine/vekterli/support-fuzzy-matching-in-streaming-search Support fuzzy term matching in streaming search | ||||
| * | Support fuzzy term matching in streaming search | Tor Brede Vekterli | 2024-01-18 | 6 | -33/+215 |
| | | | | | | | | | | | | | | | | | | | | Uses a DFA-based matcher for max edits in {1, 2} and falls back to the legacy non-DFA matcher for all other values (including 0). Currently only supports fuzzy matching across the full field string, i.e. there's no implicit tokenization or whitespace removal. This matches the semantics we currently have for fuzzy search over attributes in a non-streaming case | ||||
* | | Rename search::streaming::Hit member function context() to field_id(). | Tor Egge | 2024-01-18 | 1 | -1/+1 |
|/ | |||||
* | refactor for re-use | Arne Juul | 2024-01-17 | 2 | -16/+36 |
| | |||||
* | Propagate normalizing mode and max field length to new searcher | Tor Brede Vekterli | 2024-01-16 | 3 | -5/+25 |
| | | | | | Needed to avoid default normalizing mode/max field length being used in the reconfigured searcher instance. | ||||
* | Merge pull request #29913 from ↵ | Henning Baldersheim | 2024-01-16 | 5 | -9/+75 |
|\ | | | | | | | | | vespa-engine/vekterli/streaming-search-regex-support Add regular expression support to streaming search | ||||
| * | Add regular expression support to streaming search | Tor Brede Vekterli | 2024-01-15 | 5 | -9/+75 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduces an explicit regex query term node (which wraps an RE2 regex instance internally) and extends the existing UTF-8 flexible string searcher to use this query node. Regex matching is optionally case (in)sensitive depending on the normalization mode used. Note on `searcher/searcher_test.cpp`: this adds a magic sentinel `#` char prefix to query term parsing in the test to let a query term be interpreted as a regex rather than exact/prefix/suffix/substring match. | ||||
* | | Support matched-elements-only for WeightedSetTerm. | Tor Egge | 2024-01-15 | 1 | -0/+9 |
|/ | |||||
* | Just use normalize_mode directly from searcher. | Henning Baldersheim | 2024-01-12 | 2 | -5/+3 |
| | |||||
* | Also handle different normalization during query time. | Henning Baldersheim | 2024-01-12 | 3 | -14/+21 |
| | |||||
* | Revert "Revert "Balder/unify attributes over streaming indexed"" | Henning Baldersheim | 2024-01-12 | 4 | -6/+5 |
| | |||||
* | Revert "Balder/unify attributes over streaming indexed" | Henning Baldersheim | 2024-01-12 | 4 | -5/+6 |
| | |||||
* | local include | Henning Baldersheim | 2024-01-11 | 4 | -6/+5 |
| | |||||
* | Add brief class documentation. | Henning Baldersheim | 2024-01-11 | 1 | -0/+4 |
| | |||||
* | Split out tokenizer and test it explicit. | Henning Baldersheim | 2024-01-11 | 8 | -57/+96 |
| | |||||
* | Use the normalize_mode config. | Henning Baldersheim | 2024-01-10 | 8 | -59/+45 |
| | |||||
* | Simplify ancient carefully hand optimized code in favour of simple readable code | Henning Baldersheim | 2024-01-10 | 12 | -186/+180 |
| | |||||
* | Code cleanup | Henning Baldersheim | 2024-01-10 | 24 | -90/+81 |
| | |||||
* | - Fold query for streaming search based on either query item type, or field ↵ | Henning Baldersheim | 2024-01-05 | 6 | -14/+43 |
| | | | | | | | | definition. - This ensures that query processing and document processing is symmetric for streaming search. No longer rely on java query processing being symmetric with backend c++ variant. - Indexed search does no normalization in backend and uses query as is. | ||||
* | GC unused data members. | Henning Baldersheim | 2024-01-04 | 2 | -56/+31 |
| | |||||
* | - Modernize code | Henning Baldersheim | 2024-01-04 | 7 | -155/+100 |
| | | | | - Unify some conversion tables. | ||||
* | - Must resolve index and check all fields if any require text matching. | Henning Baldersheim | 2024-01-03 | 4 | -115/+90 |
| | | | | | | - Make methods const if possible. - Return results instead of modifying a reference. - Varoius code unification. | ||||
* | Revert "Revert "Balder/only rewrite numeric terms for text fields"" | Henning Baldersheim | 2024-01-03 | 5 | -15/+48 |
| | |||||
* | Revert "Balder/only rewrite numeric terms for text fields" | Henning Baldersheim | 2024-01-03 | 5 | -48/+15 |
| | |||||
* | Only rewrite numeric terms when searching text fields. | Henning Baldersheim | 2024-01-02 | 5 | -15/+48 |
| | |||||
* | - Avoid inefficient generic template. | Henning Baldersheim | 2023-12-29 | 1 | -1/+1 |
| | | | | - Add explicit implementations for the types needed. | ||||
* | - Separate methods for lowercasing, and lowercasing and folding. | Henning Baldersheim | 2023-12-21 | 1 | -13/+13 |
| | | | | | - Hide implementations and use accessors. - Minor code cleanup. | ||||
* | Add MultiTerm and InTerm for streaming search. | Tor Egge | 2023-12-07 | 2 | -2/+19 |
| | |||||
* | Use emplace_back. | Tor Egge | 2023-12-04 | 4 | -4/+4 |
| | |||||
* | Use templated getRange() member function to get range. | Tor Egge | 2023-12-04 | 2 | -8/+4 |
| | |||||
* | Don't switch lower and upper bound. | Tor Egge | 2023-12-04 | 2 | -2/+2 |
| | |||||
* | Standard plural of leaf is leaves. | Tor Egge | 2023-11-30 | 5 | -10/+10 |
| | |||||
* | Add linguistics tokens document field writer. | Tor Egge | 2023-10-16 | 1 | -0/+7 |
| | |||||
* | Enable passing search::docsummary::IStringFieldConverter pointer to | Tor Egge | 2023-10-12 | 2 | -7/+9 |
| | | | | | search::docsummary::IDocsumStoreDocument::insert_summary_field member function. | ||||
* | Update copyright | Jon Bratseth | 2023-10-09 | 121 | -119/+121 |
| | |||||
* | Use "_test" suffix for unit test cpp files. | Geir Storli | 2023-08-30 | 8 | -4/+4 |
| | |||||
* | Use uint32_t as ucs4_t | Henning Baldersheim | 2023-07-25 | 1 | -1/+1 |
| | |||||
* | Use WordFolder as helper instead of inheriting static stuff. | Henning Baldersheim | 2023-07-25 | 9 | -29/+31 |
| | |||||
* | Unpack interleaved features for streaming search. | Tor Egge | 2023-07-19 | 3 | -3/+78 |
| | |||||
* | Modernize C++ code with auto and range-based loops. | Geir Storli | 2023-07-06 | 9 | -37/+32 |
| | |||||
* | Handle sorting on multivalue attributes. | Tor Egge | 2023-07-04 | 1 | -14/+10 |
| | |||||
* | Add flag for controling nested multivalue grouping. | Henning Baldersheim | 2023-06-28 | 1 | -1/+1 |
| | |||||
* | Setup distance metrics for streaming search. | Tor Egge | 2023-06-05 | 4 | -4/+27 |
| | | | | Add range checks when converting to internal distance threshold. | ||||
* | Use DistanceMetricUtils for converting string value to distance metric. | Tor Egge | 2023-05-24 | 1 | -14/+14 |
| | |||||
* | GC unused assert includes | Henning Baldersheim | 2023-05-17 | 1 | -0/+1 |
| | |||||
* | Remove unused field/attribute access hinting. | Tor Egge | 2023-05-13 | 2 | -8/+0 |
| | |||||
* | Add attribute access recorder for streaming search mode. Use it to | Tor Egge | 2023-05-12 | 10 | -37/+121 |
| | | | | determine which attributes to populate during a streaming search. | ||||
* | Setup ranking assets repo for streaming search. | Tor Egge | 2023-05-10 | 6 | -40/+124 |
| | |||||
* | Setup search visitor without proton process. | Tor Egge | 2023-05-10 | 5 | -12/+10 |
| |