Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Treat regex and fuzzy whole-field matching as 1 logical word | Tor Brede Vekterli | 2024-01-22 | 1 | -0/+16 |
| | | | | | | We have concluded that this is the most semantically correct way of reporting the count, and as a bonus it avoids having to do a separate pass over the string buffer. | ||||
* | Adjust search::streaming::Hit to better match | Tor Egge | 2024-01-22 | 2 | -2/+3 |
| | | | | search::fef::TermFieldMatchDataPosition. | ||||
* | Support fuzzy term matching in streaming search | Tor Brede Vekterli | 2024-01-18 | 1 | -28/+192 |
| | | | | | | | | | | Uses a DFA-based matcher for max edits in {1, 2} and falls back to the legacy non-DFA matcher for all other values (including 0). Currently only supports fuzzy matching across the full field string, i.e. there's no implicit tokenization or whitespace removal. This matches the semantics we currently have for fuzzy search over attributes in a non-streaming case | ||||
* | Propagate normalizing mode and max field length to new searcher | Tor Brede Vekterli | 2024-01-16 | 1 | -0/+12 |
| | | | | | Needed to avoid default normalizing mode/max field length being used in the reconfigured searcher instance. | ||||
* | Add regular expression support to streaming search | Tor Brede Vekterli | 2024-01-15 | 1 | -2/+47 |
| | | | | | | | | | | | | | | Introduces an explicit regex query term node (which wraps an RE2 regex instance internally) and extends the existing UTF-8 flexible string searcher to use this query node. Regex matching is optionally case (in)sensitive depending on the normalization mode used. Note on `searcher/searcher_test.cpp`: this adds a magic sentinel `#` char prefix to query term parsing in the test to let a query term be interpreted as a regex rather than exact/prefix/suffix/substring match. | ||||
* | Just use normalize_mode directly from searcher. | Henning Baldersheim | 2024-01-12 | 1 | -2/+2 |
| | |||||
* | Split out tokenizer and test it explicit. | Henning Baldersheim | 2024-01-11 | 1 | -0/+21 |
| | |||||
* | Use the normalize_mode config. | Henning Baldersheim | 2024-01-10 | 1 | -13/+13 |
| | |||||
* | Simplify ancient carefully hand optimized code in favour of simple readable code | Henning Baldersheim | 2024-01-10 | 1 | -4/+9 |
| | |||||
* | Code cleanup | Henning Baldersheim | 2024-01-10 | 2 | -6/+5 |
| | |||||
* | - Fold query for streaming search based on either query item type, or field ↵ | Henning Baldersheim | 2024-01-05 | 1 | -8/+11 |
| | | | | | | | | definition. - This ensures that query processing and document processing is symmetric for streaming search. No longer rely on java query processing being symmetric with backend c++ variant. - Indexed search does no normalization in backend and uses query as is. | ||||
* | - Modernize code | Henning Baldersheim | 2024-01-04 | 1 | -24/+24 |
| | | | | - Unify some conversion tables. | ||||
* | Revert "Revert "Balder/only rewrite numeric terms for text fields"" | Henning Baldersheim | 2024-01-03 | 1 | -1/+1 |
| | |||||
* | Revert "Balder/only rewrite numeric terms for text fields" | Henning Baldersheim | 2024-01-03 | 1 | -1/+1 |
| | |||||
* | Only rewrite numeric terms when searching text fields. | Henning Baldersheim | 2024-01-02 | 1 | -1/+1 |
| | |||||
* | Standard plural of leaf is leaves. | Tor Egge | 2023-11-30 | 1 | -1/+1 |
| | |||||
* | Update copyright | Jon Bratseth | 2023-10-09 | 26 | -24/+26 |
| | |||||
* | Use "_test" suffix for unit test cpp files. | Geir Storli | 2023-08-30 | 8 | -4/+4 |
| | |||||
* | Use WordFolder as helper instead of inheriting static stuff. | Henning Baldersheim | 2023-07-25 | 1 | -1/+1 |
| | |||||
* | Unpack interleaved features for streaming search. | Tor Egge | 2023-07-19 | 1 | -0/+59 |
| | |||||
* | Setup search visitor without proton process. | Tor Egge | 2023-05-10 | 1 | -4/+2 |
| | |||||
* | Pass transport and file distributor connection spec to SearchEnvironment | Tor Egge | 2023-05-10 | 1 | -2/+5 |
| | | | | | in preparation for using RankingAssetsBuilder when handling config in streaming search. | ||||
* | Add SearchEnvironmentSnapshot for streaming search. | Tor Egge | 2023-05-05 | 1 | -2/+6 |
| | |||||
* | Test match features returned in streaming search result. | Geir Storli | 2023-04-28 | 5 | -28/+59 |
| | |||||
* | Merge pull request #26893 from ↵ | Arne H Juul | 2023-04-27 | 1 | -3/+1 |
|\ | | | | | | | | | vespa-engine/arnej/remove-unused-distance-functions-3 remove unused distance functions | ||||
| * | remove unused distance functions | Arne Juul | 2023-04-27 | 1 | -3/+1 |
| | | |||||
* | | Merge pull request #26897 from ↵ | Geir Storli | 2023-04-27 | 20 | -560/+352 |
|\ \ | | | | | | | | | | | | | vespa-engine/geirst/search-visitor-query-execution-test Test basic query execution in streaming search visitor. | ||||
| * | | Test basic query execution in search visitor. | Geir Storli | 2023-04-27 | 20 | -560/+352 |
| |/ | |||||
* / | Populate match features in search result for streaming search. | Tor Egge | 2023-04-27 | 1 | -0/+60 |
|/ | |||||
* | Rewrite searchvisitor test to GTest. | Geir Storli | 2023-04-27 | 2 | -35/+17 |
| | |||||
* | Rewrite streamingvisitors hit collector unit test to use gtest. | Tor Egge | 2023-04-26 | 2 | -49/+27 |
| | |||||
* | Merge pull request #26850 from ↵ | Geir Storli | 2023-04-25 | 2 | -14/+59 |
|\ | | | | | | | | | vespa-engine/geirst/nearest-neighbor-target-hits-in-streaming Use targetHits in nearestNeighbor streaming searcher. | ||||
| * | Use targetHits in nearestNeighbor streaming searcher. | Geir Storli | 2023-04-25 | 2 | -14/+59 |
| | | | | | | | | A distance heap is used to limit the number of produced document matches. | ||||
* | | Move search::FeatureValues to vespalib::FeatureValues in preparation for | Tor Egge | 2023-04-25 | 1 | -1/+1 |
|/ | | | | extending vdslib::SearchResult. | ||||
* | Provide FieldPathMap and IQueryEnvironment when preparing streaming searchers. | Geir Storli | 2023-04-20 | 5 | -41/+40 |
| | | | | This is required to prepare the NearestNeighborFieldSearcher. | ||||
* | Add exact nearest neighbor searcher over the streamed values of a tensor field. | Geir Storli | 2023-04-20 | 2 | -0/+164 |
| | | | | Note: Integration into the searchvisitor remains. | ||||
* | Unpack match data for nearest neighbor query node in streaming search. | Tor Egge | 2023-04-19 | 2 | -0/+104 |
| | |||||
* | Reduce creation of Document instances without DocumentTypeRepo. | Geir Storli | 2023-03-13 | 3 | -4/+4 |
| | |||||
* | Rename KeywordExtractor to QueryTermFilter. | Tor Egge | 2023-01-25 | 4 | -132/+132 |
| | |||||
* | Emtpy index name means default index. | Tor Egge | 2023-01-25 | 1 | -0/+7 |
| | |||||
* | Add new KeywordExtractor with two factories (one each for indexed search | Tor Egge | 2023-01-24 | 2 | -0/+125 |
| | | | | and streaming search). | ||||
* | Expose SameElement query terms to ranking. | Geir Storli | 2023-01-12 | 1 | -1/+1 |
| | | | | | | A TermFieldMatchData is allocated per SameElement term, and this is used to signal matching docids in doUnpack() on the SameElement search iterator. This allows using the matches() rank feature on a field (virtual) that is searched using a SameElement term. | ||||
* | Change from typedef to using in streamingvisitors C++ code. | Geir Storli | 2022-12-21 | 3 | -16/+16 |
| | |||||
* | fix typo invokation -> invocation | Thinh Bui | 2022-11-14 | 2 | -3/+3 |
| | |||||
* | Use SlimeFiller instead of SlimeFieldWriter for streaming search. | Tor Egge | 2022-09-19 | 1 | -155/+1 |
| | |||||
* | Reduce usage of RawBuf. | Henning Baldersheim | 2022-08-29 | 1 | -5/+17 |
| | | | | Remove some unused code. | ||||
* | perform feature renaming in streaming also | Arne H Juul | 2022-06-22 | 1 | -2/+4 |
| | |||||
* | Revert "Revert "Collapse vsm into streamingvisitors"" | Henning Baldersheim | 2022-05-15 | 23 | -4/+2131 |
| | |||||
* | Revert "Collapse vsm into streamingvisitors" | Henning Baldersheim | 2022-05-15 | 23 | -2131/+4 |
| | |||||
* | Collapse vsm into streamingvisitors | Henning Baldersheim | 2022-05-14 | 23 | -4/+2131 |
| |