Search API Reference
All the search request parameters listed below can be set in query profiles. The first four blocks of properties are also modeled as query profile types. These types can be referred from query profiles (and inheriting types) to provide type checking on the parameters.
These parameters often have both a full name - which includes the path from the root query profile - and one or more abbreviated names. Both names can be used in search requests, while only full names can be used in query profiles. The full names are case sensitive, while the abbreviated names are case insensitive.
The parameters modeled as query profiles are also available through get methods as Java objects from the Query to Searcher components.
Index
- Native Execution Parameters
-
- hits [count]
- offset [start]
- queryProfile
- nocache
- groupingSessionCache
- searchChain
- timeout
- tracelevel
- trace.timestamps
- Query Model Parameters
-
- model.defaultIndex [default-index]
- model.encoding [encoding]
- model.filter [filter]
- model.language [lang, language]
- model.queryString [query]
- model.restrict [restrict]
- model.searchPath [path]
- model.sources [search, sources]
- model.type [type]
- Ranking
-
- ranking.location [location]
- ranking.features [rankfeature]
- ranking.listFeatures [rankfeatures]
- ranking.profile [ranking]
- ranking.properties [rankproperty]
- ranking.sorting [sorting]
- ranking.freshness
- ranking.queryCache
- ranking.matchPhase
- Presentation
-
- presentation.bolding [bolding]
- presentation.format [format]
- presentation.template
- presentation.summary [summary]
- presentation.timing
- Grouping
- Geographical Searches
- Streaming Search
- Semantic Rules
- Other
Query
yql
Alias | |
Values | String |
Default | None |
The YQL query will be parsed and executed in the backend. Only simple YQL programs are supported, refer to YQL for details.
select
Select query is equivalent with YQL, written in JSON. Contains subparameters where
and grouping
.
where
Alias | |
Values | JSON |
Default | None |
grouping
Alias | |
Values | JSON |
Default | None |
The where and grouping query will be parsed and executed in the backend. Refer to Select Reference for details.
Native Execution Parameters
These parameters are defined in the native
query profile type.
hits
Alias | count |
Values |
A positive integer, or 0. The sum of offset and
hits should be lower than the configured maxoffset
value, and will be adjusted to fit. See also comment
at offset .
|
Default | 10 |
The maximum number of hits to return from the result set.
Must be lower than maxHits
, which is either set in a
query profile, or default 400.
offset
Alias | start |
Values | A positive integer, including 0. |
Default | 0 |
The index of the first hit to return from the result set.
Must be lower than maxOffset
, which is either set in a
query profile, or default 1000.
queryProfile
Alias | None |
Values | A query profile id - name:version, where version can be omitted or partially specified, e.g "myprofile:2.1" |
Default | default |
A query profile has default properties for a query. The default query profile is named default - example:
<query-profile id="default"> <field name="maxHits">10</field> <field name="maxOffset">1000</field> </query-profile>
nocache
Alias | |
Values | True or false |
Default | false |
Set to true to avoid the result being fetched from cache, and avoid writing the result to cache after fetching it.
groupingSessionCache
Alias | |
Values | True or false |
Default | false |
Set to true to store intermediate grouping results in the search back ends when using multi level grouping expressions in order to speed up grouping at a potential loss of accuracy. See the grouping reference for more details.
searchChain
Alias | |
Values | A search chain id - name:version, where version can be omitted or partially specified, e.g "mychain:2.1.3". |
Default | default |
The search chain initially invoked when processing this query. This search chain may invoke other chains.
timeout
Alias | |
Values | Positive floating point number with an optional unit. Default unit is seconds (s), valid unit strings are e.g. ms and s. To set a timeout of one minute, the argument could be set to 60 s. Space between the number and the unit is optional. |
Default | Undefined, but guaranteed to be at least 5000 milliseconds. This default can be overridden by configuring timeout in a query profile. |
The query timeout.
tracelevel
Alias | |
Values | Any positive number |
Default | No tracing |
Set to a positive number to collect trace information for debugging when running a query. Higher numbers give progressively more detail on query transformations and searcher execution.
trace.timestamps
Alias | |
Values | true or false |
Default | No timestamps in trace |
Enable it to get timing information already at tracelevel=1 which is useful for debugging latency spent at different components in the search chain without rendering a lot of string data which is associated with higher trace levels.
Query Model Parameters
model.defaultIndex [default-index]
Alias | default-index |
Values | An index name |
Default | default |
The field which is searched for query terms which doesn't explicitly specify an index.
model.encoding [encoding]
Alias | encoding |
Values | Encoding names or aliases defined in the IANA character sets |
Default | utf-8 |
Sets the encoding to use when returning a result. The encodings big5, euc-jp, euc-kr, gb2312, iso-2022-jp and shift-jis also influences how tokenization is done in the absence of an explicit language setting.
The query is always encoded as UTF-8, independently of how the result will be encoded.
model.filter [filter]
Alias | filter |
Values | Any allowed collection of filter terms |
Default | Not set |
Sets a filter to be combined with the query. Typical use of a filter is to add machine generated or preferences based filter terms to a raw user query. The filter is parsed the same way as a query of type any, the full syntax is available. The positive terms (preceded by +) and phrases act as AND filters, the negative terms (preceded by -) act as NOT filters, while the unprefixed terms will be used to RANK the results. Unless the query has no positive terms, the filter will only restrict and influence ranking of the result set, never cause more matches than the query.
model.language [lang, language]
Alias | language, lang |
Values | Ref. RFC 3066 |
Default | Unspecified |
Informs Vespa about the natural language of the query. Please see linguistics for details. This attribute should always be set when it is known. If this parameter is not set, it will be guessed from the query and encoding, and default to english if it cannot be guessed.
model.queryString [query]
Alias | query |
Values | Any HTTP encoded legal Vespa query language string |
Default | Not set |
The Simple Vespa Query Language query string specifying which documents to match in this query.
model.restrict [restrict]
Alias | restrict |
Values | A comma delimited list of document type names. |
Default | Search unrestricted |
The document types to restrict the search to when different document types share the same search cluster.
model.searchPath [path]
Alias | searchpath |
Values |
|
Default | Whole cluster |
Specification of which path to send the query to. Used to select which set of search nodes in the cluster should be used. Only meant for debugging/monitoring.
Examples: Note that in an indexed content cluster with flat distribution we have 1 implicit row and each search node represents a part.
- '7/3' = part 7, row 3.
- '7/' = part 7, any row.
- '7,1,9/0' = parts 1,7 and 9, row 0.
- '1,[3,9>/0' = parts 1,3,4,5,6,7,8, row 0.
In a cluster with a multi-level dispatch setup we must specify a search path element for each level. Lets say we have a setup with 2 mid-level dispatch groups, each containing 3 search nodes (and 3 dispatchers):
- '0/;2/' = dispatch group (part) 0, any of the dispatchers (row); search node (part) 2, any row (of 1 present).
- '0/1;2/0' = dispatch group (part) 0, dispatcher (row) 1; search node (part) 2, row 0 (of 1 present).
model.sources [search, sources]
Alias | search, sources |
Values | A comma separated list of search cluster names or other source names |
Default | Search unrestricted |
The names of the sources to search, e.g one or more search clusters and/or federated sources.
model.type [type]
Alias | type |
Values | web, all, any, phrase, yql, adv (deprecated) - refer to simple query language reference |
Default | all |
Selects the query language syntax of the query parameter.
Ranking
ranking.location [location]
Alias | location |
Values | See Geo search |
Default | None |
Point (one or two dimensional) location to use as base for location ranking. For geographical locations, it is recommended to add the location using pos.ll
ranking.features.featurename [rankfeature.featurename]
Alias | rankfeature.featurename |
Values | Any string |
Default | None |
Set a rank feature to a value. This works for any key name query(anyname)
(query features),
and also as a way to override all existing (match and document) features.
Example: query=foo&ranking.features.query(userage)=42&ranking.features.fieldMatch(title)=0.65
ranking.listFeatures [rankfeatures]
Alias | rankfeatures |
Values | boolean |
Default | false |
Set to true to request all rank features to be calculated and returned. The rank features will be returned in the summary field rankfeatures. This option is typically used for MLR training, should not to be used for production.
ranking.profile [ranking]
Alias | ranking |
Values | Any rank profile name |
Default | default |
Sets the name of the rank profile to use for assigning relevancy scores. The default rank profile will be used for back-ends which does not have the given rank profile.
ranking.properties.propertyname [rankproperty.propertyname]
Alias | rankproperty.propertyname |
Values | Any string |
Default | None |
Set a rank property that is passed to, and used by a feature executor for this query. Example: query=foo&ranking.properties.dotProduct.X={a:1,b:2}
ranking.sorting [sorting]
Alias | sorting |
Values | A valid sort specification |
Default | None - order by relevance |
A specification of how to sort the result. Fields you want to sort on must be stored as document attributes in the index structure by adding attribute to the indexing statement.
ranking.freshness
Alias | |
Values | [integer] , an absolute time in seconds since epoch, or now-[number] , to use a time [integer] seconds into the past, or now to use the current time |
Default | None - use the current time on each node. |
Sets the time which will be used as now during execution.
ranking.queryCache
Alias | |
Values | boolean |
Default | false |
Turns query cache on or off. Search is a two-phase process. If the query cache is on, the query is stored on the search nodes between the first and second phase, saving network bandwidth and also query setup time, at the expense of using more memory.
ranking.matchPhase
Settings which control Vespa's behavior during the match phase. If these are set in the query they will override any match-phase setting in the rank profile.
- ranking.matchPhase.maxHits the max number of hits that should be generated during the match phase
- ranking.matchPhase.attribute the attribute to limit matches by if more than maxHits hits will be generated
- ranking.matchPhase.ascending whether to keep the documents having the highest (default) or lowest values of the attribute
- ranking.matchPhase.diversity.attribute the attribute to use to guarantee diversity.
- ranking.matchPhase.diversity.minGroups the minimum number of groups grouped by the diversity attribute.
ranking.matchPhase.maxHits
Alias | |
Values | long |
Default | If sorting and not ranking: max(10000, maxhits+maxoffset). Otherwise: none. |
The max hits the engine should attempt to produce in the match phase on each partition. If it is determined during matching that many more hits than this will be generated, the matching will fall back to take the best (highest or lowest) values of the attribute given by ranking.matchPhase.attribute.
By default, this will be turned on only when sorting is used and grouping is not. If sorting is used, the primary sort attribute will be used as the match phase attribute if it has fast-search set. In that case the default can be overridden by setting this value explicitly.
ranking.matchPhase.attribute
Alias | |
Values | An attribute name |
Default | none |
The attribute to decide which documents are a match if the match phase estimates that there will be more than maxHits matches. This attribute should have fast-search set and should correlate with the order which would be produced by a full evaluation.
ranking.matchPhase.ascending
Alias | |
Values | boolean |
Default | false |
Whether the attribute should be sorted in ascending or descending (default) order to determine which documents to keep as matches.
ranking.matchPhase.diversity.attribute
Alias | |
Values | An attribute name |
Default | none. |
The attribute to be used for producing the desired diversity. Also see attribute.
ranking.matchPhase.diversity.minGroups
Alias | |
Values | long |
Default | none |
The minimum number of groups that should be returned from the match phase grouped by the diversity attribute. Also see min-groups.
Presentation
presentation.bolding [bolding]
Alias | bolding |
Values | boolean |
Default | true |
Whether or not to bold search terms in search definition fields defined with bolding: on or summary: dynamic.
presentation.format [format]
Alias | format | ||||||||||
Values |
| ||||||||||
Default | default |
presentation.summary [summary]
Alias | summary |
Values | The name of the summary class used to select fields in results. |
Default | The default summary class of the search definition. |
presentation.template
Alias | |
Values | Any id specification of a deployed page template. |
Default |
The id of the page template to use for this result. This should be used with the page result format.
presentation.timing
Alias | |
Values | boolean |
Default | false |
Whether a result renderer should try to add optional timing information to the rendered page.
Grouping and Aggregation
select
Alias | |
Values | A valid grouping specification. |
Default | No grouping |
Requests specific multi-level result set statistics and/or hit groups to be returned in the result. Fields you want to retrieve statistics or hit groups for must be stored as document attributes in the index structure by adding attribute to the indexing statement. See the grouping guide.
collapsefield
Alias | |
Values | Any document summary field name |
Default | No field collapsing |
Collapse (i.e. aggregate) results using this field. Collapsing is run in the container, not content node level. Define a collapsefield to remove duplicates if the corpus has few duplicates - this is more efficient than using grouping. Otherwise, use grouping.
collapsesize
Alias | |
Values | A positive integer |
Default | 1 |
The number of hits to keep in each collapsed bucket
collapse.summary
Alias | |
Values | A valid name of a document summary class. |
Default | Use default summary or attributes. |
Use this summary class to fetch the field used for collapsing.
Geographical Searches
pos.ll
Alias | |
Values | Position given in latitude and longitude - example: S22.4532;W123.9887 Refer to position field for format specification. |
Default | None |
pos.radius
Alias | |
Values |
Radius of the circle used for filtering. Valid units of measurement are km, m and mi. Examples:
|
Default | 50km |
pos.bb
Alias | |
Values |
Bounding box for positions, given as latitude and longitude boundaries.
The four boundaries must be specified as N, S, E, W, with degrees as
a decimal fraction. Degrees south of equator or west of Greenwich are
input as negative numbers. Examples:
|
Default | None |
pos.attribute
Alias | |
Values | Any attribute that has zcurve encoded positions as a long attribute. |
Default | Random choice among the ones declared as position in the searchdefinition. |
Which attribute to use for the position. Can be both single- or multi-value.
Streaming Search
The features in this section applies to streaming search only.
streaming.userid
Alias | |
Values | An integer in decimal notation in the range [0, 2^64> |
Default | None |
Restricts streaming search to only stream through documents with document ids having the n=<number> modifier and the userid part matches the supplied value. This can be used for grouping documents on a 64 bit integer.
streaming.groupname
Alias | |
Values | A string |
Default | None |
Restricts streaming search to only stream through documents with document ids having the g=<groupname> modifier and the groupname part matches the supplied value. This can be used for grouping documents on a string.
streaming.selection
Alias | |
Values | A string |
Default | None |
Restricts streaming search using a document selection. This can be used for selecting a subset of documents based on an advanced expression.
streaming.priority
Alias | |
Values | Priority class |
Default | VERY_HIGH |
Priority of the streaming search visitor. Having a high priority visitor helps maintain low latencies even when the system is under load.
streaming.maxbucketspervisitor
Alias | |
Values | int |
Default | 1 (if ordering is set), or infinite |
If set, visit only this many buckets at a time. Combine with ordering to reduce visiting time for large users/groups.
Semantic Rules
Refer to semantic rules.
rules.off
Alias | |
Values | Boolean |
Default | True |
Turn rule evaluation off for this query
rules.rulebase
Alias | |
Values | String |
Default | A rule base name |
The name of the rule base to use for these queries
tracelevel.rules
Alias | |
Values | int |
Default | 1-5 (?) |
The amount of rule evaluation trace output to show, higher number means more details. This is useful to see a trace from rule evaluation without having to see trace from all other searchers at the same time.
Other
recall
Alias | |
Values | Any allowed collection of recall terms |
Default | No recall |
Sets a recall parameter to be combined with the query. This is identical to filter, except that recall terms are not exposed to the ranking framework and thus not ranked. As such, one can not use unprefixed terms; they must either by positive or negative.
user
Alias | |
Values | A string |
Default | None |
The id of the user making the query. The contents of the argument are made available to the search chain, but it triggers no features in Vespa apart from being propagated to the access log.
nocachewrite
Alias | |
Values | Boolean |
Default | False |
Set to true to avoid the result being written to cache when fetched.
hitcountestimate
Alias | |
Values | Boolean |
Default | False |
Make this an estimation query. No hits will be returned, and total hit count will be set to an estimate of what executing the query as a normal query would give.
metrics.ignore
Alias | |
Values | Boolean |
Default | False |
Ignore metric collection for this query request, useful for warm up queries