diff options
author | Henning Baldersheim <balder@yahoo-inc.com> | 2020-04-15 11:58:29 +0000 |
---|---|---|
committer | Henning Baldersheim <balder@yahoo-inc.com> | 2020-04-15 11:58:29 +0000 |
commit | a4e565c808f7999f561b0dad881f3a34040ab5d7 (patch) | |
tree | a4f30fd3fd32d3d18494b47a61e58f95bfc16fb3 /configdefinitions | |
parent | 2cebe16d47872f08472ac52cbfc9e1102fb695da (diff) |
Make SearchCluster.TopKEstimator a top level class.
Diffstat (limited to 'configdefinitions')
-rw-r--r-- | configdefinitions/src/vespa/dispatch.def | 10 |
1 files changed, 6 insertions, 4 deletions
diff --git a/configdefinitions/src/vespa/dispatch.def b/configdefinitions/src/vespa/dispatch.def index 3f553b5b8ba..0776e648ad7 100644 --- a/configdefinitions/src/vespa/dispatch.def +++ b/configdefinitions/src/vespa/dispatch.def @@ -23,11 +23,13 @@ distributionPolicy enum { ROUNDROBIN, ADAPTIVE } default=ROUNDROBIN ## don't use it if you don't (really) mean it. maxHitsPerNode int default=2147483647 -## Probability for getting the correct topK documents. -## A value of 1.0 will ask all partitions for topK documents. -## Any value between <0, 1> will use a Student T fith 30 degrees freedom and compute a K value that -## will give you the topK documents according to this formulae. +## Probability for getting the K best hits (topK). +## A value of 1.0 will ask all N partitions for K hits. +## Any value between <0, 1> will use a Student T with 30 degrees freedom and compute a value Q that +## will give you the globally K best hits according to this formula with the desired probability. ## q = k/n + qT (p',30) x √(k × (1/n) × (1 − 1/n)) +## With a probability of 0.999 and K=200 and N=10 will give a Q of 38, meaning that you only need to fetch 19% compared to +## default setting of 1.0. This is a significant optimisation with with very little loss in presicion. topKProbability double default=1.0 # Is multi-level dispatch configured for this cluster |