aboutsummaryrefslogtreecommitdiffstats
path: root/container-search/src/test/java/com/yahoo/search/dispatch/searchcluster
Commit message (Collapse)AuthorAgeFilesLines
* Rename back to HostName, and merge the value class and utilitiesJon Marius Venstad2022-03-311-5/+5
|
* Move HostName -> Hostnames, and DomainName and Hostname to com.yahoo.netJon Marius Venstad2022-03-311-5/+5
|
* Update 2020 Oath copyrights.gjoranv2021-10-271-1/+1
|
* Update Verizon Media copyright notices.gjoranv2021-10-072-2/+2
|
* Merge branch 'master' into balder/do-not-depend-on-clusterinfoHenning Baldersheim2021-09-303-2/+234
|\
| * Separate balanced and sparseJon Bratseth2021-07-021-2/+22
| |
| * Infer groupJon Bratseth2021-07-022-14/+14
| |
| * Allow deviation of at least 1 documentHarald Musum2021-06-301-0/+18
| | | | | | | | Let content be well-balanced when there are few docs in a cluster
| * Revert "Revert "Don't consider number of working nodes in coverage""Jon Bratseth2021-05-111-2/+16
| |
| * Revert "Don't consider number of working nodes in coverage"Jon Bratseth2021-05-101-16/+2
| |
| * Don't consider number of working nodes in coverageJon Bratseth2021-05-101-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tring to figure out the right groups to send queries to based on the number of nodes in the group has many potential issues at times of topology changes. Since we could the number of documents available in each group by summing documents in working nodes, we do not need to also separately consider the number of working nodes in the group for correctness. Since we use adaptive dispatching by default we also do not need to consider it to avoid overloading groups with less resources available but enough documents.
| * Use median not average document count to determine group coverageJon Bratseth2021-04-153-1/+123
| | | | | | | | If a group has too many nodes, all others will have less than average.
| * Revert "Revert "Disable topk optimisation on dispatch when content ↵Henning Baldersheim2021-01-081-1/+44
| | | | | | | | distribution is se…""
| * Revert "Disable topk optimisation on dispatch when content distribution is ↵Henning Baldersheim2021-01-081-44/+1
| | | | | | | | se…"
| * Disable topk optimisation on dispatch when content distribution is severly ↵Henning Baldersheim2021-01-071-1/+44
| | | | | | | | | | | | | | | | skewed. When the skew is too large the assumption that docs are evenly and randomly distributed hold. The impact and is larger on smaller systems. In large systems the where this optimisation is more important, the probabilitity of large skew will be less.
| * Make SearchCluster.TopKEstimator a top level class.Henning Baldersheim2020-04-151-12/+1
| |
| * Introduce top-k-probability and use it to fetch correct proper amount of ↵Henning Baldersheim2020-04-151-0/+12
| | | | | | | | hits from each partition
| * Revert "Revert "Revert "Revert "Don't take combined clusters of size 1 down""""Jon Bratseth2020-03-261-1/+15
| |
| * Revert "Revert "Revert "Don't take combined clusters of size 1 down"""Jon Bratseth2020-03-261-15/+1
| |
| * Revert "Revert "Don't take combined clusters of size 1 down""Jon Bratseth2020-03-251-1/+15
| |
| * Revert "Don't take combined clusters of size 1 down"Harald Musum2020-03-251-15/+1
| |
| * Don't take combined clusters of size 1 downJon Bratseth2020-03-251-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This can lead to a deadlock: - host-admin needs to suspend node before it reduces the CPU allocation - suspension means setting storage node in maintenance, distributor down - cluster controller figures this means the cluster is down - the container on the same node (being a combined cluster) receives report from the downstream storage node of being offline, and changes its /state/v1/health to down - being a combined cluster node w/container, the host-admin must verify /health/v1/status is UP before allowing resume, which it isn't We have no good options when the content node is down and size is 1, and do not much care about availability in this case by definition, so keeping the container in rotation should be fine.
| * Revert "Revert "Revert "Revert "Create a resourcepool so that we do not need ↵Henning Baldersheim2020-02-191-1/+0
| | | | | | | | to reconnect to content …""""
| * Revert "Revert "Revert "Create a resourcepool so that we do not need to ↵Harald Musum2020-02-191-0/+1
| | | | | | | | reconnect to content …"""
| * Revert "Revert "Create a resourcepool so that we do not need to reconnect to ↵Henning Baldersheim2020-02-141-1/+0
| | | | | | | | content …""
| * Revert "Create a resourcepool so that we do not need to reconnect to content ↵Harald Musum2020-02-141-0/+1
| | | | | | | | …"
| * Create a resourcepool so that we do not need to reconnect to content cluster ↵Henning Baldersheim2020-02-131-1/+0
| | | | | | | | on changes to container cluster.
* | Do not depend on on ClusterInfo config as it changes to often and causes a ↵Henning Baldersheim2020-02-131-2/+1
|/ | | | | | instant clusterwide hickup on any container cluster changes like node retirement. The corner case it was used for is not worth the cost.
* Decouple so ClusterMonitor is on the outside of the searchcluster and can be ↵Henning Baldersheim2020-02-041-13/+5
| | | | provided.
* Move pingfactory to constructor.Henning Baldersheim2020-02-041-2/+3
|
* Do not start cluster monitor thread in test as it will race with explicit ↵Henning Baldersheim2020-02-041-2/+2
| | | | ping in test.
* Add another ping round to avoid racing with the builtin ping thread that ↵Henning Baldersheim2020-02-031-1/+1
| | | | operates at 1hz.
* Wait until Pong has returned before saying you are done.Henning Baldersheim2020-02-031-1/+1
|
* Provide pongHandler in constructor to avoid needing an AtomicReference.Henning Baldersheim2020-02-031-5/+7
|
* Use sequence numbers and check on Pong reception instead.Henning Baldersheim2020-02-031-0/+15
|
* Send ping every second truly async to all nodes who does not have any ↵Henning Baldersheim2020-01-311-7/+7
| | | | pending pings.
* Fix unstable testMartin Polden2020-01-301-19/+20
|
* Close state in requireThatVipStatusIsDefaultDownButComesUpAfterPingingJon Bratseth2020-01-201-7/+13
|
* Shutdown search cluster monitoring after useHarald Musum2020-01-131-92/+105
|
* Remove unused executor, log if we get InterruptedExceptionHarald Musum2020-01-131-24/+12
|
* Wait longerJon Bratseth2020-01-081-5/+17
|
* Add/corect copyright headersJon Bratseth2020-01-031-0/+1
|
* - Shut down monitoring thread.Henning Baldersheim2019-10-041-2/+2
| | | | | - Remove fs4 cleanup. - Add some more debug information for group status.
* false is false, and true is true, can not be bothHenning Baldersheim2019-09-201-0/+14
|
* Revert "Revert "Bratseth/vip logic take 2""Henning Baldersheim2019-09-201-0/+279
|
* Revert "Bratseth/vip logic take 2"Harald Musum2019-09-201-279/+0
|
* Drive the ping ourselves to avoid waiting for the monitor thread.Henning Baldersheim2019-09-191-3/+24
|
* Transition from down to up initiallyJon Bratseth2019-09-191-18/+33
| | | | | - Use tri-state logic for working/failing/unknown - Be initially down in test and verify we come up
* Revert "Merge pull request #10736 from ↵Jon Bratseth2019-09-191-0/+243
| | | | | | | vespa-engine/revert-10727-balder/add-searchcluster-test-with-local" This reverts commit 992b73092f0d14beb3ae380904d27886fe4dbc89, reversing changes made to 925ad2648e24ca0db15054beb7450f209712e404.
* Revert "Add test for in and out of vip and fix bug."Håkon Hallingstad2019-09-191-243/+0
|