Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Rename back to HostName, and merge the value class and utilities | Jon Marius Venstad | 2022-03-31 | 1 | -5/+5 |
| | |||||
* | Move HostName -> Hostnames, and DomainName and Hostname to com.yahoo.net | Jon Marius Venstad | 2022-03-31 | 1 | -5/+5 |
| | |||||
* | Update 2020 Oath copyrights. | gjoranv | 2021-10-27 | 1 | -1/+1 |
| | |||||
* | Update Verizon Media copyright notices. | gjoranv | 2021-10-07 | 2 | -2/+2 |
| | |||||
* | Merge branch 'master' into balder/do-not-depend-on-clusterinfo | Henning Baldersheim | 2021-09-30 | 3 | -2/+234 |
|\ | |||||
| * | Separate balanced and sparse | Jon Bratseth | 2021-07-02 | 1 | -2/+22 |
| | | |||||
| * | Infer group | Jon Bratseth | 2021-07-02 | 2 | -14/+14 |
| | | |||||
| * | Allow deviation of at least 1 document | Harald Musum | 2021-06-30 | 1 | -0/+18 |
| | | | | | | | | Let content be well-balanced when there are few docs in a cluster | ||||
| * | Revert "Revert "Don't consider number of working nodes in coverage"" | Jon Bratseth | 2021-05-11 | 1 | -2/+16 |
| | | |||||
| * | Revert "Don't consider number of working nodes in coverage" | Jon Bratseth | 2021-05-10 | 1 | -16/+2 |
| | | |||||
| * | Don't consider number of working nodes in coverage | Jon Bratseth | 2021-05-10 | 1 | -2/+16 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tring to figure out the right groups to send queries to based on the number of nodes in the group has many potential issues at times of topology changes. Since we could the number of documents available in each group by summing documents in working nodes, we do not need to also separately consider the number of working nodes in the group for correctness. Since we use adaptive dispatching by default we also do not need to consider it to avoid overloading groups with less resources available but enough documents. | ||||
| * | Use median not average document count to determine group coverage | Jon Bratseth | 2021-04-15 | 3 | -1/+123 |
| | | | | | | | | If a group has too many nodes, all others will have less than average. | ||||
| * | Revert "Revert "Disable topk optimisation on dispatch when content ↵ | Henning Baldersheim | 2021-01-08 | 1 | -1/+44 |
| | | | | | | | | distribution is se…"" | ||||
| * | Revert "Disable topk optimisation on dispatch when content distribution is ↵ | Henning Baldersheim | 2021-01-08 | 1 | -44/+1 |
| | | | | | | | | se…" | ||||
| * | Disable topk optimisation on dispatch when content distribution is severly ↵ | Henning Baldersheim | 2021-01-07 | 1 | -1/+44 |
| | | | | | | | | | | | | | | | | skewed. When the skew is too large the assumption that docs are evenly and randomly distributed hold. The impact and is larger on smaller systems. In large systems the where this optimisation is more important, the probabilitity of large skew will be less. | ||||
| * | Make SearchCluster.TopKEstimator a top level class. | Henning Baldersheim | 2020-04-15 | 1 | -12/+1 |
| | | |||||
| * | Introduce top-k-probability and use it to fetch correct proper amount of ↵ | Henning Baldersheim | 2020-04-15 | 1 | -0/+12 |
| | | | | | | | | hits from each partition | ||||
| * | Revert "Revert "Revert "Revert "Don't take combined clusters of size 1 down"""" | Jon Bratseth | 2020-03-26 | 1 | -1/+15 |
| | | |||||
| * | Revert "Revert "Revert "Don't take combined clusters of size 1 down""" | Jon Bratseth | 2020-03-26 | 1 | -15/+1 |
| | | |||||
| * | Revert "Revert "Don't take combined clusters of size 1 down"" | Jon Bratseth | 2020-03-25 | 1 | -1/+15 |
| | | |||||
| * | Revert "Don't take combined clusters of size 1 down" | Harald Musum | 2020-03-25 | 1 | -15/+1 |
| | | |||||
| * | Don't take combined clusters of size 1 down | Jon Bratseth | 2020-03-25 | 1 | -1/+15 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This can lead to a deadlock: - host-admin needs to suspend node before it reduces the CPU allocation - suspension means setting storage node in maintenance, distributor down - cluster controller figures this means the cluster is down - the container on the same node (being a combined cluster) receives report from the downstream storage node of being offline, and changes its /state/v1/health to down - being a combined cluster node w/container, the host-admin must verify /health/v1/status is UP before allowing resume, which it isn't We have no good options when the content node is down and size is 1, and do not much care about availability in this case by definition, so keeping the container in rotation should be fine. | ||||
| * | Revert "Revert "Revert "Revert "Create a resourcepool so that we do not need ↵ | Henning Baldersheim | 2020-02-19 | 1 | -1/+0 |
| | | | | | | | | to reconnect to content …"""" | ||||
| * | Revert "Revert "Revert "Create a resourcepool so that we do not need to ↵ | Harald Musum | 2020-02-19 | 1 | -0/+1 |
| | | | | | | | | reconnect to content …""" | ||||
| * | Revert "Revert "Create a resourcepool so that we do not need to reconnect to ↵ | Henning Baldersheim | 2020-02-14 | 1 | -1/+0 |
| | | | | | | | | content …"" | ||||
| * | Revert "Create a resourcepool so that we do not need to reconnect to content ↵ | Harald Musum | 2020-02-14 | 1 | -0/+1 |
| | | | | | | | | …" | ||||
| * | Create a resourcepool so that we do not need to reconnect to content cluster ↵ | Henning Baldersheim | 2020-02-13 | 1 | -1/+0 |
| | | | | | | | | on changes to container cluster. | ||||
* | | Do not depend on on ClusterInfo config as it changes to often and causes a ↵ | Henning Baldersheim | 2020-02-13 | 1 | -2/+1 |
|/ | | | | | | instant clusterwide hickup on any container cluster changes like node retirement. The corner case it was used for is not worth the cost. | ||||
* | Decouple so ClusterMonitor is on the outside of the searchcluster and can be ↵ | Henning Baldersheim | 2020-02-04 | 1 | -13/+5 |
| | | | | provided. | ||||
* | Move pingfactory to constructor. | Henning Baldersheim | 2020-02-04 | 1 | -2/+3 |
| | |||||
* | Do not start cluster monitor thread in test as it will race with explicit ↵ | Henning Baldersheim | 2020-02-04 | 1 | -2/+2 |
| | | | | ping in test. | ||||
* | Add another ping round to avoid racing with the builtin ping thread that ↵ | Henning Baldersheim | 2020-02-03 | 1 | -1/+1 |
| | | | | operates at 1hz. | ||||
* | Wait until Pong has returned before saying you are done. | Henning Baldersheim | 2020-02-03 | 1 | -1/+1 |
| | |||||
* | Provide pongHandler in constructor to avoid needing an AtomicReference. | Henning Baldersheim | 2020-02-03 | 1 | -5/+7 |
| | |||||
* | Use sequence numbers and check on Pong reception instead. | Henning Baldersheim | 2020-02-03 | 1 | -0/+15 |
| | |||||
* | Send ping every second truly async to all nodes who does not have any ↵ | Henning Baldersheim | 2020-01-31 | 1 | -7/+7 |
| | | | | pending pings. | ||||
* | Fix unstable test | Martin Polden | 2020-01-30 | 1 | -19/+20 |
| | |||||
* | Close state in requireThatVipStatusIsDefaultDownButComesUpAfterPinging | Jon Bratseth | 2020-01-20 | 1 | -7/+13 |
| | |||||
* | Shutdown search cluster monitoring after use | Harald Musum | 2020-01-13 | 1 | -92/+105 |
| | |||||
* | Remove unused executor, log if we get InterruptedException | Harald Musum | 2020-01-13 | 1 | -24/+12 |
| | |||||
* | Wait longer | Jon Bratseth | 2020-01-08 | 1 | -5/+17 |
| | |||||
* | Add/corect copyright headers | Jon Bratseth | 2020-01-03 | 1 | -0/+1 |
| | |||||
* | - Shut down monitoring thread. | Henning Baldersheim | 2019-10-04 | 1 | -2/+2 |
| | | | | | - Remove fs4 cleanup. - Add some more debug information for group status. | ||||
* | false is false, and true is true, can not be both | Henning Baldersheim | 2019-09-20 | 1 | -0/+14 |
| | |||||
* | Revert "Revert "Bratseth/vip logic take 2"" | Henning Baldersheim | 2019-09-20 | 1 | -0/+279 |
| | |||||
* | Revert "Bratseth/vip logic take 2" | Harald Musum | 2019-09-20 | 1 | -279/+0 |
| | |||||
* | Drive the ping ourselves to avoid waiting for the monitor thread. | Henning Baldersheim | 2019-09-19 | 1 | -3/+24 |
| | |||||
* | Transition from down to up initially | Jon Bratseth | 2019-09-19 | 1 | -18/+33 |
| | | | | | - Use tri-state logic for working/failing/unknown - Be initially down in test and verify we come up | ||||
* | Revert "Merge pull request #10736 from ↵ | Jon Bratseth | 2019-09-19 | 1 | -0/+243 |
| | | | | | | | vespa-engine/revert-10727-balder/add-searchcluster-test-with-local" This reverts commit 992b73092f0d14beb3ae380904d27886fe4dbc89, reversing changes made to 925ad2648e24ca0db15054beb7450f209712e404. | ||||
* | Revert "Add test for in and out of vip and fix bug." | Håkon Hallingstad | 2019-09-19 | 1 | -243/+0 |
| |