Commit message (Collapse) | Author | Age | Files | Lines | ||
---|---|---|---|---|---|---|
... | ||||||
* | Disallow >1 group to suspend | Håkon Hallingstad | 2021-04-16 | 5 | -27/+277 | |
| | | | | | | | If there is more than one group, disallow suspending a node if there is a node in another group that has a user wanted state != UP. If there is 1 group, disallow suspending more than 1 node. | |||||
* | No longer allow suspension if in maintenance | Håkon Hallingstad | 2021-04-15 | 3 | -17/+14 | |
| | | | | | | If a storage node falls out of Slobrok, it will change from UP to Maintenance after 60s, then after further 30s go to Down. Avoid allowing suspension in the 30s grace period just because it is Maintenance mode. | |||||
* | Merge branch 'master' into hmusum/cleanup-7 | Harald Musum | 2021-04-08 | 6 | -22/+32 | |
|\ | ||||||
| * | Add remote task queue size metric in cluster controller | Håkon Hallingstad | 2021-04-01 | 6 | -22/+32 | |
| | | ||||||
* | | Cleanup tests a bit | Harald Musum | 2021-04-08 | 3 | -43/+49 | |
| | | ||||||
* | | Fix typo in class name | Harald Musum | 2021-04-08 | 1 | -1/+1 | |
|/ | ||||||
* | Log when transitioning out of CC moratorium | Håkon Hallingstad | 2021-03-26 | 1 | -6/+3 | |
| | ||||||
* | Make default deadline to first broadcast 30s | Håkon Hallingstad | 2021-03-24 | 3 | -3/+5 | |
| | ||||||
* | Revert "Revert "Avoid safe mutations in master moratorium and increase first ↵ | Håkon Hallingstad | 2021-03-24 | 13 | -17/+71 | |
| | | | | cluster state broadcast deadline [run-systemtest]"" | |||||
* | Revert "Avoid safe mutations in master moratorium and increase first cluster ↵ | Håkon Hallingstad | 2021-03-24 | 13 | -71/+17 | |
| | | | | state broadcast deadline [run-systemtest]" | |||||
* | Merge pull request #17085 from ↵ | Håkon Hallingstad | 2021-03-24 | 13 | -17/+71 | |
|\ | | | | | | | | | vespa-engine/hakonhall/increase-the-minimum-time-before-first-cluster-state-broadcast-run-systemtest Avoid safe mutations in master moratorium and increase first cluster state broadcast deadline [run-systemtest] | |||||
| * | Avoid safe-set-node-state in master moratorium | Håkon Hallingstad | 2021-03-24 | 12 | -16/+68 | |
| | | ||||||
| * | Increase the minimum time before first cluster state broadcast [run-systemtest] | Håkon Hallingstad | 2021-03-19 | 1 | -1/+3 | |
| | | ||||||
* | | Revert deferred ZK connectivity for now | Tor Brede Vekterli | 2021-03-22 | 3 | -21/+2 | |
| | | | | | | | | | | | | Instead, we'll want to create a more generalized solution that considers all sources of node information (Slobrok _and_ explicit health check RPCs) before potentially publishing a state or processing tasks. | |||||
* | | Make sure to reset any election shortcuts if we go from !ZK -> ZK | Tor Brede Vekterli | 2021-03-19 | 1 | -5/+13 | |
| | | ||||||
* | | Use local leader state for decisions rather than election handler | Tor Brede Vekterli | 2021-03-19 | 1 | -5/+7 | |
| | | | | | | | | | | | | | | | | | | Avoids potentially publishing cluster states _before_ we have triggered our own leadership election edge handling code. Could happen if code called prior to the election edge logic checked the election handler state and erroneously thought we had performed the prerequisite actions we're supposed to do when assuming leadership (such as reading back current state from ZK). | |||||
* | | Don't allow short-circuiting election phase if only one node configured if ↵ | Tor Brede Vekterli | 2021-03-19 | 2 | -2/+10 | |
| | | | | | | | | using ZK | |||||
* | | Inhibit ZooKeeper connections until our local Slobrok mirror is ready. | Tor Brede Vekterli | 2021-03-19 | 6 | -2/+41 | |
|/ | | | | | | | | Otherwise, if there are transient Slobrok issues during CC startup and we end up winning the election, we risk publishing a cluster state where the entire cluster appears down (since we do not have any knowledge of Slobrok node mapping state). This will adversely affect availability for all the obvious reasons. | |||||
* | Guard against ever accidentally publishing a default constructed state | Tor Brede Vekterli | 2021-03-19 | 3 | -20/+20 | |
| | | | | | Since version 0 states were ambiguous with the sentinel values for "not written to ZK/not tagged as official", this could be mis-interpreted. | |||||
* | use US locale | Kristian Aune | 2021-03-19 | 2 | -7/+9 | |
| | ||||||
* | Revert "Inhibit ZooKeeper connections until our local Slobrok mirror is ready." | Tor Brede Vekterli | 2021-03-18 | 6 | -41/+2 | |
| | ||||||
* | Merge pull request #17029 from ↵ | Tor Brede Vekterli | 2021-03-18 | 6 | -2/+41 | |
|\ | | | | | | | | | vespa-engine/vekterli/inhibit-db-connectivity-until-slobrok-is-ready Inhibit ZooKeeper connections until our local Slobrok mirror is ready. | |||||
| * | Guard against Slobrok mirror not yet being configured | Tor Brede Vekterli | 2021-03-18 | 2 | -6/+2 | |
| | | ||||||
| * | Inhibit ZooKeeper connections until our local Slobrok mirror is ready. | Tor Brede Vekterli | 2021-03-18 | 6 | -2/+45 | |
| | | | | | | | | | | | | | | | | Otherwise, if there are transient Slobrok issues during CC startup and we end up winning the election, we risk publishing a cluster state where the entire cluster appears down (since we do not have any knowledge of Slobrok node mapping state). This will adversely affect availability for all the obvious reasons. | |||||
* | | Merge pull request #16935 from ↵ | Henning Baldersheim | 2021-03-15 | 7 | -39/+45 | |
|\ \ | | | | | | | | | | | | | vespa-engine/revert-16934-revert-16932-balder/move-metrics-from-partition-to-node-level Revert "Revert "GC unused DiskState and add the partition metrics to node level."" | |||||
| * | | Include metrics always. | Henning Baldersheim | 2021-03-12 | 1 | -27/+0 | |
| | | | ||||||
| * | | Revert "Revert "GC unused DiskState and add the partition metrics to node ↵ | Henning Baldersheim | 2021-03-12 | 7 | -12/+45 | |
| | | | | | | | | | | | | level."" | |||||
* | | | Ensure Import-Package for javax packages are included in bundle's manifest | Bjørn Christian Seime | 2021-03-15 | 1 | -0/+7 | |
|/ / | ||||||
* | | Revert "GC unused DiskState and add the partition metrics to node level." | Harald Musum | 2021-03-12 | 7 | -45/+12 | |
| | | ||||||
* | | GC unused DiskState and add the partition metrics to node level. | Henning Baldersheim | 2021-03-12 | 7 | -12/+45 | |
| | | ||||||
* | | GC unused import | Henning Baldersheim | 2021-03-12 | 2 | -2/+0 | |
|/ | ||||||
* | Merge pull request #16926 from ↵ | Tor Brede Vekterli | 2021-03-12 | 6 | -98/+177 | |
|\ | | | | | | | | | vespa-engine/vekterli/dont-store-full-bundle-objects-in-state-history Don't store full bundle objects in state history | |||||
| * | Add missing copyright | Tor Brede Vekterli | 2021-03-12 | 1 | -0/+1 | |
| | | ||||||
| * | Move config output further down on the status page | Tor Brede Vekterli | 2021-03-12 | 1 | -2/+2 | |
| | | | | | | | | | | Always print regardless of leader eligibility state; config is not predicated on this. | |||||
| * | Move ZK/election-related info away from top of CC status page | Tor Brede Vekterli | 2021-03-12 | 1 | -2/+2 | |
| | | | | | | | | | | Much less immediately interesting than the actual cluster node information. Move it just above the general event log instead. | |||||
| * | Don't store full bundle objects in cluster state history | Tor Brede Vekterli | 2021-03-12 | 6 | -94/+172 | |
| | | | | | | | | | | | | | | | | | | | | | | | | | | Bundles have a lot of sub-objects per state, so in systems with a high amount of node entries, this adds unnecessary pressure on the heap. Instead, store the string representations of the bundle and the string representation of the diff to the previous state version (if any). This is also inherently faster than computing the diffs on-demand on every status page render. Also remove mutable `official` field from `ClusterState`. Not worth violating immutability of an object just to get some prettier (but with high likelihood actually more confusing) status page rendering. | |||||
* | | Revert "GC unused DiskState" | Arnstein Ressem | 2021-03-12 | 10 | -7/+180 | |
|/ | ||||||
* | Merge pull request #16911 from vespa-engine/balder/gc-disk-states | Henning Baldersheim | 2021-03-12 | 10 | -180/+7 | |
|\ | | | | | GC unused DiskState | |||||
| * | GC Partition | Henning Baldersheim | 2021-03-11 | 1 | -12/+0 | |
| | | ||||||
| * | GC unused DiskState | Henning Baldersheim | 2021-03-11 | 9 | -168/+7 | |
| | | ||||||
* | | Merge pull request #16900 from vespa-engine/bjorncs/zookeeper-client-common | Bjørn Christian Seime | 2021-03-12 | 2 | -10/+16 | |
|\ \ | |/ |/| | Add shared ZK client config generator for zkfacade and vespa-zkcli [run-systemtest] | |||||
| * | Construct ZKClientConfig from ZkClientConfigBuilder | Bjørn Christian Seime | 2021-03-11 | 2 | -10/+16 | |
| | | | | | | | | Use ZKClientConfig builder from Curator and ZooKeeperDatabase | |||||
* | | GC use of void DiskState. | Henning Baldersheim | 2021-03-11 | 5 | -65/+9 | |
| | | ||||||
* | | GC use of NodeState.getDiskCount and NodeState.getDiskStates. | Henning Baldersheim | 2021-03-11 | 5 | -74/+11 | |
| | | ||||||
* | | GC long gone disk state checks. | Henning Baldersheim | 2021-03-11 | 1 | -1/+6 | |
| | | ||||||
* | | Shrink the size of the NodeState object by using float over double for ↵ | Henning Baldersheim | 2021-03-11 | 4 | -29/+29 | |
|/ | | | | initProgress and capacity. Also gc unused 'reliability' member. | |||||
* | Remove com.yahoo.vespa.jdk8compat | Bjørn Christian Seime | 2021-03-10 | 1 | -5/+4 | |
| | | | | These types are often accidentally imported, and the JDK8 replacement is typically a one-liner. | |||||
* | Merge pull request #16856 from ↵ | Tor Brede Vekterli | 2021-03-09 | 1 | -3/+20 | |
|\ | | | | | | | | | vespa-engine/vekterli/immediately-exit-cc-if-node-index-changed-live Immediately exit cluster controller if node index config is changed live | |||||
| * | Immediately exit cluster controller if node index config is changed live | Tor Brede Vekterli | 2021-03-09 | 1 | -3/+20 | |
| | | | | | | | | | | We do not support live reconfigs of CC index, so swiftly exit if we detect this, allowing the config sentinel to restart the service. | |||||
* | | Merge pull request #16843 from vespa-engine/bjorncs/upgrade-zk-client | Bjørn Christian Seime | 2021-03-09 | 1 | -0/+6 | |
|\ \ | |/ |/| | Bjorncs/upgrade zk client [run-systemtest] |