Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Guard critical cluster state version ZK writes with an atomic CaS | Tor Brede Vekterli | 2019-07-11 | 5 | -68/+170 |
| | | | | | | | | | Lets a controller discover that another controller also believes it is the leader by tracking the expected znode versions for cluster state version and bundle znodes. If a CaS failure is triggered, the controller will drop its database and election state, forcing a state refresh. | ||||
* | Do not allow states to be published when they have pending ZK writes | Tor Brede Vekterli | 2019-07-05 | 2 | -0/+15 |
| | | | | | | Avoids a race condition where a bundle ZK write fails but we have not yet detected that ZK connectivity has been lost. This could lead to violating the invariant that published state versions are strictly increasing. | ||||
* | Don't use fake timers for master election tests | Tor Brede Vekterli | 2019-06-20 | 1 | -14/+14 |
| | | | | | | | | When using fake timers there's a catch-22 edge case during transient ZooKeeper disconnects. The test will not progress and increase the fake timer value until ZK is connected, but ZK will not attempt to reconnect until the reconnection cooldown period has passed (which depends on the faked timer). | ||||
* | Increase ZK client session timeout from 10s to 30s | Tor Brede Vekterli | 2019-06-19 | 1 | -1/+1 |
| | | | | | | | | Have a suspicion that 10s ends up being too short when ZK's write-ahead log flushing is taking a long time due to a heavily loaded CI container. Should still be short enough for a network hiccup to hopefully trigger a reconnect before the test itself times out. | ||||
* | Reduce ZK session timeout for master election unit tests | Tor Brede Vekterli | 2019-06-17 | 1 | -4/+4 |
| | | | | | | | | | Setup methods would previously override explicit timeouts set by the tests with a default of 2 minutes. It's suspected that this long timeout combined with spurious ZK client failures is a source of test failures. Reduce to 10 seconds for now. Let's see how well it fares on a heavily loaded CI container. | ||||
* | Keep the spec final. | Henning Baldersheim | 2019-05-28 | 1 | -1/+1 |
| | | | | | | Create the address when needed in the async connect thread. Implement hash/equal/compareTo for Spec to avoid toString. Use Spec as key and avoid creating it every time. | ||||
* | Remove usage of deprecated Method constructor | Bjørn Christian Seime | 2019-05-23 | 2 | -21/+21 |
| | |||||
* | mockito-all => mockito-core | Henning Baldersheim | 2019-04-29 | 1 | -1/+1 |
| | |||||
* | Change interface from Mirror.Entry[] to List<Mirror.Entry> as you already ↵ | Henning Baldersheim | 2019-04-22 | 1 | -4/+9 |
| | | | | | | have a list. Avoid having to do an array copy that is not necessary. | ||||
* | Address code review feedback for cluster controller changes | Tor Brede Vekterli | 2019-03-26 | 3 | -36/+42 |
| | |||||
* | Work around some Java generics snags in test mocks | Tor Brede Vekterli | 2019-03-22 | 1 | -0/+7 |
| | |||||
* | Activation reply processing must inspect actual version returned | Tor Brede Vekterli | 2019-03-21 | 13 | -35/+113 |
| | | | | | | | | Version mismatches in backend do not return explicit RPC errors, so actual vs. desired versions must be checked in order to avoid potentially spurious activation of other versions. Also do some minor code cleanup. | ||||
* | Add activated state version to node status page row | Tor Brede Vekterli | 2019-03-20 | 4 | -10/+18 |
| | | | | | Only displayed if not equal to published state version and if two-phase transitions are enabled. | ||||
* | Break rendering MegaFunction(tm) into separate functions | Tor Brede Vekterli | 2019-03-20 | 1 | -112/+142 |
| | |||||
* | Explicitly enable two-phase transitions in tests, disable in default options | Tor Brede Vekterli | 2019-03-20 | 16 | -70/+80 |
| | | | | Mirrors the default values in the actual underlying config definitions. | ||||
* | Add explicit tests of `SystemStateBroadcaster` behavior | Tor Brede Vekterli | 2019-03-20 | 2 | -14/+177 |
| | |||||
* | Bring default state of ClusterStateBundle deferred activation flag in line ↵ | Tor Brede Vekterli | 2019-03-20 | 4 | -25/+21 |
| | | | | | | | with C++ impl I.e. disabled by default. Also reduce log level for logging used during development. | ||||
* | Print deferred activation flag in `ClusterStateBundle.toString` | Tor Brede Vekterli | 2019-03-15 | 2 | -4/+36 |
| | |||||
* | Bind deferred activation decision to concrete bundle instance, not global config | Tor Brede Vekterli | 2019-03-15 | 5 | -9/+38 |
| | | | | Ensure that deferred activation flags are propagated during building and cloning. | ||||
* | Support configurable two-phase state transitions in cluster controller | Tor Brede Vekterli | 2019-03-14 | 16 | -108/+382 |
| | |||||
* | Initial groundwork for cluster state version activation RPC | Tor Brede Vekterli | 2019-03-14 | 10 | -62/+206 |
| | |||||
* | Include deferred activation flag with cluster state bundles | Tor Brede Vekterli | 2019-03-14 | 7 | -18/+114 |
| | | | | | | | Bundles including this flag from the cluster controller indicate to receiver nodes that an explicit activation RPC will follow. When it is not present, nodes must activate the cluster state at their own leisure as they have done historically. | ||||
* | Reduce log spam | Håkon Hallingstad | 2019-01-22 | 1 | -3/+4 |
| | |||||
* | 6-SNAPSHOT -> 7-SNAPSHOT | Arnstein Ressem | 2019-01-21 | 1 | -2/+2 |
| | |||||
* | Remove experimental enable-multiple-bucket-spaces flag. | Geir Storli | 2018-11-23 | 3 | -14/+3 |
| | | | | The feature has been default on since late May 2018. | ||||
* | Revert "Revert "Revert "Revert "Enforce CC timeouts in Orchestrator 4"""" | Håkon Hallingstad | 2018-11-01 | 5 | -21/+64 |
| | |||||
* | Revert "Revert "Revert "Enforce CC timeouts in Orchestrator 4""" | Håkon Hallingstad | 2018-11-01 | 5 | -64/+21 |
| | |||||
* | Revert "Revert "Enforce CC timeouts in Orchestrator [4]"" | Håkon Hallingstad | 2018-11-01 | 5 | -21/+64 |
| | |||||
* | Revert "Enforce CC timeouts in Orchestrator [4]" | Harald Musum | 2018-10-31 | 5 | -64/+21 |
| | |||||
* | Revert "Revert "Revert "Revert "Enforce CC timeouts in Orchestrator 2"""" | Håkon Hallingstad | 2018-10-30 | 5 | -21/+64 |
| | |||||
* | Revert "Revert "Revert "Enforce CC timeouts in Orchestrator 2""" | Håkon Hallingstad | 2018-10-30 | 5 | -64/+21 |
| | |||||
* | Revert "Revert "Enforce CC timeouts in Orchestrator 2"" | Håkon Hallingstad | 2018-10-29 | 5 | -21/+64 |
| | |||||
* | Revert "Enforce CC timeouts in Orchestrator 2" | Håkon Hallingstad | 2018-10-29 | 5 | -64/+21 |
| | |||||
* | Fixes after review round | Håkon Hallingstad | 2018-10-26 | 1 | -1/+1 |
| | |||||
* | set-node-state probing in CC | Håkon Hallingstad | 2018-10-24 | 5 | -21/+64 |
| | |||||
* | Minor fixes | Jon Bratseth | 2018-10-14 | 2 | -38/+39 |
| | |||||
* | Add copyright header | Jon Bratseth | 2018-10-01 | 2 | -0/+2 |
| | |||||
* | set-node-state timeout in CC | Håkon Hallingstad | 2018-06-22 | 6 | -5/+45 |
| | |||||
* | Warn on timeout | Håkon Hallingstad | 2018-06-15 | 1 | -1/+1 |
| | |||||
* | Do not wait for version ack for failed set-node-state | Håkon Hallingstad | 2018-06-13 | 5 | -6/+82 |
| | |||||
* | Remove support for ancient legacy node state protocol versions | Tor Brede Vekterli | 2018-06-11 | 3 | -83/+17 |
| | | | | | Protocol versions 0 and 1 haven't been in use for years. No point in maintaining complexity to support automatic downgrades to these. | ||||
* | Merge pull request #5766 from ↵ | Geir Storli | 2018-05-03 | 3 | -3/+31 |
|\ | | | | | | | | | vespa-engine/vekterli/only-derive-default-space-node-states-when-global-doc-types-present Only derive default bucket space node states when cluster has global docs | ||||
| * | Only derive default bucket space node states when cluster has global docs | Tor Brede Vekterli | 2018-05-02 | 3 | -3/+31 |
| | | | | | | | | | | | | | | Lets cluster controller use new protocols for sending compressed cluster state bundles, but without triggering implicit Maintenance edges for nodes in the default bucket space. Also allows for easy live reconfiguration when global document types are added or removed. | ||||
* | | Revert "Revert "Gjoranv/java9 prep 05"" | gjoranv | 2018-05-02 | 1 | -1/+1 |
|/ | |||||
* | Revert "Gjoranv/java9 prep 05" | gjoranv | 2018-05-02 | 1 | -1/+1 |
| | |||||
* | Java 9: Replace 'new Integer' with 'Integer.valueOf' | gjoranv | 2018-04-30 | 1 | -1/+1 |
| | |||||
* | Avoid candidate state racing with published state in tests | Tor Brede Vekterli | 2018-04-27 | 1 | -0/+15 |
| | | | | | | | | Since the tests using `StateWaiter` expects to observe _both_ versioned and unversioned (candidate) states, we ignore candidate states iff they are equal to the versioned state we have already observed. Otherwise, tests waiting for a _versioned_ state risk never observing the version number itself (only a candidate following it) and hang until they time out. | ||||
* | Merge pull request #5710 from vespa-engine/gjoranv/java9-prep-01 | Bjørn Christian Seime | 2018-04-25 | 1 | -9/+0 |
|\ | | | | | Gjoranv/java9 prep 01 | ||||
| * | Remove explicit maven-compiler-plugin config. Inherit from parent. | gjoranv | 2018-04-25 | 1 | -9/+0 |
| | | |||||
* | | Remove redundant task processing step | Tor Brede Vekterli | 2018-04-25 | 1 | -1/+0 |
| | | | | | | | | Already implicitly called by saveLatestClusterStateBundle() |