diff options
author | Tor Brede Vekterli <vekterli@verizonmedia.com> | 2021-02-24 17:07:17 +0100 |
---|---|---|
committer | Tor Brede Vekterli <vekterli@verizonmedia.com> | 2021-02-26 12:32:04 +0100 |
commit | 20627141c1f57c0c199193b07abce885508152f3 (patch) | |
tree | 87717c7794b751fcd6d786556248dad77e076aaa /clustercontroller-core/src/test/java/com/yahoo/vespa/clustercontroller/core/StateChangeTest.java | |
parent | 38beecb43c8dfc15af8f3b14a62ee29430689713 (diff) |
Enforce that no cluster state can be published unless confirmed written to ZooKeeper
This avoids a subtle edge case where the underlying ZK integration code
may fail silently a write, leaving the core controller logic to think that
it had actually durably persisted a particular state version.
In case of reelections racing with broadcasts, it would be possible for
leader-edge readbacks from ZK to retrieve a _lower_ version than one
that had already been published. This would cause the cluster controller
to get very confused about which cluster states nodes had already observed.
If a newly produced state version overlapped with a previously broadcast
state, the controller would not push the updated state to the nodes, as it
would (with good reason) assume the node had already observed it, seeing
that it had already ACKed the particular version number.
Diffstat (limited to 'clustercontroller-core/src/test/java/com/yahoo/vespa/clustercontroller/core/StateChangeTest.java')
-rw-r--r-- | clustercontroller-core/src/test/java/com/yahoo/vespa/clustercontroller/core/StateChangeTest.java | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/clustercontroller-core/src/test/java/com/yahoo/vespa/clustercontroller/core/StateChangeTest.java b/clustercontroller-core/src/test/java/com/yahoo/vespa/clustercontroller/core/StateChangeTest.java index 15778846ca6..cf2b151e55a 100644 --- a/clustercontroller-core/src/test/java/com/yahoo/vespa/clustercontroller/core/StateChangeTest.java +++ b/clustercontroller-core/src/test/java/com/yahoo/vespa/clustercontroller/core/StateChangeTest.java @@ -954,7 +954,7 @@ public class StateChangeTest extends FleetControllerTest { options.minTimeBeforeFirstSystemStateBroadcast = 3 * 60 * 1000; setUpSystem(true, options); setUpVdsNodes(true, new DummyVdsNodeOptions(), true); - // Leave one node down to avoid sending cluster state due to having seen all node states. + // Leave one node down to avoid sending cluster state due to having seen all node states. for (int i=0; i<nodes.size(); ++i) { if (i != 3) { nodes.get(i).connect(); @@ -973,7 +973,7 @@ public class StateChangeTest extends FleetControllerTest { @Override int expectedMessageCount(final DummyVdsNode node) { return 0; } }; - // Pass time and see that the nodes get state + // Pass time and see that the nodes get state timer.advanceTime(3 * 60 * 1000); waiter.waitForState("version:\\d+ distributor:10 storage:10 .1.s:d", timeoutMS); |