summaryrefslogtreecommitdiffstats
path: root/flags
diff options
context:
space:
mode:
authorHåkon Hallingstad <hakon@verizonmedia.com>2019-11-13 23:06:23 +0100
committerHåkon Hallingstad <hakon@verizonmedia.com>2019-11-13 23:06:23 +0100
commitb2be7d18f2b540294db374a4740500cdb24650a1 (patch)
tree7047da3ca0fce9a10094141d94a2a7cee44f1d4a /flags
parent506cfee050f40cc29595f14c07a201193b9fcf89 (diff)
Read reboot-interval-in-days dynamically
But also: Changes the distribution of the scheduling past 1x reboot interval: hosts will be scheduled for reboot evenly distributed in the whole 1x-2x range, and is by this guaranteed to be scheduled at latest at 2x. The expected time before a reboot was scheduled was 1.33 reboot intervals, while there was no guarantee of an upper time. The new algorithm has an expected time before reboot of 1.5 reboot intervals, bound to 2x. The old would have a higher probability of reboot passing the 1x boundary, while a lower probability than the new as one nears 2x. So I think the new algorithm also have the nice property of avoiding thundering herd, perhaps even more so than the old: For instance when most hosts are rebooted at the same time in a zone, they would tend to be rescheduled for reboot closer to each other with the old than with the new. And, enabling the new algoritm should also not lead to too many hosts suddenly having to reboot, or at least that's what I hope. I can sanity-check this before merge - I guess it would be dominated by the number of hosts in west/east that are beyond 2x.
Diffstat (limited to 'flags')
-rw-r--r--flags/src/main/java/com/yahoo/vespa/flags/Flags.java5
1 files changed, 3 insertions, 2 deletions
diff --git a/flags/src/main/java/com/yahoo/vespa/flags/Flags.java b/flags/src/main/java/com/yahoo/vespa/flags/Flags.java
index 78aec5285cf..272e96903f8 100644
--- a/flags/src/main/java/com/yahoo/vespa/flags/Flags.java
+++ b/flags/src/main/java/com/yahoo/vespa/flags/Flags.java
@@ -113,8 +113,9 @@ public class Flags {
public static final UnboundIntFlag REBOOT_INTERVAL_IN_DAYS = defineIntFlag(
"reboot-interval-in-days", 30,
- "The reboot interval in days.",
- "Takes effect on start of config server / controller");
+ "No reboots are scheduled 0x-1x reboot intervals after the previous reboot, while reboot is " +
+ "scheduled evenly distributed in the 1x-2x range (and naturally guaranteed at the 2x boundary).",
+ "Takes effect on next run of NodeRebooter");
public static final UnboundBooleanFlag ENABLE_DYNAMIC_PROVISIONING = defineFeatureFlag(
"enable-dynamic-provisioning", false,