diff options
author | Håkon Hallingstad <hakon@verizonmedia.com> | 2019-11-13 23:06:23 +0100 |
---|---|---|
committer | Håkon Hallingstad <hakon@verizonmedia.com> | 2019-11-13 23:06:23 +0100 |
commit | b2be7d18f2b540294db374a4740500cdb24650a1 (patch) | |
tree | 7047da3ca0fce9a10094141d94a2a7cee44f1d4a /flags | |
parent | 506cfee050f40cc29595f14c07a201193b9fcf89 (diff) |
Read reboot-interval-in-days dynamically
But also:
Changes the distribution of the scheduling past 1x reboot interval: hosts will
be scheduled for reboot evenly distributed in the whole 1x-2x range, and is by
this guaranteed to be scheduled at latest at 2x.
The expected time before a reboot was scheduled was 1.33 reboot intervals,
while there was no guarantee of an upper time. The new algorithm has an
expected time before reboot of 1.5 reboot intervals, bound to 2x. The old would
have a higher probability of reboot passing the 1x boundary, while a lower
probability than the new as one nears 2x.
So I think the new algorithm also have the nice property of avoiding thundering
herd, perhaps even more so than the old: For instance when most hosts are
rebooted at the same time in a zone, they would tend to be rescheduled for
reboot closer to each other with the old than with the new.
And, enabling the new algoritm should also not lead to too many hosts suddenly
having to reboot, or at least that's what I hope. I can sanity-check this
before merge - I guess it would be dominated by the number of hosts in
west/east that are beyond 2x.
Diffstat (limited to 'flags')
-rw-r--r-- | flags/src/main/java/com/yahoo/vespa/flags/Flags.java | 5 |
1 files changed, 3 insertions, 2 deletions
diff --git a/flags/src/main/java/com/yahoo/vespa/flags/Flags.java b/flags/src/main/java/com/yahoo/vespa/flags/Flags.java index 78aec5285cf..272e96903f8 100644 --- a/flags/src/main/java/com/yahoo/vespa/flags/Flags.java +++ b/flags/src/main/java/com/yahoo/vespa/flags/Flags.java @@ -113,8 +113,9 @@ public class Flags { public static final UnboundIntFlag REBOOT_INTERVAL_IN_DAYS = defineIntFlag( "reboot-interval-in-days", 30, - "The reboot interval in days.", - "Takes effect on start of config server / controller"); + "No reboots are scheduled 0x-1x reboot intervals after the previous reboot, while reboot is " + + "scheduled evenly distributed in the 1x-2x range (and naturally guaranteed at the 2x boundary).", + "Takes effect on next run of NodeRebooter"); public static final UnboundBooleanFlag ENABLE_DYNAMIC_PROVISIONING = defineFeatureFlag( "enable-dynamic-provisioning", false, |