Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Move compatible check to enums | Jon Bratseth | 2019-11-14 | 1 | -3/+23 |
| | |||||
* | Merge pull request #11284 from ↵ | Håkon Hallingstad | 2019-11-14 | 4 | -70/+123 |
|\ | | | | | | | | | vespa-engine/hakonhall/allow-overriding-noderepositorymaintenance-durations-with-flag Add flag to control reboot interval | ||||
| * | Do not use discrete probabilities | Håkon Hallingstad | 2019-11-14 | 1 | -6/+3 |
| | | |||||
| * | Read reboot-interval-in-days dynamically | Håkon Hallingstad | 2019-11-13 | 4 | -51/+112 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | But also: Changes the distribution of the scheduling past 1x reboot interval: hosts will be scheduled for reboot evenly distributed in the whole 1x-2x range, and is by this guaranteed to be scheduled at latest at 2x. The expected time before a reboot was scheduled was 1.33 reboot intervals, while there was no guarantee of an upper time. The new algorithm has an expected time before reboot of 1.5 reboot intervals, bound to 2x. The old would have a higher probability of reboot passing the 1x boundary, while a lower probability than the new as one nears 2x. So I think the new algorithm also have the nice property of avoiding thundering herd, perhaps even more so than the old: For instance when most hosts are rebooted at the same time in a zone, they would tend to be rescheduled for reboot closer to each other with the old than with the new. And, enabling the new algoritm should also not lead to too many hosts suddenly having to reboot, or at least that's what I hope. I can sanity-check this before merge - I guess it would be dominated by the number of hosts in west/east that are beyond 2x. | ||||
| * | Make flag only for NodeRebooter, and remove fetching from environment | Håkon Hallingstad | 2019-11-13 | 1 | -25/+15 |
| | | |||||
| * | Allow overriding NodeRepositoryMaintenance durations with flag | Håkon Hallingstad | 2019-11-13 | 1 | -17/+22 |
| | | |||||
* | | Explicit NodeResources defaults | Jon Bratseth | 2019-11-13 | 22 | -23/+24 |
| | | |||||
* | | Pass and receive remoteStorage | Jon Bratseth | 2019-11-13 | 1 | -1/+0 |
| | | |||||
* | | Compute free resources using just numbers | Jon Bratseth | 2019-11-13 | 7 | -18/+15 |
| | | |||||
* | | Add NodeResources.storageType | Jon Bratseth | 2019-11-13 | 32 | -58/+142 |
|/ | |||||
* | Merge pull request #11251 from vespa-engine/hmusum/log-parent-hosts-not-ready | Harald Musum | 2019-11-08 | 1 | -3/+10 |
|\ | | | | | Log parent hosts that are not ready | ||||
| * | Fix logger name | Harald Musum | 2019-11-08 | 1 | -1/+1 |
| | | |||||
| * | Log parent hosts that are not ready | Harald Musum | 2019-11-08 | 1 | -3/+10 |
| | | |||||
* | | Merge pull request #11249 from ↵ | Andreas Eriksen | 2019-11-08 | 1 | -1/+3 |
|\ \ | |/ |/| | | | | | vespa-engine/olaa/use-requested-resources-for-allocation-failures Use requested resources when finding allocation failures | ||||
| * | Use requested resources when finding allocation failures | Ola Aunrønning | 2019-11-08 | 1 | -1/+3 |
| | | |||||
* | | make required disk speed patchable | andreer | 2019-11-08 | 1 | -0/+12 |
|/ | |||||
* | Pass requestedResources through HostResources | Jon Bratseth | 2019-11-06 | 1 | -2/+2 |
| | |||||
* | add metrics for application allocations | andreer | 2019-11-05 | 2 | -0/+44 |
| | |||||
* | Preserve resources decided implicitly by policies as requested | Jon Bratseth | 2019-11-04 | 2 | -6/+6 |
| | |||||
* | Remove debug log | Valerij Fredriksen | 2019-10-31 | 1 | -2/+0 |
| | |||||
* | -1 typo | Jon Bratseth | 2019-10-31 | 21 | -21/+21 |
| | |||||
* | Fix typos | Jon Bratseth | 2019-10-31 | 21 | -22/+22 |
| | |||||
* | Output requested node resources | Jon Bratseth | 2019-10-31 | 22 | -0/+40 |
| | |||||
* | Merge pull request #11174 from ↵ | Andreas Eriksen | 2019-10-31 | 1 | -1/+1 |
|\ | | | | | | | | | vespa-engine/andreer/lower-capacity-report-interval lower capacity report interval | ||||
| * | lower capacity report interval | andreer | 2019-10-31 | 1 | -1/+1 |
| | | | | | | | | mainly in order to clear the alert faster once we get it fixed | ||||
* | | Revert "Revert "Add devhost node type"" | Martin Polden | 2019-10-31 | 2 | -0/+4 |
| | | |||||
* | | Check capacity by requested, not assigned resources | Jon Bratseth | 2019-10-30 | 1 | -16/+24 |
| | | |||||
* | | Remember requested resources on nodes | Jon Bratseth | 2019-10-30 | 18 | -72/+148 |
| | | | | | | | | | | | | This may be different from assigned resources e.g in that requested resources may specify DiskSpeed.any while assigned resources always have a definite disk speed. | ||||
* | | use node repo disk speed as-is | kkraune | 2019-10-25 | 1 | -7/+2 |
| | | |||||
* | | Don't rebalance in AWS | Jon Bratseth | 2019-10-24 | 4 | -3/+11 |
| | | |||||
* | | Add metric hostedVespa.docker.skew to measure average host skew | Jon Bratseth | 2019-10-22 | 4 | -2/+31 |
| | | |||||
* | | Merge pull request #11031 from ↵ | Harald Musum | 2019-10-21 | 1 | -0/+4 |
|\ \ | | | | | | | | | | | | | vespa-engine/hakonhall/return-504-gateway-timeout-on-lock-timeout-from-orchestrator Return 504 Gateway Timeout on lock timeout from Orchestrator | ||||
| * | | Return 504 Gateway Timeout on lock timeout from Orchestrator | Håkon Hallingstad | 2019-10-21 | 1 | -0/+4 |
| | | | |||||
* | | | Merge pull request #11020 from vespa-engine/hakonhall/use-mockito-core-310 | Martin Polden | 2019-10-21 | 7 | -18/+14 |
|\ \ \ | |/ / |/| | | Use mockito-core 3.1.0 | ||||
| * | | Use mockito-core 3.1.0 | Håkon Hallingstad | 2019-10-18 | 7 | -18/+14 |
| | | | |||||
* | | | Only consider tenant nodes | Jon Bratseth | 2019-10-17 | 1 | -2/+2 |
| | | | |||||
* | | | Check node just once per iteration | Jon Bratseth | 2019-10-17 | 1 | -1/+1 |
| | | | |||||
* | | | Fix indentation | Jon Bratseth | 2019-10-17 | 1 | -2/+2 |
| | | | |||||
* | | | Only consider allocatable hosts | Jon Bratseth | 2019-10-17 | 3 | -9/+11 |
| | | | |||||
* | | | Remove obsoleted argument | Jon Bratseth | 2019-10-17 | 3 | -19/+18 |
| | | | |||||
* | | | Schedule node balancing acts | Jon Bratseth | 2019-10-17 | 17 | -46/+344 |
| | | | |||||
* | | | Refactor: Skew computation independent of node prioritization | Jon Bratseth | 2019-10-17 | 5 | -27/+43 |
|/ / | |||||
* | | Make final | Jon Bratseth | 2019-10-16 | 1 | -2/+2 |
| | | |||||
* | | Pick hosts for nodes to reduce resource allocation skew | Jon Bratseth | 2019-10-16 | 16 | -86/+304 |
| | | |||||
* | | Count OS upgrade event as a reboot in NodeRebooter | Martin Polden | 2019-10-15 | 2 | -2/+38 |
| | | | | | | | | | | Avoids unnecessary rebooting of nodes that recently upgraded their OS (and thus already rebooted). | ||||
* | | Always prepare and activate in InfraDeployerImpl | Valerij Fredriksen | 2019-10-10 | 2 | -158/+56 |
| | | |||||
* | | Move optimization up | Valerij Fredriksen | 2019-10-10 | 1 | -1/+2 |
| | | |||||
* | | Ignore unhealthy nodes when activating OS upgrades | Martin Polden | 2019-10-09 | 3 | -25/+40 |
| | | |||||
* | | Reduce interval of OsUpgradeActivator in CD | Martin Polden | 2019-10-04 | 1 | -1/+1 |
|/ | |||||
* | Merge pull request #10816 from vespa-engine/mpolden/container-core-restapi-types | Martin Polden | 2019-09-30 | 6 | -159/+9 |
|\ | | | | | Use response classes from container-core |