summaryrefslogtreecommitdiffstats
path: root/node-repository
Commit message (Collapse)AuthorAgeFilesLines
* Move compatible check to enumsJon Bratseth2019-11-141-3/+23
|
* Merge pull request #11284 from ↵Håkon Hallingstad2019-11-144-70/+123
|\ | | | | | | | | vespa-engine/hakonhall/allow-overriding-noderepositorymaintenance-durations-with-flag Add flag to control reboot interval
| * Do not use discrete probabilitiesHåkon Hallingstad2019-11-141-6/+3
| |
| * Read reboot-interval-in-days dynamicallyHåkon Hallingstad2019-11-134-51/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | But also: Changes the distribution of the scheduling past 1x reboot interval: hosts will be scheduled for reboot evenly distributed in the whole 1x-2x range, and is by this guaranteed to be scheduled at latest at 2x. The expected time before a reboot was scheduled was 1.33 reboot intervals, while there was no guarantee of an upper time. The new algorithm has an expected time before reboot of 1.5 reboot intervals, bound to 2x. The old would have a higher probability of reboot passing the 1x boundary, while a lower probability than the new as one nears 2x. So I think the new algorithm also have the nice property of avoiding thundering herd, perhaps even more so than the old: For instance when most hosts are rebooted at the same time in a zone, they would tend to be rescheduled for reboot closer to each other with the old than with the new. And, enabling the new algoritm should also not lead to too many hosts suddenly having to reboot, or at least that's what I hope. I can sanity-check this before merge - I guess it would be dominated by the number of hosts in west/east that are beyond 2x.
| * Make flag only for NodeRebooter, and remove fetching from environmentHåkon Hallingstad2019-11-131-25/+15
| |
| * Allow overriding NodeRepositoryMaintenance durations with flagHåkon Hallingstad2019-11-131-17/+22
| |
* | Explicit NodeResources defaultsJon Bratseth2019-11-1322-23/+24
| |
* | Pass and receive remoteStorageJon Bratseth2019-11-131-1/+0
| |
* | Compute free resources using just numbersJon Bratseth2019-11-137-18/+15
| |
* | Add NodeResources.storageTypeJon Bratseth2019-11-1332-58/+142
|/
* Merge pull request #11251 from vespa-engine/hmusum/log-parent-hosts-not-readyHarald Musum2019-11-081-3/+10
|\ | | | | Log parent hosts that are not ready
| * Fix logger nameHarald Musum2019-11-081-1/+1
| |
| * Log parent hosts that are not readyHarald Musum2019-11-081-3/+10
| |
* | Merge pull request #11249 from ↵Andreas Eriksen2019-11-081-1/+3
|\ \ | |/ |/| | | | | vespa-engine/olaa/use-requested-resources-for-allocation-failures Use requested resources when finding allocation failures
| * Use requested resources when finding allocation failuresOla Aunrønning2019-11-081-1/+3
| |
* | make required disk speed patchableandreer2019-11-081-0/+12
|/
* Pass requestedResources through HostResourcesJon Bratseth2019-11-061-2/+2
|
* add metrics for application allocationsandreer2019-11-052-0/+44
|
* Preserve resources decided implicitly by policies as requestedJon Bratseth2019-11-042-6/+6
|
* Remove debug logValerij Fredriksen2019-10-311-2/+0
|
* -1 typoJon Bratseth2019-10-3121-21/+21
|
* Fix typosJon Bratseth2019-10-3121-22/+22
|
* Output requested node resourcesJon Bratseth2019-10-3122-0/+40
|
* Merge pull request #11174 from ↵Andreas Eriksen2019-10-311-1/+1
|\ | | | | | | | | vespa-engine/andreer/lower-capacity-report-interval lower capacity report interval
| * lower capacity report intervalandreer2019-10-311-1/+1
| | | | | | | | mainly in order to clear the alert faster once we get it fixed
* | Revert "Revert "Add devhost node type""Martin Polden2019-10-312-0/+4
| |
* | Check capacity by requested, not assigned resourcesJon Bratseth2019-10-301-16/+24
| |
* | Remember requested resources on nodesJon Bratseth2019-10-3018-72/+148
| | | | | | | | | | | | This may be different from assigned resources e.g in that requested resources may specify DiskSpeed.any while assigned resources always have a definite disk speed.
* | use node repo disk speed as-iskkraune2019-10-251-7/+2
| |
* | Don't rebalance in AWSJon Bratseth2019-10-244-3/+11
| |
* | Add metric hostedVespa.docker.skew to measure average host skewJon Bratseth2019-10-224-2/+31
| |
* | Merge pull request #11031 from ↵Harald Musum2019-10-211-0/+4
|\ \ | | | | | | | | | | | | vespa-engine/hakonhall/return-504-gateway-timeout-on-lock-timeout-from-orchestrator Return 504 Gateway Timeout on lock timeout from Orchestrator
| * | Return 504 Gateway Timeout on lock timeout from OrchestratorHåkon Hallingstad2019-10-211-0/+4
| | |
* | | Merge pull request #11020 from vespa-engine/hakonhall/use-mockito-core-310Martin Polden2019-10-217-18/+14
|\ \ \ | |/ / |/| | Use mockito-core 3.1.0
| * | Use mockito-core 3.1.0Håkon Hallingstad2019-10-187-18/+14
| | |
* | | Only consider tenant nodesJon Bratseth2019-10-171-2/+2
| | |
* | | Check node just once per iterationJon Bratseth2019-10-171-1/+1
| | |
* | | Fix indentationJon Bratseth2019-10-171-2/+2
| | |
* | | Only consider allocatable hostsJon Bratseth2019-10-173-9/+11
| | |
* | | Remove obsoleted argumentJon Bratseth2019-10-173-19/+18
| | |
* | | Schedule node balancing actsJon Bratseth2019-10-1717-46/+344
| | |
* | | Refactor: Skew computation independent of node prioritizationJon Bratseth2019-10-175-27/+43
|/ /
* | Make finalJon Bratseth2019-10-161-2/+2
| |
* | Pick hosts for nodes to reduce resource allocation skewJon Bratseth2019-10-1616-86/+304
| |
* | Count OS upgrade event as a reboot in NodeRebooterMartin Polden2019-10-152-2/+38
| | | | | | | | | | Avoids unnecessary rebooting of nodes that recently upgraded their OS (and thus already rebooted).
* | Always prepare and activate in InfraDeployerImplValerij Fredriksen2019-10-102-158/+56
| |
* | Move optimization upValerij Fredriksen2019-10-101-1/+2
| |
* | Ignore unhealthy nodes when activating OS upgradesMartin Polden2019-10-093-25/+40
| |
* | Reduce interval of OsUpgradeActivator in CDMartin Polden2019-10-041-1/+1
|/
* Merge pull request #10816 from vespa-engine/mpolden/container-core-restapi-typesMartin Polden2019-09-306-159/+9
|\ | | | | Use response classes from container-core