aboutsummaryrefslogtreecommitdiffstats
path: root/node-repository/src/main/java/com/yahoo/vespa/hosted/provision/maintenance/NodeFailer.java
Commit message (Collapse)AuthorAgeFilesLines
* Reset downtime at resume, 2. tryHåkon Hallingstad2024-01-101-6/+23
|
* Revert "Reset downtime at resume"Harald Musum2024-01-061-23/+6
|
* Reset downtime at resumeHåkon Hallingstad2024-01-051-6/+23
|
* Add javadocMartin Polden2023-10-161-12/+13
|
* Update copyrightJon Bratseth2023-10-091-1/+1
|
* Add enums for infrastructure and add to vespametricsset as needed for ↵yngveaasheim2023-07-311-2/+3
| | | | infrastructure services.
* Ensure correct lock order when failing tenant hostsjonmv2023-07-141-25/+40
|
* Add two TODOs about locks taken in the wrong orderjonmv2023-07-121-1/+1
|
* Don't fail nodes undergoing CMR (#26743)Ola Aunrønning2023-04-141-1/+15
|
* maintainer success factor baseline deviationbjormel2023-03-291-1/+1
|
* Do not hold application lock while replacing failing nodeMartin Polden2023-03-101-27/+36
|
* Reduce NodeFailer activate timeoutHåkon Hallingstad2022-12-211-2/+4
|
* Revert "Revert collect(Collectors.toList())"Henning Baldersheim2022-12-041-1/+1
|
* Revert collect(Collectors.toList())Henning Baldersheim2022-12-041-1/+1
|
* Merge branch 'master' into bratseth/discard-warmup-metricsJon Bratseth2022-12-031-1/+1
|\
| * collect(Collectors.toList()) -> toList()Henning Baldersheim2022-12-021-1/+1
| |
* | Discard metrics right after restartJon Bratseth2022-12-031-10/+5
|/
* Allow 4% of nodes to fail before throttlingMartin Polden2022-12-011-1/+1
|
* Reapply "Remove HostLivenessTracker"Valerij Fredriksen2022-10-141-60/+3
| | | | This reverts commit a5ed12b351806b187613457b58982ca67f537594.
* Revert "Remove HostLivenessTracker"Valerij Fredriksen2022-10-131-3/+60
|
* Remove node failing for ready nodesValerij Fredriksen2022-10-131-60/+3
|
* Only use wantToFail just before activateHåkon Hallingstad2022-07-111-5/+10
|
* Revert "Revert "Avoid the host lock while failing the children""Håkon Hallingstad2022-07-111-46/+54
|
* Revert "Avoid the host lock while failing the children"Håkon Hallingstad2022-07-111-54/+46
| | | | This reverts commit 2cdaef56e18ace2ee2269d28f959f5a534bd68ee.
* Revert update of commentHåkon Hallingstad2022-07-111-2/+2
|
* Define main-chain-graph flagHåkon Hallingstad2022-07-081-2/+2
|
* Avoid the host lock while failing the childrenHåkon Hallingstad2022-07-051-46/+54
|
* Read nodes lessJon Bratseth2022-04-221-7/+4
|
* Keep a chronological log of events per nodeMartin Polden2022-04-191-1/+1
|
* Revert "Preserve all node events"Jon Bratseth2022-04-121-9/+4
|
* Fix after review feedbackMartin Polden2022-04-121-4/+4
|
* Preserve all node eventsMartin Polden2022-04-121-1/+6
| | | | Node events are now limited by a total size limit, instead of one per type.
* Increase node failure throttling from 2 to 3 %Jon Bratseth2022-04-081-1/+1
|
* Do not allocate nodes to suspended hostsValerij Fredriksen2022-02-031-14/+3
|
* Add Orchestrator to NodeRepositoryValerij Fredriksen2022-02-031-7/+3
|
* Merge pull request #20938 from vespa-engine/bratseth/modular-profilesJon Bratseth2022-01-261-1/+1
|\ | | | | Bratseth/modular profiles
| * No functional changesJon Bratseth2022-01-251-1/+1
| |
* | Increase down grace time while nodes are suspendedJon Bratseth2022-01-251-28/+42
| |
* | No functional changesJon Bratseth2022-01-251-48/+50
|/
* Remove dead codeMartin Polden2021-10-251-16/+0
|
* Update 2017 copyright notices.gjoranv2021-10-071-1/+1
|
* Update ↵Håkon Hallingstad2021-08-161-3/+0
| | | | | node-repository/src/main/java/com/yahoo/vespa/hosted/provision/maintenance/NodeFailer.java Co-authored-by: Valerij Fredriksen <freva@users.noreply.github.com>
* Do not fail ready nodes w/o recent config requestsHåkon Hallingstad2021-08-161-11/+10
| | | | | | This code was used to support non-Docker tenant hosts, but now only affects ready cfg and proxy containers which may not even exist and cannot possibly issue config requests (when in ready).
* Revert "Revert "Emit a success factor from maintainers""Jon Bratseth2021-06-061-3/+13
| | | | This reverts commit cd1b747b4f65fa3a6ed6aace23235db7591638c5.
* Revert "Emit a success factor from maintainers"Arnstein Ressem2021-06-041-13/+3
|
* Return success factorJon Bratseth2021-06-041-3/+13
|
* Never throttle failing of children on failed hostsMartin Polden2021-04-271-14/+15
|
* Node failing improvementsJon Bratseth2021-04-121-4/+21
| | | | | - Fail hosts that wants to fail and do not have active children - Clear want to fail on failing in case the nodes are later reactivated
* Move nodes to 'failed' during activateJon Bratseth2021-04-081-7/+10
|
* Less DockerMartin Polden2021-02-181-1/+1
|