summaryrefslogtreecommitdiffstats
path: root/node-repository/src/main/java/com/yahoo/vespa/hosted/provision/maintenance/NodeFailer.java
Commit message (Expand)AuthorAgeFilesLines
* Handle node disappearing after taking lockMartin Polden2020-05-271-28/+21
* only throttle node failures when nodes are still in state "failed"andreer2020-05-221-0/+1
* Use vespajlib maintenance package in node-repositoryMartin Polden2020-04-291-2/+1
* LogLevel.INFO -> Level.INFOgjoranv2020-04-251-1/+1
* Import java.util.logging.Level instead of com.yahoo.log.LogLevelgjoranv2020-04-251-1/+1
* Avoid building lots of ApplicationInstancesHåkon Hallingstad2020-03-081-1/+1
* Moved to more specific methods on ServiceMonitorHåkon Hallingstad2020-02-281-2/+1
* Prepare for setting PERMANENTLY_DOWNHåkon Hallingstad2020-01-301-1/+1
* Record the specific change agent in the node historyJon Bratseth2020-01-231-1/+1
* Unreserve hosts with allocationsJon Bratseth2020-01-221-1/+1
* Remove mitigation for "NodeFailer" agentHarald Musum2020-01-081-5/+5
* Use static factory method instead of constructor to signal copyingMartin Polden2020-01-031-1/+1
* Remove hardwareFailure and hardwareDivergence from node-repo maintainersValerij Fredriksen2019-09-191-20/+8
* Fail readying a node with a hard fail reportHåkon Hallingstad2019-09-111-1/+1
* Add throttled host metricValerij Fredriksen2019-08-071-4/+11
* Nonfunctional changes onlyJon Bratseth2019-08-051-4/+4
* Revert "Return 409 with error code TRANSIENT_ERROR when getting TransientExce...Harald Musum2019-08-011-1/+1
* Move some exceptions to its own package (making them not part of public API)Harald Musum2019-08-011-1/+1
* Ignore TransientException in NodeFailer and RetiredExpirerValerij Fredriksen2019-06-291-2/+8
* Remove nodeAdminInContainer from configserver.defValerij Fredriksen2019-06-011-6/+2
* Require lock reference for all write operationsMartin Polden2019-05-151-5/+5
* Disallow failing config/controller(hosts)Valerij Fredriksen2019-05-091-5/+13
* Non-functional cleanupValerij Fredriksen2019-05-061-4/+4
* Remove unused variableValerij Fredriksen2019-05-061-4/+0
* Move JobControl and InfrastructureVersions to NodeRepositoryValerij Fredriksen2019-05-061-2/+1
* Use the type of the node reportHåkon Hallingstad2019-02-281-21/+8
* Merge pull request #8545 from vespa-engine/hakonhall/stop-using-agentnodefail...Jon Bratseth2019-02-181-5/+5
|\
| * Stop using Agent.NodeFailer until v6 is goneHåkon Hallingstad2019-02-181-5/+5
* | Require all child nodes to be suspended in NodeFailerHåkon Hallingstad2019-02-181-1/+11
|/
* Only fail tenant host nodes with failure reportsHåkon Hallingstad2019-02-181-6/+8
* Remove hardwareDivergence from node-adminHåkon Hallingstad2019-02-181-0/+1
* Fail instead of retire on failure report in NodeFailerHåkon Hallingstad2019-02-151-64/+14
* Use valerijs super streamHåkon Hallingstad2019-02-131-14/+18
* 10s timeoutHåkon Hallingstad2019-02-131-1/+1
* Rename to activeNodesHåkon Hallingstad2019-02-131-3/+3
* Max 1 active host with wantToRetire, and fix NodeFailer.hasHardwareIssueHåkon Hallingstad2019-02-121-11/+18
* Also fail on badDiskType, badInterfaceSpeed, badCpuCountHåkon Hallingstad2019-02-121-1/+7
* Retire/fail hosts with failure reportsHåkon Hallingstad2019-02-121-10/+101
* Nonfunctional changes onlyJon Bratseth2019-02-051-1/+1
* Implement Iterable in NodeListValerij Fredriksen2019-01-301-1/+1
* Remove duplicated child node filteringMartin Polden2019-01-141-1/+1
* Clarify physical nodesMartin Polden2019-01-031-1/+1
* Increase allowed to fail fractionMartin Polden2019-01-031-1/+1
* Always allow 2 parent hosts to fail in a 24 hour periodMartin Polden2019-01-031-19/+29
* Include throttled active nodesMartin Polden2018-12-061-6/+15
* Emit metric for throttled node failuresMartin Polden2018-12-061-5/+17
* Fail nodes because of hardware failureValerij Fredriksen2018-08-211-0/+14
* Return active nodes that should be failed with reasonValerij Fredriksen2018-08-211-31/+38
* Simplify common history checkValerij Fredriksen2018-08-211-15/+8
* Nonfunctional changes onlyJon Bratseth2018-03-191-2/+3