Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Revert "Bjorncs/rewrite config convergence checker client" | Jon Marius Venstad | 2020-11-07 | 2 | -2/+0 |
| | |||||
* | Deprecate VespaClientBuilderFactory + VespaJerseyJaxRsClientFactory | Bjørn Christian Seime | 2020-11-06 | 2 | -0/+2 |
| | |||||
* | Remove locating code | Håkon Hallingstad | 2020-10-20 | 1 | -0/+3 |
| | |||||
* | Close orchestrator locks | Håkon Hallingstad | 2020-10-20 | 3 | -60/+99 |
| | |||||
* | Remove hack to listen to ephemeral port | Bjørn Christian Seime | 2020-10-16 | 1 | -15/+1 |
| | |||||
* | Move lock metrics to MetricsReporter | Håkon Hallingstad | 2020-10-03 | 1 | -1/+1 |
| | | | | | | | | | | | | | | | Adds two new metrics: - The load of acquiring each lock path: The average number of threads waiting to acquire the lock within the last minute (or unit of time). Aka the lock queue (depth). - The load of the lock for each lock path: The average number of threads holding the lock within the last minute (or unit of time). This is always <= 1. Aka the lock utilization. Changes the LockCounters to LockMetrics, and exporting those once every minute through MetricReporter which is designed for this. | ||||
* | Add metrics to lock attempts | Håkon Hallingstad | 2020-10-01 | 2 | -21/+5 |
| | |||||
* | 30s down-moratorium before allowing suspension | Håkon Hallingstad | 2020-09-18 | 17 | -75/+310 |
| | |||||
* | Orchestrator should assume 3 controllers | Håkon Hallingstad | 2020-06-22 | 1 | -1/+1 |
| | |||||
* | Ignore missing children from optimistic read of suspended hosts | Håkon Hallingstad | 2020-05-15 | 1 | -13/+26 |
| | | | | | | | Also: Remove test for existence of path, which would normally turn up in the negative, and instead catch NoNodeException on the next if path does not exist, at the expense of exception thrown/caught. This should be cheaper than actually hitting ZK. | ||||
* | Replace remaining LogLevel.<level> with corresponding Level | gjoranv | 2020-04-25 | 1 | -1/+1 |
| | |||||
* | LogLevel.ERROR -> Level.SEVERE | gjoranv | 2020-04-25 | 2 | -3/+3 |
| | |||||
* | LogLevel.WARNING -> Level.WARNING | gjoranv | 2020-04-25 | 2 | -2/+2 |
| | |||||
* | LogLevel.INFO -> Level.INFO | gjoranv | 2020-04-25 | 1 | -6/+6 |
| | |||||
* | LogLevel.DEBUG -> Level.FINE | gjoranv | 2020-04-25 | 5 | -19/+19 |
| | |||||
* | Import java.util.logging.Level instead of com.yahoo.log.LogLevel | gjoranv | 2020-04-25 | 9 | -9/+9 |
| | |||||
* | Revert "Reduce host admin suspension concurrency from 20% to 10%" | Håkon Hallingstad | 2020-04-08 | 2 | -2/+2 |
| | |||||
* | Reduce logging in service-monitor and orchestrator | Håkon Hallingstad | 2020-03-23 | 4 | -17/+0 |
| | |||||
* | Merge pull request #12520 from ↵ | Valerij Fredriksen | 2020-03-21 | 6 | -51/+26 |
|\ | | | | | | | | | vespa-engine/hakonhall/always-do-status-service-cleanup Always do status service cleanup | ||||
| * | Always do status service cleanup | Håkon Hallingstad | 2020-03-09 | 6 | -51/+26 |
| | | |||||
* | | Silence orchestrator | Håkon Hallingstad | 2020-03-16 | 1 | -11/+11 |
| | | |||||
* | | Reduce host admin suspension concurrencty from 20% to 10% | Håkon Hallingstad | 2020-03-13 | 2 | -2/+2 |
|/ | |||||
* | Avoid building lots of ApplicationInstances | Håkon Hallingstad | 2020-03-08 | 3 | -18/+23 |
| | | | | | | Avoid building a full ApplicationInstance for each node... - for all nodes in the node repo when reporting metrics repo every minute, and - for all nodes in any /nodes/v1/node response | ||||
* | Throw if accessing duper model while holding status service application lock | Håkon Hallingstad | 2020-03-07 | 8 | -21/+113 |
| | | | | | | | | | | | | When the duper model is updated, ZooKeeper is atomically updated to e.g. remove extraneous hosts. This is done by acquiring the duper model lock first, then the relevant application lock in the status service. Acquiring these two locks in the reverse order may lead to a deadlock. This PR throws an IllegalStateException when detecting the current thread is about to acquire the duper model lock when the current thread has acquired the application lock. | ||||
* | Remove service model cache | Håkon Hallingstad | 2020-03-06 | 2 | -6/+2 |
| | |||||
* | Support cleanup of status service | Håkon Hallingstad | 2020-03-05 | 14 | -60/+313 |
| | |||||
* | Remove InstanceLookupService | Håkon Hallingstad | 2020-03-03 | 14 | -191/+122 |
| | | | | The lower-level methods on ServiceMonitor has removed the need for InstanceLookupService. | ||||
* | Rename MutableStatusService to ApplicationLock | Håkon Hallingstad | 2020-03-02 | 15 | -158/+136 |
| | | | | | The result of acquiring the application lock in the status service is now named ApplicationLock instead of MutableStatusService. | ||||
* | Align names with status service | Håkon Hallingstad | 2020-03-02 | 17 | -208/+221 |
| | |||||
* | Merge pull request #12390 from vespa-engine/hakonhall/extract-HostInfo-ops | Håkon Hallingstad | 2020-03-02 | 2 | -94/+129 |
|\ | | | | | Extract low-level ZK HostInfo ops to HostInfosServiceImpl | ||||
| * | Update ↵ | Håkon Hallingstad | 2020-03-02 | 1 | -2/+2 |
| | | | | | | | | | | orchestrator/src/main/java/com/yahoo/vespa/orchestrator/status/HostInfosServiceImpl.java Co-Authored-By: Valerij Fredriksen <freva@users.noreply.github.com> | ||||
| * | Extract low-level ZK HostInfo ops to HostInfosServiceImpl | Håkon Hallingstad | 2020-02-27 | 2 | -94/+129 |
| | | |||||
* | | Only build part of application instance for host resource | Håkon Hallingstad | 2020-02-28 | 5 | -20/+16 |
| | | |||||
* | | Moved to more specific methods on ServiceMonitor | Håkon Hallingstad | 2020-02-28 | 8 | -13/+39 |
|/ | |||||
* | Improve suspension denied reason | Håkon Hallingstad | 2020-02-24 | 6 | -59/+64 |
| | |||||
* | Remove use of fest-assert | Bjørn Christian Seime | 2020-02-24 | 1 | -11/+11 |
| | | | | Motivation: remove the number of 'fluent test assertion' libraries in use. | ||||
* | Merge pull request #12275 from ↵ | Valerij Fredriksen | 2020-02-24 | 4 | -11/+3 |
|\ | | | | | | | | | vespa-engine/hakonhall/remove-unnecessary-gethostinfo Remove unnecessary getHostInfo | ||||
| * | Remove unnecessary getHostInfo | Håkon Hallingstad | 2020-02-20 | 4 | -11/+3 |
| | | |||||
* | | Return 409 on Orchestrator timeout instead of 504 | Håkon Hallingstad | 2020-02-23 | 3 | -8/+8 |
| | | |||||
* | | Remove large-orchestrator-locks flag | Håkon Hallingstad | 2020-02-23 | 4 | -27/+6 |
|/ | |||||
* | Add host info to orchestrator REST API | Håkon Hallingstad | 2020-02-17 | 7 | -24/+55 |
| | |||||
* | Support setting PERMANENTLY_DOWN at end of retirement | Håkon Hallingstad | 2020-02-07 | 7 | -27/+68 |
| | |||||
* | Skip removing status nodes at old zk paths | Håkon Hallingstad | 2020-02-06 | 1 | -46/+20 |
| | |||||
* | Reduce access logging | Håkon Hallingstad | 2020-02-05 | 1 | -0/+1 |
| | | | | | | | | | | | | Avoids writing access logs in various tests. 1. Disables by-default access logging with Application, since it is used in unit tests. 2. However many tests create additional DeployState which renders this ineffective, and so this PR also explicitly disables access logging in services.xml of some tests. (1) might be unnecessary if we anyway have to do (2) everywhere, but this is not clear to me. | ||||
* | Remove stray test | Håkon Hallingstad | 2020-02-05 | 1 | -10/+0 |
| | |||||
* | Keep track of locks in OrchestratorContext | Håkon Hallingstad | 2020-02-04 | 5 | -47/+138 |
| | |||||
* | Support large orchestrator lock | Håkon Hallingstad | 2020-02-03 | 9 | -50/+351 |
| | |||||
* | Avoid wrapping timeout exception (causing 504) with batch internal error ↵ | Håkon Hallingstad | 2020-02-03 | 1 | -0/+2 |
| | | | | (causing 500) | ||||
* | Remove unused suspendedHostnames | Håkon Hallingstad | 2020-01-30 | 5 | -33/+7 |
| | |||||
* | Prepare for setting PERMANENTLY_DOWN | Håkon Hallingstad | 2020-01-30 | 14 | -43/+95 |
| |