aboutsummaryrefslogtreecommitdiffstats
path: root/orchestrator
Commit message (Collapse)AuthorAgeFilesLines
* Avoid stacktrace in log on timeoutsHåkon Hallingstad2018-11-093-8/+66
|
* Fix probe messageHåkon Hallingstad2018-11-051-2/+2
|
* Send probe when suspending many nodesHåkon Hallingstad2018-11-0516-121/+157
| | | | | | | | When suspending all nodes on a host, first do a suspend-all probe that will try to suspend the nodes as normal in Orchestrator and cluster controller, but actually not commit anything. A probe failure will result in the same failure as a non-probe failure: A 409 response with description is sent back to the client.
* Wrap CC HTTP failures in 409Håkon Hallingstad2018-11-019-159/+91
|
* Revert "Revert "Revert "Revert "Enforce CC timeouts in Orchestrator 4""""Håkon Hallingstad2018-11-0119-308/+496
|
* Revert "Revert "Revert "Enforce CC timeouts in Orchestrator 4"""Håkon Hallingstad2018-11-0119-496/+308
|
* Revert "Revert "Enforce CC timeouts in Orchestrator [4]""Håkon Hallingstad2018-11-0119-308/+496
|
* Revert "Enforce CC timeouts in Orchestrator [4]"Harald Musum2018-10-3119-496/+308
|
* Retry twice if only 1 CCHåkon Hallingstad2018-10-305-20/+22
| | | | | | | | | | | | | | Caught in the systemtests: If there's only one CC, there will only be 1 request with timeout ~5s, whereas today the real timeout is 10s. This appears to make a difference to the systemtests as converging to a cluster state may take several seconds. There are 2 solutions: 1. Allocate ~10s to CC call, or 2. Make another ~5s call to CC if the first one fails. (2) is simpler to implement for now. To implement (1), the timeout calculation could receive the number of backends as a parameter, but that would make the already complex logic here even worse. Or, we could only reserve enough time for 1 call (abandon 2 calls logic). TBD later.
* Revert "Revert "Revert "Revert "Enforce CC timeouts in Orchestrator 2""""Håkon Hallingstad2018-10-3019-306/+492
|
* Revert "Revert "Revert "Enforce CC timeouts in Orchestrator 2"""Håkon Hallingstad2018-10-3019-492/+306
|
* Revert "Revert "Enforce CC timeouts in Orchestrator 2""Håkon Hallingstad2018-10-2919-306/+492
|
* Revert "Enforce CC timeouts in Orchestrator 2"Håkon Hallingstad2018-10-2919-492/+306
|
* Merge branch 'master' into hakonhall/enforce-cc-timeouts-in-orchestrator-2Håkon Hallingstad2018-10-263-38/+58
|\
| * Use dynamic port in testsHarald Musum2018-10-242-38/+56
| |
| * Add GET suspended status to application/v2Jon Bratseth2018-10-221-0/+2
| |
* | Fixes after review roundHåkon Hallingstad2018-10-265-34/+30
| |
* | Enforce CC timeouts in Orchestrator 2Håkon Hallingstad2018-10-2319-306/+496
|/
* Replace 'tonytv' with full name in author tagsBjørn Christian Seime2018-07-056-6/+6
|
* Correct share-remaining-timeHåkon Hallingstad2018-06-251-1/+1
|
* Remove unused methodHåkon Hallingstad2018-06-251-4/+0
|
* Avoid fatal first CC request timeoutHåkon Hallingstad2018-06-254-9/+36
| | | | | | | | | | | | | If the first setNodeState to the "first" cluster controller times out, then we'd like to leave enough time to try the second CC. This avoids making a single CC a single point of failure. The strategy is to set a timeout of 50% of the remaining time, so if everything times out the timeouts would roughly be 50%, 25%, and 12.5% of original timeout. An alternative strategy would be to use 33% for each, which would be more democratic.
* set-node-state timeout in CCHåkon Hallingstad2018-06-224-12/+8
|
* Revert "Revert "Move TimeBudget to vespajlib and use Clock""Håkon Hallingstad2018-06-228-36/+49
|
* Revert "Move TimeBudget to vespajlib and use Clock"Harald Musum2018-06-218-49/+36
|
* Use UncheckedTimeoutException from guavaHåkon Hallingstad2018-06-212-4/+5
|
* Use ManualClock and remove Unchecked prefixHåkon Hallingstad2018-06-212-4/+4
|
* Move TimeBudget to vespajlib and use ClockHåkon Hallingstad2018-06-218-36/+48
|
* Add timeout to set-node-state calls from OrchestratorHåkon Hallingstad2018-06-1918-101/+188
|
* Avoid set-node-state retryHåkon Hallingstad2018-06-141-1/+7
| | | | | | | | | | | | | Today, the Orchestrator will call each cluster controller twice, e.g. indices 1, 2, 0, 1, 2, 0, if each time out. This is unnecessary. The minimum number of calls is 2: - Either the first CC is up and will redirect to master if necessary, or - the second is up and will redirect to master if necessary, or - the third won't have quorum. This PR changes the current strategy to call all CCs once, e.g. indices 1, 2, and 0.
* Use RuntimeException instead of ErrorHåkon Hallingstad2018-06-071-1/+1
|
* Remove usage of junit.frameworkJon Bratseth2018-04-301-11/+13
|
* Nonfunctional changesJon Bratseth2018-04-062-0/+3
|
* Remove deprecated suspend APIMartin Polden2018-03-141-18/+0
|
* New path for suspend all APIMartin Polden2018-03-013-32/+44
| | | | This is required to allow authorization of these requests.
* Support reporting UP for node admin outside zone appHåkon Hallingstad2018-02-262-14/+24
| | | | | | | | | If the nodeAdminInContainer ConfigserverConfig has been set, with this PR, the service monitor will always report the node admin container service as UP, thereby avoiding issues related to standalone node admin seemingly being down when not running as part of the application. This postpones checking /status/v1/health for later.
* Roll out node admin with 20%Håkon Hallingstad2018-01-2512-21/+69
|
* Some Curator clients require ensemble connect stringHåkon Hallingstad2018-01-111-3/+1
|
* Split parent + container-dependency-versions from root pom.gjoranv2017-12-011-0/+1
| | | | | | - Add missing dependencies so that all provided non-yahoo jars are listed in container-dependency-versions. - Add relativePath for all child poms of parent.
* Revert "Gjoranv/split parent2"gjoranv2017-11-301-1/+0
|
* Split parent + container-dependency-versions from root pom.gjoranv2017-11-301-0/+1
| | | | | | - Add missing dependencies so that all provided non-yahoo jars are listed in container-dependency-versions. - Add relativePath for all child poms of parent.
* Revert "Gjoranv/split parent"gjoranv2017-11-291-1/+0
|
* Split parent + container-dependency-versions from root pom.gjoranv2017-11-291-0/+1
| | | | | | - Add missing dependencies so that all provided non-yahoo jars are listed in container-dependency-versions. - Add relativePath for all child poms of parent.
* Set scheme parameter for all us of jaxrs clientBjørn Christian Seime2017-11-213-4/+10
|
* Revert "Temporarily ignore unstable orchestrator test"Bjørn Christian Seime2017-11-161-1/+0
|
* Revert "Avoid changing API before all clients handle it"Håkon Hallingstad2017-11-132-15/+8
|
* Avoid changing API before all clients handle itHåkon Hallingstad2017-11-012-8/+15
|
* REST API for service statusHåkon Hallingstad2017-10-273-42/+91
|
* Merge pull request #3917 from ↵Bjørn Christian Seime2017-10-273-11/+112
|\ | | | | | | | | vespa-engine/hakonhall/add-rest-api-to-query-slobrok Add REST API to query Slobrok
| * Add REST API to query SlobrokHåkon Hallingstad2017-10-273-11/+112
| |