Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Split parent + container-dependency-versions from root pom. | gjoranv | 2017-12-01 | 1 | -0/+1 |
| | | | | | | - Add missing dependencies so that all provided non-yahoo jars are listed in container-dependency-versions. - Add relativePath for all child poms of parent. | ||||
* | Revert "Gjoranv/split parent2" | gjoranv | 2017-11-30 | 1 | -1/+0 |
| | |||||
* | Split parent + container-dependency-versions from root pom. | gjoranv | 2017-11-30 | 1 | -0/+1 |
| | | | | | | - Add missing dependencies so that all provided non-yahoo jars are listed in container-dependency-versions. - Add relativePath for all child poms of parent. | ||||
* | Revert "Gjoranv/split parent" | gjoranv | 2017-11-29 | 1 | -1/+0 |
| | |||||
* | Split parent + container-dependency-versions from root pom. | gjoranv | 2017-11-29 | 1 | -0/+1 |
| | | | | | | - Add missing dependencies so that all provided non-yahoo jars are listed in container-dependency-versions. - Add relativePath for all child poms of parent. | ||||
* | Log when a cluster state version is published | Tor Brede Vekterli | 2017-10-30 | 1 | -0/+1 |
| | | | | | | Makes it much easier to reason about which state transitions have been made visible in the cluster, and which ones have just been internal state transitions in the controller. | ||||
* | Update wanted state on description changes, and fix method names | Håkon Hallingstad | 2017-10-24 | 1 | -13/+17 |
| | |||||
* | Also set the distributor wanted state when safe-setting the storage node state | Håkon Hallingstad | 2017-10-21 | 4 | -12/+247 |
| | | | | | | | | | | This is done as part of the SAFE REST API call to set the node state of a storage node to ensure atomicity of the state change, reduce the number of state changes, and minimize the time to complete the state changes. The right way to think about the safe-set is then: In order to safely set a storage node to (e.g.) maintenance, the distributor will also have to be set to down. And so on for the various permutations of state transitions. | ||||
* | Ignore current wanted state when safely setting state to up | Håkon Hallingstad | 2017-10-20 | 2 | -10/+5 |
| | |||||
* | Merge pull request #3525 from ↵ | Tor Brede Vekterli | 2017-10-12 | 8 | -21/+165 |
|\ | | | | | | | | | vespa-engine/vekterli/re-enable-synchronous-set-node-state Re-enable synchronous set node state with additional safeguards | ||||
| * | Add configurable deadline for cluster controller tasks | Tor Brede Vekterli | 2017-09-25 | 7 | -14/+113 |
| | | | | | | | | | | | | Prevents an unstable cluster from potentially holding up all container request processing threads indefinitely. Deadline errors are translated into HTTP 504 errors to REST API clients. | ||||
| * | Immediately complete failed remote tasks | Tor Brede Vekterli | 2017-09-25 | 6 | -8/+53 |
| | | | | | | | | | | | | We check both for master status and task failure, as we otherwise place a potentially dangerous silent dependency on the task always failing itself if the controller is not a master. | ||||
* | | Config-retired should not override explicit Down or Maintenance states | Tor Brede Vekterli | 2017-10-12 | 4 | -6/+62 |
| | | | | | | | | | | | | Previously, a config-retired node marked as Down by the Orchestrator would remain as Retired in the cluster state until the node was actually taken down entirely. | ||||
* | | Avoid busy-looping when distributors fail to ACK state version | Tor Brede Vekterli | 2017-10-11 | 2 | -7/+3 |
| | | |||||
* | | Revert "Revert "Aressem/remove post install script"" | Arnstein Ressem | 2017-09-27 | 1 | -0/+2 |
| | | |||||
* | | Revert "Aressem/remove post install script" | Arnstein Ressem | 2017-09-27 | 1 | -2/+0 |
| | | |||||
* | | Remove global install of files and put this in the modules that owns them. | Arnstein Ressem | 2017-09-25 | 1 | -0/+2 |
|/ | |||||
* | Merge branch 'master' into bratseth/nonfunctional-changes-4 | Arne Juul | 2017-09-22 | 4 | -4/+23 |
|\ | | | | | | | | | Conflicts: vespajlib/src/main/java/com/yahoo/concurrent/lock/Locks.java | ||||
| * | Temporarily disable set-node-state version ACK dependency | Tor Brede Vekterli | 2017-09-20 | 2 | -3/+6 |
| | | | | | | | | | | | | Effectively reverts to legacy behavior while some more thinking is done on how to deal with blocking requests during leader elections and non-converging clusters. | ||||
| * | Immediately complete remote tasks when not leader | Tor Brede Vekterli | 2017-09-19 | 2 | -1/+17 |
| | | | | | | | | | | | | Avoids edge case where set-node-state requests sent to followers would have their response delayed indefinitely due to controller not publishing any versions that the task's ACK barrier could be released by. | ||||
* | | Merge with master | Jon Bratseth | 2017-09-15 | 50 | -57/+117 |
|/ | |||||
* | Refactor deferred version task completion to take in version explicitly | Tor Brede Vekterli | 2017-09-12 | 1 | -13/+14 |
| | |||||
* | Change wording for operations without observable side-effects | Tor Brede Vekterli | 2017-09-12 | 3 | -22/+23 |
| | |||||
* | Break node version ACK check out into separately called logic | Tor Brede Vekterli | 2017-09-12 | 3 | -14/+49 |
| | | | | | | | | Removes dependency on having to invoke broadcastNewState before being able to observe that all distributors are in sync. Invocations of broadcastNewState are gated by a grace period between each time, so unless this is done we get artificial delays before a synchronous task can be considered complete. | ||||
* | Test multiple scheduled synchronous tasks | Tor Brede Vekterli | 2017-09-11 | 2 | -1/+19 |
| | |||||
* | Move leadership test code into fixture | Tor Brede Vekterli | 2017-09-11 | 1 | -11/+22 |
| | |||||
* | Test automatic task failing on controller leadership loss | Tor Brede Vekterli | 2017-09-11 | 3 | -11/+54 |
| | |||||
* | Add support for version ACK-dependent tasks in cluster controller | Tor Brede Vekterli | 2017-09-11 | 9 | -18/+426 |
| | | | | | | | | | Used to enable synchronous operation for set-node-state calls, which ensure that side-effects of the call are visible when the response returns. If controller leadership is lost before state is published, tasks will be failed back to the client. | ||||
* | Do not use import x.y.*; | Henning Baldersheim | 2017-09-04 | 1 | -1/+10 |
| | |||||
* | Try to shut down fleetcontroller in a controlled manner without relying on ↵ | Henning Baldersheim | 2017-09-04 | 1 | -21/+29 |
| | | | | the infamous thread.interrupt. | ||||
* | Update copyright headers | Jon Bratseth | 2017-06-14 | 158 | -156/+158 |
| | |||||
* | Revert "Update copyright headers" | Jon Bratseth | 2017-06-14 | 158 | -158/+156 |
| | |||||
* | Update copyright headers | Jon Bratseth | 2017-06-14 | 158 | -156/+158 |
| | |||||
* | Remove carriage return | Jon Bratseth | 2017-06-14 | 4 | -4/+4 |
| | |||||
* | Revert "Copyright header" | Jon Bratseth | 2017-06-13 | 158 | -163/+160 |
| | |||||
* | Copyright header | Jon Bratseth | 2017-06-13 | 158 | -160/+163 |
| | |||||
* | Ignore test that hangs on mac. | gjoranv | 2017-06-12 | 1 | -0/+1 |
| | |||||
* | Merge pull request #2494 from yahoo/hakon/adds-safe-setting-of-wanted-state-down | hakonhall | 2017-05-23 | 5 | -46/+256 |
|\ | | | | | Safely set storage node to DOWN | ||||
| * | Extract common check | Håkon Hallingstad | 2017-05-21 | 2 | -23/+31 |
| | | |||||
| * | Dedup test code | Håkon Hallingstad | 2017-05-21 | 1 | -64/+46 |
| | | |||||
| * | Verify version and reported state | Håkon Hallingstad | 2017-05-21 | 2 | -11/+76 |
| | | |||||
| * | Safely set storage node to DOWN | Håkon Hallingstad | 2017-05-18 | 5 | -33/+188 |
| | | | | | | | | | | | | | | Setting a storage node to DOWN is considered safe if it can be permantenly set down (e.g. removed from the application): - The node is RETIRED - There are no managed buckets | ||||
* | | Don't reset interrupt flag | Tor Brede Vekterli | 2017-05-22 | 1 | -1/+0 |
| | | |||||
* | | Write to ZooKeeper must be timing invariant | Tor Brede Vekterli | 2017-05-22 | 2 | -4/+17 |
| | | | | | | | | | | | | Previously could risk that state transition grace period would elide write to ZooKeeper if state changes happened within previous grace period. | ||||
* | | Merge pull request #2506 from yahoo/arnej/remove-extra-gitignore | Jon Bratseth | 2017-05-19 | 1 | -0/+0 |
|\ \ | | | | | | | Arnej/remove extra gitignore | ||||
| * | | remove old unused ignores | Arne Juul | 2017-05-19 | 1 | -0/+0 |
| |/ | |||||
* / | Always write new cluster state versions to ZooKeeper | Tor Brede Vekterli | 2017-05-12 | 6 | -52/+95 |
|/ | | | | | | | | | | | Previously, the controller would not write the version to ZK unless the version was published to at least one node. This could lead to problems due to un-written version numbers being visible via the controller's REST APIs. External observers could see versions that were not present in ZK and that would not be stable across reelections. As a consequence, invariants for strictly increasing version numbers would be violated from the perspective of these external observers (in particular, our system test framework). | ||||
* | Log ZooKeeper cluster state version reads and writes with INFO level | Tor Brede Vekterli | 2017-05-09 | 1 | -2/+4 |
| | |||||
* | Improve Spec API | Håkon Hallingstad | 2017-02-22 | 6 | -20/+26 |
| | | | | | | - Removes Spec.getLocalHostName - Removes distinction between listening- and connect- address for Spec - Makes all usage of connect w/Spec specify hostname | ||||
* | Makes clustercontroller-core work on WiFi | Håkon Hallingstad | 2017-02-20 | 5 | -45/+79 |
| |