Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Revert "Revert "Aressem/remove post install script"" | Arnstein Ressem | 2017-09-27 | 1 | -0/+2 |
| | |||||
* | Revert "Aressem/remove post install script" | Arnstein Ressem | 2017-09-27 | 1 | -2/+0 |
| | |||||
* | Remove global install of files and put this in the modules that owns them. | Arnstein Ressem | 2017-09-25 | 1 | -0/+2 |
| | |||||
* | Merge branch 'master' into bratseth/nonfunctional-changes-4 | Arne Juul | 2017-09-22 | 4 | -4/+23 |
|\ | | | | | | | | | Conflicts: vespajlib/src/main/java/com/yahoo/concurrent/lock/Locks.java | ||||
| * | Temporarily disable set-node-state version ACK dependency | Tor Brede Vekterli | 2017-09-20 | 2 | -3/+6 |
| | | | | | | | | | | | | Effectively reverts to legacy behavior while some more thinking is done on how to deal with blocking requests during leader elections and non-converging clusters. | ||||
| * | Immediately complete remote tasks when not leader | Tor Brede Vekterli | 2017-09-19 | 2 | -1/+17 |
| | | | | | | | | | | | | Avoids edge case where set-node-state requests sent to followers would have their response delayed indefinitely due to controller not publishing any versions that the task's ACK barrier could be released by. | ||||
* | | Merge with master | Jon Bratseth | 2017-09-15 | 50 | -57/+117 |
|/ | |||||
* | Refactor deferred version task completion to take in version explicitly | Tor Brede Vekterli | 2017-09-12 | 1 | -13/+14 |
| | |||||
* | Change wording for operations without observable side-effects | Tor Brede Vekterli | 2017-09-12 | 3 | -22/+23 |
| | |||||
* | Break node version ACK check out into separately called logic | Tor Brede Vekterli | 2017-09-12 | 3 | -14/+49 |
| | | | | | | | | Removes dependency on having to invoke broadcastNewState before being able to observe that all distributors are in sync. Invocations of broadcastNewState are gated by a grace period between each time, so unless this is done we get artificial delays before a synchronous task can be considered complete. | ||||
* | Test multiple scheduled synchronous tasks | Tor Brede Vekterli | 2017-09-11 | 2 | -1/+19 |
| | |||||
* | Move leadership test code into fixture | Tor Brede Vekterli | 2017-09-11 | 1 | -11/+22 |
| | |||||
* | Test automatic task failing on controller leadership loss | Tor Brede Vekterli | 2017-09-11 | 3 | -11/+54 |
| | |||||
* | Add support for version ACK-dependent tasks in cluster controller | Tor Brede Vekterli | 2017-09-11 | 9 | -18/+426 |
| | | | | | | | | | Used to enable synchronous operation for set-node-state calls, which ensure that side-effects of the call are visible when the response returns. If controller leadership is lost before state is published, tasks will be failed back to the client. | ||||
* | Do not use import x.y.*; | Henning Baldersheim | 2017-09-04 | 1 | -1/+10 |
| | |||||
* | Try to shut down fleetcontroller in a controlled manner without relying on ↵ | Henning Baldersheim | 2017-09-04 | 1 | -21/+29 |
| | | | | the infamous thread.interrupt. | ||||
* | Update copyright headers | Jon Bratseth | 2017-06-14 | 158 | -156/+158 |
| | |||||
* | Revert "Update copyright headers" | Jon Bratseth | 2017-06-14 | 158 | -158/+156 |
| | |||||
* | Update copyright headers | Jon Bratseth | 2017-06-14 | 158 | -156/+158 |
| | |||||
* | Remove carriage return | Jon Bratseth | 2017-06-14 | 4 | -4/+4 |
| | |||||
* | Revert "Copyright header" | Jon Bratseth | 2017-06-13 | 158 | -163/+160 |
| | |||||
* | Copyright header | Jon Bratseth | 2017-06-13 | 158 | -160/+163 |
| | |||||
* | Ignore test that hangs on mac. | gjoranv | 2017-06-12 | 1 | -0/+1 |
| | |||||
* | Merge pull request #2494 from yahoo/hakon/adds-safe-setting-of-wanted-state-down | hakonhall | 2017-05-23 | 5 | -46/+256 |
|\ | | | | | Safely set storage node to DOWN | ||||
| * | Extract common check | Håkon Hallingstad | 2017-05-21 | 2 | -23/+31 |
| | | |||||
| * | Dedup test code | Håkon Hallingstad | 2017-05-21 | 1 | -64/+46 |
| | | |||||
| * | Verify version and reported state | Håkon Hallingstad | 2017-05-21 | 2 | -11/+76 |
| | | |||||
| * | Safely set storage node to DOWN | Håkon Hallingstad | 2017-05-18 | 5 | -33/+188 |
| | | | | | | | | | | | | | | Setting a storage node to DOWN is considered safe if it can be permantenly set down (e.g. removed from the application): - The node is RETIRED - There are no managed buckets | ||||
* | | Don't reset interrupt flag | Tor Brede Vekterli | 2017-05-22 | 1 | -1/+0 |
| | | |||||
* | | Write to ZooKeeper must be timing invariant | Tor Brede Vekterli | 2017-05-22 | 2 | -4/+17 |
| | | | | | | | | | | | | Previously could risk that state transition grace period would elide write to ZooKeeper if state changes happened within previous grace period. | ||||
* | | Merge pull request #2506 from yahoo/arnej/remove-extra-gitignore | Jon Bratseth | 2017-05-19 | 1 | -0/+0 |
|\ \ | | | | | | | Arnej/remove extra gitignore | ||||
| * | | remove old unused ignores | Arne Juul | 2017-05-19 | 1 | -0/+0 |
| |/ | |||||
* / | Always write new cluster state versions to ZooKeeper | Tor Brede Vekterli | 2017-05-12 | 6 | -52/+95 |
|/ | | | | | | | | | | | Previously, the controller would not write the version to ZK unless the version was published to at least one node. This could lead to problems due to un-written version numbers being visible via the controller's REST APIs. External observers could see versions that were not present in ZK and that would not be stable across reelections. As a consequence, invariants for strictly increasing version numbers would be violated from the perspective of these external observers (in particular, our system test framework). | ||||
* | Log ZooKeeper cluster state version reads and writes with INFO level | Tor Brede Vekterli | 2017-05-09 | 1 | -2/+4 |
| | |||||
* | Improve Spec API | Håkon Hallingstad | 2017-02-22 | 6 | -20/+26 |
| | | | | | | - Removes Spec.getLocalHostName - Removes distinction between listening- and connect- address for Spec - Makes all usage of connect w/Spec specify hostname | ||||
* | Makes clustercontroller-core work on WiFi | Håkon Hallingstad | 2017-02-20 | 5 | -45/+79 |
| | |||||
* | Use relative URLs in Cluster Controller status page | Håkon Hallingstad | 2017-02-17 | 4 | -19/+17 |
| | |||||
* | Add/improve README's | Jon Bratseth | 2017-01-19 | 1 | -0/+5 |
| | |||||
* | Merge pull request #1301 from yahoo/bratseth/indexed-tensor | Jon Bratseth | 2016-12-13 | 1 | -0/+1 |
|\ | | | | | Bratseth/indexed tensor | ||||
| * | MapTensor -> MappedTensor | Jon Bratseth | 2016-12-12 | 1 | -0/+1 |
| | | |||||
* | | Use latest candidate cluster state when comparing against reported node states | Tor Brede Vekterli | 2016-12-09 | 3 | -1/+52 |
|/ | | | | | | | | | | | Using just the versioned cluster state instead can cause the code to erroneously believe that it is seeing repeated reported state changes for the first time. This happens when the diffs in the reported node states are not in and by themselves enough to trigger a new cluster state version containing the changes. This can in turn spam the logs and event buffers until a new cluster state has been versioned. | ||||
* | Reduce disconnect errors to wraning as they are likely during shutdown. | Henning Baldersheim | 2016-10-12 | 1 | -3/+2 |
| | |||||
* | Rewrite and refactor core cluster controller state generation logic | Tor Brede Vekterli | 2016-10-05 | 38 | -1378/+3626 |
| | | | Cluster controller will now generate the new cluster state on-demand in a "pure functional" way instead of conditionally patching a working state over time. This makes understanding (and changing) the state generation logic vastly easier than it previously was. | ||||
* | Yahoo sets up mac wireless networks such that the local hostname points to an | Jon Bratseth | 2016-09-29 | 1 | -2/+1 |
| | | | | | | ip which does not resolve. This works around that problem by finding a resolvable address (while still falling back to localhost if we only get ipv6 addresses, as that causes other problems in docker containers). | ||||
* | Need to figure out what to do with the tests using DockerOperations | Håkon Hallingstad | 2016-09-01 | 1 | -0/+2 |
| | |||||
* | Less verbose/duplicate logging per fetched child node from ZooKeeper | Tor Brede Vekterli | 2016-07-04 | 1 | -2/+2 |
| | |||||
* | Always request data for all znodes on master election dir watch callback | Tor Brede Vekterli | 2016-07-01 | 1 | -24/+9 |
| | | | | | | | | | | | | | The previous version of the code attempted to optimize by only requesting node data for nodes that had changed, but there existed an edge case where it would mistakenly fail to request new data for nodes that _had_ changed. This could happen if the callback was invoked when nextMasterData already contained entries for the same set of node indices returned as part of the directory callback. Always clearing our internal state and requesting all znodes is a more robust option. The number of cluster controllers should always be so low that the expected added overhead is negligible. | ||||
* | Merge pull request #56 from yahoo/vekterli/configurable-group-auto-takedown | Tor Brede Vekterli | 2016-06-27 | 16 | -176/+1410 |
|\ | | | | | Add configurable automatic group up/down feature based on node availability | ||||
| * | Clarify predicate on isRpcAddressOutdated() for clearing node state | Tor Brede Vekterli | 2016-06-22 | 1 | -4/+14 |
| | | | | | | | | | | | | Logic is unchanged, but added comment with rationale and cross-reference to other method that we're trying to be symmetrical with in terms of state transition behavior. | ||||
| * | Don't reintroduce already observed timestamps in cluster state | Tor Brede Vekterli | 2016-06-17 | 2 | -9/+60 |
| | | | | | | | | Also address code review comments. |