| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|\
| |
| | |
Safely set storage node to DOWN
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
Setting a storage node to DOWN is considered safe if it can be permantenly set
down (e.g. removed from the application):
- The node is RETIRED
- There are no managed buckets
|
| | |
|
| |
| |
| |
| |
| |
| | |
Previously could risk that state transition grace period would elide
write to ZooKeeper if state changes happened within previous grace
period.
|
|\ \
| | |
| | | |
Arnej/remove extra gitignore
|
| |/ |
|
|/
|
|
|
|
|
|
|
|
|
| |
Previously, the controller would not write the version to ZK unless the
version was published to at least one node. This could lead to problems
due to un-written version numbers being visible via the controller's REST
APIs. External observers could see versions that were not present in ZK
and that would not be stable across reelections. As a consequence, invariants
for strictly increasing version numbers would be violated from the
perspective of these external observers (in particular, our system test
framework).
|
| |
|
|
|
|
|
|
| |
- Removes Spec.getLocalHostName
- Removes distinction between listening- and connect- address for Spec
- Makes all usage of connect w/Spec specify hostname
|
| |
|
| |
|
| |
|
|\
| |
| | |
Bratseth/indexed tensor
|
| | |
|
|/
|
|
|
|
|
|
|
|
|
| |
Using just the versioned cluster state instead can cause the code to
erroneously believe that it is seeing repeated reported state changes
for the first time. This happens when the diffs in the reported node
states are not in and by themselves enough to trigger a new cluster state
version containing the changes.
This can in turn spam the logs and event buffers until a new cluster state
has been versioned.
|
| |
|
|
|
| |
Cluster controller will now generate the new cluster state on-demand in a "pure functional" way instead of conditionally patching a working state over time. This makes understanding (and changing) the state generation logic vastly easier than it previously was.
|
|
|
|
|
|
| |
ip which does not resolve. This works around that problem by finding a resolvable
address (while still falling back to localhost if we only get ipv6 addresses,
as that causes other problems in docker containers).
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous version of the code attempted to optimize by only requesting
node data for nodes that had changed, but there existed an edge case where
it would mistakenly fail to request new data for nodes that _had_ changed.
This could happen if the callback was invoked when nextMasterData already
contained entries for the same set of node indices returned as part of the
directory callback.
Always clearing our internal state and requesting all znodes is a more
robust option. The number of cluster controllers should always be so low
that the expected added overhead is negligible.
|
|\
| |
| | |
Add configurable automatic group up/down feature based on node availability
|
| |
| |
| |
| |
| |
| | |
Logic is unchanged, but added comment with rationale and cross-reference
to other method that we're trying to be symmetrical with in terms of
state transition behavior.
|
| |
| |
| |
| | |
Also address code review comments.
|
| |
| |
| |
| |
| |
| |
| |
| | |
Available under content cluster tuning tag; feature is currently disabled by
default (need prod experience for this first).
Also improves handling of nodes removed from config by ensuring these are
taken out of the core working cluster state instead of just patched away
before each state publish.
|
| | |
|
| | |
|
| | |
|
|/ |
|
|
|