| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The existing state unification logic was likely to help ensure that
various distributor availability-states were treated as if they were
simply Up, but the distributor has not been able to even _be_ in other
available states than Up for many years. So it's effectively pointless.
Remove unification entirely and instead require both the distributor
and content node to be mutually in sync with the exact cluster state
version.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was possible for a distributor bucket fetch request to be processed
_after_ a cluster state was enabled (and internally propagated) on the content
node, but _before_ all side effects of this enabling were complete and fully
visible. This could cause inconsistent information to be returned to the
distributor, causing nodes to get out of sync bucket metadata.
This commit handles such transition periods by introducing an implicit
barrier between observing the incoming command and outgoing reply for a
particular cluster state version. Upon observing the reply for a version,
all side effects must already be visible since the reply is only sent
once internal state processing is complete (both above and below the SPI).
Until initiated and completed versions converge, requests are rejected and
will be transparently retried by the distributors.
|
|
|
|
|
| |
Was used to handle rolling upgrades between versions with different
semantics a long time ago on the 7 branch.
|
|
|
|
| |
proton.
|
| |
|
|
|
|
|
|
|
| |
* For C++ code this introduces a "document::config" namespace, which will
sometimes conflict with the global "config" namespace.
* Move all forward-declarations of the types DocumenttypesConfig and
DocumenttypesConfigBuilder to a common header file.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
version
There's a tiny window of time between when the bucket manager observes a new
state version and when the state version actually is visible in the rest of
the process. We must ensure that we don't end up processing requests when
these two differ, or we might erroneously process requests for version X
using a state only valid for version Y < X.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Adds a configurable max number of groups (default 0) whose replica
activation is inhibited if the replica's bucket info is out of sync
with a majority of other replicas.
Intended to be used for the case where a group comes back up after
transient unavailability and where the nodes are out of sync and
should preferably not be activated until post-merging.
|
| |
|
| |
|
|
|
|
| |
Use an array of buffer types in the array class.
|
|
|
|
|
|
| |
that require it.
- Clean up some old members and code not used any more.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Abstracts away multiple underlying B-tree DBs that each hold a subset
of the super bucket space. Offers ordered iteration via a priority-queue
based view over the sub DBs.
Not yet ready for prime time, as the striping inherently requires an
absolute lower bound on the bucket bits used in the system, which is
currently not enforced.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The legacy bucket DB initialization logic was designed for the case
where bucket information was spread across potentially millions of
files residing on spinning rust drives. It was therefore async and
running in parallel with client operations, adding much complexity
in order to deal with a myriad of concurrency edge cases.
Replace this with a very simple, synchronous init method that expects the
provider to have the required information readily and cheaply available.
This effectively removes the concept of a node's "initializing" state,
moving directly from reported state Down to Up.
Even though a node still technically starts up in Initializing state,
we never end up reporting this to the Cluster Controller as the DB init
completes before the RPC server stack is set up.
Legacy bucket DB initializer code will be removed in a separate pass.
Also simplify bucket DB interface contract for mutating iteration,
indicating that it is done in an unspecified order.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/basic-snapshot-support-for-content-node-bucket-db
Vekterli/basic snapshot support for content node bucket db
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
* Add working B-tree snapshot read guard impl
* Add placeholder wrapper read guard for legacy DB
* Enforce value const-ness of existing for_each_chunked iteration API
* Return read guard entries by value instead of modifying ref argument
|
|/ |
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/btree-bucket-db-support-on-content-node
Create generic B-tree bucket DB and content node DB implementation
|
| |
| |
| |
| |
| | |
Also rewrite some GMock macros that triggered Valgrind warnings
due to default test object printers accessing uninitialized memory.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is the first stage of removing the legacy DB implementation.
Support for B-tree specific functionality such as lock-free snapshot
reads will be added soon. This commit is just for feature parity.
Abstract away actual database implementation to allow it to
be chosen dynamically at startup. This abstraction does incur
some overhead via call indirections and type erasures of callbacks,
so it's likely it will be removed once the transition to the
new B-tree DB has been completed.
Since the algorithms used for bucket key operations is so similar
between the content node and distributor, a generic B-tree backed
bucket database has been created. The distributor DB will be rewritten
around this code very soon.
Due to the strong coupling between bucket locking and actual DB
implementation details, the new bucket DB has a fairly significant
code overlap with the legacy implementation. This is to avoid
spending time abstracting away and factoring out code for a
legacy implementation that is to be removed entirely anyway.
Remove existing LockableMap functionality not used or that's
only used by tests.
|
|/ |
|
| |
|
|
|
|
| |
- Let default bucket iteration work in smaller chunks with shorter waits.
|
| |
|
|
|
|
| |
Needs to be rewritten or discarded.
|
|
|
|
|
|
|
|
| |
Simulates added request latency caused by the BucketManager computing bucket
ownership for a very large number of buckets.
Fetched at BucketManager init only, so not a dynamic config. This is only
meant for internal testing so should not have any practical consequences.
|
| |
|
| |
|
|
|
|
| |
Move test config helpers out of cppunit submodule.
|
|
|
|
|
|
| |
Move base message sender stub out to common test module to
avoid artificial dependency from persistence tests to the
distributor tests.
|
|
|
|
|
| |
Remove convoluted thread stress test which didn't actually _verify_
any kind of correctness (aside from the test not outright crashing).
|
| |
|
| |
|
|
|
|
|
| |
Still some residual vdstestlib CppUnit traces that will need
cleaning up later.
|
| |
|
|
|
|
| |
This makes it possible to run storage tests in parallel.
|