| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| | |
vespa-engine/vekterli/btree-bucket-db-support-on-content-node
Create generic B-tree bucket DB and content node DB implementation
|
| |
| |
| |
| |
| | |
Also rewrite some GMock macros that triggered Valgrind warnings
due to default test object printers accessing uninitialized memory.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is the first stage of removing the legacy DB implementation.
Support for B-tree specific functionality such as lock-free snapshot
reads will be added soon. This commit is just for feature parity.
Abstract away actual database implementation to allow it to
be chosen dynamically at startup. This abstraction does incur
some overhead via call indirections and type erasures of callbacks,
so it's likely it will be removed once the transition to the
new B-tree DB has been completed.
Since the algorithms used for bucket key operations is so similar
between the content node and distributor, a generic B-tree backed
bucket database has been created. The distributor DB will be rewritten
around this code very soon.
Due to the strong coupling between bucket locking and actual DB
implementation details, the new bucket DB has a fairly significant
code overlap with the legacy implementation. This is to avoid
spending time abstracting away and factoring out code for a
legacy implementation that is to be removed entirely anyway.
Remove existing LockableMap functionality not used or that's
only used by tests.
|
|/ |
|
| |
|
| |
|
|
|
|
|
|
|
| |
If the newest document version is a tombstone, behave
as if the document was not found at all. Since we still
track replica consistency, this should work as expected
for multi-phase update operations as well.
|
|
|
|
| |
- Let default bucket iteration work in smaller chunks with shorter waits.
|
| |
|
| |
|
| |
|
|
|
|
| |
Implement async remove.
|
|\
| |
| |
| |
| |
| |
| | |
vekterli/remove-deprecated-bucket-disk-move-functionality
Conflicts:
storage/src/tests/persistence/diskmoveoperationhandlertest.cpp
|
| | |
|
|/
|
|
|
| |
The notion of multiple disks hasn't been supported since we
removed VDS, and likely won't be in the future either.
|
|
|
|
|
|
| |
- Move result processing to MessageTracker
- Wire putAsync through provider error wrapper too.
- Handle both sync and async replies in tests.
|
|
|
|
| |
Not in use after VDS was removed.
|
|
|
|
|
| |
- Use MessageTracker for keeping context.
- implement putAsync, but still use it synchronously.
|
| |
|
| |
|
| |
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/optimize-btree-find-parents-with-fix
Optimize B-tree bucket DB lookup with used-bits aggregation
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
By tracking the minimum used bits count across all buckets in
the database we can immediately start seeking at that implicit
level in the tree, as we know no parent buckets can exist above
that level.
Local synthetic benchmarking shows the following results with a
DB size of 917504 buckets and performing getParents for all
buckets in sequence:
Before optimization:
- B-tree DB: 0.593321 seconds
- Legacy DB: 0.227947 seconds
After optimization:
- B-tree DB: 0.191971 seconds
- Legacy DB: (unchanged)
|
| | |
|
| |
| |
| |
| | |
operations easier to implement.
|
|/
|
|
| |
Avoid state in the thread.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By tracking the minimum used bits count across all buckets in
the database we can immediately start seeking at that implicit
level in the tree, as we know no parent buckets can exist above
that level.
Local synthetic benchmarking shows the following results with a
DB size of 917504 buckets and performing getParents for all
buckets in sequence:
Before optimization:
- B-tree DB: 0.593321 seconds
- Legacy DB: 0.227947 seconds
After optimization:
- B-tree DB: 0.213738 seconds
- Legacy DB: (unchanged)
|
| |
|
|
|
|
|
|
|
|
| |
If requests or responses from external sources are being constantly
processed as part of the distributor tick, allow for up to N ticks
to skip maintenance scanning, where N is a configurable number.
This reduces the amount of CPU time spent on maintenance operations
when the node has a lot of incoming data to deal with.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bucket DB updating happened unconditionally anyway; this was
only used for failing operations in an overly pessimistic way.
Removing this lookup has two benefits:
- Less CPU spent in DB
- Less impact expected during feeding during node state
transitions since fewer operations will have to be needlessly
retried by the client.
Rationale: an operation towards a given bucket completes (i.e.
is ACKed by all its replica nodes) at time t and the bucket is
removed from the DB at time T. There is no fundamental change
in correctness or behavior from the client's perspective if
the order of events is tT or Tt. Both are equally valid, as
the state transition edge happens independently of any reply
processing.
|
|
|
|
|
| |
B-tree/datastore stats can be expensive to sample, so don't do
this after every full DB iteration. For now, wait at least 30s.
|
| |
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/add-distributor-bucket-db-memory-usage-metrics
Add distributor bucket db memory usage metrics
|
| | |
|
|/ |
|
| |
|
| |
|
|
|
|
|
|
| |
Reuses the old update-get metric for the single full Get sent
after the initial metadata-only phase. Adds new metric set for
the initial metadata Gets.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If bucket replicas are inconsistent, the common case is that only
a small subset of documents contained in the buckets are actually
mutually out of sync. The added metadata phase optimizes for
such a case by initially sending Get requests to all divergent
replicas that only ask for the timestamps (and no fields). This
is a very cheap and fast operation. If all returned timestamps
are in sync, the update can be restarted in the fast path.
Otherwise, a full Get will _only_ be sent to the newest replica,
and its result will be used for performing the update on the
distributor itself, before pushing out the result as Puts.
This is in contrast to today's behavior where full Gets are
sent to all replicas. For users with large documents this
can be very expensive.
In addition, the metadata Get operations are sent with weak
internal read consistency (as they do not need to read any
previously written, possibly in-flight fields). This lets them
bypass the main commit queue entirely.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New distributor metric available as:
```
vds.idealstate.garbage_collection.documents_removed
```
Add documents removed statistics to `RemoveLocation` responses,
which is what GC is currently built around. Could technically have
been implemented as a diff of before/after BucketInfo, but GC is
very low priority so many other mutating ops may have changed the
bucket document set in the time span between sending the GC ops
and receiving the replies.
This relates to issue #12139
|
|
|
|
|
| |
send spec for client connections to enable SNI as well as server name
verification
|
| |
|
| |
|
| |
|
|\
| |
| | |
Balder/reduce bytebuffer exposure
|
| | |
|
| | |
|