summaryrefslogtreecommitdiffstats
path: root/storage
Commit message (Collapse)AuthorAgeFilesLines
* Re-enable cluster state transition optimizationTor Brede Vekterli2019-05-211-1/+11
| | | | | Made side effects of replica resurrection bug more likely to be observed, but not in itself a believed root cause of any issues.
* Avoid resurrecting replicas for nodes that are unavailable in pending stateTor Brede Vekterli2019-05-2110-332/+278
| | | | | | | | | | | We previously only checked for node availability in the _active_ state without looking at the pending state. This opened up for a race condition where a reply for a previously DB-pruned node could bring a replica back in the DB iff received during a pending state window. Consider Maintenance as unavailable for this case, not just Down. Also move all `PutOperation` tests to GTest.
* Avoid recomputing bucket keys during sorting stepTor Brede Vekterli2019-05-153-18/+68
| | | | | | | | | | Buckets are sorted in increasing key order before being merged into the database. Instead of repeatedly calling `toKey()` on entries, just store the key verbatim. Add a (by default disabled) benchmarking test for this case. With the bucket key change, running this locally brings total merge time for 917K buckets down from 2 seconds to 1.3 seconds.
* Disable bucket DB pruning elision optimization for nowTor Brede Vekterli2019-05-142-11/+4
| | | | | | There may be a currently unknown edge case where some valid edges do not trigger pruning as they should, so disable optimization entirely until we know for sure.
* Merge pull request #9341 from ↵Tor Brede Vekterli2019-05-1319-180/+670
|\ | | | | | | | | vespa-engine/vekterli/add-distributor-btree-bucket-database-foundations Add initial B+tree distributor bucket database
| * Add initial B+tree distributor bucket databaseTor Brede Vekterli2019-05-0919-180/+670
| | | | | | | | | | | | | | | | | | | | | | | | | | Still uses legacy `BucketDatabase` API, which is not optimized for bulk loading or updating. Focus for this iteration is functional correctness rather than API redesign. Legacy DB is still the one wired in for all production logic. Unit tests have been expanded to cover discovered edge cases that were not properly tested for. Also move distributor bucket DB tests to GTest. Use value- parameterized test fixture instead of ad-hoc CppUnit approach.
* | Simplify the supervisor responsibilityHenning Baldersheim2019-05-103-22/+27
|/
* Elide bucket DB pruning scans when possibleTor Brede Vekterli2019-04-298-10/+275
| | | | | | | | | | | | | | Only do a (potentially expensive) pruning pass over the bucket DB when the cluster state transition indicates that one or more nodes are not in the same availability-state as the currently active state. For instance, if a cluster state change is for a content node going from Maintenance to Down, no pruning action is required since that shall already have taken place during the processing of the initial Down edge (and no buckets shall have been created for it in the meantime). We do not currently attempt to elide distribution config changes, as these happen much more rarely than cluster state changes.
* Cache super bucket ownership decisions when processing bucket DBTor Brede Vekterli2019-04-252-12/+46
| | | | | | | | Distributor bucket ownership is assigned on a per superbucket basis, so all buckets with the same superbucket can use the same decision. The bucket DB is explicitly ordered in such a way that all buckets belonging to the same superbucket are ordered after each other, so we need only maintain O(1) extra state for this.
* Add metrics around bucket DB pruning and merging phases of state transitionsTor Brede Vekterli2019-04-123-3/+21
|
* Limit number of persistence threads that can process merges in parallelTor Brede Vekterli2019-04-035-8/+50
| | | | | | Avoids starving other operations when there is a lot of merge activity taking place. For now, 1/2 of the total persistence thread pool may process merges.
* Convert BucketDBUpdaterTest from CppUnit to GTestTor Brede Vekterli2019-03-273-896/+597
|
* Merge pull request #8882 from ↵Tor Brede Vekterli2019-03-2632-151/+962
|\ | | | | | | | | vespa-engine/vekterli/add-read-only-support-during-cluster-state-transitions Add read-only support during cluster state transitions
| * Address code review feedback for distributor changesTor Brede Vekterli2019-03-262-7/+15
| |
| * Minor C++ cleanupsTor Brede Vekterli2019-03-226-7/+8
| |
| * Always allow activation commands through bouncer componentTor Brede Vekterli2019-03-202-0/+15
| | | | | | | | | | Otherwise we'd miss activation commands sent for a cluster state in which our own node is marked down.
| * Test more BucketDBUpdater two-phase transition edge casesTor Brede Vekterli2019-03-203-58/+98
| |
| * Properly handle non-owned vs. missing bucketsTor Brede Vekterli2019-03-159-52/+250
| | | | | | | | | | | | | | | | Bonus: no more spurious "we have removed buckets" log messages caused by ownership changes. Also ensure that we BUSY-bounce operations in `ExternalOperationHandler` when there is no actual state to send back in a `WrongDistributionReply`.
| * WIP on BucketDBUpdater explicit activation supportTor Brede Vekterli2019-03-146-7/+115
| |
| * Basic handling of activate_cluster_state_version RPC in backendTor Brede Vekterli2019-03-147-14/+122
| |
| * Move non-owned buckets to read-only DB and allow use for read-only opsTor Brede Vekterli2019-03-1412-66/+321
| |
| * Add read-only bucket space repo and wire it through distributor componentsTor Brede Vekterli2019-03-1415-51/+129
| |
* | include content length in http responseHåvard Pettersen2019-03-261-0/+3
| |
* | Revert "include content length in http response"Harald Musum2019-03-251-3/+0
| |
* | include content length in http responseHåvard Pettersen2019-03-251-0/+3
| |
* | Revert typecasting of variables sent to JsonStream, instead assume thatTor Egge2019-03-151-2/+2
| | | | | | | | JsonStream will get overloads for the relevant fundamental types.
* | Adjust types in storage module.Tor Egge2019-03-146-20/+20
| |
* | Adjust build setup for Darwin.Tor Egge2019-03-141-1/+1
|/
* cinttypes must be included before Jydy.h.Tor Egge2019-03-131-0/+1
|
* Fix format strings in storage module.Tor Egge2019-03-1216-39/+39
|
* Add '()' to macro definition.Geir Storli2019-03-011-1/+1
|
* Simplify.Geir Storli2019-03-011-1/+0
|
* Merge pull request #8616 from ↵Geir Storli2019-02-272-6/+20
|\ | | | | | | | | vespa-engine/vekterli/log-bucket-info-before-and-after-on-update-inconsistency Log before/after bucket info for when update operation inconsistency is discovered
| * Log before/after bucket info for when update operation inconsistency is ↵Tor Brede Vekterli2019-02-262-6/+20
| | | | | | | | | | | | discovered Makes it more obvious if the inconsistency is likely due to e.g. a checksum collision.
* | Eliminate some gcc 9 warnings.Tor Egge2019-02-251-0/+2
|/
* Merge pull request #8588 from ↵Tor Brede Vekterli2019-02-256-36/+70
|\ | | | | | | | | vespa-engine/vekterli/do-not-bruteforce-abort-client-ops-during-orchestrated-down Fail client ops gracefully when distributor is marked down
| * Fail client ops gracefully when distributor is marked downTor Brede Vekterli2019-02-226-36/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, clients would only receive `ABORTED` when the distributor was marked down by orchestration. This would simply cause the client to resend until either the `StoragePolicy` would discard the cluster state entirely and retry against a working distributor, or the operations would time out. Now they will receive a `WrongDistributionReply` that shall immediately update the `StoragePolicy` to avoid sending to the distributor that has been marked down. Also add a separate metric for number of operations aborted by `Bouncer`. This fixes #8448.
* | Reduce code duplication in gtest runners.Geir Storli2019-02-221-8/+2
|/
* Add workarounds for legacy global distribution hash handlingTor Brede Vekterli2019-02-219-21/+277
| | | | | | | | | | | | | | | | | | | This addresses a regression introduced as part of #8479, which in turn was intended to serve as a fix for issue #8475. This regression would stall cluster state convergence when a subset of nodes contained the fix and another subset did not. With the workarounds present, nodes gracefully handle the case where different distribution hashes are expected for the global bucket space. `BucketManager` will now fall back to comparing the new incoming hash to that of the legacy derived distribution config if it mismatches. `PendingClusterState` will try to send a subset of bucket info requests with legacy hash format for the global bucket space iff there has been at least 1 rejected request. All these workarounds will be removed on Vespa 8.
* Stop running storage unit tests in parallel, the tests can interfereTor Egge2019-02-201-3/+1
| | | | with each other, cf. PersistenceTestUtils::setupDisks().
* Use ASSERT_EQ when checking vector sizes.Geir Storli2019-02-181-4/+4
|
* Add gtest runner in storage and migrate bucketmovertest from CppUnit to gtest.Geir Storli2019-02-187-60/+61
|
* Derive correct distribution partition spec for grouped clustersTor Brede Vekterli2019-02-122-27/+31
| | | | | | | | Simplify code by emitting wildcards for all groups instead of using explicit leaf counts. Distribution code will distribute replicas evenly across all wildcarded groups. This fixes #8475
* Merge pull request #8443 from ↵Tor Brede Vekterli2019-02-116-45/+159
|\ | | | | | | | | vespa-engine/vekterli/add-per-bucket-space-data-metrics-on-content-node Expose data metrics per bucket space on content node
| * Use non-generic dimension name for bucket spacesTor Brede Vekterli2019-02-111-1/+1
| |
| * Expose data metrics per bucket space on content nodeTor Brede Vekterli2019-02-086-45/+159
| | | | | | | | Legacy metrics that cover all bucket spaces remain unchanged.
* | Eliminate some clang warnings in storage.Tor Egge2019-02-1013-48/+8
|/
* Update metric descriptionsTor Brede Vekterli2019-02-071-2/+2
|
* Rename and restructure C++ TLS metricsTor Brede Vekterli2019-02-072-20/+31
| | | | | - Use dashes instead of underscores - Explicitly separate client/server metrics in metric path
* Append node identity to response messages sent by Bouncer componentTor Brede Vekterli2019-02-043-6/+16
|