summaryrefslogtreecommitdiffstats
path: root/storage
Commit message (Collapse)AuthorAgeFilesLines
* Promote log level for DB pruning elision decisionsTor Brede Vekterli2019-06-281-1/+3
|
* Emit single-shot warning for phantom bucket replicasTor Brede Vekterli2019-06-282-15/+27
| | | | Adding this to see if it triggers on any of our internal tests
* Remove CppUnit dependencies in modulesTor Brede Vekterli2019-06-269-44/+7
| | | | Move test config helpers out of cppunit submodule.
* Don't require gmock library linkageTor Brede Vekterli2019-06-252-2/+4
|
* Convert remaining CppUnit tests to GTestTor Brede Vekterli2019-06-2563-5505/+3421
| | | | | | Move base message sender stub out to common test module to avoid artificial dependency from persistence tests to the distributor tests.
* Convert storageserver and visiting tests from CppUnit to GTestTor Brede Vekterli2019-06-1421-2422/+1587
|
* Convert persistence tests from CppUnit to GTestTor Brede Vekterli2019-06-1225-2573/+1254
|
* Convert BucketOwnershipNotifierTest from CppUnit to GTestTor Brede Vekterli2019-06-122-42/+18
|
* Convert StatusTest from CppUnit to GTestTor Brede Vekterli2019-06-123-52/+21
|
* Convert tests in 'common' module from CppUnit to GTestTor Brede Vekterli2019-06-129-188/+105
| | | | Still builds shared non-test library.
* Add missing includes.Tor Egge2019-06-1112-0/+12
|
* Convert LockableMapTest from CppUnit to GTestTor Brede Vekterli2019-06-073-629/+199
| | | | | Remove convoluted thread stress test which didn't actually _verify_ any kind of correctness (aside from the test not outright crashing).
* Convert JudyMultiMapTest from CppUnit to GtestTor Brede Vekterli2019-06-073-72/+51
|
* Convert JudyArrayTest from CppUnit to GtestTor Brede Vekterli2019-06-073-166/+96
|
* Convert BucketManagerTest and InitializerTest to gtestTor Brede Vekterli2019-06-074-769/+250
| | | | | Still some residual vdstestlib CppUnit traces that will need cleaning up later.
* Convert BucketInfoTest from CppUnit to GTestTor Brede Vekterli2019-06-062-106/+46
|
* Do not block scanning all buckets on first cluster state edgeTor Brede Vekterli2019-06-063-10/+16
| | | | | | | | | | Scanning all buckets is expensive and blocks the main distributor thread during the entire process. Instead, use the standard "recovery mode" functionality that is triggered for all subsequent state transitions. Recovery mode scans allow client operations to be scheduled alongside the scanning, but still tries to scan the DB as quickly as possible. There shouldn't be anything that special with the first cluster state that implies a full scan is inherently required.
* Merge pull request #9657 from ↵Tor Brede Vekterli2019-06-0521-604/+1119
|\ | | | | | | | | vespa-engine/vekterli/more-efficient-bucket-db-bulk-apis Add new DB merging API to distributor BucketDatabase
| * Add comments to new DB merge functionalityTor Brede Vekterli2019-06-041-2/+88
| |
| * Remove pointless typedef aliasTor Brede Vekterli2019-06-031-2/+0
| |
| * Add new DB merging API to distributor BucketDatabaseTor Brede Vekterli2019-06-0321-604/+1035
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Abstracts away how an ordered merge may be performed with the database and an arbitrary sorted bucket sequence, with any number of buckets skipped, updated or inserted as part of the merge. Such an API is required to allow efficient bulk updates of a B-tree backed database, as it is suboptimal to require constant tree mutations. Other changes: - Removed legacy mutable iteration API. Not needed with new merge API. - Const-iteration of bucket database now uses an explicit const reference entry type to avoid needing to construct a temporary entry when we can instead just point directly into the backing ArrayStore. - Micro-optimizations of node remover pass to avoid going via cluster state's node state std::map for each bucket replica entry. Now uses a precomputed bit vector. Also avoid BucketId bit reversing operations as much as possible by using raw bucket keys in more places. - Changed wording and contents of log message that triggers when buckets are removed from the DB due to no remaining nodes containing replicas for the bucket. Now more obvious what the message actually means. - Added several benchmark tests (disabled by default)
* | Create gtest runner per test sub-module.Geir Storli2019-06-0431-17/+226
|/ | | | This makes it possible to run storage tests in parallel.
* Remove storage dependency on searchlibTor Brede Vekterli2019-05-281-1/+0
| | | | Not needed now that B-tree code has been moved to vespalib.
* Move datastore and btree code from searchlib to vespalibTor Brede Vekterli2019-05-272-12/+12
| | | | | | | | | | Namespace is still `search` and not `vespalib` due to the massive amount of code that would need to be modified for such a change. Other changes: - Move `BufferWriter` from searchlib to vespalib - Move assertion and rand48 utilities from staging_vespalib to vespalib - Move gtest utility code from staging_vespalib to vespalib
* Bounce Puts when a node is unavailable in the pending cluster stateTor Brede Vekterli2019-05-243-0/+41
| | | | | | | | | | | | | Avoids scheduling Put operations towards nodes that are available in the current cluster state, but not in the one that is being async processed in the background. Not only are operations to such nodes highly likely to be doomed in the first place, we also risk triggering assertion failures by code that does not expect bucket DB inserts to said nodes to ever be rejected. Since we're recently added explicit rejection of inserts to nodes that are unavailable in the pending state, these assertions have the potential for triggering in certain edge case scenarios.
* Re-enable cluster state transition optimizationTor Brede Vekterli2019-05-211-1/+11
| | | | | Made side effects of replica resurrection bug more likely to be observed, but not in itself a believed root cause of any issues.
* Avoid resurrecting replicas for nodes that are unavailable in pending stateTor Brede Vekterli2019-05-2110-332/+278
| | | | | | | | | | | We previously only checked for node availability in the _active_ state without looking at the pending state. This opened up for a race condition where a reply for a previously DB-pruned node could bring a replica back in the DB iff received during a pending state window. Consider Maintenance as unavailable for this case, not just Down. Also move all `PutOperation` tests to GTest.
* Avoid recomputing bucket keys during sorting stepTor Brede Vekterli2019-05-153-18/+68
| | | | | | | | | | Buckets are sorted in increasing key order before being merged into the database. Instead of repeatedly calling `toKey()` on entries, just store the key verbatim. Add a (by default disabled) benchmarking test for this case. With the bucket key change, running this locally brings total merge time for 917K buckets down from 2 seconds to 1.3 seconds.
* Disable bucket DB pruning elision optimization for nowTor Brede Vekterli2019-05-142-11/+4
| | | | | | There may be a currently unknown edge case where some valid edges do not trigger pruning as they should, so disable optimization entirely until we know for sure.
* Merge pull request #9341 from ↵Tor Brede Vekterli2019-05-1319-180/+670
|\ | | | | | | | | vespa-engine/vekterli/add-distributor-btree-bucket-database-foundations Add initial B+tree distributor bucket database
| * Add initial B+tree distributor bucket databaseTor Brede Vekterli2019-05-0919-180/+670
| | | | | | | | | | | | | | | | | | | | | | | | | | Still uses legacy `BucketDatabase` API, which is not optimized for bulk loading or updating. Focus for this iteration is functional correctness rather than API redesign. Legacy DB is still the one wired in for all production logic. Unit tests have been expanded to cover discovered edge cases that were not properly tested for. Also move distributor bucket DB tests to GTest. Use value- parameterized test fixture instead of ad-hoc CppUnit approach.
* | Simplify the supervisor responsibilityHenning Baldersheim2019-05-103-22/+27
|/
* Elide bucket DB pruning scans when possibleTor Brede Vekterli2019-04-298-10/+275
| | | | | | | | | | | | | | Only do a (potentially expensive) pruning pass over the bucket DB when the cluster state transition indicates that one or more nodes are not in the same availability-state as the currently active state. For instance, if a cluster state change is for a content node going from Maintenance to Down, no pruning action is required since that shall already have taken place during the processing of the initial Down edge (and no buckets shall have been created for it in the meantime). We do not currently attempt to elide distribution config changes, as these happen much more rarely than cluster state changes.
* Cache super bucket ownership decisions when processing bucket DBTor Brede Vekterli2019-04-252-12/+46
| | | | | | | | Distributor bucket ownership is assigned on a per superbucket basis, so all buckets with the same superbucket can use the same decision. The bucket DB is explicitly ordered in such a way that all buckets belonging to the same superbucket are ordered after each other, so we need only maintain O(1) extra state for this.
* Add metrics around bucket DB pruning and merging phases of state transitionsTor Brede Vekterli2019-04-123-3/+21
|
* Limit number of persistence threads that can process merges in parallelTor Brede Vekterli2019-04-035-8/+50
| | | | | | Avoids starving other operations when there is a lot of merge activity taking place. For now, 1/2 of the total persistence thread pool may process merges.
* Convert BucketDBUpdaterTest from CppUnit to GTestTor Brede Vekterli2019-03-273-896/+597
|
* Merge pull request #8882 from ↵Tor Brede Vekterli2019-03-2632-151/+962
|\ | | | | | | | | vespa-engine/vekterli/add-read-only-support-during-cluster-state-transitions Add read-only support during cluster state transitions
| * Address code review feedback for distributor changesTor Brede Vekterli2019-03-262-7/+15
| |
| * Minor C++ cleanupsTor Brede Vekterli2019-03-226-7/+8
| |
| * Always allow activation commands through bouncer componentTor Brede Vekterli2019-03-202-0/+15
| | | | | | | | | | Otherwise we'd miss activation commands sent for a cluster state in which our own node is marked down.
| * Test more BucketDBUpdater two-phase transition edge casesTor Brede Vekterli2019-03-203-58/+98
| |
| * Properly handle non-owned vs. missing bucketsTor Brede Vekterli2019-03-159-52/+250
| | | | | | | | | | | | | | | | Bonus: no more spurious "we have removed buckets" log messages caused by ownership changes. Also ensure that we BUSY-bounce operations in `ExternalOperationHandler` when there is no actual state to send back in a `WrongDistributionReply`.
| * WIP on BucketDBUpdater explicit activation supportTor Brede Vekterli2019-03-146-7/+115
| |
| * Basic handling of activate_cluster_state_version RPC in backendTor Brede Vekterli2019-03-147-14/+122
| |
| * Move non-owned buckets to read-only DB and allow use for read-only opsTor Brede Vekterli2019-03-1412-66/+321
| |
| * Add read-only bucket space repo and wire it through distributor componentsTor Brede Vekterli2019-03-1415-51/+129
| |
* | include content length in http responseHåvard Pettersen2019-03-261-0/+3
| |
* | Revert "include content length in http response"Harald Musum2019-03-251-3/+0
| |
* | include content length in http responseHåvard Pettersen2019-03-251-0/+3
| |