aboutsummaryrefslogtreecommitdiffstats
path: root/storage
Commit message (Collapse)AuthorAgeFilesLines
* Fix off-by-one assertion for intra-second timestamp overflow sanity checkTor Brede Vekterli2021-09-271-1/+1
|
* Add grace period inhibiting maintenance after state transitions with bucket ↵Tor Brede Vekterli2021-09-2711-25/+149
| | | | | | | | | | | | | | | | ownership transfer Avoids the case where different distributors can start merges with a max timestamp that is lower than timestamps generated intra-second by other distributors used for feed bound to the same bucket. This is analogous to the existing "safe time period" functionality used for handling external feed, and uses the same max clock skew config as this. Correctness of this grace period is therefore inherently dependent on actual cluster clock skew being less than this configured number. Bucket activations are still allowed to take place during the grace period time window, as these do not mutate the bucket contents and are therefore safe.
* Add high-level test that maintenance is inhibited during pending state ↵Tor Brede Vekterli2021-09-241-0/+19
| | | | | | | transitions Maintenance inhibition is already present, but it happens at a much lower level. Add a high-level test to ensure that the wiring works as expected.
* Merge pull request #19267 from ↵Tor Brede Vekterli2021-09-231-1/+1
|\ | | | | | | | | vespa-engine/vekterli/use-max-of-current-and-pending-distribution-bit-counts Use max instead of min from current and pending cluster states' distribution bit counts [run-systemtest]
| * Use max instead of min from current and pending cluster states' distribution ↵Tor Brede Vekterli2021-09-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bit counts Using min() has an unfortunate (very rare) edge case if a cluster goes _down_ in distribution bit counts (whether we really want to support this at all is a different discussion, since it has some other unfortunate implications). If the current state has e.g. 14 bits and the pending state has 8 bits, using 8 bits for `_distribution_bits` will trigger a `TooFewBucketBitsInUse` exception when computing a cached ideal state for a bucket in the bucket DB. This is because the ideal state algorithm is not defined for buckets using fewer bits than the state's distribution bit count. The cluster controller shall never push a cluster state with a distribution bit count higher than the least split bucket across all nodes in the cluster, so the cache lookup code should theoretically(tm) never be invoked with a bucket that has fewer used bits than what's present in the pending state.
* | use path in config includesArne H Juul2021-09-221-2/+2
| |
* | allow generated PB files outside source treeArne H Juul2021-09-221-1/+1
| |
* | Remove TODO that we won't fix.Geir Storli2021-09-211-1/+0
| |
* | Remove unused use_bucket_db parameter.Geir Storli2021-09-214-9/+7
|/
* Use BucketSpaceStateMap to track cluster state and distribution in the ↵Geir Storli2021-09-2019-128/+112
| | | | | | | top-level distributor. This replaces the previous hack (needed in legacy mode) that used DistributorBucketSpaceRepo to achieve the same.
* Remove most traces of distributor legacy mode.Tor Brede Vekterli2021-09-2021-6066/+151
| | | | | Some assorted legacy bits and pieces still remain on the factory floor, these will be cleaned up in follow-ups.
* Address low-hanging TODO fruit and remove stuff that's either done or won't ↵Tor Brede Vekterli2021-09-1611-33/+20
| | | | be done
* Merge pull request #19164 from ↵Tor Brede Vekterli2021-09-163-3/+30
|\ | | | | | | | | vespa-engine/vekterli/aggregate-pending-operation-stats-across-stripes Aggregate pending operation stats across all stripes in stripe guard
| * Aggregate pending operation stats across all stripes in stripe guardTor Brede Vekterli2021-09-163-3/+30
| |
* | Merge pull request #19162 from ↵Geir Storli2021-09-163-2/+33
|\ \ | |/ |/| | | | | vespa-engine/geirst/flip-to-new-distributor-stripe-code-path Flip to always use the new distributor stripe code path.
| * Flip to always use the new distributor stripe code path.Geir Storli2021-09-163-2/+33
| | | | | | | | If the number of stripes is not configured, we tune it based on the sampled number of CPU cores.
* | Use ideal state cache when populating StateChecker contextTor Brede Vekterli2021-09-151-3/+2
|/
* Move initializing handling to top-level distributorTor Brede Vekterli2021-09-1411-20/+98
| | | | | | | Add a listener interface that lets the top-level distributor intercept cluster state activations and use this for triggering the node init edge. This happens when all stripes are paused so this is safe from data races. Legacy code in the DistributorStripe remains for now.
* Port final batch of BucketDBUpdater tests from legacy to top-level code pathsTor Brede Vekterli2021-09-135-19/+698
|
* Merge pull request #19076 from ↵Tor Brede Vekterli2021-09-138-38/+1201
|\ | | | | | | | | vespa-engine/vekterli/port-more-bucketdbupdater-tests Port more BucketDBUpdater tests from legacy to new code path
| * Port more BucketDBUpdater tests from legacy to new code pathTor Brede Vekterli2021-09-108-38/+1201
| |
* | Rename test functions to be aligned with distributor stripe functionality.Geir Storli2021-09-091-4/+4
|/
* Merge pull request #19023 from ↵Geir Storli2021-09-095-0/+173
|\ | | | | | | | | vespa-engine/vekterli/port-additional-tests-and-fix-regression Port additional DB updater tests and fix delayed sending regression
| * Port additional DB updater tests and fix delayed sending regressionTor Brede Vekterli2021-09-085-0/+173
| | | | | | | | | | | | | | Addresses a missing piece of functionality in the new code path where queued bucket rechecks during a pending cluster state time window would not be sent as expected when the pending state has been completed and activated.
* | Merge pull request #19022 from ↵Geir Storli2021-09-091-1/+3
|\ \ | |/ |/| | | | | vespa-engine/geirst/main-distributor-thread-tick-wait-duration Increase tick wait duration for main distributor thread when running …
| * Increase tick wait duration for main distributor thread when running with ↵Geir Storli2021-09-081-1/+3
| | | | | | | | | | | | | | multiple stripes. This because it will no longer be running background maintenance jobs (non-event tick will instead be used primarily for resending full bucket fetches etc).
* | Merge pull request #19016 from ↵Tor Brede Vekterli2021-09-086-19/+952
|\ \ | |/ |/| | | | | vespa-engine/vekterli/port-first-batch-of-bucketdbupdater-tests-to-top-level Port first batch of BucketDBUpdater tests from legacy to top-level
| * Port first batch of BucketDBUpdater tests from legacy to top-levelTor Brede Vekterli2021-09-086-19/+952
| |
* | Use distributor stripe index when setting up reporter for PendingMessageTracker.Geir Storli2021-09-085-14/+17
|/ | | | This ensures we can access each individual reporter, instead of just one of them.
* Rename BucketDBUpdater to TopLevelBucketDBUpdater.Geir Storli2021-09-0614-76/+76
|
* Rename Distributor to TopLevelDistributor.Geir Storli2021-09-0639-128/+128
|
* Port remaining legacy distributor tests to top-level test suiteTor Brede Vekterli2021-09-0214-26/+188
| | | | | | | | | | | Also fix a minor regression caused by the stripe cluster state change code path consulting a now unused part of the `StripeBucketDBUpdater` on whether a cluster state change implies bucket ownership change. This was used for adding a configurable safe period for client mutations to ensure distributors can't step on each others toes. The responsibility for telling stripes that state changes imply ownership changes has now been moved to the top-level DB updater and happens through the stripe guard interface instead.
* Migrate more unit tests to top-level distributor test suiteTor Brede Vekterli2021-08-315-20/+220
|
* Add info on number of distributor stripes to status page.Geir Storli2021-08-311-2/+5
|
* Add todos and remove todos that are done or no longer relevant.Geir Storli2021-08-311-5/+5
|
* Migrate stale reads tests to DistributorStripeTest.Geir Storli2021-08-302-0/+69
|
* Migrate config propagation tests to DistributorStripeTest.Geir Storli2021-08-302-0/+140
|
* Rewrite state checkers tests to not use legacy test util.Geir Storli2021-08-273-93/+115
|
* Rewrite ideal state manager tests to not use legacy test util.Geir Storli2021-08-276-36/+38
|
* Rewrite external operation handler tests to not use legacy test util.Geir Storli2021-08-272-35/+41
|
* Add test of top-level distributor functionalityTor Brede Vekterli2021-08-2715-27/+800
| | | | | | | This is a subset of the legacy distributor tests, adapted to explicitly test cross-stripe functionality. Once all relevant tests have been ported to be cross-stripe, the legacy test code will be removed.
* Rewrite per stripe tests that need to set pending cluster state to use ↵Geir Storli2021-08-254-93/+103
| | | | DistributorStripeTestUtil.
* Rewrite per stripe tests to use DistributorStripeTestUtil instead of ↵Geir Storli2021-08-2516-240/+288
| | | | DistributorTestUtil.
* Remove unused variables.Tor Egge2021-08-231-2/+0
|
* Report max address space used in attribute vector components from content ↵Geir Storli2021-08-202-43/+22
| | | | | | | nodes (proton) to the cluster controller. This is more generic than explicit address space values for enum store and multi value. This is used in the cluster controller to determine whether to block external feed.
* Unify on std::_ExitHenning Baldersheim2021-08-031-1/+1
|
* Use std::quick_exit instead of abort to avoid coredump when it will not ↵Henning Baldersheim2021-08-021-1/+1
| | | | provide any more information.
* - Control tcp-no-delay explicit.Henning Baldersheim2021-07-022-2/+9
| | | | - Wire in configuration of number of rpc targets.
* Support multiple network threads in mbus.Henning Baldersheim2021-07-022-5/+6
|
* Migrate distributor stripe unit tests that configures using config builder.Geir Storli2021-06-292-5/+116
|