summaryrefslogtreecommitdiffstats
path: root/storage
Commit message (Collapse)AuthorAgeFilesLines
...
* Reduce direct use of DistributorStripeComponent.Geir Storli2021-05-1437-104/+129
|
* Merge pull request #17834 from vespa-engine/geirst/impl-bucket-space-state-mapGeir Storli2021-05-124-2/+143
|\ | | | | Implement class that provides mapping from bucket space to state for …
| * Implement class that provides mapping from bucket space to state for that space.Geir Storli2021-05-124-2/+143
| |
* | Merge pull request #17830 from ↵Tor Brede Vekterli2021-05-127-64/+178
|\ \ | |/ |/| | | | | vespa-engine/vekterli/add-initial-multi-stripe-support-to-access-guard Add initial multi stripe support to access guard [run-systemtest]
| * Add initial multi stripe support to access guardTor Brede Vekterli2021-05-127-64/+178
| | | | | | | | | | | | Still missing functionality for: - Merging bucket entries across stripes - Aggregating pending operation stats across stripes
* | Stop all stripe threads before starting shutdown (and closing) of the ↵Geir Storli2021-05-117-19/+31
|/ | | | | | | | | | storage link chain. This is required to avoid stripe threads being able to send up messages while the communication manager is being closed. Such messages will fail at the RPC layer (already closed) and an error reply is sent down from the communication manager. This triggers an assert in StorageLink::sendDown() which is already CLOSED.
* Extend TickableStripe interface to avoid direct access to DistributorStripe ↵Tor Brede Vekterli2021-05-118-27/+224
| | | | | | | internals Also lets us test guard functionality much more easily since its target is now fully mockable.
* Explicitly signal locking requirements in function signatureTor Brede Vekterli2021-05-102-3/+3
|
* Add timed batching of explicit host info sends triggered by stripesTor Brede Vekterli2021-05-108-12/+193
| | | | | | | | | | | | | | | | | | | | | Since distributor stripes may independently reach a conclusion that a `GetNodeState` reply containing new host info should be sent back to the cluster controller, implement basic rate limiting/batching of concurrent sends. Batching has two separate modes of operation: - If the node is initializing, host info will be sent immediately after _all_ stripes have reported in (they will always do this post-init). This is not timed, in order to minimize latency of bucket info being visible to the cluster controller. - If the node has already initialized, have a grace period of up to 1 second from the time the first stripe signals its intent to send host info until it's actually sent. This allows several stripes to complete their recovery mode and signal host info intents during this second. Batch time period is currently not configurable, may be done later if deemed useful or necessary.
* Propagate config to underlying bucket repos when storage distribution changes.Geir Storli2021-05-102-0/+14
| | | | | | | This fixes the following system tests when running with new distributor stripe mode: Capacity::test_capacity FlatToHierarchicTransitionTest::test_transition_implicitly_indexes_and_activates_docs_per_group HierarchDistr::test_app_change__PROTON
* Remove remains of "maintenance" status page.Geir Storli2021-05-062-12/+4
|
* Remove IdealStateManager as an explicit status reporter.Geir Storli2021-05-062-11/+2
| | | | | | This is wrongly implemented by not using a delegator to the right thread. The system test framework is not using the "idealstateman" page. The same information is present in the status page "distributor?page=buckets".
* Remove assert for a scenario that might occur in wait_until_unparked().Geir Storli2021-05-061-1/+0
| | | | | | | | | | A stripe thread is parked as part of another thread calling DistributorStripePool::park_all_threads(). The stripe thread will then be inside DistributorStripePool::park_thread_until_released(), just waiting to call DistributorStripeThread::wait_until_unparked(). Before this is called, the other thread can call DistributorStripePool::unpark_all_threads(), and the _should_park variable in DistributorStripeThread is set to false again. When the stripe thread calls DistributorStripeThread::wait_until_unparked(), it is already unparked. This is a scenario that might occur when the parking / unparking loop is short.
* Make status reporting from distributor and bucket db updater work when ↵Geir Storli2021-05-0512-44/+197
| | | | running in new stripe mode.
* Remove no longer used class.Geir Storli2021-05-055-170/+0
|
* Dispatch messages to be handled by BucketDBUpdater to main distributor threadTor Brede Vekterli2021-05-052-7/+50
| | | | | Required to ensure no race conditions can happen from processing such messages from arbitrary RPC/CommunicationManager threads.
* Run single stripe in its own thread when not using legacy modeTor Brede Vekterli2021-05-0511-43/+187
| | | | | | | | | | | | | | The (currently single) stripe is now run as part of the distributor stripe pool instead of being transitively invoked by the main thread. Introduce an explicit message mutex per stripe that is used for external messages and status requests when not using legacy mode. Use per-stripe wakeup mechanisms instead of the framework-global mutex used in the legacy code path. Additional work remains to bring back a dedicated message run-queue for the top-level distributor, so this is not yet thread safe for operations to the main `BucketDBUpdater`.
* Merge pull request #17713 from ↵Geir Storli2021-05-0310-89/+141
|\ | | | | | | | | vespa-engine/vekterli/make-more-distributor-internals-private-2 Make more distributor internals private 2; the much anticipated sequel [run-systemtest]
| * Ensure we do not call legacy `getConfig()` in common code pathsTor Brede Vekterli2021-05-031-3/+3
| | | | | | | | | | | | | | Also unconditionally update top-level Distributor's own config snapshot so that it can be used for legacy code paths as well. Would ideally remove all usages of legacy `getConfig()`, but we need to refactor how unit tests sneakily inject config changes first.
| * Make more Distributor internals only available to friended testsTor Brede Vekterli2021-05-0310-86/+138
| | | | | | | | Also add more assertions that such functions are only called in legacy mode.
| * Revert "Make more Distributor internals only available to friended tests"Harald Musum2021-05-0310-138/+86
| |
| * Make more Distributor internals only available to friended testsTor Brede Vekterli2021-04-3010-86/+138
| | | | | | | | Also add more assertions that such functions are only called in legacy mode.
* | Use noexcept.Geir Storli2021-05-031-1/+1
| |
* | Move function for getting storage node up states to a common place.Geir Storli2021-04-3011-32/+28
| |
* | Add option on whether to use the bucket database in DistributorBucketSpace.Geir Storli2021-04-306-11/+15
|/ | | | | Don't use bucket databases in the top-level distributor component and bucket db updater. It is only distributor stripes that should have a bucket database.
* Merge pull request #17661 from ↵Geir Storli2021-04-2914-2/+704
|\ | | | | | | | | vespa-engine/vekterli/distributor-stripe-pool-and-thread-coordination Add DistributorStripe thread pool with thread park/unpark support
| * Fix typo and add some TODOs for follow-upsTor Brede Vekterli2021-04-293-1/+4
| |
| * Add DistributorStripe thread pool with thread park/unpark supportTor Brede Vekterli2021-04-2914-2/+701
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To enable safe and well-defined access to underlying stripe data structures from the main distributor thread, the pool has functionality for "parking" and "unparking" all stripe threads: * Parking makes all threads go into a blocked holding pattern where it is guaranteed that they may not race with any other threads. * Unparking releases all threads from their holding pattern, allowing them to continue their event processing loop. Also adds a custom run loop for distributor threads that largely emulates the waiting semantics found in the current framework ticking thread pool run loop. But unlike the framework pool, there is no global mutex that must be acquired by all threads in the pool. All stripe event handling uses per-thread mutexes and condition variables. Global state is only accessed when thread parking is requested, which happens very rarely.
* | Make the top-level BucketDBUpdater independent of the single distributor stripe.Geir Storli2021-04-295-11/+24
| |
* | Split DistributorMessageSender into two parts.Geir Storli2021-04-2943-172/+176
| | | | | | | | DistributorStripeMessageSender is used for all stripe related operations.
* | Make the top-level BucketDBUpdater less dependant on the single existing ↵Geir Storli2021-04-2910-57/+209
|/ | | | distributor stripe.
* Remove unused top-level code that is handled per stripe instead.Geir Storli2021-04-282-185/+1
|
* Rename DistributorOperationContext to DistributorStripeOperationContext.Geir Storli2021-04-2724-38/+38
|
* Remove processing of single bucket info replies.Geir Storli2021-04-262-138/+1
| | | | These are per stripe and handled by StripeBucketDBUpdater instead.
* Merge pull request #17579 from ↵Henning Baldersheim2021-04-233-9/+3
|\ | | | | | | | | vespa-engine/toregge/remove-unused-variables-and-arguments Remove unused variables and arguments.
| * Remove unused variables and arguments.Tor Egge2021-04-233-9/+3
| |
* | Fix forward declaration.Tor Egge2021-04-231-1/+1
|/
* Propagate distributor config via internal snapshotsTor Brede Vekterli2021-04-2310-30/+94
| | | | | | | | | | | | | | | | | | Add functionality for updating config for stripes via accessor guard interface. Stripes will now keep an explicit, immutable config snapshot internally (and get explicit knowledge of when this changes) instead of accessing config via magical component interface. Introduce notion of internal config generation (not directly related to generations received from underlying config system) to be able to cheaply know if new config should be propagated. Since a lot of unit tests use dirty "modify in place" tricks to alter config for tests, have to do some extra work per tick to ensure non-generational config changes are applied to the distributor code. This should be fixed as a later follow-up. Also remove `TickingThread` inheritance from `DistributorStripe` to begin preparing for redesign of stripe threading model.
* Make DistributorStripe aware of whether it uses legacy mode or not and add ↵Geir Storli2021-04-235-14/+35
| | | | asserts.
* Propagate num_distributor_stripes cfg to Distributor ctor and instantiate ↵Geir Storli2021-04-229-32/+18
| | | | | | | BucketDBUpdater if needed. At the same time remove the manageActiveBucketCopies flag, which has been true since the Storage provider was removed years ago.
* Remove unused function.Geir Storli2021-04-222-12/+0
|
* Decouple DistributorStripe from StorageLink.Geir Storli2021-04-223-33/+16
|
* Initial implementation and wiring of cross-stripe state and DB handlingTor Brede Vekterli2021-04-2127-750/+2281
| | | | | | | | | | | | | | Introduces the concept of stripe access guards, that ensure safe and non-concurrent access to the underlying state of all running distributor stripes. Also bring back a top-level `BucketDBUpdater` component responsible for managing cluster state/distribution config and all related bucket info fetching for the entire node as a whole. This component abstracts away all stripe-specific operations via the new guard interface. For now, only a single stripe can be used via the new code path, and by default the legacy code path (single stripe acts as an entire distirbutor) is used. New path may be enabled via (non-live) config, but is not yet production ready.
* Add config and feature flag for the number of distributor stripes.Geir Storli2021-04-201-0/+4
|
* Simplify Distributor class to remove now unused functionalityTor Brede Vekterli2021-03-234-258/+25
| | | | | In particular, remove `DistributorStripeInterface` inheritance and subclassed `DistributorComponent` usage.
* Merge pull request #17113 from vespa-engine/geirst/distributor-stripe-refactor-2Tor Brede Vekterli2021-03-2315-151/+111
|\ | | | | Distributor stripe refactor 2
| * Stop exposing DistributorStripeComponent from IdealStateManager.Geir Storli2021-03-2314-102/+109
| |
| * Remove unused functions.Geir Storli2021-03-232-41/+0
| |
| * Remove functions from DistributorStripeComponent that are part of ↵Geir Storli2021-03-222-8/+2
| | | | | | | | DistributorOperationContext interface.
* | Fix forward declarations.Tor Egge2021-03-221-2/+2
|/