summaryrefslogtreecommitdiffstats
path: root/storage
Commit message (Collapse)AuthorAgeFilesLines
* Use c++11 for loops and use std::move.Henning Baldersheim2020-03-239-80/+70
|
* Inhibit all merge-related commands and repliesTor Brede Vekterli2020-03-191-3/+20
| | | | | | | Just inhibiting MergeCommand itself does not inhibit all message variants that may cause significant read or write load for any given bucket. This should lessen the competition between merge backend activity and client operations.
* Merge pull request #12586 from ↵Tor Brede Vekterli2020-03-1914-423/+894
|\ | | | | | | | | vespa-engine/vekterli/add-cheap-metadata-fetch-phase-for-updates Add initial metadata-only phase to inconsistent update handling
| * Add comments and some extra safety handling of single Get commandTor Brede Vekterli2020-03-174-2/+26
| |
| * Add initial metadata-only phase to inconsistent update handlingTor Brede Vekterli2020-03-1614-422/+869
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If bucket replicas are inconsistent, the common case is that only a small subset of documents contained in the buckets are actually mutually out of sync. The added metadata phase optimizes for such a case by initially sending Get requests to all divergent replicas that only ask for the timestamps (and no fields). This is a very cheap and fast operation. If all returned timestamps are in sync, the update can be restarted in the fast path. Otherwise, a full Get will _only_ be sent to the newest replica, and its result will be used for performing the update on the distributor itself, before pushing out the result as Puts. This is in contrast to today's behavior where full Gets are sent to all replicas. For users with large documents this can be very expensive. In addition, the metadata Get operations are sent with weak internal read consistency (as they do not need to read any previously written, possibly in-flight fields). This lets them bypass the main commit queue entirely.
* | Account for stripe index when performing a multi-lockTor Brede Vekterli2020-03-182-9/+33
|/ | | | | | | | | | | | Legacy locking code had a hidden assumption that each disk had only a single queue associated with it and locking requests could therefore be deduped at the disk level. Since we now only have a single logical disk with a number of queues striped over it, this would introduce race conditions when splits/joins would cross stripe boundaries for their target/source buckets. Use a composite disk/stripe multi lock key to ensure we only dedupe locking requests at the stripe-level.
* Limit merges per stripe, not globallyTor Brede Vekterli2020-03-042-36/+32
| | | | | | | | | | | With a sufficient, even thread count this will ensure that no stripes end up completely blocked on processing merges, which can starve client operations. Having a global limit means that it was possible for stripes to completely fill up with merges. As an added bonus, moving the limit tracking to individual stripes means that we no longer have to track this as an atomic, since all access already happens under the Stripe lock.
* Use max on stripes, instead of threadsHenning Baldersheim2020-03-041-2/+2
|
* Use 2 threads per stripe.Henning Baldersheim2020-03-031-2/+2
|
* Add count metric for number of documents garbage collectedTor Brede Vekterli2020-02-2414-48/+100
| | | | | | | | | | | | | | | New distributor metric available as: ``` vds.idealstate.garbage_collection.documents_removed ``` Add documents removed statistics to `RemoveLocation` responses, which is what GC is currently built around. Could technically have been implemented as a diff of before/after BucketInfo, but GC is very low priority so many other mutating ops may have changed the bucket document set in the time span between sending the GC ops and receiving the replies. This relates to issue #12139
* Avoid copying BucketState, when you only need BucketInfo.Henning Baldersheim2020-02-152-3/+3
|
* Twice as many stripes.Henning Baldersheim2020-02-146-64/+31
|
* extend crypto engine apiHåvard Pettersen2020-02-131-1/+1
| | | | | send spec for client connections to enable SNI as well as server name verification
* track the total number of connection objectsHåvard Pettersen2020-02-045-2/+50
|
* Add include statements needed by newer build environments.Tor Egge2020-01-261-0/+1
|
* Include stdexcept before using std::runtime_errorTor Egge2020-01-261-0/+1
|
* Followup on code comments.Henning Baldersheim2020-01-231-1/+1
|
* Use a single chunkHenning Baldersheim2020-01-231-27/+26
|
* Merge pull request #11822 from vespa-engine/balder/reduce-bytebuffer-exposureHenning Baldersheim2020-01-2116-150/+60
|\ | | | | Balder/reduce bytebuffer exposure
| * Add stream method and use memcpy over casting.Henning Baldersheim2020-01-218-13/+12
| |
| * Add TODO for next commit.Henning Baldersheim2020-01-201-0/+1
| |
| * Make it known that getting serialized size will always be expensive.Henning Baldersheim2020-01-202-3/+4
| |
| * GC a load of unused code. ByteBuffer towards read only.Henning Baldersheim2020-01-209-58/+35
| |
| * GC unused code and simplify StructFieldValue.Henning Baldersheim2020-01-172-65/+3
| |
| * Remove complicated option for slicing as it is not used anywhere.Henning Baldersheim2020-01-163-5/+7
| |
| * Unify towards nbostreamHenning Baldersheim2020-01-163-19/+11
| |
* | Merge pull request #11830 from ↵Tor Brede Vekterli2020-01-1714-7/+154
|\ \ | |/ |/| | | | | vespa-engine/vekterli/support-weak-internal-read-consistency-for-client-gets Vekterli/support weak internal read consistency for client gets
| * Add configurable support for weakly consistent client GetsTor Brede Vekterli2020-01-1713-6/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If configured, Get operations initiated by the client are flagged with weak internal consistency. This allows the backend to bypass certain internal synchronization mechanisms, which minimizes latency at the cost of possibly not observing a consistent view of the document. This config should only be used in a very restricted set of cases where the document set is effectively read-only, or cross- field consistency or freshness does not matter. To enable the weak consistency, use an explicit config override: ``` <config name="vespa.config.content.core.stor-distributormanager"> <use_weak_internal_read_consistency_for_client_gets> true </use_weak_internal_read_consistency_for_client_gets> </config> ``` This closes #11811
| * Add internal read consistency enum to storage protocol Get requestsTor Brede Vekterli2020-01-161-1/+1
| |
* | Remove and indirection for document id, for less memory footprint, and ↵Henning Baldersheim2020-01-161-11/+3
|/ | | | better generated code.
* Merge pull request #11782 from ↵Henning Baldersheim2020-01-164-16/+16
|\ | | | | | | | | vespa-engine/balder/bring-you-backing-buffer-along Balder/bring your backing buffer along
| * Just use the stream method.Henning Baldersheim2020-01-164-9/+9
| |
| * Remove virtuality of DocumentId.Henning Baldersheim2020-01-143-21/+21
| |
* | Avoid inconsistent auto-created document versions taking precedenceTor Brede Vekterli2020-01-135-21/+105
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | Create-if-missing updates have a rather finicky behavior in the backend, wherein they'll set the timestamp of the previous document to that of the _new_ document timestamp if the update ended up creating a document from scratch. This particular behavior confuses the "after the fact" timestamp consistency checks, since it will seem like the document that was created from scratch is a better candidate to force convergence towards rather than the ones that actually updated an existing document. With this change we therefore detect this case specially and treat the received timestamps as if the document updated had a timestamp of zero. This matches the behavior of regular (non auto-create) updates. Note that another venue for solving this would be to alter the returned timestamp in the backend to be zero instead, but this would cause issues during rolling upgrades since some of the content nodes would be returning zero timestamps while others would be returning non-zero. This would in turn trigger false positives for the inconsistency sanity checks. Also note that this is a fallback path that should not be hit unless the a-priori inconsistency checks in the two-phase update operation somehow fails to recognize that the document versions may be out of sync. This relates to issue #11686
* Merge pull request #11704 from ↵Tor Brede Vekterli2020-01-097-7/+57
|\ | | | | | | | | vespa-engine/vekterli/support-config-disabling-of-merges Support config disabling of merges
| * Upgrade log level to error for detected update inconstenciesTor Brede Vekterli2020-01-081-7/+7
| |
| * Add distributor configuration for disabling merges for testingTor Brede Vekterli2020-01-086-0/+50
| | | | | | | | | | | | | | | | If config is set, all merges will be completely inhibited. This is useful for letting system tests operate against a bucket replica state that is deterministically out of sync. This relates to issue #11686
* | Merge pull request #11692 from ↵Henning Baldersheim2020-01-081-1/+1
|\ \ | | | | | | | | | | | | vespa-engine/toregge/system-time-and-steady-time-might-have-different-duration-types std::chrono::system_clock and std::chrono::steady_clock might have different duration types.
| * | Use default constructor for time point when duration since epoch is zero.Tor Egge2020-01-081-1/+1
| | |
| * | system_time and steady_time might have different duration types.Tor Egge2020-01-081-1/+1
| |/
* / Fix format strings.Tor Egge2020-01-072-4/+4
|/
* Ensure missing documents on replicas are not erroneously considered consistentTor Brede Vekterli2019-12-204-6/+38
| | | | | | | | | | | | | | | | | | Introducing a new member in the category "stupid bugs that I should have added explicit tests for when adding the feature initially". There was ambiguity in the GetOperation code where a timestamp sentinel value of zero was used to denote not having received any replies yet, but where a timestamp of zero also means "document not found on replica". This means that if the first reply was from a replica _without_ a document and the second reply was from a replica _with_ a document, the code would act as if the first reply effectively did not exist. Consequently the Get operation would be tagged as consistent. This had very bad consequences for the two-phase update operation logic that relied on this information to be correct. This change ensures there is no ambiguity between not having received a reply and having a received a reply with a missing document.
* Disable fast update path restarts by defaultTor Brede Vekterli2019-12-204-25/+30
| | | | | | | | | | Even with the fix in #11561 we are still observing replica divergence warnings in the logs. Disabling this feature entirely until the issue has been fully investigated and a complete fix has been implemented. Also emit a log message when the distributor has forced convergence of a detected inconsistent update.
* Merge branch 'master' into balder/reduce-timestamp-usageHenning Baldersheim2019-12-201-1/+1
|\
| * Multiple slashes in include paths messes up the mechanism in rpmbuild when ↵Arnstein Ressem2019-12-171-1/+1
| | | | | | | | extracting debuginfo.
* | Drop timestamp.hHenning Baldersheim2019-12-168-75/+27
|/
* Avoid fast past update restart race with concurrently created replicaTor Brede Vekterli2019-12-135-3/+58
| | | | | | | | | | | | | | | | After the recent change to allow safe path updates to be restarted as fast path updates iff all observed document timestamps are equal, a race condition regression was introduced. If the bucket that the update operation was scheduled towards got a new replica concurrently created _between_ the time that safe path Gets were sent and received, it was possible for updates to be sent to inconsistent replicas. This is because the Get and Update operations use the current database state at _their_ start time, not a stable snapshot state from the start time of the two phase update operation itself. Add an explicit check that the replica state between sending Gets and Updates is unchanged. If it has changed, a fast path restart is _not_ permitted.
* Merge pull request #11507 from ↵Henning Baldersheim2019-12-0527-194/+170
|\ | | | | | | | | vespa-engine/balder/use-duration-in-messagebus-and-storageapi-rebased-1 timeout as duration
| * Merge branch 'master' into ↵Henning Baldersheim2019-12-0512-29/+46
| |\ | | | | | | | | | balder/use-duration-in-messagebus-and-storageapi-rebased-1
| * | Use getMessageNowHenning Baldersheim2019-12-041-2/+0
| | |