vespa - An engine for low-latency computation over large data sets

	Commit message (Collapse)	Author	Age	Files	Lines
*	Re-enable cluster state transition optimization	Tor Brede Vekterli	2019-05-21	1	-1/+11
\| \| \| \| \|	Made side effects of replica resurrection bug more likely to be observed, but not in itself a believed root cause of any issues.
*	Avoid resurrecting replicas for nodes that are unavailable in pending state	Tor Brede Vekterli	2019-05-21	10	-332/+278
\| \| \| \| \| \| \| \| \| \| \|	We previously only checked for node availability in the _active_ state without looking at the pending state. This opened up for a race condition where a reply for a previously DB-pruned node could bring a replica back in the DB iff received during a pending state window. Consider Maintenance as unavailable for this case, not just Down. Also move all `PutOperation` tests to GTest.
*	Avoid recomputing bucket keys during sorting step	Tor Brede Vekterli	2019-05-15	3	-18/+68
\| \| \| \| \| \| \| \| \| \|	Buckets are sorted in increasing key order before being merged into the database. Instead of repeatedly calling `toKey()` on entries, just store the key verbatim. Add a (by default disabled) benchmarking test for this case. With the bucket key change, running this locally brings total merge time for 917K buckets down from 2 seconds to 1.3 seconds.
*	Disable bucket DB pruning elision optimization for now	Tor Brede Vekterli	2019-05-14	2	-11/+4
\| \| \| \| \| \|	There may be a currently unknown edge case where some valid edges do not trigger pruning as they should, so disable optimization entirely until we know for sure.
*	Merge pull request #9341 from ↵	Tor Brede Vekterli	2019-05-13	19	-180/+670
\|\ \| \| \| \| \| \| \| \|	vespa-engine/vekterli/add-distributor-btree-bucket-database-foundations Add initial B+tree distributor bucket database
\| *	Add initial B+tree distributor bucket database	Tor Brede Vekterli	2019-05-09	19	-180/+670
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Still uses legacy `BucketDatabase` API, which is not optimized for bulk loading or updating. Focus for this iteration is functional correctness rather than API redesign. Legacy DB is still the one wired in for all production logic. Unit tests have been expanded to cover discovered edge cases that were not properly tested for. Also move distributor bucket DB tests to GTest. Use value- parameterized test fixture instead of ad-hoc CppUnit approach.
* \|	Simplify the supervisor responsibility	Henning Baldersheim	2019-05-10	3	-22/+27
\|/
*	Elide bucket DB pruning scans when possible	Tor Brede Vekterli	2019-04-29	8	-10/+275
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Only do a (potentially expensive) pruning pass over the bucket DB when the cluster state transition indicates that one or more nodes are not in the same availability-state as the currently active state. For instance, if a cluster state change is for a content node going from Maintenance to Down, no pruning action is required since that shall already have taken place during the processing of the initial Down edge (and no buckets shall have been created for it in the meantime). We do not currently attempt to elide distribution config changes, as these happen much more rarely than cluster state changes.
*	Cache super bucket ownership decisions when processing bucket DB	Tor Brede Vekterli	2019-04-25	2	-12/+46
\| \| \| \| \| \| \| \|	Distributor bucket ownership is assigned on a per superbucket basis, so all buckets with the same superbucket can use the same decision. The bucket DB is explicitly ordered in such a way that all buckets belonging to the same superbucket are ordered after each other, so we need only maintain O(1) extra state for this.
*	Add metrics around bucket DB pruning and merging phases of state transitions	Tor Brede Vekterli	2019-04-12	3	-3/+21
\|
*	Limit number of persistence threads that can process merges in parallel	Tor Brede Vekterli	2019-04-03	5	-8/+50
\| \| \| \| \| \|	Avoids starving other operations when there is a lot of merge activity taking place. For now, 1/2 of the total persistence thread pool may process merges.
*	Convert BucketDBUpdaterTest from CppUnit to GTest	Tor Brede Vekterli	2019-03-27	3	-896/+597
\|
*	Merge pull request #8882 from ↵	Tor Brede Vekterli	2019-03-26	32	-151/+962
\|\ \| \| \| \| \| \| \| \|	vespa-engine/vekterli/add-read-only-support-during-cluster-state-transitions Add read-only support during cluster state transitions
\| *	Address code review feedback for distributor changes	Tor Brede Vekterli	2019-03-26	2	-7/+15
\| \|
\| *	Minor C++ cleanups	Tor Brede Vekterli	2019-03-22	6	-7/+8
\| \|
\| *	Always allow activation commands through bouncer component	Tor Brede Vekterli	2019-03-20	2	-0/+15
\| \| \| \| \| \| \| \| \| \|	Otherwise we'd miss activation commands sent for a cluster state in which our own node is marked down.
\| *	Test more BucketDBUpdater two-phase transition edge cases	Tor Brede Vekterli	2019-03-20	3	-58/+98
\| \|
\| *	Properly handle non-owned vs. missing buckets	Tor Brede Vekterli	2019-03-15	9	-52/+250
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bonus: no more spurious "we have removed buckets" log messages caused by ownership changes. Also ensure that we BUSY-bounce operations in `ExternalOperationHandler` when there is no actual state to send back in a `WrongDistributionReply`.
\| *	WIP on BucketDBUpdater explicit activation support	Tor Brede Vekterli	2019-03-14	6	-7/+115
\| \|
\| *	Basic handling of activate_cluster_state_version RPC in backend	Tor Brede Vekterli	2019-03-14	7	-14/+122
\| \|
\| *	Move non-owned buckets to read-only DB and allow use for read-only ops	Tor Brede Vekterli	2019-03-14	12	-66/+321
\| \|
\| *	Add read-only bucket space repo and wire it through distributor components	Tor Brede Vekterli	2019-03-14	15	-51/+129
\| \|
* \|	include content length in http response	Håvard Pettersen	2019-03-26	1	-0/+3
\| \|
* \|	Revert "include content length in http response"	Harald Musum	2019-03-25	1	-3/+0
\| \|
* \|	include content length in http response	Håvard Pettersen	2019-03-25	1	-0/+3
\| \|
* \|	Revert typecasting of variables sent to JsonStream, instead assume that	Tor Egge	2019-03-15	1	-2/+2
\| \| \| \| \| \| \| \|	JsonStream will get overloads for the relevant fundamental types.
* \|	Adjust types in storage module.	Tor Egge	2019-03-14	6	-20/+20
\| \|
* \|	Adjust build setup for Darwin.	Tor Egge	2019-03-14	1	-1/+1
\|/
*	cinttypes must be included before Jydy.h.	Tor Egge	2019-03-13	1	-0/+1
\|
*	Fix format strings in storage module.	Tor Egge	2019-03-12	16	-39/+39
\|
*	Add '()' to macro definition.	Geir Storli	2019-03-01	1	-1/+1
\|
*	Simplify.	Geir Storli	2019-03-01	1	-1/+0
\|
*	Merge pull request #8616 from ↵	Geir Storli	2019-02-27	2	-6/+20
\|\ \| \| \| \| \| \| \| \|	vespa-engine/vekterli/log-bucket-info-before-and-after-on-update-inconsistency Log before/after bucket info for when update operation inconsistency is discovered
\| *	Log before/after bucket info for when update operation inconsistency is ↵	Tor Brede Vekterli	2019-02-26	2	-6/+20
\| \| \| \| \| \| \| \| \| \| \| \|	discovered Makes it more obvious if the inconsistency is likely due to e.g. a checksum collision.
* \|	Eliminate some gcc 9 warnings.	Tor Egge	2019-02-25	1	-0/+2
\|/
*	Merge pull request #8588 from ↵	Tor Brede Vekterli	2019-02-25	6	-36/+70
\|\ \| \| \| \| \| \| \| \|	vespa-engine/vekterli/do-not-bruteforce-abort-client-ops-during-orchestrated-down Fail client ops gracefully when distributor is marked down
\| *	Fail client ops gracefully when distributor is marked down	Tor Brede Vekterli	2019-02-22	6	-36/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, clients would only receive `ABORTED` when the distributor was marked down by orchestration. This would simply cause the client to resend until either the `StoragePolicy` would discard the cluster state entirely and retry against a working distributor, or the operations would time out. Now they will receive a `WrongDistributionReply` that shall immediately update the `StoragePolicy` to avoid sending to the distributor that has been marked down. Also add a separate metric for number of operations aborted by `Bouncer`. This fixes #8448.
* \|	Reduce code duplication in gtest runners.	Geir Storli	2019-02-22	1	-8/+2
\|/
*	Add workarounds for legacy global distribution hash handling	Tor Brede Vekterli	2019-02-21	9	-21/+277
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This addresses a regression introduced as part of #8479, which in turn was intended to serve as a fix for issue #8475. This regression would stall cluster state convergence when a subset of nodes contained the fix and another subset did not. With the workarounds present, nodes gracefully handle the case where different distribution hashes are expected for the global bucket space. `BucketManager` will now fall back to comparing the new incoming hash to that of the legacy derived distribution config if it mismatches. `PendingClusterState` will try to send a subset of bucket info requests with legacy hash format for the global bucket space iff there has been at least 1 rejected request. All these workarounds will be removed on Vespa 8.
*	Stop running storage unit tests in parallel, the tests can interfere	Tor Egge	2019-02-20	1	-3/+1
\| \| \| \|	with each other, cf. PersistenceTestUtils::setupDisks().
*	Use ASSERT_EQ when checking vector sizes.	Geir Storli	2019-02-18	1	-4/+4
\|
*	Add gtest runner in storage and migrate bucketmovertest from CppUnit to gtest.	Geir Storli	2019-02-18	7	-60/+61
\|
*	Derive correct distribution partition spec for grouped clusters	Tor Brede Vekterli	2019-02-12	2	-27/+31
\| \| \| \| \| \| \| \|	Simplify code by emitting wildcards for all groups instead of using explicit leaf counts. Distribution code will distribute replicas evenly across all wildcarded groups. This fixes #8475
*	Merge pull request #8443 from ↵	Tor Brede Vekterli	2019-02-11	6	-45/+159
\|\ \| \| \| \| \| \| \| \|	vespa-engine/vekterli/add-per-bucket-space-data-metrics-on-content-node Expose data metrics per bucket space on content node
\| *	Use non-generic dimension name for bucket spaces	Tor Brede Vekterli	2019-02-11	1	-1/+1
\| \|
\| *	Expose data metrics per bucket space on content node	Tor Brede Vekterli	2019-02-08	6	-45/+159
\| \| \| \| \| \| \| \|	Legacy metrics that cover all bucket spaces remain unchanged.
* \|	Eliminate some clang warnings in storage.	Tor Egge	2019-02-10	13	-48/+8
\|/
*	Update metric descriptions	Tor Brede Vekterli	2019-02-07	1	-2/+2
\|
*	Rename and restructure C++ TLS metrics	Tor Brede Vekterli	2019-02-07	2	-20/+31
\| \| \| \| \|	- Use dashes instead of underscores - Explicitly separate client/server metrics in metric path
*	Append node identity to response messages sent by Bouncer component	Tor Brede Vekterli	2019-02-04	3	-6/+16
\|