vespa - An engine for low-latency computation over large data sets

	Commit message (Collapse)	Author	Age	Files	Lines
*	- Control tcp-no-delay explicit.	Henning Baldersheim	2021-07-02	2	-2/+9
\| \| \| \|	- Wire in configuration of number of rpc targets.
*	Support multiple network threads in mbus.	Henning Baldersheim	2021-07-02	2	-5/+6
\|
*	Migrate distributor stripe unit tests that configures using config builder.	Geir Storli	2021-06-29	2	-5/+116
\|
*	Properly configure the stripe instead of accessing the underlying config ↵	Geir Storli	2021-06-29	3	-12/+39
\| \| \| \|	instance.
*	Update todos.	Geir Storli	2021-06-29	1	-20/+21
\|
*	Rename functions to be more stripe centric.	Geir Storli	2021-06-28	3	-65/+66
\|
*	Migrate stripe tests from LegacyDistributorTest to DistributorStripeTest.	Geir Storli	2021-06-28	4	-21/+528
\| \| \| \|	They are still present in LegacyDistributorTest as long as legacy mode exists.
*	Merge pull request #18409 from ↵	Geir Storli	2021-06-25	9	-1/+821
\|\ \| \| \| \| \| \| \| \|	vespa-engine/geirst/baseline-utils-and-tests-for-distributor-stripe Prepare baseline utils and tests for a single distributor stripe.
\| *	Prepare baseline utils and tests for a single distributor stripe.	Geir Storli	2021-06-25	9	-1/+821
\| \| \| \| \| \| \| \| \| \|	This is copied from DistributorTestUtil and LegacyDistributorTest, and adjusted to work with one distributor stripe.
* \|	Avoid race condition regression introduced in #18179	Tor Brede Vekterli	2021-06-24	4	-5/+131
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We would previously check for the presence of pending null-bucket `RequestBucketInfoCommand`s to determine if a pending cluster state was present. We would also attempt to block all bucket delete operations from starting if _any_ operation was pending towards that bucket on a given node, including bucket info requests. The former was rewritten to instead explicitly consider pending cluster state checks instead, as checking null buckets no longer works when using stripes. Unfortunately, due to a long-standing bug with message tracking of `RequestBucketInfoCommand`s, these would _always_ be marked as pending towards the null bucket. Since all ideal state ops would be blocked by null-bucket info requests, this would avoid starting any ideal state op as long as _any_ other op had an info request pending for the target node. This had the desirable (but not explicitly coded for) side effect of inhibiting bucket deletions from racing with half-finished merge operations. It also had the undesirable effect of needlessly blocking ops for completely unrelated buckets. With these changes, we now explicitly handle bucket info requests for single buckets in the `PendingMessageTracker`, allowing inhibition of deletions to work as expected. Also add an explicit check for pending info requests for all ideal state ops to mirror the old behavior (but now per-bucket instead of globally...!).
*	Merge pull request #18287 from ↵	Geir Storli	2021-06-16	1	-10/+2
\|\ \| \| \| \| \| \| \| \|	vespa-engine/geirst/fix-stripe-dispatch-of-request-bucket-info-reply Dispatch RequestBucketInfoReply for non-existing buckets to correct d…
\| *	Dispatch RequestBucketInfoReply for non-existing buckets to correct ↵	Geir Storli	2021-06-16	1	-10/+2
\| \| \| \| \| \| \| \|	distributor stripe.
* \|	Rename bucket db updater test that is only testing legacy mode.	Geir Storli	2021-06-16	4	-96/+98
\| \|
* \|	Rename destributor test that is only testing legacy mode.	Geir Storli	2021-06-16	5	-70/+72
\|/
*	Assert that the storage message has a valid bucket id for striping.	Geir Storli	2021-06-15	2	-9/+11
\|
*	Add per stripe handling of ideal state metrics with aggregation on top.	Geir Storli	2021-06-15	9	-36/+127
\| \| \| \|	This is handled similarly to per stripe distributor metrics.
*	Merge pull request #18247 from ↵	Geir Storli	2021-06-15	2	-4/+20
\|\ \| \| \| \| \| \| \| \|	vespa-engine/toregge/aggregate-metrics-on-the-fly-when-adding-to-snapshot Aggregate distributor metrics when adding to snapshot.
\| *	Factor out common code to private helper member function.	Tor Egge	2021-06-14	2	-8/+12
\| \|
\| *	Aggregate distributor metrics when adding to snapshot.	Tor Egge	2021-06-14	2	-0/+12
\| \|
* \|	Measure queue size after element have been inserted, and stabilize test by ↵	Henning Baldersheim	2021-06-14	2	-2/+3
\|/ \| \| \|	waiting for full Q
*	Only support legacy mode in getActiveIdealStateOperations().	Geir Storli	2021-06-11	2	-23/+2
\| \| \| \|	This function is only used by idealstatemanagertest in the context of testing a single stripe.
*	Merge pull request #18219 from ↵	Geir Storli	2021-06-11	5	-10/+88
\|\ \| \| \| \| \| \| \| \|	vespa-engine/toregge/aggregate-metrics-from-distributor-stripes-pass-2 Aggregate metrics from distributor stripes.
\| *	Remove stale comments.	Tor Egge	2021-06-11	2	-5/+3
\| \| \| \| \| \| \| \|	Reorder member variables.
\| *	Aggregate metrics from distributor stripes.	Tor Egge	2021-06-11	5	-6/+86
\| \|
* \|	Merge pull request #18218 from ↵	Geir Storli	2021-06-11	1	-0/+1
\|\ \ \| \|/ \|/\| \| \| \| \|	vespa-engine/geirst/getnodestate-command-in-distributor-main-thread Handle GetNodeStateCommand in distributor main thread when running in…
\| *	Handle GetNodeStateCommand in distributor main thread when running in new ↵	Geir Storli	2021-06-11	1	-0/+1
\| \| \| \| \| \| \| \|	stripe mode.
* \|	Provide correct stripe index to notify_stripe_wants_to_send_host_info.	Tor Egge	2021-06-11	1	-1/+1
\|/
*	MinReplica counts the minimum bucket replication factor, so use std::min ↵	Geir Storli	2021-06-11	2	-2/+4
\| \| \| \|	instead of sum.
*	Implement aggregation across distributor stripes for min replica, bucket ↵	Geir Storli	2021-06-10	12	-15/+316
\| \| \| \|	spaces, and pending maintenance stats.
*	Block ideal state ops when a pending cluster state is present	Tor Brede Vekterli	2021-06-09	29	-139/+223
\| \| \| \| \| \| \| \| \| \| \| \|	Since distributor stripes no longer have access to the top-level pending message tracking info, it's no longer possible to infer if a pending cluster state is happening by looking at the sent messages. Instead, do this more generally (and efficiently) by looking at the potential pending cluster state directly. Rewire the `isBlocked` logic to take in an operation context instead of just a `PendingMessageTracker`, giving it access to a lot more relevant information.
*	Route CreateVisitorCommand with too few used bits in the super bucket id to ↵	Geir Storli	2021-06-09	2	-5/+26
\| \| \| \| \| \|	a random distributor stripe. Such commands will eventually be bounced with WRONG_DISTRIBUTION when handled by the stripe.
*	addValue -> set for gauge metric.	Henning Baldersheim	2021-06-07	1	-1/+1
\|
*	Add queue size metric	Henning Baldersheim	2021-06-06	3	-1/+5
\|
*	Merge pull request #18102 from ↵	Geir Storli	2021-06-03	1	-0/+7
\|\ \| \| \| \| \| \| \| \|	vespa-engine/geirst/dispatch-get-and-visitor-messages-to-stripe Dispatch get and visitor messages to correct distributor stripe.
\| *	Dispatch get and visitor messages to correct distributor stripe.	Geir Storli	2021-06-03	1	-0/+7
\| \|
* \|	Use a hash map for specs. If the request is a point lookup then just use a ↵	Henning Baldersheim	2021-06-02	2	-4/+3
\|/ \| \| \| \| \| \| \|	hash lookup. If it is a wildcard lookup iterate as earlier on. Also use vespalib::stringref in interface to avoid conversion. Use vespalib:string in the hash map to locate string in object aswe are still on old abi.
*	Add proof of concept support for multiple distributor stripes.	Geir Storli	2021-06-01	6	-23/+83
\| \| \| \| \| \| \| \| \|	The most basic functionality is now supported using multiple distributor stripes (and threads). Note that the following is (at least) still missing: * Stripe-separate metrics with top-level aggregation. * Aggregation over all stripes in misc functions in Distributor that currently is using the first stripe. * Handling of messages without bucket id in the top-level Distributor instead of using the first stripe.
*	Merge pull request #18050 from ↵	Geir Storli	2021-06-01	6	-7/+39
\|\ \| \| \| \| \| \| \| \|	vespa-engine/geirst/validate-distributor-stripes-config Add validation of the number of distributor stripes from config and a…
\| *	Add validation of the number of distributor stripes from config and add more ↵	Geir Storli	2021-06-01	6	-7/+39
\| \| \| \| \| \| \| \| \| \| \| \|	asserts. This ensures the number of stripes is a power of 2 and within MaxStripes boundary.
* \|	Minor code cleanup	Tor Brede Vekterli	2021-05-31	2	-34/+5
\| \|
* \|	Do not block global merges to nodes tagged as busy	Tor Brede Vekterli	2021-05-31	3	-5/+39
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \|	To avoid starvation of high priority global bucket merges, we do not consider these for blocking due to a node being "busy" (usually caused by a full merge throttler queue). This is for two reasons: 1. When an ideal state op is blocked, it is still removed from the internal maintenance priority queue. This means a blocked high pri operation will not be retried until the next DB pass (at which point the node is likely to still be marked as busy when there's heavy merge traffic). 2. Global bucket merges have high priority and will most likely be allowed to enter the merge throttler queues, displacing lower priority merges.
*	Make merge_entries_into_db() work across multiple stripes by handling each ↵	Geir Storli	2021-05-28	5	-12/+123
\| \| \| \|	stripe in sequence.
*	Add common utils to map from bucket key to stripe and calculcate number of ↵	Geir Storli	2021-05-28	7	-12/+112
\| \| \| \|	stripe bits.
*	admin/slobrok.0 does not always exist ....... anymore.	Henning Baldersheim	2021-05-27	1	-1/+1
\|
*	Add a couple of stripe work TODOs	Tor Brede Vekterli	2021-05-26	2	-0/+4
\| \| \| \| \| \| \| \|	- Ideal state ops cannot look at null-bucket messages for determining if full bucket checks are pending when running in striped mode, as these are not handled by stripes when not in legacy mode. - State checker context should use ideal state cache instead of recomputing for every checked bucket (observed via `perf` in production).
*	Remove extra braces from initializer list.	Tor Egge	2021-05-25	1	-1/+1
\|
*	Merge pull request #17943 from vespa-engine/vekterli/minor-code-cleanup	Geir Storli	2021-05-21	15	-72/+41
\|\ \| \| \| \|	Minor cleanups in distributor maintenance handling code
\| *	Minor cleanups in distributor maintenance handling code	Tor Brede Vekterli	2021-05-21	15	-72/+41
\| \| \| \| \| \| \| \|	No functional changes
* \|	GC unused code	Henning Baldersheim	2021-05-21	1	-26/+0
\|/
*	Make distributor timestamp generation thread safe	Tor Brede Vekterli	2021-05-18	5	-23/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	New behavior: - Only allow time to travel forwards within a given distributor process' lifetime. This is a change from the old behavior, which would emit a warning to the logs and happily continue from a previously used second, possibly causing the distributor to reuse timestamps. - Try to detect cases where the wall clock has been transiently set far into the future--only to bounce back--by aborting the process if the current observed time is more than 120 seconds older than the highest observed wall clock time. This is an attempt to avoid generating _too_ many bogus future timestamps, as the distributor would otherwise continue generating timestamps within the highest observed second.