vespa - An engine for low-latency computation over large data sets

	Commit message (Collapse)	Author	Age	Files	Lines
*	No functional changes	Jon Bratseth	2021-06-01	1	-1/+1
\|
*	GC some unused methods and simplify	Henning Baldersheim	2021-05-23	1	-2/+5
\|
*	Let the supervisor owner set the small buffer option	Jon Marius Venstad	2021-05-03	1	-0/+1
\|
*	One more lazy	Jon Marius Venstad	2021-04-28	1	-1/+2
\|
*	More lazy debug log message generation	Jon Marius Venstad	2021-04-28	12	-108/+98
\|
*	Disallow >1 group to suspend	Håkon Hallingstad	2021-04-16	3	-12/+101
\| \| \| \| \| \| \|	If there is more than one group, disallow suspending a node if there is a node in another group that has a user wanted state != UP. If there is 1 group, disallow suspending more than 1 node.
*	No longer allow suspension if in maintenance	Håkon Hallingstad	2021-04-15	1	-4/+2
\| \| \| \| \| \|	If a storage node falls out of Slobrok, it will change from UP to Maintenance after 60s, then after further 30s go to Down. Avoid allowing suspension in the 30s grace period just because it is Maintenance mode.
*	Add remote task queue size metric in cluster controller	Håkon Hallingstad	2021-04-01	2	-16/+24
\|
*	Log when transitioning out of CC moratorium	Håkon Hallingstad	2021-03-26	1	-6/+3
\|
*	Make default deadline to first broadcast 30s	Håkon Hallingstad	2021-03-24	3	-3/+5
\|
*	Revert "Revert "Avoid safe mutations in master moratorium and increase first ↵	Håkon Hallingstad	2021-03-24	9	-12/+45
\| \| \| \|	cluster state broadcast deadline [run-systemtest]""
*	Revert "Avoid safe mutations in master moratorium and increase first cluster ↵	Håkon Hallingstad	2021-03-24	9	-45/+12
\| \| \| \|	state broadcast deadline [run-systemtest]"
*	Merge pull request #17085 from ↵	Håkon Hallingstad	2021-03-24	9	-12/+45
\|\ \| \| \| \| \| \| \| \|	vespa-engine/hakonhall/increase-the-minimum-time-before-first-cluster-state-broadcast-run-systemtest Avoid safe mutations in master moratorium and increase first cluster state broadcast deadline [run-systemtest]
\| *	Avoid safe-set-node-state in master moratorium	Håkon Hallingstad	2021-03-24	8	-11/+42
\| \|
\| *	Increase the minimum time before first cluster state broadcast [run-systemtest]	Håkon Hallingstad	2021-03-19	1	-1/+3
\| \|
* \|	Revert deferred ZK connectivity for now	Tor Brede Vekterli	2021-03-22	2	-9/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Instead, we'll want to create a more generalized solution that considers all sources of node information (Slobrok _and_ explicit health check RPCs) before potentially publishing a state or processing tasks.
* \|	Make sure to reset any election shortcuts if we go from !ZK -> ZK	Tor Brede Vekterli	2021-03-19	1	-5/+13
\| \|
* \|	Use local leader state for decisions rather than election handler	Tor Brede Vekterli	2021-03-19	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoids potentially publishing cluster states _before_ we have triggered our own leadership election edge handling code. Could happen if code called prior to the election edge logic checked the election handler state and erroneously thought we had performed the prerequisite actions we're supposed to do when assuming leadership (such as reading back current state from ZK).
* \|	Don't allow short-circuiting election phase if only one node configured if ↵	Tor Brede Vekterli	2021-03-19	2	-2/+10
\| \| \| \| \| \| \| \|	using ZK
* \|	Inhibit ZooKeeper connections until our local Slobrok mirror is ready.	Tor Brede Vekterli	2021-03-19	4	-1/+24
\|/ \| \| \| \| \| \| \|	Otherwise, if there are transient Slobrok issues during CC startup and we end up winning the election, we risk publishing a cluster state where the entire cluster appears down (since we do not have any knowledge of Slobrok node mapping state). This will adversely affect availability for all the obvious reasons.
*	Guard against ever accidentally publishing a default constructed state	Tor Brede Vekterli	2021-03-19	2	-4/+4
\| \| \| \| \|	Since version 0 states were ambiguous with the sentinel values for "not written to ZK/not tagged as official", this could be mis-interpreted.
*	use US locale	Kristian Aune	2021-03-19	1	-4/+5
\|
*	Revert "Inhibit ZooKeeper connections until our local Slobrok mirror is ready."	Tor Brede Vekterli	2021-03-18	4	-24/+1
\|
*	Merge pull request #17029 from ↵	Tor Brede Vekterli	2021-03-18	4	-1/+24
\|\ \| \| \| \| \| \| \| \|	vespa-engine/vekterli/inhibit-db-connectivity-until-slobrok-is-ready Inhibit ZooKeeper connections until our local Slobrok mirror is ready.
\| *	Guard against Slobrok mirror not yet being configured	Tor Brede Vekterli	2021-03-18	2	-6/+2
\| \|
\| *	Inhibit ZooKeeper connections until our local Slobrok mirror is ready.	Tor Brede Vekterli	2021-03-18	4	-1/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Otherwise, if there are transient Slobrok issues during CC startup and we end up winning the election, we risk publishing a cluster state where the entire cluster appears down (since we do not have any knowledge of Slobrok node mapping state). This will adversely affect availability for all the obvious reasons.
* \|	Revert "Revert "GC unused DiskState and add the partition metrics to node ↵	Henning Baldersheim	2021-03-12	6	-12/+35
\| \| \| \| \| \| \| \|	level.""
* \|	Revert "GC unused DiskState and add the partition metrics to node level."	Harald Musum	2021-03-12	6	-35/+12
\| \|
* \|	GC unused DiskState and add the partition metrics to node level.	Henning Baldersheim	2021-03-12	6	-12/+35
\|/
*	Merge pull request #16926 from ↵	Tor Brede Vekterli	2021-03-12	5	-89/+154
\|\ \| \| \| \| \| \| \| \|	vespa-engine/vekterli/dont-store-full-bundle-objects-in-state-history Don't store full bundle objects in state history
\| *	Add missing copyright	Tor Brede Vekterli	2021-03-12	1	-0/+1
\| \|
\| *	Move config output further down on the status page	Tor Brede Vekterli	2021-03-12	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	Always print regardless of leader eligibility state; config is not predicated on this.
\| *	Move ZK/election-related info away from top of CC status page	Tor Brede Vekterli	2021-03-12	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	Much less immediately interesting than the actual cluster node information. Move it just above the general event log instead.
\| *	Don't store full bundle objects in cluster state history	Tor Brede Vekterli	2021-03-12	5	-85/+149
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bundles have a lot of sub-objects per state, so in systems with a high amount of node entries, this adds unnecessary pressure on the heap. Instead, store the string representations of the bundle and the string representation of the diff to the previous state version (if any). This is also inherently faster than computing the diffs on-demand on every status page render. Also remove mutable `official` field from `ClusterState`. Not worth violating immutability of an object just to get some prettier (but with high likelihood actually more confusing) status page rendering.
* \|	Revert "GC unused DiskState"	Arnstein Ressem	2021-03-12	5	-7/+105
\|/
*	Merge pull request #16911 from vespa-engine/balder/gc-disk-states	Henning Baldersheim	2021-03-12	5	-105/+7
\|\ \| \| \| \|	GC unused DiskState
\| *	GC Partition	Henning Baldersheim	2021-03-11	1	-12/+0
\| \|
\| *	GC unused DiskState	Henning Baldersheim	2021-03-11	4	-93/+7
\| \|
* \|	Merge pull request #16900 from vespa-engine/bjorncs/zookeeper-client-common	Bjørn Christian Seime	2021-03-12	1	-10/+10
\|\ \ \| \|/ \|/\|	Add shared ZK client config generator for zkfacade and vespa-zkcli [run-systemtest]
\| *	Construct ZKClientConfig from ZkClientConfigBuilder	Bjørn Christian Seime	2021-03-11	1	-10/+10
\| \| \| \| \| \| \| \|	Use ZKClientConfig builder from Curator and ZooKeeperDatabase
* \|	GC use of void DiskState.	Henning Baldersheim	2021-03-11	2	-2/+9
\| \|
* \|	GC use of NodeState.getDiskCount and NodeState.getDiskStates.	Henning Baldersheim	2021-03-11	3	-27/+11
\| \|
* \|	GC long gone disk state checks.	Henning Baldersheim	2021-03-11	1	-1/+6
\|/
*	Immediately exit cluster controller if node index config is changed live	Tor Brede Vekterli	2021-03-09	1	-3/+20
\| \| \| \| \|	We do not support live reconfigs of CC index, so swiftly exit if we detect this, allowing the config sentinel to restart the service.
*	Fix typo	Tor Brede Vekterli	2021-03-08	1	-1/+1
\|
*	Better handling of ZK connectivity issues concurrent with elections	Tor Brede Vekterli	2021-03-08	3	-28/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds the following safeguards/improvements: - Do not clear pending (non-persisted) writes over a `connect()` edge. Avoids having the controller eternally wait for a doomed pending write to be completed when it has no other events that can trigger a new write. - Trigger `lostDatabaseConnection()` whenever ZK is reconfigured to ensure we reload the newest state before trying to compute/publish any new states. - Explicitly drop leadership in `lostDatabaseConnection()` to immediately prevent controller from trying any funny leader-related business since it no longer can depend on ZK watches triggering. - When falling back to default state/cluster bundle, ensure that any subsequent dependent znode write is predicated on the pre-existing znode version being 0, i.e. did not previously exist.
*	Print node index wihtout parenthesis as done elsewhere	Harald Musum	2021-03-03	1	-6/+6
\|
*	Merge pull request #16747 from ↵	Geir Storli	2021-03-02	2	-4/+26
\|\ \| \| \| \| \| \| \| \|	vespa-engine/geirst/cluster-controller-resouce-usage-limits-metrics Expose resource usage metrics for disk and memory limits for feed blo…
\| *	Expose resource usage metrics for disk and memory limits for feed blocked.	Geir Storli	2021-03-02	2	-4/+26
\| \|
* \|	Only force CC metric update when you are the boss	Jon Marius Venstad	2021-03-02	1	-1/+1
\|/