vespa - An engine for low-latency computation over large data sets

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge pull request #16935 from ↵	Henning Baldersheim	2021-03-15	7	-39/+45
\|\ \| \| \| \| \| \| \| \|	vespa-engine/revert-16934-revert-16932-balder/move-metrics-from-partition-to-node-level Revert "Revert "GC unused DiskState and add the partition metrics to node level.""
\| *	Include metrics always.	Henning Baldersheim	2021-03-12	1	-27/+0
\| \|
\| *	Revert "Revert "GC unused DiskState and add the partition metrics to node ↵	Henning Baldersheim	2021-03-12	7	-12/+45
\| \| \| \| \| \| \| \|	level.""
* \|	Ensure Import-Package for javax packages are included in bundle's manifest	Bjørn Christian Seime	2021-03-15	1	-0/+7
\|/
*	Revert "GC unused DiskState and add the partition metrics to node level."	Harald Musum	2021-03-12	7	-45/+12
\|
*	GC unused DiskState and add the partition metrics to node level.	Henning Baldersheim	2021-03-12	7	-12/+45
\|
*	GC unused import	Henning Baldersheim	2021-03-12	2	-2/+0
\|
*	Merge pull request #16926 from ↵	Tor Brede Vekterli	2021-03-12	6	-98/+177
\|\ \| \| \| \| \| \| \| \|	vespa-engine/vekterli/dont-store-full-bundle-objects-in-state-history Don't store full bundle objects in state history
\| *	Add missing copyright	Tor Brede Vekterli	2021-03-12	1	-0/+1
\| \|
\| *	Move config output further down on the status page	Tor Brede Vekterli	2021-03-12	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	Always print regardless of leader eligibility state; config is not predicated on this.
\| *	Move ZK/election-related info away from top of CC status page	Tor Brede Vekterli	2021-03-12	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	Much less immediately interesting than the actual cluster node information. Move it just above the general event log instead.
\| *	Don't store full bundle objects in cluster state history	Tor Brede Vekterli	2021-03-12	6	-94/+172
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bundles have a lot of sub-objects per state, so in systems with a high amount of node entries, this adds unnecessary pressure on the heap. Instead, store the string representations of the bundle and the string representation of the diff to the previous state version (if any). This is also inherently faster than computing the diffs on-demand on every status page render. Also remove mutable `official` field from `ClusterState`. Not worth violating immutability of an object just to get some prettier (but with high likelihood actually more confusing) status page rendering.
* \|	Revert "GC unused DiskState"	Arnstein Ressem	2021-03-12	10	-7/+180
\|/
*	Merge pull request #16911 from vespa-engine/balder/gc-disk-states	Henning Baldersheim	2021-03-12	10	-180/+7
\|\ \| \| \| \|	GC unused DiskState
\| *	GC Partition	Henning Baldersheim	2021-03-11	1	-12/+0
\| \|
\| *	GC unused DiskState	Henning Baldersheim	2021-03-11	9	-168/+7
\| \|
* \|	Merge pull request #16900 from vespa-engine/bjorncs/zookeeper-client-common	Bjørn Christian Seime	2021-03-12	2	-10/+16
\|\ \ \| \|/ \|/\|	Add shared ZK client config generator for zkfacade and vespa-zkcli [run-systemtest]
\| *	Construct ZKClientConfig from ZkClientConfigBuilder	Bjørn Christian Seime	2021-03-11	2	-10/+16
\| \| \| \| \| \| \| \|	Use ZKClientConfig builder from Curator and ZooKeeperDatabase
* \|	GC use of void DiskState.	Henning Baldersheim	2021-03-11	5	-65/+9
\| \|
* \|	GC use of NodeState.getDiskCount and NodeState.getDiskStates.	Henning Baldersheim	2021-03-11	5	-74/+11
\| \|
* \|	GC long gone disk state checks.	Henning Baldersheim	2021-03-11	1	-1/+6
\| \|
* \|	Shrink the size of the NodeState object by using float over double for ↵	Henning Baldersheim	2021-03-11	4	-29/+29
\|/ \| \| \|	initProgress and capacity. Also gc unused 'reliability' member.
*	Remove com.yahoo.vespa.jdk8compat	Bjørn Christian Seime	2021-03-10	1	-5/+4
\| \| \| \|	These types are often accidentally imported, and the JDK8 replacement is typically a one-liner.
*	Merge pull request #16856 from ↵	Tor Brede Vekterli	2021-03-09	1	-3/+20
\|\ \| \| \| \| \| \| \| \|	vespa-engine/vekterli/immediately-exit-cc-if-node-index-changed-live Immediately exit cluster controller if node index config is changed live
\| *	Immediately exit cluster controller if node index config is changed live	Tor Brede Vekterli	2021-03-09	1	-3/+20
\| \| \| \| \| \| \| \| \| \|	We do not support live reconfigs of CC index, so swiftly exit if we detect this, allowing the config sentinel to restart the service.
* \|	Merge pull request #16843 from vespa-engine/bjorncs/upgrade-zk-client	Bjørn Christian Seime	2021-03-09	1	-0/+6
\|\ \ \| \|/ \|/\|	Bjorncs/upgrade zk client [run-systemtest]
\| *	Depend on zookeeper-server-common	Bjørn Christian Seime	2021-03-09	1	-0/+6
\| \| \| \| \| \| \| \|	Ensure extra required ZK dependencies are present on test classpath
* \|	Fix typo	Tor Brede Vekterli	2021-03-08	1	-1/+1
\| \|
* \|	Better handling of ZK connectivity issues concurrent with elections	Tor Brede Vekterli	2021-03-08	3	-28/+59
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds the following safeguards/improvements: - Do not clear pending (non-persisted) writes over a `connect()` edge. Avoids having the controller eternally wait for a doomed pending write to be completed when it has no other events that can trigger a new write. - Trigger `lostDatabaseConnection()` whenever ZK is reconfigured to ensure we reload the newest state before trying to compute/publish any new states. - Explicitly drop leadership in `lostDatabaseConnection()` to immediately prevent controller from trying any funny leader-related business since it no longer can depend on ZK watches triggering. - When falling back to default state/cluster bundle, ensure that any subsequent dependent znode write is predicated on the pre-existing znode version being 0, i.e. did not previously exist.
*	Print node index wihtout parenthesis as done elsewhere	Harald Musum	2021-03-03	2	-7/+7
\|
*	Merge pull request #16747 from ↵	Geir Storli	2021-03-02	4	-5/+39
\|\ \| \| \| \| \| \| \| \|	vespa-engine/geirst/cluster-controller-resouce-usage-limits-metrics Expose resource usage metrics for disk and memory limits for feed blo…
\| *	Expose resource usage metrics for disk and memory limits for feed blocked.	Geir Storli	2021-03-02	4	-5/+39
\| \|
* \|	Only force CC metric update when you are the boss	Jon Marius Venstad	2021-03-02	1	-1/+1
\|/
*	Use (override, really) the clusterid dimension in CCs content metrics	Jon Marius Venstad	2021-03-02	2	-2/+4
\|
*	Do not use vespamalloc for clustercontroller as it is pure java.	Henning Baldersheim	2021-03-01	1	-2/+10
\|
*	Only compute feed blocked state from available nodes	Tor Brede Vekterli	2021-02-26	3	-2/+45
\| \| \| \| \|	Available nodes here mean nodes that are reported as Up/Initializing and where the wanted state is Up/Retired.
*	Pass current ZK-persisted version directly to broadcast method instead of ↵	Tor Brede Vekterli	2021-02-26	3	-39/+22
\| \| \| \|	indirectly
*	Enforce that no cluster state can be published unless confirmed written to ↵	Tor Brede Vekterli	2021-02-26	5	-4/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ZooKeeper This avoids a subtle edge case where the underlying ZK integration code may fail silently a write, leaving the core controller logic to think that it had actually durably persisted a particular state version. In case of reelections racing with broadcasts, it would be possible for leader-edge readbacks from ZK to retrieve a _lower_ version than one that had already been published. This would cause the cluster controller to get very confused about which cluster states nodes had already observed. If a newly produced state version overlapped with a previously broadcast state, the controller would not push the updated state to the nodes, as it would (with good reason) assume the node had already observed it, seeing that it had already ACKed the particular version number.
*	Remove unused aguments and methods	Harald Musum	2021-02-21	12	-112/+32
\|
*	Mincor cleanup, no functional changs	Harald Musum	2021-02-21	44	-247/+174
\|
*	Remove unused code, and fix doc	Jon Marius Venstad	2021-02-19	1	-2/+2
\|
*	Other nodes not being up should not hinder permanently down	Håkon Hallingstad	2021-02-19	1	-5/+0
\|
*	Fail safe maintenance if other nodes are not up	Håkon Hallingstad	2021-02-19	4	-88/+52
\|
*	Avoid stripping index of node that was missing	Jon Marius Venstad	2021-02-17	1	-1/+1
\|
*	Merge pull request #16494 from ↵	Håkon Hallingstad	2021-02-12	3	-14/+55
\|\ \| \| \| \| \| \| \| \|	vespa-engine/hakonhall/also-deny-maintenance-when-another-node-is-in-maintenance Also deny maintenance when another node is in maintenance
\| *	Add test	Håkon Hallingstad	2021-02-12	2	-11/+49
\| \|
\| *	Also deny maintenance when another node is in maintenance	Håkon Hallingstad	2021-02-12	2	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cluster controller today already denies setting a node X safely to maintenance M, if there is another node Y in another group that has wanted state M. Which means that if Y is in M but wanted state is not M, X is allowed to be set in M. This is an edge case which is rare.
* \|	enableSmallBuffers -> useSmallBuffers	Henning Baldersheim	2021-02-12	2	-2/+2
\| \|
* \|	Use small buffers where size matters more than speed.	Henning Baldersheim	2021-02-12	2	-2/+3
\| \|
* \|	Support configurable feed block hysteresis on the cluster controller	Tor Brede Vekterli	2021-02-10	8	-11/+171
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds an absolute number delta that is subtracted from the feed block limit when a node has a resource already in feed blocked state. This means that there's a lower watermark threshold that must be crossed before feeding can be unblocked. Avoids flip-flopping between block states. Default is currently 0.0, i.e. effectively disabled. To be modified later for system tests and trial roll-outs. A couple of caveats with the current implementation: * The cluster state is not recomputed automatically when just the hysteresis threshold is crossed, so the description will be out of date on the content nodes. However, if any other feed block event happens (or the hysteresis threshold is crossed), the state will be recomputed as expected. This does not affect correctness, since the feed is still to be blocked. * A node event remove/add pair is emitted for feed block status when the hysteresis threshold is crossed and there's a cluster state recomputation.