vespa - An engine for low-latency computation over large data sets

	Commit message (Collapse)	Author	Age	Files	Lines
*	Add cluster bucket sync status summary to CC status page	Tor Brede Vekterli	2024-05-28	1	-5/+50
\| \| \| \| \| \|	Uses the same "out of sync ratio" value as is currently exposed as a metric and through the State V2 REST API, but rendered as a percentage to be more human-friendly.
*	Don't make elevated "events last week" column render as error if high	Tor Brede Vekterli	2024-05-27	1	-3/+1
\| \| \| \| \| \|	The cluster controller has no notion of the nature of these events (could just be a benign upgrade cycle), so don't paint it with a scary red color that implies something is wrong in the cluster.
*	Add rationale for why we cap pending buckets to total count	Tor Brede Vekterli	2024-05-08	1	-0/+6
\|
*	Emit single metric for how out of sync the cluster data is	Tor Brede Vekterli	2024-05-06	14	-9/+224
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	With these changes the cluster controller continuously maintains a global aggregate across all content nodes that represents the number of pending and total buckets per bucket space. This aggregate can be sampled in O(1) time. An explicit metric `cluster-buckets-out-of-sync-ratio` has been added, and the value is also emitted as part of the cluster state REST API. Note: only emitted when statistics have been received from _all_ distributors for a particular cluster state, as it would otherwise potentially represent a state somewhere arbitrary between two or more distinct states.
*	Replace all usages of Arrays.asList with List.of where possible.	Henning Baldersheim	2024-04-12	3	-6/+7
\|
*	Unify on Set.of	Henning Baldersheim	2024-04-11	3	-12/+9
\|
*	Unify on Map.of	Henning Baldersheim	2024-04-11	7	-31/+22
\|
*	Turn off classfile warnings where zookeeper is pulled in due to issues with ↵	Henning Baldersheim	2024-04-08	1	-7/+9
\| \| \| \|	spotbugs SuppressWarning annotation.
*	Move some of bratseths ownership to others.	Geir Storli	2024-02-28	1	-1/+1
\|
*	Use sentinel value for missing buckets rather than throwing	Tor Brede Vekterli	2024-01-29	1	-1/+1
\| \| \| \|	Bucket count should have been pre-verified as present by the caller.
*	Explicitly report docs, tombstones and buckets in disallow-message	Tor Brede Vekterli	2024-01-29	2	-26/+38
\|
*	Use stored entry count rather than bucket count for (dis-)allowing permanent ↵	Tor Brede Vekterli	2024-01-26	2	-112/+182
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	node down edge The stored entry count encompasses both visible documents and tombstones. Using this count rather than bucket count avoids any issues where a node only containing empty buckets (i.e. no actual data) is prohibited from being marked as permanently down. Entry count is cross-checked with the visible document count; if the former is zero, the latter should always be zero as well. Since entry/doc counts were only recently introduced as part of the HostInfo payload, we have to handle the case where these do not exist. If entry count is not present, the decision to allow or disallow the transition falls back to the bucket count check.
*	Revert "Merge pull request #29683 from ↵	jonmv	2023-12-17	1	-1/+1
\| \| \| \| \| \| \|	vespa-engine/revert-29678-jonmv/reapply-zk-3.9.1" This reverts commit c8ece8b229362c7bf725e4433ef4fec86024cd29, reversing changes made to d42b67f0fe821d122548a345f27fda7f9c9c9d10.
*	Revert "Jonmv/reapply zk 3.9.1"	Harald Musum	2023-12-16	1	-1/+1
\|
*	Revert "Merge pull request #29674 from ↵	jonmv	2023-12-15	1	-1/+1
\| \| \| \| \| \| \|	vespa-engine/revert-29671-jonmv/reapply-zk-3.9.1" This reverts commit 28f8cf3e298d51ca703ceee36a992297d38637cc, reversing changes made to 3a9f89fe60e3420eed435daee435a4f8534c9512.
*	Revert "Jonmv/reapply zk 3.9.1"	Jon Marius Venstad	2023-12-15	1	-1/+1
\|
*	Revert "Merge pull request #29669 from ↵	jonmv	2023-12-15	1	-1/+1
\| \| \| \| \| \| \|	vespa-engine/revert-29662-revert-29661-revert-29658-jonmv/zk-3.9.1-clients-2" This reverts commit 9c8ba2608384ee79e143babd1e5a18a62166541f, reversing changes made to 954785e4eb91286bd166c304e98042ec63b7eb84.
*	Revert "Revert "Revert "Jonmv/zk 3.9.1 clients 2"""	Jon Marius Venstad	2023-12-15	1	-1/+1
\|
*	Revert "Revert "Jonmv/zk 3.9.1 clients 2""	Jon Marius Venstad	2023-12-14	1	-1/+1
\|
*	Revert "Jonmv/zk 3.9.1 clients 2"	Harald Musum	2023-12-14	1	-1/+1
\|
*	Get zk-client-common through zkfacade	jonmv	2023-12-14	1	-1/+1
\|
*	Restore original ZK embed placements, and duplicate ClientX509Util instead	jonmv	2023-12-14	1	-6/+0
\|
*	Depend on client and auth setup in clustercontroller-core	jonmv	2023-12-14	1	-3/+3
\|
*	Embed ZK in zookeeper-client-common	jonmv	2023-12-14	1	-2/+8
\|
*	Use fake ZooKeeper database implementation for subset of CC tests	Tor Brede Vekterli	2023-12-04	7	-15/+184
\| \| \| \| \| \|	The fake impl acts "as if" a single-node ZK quorum is present, so it cannot be directly used with most multi-node tests that require multiple nodes to actually participate in leader elections.
*	Reduce unit test ZooKeeper WAL file preallocations by three orders of magnitude	Tor Brede Vekterli	2023-11-29	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	ZK will by default preallocate 65536 * 1024 bytes for its write-ahead log file. This will happen for every test instantiation of the ZooKeeper database. Now, I like wearing out SSD flash cells as much as the next guy, but this just feels silly. Input number is always multiplied by 1024, so reduce down to 64 to get a 64k preallocation instead.
*	Move Jackson util from vespajlib to container-core.	Henning Baldersheim	2023-11-24	2	-2/+2
\|
*	jackson 2.16 changes some of its default settings so we consolidate our use ↵	Henning Baldersheim	2023-11-23	6	-18/+19
\| \| \| \| \| \|	of the ObjectMapper. Unless special options are used, use a common instance, or create via factory metod.
*	Update copyright	Jon Bratseth	2023-10-09	202	-201/+203
\|
*	GC unused count and average	Henning Baldersheim	2023-08-30	1	-8/+0
\|
*	Consolidate hamcrest usage to 2.x and remove cthul-matchers	Bjørn Christian Seime	2023-08-29	10	-19/+2
\|
*	Revert "More logging at higher level, need more info"	Harald Musum	2023-08-25	1	-5/+3
\|
*	Add content cluster name to generated feed block message	Tor Brede Vekterli	2023-07-26	4	-34/+37
\| \| \| \| \| \| \| \| \| \| \|	Messages now prefixed with content cluster name to help disambiguate which cluster is exceeding its limits in multi-cluster deployments. Example message: ``` in content cluster 'my-cool-cluster': disk on node 1 [my-node-1.example.com] is 81.0% full (the configured limit is 80.0%). See https://docs.vespa.ai/en/operations/feed-block.html ```
*	Make generated automatic feed block error messages more user-friendly	Tor Brede Vekterli	2023-07-26	5	-38/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Messages are generated centrally by the cluster controller and pushed to content nodes as part of a cluster state bundle; the distributors nodes merely repeat back what they have been told. This changes the cluster controller feed block error message code to be less ambiguous and to include a URL to our public documentation about feed blocks. Example of _old_ message: ``` disk on node 1 [storage.1.local] (0.510 > 0.500) ``` Same feed block with _new_ message: ``` disk on node 1 [storage.1.local] is 51.0% full (the configured limit is 50.0%). See https://docs.vespa.ai/en/operations/feed-block.html ```
*	Use 'vespa.version' suffix for all global version properties.	gjoranv	2023-07-24	1	-1/+1
\|
*	Revert "New parent pom"	Arnstein Ressem	2023-07-21	1	-1/+1
\|
*	Merge pull request #27827 from vespa-engine/new-parent-pom	gjoranv	2023-07-21	1	-1/+1
\|\ \| \| \| \|	New parent pom
\| *	Use 'vespa.version' suffix for all global version properties.	gjoranv	2023-07-19	1	-1/+1
\| \|
* \|	More logging at higher level, need more info	Harald Musum	2023-07-20	1	-3/+5
\|/
*	Use cluster state to check if disttributors are UP	Harald Musum	2023-07-18	1	-5/+4
\|
*	Check min replication seen from all distributors that are UP	Harald Musum	2023-07-17	2	-29/+17
\| \| \| \| \|	This means we will also check distributors that are on same node as a retired storage node
*	Fix minor issues after code reviews	Harald Musum	2023-07-17	2	-3/+1
\|
*	Avoid code duplication	Harald Musum	2023-07-16	1	-9/+7
\|
*	Check redundancy also for groups that are up	Harald Musum	2023-07-16	2	-18/+106
\| \| \| \| \| \| \|	When we allow several groups to go down for maintenance we should check nodes in the groups that are up if they have the required redundancy. They might be up but have not yet synced all buckets after coming up. We want to wait with allowing more nodes to be taken down until that is done.
*	Split out method for finding min replication per distributor	Harald Musum	2023-07-14	1	-15/+20
\|
*	Separate code at a higher level based on groupes setup or not	Harald Musum	2023-07-11	1	-71/+66
\|
*	More minor changes	Harald Musum	2023-07-09	1	-22/+29
\|
*	Split out method to avoid repeating code	Harald Musum	2023-07-09	1	-69/+68
\|
*	Renames and minor refactorings, no funcational changes	Harald Musum	2023-07-09	5	-258/+226
\|
*	Simplify	Harald Musum	2023-07-09	2	-35/+17
\|