vespa - An engine for low-latency computation over large data sets

	Commit message (Collapse)	Author	Age	Files	Lines
*	Use fake ZooKeeper database implementation for subset of CC tests	Tor Brede Vekterli	2023-12-04	1	-3/+5
\| \| \| \| \| \|	The fake impl acts "as if" a single-node ZK quorum is present, so it cannot be directly used with most multi-node tests that require multiple nodes to actually participate in leader elections.
*	Update copyright	Jon Bratseth	2023-10-09	1	-1/+1
\|
*	Simplify and remove some test methods	Harald Musum	2023-06-01	1	-2/+2
\|
*	Require non-null zooKeeperServerAddress in FleetControllerOptions	Harald Musum	2023-06-01	1	-113/+114
\|
*	Simplify and minor cleanup	Harald Musum	2023-05-15	1	-125/+133
\|
*	Create slobrok in constructor and simplify setup	Harald Musum	2023-05-12	1	-10/+2
\|
*	Inject timer from test classes instead of inheriting	Harald Musum	2023-05-12	1	-13/+10
\|
*	Remove testname and logging related to starting and stopping	Harald Musum	2023-05-12	1	-8/+0
\| \| \| \|	Not used, reintroduce using junit TestInfo class if needed
*	StatusPageServerInterface has just one implementation, simplify	Harald Musum	2022-12-28	1	-3/+1
\|
*	Cleanup by using supervisor in superclass	Harald Musum	2022-09-15	1	-19/+3
\|
*	Simplify	Harald Musum	2022-09-15	1	-23/+23
\|
*	Use a list of fleet controllers in test superclass	Harald Musum	2022-09-08	1	-10/+10
\|
*	Remove support for 'setsystemstate2' RPC method in cluster controller	Harald Musum	2022-09-05	1	-2/+2
\|
*	Exgtract CleanupZookeeperLogsOnSuccess into its own class	Harald Musum	2022-09-01	1	-0/+2
\|
*	Use node type instead of boolean, simplify	Harald Musum	2022-09-01	1	-2/+2
\|
*	Make FleetControllerOptions immutable and support builder pattern	Harald Musum	2022-08-31	1	-155/+156
\|
*	Require non-null status page server	Harald Musum	2022-08-29	1	-1/+4
\|
*	Remove unused method and class	Harald Musum	2022-08-15	1	-1/+1
\|
*	Make sure to get timeout in seconds as a double	Harald Musum	2022-08-12	1	-9/+9
\|
*	Use one timeout and cleanup timeout usage a bit	Harald Musum	2022-08-11	1	-9/+9
\|
*	Remove the need for storing an instance variable in FleetControllerTest	Harald Musum	2022-08-11	1	-6/+8
\|
*	Simplify and cleanup	Harald Musum	2022-08-10	1	-1/+1
\|
*	Convert clustercontroller-core to junit5	Bjørn Christian Seime	2022-07-29	1	-153/+158
\|
*	Trigger saveWantedState when nodes are removed or orphaned wanted states are ↵	Håkon Hallingstad	2022-04-20	1	-2/+2
\| \| \| \|	loaded
*	Use FleetControllerContext in ZooKeeperDatabase	Håkon Hallingstad	2021-12-13	1	-3/+3
\|
*	Fixes after review round	Håkon Hallingstad	2021-10-19	1	-1/+1
\|
*	Improve logging of FleetController and DatabaseHandler	Håkon Hallingstad	2021-10-15	1	-14/+15
\|
*	Update 2017 copyright notices.	gjoranv	2021-10-07	1	-1/+1
\|
*	Add remote task queue size metric in cluster controller	Håkon Hallingstad	2021-04-01	1	-1/+1
\|
*	Shrink the size of the NodeState object by using float over double for ↵	Henning Baldersheim	2021-03-11	1	-19/+19
\| \| \| \|	initProgress and capacity. Also gc unused 'reliability' member.
*	Enforce that no cluster state can be published unless confirmed written to ↵	Tor Brede Vekterli	2021-02-26	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ZooKeeper This avoids a subtle edge case where the underlying ZK integration code may fail silently a write, leaving the core controller logic to think that it had actually durably persisted a particular state version. In case of reelections racing with broadcasts, it would be possible for leader-edge readbacks from ZK to retrieve a _lower_ version than one that had already been published. This would cause the cluster controller to get very confused about which cluster states nodes had already observed. If a newly produced state version overlapped with a previously broadcast state, the controller would not push the updated state to the nodes, as it would (with good reason) assume the node had already observed it, seeing that it had already ACKed the particular version number.
*	Remove unused aguments and methods	Harald Musum	2021-02-21	1	-3/+2
\|
*	Remove use-bucket-space-metric feature flag	Håkon Hallingstad	2020-01-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The flag controlled config read by the Cluster Controller. Therefore, I have left the ModelContextImpl.Properties method and implementation (now always returning true), but the model has stopped using that method internally, and the config is no longer used in the CC. The field in the fleetcontroller.def is left unchanged and documented as deprecated.
*	Use bucket_space metric in retirement	Håkon Hallingstad	2020-01-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This makes the Cluster Controller use the vds.datastored.bucket_space.buckets_total, dimension bucketSpace=default, to determine whether a content node manages zero buckets, and if so, will allow the node to go permanently down. This is used when a node is retiring, and it is to be removed from the application. The change is guarded by the use-bucket-space-metric, default true. If the new metric doesn't work as expected, we can revert to using the current/old metric by flipping the flag. The flag can be controlled per application.
*	Add non-converged nodes to task deadline exceeded messages	Tor Brede Vekterli	2019-11-04	1	-10/+48
\| \| \| \| \|	Makes it easier for an external observer to understand what set of nodes is causing the cluster state to not converge.
*	Move grace period event edge from timer to event diff calculator	Tor Brede Vekterli	2019-10-30	1	-6/+9
\| \| \| \| \| \| \| \|	Ensures that event is only emitted when we're actually publishing a state containing the state transition. Emitting events in the timer code is fragile when it does not modify any state, risking emitting the same event an unbounded amount of times if the condition keeps holding for each timer cycle.
*	Cleanup tests, no functional changes	Harald Musum	2019-09-03	1	-23/+30
\|
*	Activation reply processing must inspect actual version returned	Tor Brede Vekterli	2019-03-21	1	-1/+0
\| \| \| \| \| \| \| \|	Version mismatches in backend do not return explicit RPC errors, so actual vs. desired versions must be checked in order to avoid potentially spurious activation of other versions. Also do some minor code cleanup.
*	Explicitly enable two-phase transitions in tests, disable in default options	Tor Brede Vekterli	2019-03-20	1	-24/+24
\| \| \| \|	Mirrors the default values in the actual underlying config definitions.
*	Support configurable two-phase state transitions in cluster controller	Tor Brede Vekterli	2019-03-14	1	-3/+16
\|
*	ZooKeeper-persist and load published cluster state bundles	Tor Brede Vekterli	2018-04-24	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Store synchronously upon each new versioned state, load whenever controller is elected master. Effectively carries over visible node states from one controller's lifetime to the next. This removes the edge case where default bucket space content nodes are marked as in Maintainence until their global merge status is known. To avoid controller tripping over its own feet, state bundles are now _not_ versioned at all until the initial send time period has passed. This prevents overwriting the state persisted from a previous controller with a transient state where all nodes are down due to not having Slobrok contact yet. A new cluster state recompute+send edge has been added when the master passes its initial state send time period.
*	Add configurable deadline for cluster controller tasks	Tor Brede Vekterli	2017-09-25	1	-1/+36
\| \| \| \| \| \|	Prevents an unstable cluster from potentially holding up all container request processing threads indefinitely. Deadline errors are translated into HTTP 504 errors to REST API clients.
*	Immediately complete failed remote tasks	Tor Brede Vekterli	2017-09-25	1	-0/+18
\| \| \| \| \| \|	We check both for master status and task failure, as we otherwise place a potentially dangerous silent dependency on the task always failing itself if the controller is not a master.
*	Immediately complete remote tasks when not leader	Tor Brede Vekterli	2017-09-19	1	-0/+15
\| \| \| \| \| \|	Avoids edge case where set-node-state requests sent to followers would have their response delayed indefinitely due to controller not publishing any versions that the task's ACK barrier could be released by.
*	Change wording for operations without observable side-effects	Tor Brede Vekterli	2017-09-12	1	-17/+17
\|
*	Break node version ACK check out into separately called logic	Tor Brede Vekterli	2017-09-12	1	-0/+19
\| \| \| \| \| \| \| \|	Removes dependency on having to invoke broadcastNewState before being able to observe that all distributors are in sync. Invocations of broadcastNewState are gated by a grace period between each time, so unless this is done we get artificial delays before a synchronous task can be considered complete.
*	Test multiple scheduled synchronous tasks	Tor Brede Vekterli	2017-09-11	1	-0/+18
\|
*	Move leadership test code into fixture	Tor Brede Vekterli	2017-09-11	1	-11/+22
\|
*	Test automatic task failing on controller leadership loss	Tor Brede Vekterli	2017-09-11	1	-8/+52
\|
*	Add support for version ACK-dependent tasks in cluster controller	Tor Brede Vekterli	2017-09-11	1	-1/+178
\| \| \| \| \| \| \| \| \|	Used to enable synchronous operation for set-node-state calls, which ensure that side-effects of the call are visible when the response returns. If controller leadership is lost before state is published, tasks will be failed back to the client.