vespa - An engine for low-latency computation over large data sets

	Commit message (Collapse)	Author	Age	Files	Lines
*	Add fundamental metrics for async "remove by GID" SPI operation	Tor Brede Vekterli	2023-11-15	1	-0/+8
\| \| \| \| \| \|	Tracks invocation count, latency and failures (although we don't expect to see any failures in this pipeline, since the remove ops logically happen atomically with the bucket iteration).
*	Add and wire live config for selecting `DeleteBucket` behavior	Tor Brede Vekterli	2023-11-10	1	-5/+21
\| \| \| \|	By default the legacy behavior is used.
*	Implement DeleteBucket with throttled per-document async removal	Tor Brede Vekterli	2023-11-10	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previous (legacy) behavior was to immediately async schedule a full bucket deletion in the persistence backend, which incurs a very disproportionate cost when documents are backed by many and/or heavy indexes (such as HNSW). This risked swamping the backend with tens to hundreds of thousands of concurrent document deletes. New behavior splits deletion into three phases: 1. Metadata enumeration for all documents present in the bucket 2. Persistence-throttled async remove _per individual document_ that was returned in the iteration result. This blocks the persistence thread (by design) if the throttling window is not sufficiently large to accomodate all pending deletes. 3. Once all async removes have been ACKed, schedule the actual `DeleteBucket` operation towards the backend. This will clean up any remaining (cheap) tombstone entries as well as the meta data store. Operation reply is sent as before once the delete has completed.
*	Avoid using a reserved identifier naming format	Tor Brede Vekterli	2023-10-25	1	-50/+50
\| \| \| \| \| \|	Identifiers of the form `_Uppercased` are considered reserved by the standard. Not likely to cause ambiguity in practice, but it's preferable to stay on the good side of the standard-gods.
*	Rewire `FileStorManager` config	Tor Brede Vekterli	2023-10-24	1	-2/+5
\|
*	Rewire `ModifiedBucketChecker` config	Tor Brede Vekterli	2023-10-24	2	-12/+16
\|
*	Update copyright	Jon Bratseth	2023-10-09	13	-13/+13
\|
*	Remove dead code from FileStorManager unit test.	Tor Egge	2023-08-31	1	-27/+0
\|
*	GC some unused internal storage message types	Tor Brede Vekterli	2023-06-16	1	-15/+0
\| \| \| \| \|	Remnants of the "file per bucket on spinning disks" days and no longer used for anything.
*	Clean up some todos.	Henning Baldersheim	2023-06-08	3	-6/+6
\|
*	Reduce creation of Document instances without DocumentTypeRepo.	Geir Storli	2023-03-13	1	-6/+6
\|
*	remove document::Runnable	Håvard Pettersen	2023-02-20	1	-17/+27
\| \| \| \|	use std::thread directly instead
*	use std::thread directly	Håvard Pettersen	2023-02-15	1	-1/+1
\| \| \| \| \| \|	also add very simple ThreadPool class to run multiple threads at once make an effort to only join once
*	stop using fastos thread more places	Håvard Pettersen	2023-02-14	1	-3/+1
\| \| \| \| \| \| \| \| \| \|	- also stop using std::jthread - remove Active and Joinable interfaces - remove stop, stopped and slumber - remove currentThread - make start function static - override start for Runnable w/init or custom function - explicit stop/slumber where needed
*	Change from typedef to using in storage C++ code.	Geir Storli	2022-12-21	1	-1/+1
\|
*	Remove stacksize from the thread pools and thread executors.	Henning Baldersheim	2022-12-20	1	-1/+1
\|
*	Force content node-internal bucket DB metric update during startup	Tor Brede Vekterli	2022-08-30	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	After initialization, the node will immediately start communicating with the cluster controller, exchanging host info. This host info contains a subset snapshot of the active metrics, which includes the total bucket count, doc count etc. It is critical that we must never report back host info _prior_ to having run at least one full sweep of the bucket database, lest we risk transiently reporting zero buckets held by the content node. Doing so could cause orchestration logic to perform operations based on erroneous assumptions. To avoid this, we explicitly force a full DB sweep and metric update prior to reporting the node as up. Since this function is called prior to the CommunicationManager thread being started, any CC health pings should also always happen after this init step.
*	Remove '.sum' form vds sum metrics.	Henning Baldersheim	2022-06-08	1	-6/+6
\| \| \| \| \| \| \| \|	Remove '.sum' from metric names for storage node and also remove the average metrics for the same. Remove '.sum' from distributor metrics set and remove distributor average metrics. GC '.sum' from distributor metric names. Remove '.alldisks' from metric names and update tests. GC '.alldisks' from filestor metrics.
*	Fold storageapi into storage.	Henning Baldersheim	2022-05-19	1	-1/+0
\|
*	GC unused Context parameter	Henning Baldersheim	2022-03-31	2	-21/+12
\|
*	Move BucketIdListResult	Henning Baldersheim	2022-03-09	2	-2/+2
\|
*	Reduce use of Identifiable for document::DatatType	Henning Baldersheim	2022-03-03	1	-0/+4
\|
*	Sync executor to ensure tasks are run prior to stripe teardowns in tests	Tor Brede Vekterli	2022-02-25	1	-2/+11
\|
*	Make ConfigUri constructors explicit and use same context where possible in ↵	Henning Baldersheim	2022-02-20	3	-3/+3
\| \| \| \|	proton.
*	Support dynamic throttling of async persistence operations	Tor Brede Vekterli	2022-01-10	1	-17/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds an operation throttler that is intended to provide global throttling of async operations across all persistence stripe threads. A throttler wraps a logical max pending window size of in-flight operations. Depending on the throttler implementation, the window size may expand and shrink dynamically. Exactly how and when this happens is unspecified. Commit adds two throttler implementations: * An unlimited throttler that is no-op and never blocks. * A throttler built around the mbus `DynamicThrottlePolicy` and defers all window decisions to it. Current config default is to use the unlimited throttler. Config changes require a process restart. Offers both polling and (timed, non-timed) blocking calls for acquiring a throttle token. If the returned token is valid, the caller may proceed to invoke the asynchronous operation. The window slot taken up by a valid throttle token is implicitly freed up when the token is destroyed.
*	_executor -> _thread	Henning Baldersheim	2021-12-09	1	-2/+2
\|
*	Add init_fun to vespalib::Thread too to figure out what the thread is used for.	Henning Baldersheim	2021-12-09	1	-1/+3
\|
*	Actually test maintenance -> down node state transition	Tor Brede Vekterli	2021-11-24	1	-1/+1
\|
*	Handle case where bucket spaces have differing maintenance state for a node	Tor Brede Vekterli	2021-11-24	1	-26/+83
\| \| \| \| \| \| \| \| \| \| \|	Only skip deactivating buckets if the entire _node_ is marked as maintenance state, i.e. the node has maintenance state across all bucket spaces provided in the bundle. Otherwise treat the state transition as if the node goes down, deactivating all buckets. Also ensure that the bucket deactivation logic above the SPI is identical to that within Proton. This avoids bucket DBs getting out of sync between the two.
*	Revert "Continue serving search queries when in Maintenance node state ↵	Henning Baldersheim	2021-11-23	1	-83/+26
\| \| \| \|	[run-systemtest]"
*	Handle case where bucket spaces have differing maintenance state for a node	Tor Brede Vekterli	2021-11-23	1	-26/+83
\| \| \| \| \| \| \| \| \| \| \|	Only skip deactivating buckets if the entire _node_ is marked as maintenance state, i.e. the node has maintenance state across all bucket spaces provided in the bundle. Otherwise treat the state transition as if the node goes down, deactivating all buckets. Also ensure that the bucket deactivation logic above the SPI is identical to that within Proton. This avoids bucket DBs getting out of sync between the two.
*	Update 2019 Oath copyrights.	gjoranv	2021-10-27	1	-1/+1
\|
*	Adjust dummy persistence spi semantics towards proton spi semantics when	Tor Egge	2021-10-25	1	-4/+4
\| \| \| \| \| \| \|	bucket doesn't exist: * getBucketInfo() returns success with empty bucket info * createIterator() returns success * iterate() returns empty complete result.
*	create/delete bucket will never throw.	Henning Baldersheim	2021-10-25	1	-2/+2
\|
*	Async createBucket	Henning Baldersheim	2021-10-25	1	-2/+2
\|
*	Only keep async variant to simplify what to implement and what fallback ↵	Henning Baldersheim	2021-10-18	1	-3/+5
\| \| \| \|	there are.
*	Implement async delete bucket.	Henning Baldersheim	2021-10-18	1	-2/+2
\|
*	Update Verizon Media copyright notices.	gjoranv	2021-10-07	1	-1/+1
\|
*	Update 2017 copyright notices.	gjoranv	2021-10-07	11	-11/+11
\|
*	Report max address space used in attribute vector components from content ↵	Geir Storli	2021-08-20	1	-29/+15
\| \| \| \| \| \| \|	nodes (proton) to the cluster controller. This is more generic than explicit address space values for enum store and multi value. This is used in the cluster controller to determine whether to block external feed.
*	- Change error handling so that both synchonous and asynchronous errors can ↵	Henning Baldersheim	2021-02-23	1	-14/+14
\| \| \| \| \| \| \| \|	be reported back from bucket executor. - Treat remapping as an error. - For lidspace compaction job iterator is reset and will be recreated on next invocation. - For bucketmove th ebucket is rechecked and either discarded or restarted.
*	- Reduce visibility of ClusterState and Distribution.	Henning Baldersheim	2021-02-19	2	-0/+3
\|
*	use size literals in storage	Arne Juul	2021-02-15	1	-6/+7
\|
*	Make the noise level used when deciding whether to report resource usage ↵	Geir Storli	2021-02-04	1	-12/+34
\| \| \| \|	configurable.
*	Revert "Properly track execution of BucketTasks and provide sync() and ↵	Henning Baldersheim	2021-02-02	1	-34/+0
\| \| \| \|	order… "
*	Properly track execution of BucketTasks and provide sync() and orderly shutdown.	Henning Baldersheim	2021-02-02	1	-0/+34
\|
*	Revert "Implement BucketExecutor::sync."	Henning Baldersheim	2021-02-02	1	-34/+0
\|
*	Use conditional notify instead of sleep.	Henning Baldersheim	2021-02-01	1	-0/+1
\|
*	Implement BucketExecutor::sync.	Henning Baldersheim	2021-02-01	1	-0/+33
\|
*	Wire reporting of attribute resource usage all the way to the cluster ↵	Geir Storli	2021-01-29	1	-0/+22
\| \| \| \|	controller via the host info API.