summaryrefslogtreecommitdiffstats
path: root/application
diff options
context:
space:
mode:
authorTor Brede Vekterli <vekterli@vespa.ai>2023-11-09 15:56:30 +0000
committerTor Brede Vekterli <vekterli@vespa.ai>2023-11-10 13:10:59 +0000
commitb4ca69ae534534f4f3c36b96aa2423f93001b05f (patch)
tree5d636274dcfaf5b27e10baa52ba1661637a21ac3 /application
parentd1a69ad4cf19eae5efb7ff5ba3854d33551221bc (diff)
Implement DeleteBucket with throttled per-document async removal
Previous (legacy) behavior was to immediately async schedule a full bucket deletion in the persistence backend, which incurs a very disproportionate cost when documents are backed by many and/or heavy indexes (such as HNSW). This risked swamping the backend with tens to hundreds of thousands of concurrent document deletes. New behavior splits deletion into three phases: 1. Metadata enumeration for all documents present in the bucket 2. Persistence-throttled async remove _per individual document_ that was returned in the iteration result. This blocks the persistence thread (by design) if the throttling window is not sufficiently large to accomodate all pending deletes. 3. Once all async removes have been ACKed, schedule the actual `DeleteBucket` operation towards the backend. This will clean up any remaining (cheap) tombstone entries as well as the meta data store. Operation reply is sent as before once the delete has completed.
Diffstat (limited to 'application')
0 files changed, 0 insertions, 0 deletions