| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
| |
|
|
|
|
| |
enable_metadata_only_fetch_phase_for_inconsistent_updates=true as default.
|
| |
|
|
|
|
|
|
|
| |
Neither distributors nor content nodes ever report their state as
Initializing as part of their startup sequence; they go straight
from Down to Up. Remove complicated init progress delta reporting
that is no longer needed.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The existing state unification logic was likely to help ensure that
various distributor availability-states were treated as if they were
simply Up, but the distributor has not been able to even _be_ in other
available states than Up for many years. So it's effectively pointless.
Remove unification entirely and instead require both the distributor
and content node to be mutually in sync with the exact cluster state
version.
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/avoid-bucket-db-race-during-cluster-state-transition
Avoid bucket DB race during content node cluster state transition [run-systemtest]
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It was possible for a distributor bucket fetch request to be processed
_after_ a cluster state was enabled (and internally propagated) on the content
node, but _before_ all side effects of this enabling were complete and fully
visible. This could cause inconsistent information to be returned to the
distributor, causing nodes to get out of sync bucket metadata.
This commit handles such transition periods by introducing an implicit
barrier between observing the incoming command and outgoing reply for a
particular cluster state version. Upon observing the reply for a version,
all side effects must already be visible since the reply is only sent
once internal state processing is complete (both above and below the SPI).
Until initiated and completed versions converge, requests are rejected and
will be transparently retried by the distributors.
|
|\ \
| | |
| | |
| | |
| | | |
vespa-engine/vekterli/do-not-inhibit-activation-under-maintenance-mode
Do not inhibit bucket replica activation under maintenance mode [run-systemtest]
|
| | | |
|
| | |
| | |
| | |
| | |
| | | |
Let naming better reflect underlying semantics
Co-authored-by: Geir Storli <geirst@yahooinc.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Adds an internal feature support flag which communicates that the
content node will not implicitly index a non-ideal replica marked
explicitly as active. Activation during maintenance is only performed
iff a content node has this flag set.
|
| | | |
|
| |/
|/| |
|
|/
|
|
|
|
|
|
|
|
|
| |
This lets status pages that today are served at the root `/` path
be aliased under `/contentnode-status/v1/`. Legacy paths continue
working as before. Change existing status page absolute paths to
relative to avoid having to care about this particular detail internally.
Note: both distributor and search/storage node process status pages
use the `/contentnode-status/v1/` prefix, as they're both technically
processes that are part of a _logical_ content node.
|
| |
|
| |
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/make-gc-work-with-parent-child-with-subset-indexed
Make two-phase GC work for parent-child with subset of replicas indexed [run-systemtest]
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The previous iteration of GC 1st phase candidate set computation required
_all_ replicas to agree that a particular document should be removed
for it to be passed on to the second phase. I.e. the intersection of all
nodes' document sets. This does not work as expected when the GC
expression references imported fields _and_ `searchable-copies` is
less than `redundancy`, as the required index structures are not present
across all replicas. The result was that eligible documents were never
removed.
This commit changes the candidate set semantics to instead use a
union of document IDs, using the maximum observed timestamp in the case
of conflicts for the same ID. This mirrors the end result of the legacy
behavior, but does not require merging in order to propagate tombstones
from the indexed replicas to those without. It also greatly simplifies
the candidate computation code.
|
| | |
|
| | |
|
| | |
|
|/ |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After initialization, the node will immediately start communicating with the cluster
controller, exchanging host info. This host info contains a subset snapshot of the active
metrics, which includes the total bucket count, doc count etc. It is critical that
we must never report back host info _prior_ to having run at least one full sweep of
the bucket database, lest we risk transiently reporting zero buckets held by the
content node. Doing so could cause orchestration logic to perform operations based
on erroneous assumptions.
To avoid this, we explicitly force a full DB sweep and metric update prior to reporting
the node as up. Since this function is called prior to the CommunicationManager thread
being started, any CC health pings should also always happen after this init step.
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/capability-filtering-of-content-status-pages
Add capability filtering for content layer status pages and metrics [run-systemtest]
|
| | |
|
| |
| |
| |
| |
| | |
This currently only applies to the port exposed for the content node
or distributor specific status pages and metrics export, not state V1.
|
|/ |
|
| |
|
|
|
|
| |
Refactor existing request access filter creation to use these.
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/add-cc-api-server-capability-filter
Add capability filter to cluster controller API RPCs on content nodes
|
| | |
|
|/
|
|
| |
Minor cleanup sweep.
|
|\
| |
| | |
Add support for two-phase document garbage collection [run-systemtest]
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If enabled, garbage collection is performed in two phases (metadata
gathering and deletion) instead of just a single phase. Two-phase GC
allows for ensuring the same set of documents is deleted across all
nodes and explicitly takes write locks on the distributor to prevent
concurrent feed ops to GC'd documents from potentially creating
inconsistencies.
Two-phase GC is only used _iff_ all replica content nodes support
the feature _and_ it's enabled in config. An additional field has
been added to the feature negotiation functionality to communicate
support from content nodes to distributors.
|
| | |
|
|/ |
|
|
|
|
|
| |
Move ref away to avoid an unneeded refcount bump and avoid leaving
behind a lingering strong reference to the last generated operation.
|
|
|
|
|
|
|
|
| |
This should always succeed today, as authz rules by default grant
all capabilities. But since this is a very hot call path, we'll
learn very quickly if the capability check incurs a measurable
overhead; it is not expected to do so in practice (really just a
virtual function call and a few bitwise ops).
|
|\
| |
| |
| |
| | |
vespa-engine/vekterli/add-separate-id-and-timestamp-wrapper
Add wrapper for <doc id, timestamp> tuple and update APIs to use this
|
| |
| |
| |
| |
| | |
Feels more intuitive to have a tuple that implies "document foo at timestamp bar"
rather than the current inverse of "timestamp bar with document foo".
|
|/ |
|
| |
|
| |
|
| |
|
| |
|
| |
|