aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Revert changes to config generationHåkon Hallingstad2021-10-2010-30/+46
|
* Fixes after review roundHåkon Hallingstad2021-10-1912-95/+90
|
* Improve logging of FleetController and DatabaseHandlerHåkon Hallingstad2021-10-1521-256/+371
|
* Merge pull request #19566 from ↵Håkon Hallingstad2021-10-144-22/+31
|\ | | | | | | | | vespa-engine/hakonhall/some-optimizations-of-rpcservertest Some optimizations of RpcServerTest
| * Some optimizations of RpcServerTestHåkon Hallingstad2021-10-144-22/+31
| |
* | Merge pull request #19564 from vespa-engine/freva/prepare-container-fsValerij Fredriksen2021-10-1427-175/+166
|\ \ | | | | | | Prepare to use ContainerPaths
| * | Update default container storage rootValerij Fredriksen2021-10-144-11/+11
| | |
| * | Use String instead of Path where possibleValerij Fredriksen2021-10-145-33/+28
| | |
| * | Simplify with UnixPathValerij Fredriksen2021-10-148-64/+61
| | |
| * | Create factory method for NodeAgentContext builderValerij Fredriksen2021-10-1412-29/+33
| | |
| * | Create factory method for ContainerFileSystemValerij Fredriksen2021-10-145-38/+33
| | |
* | | Merge pull request #19544 from vespa-engine/container-config-improvementsgjoranv2021-10-1410-142/+163
|\ \ \ | | | | | | | | Container config improvements [run-systemtest]
| * | | Init the config generation to 1 instead of 0.gjoranv2021-10-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - An initial value of 0, generated config generation sequence 1,1,2,3,... causing an exception in Container.getConfigAndCreateGraph when it got bootstrap configs with generation=1 twice.
| * | | Rename config retriever field.gjoranv2021-10-141-7/+7
| | | |
| * | | Allow exceptions from the config system to propagate up.gjoranv2021-10-131-9/+2
| | | |
| * | | Simplify and improve config retrieval.gjoranv2021-10-131-34/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Retrive bootstrap snapshot first when the system is in the stable state. When bootstrap is newer than components, retrieve the new components generation. This avoids getting exceptions from the config system when a component that takes a config with missing default value has been removed. - Do not close set up empty component subscriber after bootstrap, should be unnecessary as it's always done when component config keys are changed. - Declare getConfigsOnce private. - Improve debug logging
| * | | Improve debug logging.gjoranv2021-10-131-4/+5
| | | |
| * | | minor: rearrange fields.gjoranv2021-10-081-1/+2
| | | |
| * | | Improve debugging of CloudSubscriber by adding a name.gjoranv2021-10-087-18/+21
| | | |
| * | | Simplify by taking a SubscriberFactory instead of a Function.gjoranv2021-10-083-11/+10
| | | |
| * | | Move CloudSubscriber to separate class file.gjoranv2021-10-082-75/+101
| | | |
| * | | Add more debug log for config generations.gjoranv2021-10-082-3/+7
| | | |
| * | | Use correct method name in log message.gjoranv2021-10-081-1/+1
| | | |
| * | | Improve commentgjoranv2021-10-081-1/+2
| | | |
* | | | Merge pull request #19565 from vespa-engine/hmusum/config-cleanup-1Henning Baldersheim2021-10-143-25/+7
|\ \ \ \ | | | | | | | | | | Cleanup, no functional changes
| * | | | Cleanup, no functional changesHarald Musum2021-10-143-25/+7
| | | | |
* | | | | Merge pull request #19559 from vespa-engine/hmusum/upgrade-to-curator-5.2.0Håkon Hallingstad2021-10-1413-13/+24
|\ \ \ \ \ | | | | | | | | | | | | Upgrade to Curator 5.2.0 [run-systemtest]
| * | | | | Upgrade to Curator 5.2.0Harald Musum2021-10-1413-13/+24
| | |_|/ / | |/| | |
* | | | | Merge pull request #19554 from ↵Henning Baldersheim2021-10-144-5/+9
|\ \ \ \ \ | |_|/ / / |/| | | | | | | | | | | | | | vespa-engine/hmusum/application-package-maintainer-changes Improve download of application package in maintainer [run-systemtest]
| * | | | Improve download of application package in maintainerHarald Musum2021-10-144-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set downloadFromOtherSourceIfNotFound to false, so that the receiving config server that gets the request don't try to download a file reference. This will be done by the ApplicationPackageMaintainer on the other server anyway.
* | | | | Merge pull request #19562 from vespa-engine/balder/prevent-division-by-zeroJon Bratseth2021-10-142-11/+18
|\ \ \ \ \ | | | | | | | | | | | | Prevent division by zero
| * | | | | Update ↵Jon Bratseth2021-10-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | metrics-proxy/src/main/java/ai/vespa/metricsproxy/service/SystemPoller.java
| * | | | | Prevent division by zeroHenning Baldersheim2021-10-142-11/+18
| | | | | |
* | | | | | Merge pull request #19560 from ↵Tor Brede Vekterli2021-10-146-2/+43
|\ \ \ \ \ \ | |/ / / / / |/| | | | | | | | | | | | | | | | | vespa-engine/vekterli/add-distributor-enhanced-maintenance-scheduling-feature-flag Add feature flag for enhanced distributor maintenance scheduling
| * | | | | Add feature flag for enhanced distributor maintenance schedulingTor Brede Vekterli2021-10-146-2/+43
| | |/ / / | |/| | |
* | | | | Merge pull request #19556 from ↵Tor Brede Vekterli2021-10-149-16/+89
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | vespa-engine/vekterli/add-metric-for-max-time-since-bucket-gc Add metric for max time since bucket GC was last run
| * | | | | Add metric for max time since bucket GC was last runTor Brede Vekterli2021-10-149-16/+89
| | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | Max time is aggregated across all buckets. If this metric value grows substantially larger than the configured GC period it indicates that GC is being starved.
* | | | | Merge pull request #19547 from ↵Geir Storli2021-10-145-33/+111
|\ \ \ \ \ | |_|/ / / |/| | | | | | | | | | | | | | vespa-engine/toregge/add-detailed-metrics-for-failed-merge-operations Add detailed metrics for failed merge operations.
| * | | | Use ASSERT_NO_FATAL_FAILURE() to propagate fatal failures.Tor Egge2021-10-141-6/+6
| | | | |
| * | | | Add detailed metrics for failed merge operations.Tor Egge2021-10-145-33/+111
| | | | |
* | | | | Merge pull request #19379 from ↵Tor Brede Vekterli2021-10-1413-81/+252
|\ \ \ \ \ | |_|/ / / |/| | | | | | | | | | | | | | vespa-engine/vekterli/avoid-stalling-maintenance-scheduling-if-single-op-blocked Don't let a blocked maintenance operation inhibit remaining maintenance queue [run-systemtest]
| * | | | Use blocking scheduling semantics for bucket activation maintenanceTor Brede Vekterli2021-10-143-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We consider bucket maintenance so latency critical that we'll prefer to stall scheduling of subsequent buckets instead of risking having to re-scan the DB to encounter the bucket again.
| * | | | Make implicit bucket priority DB clearing on scheduling configurableTor Brede Vekterli2021-10-148-16/+90
| | | | |
| * | | | Don't let a blocked maintenance operation inhibit remaining maintenance queueTor Brede Vekterli2021-10-149-72/+150
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The old maintenance scheduler behavior is to only remove a bucket from the priority DB if its maintenance operation was successfully started. Failing to start an operation could happen from both max pending throttling as well as operation/bucket-specific blocking behavior. Since the scheduler would encounter the same bucket as the one previously blocked upon its next tick invocation, a single blocked bucket would run the risk of head-of-line stalling the rest of the remaining maintenance queue (assuming the ongoing DB scan did not encounter any higher priority buckets). This commit changes the following aspects of maintenance scheduling: * Always clear entries from the priority DB before trying to start an operation. A blocked operation will be retried the next time the regular bucket DB scan encounters the bucket. * Avoid trying to start (and clear) inherently doomed operations by _not_ trying to schedule any operations if it would be blocked due to too many pending maintenance operations anyway. Introduces a new `PendingWindowChecker` interface for this purpose. * Explicitly inhibit all maintenance scheduling if a pending cluster state is present. Operations are already _implicitly_ blocked from starting if there's a pending cluster state, but this would cause the priority DB from being pointlessly cleared.
* | | | Merge pull request #19553 from vespa-engine/balder/test-system-metricsJon Bratseth2021-10-144-55/+210
|\ \ \ \ | |_|/ / |/| | | Balder/test system metrics
| * | | cpu.util -> cpu_utilHenning Baldersheim2021-10-142-5/+9
| | | |
| * | | Make system metrics testable.Henning Baldersheim2021-10-144-53/+204
| | | |
* | | | Merge pull request #19548 from ↵Håkon Hallingstad2021-10-142-1/+11
|\ \ \ \ | |/ / / |/| | | | | | | | | | | vespa-engine/hakonhall/reduce-running-time-of-masterelectiontest-from-28-to-12s Reduce running time of MasterElectionTest from 28 to 12s
| * | | Reduce running time of MasterElectionTest from 28 to 12sHåkon Hallingstad2021-10-142-1/+11
| | | |
* | | | Merge pull request #19551 from vespa-engine/mpolden/image-selection-cleanupMartin Polden2021-10-1417-202/+102
|\ \ \ \ | | | | | | | | | | Stop reading container images from ZK