aboutsummaryrefslogtreecommitdiffstats
path: root/application-model
Commit message (Collapse)AuthorAgeFilesLines
* Revert "Revert "Use InfrastructureApplication""Håkon Hallingstad2022-01-051-0/+9
|
* Revert "Use InfrastructureApplication"Harald Musum2022-01-051-9/+0
|
* Methods for getting all infrastructure applications in hostedHåkon Hallingstad2022-01-041-0/+9
|
* Add InfrastructureApplication in application-modelHåkon Hallingstad2022-01-036-19/+81
|
* Update 2019 Oath copyrights.gjoranv2021-10-271-1/+1
|
* Update 2017 copyright notices.gjoranv2021-10-0715-15/+15
|
* Disallow cfg suspension based solely on being downHåkon Hallingstad2021-09-231-1/+9
|
* Add ServiceStatus.UNKNOWNHåkon Hallingstad2021-09-131-1/+6
|
* Revert "Revert "Pass around orchestration parameters""Håkon Hallingstad2021-07-293-0/+21
|
* Revert "Pass around orchestration parameters"Håkon Hallingstad2021-07-293-21/+0
|
* Use OrchestrationParamsHåkon Hallingstad2021-07-282-0/+19
|
* OrchestrationParamsHåkon Hallingstad2021-07-281-0/+2
|
* Allow Jackson deserialization of model typesBjørn Christian Seime2021-04-121-0/+10
|
* Avoid serialization of utility methodsHåkon Hallingstad2021-04-021-0/+7
|
* Require 3 config server (and controller) hostsHåkon Hallingstad2021-03-231-0/+14
| | | | | | | | | We already require 3 config server (and controller) nodes, but it is not sufficient to protect the hosts from being left with only 1 healthy host: Say the config server host application contains 2 nodes. An upgrade of host-admin on one of those nodes is allowed, since only the host is suspended and none of the 2 nodes are down. This is fixed by handling config server hosts similar to config servers: assume 3 nodes.
* Support delegating content node suspension to cluster controllerHåkon Hallingstad2021-01-221-0/+6
| | | | | | | | | | | | | | | | | | | | | | | This PR introduces a new flag group-suspension, which if true, enables: - Instead of allowing at most one storagenode to suspend at any given time, it will now ignore storagenode, searchnode, and distributor service clusters, and rely on the cluster controller to allow or deny the request to suspend. This will increase the load on the cluster controllers. Combined with earlier changes to the cluster controller, this new flag effectively guard the feature of allowing all nodes within a hierarchical group to suspend concurrently. I also took the opportunity to tune related policies: - Allow at most one config server and controller to be down at any given time. This is actually a no-op, since it was effectivelly equal to the older policy of 10% down. - Allows 20% of all host-admins to be down, not just tenant host-admins. This is effectively equal to the old policy of 10% except that it may allow 2 proxy host-admins to go down at the same time. Should be fine.
* Update ↵Håkon Hallingstad2020-09-181-1/+1
| | | | | application-model/src/main/java/com/yahoo/vespa/applicationmodel/ClusterId.java Co-authored-by: Harald Musum <musum@verizonmedia.com>
* 30s down-moratorium before allowing suspensionHåkon Hallingstad2020-09-185-16/+79
|
* Orchestrator should assume 3 controllersHåkon Hallingstad2020-06-223-3/+18
|
* Moved to more specific methods on ServiceMonitorHåkon Hallingstad2020-02-281-13/+15
|
* Unit test 1-d map short form modify updateJon Bratseth2020-01-141-0/+1
|
* Assume at least 3 config server in OrchestratorHåkon Hallingstad2019-08-134-0/+11
|
* Health rest APIHåkon Hallingstad2019-01-311-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Makes a new REST API /orchestrator/v1/health/<ApplicationId> that shows the list of services that are monitored for health. This information is currently a bit difficult to infer from /orchestrator/v1/instances/<ApplicationInstanceReference> since it is the combined view of health and Slobrok. There are already APIs for Slobrok. Example content: $ curl -s localhost:19071/orchestrator/v1/health/hosted-vespa:zone-config-serve\ rs:default|jq . { "services": [ { "clusterId": "zone-config-servers", "serviceType": "configserver", "configId": "zone-config-servers/cfg6", "status": { "serviceStatus": "UP", "lastChecked": 1548939111.708718, "since": 1548939051.686223, "endpoint": "http://cfg4.prod.cd-us-central-1.vespahosted.ne1.yahoo.com:19071/state/v1/health" } }, ... ] } This view is slightly different from the application model view, just because that's exactly how the health monitoring is structured (individual monitors against endpoints). The "endpoint" information will also be added to /instances if the status comes from health and not Slobrok.
* Revert "Preserve serviceStatus in service instance for backwards compatibility"Jon Marius Venstad2019-01-281-2/+0
|
* Preserve serviceStatus in service instance for backwards compatibilityHåkon Hallingstad2019-01-251-0/+2
|
* Metadata about /state/v1/health statusHåkon Hallingstad2019-01-252-7/+108
| | | | | | | | | | | | | The service monitor uses /state/v1/health to monitor config servers and the host admins (but not yet tenant host admins). This commit adds some metadata about the status of a service: - The time the status was last checked - The time the status changed to the current This can be used to e.g. make more intelligent decisions in the Orchestrator, e.g. only allowing a service to suspend if it has been DOWN longer than X seconds (to avoid spurious DOWN to break redundancy and uptime guarantees).
* Nonfunctional changes onlyJon Bratseth2019-01-211-0/+1
|
* 6-SNAPSHOT -> 7-SNAPSHOTArnstein Ressem2019-01-211-2/+2
|
* Support monitoring health of tenant hostsHåkon Hallingstad2019-01-161-5/+0
|
* Revert "Revert "Add infrastructure applications to DuperModel""Håkon Hallingstad2018-12-031-0/+3
|
* Revert "Add infrastructure applications to DuperModel"Harald Musum2018-12-031-3/+0
|
* Add infrastructure applications to DuperModelHåkon Hallingstad2018-11-301-0/+3
| | | | | | | | | | | | | | | | | | | | | | | DuperModel is (will be) responsible for both active tenant applications (through SuperModel) and infrastructure applications. This PR is one step in that direction: - All infrastructure applications (config, confighost, controller, controllerhost, and proxyhost) are owned and managed by DuperModel. - The InfrastructureProvisioner retrieves all possible infra apps from the DuperModel (through a reduced API), and "activates" each of them if target is set and there are any nodes etc. - The InfrastructureProvisioner then notifies the DuperModel which apps have been activated, and with which hosts. - The DuperModel can then build delegate artificially create ApplicationInfo, which gets translated into the application model, and finally the service model. - The resulting service model has NOT_CHECKED for each hostadmin service instance. This is sufficient for goal 1 of this sprint. - The config server application currently has health, so that's kept as-is for now. - Feature flags have been tried and works and allows 1. to disable adding the infra apps in the DuperModel, and 2. to enable the infra configserver instead of the currently created configserver w/health.
* Remove explicit maven-compiler-plugin config. Inherit from parent.gjoranv2018-04-251-10/+0
|
* Support reporting UP for node admin outside zone appHåkon Hallingstad2018-02-262-0/+7
| | | | | | | | | If the nodeAdminInContainer ConfigserverConfig has been set, with this PR, the service monitor will always report the node admin container service as UP, thereby avoiding issues related to standalone node admin seemingly being down when not running as part of the application. This postpones checking /status/v1/health for later.
* Split parent + container-dependency-versions from root pom.gjoranv2017-12-011-0/+1
| | | | | | - Add missing dependencies so that all provided non-yahoo jars are listed in container-dependency-versions. - Add relativePath for all child poms of parent.
* Revert "Gjoranv/split parent2"gjoranv2017-11-301-1/+0
|
* Split parent + container-dependency-versions from root pom.gjoranv2017-11-301-0/+1
| | | | | | - Add missing dependencies so that all provided non-yahoo jars are listed in container-dependency-versions. - Add relativePath for all child poms of parent.
* Revert "Gjoranv/split parent"gjoranv2017-11-291-1/+0
|
* Split parent + container-dependency-versions from root pom.gjoranv2017-11-291-0/+1
| | | | | | - Add missing dependencies so that all provided non-yahoo jars are listed in container-dependency-versions. - Add relativePath for all child poms of parent.
* Avoid recursive toStringHåkon Hallingstad2017-10-252-4/+0
|
* Avoid recursive hashCode and equalsHåkon Hallingstad2017-10-252-6/+4
|
* Provide more info in host Orchestrator REST APIHåkon Hallingstad2017-10-252-6/+36
|
* Remove status type parameter in application model classesHåkon Hallingstad2017-10-224-15/+26
|
* Include orchestrator and service-model fat jarsHåkon Hallingstad2017-10-191-0/+2
|
* Nonfunctional changesJon Bratseth2017-08-3011-0/+17
|
* Update copyright headersJon Bratseth2017-06-1413-2/+13
|
* Revert "Update copyright headers"Jon Bratseth2017-06-1413-13/+2
|
* Update copyright headersJon Bratseth2017-06-1413-2/+13
|
* Revert "Copyright header"Jon Bratseth2017-06-1313-13/+2
|
* Copyright headerJon Bratseth2017-06-1313-2/+13
|