diff options
author | Jack Lucas <jflucas@research.att.com> | 2020-06-12 11:44:49 -0400 |
---|---|---|
committer | Jack Lucas <jflucas@research.att.com> | 2020-06-15 16:05:52 -0400 |
commit | ef3183f5ae33b63be4aca2dd95e52e009f15f4c3 (patch) | |
tree | 7a68dea0f092371fb7d8843ff08f264fa4e234ef /healthcheck-container | |
parent | ec3410a8478bedba8a06efb02f1610c2ebfdf130 (diff) |
Make healthcheck fully dynamic3.0.0
Support health checks for DCAE and DCAE MOD
Issue-ID: DCAEGEN2-1864
Signed-off-by: Jack Lucas <jflucas@research.att.com>
Change-Id: Idcf127a591ff3b926a5af0281c591d8da18355f1
Diffstat (limited to 'healthcheck-container')
-rw-r--r-- | healthcheck-container/README.md | 19 | ||||
-rw-r--r-- | healthcheck-container/get-status.js | 66 | ||||
-rw-r--r-- | healthcheck-container/healthcheck.js | 52 | ||||
-rw-r--r-- | healthcheck-container/package.json | 2 | ||||
-rw-r--r-- | healthcheck-container/pom.xml | 2 |
5 files changed, 58 insertions, 83 deletions
diff --git a/healthcheck-container/README.md b/healthcheck-container/README.md index fb8d87c..f4185bd 100644 --- a/healthcheck-container/README.md +++ b/healthcheck-container/README.md @@ -1,16 +1,17 @@ -# DCAE Healthcheck Service +# DCAE and DCAE MOD Healthcheck Service -The DCAE Healthcheck service provides a simple HTTP API to check the status of DCAE components running in the Kubernetes environment. When it receives any incoming HTTP request, the service makes queries to the Kubernetes API to determine the current status of the DCAE components, as seen by Kubernetes. Most components have defined a "readiness probe" (an HTTP healthcheck endpoint or a healthcheck script) that Kubernetes uses to -determine readiness. +The Healthcheck service provides a simple HTTP API to check the status of DCAE or DCAE MOD components running in the Kubernetes environment. When it receives any incoming HTTP request, the service makes queries to the Kubernetes API to determine the current status of the DCAE or DCAE MOD components, as seen by Kubernetes. Most components have defined a "readiness probe" (an HTTP healthcheck endpoint or a healthcheck script) that Kubernetes uses to determine readiness. -The Healthcheck service has three sources for identifying components that should be running: -1. A hardcoded list of components that are expected to be deployed by Helm as part of the ONAP installation. -2. A hardcoded list of components thar are expected to be deployed with blueprints using Cloudify Manager during DCAE bootstrapping, which is part of ONAP installation. -3. Components labeled in Kubernetes as having been deployed by Cloudify Manager. These are identified by a query to the Kubernetes API. The query is made each time an incoming HTTP request is made. +Two instances of the Healthcheck service are deployed in ONAP: one for DCAE and one for DCAE MOD. -Note that by "component", we mean a Kubernetes Deployment object associated with the component. +The Healthcheck service has two sources for identifying components that should be running: +1. A list of components that are expected to be deployed by Helm as part of the ONAP installation, specified in a JSON array stored in a file at `/opt/app/expected-components.json`. -Sources 2 and 3 are likely to overlap (the components in source 2 are labeled in Kubernetes as having been deployed by Cloudify Manager, so they will show up as part of source 3 if the bootstrap process progressed to the point of attempting deployments). The code de-duplicates these sources. + DCAE and DCAE MOD have configurable deployments. By setting flags in the `values.yaml` file or in an override file, a user can select which components are deployed. The`/opt/app/expected-components.json` file is generated at deployment time based on which components have been selected for deployment. The file is stored in a Kubernetes ConfigMap that is mounted on the healthcheck container at `/opt/app/expected-components.json`. See the Helm charts for DCAE and DCAEMOD in the OOM repository for details on how the ConfigMap is created. + +2. Components whose Kubernetes deployments have been marked with the labeled specified by the environment variable `DEPLOY_LABEL`. These are identified by a query to the Kubernetes API requesting a list of all the deployments with the label. The query is made each time an incoming HTTP request is made, so that as new deployments are created, they will be detected and included in the health check. + + For the DCAE instance of the Healthcheck service, the `DEPLOY_LABEL` variable is set to `cfydeployment`. This is the label that the DCAE k8s Cloudify plugin uses to mark every deployment that it creates. The DCAE Healthcheck instance therefore includes all components deployed by the DCAE k8s plugin in its health check. For the DCAE MOD instance of the Healthcheck service, the `DEPLOY_LABEL` is not set, so the DCAE MOD health check does not make any checks based on a label. The Healthcheck service returns an HTTP status code of 200 if Kubernetes reports that all of the components that should be running are in a ready state. It returns a status code of 500 if some of the components are not ready. It returns a status code of 503 if some kind of error prevented it from completing a query. diff --git a/healthcheck-container/get-status.js b/healthcheck-container/get-status.js index 565c1e0..282b71e 100644 --- a/healthcheck-container/get-status.js +++ b/healthcheck-container/get-status.js @@ -25,7 +25,6 @@ const K8S_CREDS = '/var/run/secrets/kubernetes.io/serviceaccount'; const K8S_HOST = 'kubernetes.default.svc.cluster.local'; // Full name to match cert for TLS const K8S_PATH = 'apis/apps/v1beta2/namespaces/'; -const CFY_LABEL = 'cfydeployment'; // All k8s deployments created by Cloudify--and only k8s deployments created by Cloudify--have this label const MAX_DEPS = 1000; // Maximum number of k8s deployments to return from a query to k8s //Get token and CA cert @@ -60,13 +59,6 @@ const summarizeDeploymentList = function(list) { return ret; }; -const summarizeDeployment = function(deployment) { - // deployment is a Deployment object returned by k8s - // we make it look enough like a DeploymentList object to - // satisfy summarizeDeploymentList - return summarizeDeploymentList({items: [deployment]}); -}; - const queryKubernetes = function(path, callback) { // Make GET request to Kubernetes API const options = { @@ -95,17 +87,6 @@ const queryKubernetes = function(path, callback) { req.end(); }; -const getStatus = function(path, extract, callback) { - // Get info from k8s and extract readiness info - queryKubernetes(path, function(error, res, body) { - let ret = body; - if (!error && res && res.statusCode === 200) { - ret = extract(body); - } - callback (error, res, ret); - }); -}; - const getStatusSinglePromise = function (item) { // Expect item to be of the form {namespace: "namespace", deployment: "deployment_name"} return new Promise(function(resolve, reject){ @@ -130,17 +111,6 @@ const getStatusSinglePromise = function (item) { }); }); } -exports.getStatusNamespace = function (namespace, callback) { - // Get readiness information for all deployments in namespace - const path = K8S_PATH + namespace + '/deployments'; - getStatus(path, summarizeDeploymentList, callback); -}; - -exports.getStatusSingle = function (namespace, deployment, callback) { - // Get readiness information for a single deployment - const path = K8S_PATH + namespace + '/deployments/' + deployment; - getStatus(path, summarizeDeployment, callback); -}; exports.getStatusListPromise = function (list) { // List is of the form [{namespace: "namespace", deployment: "deployment_name"}, ... ] @@ -150,24 +120,32 @@ exports.getStatusListPromise = function (list) { }); } -exports.getDCAEDeploymentsPromise = function (namespace) { - // Return list of the form [{namespace: "namespace"}, deployment: "deployment_name"]. +exports.getLabeledDeploymentsPromise = function (namespace, label) { + // Return list of the form [{namespace: "namespace", deployment: "deployment_name"}]. // List contains all k8s deployments in the specified namespace that were deployed + // with the specified 'label'. (The check is for the presence of the label--its + // values is not important.) In DCAE, this is used to find deployments created // by Cloudify, based on Cloudify's use of a "marker" label on each k8s deployment that // the k8s plugin created. + // If 'label' is unspecified or has zero length, returns an empty list. return new Promise(function(resolve, reject) { - const path = K8S_PATH + namespace + '/deployments?labelSelector=' + CFY_LABEL + '&limit=' + MAX_DEPS - queryKubernetes(path, function(error, res, body){ - if (error) { - reject(error); - } - else if (res.statusCode !== 200) { - reject(body); - } - else { - resolve(body.items.map(function(i) {return {namespace : namespace, deployment: i.metadata.name};})); - } - }); + if (!label || label.length < 1) { + resolve([]); + } + else { + const path = K8S_PATH + namespace + '/deployments?labelSelector=' + label + '&limit=' + MAX_DEPS + queryKubernetes(path, function(error, res, body){ + if (error) { + reject(error); + } + else if (res.statusCode !== 200) { + reject(body); + } + else { + resolve(body.items.map(function(i) {return {namespace : namespace, deployment: i.metadata.name};})); + } + }); + } }); }; diff --git a/healthcheck-container/healthcheck.js b/healthcheck-container/healthcheck.js index ed5aad3..574859f 100644 --- a/healthcheck-container/healthcheck.js +++ b/healthcheck-container/healthcheck.js @@ -19,31 +19,31 @@ const ONAP_NS = process.env.ONAP_NAMESPACE || 'default'; const DCAE_NS = process.env.DCAE_NAMESPACE || process.env.ONAP_NAMESPACE || 'default'; const HELM_REL = process.env.HELM_RELEASE || ''; +// If the healthcheck should include k8s deployments that are marked with a specific label, +// the DEPLOY_LABEL environment variable will be set to the name of the label. +// Note that the only the name of label is important--the value isn't used by the +// the healthcheck. If a k8s deployment has the label, it is included in the check. +// For DCAE (dcaegen2), this capability is used to check for k8s deployments that are +// created by Cloudify using the k8s plugin. +const DEPLOY_LABEL = process.env.DEPLOY_LABEL || ''; + const HEALTHY = 200; const UNHEALTHY = 500; const UNKNOWN = 503; +const EXPECTED_COMPONENTS='/opt/app/expected-components.json' + +const fs = require('fs'); + // List of deployments expected to be created via Helm -const helmDeps = - [ - 'dcae-cloudify-manager', - 'dcae-config-binding-service', - 'dcae-inventory-api', - 'dcae-servicechange-handler', - 'dcae-deployment-handler', - 'dcae-policy-handler', - 'dcae-dashboard' - ]; - -// List of deployments expected to be created by CM at boot time -const bootDeps = - [ - 'dep-dcae-tca-analytics', - 'dep-dcae-tcagen2', - 'dep-dcae-prh', - 'dep-dcae-hv-ves-collector', - 'dep-dcae-ves-collector' - ]; +let helmDeps = []; +try { + helmDeps = JSON.parse(fs.readFileSync(EXPECTED_COMPONENTS, {encoding: 'utf8'})); +} +catch (error) { + console.log(`Could not access ${EXPECTED_COMPONENTS}: ${error}`); + console.log ('Using empty list of expected components'); +} const status = require('./get-status'); const http = require('http'); @@ -55,7 +55,7 @@ const helmList = helmDeps.map(function(name) { const isHealthy = function(summary) { // Current healthiness criterion is simple--all deployments are ready - return summary.count && summary.ready && summary.count === summary.ready; + return summary.hasOwnProperty('count') && summary.hasOwnProperty('ready') && summary.count === summary.ready; }; const checkHealth = function (callback) { @@ -65,15 +65,11 @@ const checkHealth = function (callback) { // If we get responses from k8s and all deployments are ready, health status is HEALTHY (200) // This could be a lot more nuanced, but what's here should be sufficient for R2 OOM healthchecking - // Query k8s to find all the deployments launched by CM (they all have a 'cfydeployment' label) - status.getDCAEDeploymentsPromise(DCAE_NS) + // Query k8s to find all the deployments with specified DEPLOY_LABEL + status.getLabeledDeploymentsPromise(DCAE_NS, DEPLOY_LABEL) .then(function(fullDCAEList) { - // Remove any expected boot-time CM deployments from the list to avoid duplicates - dynamicDCAEDeps = fullDCAEList.filter(function(i) {return !(bootDeps.includes(i.deployment));}) - // Create full list of CM deployments to check: boot deployments and anything else created by CM - dcaeList = (bootDeps.map(function(name){return {namespace: DCAE_NS, deployment: name}})).concat(dynamicDCAEDeps); // Now get status for Helm deployments and CM deployments - return status.getStatusListPromise(helmList.concat(dcaeList)); + return status.getStatusListPromise(helmList.concat(fullDCAEList)); }) .then(function(body) { callback({status: isHealthy(body) ? HEALTHY : UNHEALTHY, body: body}); diff --git a/healthcheck-container/package.json b/healthcheck-container/package.json index cc20578..6b91448 100644 --- a/healthcheck-container/package.json +++ b/healthcheck-container/package.json @@ -1,7 +1,7 @@ { "name": "k8s-healthcheck", "description": "DCAE healthcheck server", - "version": "1.2.4", + "version": "2.0.0", "main": "healthcheck.js", "author": "author", "license": "(Apache-2.0)" diff --git a/healthcheck-container/pom.xml b/healthcheck-container/pom.xml index a01022c..3b41ab9 100644 --- a/healthcheck-container/pom.xml +++ b/healthcheck-container/pom.xml @@ -27,7 +27,7 @@ limitations under the License. <groupId>org.onap.dcaegen2.deployments</groupId> <artifactId>healthcheck-container</artifactId> <name>dcaegen2-deployments-healthcheck-container</name> - <version>1.3.1</version> + <version>2.0.0</version> <url>http://maven.apache.org</url> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> |