summaryrefslogtreecommitdiffstats
path: root/docs/oom_user_guide.rst
diff options
context:
space:
mode:
Diffstat (limited to 'docs/oom_user_guide.rst')
-rw-r--r--docs/oom_user_guide.rst581
1 files changed, 581 insertions, 0 deletions
diff --git a/docs/oom_user_guide.rst b/docs/oom_user_guide.rst
new file mode 100644
index 0000000000..b8e5d1bb9d
--- /dev/null
+++ b/docs/oom_user_guide.rst
@@ -0,0 +1,581 @@
+.. This work is licensed under a Creative Commons Attribution 4.0 International License.
+.. http://creativecommons.org/licenses/by/4.0
+.. Copyright 2018 Amdocs, Bell Canada
+
+.. Links
+.. _Curated applications for Kubernetes: https://github.com/kubernetes/charts
+.. _Services: https://kubernetes.io/docs/concepts/services-networking/service/
+.. _ReplicaSet: https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/
+.. _StatefulSet: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
+.. _Helm Documentation: https://docs.helm.sh/helm/
+.. _Helm: https://docs.helm.sh/
+.. _Kubernetes: https://Kubernetes.io/
+
+.. _user-guide-label:
+
+OOM User Guide
+##############
+
+The ONAP Operations Manager (OOM) provide the ability to manage the entire
+life-cycle of an ONAP installation, from the initial deployment to final
+decommissioning. This guide provides instructions for users of ONAP to
+use the Kubernetes_/Helm_ system as a complete ONAP management system.
+
+This guide provides many examples of Helm command line operations. For a
+complete description of these commands please refer to the `Helm
+Documentation`_.
+
+.. figure:: oomLogoV2-medium.png
+ :align: right
+
+The following sections describe the life-cycle operations:
+
+- Deploy_ - with built-in component dependency management
+- Configure_ - unified configuration across all ONAP components
+- Monitor_ - real-time health monitoring feeding to a Consul UI and Kubernetes
+- Heal_- failed ONAP containers are recreated automatically
+- Scale_ - cluster ONAP services to enable seamless scaling
+- Upgrade_ - change-out containers or configuration with little or no service impact
+- Delete_ - cleanup individual containers or entire deployments
+
+.. figure:: oomLogoV2-Deploy.png
+ :align: right
+
+Deploy
+======
+
+The OOM team with assistance from the ONAP project teams, have built a
+comprehensive set of Helm charts, yaml files very similar to TOSCA files, that
+describe the composition of each of the ONAP components and the relationship
+within and between components. Using this model Helm is able to deploy all of
+ONAP this simple command::
+
+ > helm install osn/onap
+
+.. note::
+ The osn repo is not currently available so creation of a local repository is
+ required.
+
+Helm is able to use charts served up from a repository and comes setup with a
+default CNCF provided `Curated applications for Kubernetes`_ repository called
+stable which should be removed to avoid confusion::
+
+ > helm repo remove stable
+
+.. To setup the Open Source Networking Nexus repository for helm enter::
+.. > helm repo add osn 'https://nexus3.onap.org:10001/helm/helm-repo-in-nexus/master/'
+
+To prepare your system for an installation of ONAP, you'll need to::
+
+ > git clone http://gerrit.onap.org/r/oom
+ > cd kubernetes
+
+Then build your local Helm repository::
+
+ > make all
+
+To setup a local Helm server to server up the ONAP charts::
+
+ > helm serve &
+
+Note the port number that is listed and use it in the Helm repo add as follows::
+
+ > helm repo add local http://127.0.0.1:8879
+
+To get a list of all of the available Helm chart repositories::
+
+ > helm repo list
+ NAME URL
+ local http://127.0.0.1:8879
+
+The Helm search command reads through all of the repositories configured on the
+system, and looks for matches::
+
+ > helm search -l
+ NAME VERSION DESCRIPTION
+ local/appc 2.0.0 Application Controller
+ local/clamp 2.0.0 ONAP Clamp
+ local/common 2.0.0 Common templates for inclusion in other charts
+ local/onap 2.0.0 Open Network Automation Platform (ONAP)
+ local/robot 2.0.0 A helm Chart for kubernetes-ONAP Robot
+ local/so 2.0.0 ONAP Service Orchestrator
+
+In any case, setup of the Helm repository is a one time activity.
+
+Once the repo is setup, installation of ONAP can be done with a single command::
+
+ > helm install local/onap -name development
+
+This will install ONAP from a local repository in a 'development' Helm release.
+As described below, to override the default configuration values provided by
+OOM, an environment file can be provided on the command line as follows::
+
+ > helm install local/onap -name development -f onap-development.yaml
+
+To get a summary of the status of all of the pods (containers) running in your
+deployment::
+
+ > kubectl get pods --all-namespaces -o=wide
+
+.. note::
+ The Kubernetes namespace concept allows for multiple instances of a component
+ (such as all of ONAP) to co-exist with other components in the same
+ Kubernetes cluster by isolating them entirely. Namespaces share only the
+ hosts that form the cluster thus providing isolation between production and
+ development systems as an example. The OOM deployment of ONAP in Beijing is
+ now done within a single Kubernetes namespace where in Amsterdam a namespace
+ was created for each of the ONAP components.
+
+.. note::
+ The Helm `-name` option refers to a release name and not a Kubernetes namespace.
+
+
+To install a specific version of a single ONAP component (`so` in this example)
+with the given name enter::
+
+ > helm install onap/so --version 2.0.1 -n so
+
+To display details of a specific resource or group of resources type::
+
+ > kubectl describe pod so-1071802958-6twbl
+
+where the pod identifier refers to the auto-generated pod identifier.
+
+.. figure:: oomLogoV2-Configure.png
+ :align: right
+
+Configure
+=========
+
+Each project within ONAP has its own configuration data generally consisting
+of: environment variables, configuration files, and database initial values.
+Many technologies are used across the projects resulting in significant
+operational complexity and an inability to apply global parameters across the
+entire ONAP deployment. OOM solves this problem by introducing a common
+configuration technology, Helm charts, that provide a hierarchical
+configuration configuration with the ability to override values with higher
+level charts or command line options.
+
+The structure of the configuration of ONAP is shown in the following diagram.
+Note that key/value pairs of a parent will always take precedence over those
+of a child. Also note that values set on the command line have the highest
+precedence of all.
+
+.. graphviz::
+
+ digraph config {
+ {
+ node [shape=folder]
+ oValues [label="values.yaml"]
+ demo [label="onap-demo.yaml"]
+ prod [label="onap-production.yaml"]
+ oReq [label="requirements.yaml"]
+ soValues [label="values.yaml"]
+ soReq [label="requirements.yaml"]
+ mdValues [label="values.yaml"]
+ }
+ {
+ oResources [label="resources"]
+ }
+ onap -> oResources
+ onap -> oValues
+ oResources -> environments
+ oResources -> oReq
+ oReq -> so
+ environments -> demo
+ environments -> prod
+ so -> soValues
+ so -> soReq
+ so -> charts
+ charts -> mariadb
+ mariadb -> mdValues
+
+ }
+
+The top level onap/values.yaml file contains the values required to be set
+before deploying ONAP. Here is the contents of this file:
+
+.. include:: onap_values.yaml
+ :code: yaml
+
+One may wish to create a value file that is specific to a given deployment such
+that it can be differentiated from other deployments. For example, a
+onap-development.yaml file may create a minimal environment for development
+while onap-production.yaml might describe a production deployment that operates
+independently of the developer version.
+
+For example, if the production OpenStack instance was different from a
+developer's instance, the onap-production.yaml file may contain a different
+value for the vnfDeployment/openstack/oam_network_cidr key as shown below.
+
+.. code-block:: yaml
+
+ nsPrefix: onap
+ nodePortPrefix: 302
+ apps: consul msb mso message-router sdnc vid robot portal policy appc aai
+ sdc dcaegen2 log cli multicloud clamp vnfsdk aaf kube2msb
+ dataRootDir: /dockerdata-nfs
+
+ # docker repositories
+ repository:
+ onap: nexus3.onap.org:10001
+ oom: oomk8s
+ aai: aaionap
+ filebeat: docker.elastic.co
+
+ image:
+ pullPolicy: Never
+
+ # vnf deployment environment
+ vnfDeployment:
+ openstack:
+ ubuntu_14_image: "Ubuntu_14.04.5_LTS"
+ public_net_id: "e8f51956-00dd-4425-af36-045716781ffc"
+ oam_network_id: "d4769dfb-c9e4-4f72-b3d6-1d18f4ac4ee6"
+ oam_subnet_id: "191f7580-acf6-4c2b-8ec0-ba7d99b3bc4e"
+ oam_network_cidr: "192.168.30.0/24"
+ <...>
+
+
+To deploy ONAP with this environment file, enter::
+
+ > helm install local/onap -n beijing -f environments/onap-production.yaml
+
+.. include:: environments_onap_demo.yaml
+ :code: yaml
+
+When deploying all of ONAP a requirements.yaml file control which and what
+version of the ONAP components are included. Here is an excerpt of this
+file:
+
+.. code-block:: yaml
+
+ # Referencing a named repo called 'local'.
+ # Can add this repo by running commands like:
+ # > helm serve
+ # > helm repo add local http://127.0.0.1:8879
+ dependencies:
+ <...>
+ - name: so
+ version: ~2.0.0
+ repository: '@local'
+ condition: so.enabled
+ <...>
+
+The ~ operator in the `so` version value indicates that the latest "2.X.X"
+version of `so` shall be used thus allowing the chart to allow for minor
+upgrades that don't impact the so API; hence, version 2.0.1 will be installed
+in this case.
+
+The onap/resources/environment/onap-dev.yaml (see the excerpt below) enables
+for fine grained control on what components are included as part of this
+deployment. By changing this `so` line to `enabled: false` the `so` component
+will not be deployed. If this change is part of an upgrade the existing `so`
+component will be shut down. Other `so` parameters and even `so` child values
+can be modified, for example the `so`'s `liveness` probe could be disabled
+(which is not recommended as this change would disable auto-healing of `so`).
+
+.. code-block:: yaml
+
+ #################################################################
+ # Global configuration overrides.
+ #
+ # These overrides will affect all helm charts (ie. applications)
+ # that are listed below and are 'enabled'.
+ #################################################################
+ global:
+ <...>
+
+ #################################################################
+ # Enable/disable and configure helm charts (ie. applications)
+ # to customize the ONAP deployment.
+ #################################################################
+ aaf:
+ enabled: false
+ <...>
+ so: # Service Orchestrator
+ enabled: true
+
+ replicaCount: 1
+
+ liveness:
+ # necessary to disable liveness probe when setting breakpoints
+ # in debugger so K8s doesn't restart unresponsive container
+ enabled: true
+
+ <...>
+
+.. figure:: oomLogoV2-Monitor.png
+ :align: right
+
+Monitor
+=======
+
+All highly available systems include at least one facility to monitor the
+health of components within the system. Such health monitors are often used as
+inputs to distributed coordination systems (such as etcd, zookeeper, or consul)
+and monitoring systems (such as nagios or zabbix). OOM provides two mechanims
+to monitor the real-time health of an ONAP deployment:
+
+- a Consul GUI for a human operator or downstream monitoring systems and
+ Kubernetes liveness probes that enable automatic healing of failed
+ containers, and
+- a set of liveness probes which feed into the Kubernetes manager which
+ are described in the Heal section.
+
+Within ONAP Consul is the monitoring system of choice and deployed by OOM in two parts:
+
+- a three-way, centralized Consul server cluster is deployed as a highly
+ available monitor of all of the ONAP components,and
+- a number of Consul agents.
+
+The Consul server provides a user interface that allows a user to graphically
+view the current health status of all of the ONAP components for which agents
+have been created - a sample from the ONAP Integration labs follows:
+
+.. figure:: consulHealth.png
+ :align: center
+
+To see the real-time health of a deployment go to: http://<kubernetes IP>:30270/ui/
+where a GUI much like the following will be found:
+
+
+.. figure:: oomLogoV2-Heal.png
+ :align: right
+
+Heal
+====
+
+The ONAP deployment is defined by Helm charts as mentioned earlier. These Helm
+charts are also used to implement automatic recoverability of ONAP components
+when individual components fail. Once ONAP is deployed, a "liveness" probe
+starts checking the health of the components after a specified startup time.
+
+Should a liveness probe indicate a failed container it will be terminated and a
+replacement will be started in its place - containers are ephemeral. Should the
+deployment specification indicate that there are one or more dependencies to
+this container or component (for example a dependency on a database) the
+dependency will be satisfied before the replacement container/component is
+started. This mechanism ensures that, after a failure, all of the ONAP
+components restart successfully.
+
+To test healing, the following command can be used to delete a pod::
+
+ > kubectl delete pod [pod name] -n [pod namespace]
+
+One could then use the following command to monitor the pods and observe the
+pod being terminated and the service being automatically healed with the
+creation of a replacement pod::
+
+ > kubectl get pods --all-namespaces -o=wide
+
+.. figure:: oomLogoV2-Scale.png
+ :align: right
+
+Scale
+=====
+
+Many of the ONAP components are horizontally scalable which allows them to
+adapt to expected offered load. During the Beijing release scaling is static,
+that is during deployment or upgrade a cluster size is defined and this cluster
+will be maintained even in the presence of faults. The parameter that controls
+the cluster size of a given component is found in the values.yaml file for that
+component. Here is an excerpt that shows this parameter:
+
+.. code-block:: yaml
+
+ # default number of instances
+ replicaCount: 1
+
+In order to change the size of a cluster, an operator could use a helm upgrade
+(described in detail in the next section) as follows::
+
+ > helm upgrade --set replicaCount=3 onap/so/mariadb
+
+The ONAP components use Kubernetes provided facilities to build clustered,
+highly available systems including: Services_ with load-balancers, ReplicaSet_,
+and StatefulSet_. Some of the open-source projects used by the ONAP components
+directly support clustered configurations, for example ODL and MariaDB Galera.
+
+The Kubernetes Services_ abstraction to provide a consistent access point for
+each of the ONAP components, independent of the pod or container architecture
+of that component. For example, SDN-C uses OpenDaylight clustering with a
+default cluster size of three but uses a Kubernetes service to and change the
+number of pods in this abstract this cluster from the other ONAP components
+such that the cluster could change size and this change is isolated from the
+other ONAP components by the load-balancer implemented in the ODL service
+abstraction.
+
+A ReplicaSet_ is a construct that is used to describe the desired state of the
+cluster. For example 'replicas: 3' indicates to Kubernetes that a cluster of 3
+instances is the desired state. Should one of the members of the cluster fail,
+a new member will be automatically started to replace it.
+
+Some of the ONAP components many need a more deterministic deployment; for
+example to enable intra-cluster communication. For these applications the
+component can be deployed as a Kubernetes StatefulSet_ which will maintain a
+persistent identifier for the pods and thus a stable network id for the pods.
+For example: the pod names might be web-0, web-1, web-{N-1} for N 'web' pods
+with corresponding DNS entries such that intra service communication is simple
+even if the pods are physically distributed across multiple nodes. An example
+of how these capabilities can be used is described in the Running Consul on
+Kubernetes tutorial.
+
+.. figure:: oomLogoV2-Upgrade.png
+ :align: right
+
+Upgrade
+=======
+
+Helm has built-in capabilities to enable the upgrade of pods without causing a
+loss of the service being provided by that pod or pods (if configured as a
+cluster). As described in the OOM Developer's Guide, ONAP components provide
+an abstracted 'service' end point with the pods or containers providing this
+service hidden from other ONAP components by a load balancer. This capability
+is used during upgrades to allow a pod with a new image to be added to the
+service before removing the pod with the old image. This 'make before break'
+capability ensures minimal downtime.
+
+Prior to doing an upgrade, determine of the status of the deployed charts::
+
+ > helm list
+ NAME REVISION UPDATED STATUS CHART NAMESPACE
+ so 1 Mon Feb 5 10:05:22 2018 DEPLOYED so-2.0.1 default
+
+When upgrading a cluster a parameter controls the minimum size of the cluster
+during the upgrade while another parameter controls the maximum number of nodes
+in the cluster. For example, SNDC configured as a 3-way ODL cluster might
+require that during the upgrade no fewer than 2 pods are available at all times
+to provide service while no more than 5 pods are ever deployed across the two
+versions at any one time to avoid depleting the cluster of resources. In this
+scenario, the SDNC cluster would start with 3 old pods then Kubernetes may add
+a new pod (3 old, 1 new), delete one old (2 old, 1 new), add two new pods (2
+old, 3 new) and finally delete the 2 old pods (3 new). During this sequence
+the constraints of the minimum of two pods and maximum of five would be
+maintained while providing service the whole time.
+
+Initiation of an upgrade is triggered by changes in the Helm charts. For
+example, if the image specified for one of the pods in the SDNC deployment
+specification were to change (i.e. point to a new Docker image in the nexus3
+repository - commonly through the change of a deployment variable), the
+sequence of events described in the previous paragraph would be initiated.
+
+For example, to upgrade a container by changing configuration, specifically an
+environment value::
+
+ > helm upgrade beijing onap/so --version 2.0.1 --set enableDebug=true
+
+Issuing this command will result in the appropriate container being stopped by
+Kubernetes and replaced with a new container with the new environment value.
+
+To upgrade a component to a new version with a new configuration file enter::
+
+ > helm upgrade beijing onap/so --version 2.0.2 -f environments/demo.yaml
+
+To fetch release history enter::
+
+ > helm history so
+ REVISION UPDATED STATUS CHART DESCRIPTION
+ 1 Mon Feb 5 10:05:22 2018 SUPERSEDED so-2.0.1 Install complete
+ 2 Mon Feb 5 10:10:55 2018 DEPLOYED so-2.0.2 Upgrade complete
+
+Unfortunately, not all upgrades are successful. In recognition of this the
+lineup of pods within an ONAP deployment is tagged such that an administrator
+may force the ONAP deployment back to the previously tagged configuration or to
+a specific configuration, say to jump back two steps if an incompatibility
+between two ONAP components is discovered after the two individual upgrades
+succeeded.
+
+This rollback functionality gives the administrator confidence that in the
+unfortunate circumstance of a failed upgrade the system can be rapidly brought
+back to a known good state. This process of rolling upgrades while under
+service is illustrated in this short YouTube video showing a Zero Downtime
+Upgrade of a web application while under a 10 million transaction per second
+load.
+
+For example, to roll-back back to previous system revision enter::
+
+ > helm rollback so 1
+
+ > helm history so
+ REVISION UPDATED STATUS CHART DESCRIPTION
+ 1 Mon Feb 5 10:05:22 2018 SUPERSEDED so-2.0.1 Install complete
+ 2 Mon Feb 5 10:10:55 2018 SUPERSEDED so-2.0.2 Upgrade complete
+ 3 Mon Feb 5 10:14:32 2018 DEPLOYED so-2.0.1 Rollback to 1
+
+.. note::
+
+ The description field can be overridden to document actions taken or include
+ tracking numbers.
+
+Many of the ONAP components contain their own databases which are used to
+record configuration or state information. The schemas of these databases may
+change from version to version in such a way that data stored within the
+database needs to be migrated between versions. If such a migration script is
+available it can be invoked during the upgrade (or rollback) by Container
+Lifecycle Hooks. Two such hooks are available, PostStart and PreStop, which
+containers can access by registering a handler against one or both. Note that
+it is the responsibility of the ONAP component owners to implement the hook
+handlers - which could be a shell script or a call to a specific container HTTP
+endpoint - following the guidelines listed on the Kubernetes site. Lifecycle
+hooks are not restricted to database migration or even upgrades but can be used
+anywhere specific operations need to be taken during lifecycle operations.
+
+OOM uses Helm K8S package manager to deploy ONAP components. Each component is
+arranged in a packaging format called a chart - a collection of files that
+describe a set of k8s resources. Helm allows for rolling upgrades of the ONAP
+component deployed. To upgrade a component Helm release you will need an
+updated Helm chart. The chart might have modified, deleted or added values,
+deployment yamls, and more. To get the release name use::
+
+ > helm ls
+
+To easily upgrade the release use::
+
+ > helm upgrade [RELEASE] [CHART]
+
+To roll back to a previous release version use::
+
+ > helm rollback [flags] [RELEASE] [REVISION]
+
+For example, to upgrade the onap-so helm release to the latest SO container
+release v1.1.2:
+
+- Edit so values.yaml which is part of the chart
+- Change "so: nexus3.onap.org:10001/openecomp/so:v1.1.1" to
+ "so: nexus3.onap.org:10001/openecomp/so:v1.1.2"
+- From the chart location run::
+
+ > helm upgrade onap-so
+
+The previous so pod will be terminated and a new so pod with an updated so
+container will be created.
+
+.. figure:: oomLogoV2-Delete.png
+ :align: right
+
+Delete
+======
+
+Existing deployments can be partially or fully removed once they are no longer
+needed. To minimize errors it is recommended that before deleting components
+from a running deployment the operator perform a 'dry-run' to display exactly
+what will happen with a given command prior to actually deleting anything. For
+example::
+
+ > helm delete --dry-run beijing
+
+will display the outcome of deleting the 'beijing' release from the deployment.
+To completely delete a release and remove it from the internal store enter::
+
+ > helm delete --purge beijing
+
+One can also remove individual components from a deployment by changing the
+ONAP configuration values. For example, to remove `so` from a running
+deployment enter::
+
+ > helm upgrade beijing osn/onap --set so.enabled=false
+
+will remove `so` as the configuration indicates it's no longer part of the
+deployment. This might be useful if a one wanted to replace just `so` by
+installing a custom version.