From 97ff534fdf663ba97a4ba02fec6ef4d2c9368081 Mon Sep 17 00:00:00 2001 From: Saryu Shah Date: Tue, 15 May 2018 18:07:54 +0000 Subject: SRM, Locking, Pooling documentation SRM, Locking, Pooling documentation ------------------------------------------------------------- Change-Id: Id66463fc89f2c2c466ad92936816be878cfd77b7 Issue-ID: POLICY-536 Signed-off-by: Saryu Shah --- docs/platform/feature_locking.rst | 47 +++ docs/platform/feature_pooling.rst | 100 ++++++ docs/platform/index.rst | 3 + docs/platform/poolingBuckets.png | Bin 0 -> 24978 bytes docs/platform/poolingDesign.png | Bin 0 -> 61828 bytes docs/platform/poolingPdps.png | Bin 0 -> 110422 bytes docs/platform/srmEditor.png | Bin 0 -> 107740 bytes docs/platform/srmNexus.png | Bin 0 -> 129754 bytes docs/platform/srmPdpxPdpMgmt.png | Bin 0 -> 76009 bytes docs/platform/srmPdpxResiliencyPdpMgmt1.png | Bin 0 -> 58164 bytes docs/platform/srmPdpxResiliencyPdpMgmt2.png | Bin 0 -> 57400 bytes docs/platform/swarch_pdp.rst | 2 + docs/platform/swarch_srm.rst | 495 ++++++++++++++++++++++++++++ docs/release-notes.rst | 4 +- 14 files changed, 648 insertions(+), 3 deletions(-) create mode 100644 docs/platform/feature_locking.rst create mode 100644 docs/platform/feature_pooling.rst create mode 100644 docs/platform/poolingBuckets.png create mode 100644 docs/platform/poolingDesign.png create mode 100644 docs/platform/poolingPdps.png create mode 100644 docs/platform/srmEditor.png create mode 100644 docs/platform/srmNexus.png create mode 100644 docs/platform/srmPdpxPdpMgmt.png create mode 100644 docs/platform/srmPdpxResiliencyPdpMgmt1.png create mode 100644 docs/platform/srmPdpxResiliencyPdpMgmt2.png create mode 100644 docs/platform/swarch_srm.rst (limited to 'docs') diff --git a/docs/platform/feature_locking.rst b/docs/platform/feature_locking.rst new file mode 100644 index 000000000..1236c93f2 --- /dev/null +++ b/docs/platform/feature_locking.rst @@ -0,0 +1,47 @@ + +.. This work is licensed under a Creative Commons Attribution 4.0 International License. +.. http://creativecommons.org/licenses/by/4.0 + +**************************** +Feature: Distributed Locking +**************************** + +Summary +^^^^^^^ + +The Distributed Locking Feature provides locking of resources across a pool of PDP-D hosts. The list of locks is maintained in a database, where each record includes a resource identifier, an owner identifier, and an expiration time. Typically, a drools application will unlock the resource when it's operation completes. However, if it fails to do so, then the resource will be automatically released when the lock expires, thus preventing a resource from becoming permanently locked. + +Usage +^^^^^ + + .. code-block:: bash + :caption: Enable Feature Distributed Locking + + policy stop + + features enable distributed-locking + + The configuration is located at: + + * $POLICY_HOME/config/feature-distributed-locking.properties + + + .. code-block:: bash + :caption: Start the PDP-D using pooling + + policy start + + + .. code-block:: bash + :caption: Disable the Distributed Locking feature + + policy stop + features disable distributed-locking + policy start + + +End of Document + +.. SSNote: Wiki page ref. https://wiki.onap.org/display/DW/Feature+Distributed+Locking + + diff --git a/docs/platform/feature_pooling.rst b/docs/platform/feature_pooling.rst new file mode 100644 index 000000000..5ed3de18a --- /dev/null +++ b/docs/platform/feature_pooling.rst @@ -0,0 +1,100 @@ + +.. This work is licensed under a Creative Commons Attribution 4.0 International License. +.. http://creativecommons.org/licenses/by/4.0 + +**************** +Feature: Pooling +**************** + +Summary +^^^^^^^ + +The Pooling feature provides the ability to load-balance work across a “pool” of active-active Drools-PDP hosts. This particular implementation uses a DMaaP topic for communication between the hosts within the pool. + +The pool is adjusted automatically, with no manual intervention when: + * a new host is brought online + * a host goes offline, whether gracefully or due to a failure in the host or in the network + +Assumptions and Limitations +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + * Session persistence is not required + * Data may be lost when processing is moved from one host to another + * The entire pool may shut down if the inter-host DMaaP topic becomes inaccessible + + .. image:: poolingDesign.png + + +Key Points +^^^^^^^^^^ + * Requests are received on a common DMaaP topic + * DMaaP distributes the requests randomly to the hosts + * The request topic should have at least as many partitions as there are hosts + * Uses a single, internal DMaaP topic for all inter-host communication + * Allocates buckets to each host + * Requests are assigned to buckets based on their respective “request IDs” + * No session persistence + * No objects copied between hosts + * Requires feature(s): distributed-locking + * Precludes feature(s): session-persistence, active-standby, state-management + +Example Scenario +^^^^^^^^^^^^^^^^ + 1. Incoming DMaaP message is received on a topic — all hosts are listening, but only one random host receives the message + 2. Decode message to determine “request ID” key (message-specific operation) + 3. Hash request ID to determine the bucket number + 4. Look up host associated with hash bucket (most likely remote) + 5. Publish “forward” message to internal DMaaP topic, including remote host, bucket number, DMaaP topic information, and message body + 6. Remote host verifies ownership of bucket, and routes the DMaaP message to its own rule engine for processing + + The figure below shows several different hosts in a pool. Each host as a copy of the bucket assignments, which specifies which buckets are assigned to which hosts. Incoming requests are mapped to a bucket, and a bucket is mapped to a host, to which the request is routed. The host table includes an entry for each active host in the pool, to which one or more buckets are mapped. + + .. image:: poolingPdps.png + +Bucket Reassignment +^^^^^^^^^^^^^^^^^^^ + * When a host goes up or down, buckets are rebalanced + * Attempts to maintain an even distribution + * Leaves buckets with their current owner, where possible + * Takes a few buckets from each host to assign to new hosts + + For example, in the diagram below, the left side shows how 32 buckets might be assigned among four different hosts. When the first host fails, the buckets from host 1 would be reassigned among the remaining hosts, similar to what is shown on the right side of the diagram. Any requests that were being processed by host 1 will be lost and must be restarted. However, the buckets that had already been assigned to the remaining hosts are unchanged, thus requests associated with those buckets are not impacted by the loss of host 1. + + .. image:: poolingBuckets.png + +Usage +^^^^^ + +For pooling to be enabled, the distributed-locking feature must be also be enabled. + + .. code-block:: bash + :caption: Enable Feature Pooling + + policy stop + + features enable distributed-locking + features enable pooling-dmaap + + The configuration is located at: + + * $POLICY_HOME/config/feature-pooling-dmaap.properties + + + .. code-block:: bash + :caption: Start the PDP-D using pooling + + policy start + + + .. code-block:: bash + :caption: Disable the pooling feature + + policy stop + features disable pooling-dmaap + policy start + + +End of Document + +.. SSNote: Wiki page ref. https://wiki.onap.org/display/DW/Feature+Pooling + + diff --git a/docs/platform/index.rst b/docs/platform/index.rst index 9ce1c27fd..4baf57e90 100644 --- a/docs/platform/index.rst +++ b/docs/platform/index.rst @@ -28,6 +28,7 @@ Policy Software Architecture .. toctree:: :maxdepth: 1 + swarch_srm.rst swarch_pdp.rst feature_eelf.rst feature_testtransaction.rst @@ -35,6 +36,8 @@ Policy Software Architecture feature_sesspersist.rst feature_statemgmt.rst feature_activestdbymgmt.rst + feature_locking.rst + feature_pooling.rst swarch_pdpx.rst swarch_pap.rst swarch_brmsgw.rst diff --git a/docs/platform/poolingBuckets.png b/docs/platform/poolingBuckets.png new file mode 100644 index 000000000..8b43a7de9 Binary files /dev/null and b/docs/platform/poolingBuckets.png differ diff --git a/docs/platform/poolingDesign.png b/docs/platform/poolingDesign.png new file mode 100644 index 000000000..8040e8099 Binary files /dev/null and b/docs/platform/poolingDesign.png differ diff --git a/docs/platform/poolingPdps.png b/docs/platform/poolingPdps.png new file mode 100644 index 000000000..e05bad3d0 Binary files /dev/null and b/docs/platform/poolingPdps.png differ diff --git a/docs/platform/srmEditor.png b/docs/platform/srmEditor.png new file mode 100644 index 000000000..0910f04a6 Binary files /dev/null and b/docs/platform/srmEditor.png differ diff --git a/docs/platform/srmNexus.png b/docs/platform/srmNexus.png new file mode 100644 index 000000000..f0ffea458 Binary files /dev/null and b/docs/platform/srmNexus.png differ diff --git a/docs/platform/srmPdpxPdpMgmt.png b/docs/platform/srmPdpxPdpMgmt.png new file mode 100644 index 000000000..5a998f080 Binary files /dev/null and b/docs/platform/srmPdpxPdpMgmt.png differ diff --git a/docs/platform/srmPdpxResiliencyPdpMgmt1.png b/docs/platform/srmPdpxResiliencyPdpMgmt1.png new file mode 100644 index 000000000..84d468fe3 Binary files /dev/null and b/docs/platform/srmPdpxResiliencyPdpMgmt1.png differ diff --git a/docs/platform/srmPdpxResiliencyPdpMgmt2.png b/docs/platform/srmPdpxResiliencyPdpMgmt2.png new file mode 100644 index 000000000..6b5ae8510 Binary files /dev/null and b/docs/platform/srmPdpxResiliencyPdpMgmt2.png differ diff --git a/docs/platform/swarch_pdp.rst b/docs/platform/swarch_pdp.rst index 9b087b53b..7783156ac 100644 --- a/docs/platform/swarch_pdp.rst +++ b/docs/platform/swarch_pdp.rst @@ -57,6 +57,8 @@ The current extensions supported are: - `Feature Healthcheck `_ (enabled by default) - `Feature Session Persistence `_ (disabled by default) - `Feature Active/Standby Management `_ (disabled by default) +- `Feature Distributed Locking `_ (disabled by default) +- `Feature Pooling `_ (disabled by default) .. seealso:: Click on the individual feature links for more information diff --git a/docs/platform/swarch_srm.rst b/docs/platform/swarch_srm.rst new file mode 100644 index 000000000..2de6baebc --- /dev/null +++ b/docs/platform/swarch_srm.rst @@ -0,0 +1,495 @@ + +.. This work is licensed under a Creative Commons Attribution 4.0 International License. +.. http://creativecommons.org/licenses/by/4.0 + +***************************************** +Scalability, Resiliency and Manageability +***************************************** + +.. contents:: + :depth: 3 + +The new Beijing release scalability, resiliency, and manageablity are described here. These capabilities apply to the OOM/Kubernetes installation. + +Installation +^^^^^^^^^^^^ +Follow the OOM installation instructions at http://onap.readthedocs.io/en/latest/submodules/oom.git/docs/index.html + +Overview of the running system +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Upon initialization, you should see pools of 4 PDP-Ds and 2 PDP-Xs: + +.. code-block:: bash + :caption: verify pods + + kubectl get pods --all-namespaces -o=wid + + onap dev-brmsgw-5dbc4c8dc4-llk5s 1/1 Running 0 18m 10.42.120.43 k8sx + onap dev-drools-0 1/1 Running 0 18m 10.42.60.27 k8sx + onap dev-drools-1 1/1 Running 0 16m 10.42.105.190 k8sx + onap dev-drools-2 1/1 Running 0 15m 10.42.139.82 k8sx + onap dev-drools-3 1/1 Running 0 15m 10.42.128.4 k8sx + onap dev-nexus-7d96568f5f-qp5td 1/1 Running 0 18m 10.42.172.8 k8sx + onap dev-pap-8587696769-vwj6k 2/2 Running 0 18m 10.42.19.137 k8sx + onap dev-pdp-0 2/2 Running 0 18m 10.42.144.218 k8sx + onap dev-pdp-1 2/2 Running 0 15m 10.42.233.111 k8sx + onap dev-policydb-587d55bdff-4f5dz 1/1 Running 0 18m 10.42.12.242 k8sx + + +and a service for every component: + +.. code-block:: bash + :caption: verify services + + kubectl get services --all-namespaces + + onap brmsgw NodePort 10.43.209.173 9989:30216/TCP 24m + onap drools NodePort 10.43.27.92 6969:30217/TCP,9696:30221/TCP 24m + onap nexus NodePort 10.43.19.171 8081:30236/TCP 24m + onap pap NodePort 10.43.9.166 8443:30219/TCP,9091:30218/TCP 24m + onap pdp ClusterIP None 8081/TCP 24m + onap policydb ClusterIP None 3306/TCP 24m + +Config and Decision policy requests will be distributed across PDP-Xs through the *pdp* service. PDP-X clients (such as DCAE) should configure their URLs to go through the *pdp* service. Their requests will be distributed across the available PDP-X replicas. The PDP-Xs can be also accessed individually (dev-pdp-0 and dev-pdp-1 above), but is preferable to that external clients use the service. + +PDP-Ds are also accessible on a group fashion by using the service IP. Nevertheless, as DMaaP is the main means of communication with other ONAP components, the service interface is not used heavily. + + +Healthchecks +^^^^^^^^^^^^ + +Verify that the policy healtcheck passes by the robot framework: + +.. code-block:: bash + :caption: ~/oom/kubernetes/robot/ete-k8s.sh onap health 2> /dev/null | grep PASS + + Basic Policy Health Check | PASS | + +A policy healthcheck (with more detailed output) can be done directly to the drools service in the policy VM. + +.. code-block:: none + :caption: Healtcheck on the PDP-D service + + curl --silent --user ': -X GET http://localhost:30217/healthcheck | python -m json.tool + + { + "details": [ + { + "code": 200, + "healthy": true, + "message": "alive", + "name": "PDP-D", + "url": "self" + }, + { + "code": 200, + "healthy": true, + "message": "", + "name": "PAP", + "url": "http://pap:9091/pap/test" + }, + { + "code": 200, + "healthy": true, + "message": "", + "name": "PDP", + "url": "http://pdp:8081/pdp/test" + } + ], + "healthy": true + } + + +PDP-X active/active pool +^^^^^^^^^^^^^^^^^^^^^^^^ + +The policy engine UI (console container in the pap pod) can be used to check that the the 2 individual PDP-Xs are synchronized. +The console URL is accessible at ``http://:30219/onap/login.htm``. Select the PDP tab. + + .. image:: srmPdpxPdpMgmt.png + +After initialization, there will be no policies loaded into the policy subsystem. You can verify it, by accessing the Editor tab in the UI. + + +PDP-D Active/Active Pool +^^^^^^^^^^^^^^^^^^^^^^^^ + +The PDP-Ds replicas will come up with the amsterdam controller installed in brainless mode (no maven coordinates) since the controller has not been associated with a set of drools rules to run (control loop rules). + +The following command can be issued on each of the PDP-D replicas IPs: + +.. code-block:: bash + :caption: Querying the rules association for a PDP-D replica + + curl --silent --user ':' -X GET http://:9696/policy/pdp/engine/controllers/amsterdam/drools | python -m json.tool + + { + "alive": false, + "artifactId": "NO-ARTIFACT-ID", + "brained": false, + "canonicalSessionNames": [], + "container": null, + "groupId": "NO-GROUP-ID", + "locked": false, + "recentSinkEvents": [], + "recentSourceEvents": [], + "sessionNames": [], + "version": "NO-VERSION" + } + +Installing Policies +^^^^^^^^^^^^^^^^^^^ + +The OOM default installation will come with no policies pre-configured. There is a sample script used by integration teams to load policies to support all 4 use cases at: /tmp/policy-install/config/push-policies.sh in the pap container within the pap pod. This script can be modified for your own particular installation, for example if only interested in vCPE use cases, remove those vCPE related API REST calls. For the vFW use case, you may want to edit the encoded operational policy to point to the proper resourceID in your installation. + +The above mentioned push-policies.sh script can be executed as follows: + +.. code-block:: bash + :caption: Installing the default policies + + kubectl exec -it dev-pap-8587696769-vwj6k -c pap -n onap -- bash -c "export PRELOAD_POLICIES=true; /tmp/policy-install/config/push-policies.sh" + + + .. + Create BRMSParam Operational Policies + .. + Create BRMSParamvFirewall Policy + .. + Transaction ID: ef08cc65-9950-4478-a4ab-0f3bc2519f60 --Policy with the name com.Config_BRMS_Param_BRMSParamvFirewall.1.xml was successfully created.Create BRMSParamvDNS Policy + .. + Transaction ID: 52e33efe-ba66-47de-b404-8d441107d8a9 --Policy with the name com.Config_BRMS_Param_BRMSParamvDNS.1.xml was successfully created.Create BRMSParamVOLTE Policy + .. + Transaction ID: f13072b7-6258-4c16-99da-f908d29363ec --Policy with the name com.Config_BRMS_Param_BRMSParamVOLTE.1.xml was successfully created.Create BRMSParamvCPE Policy + .. + Transaction ID: 616f970a-b45e-40f7-88cd-d63000d22cca --Policy with the name com.Config_BRMS_Param_BRMSParamvCPE.1.xml was successfully created.Create MicroService Config Policies + Create MicroServicevFirewall Policy + .. + Transaction ID: 4c143a15-20af-408a-9285-bc7940261829 --Policy with the name com.Config_MS_MicroServicevFirewall.1.xml was successfully created.Create MicroServicevDNS Policy + .. + Transaction ID: 1e54ae73-509b-490e-bf62-1fea7989fd5f --Policy with the name com.Config_MS_MicroServicevDNS.1.xml was successfully created.Create MicroServicevCPE Policy + .. + Transaction ID: 32239868-bab2-4e12-9fd9-81a0ed4a6b1c --Policy with the name com.Config_MS_MicroServicevCPE.1.xml was successfully created.Creating Decision Guard policy + .. + Transaction ID: b43cb9d5-42c7-4654-aacf-d4898c4d13bb --Policy with the name com.Decision_AllPermitGuard.1.xml was successfully created.Push Decision policy + .. + Transaction ID: 3c1e4ae6-6991-415b-9f2d-c665a8c5a026 --Policy 'com.Decision_AllPermitGuard.1.xml' was successfully pushed to the PDP group 'default'.Pushing BRMSParam Operational policies + .. + Transaction ID: 58d26d03-b5b8-4fd3-b2df-1411a1c36420 --Policy 'com.Config_BRMS_Param_BRMSParamvFirewall.1.xml' was successfully pushed to the PDP group 'default'.pushPolicy : PUT : com.BRMSParamvDNS + .. + Transaction ID: 0854e54a-504b-4f06-bc2f-30f491cb9f5a --Policy 'com.Config_BRMS_Param_BRMSParamvDNS.1.xml' was successfully pushed to the PDP group 'default'.pushPolicy : PUT : com.BRMSParamVOLTE + .. + Transaction ID: d33c7dde-5c99-4dab-b4ff-9988473cd88d --Policy 'com.Config_BRMS_Param_BRMSParamVOLTE.1.xml' was successfully pushed to the PDP group 'default'.pushPolicy : PUT : com.BRMSParamvCPE + .. + Transaction ID: e8c8a73e-127c-4318-9e59-3cae9dcbe011 --Policy 'com.Config_BRMS_Param_BRMSParamvCPE.1.xml' was successfully pushed to the PDP group 'default'.Pushing MicroService Config policies + .. + Transaction ID: ec0429d7-e35f-4978-8a6c-40d2b5b3be61 --Policy 'com.Config_MS_MicroServicevFirewall.1.xml' was successfully pushed to the PDP group 'default'.pushPolicy : PUT : com.MicroServicevDNS + .. + Transaction ID: f7072f05-7b74-45b5-9bd3-99b7f8023e3e --Policy 'com.Config_MS_MicroServicevDNS.1.xml' was successfully pushed to the PDP group 'default'.pushPolicy : PUT : com.MicroServicevCPE + .. + Transaction ID: 6d47db63-7956-4f5f-ab34-aeb5a124a90d --Policy 'com.Config_MS_MicroServicevCPE.1.xml' was successfully pushed to the PDP group 'default'. + + +The policies pushed can be viewed through the Policy UI: + + .. image:: srmEditor.png + +As a consequence of pushing the policies, the brmsgw component will compose drools rules artifacts and publish them to the nexus respository at ``http://:30236/nexus/`` + + .. image:: srmNexus.png + +At the same time each replica of the PDP-Ds will receive notifications for each new version of the policies to run for the amsterdam controller. You can run the following command to see how the amsterdam controller is associated with the latest rules version. The following command can be used for verification for each replica: + + +.. code-block:: none + :caption: Querying the rules association of a PDP-D replica + + curl --silent --user ' -X GET http://:9696/policy/pdp/engine/controllers/amsterdam/drools | python -m json.tool + { + "alive": true, + "artifactId": "policy-amsterdam-rules", + "brained": true, + "groupId": "org.onap.policy-engine.drools.amsterdam", + "locked": false, + "modelClassLoaderHash": 1223551265, + "recentSinkEvents": [], + "recentSourceEvents": [], + "sessionCoordinates": [ + "org.onap.policy-engine.drools.amsterdam:policy-amsterdam-rules:0.4.0:closedloop-amsterdam" + ], + "sessions": [ + "closedloop-amsterdam" + ], + "version": "0.4.0" + } + +Likewise, for verification purposes, each PDP-X replica can be queried directly to retrieve policy information. The following commands can be used to query a policy through the pdp service: + + +.. code-block:: bash + :caption: Querying the "pdp" service for the vFirewal policy + + ubuntu@k8sx:~$ kubectl exec -it dev-pap-8587696769-vwj6k -c pap -n onap bash + policy@dev-pap-8587696769-vwj6k:/tmp/policy-install$ curl --silent -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' --header 'ClientAuth: cHl0aG9uOnRlc3Q=' --header 'Authorization: Basic dGVzdHBkcDphbHBoYTEyMw==' --header 'Environment: TEST' -d '{"policyName": ".*vFirewall.*"}' http://pdp:8081/pdp/api/getConfig | python -m json.tool + [ + { + "config": "{\"service\":\"tca_policy\",\"location\":\"SampleServiceLocation\",\"uuid\":\"test\",\"policyName\":\"MicroServicevFirewall\",\"description\":\"MicroService vFirewall Policy\",\"configName\":\"SampleConfigName\",\"templateVersion\":\"OpenSource.version.1\",\"version\":\"1.1.0\",\"priority\":\"1\",\"policyScope\":\"resource=SampleResource,service=SampleService,type=SampleType,closedLoopControlName=ControlLoop-vFirewall-d0a1dfc6-94f5-4fd4-a5b5-4630b438850a\",\"riskType\":\"SampleRiskType\",\"riskLevel\":\"1\",\"guard\":\"False\",\"content\":{\"tca_policy\":{\"domain\":\"measurementsForVfScaling\",\"metricsPerEventName\":[{\"eventName\":\"vFirewallBroadcastPackets\",\"controlLoopSchemaType\":\"VNF\",\"policyScope\":\"DCAE\",\"policyName\":\"DCAE.Config_tca-hi-lo\",\"policyVersion\":\"v0.0.1\",\"thresholds\":[{\"closedLoopControlName\":\"ControlLoop-vFirewall-d0a1dfc6-94f5-4fd4-a5b5-4630b438850a\",\"version\":\"1.0.2\",\"fieldPath\":\"$.event.measurementsForVfScalingFields.vNicUsageArray[*].receivedTotalPacketsDelta\",\"thresholdValue\":300,\"direction\":\"LESS_OR_EQUAL\",\"severity\":\"MAJOR\",\"closedLoopEventStatus\":\"ONSET\"},{\"closedLoopControlName\":\"ControlLoop-vFirewall-d0a1dfc6-94f5-4fd4-a5b5-4630b438850a\",\"version\":\"1.0.2\",\"fieldPath\":\"$.event.measurementsForVfScalingFields.vNicUsageArray[*].receivedTotalPacketsDelta\",\"thresholdValue\":700,\"direction\":\"GREATER_OR_EQUAL\",\"severity\":\"CRITICAL\",\"closedLoopEventStatus\":\"ONSET\"}]}]}}}", + "matchingConditions": { + "ConfigName": "SampleConfigName", + "Location": "SampleServiceLocation", + "ONAPName": "DCAE", + "service": "tca_policy", + "uuid": "test" + }, + "policyConfigMessage": "Config Retrieved! ", + "policyConfigStatus": "CONFIG_RETRIEVED", + "policyName": "com.Config_MS_MicroServicevFirewall.1.xml", + "policyType": "MicroService", + "policyVersion": "1", + "property": null, + "responseAttributes": {}, + "type": "JSON" + }, + { + "config": ..... + "matchingConditions": { + "ConfigName": "BRMS_PARAM_RULE", + "ONAPName": "DROOLS" + }, + "policyConfigMessage": "Config Retrieved! ", + "policyConfigStatus": "CONFIG_RETRIEVED", + "policyName": "com.Config_BRMS_Param_BRMSParamvFirewall.1.xml", + "policyType": "BRMS_PARAM", + "policyVersion": "1", + "property": null, + "responseAttributes": { + "controller": "amsterdam" + }, + "type": "OTHER" + } + ] + + +while the following commands could be used to query an specific PDP-X replica: + + +.. code-block:: bash + :caption: Querying PDP-X 0 for the vCPE policy + + curl --silent -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' --header 'ClientAuth: cHl0aG9uOnRlc3Q=' --header 'Authorization: Basic dGVzdHBkcDphbHBoYTEyMw==' --header 'Environment: TEST' -d '{"policyName": ".*vCPE.*"}' http://10.42.144.218:8081/pdp/api/getConfig | python -m json.tool + [ + { + "config": ..., + "matchingConditions": { + "ConfigName": "BRMS_PARAM_RULE", + "ONAPName": "DROOLS" + }, + "policyConfigMessage": "Config Retrieved! ", + "policyConfigStatus": "CONFIG_RETRIEVED", + "policyName": "com.Config_BRMS_Param_BRMSParamvCPE.1.xml", + "policyType": "BRMS_PARAM", + "policyVersion": "1", + "property": null, + "responseAttributes": { + "controller": "amsterdam" + }, + "type": "OTHER" + }, + { + "config": "{\"service\":\"tca_policy\",\"location\":\"SampleServiceLocation\",\"uuid\":\"test\",\"policyName\":\"MicroServicevCPE\",\"description\":\"MicroService vCPE Policy\",\"configName\":\"SampleConfigName\",\"templateVersion\":\"OpenSource.version.1\",\"version\":\"1.1.0\",\"priority\":\"1\",\"policyScope\":\"resource=SampleResource,service=SampleService,type=SampleType,closedLoopControlName=ControlLoop-vCPE-48f0c2c3-a172-4192-9ae3-052274181b6e\",\"riskType\":\"SampleRiskType\",\"riskLevel\":\"1\",\"guard\":\"False\",\"content\":{\"tca_policy\":{\"domain\":\"measurementsForVfScaling\",\"metricsPerEventName\":[{\"eventName\":\"Measurement_vGMUX\",\"controlLoopSchemaType\":\"VNF\",\"policyScope\":\"DCAE\",\"policyName\":\"DCAE.Config_tca-hi-lo\",\"policyVersion\":\"v0.0.1\",\"thresholds\":[{\"closedLoopControlName\":\"ControlLoop-vCPE-48f0c2c3-a172-4192-9ae3-052274181b6e\",\"version\":\"1.0.2\",\"fieldPath\":\"$.event.measurementsForVfScalingFields.additionalMeasurements[*].arrayOfFields[0].value\",\"thresholdValue\":0,\"direction\":\"EQUAL\",\"severity\":\"MAJOR\",\"closedLoopEventStatus\":\"ABATED\"},{\"closedLoopControlName\":\"ControlLoop-vCPE-48f0c2c3-a172-4192-9ae3-052274181b6e\",\"version\":\"1.0.2\",\"fieldPath\":\"$.event.measurementsForVfScalingFields.additionalMeasurements[*].arrayOfFields[0].value\",\"thresholdValue\":0,\"direction\":\"GREATER\",\"severity\":\"CRITICAL\",\"closedLoopEventStatus\":\"ONSET\"}]}]}}}", + "matchingConditions": { + "ConfigName": "SampleConfigName", + "Location": "SampleServiceLocation", + "ONAPName": "DCAE", + "service": "tca_policy", + "uuid": "test" + }, + "policyConfigMessage": "Config Retrieved! ", + "policyConfigStatus": "CONFIG_RETRIEVED", + "policyName": "com.Config_MS_MicroServicevCPE.1.xml", + "policyType": "MicroService", + "policyVersion": "1", + "property": null, + "responseAttributes": {}, + "type": "JSON" + } + ] + +PDP-X Resiliency +^^^^^^^^^^^^^^^^ + +A PDP-X container failure can be simulated by performing a"policy.sh stop" operation within the PDP-X container, this in fact will shutdown the PDP-X service. The kubernetes liveness operation will detect that the ports are down, inferring there's a problem with the service, and in turn, will restart the container. In the following example will cause PDP-X 1 to fail. + +.. code-block:: bash + :caption: Causing PDP-X 1 service to fail + + ubuntu@k8sx:~$ kubectl exec -it dev-pdp-1 --container pdp -n onap -- bash -c "source /opt/app/policy/etc/profile.d/env.sh; policy.sh stop;" + pdplp: STOPPING .. + pdp: STOPPING .. + +Upon detection of the service being down through the liveness check, the container will be restarted. Note the restart count when querying the status of the pods: + +.. code-block:: bash + :caption: Checking PDP-X 1 restart count + + ubuntu@k8sx:~$ kubectl get pods --all-namespaces -o=wide + + NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE + + onap dev-brmsgw-5dbc4c8dc4-llk5s 1/1 Running 0 3d 10.42.120.43 k8sx + onap dev-drools-0 1/1 Running 0 3d 10.42.60.27 k8sx + onap dev-drools-1 1/1 Running 0 3d 10.42.105.190 k8sx + onap dev-drools-2 1/1 Running 0 3d 10.42.139.82 k8sx + onap dev-drools-3 1/1 Running 0 3d 10.42.128.4 k8sx + onap dev-nexus-7d96568f5f-qp5td 1/1 Running 0 3d 10.42.172.8 k8sx + onap dev-pap-8587696769-vwj6k 2/2 Running 0 3d 10.42.19.137 k8sx + onap dev-pdp-0 2/2 Running 0 3d 10.42.144.218 k8sx + onap dev-pdp-1 2/2 Running 1 3d 10.42.233.111 k8sx <--- ** + onap dev-policydb-587d55bdff-4f5dz 1/1 Running 0 3d 10.42.12.242 k8sx + + +During the restart process, the PAP component, will detect that PDP-X 1 is down and therefore its state being reflected in the PDP-X screen: + + .. image:: srmPdpxResiliencyPdpMgmt1.png + +This screen will be updated to reflect PDP-X 1 is back alive, after PDP-X 1 synchronizes itself with the PAP. + + .. image:: srmPdpxResiliencyPdpMgmt2.png + +At that point, PDP-X is usable either directly or through the service to query for policies. + + +.. code-block:: bash + :caption: Query PDP-X 1 for vCPE policy + + ubuntu@k8sx:~$ curl --silent -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' --header 'ClientAuth: cHl0aG9uOnRlc3Q=' --header 'Authorization: Basic dGVzdHBkcDphbHBoYTEyMw==' --header 'Environment: TEST' -d '{"policyName": ".*vCPE.*"}' http://10.42.233.111:8081/pdp/api/getConfig | python -m json.tool + [ + { + "config": "..", + "matchingConditions": { + "ConfigName": "BRMS_PARAM_RULE", + "ONAPName": "DROOLS" + }, + "policyConfigMessage": "Config Retrieved! ", + "policyConfigStatus": "CONFIG_RETRIEVED", + "policyName": "com.Config_BRMS_Param_BRMSParamvCPE.1.xml", + "policyType": "BRMS_PARAM", + "policyVersion": "1", + "property": null, + "responseAttributes": { + "controller": "amsterdam" + }, + "type": "OTHER" + }, + { + "config": "{\"service\":\"tca_policy\",\"location\":\"SampleServiceLocation\",\"uuid\":\"test\",\"policyName\":\"MicroServicevCPE\",\"description\":\"MicroService vCPE Policy\",\"configName\":\"SampleConfigName\",\"templateVersion\":\"OpenSource.version.1\",\"version\":\"1.1.0\",\"priority\":\"1\",\"policyScope\":\"resource=SampleResource,service=SampleService,type=SampleType,closedLoopControlName=ControlLoop-vCPE-48f0c2c3-a172-4192-9ae3-052274181b6e\",\"riskType\":\"SampleRiskType\",\"riskLevel\":\"1\",\"guard\":\"False\",\"content\":{\"tca_policy\":{\"domain\":\"measurementsForVfScaling\",\"metricsPerEventName\":[{\"eventName\":\"Measurement_vGMUX\",\"controlLoopSchemaType\":\"VNF\",\"policyScope\":\"DCAE\",\"policyName\":\"DCAE.Config_tca-hi-lo\",\"policyVersion\":\"v0.0.1\",\"thresholds\":[{\"closedLoopControlName\":\"ControlLoop-vCPE-48f0c2c3-a172-4192-9ae3-052274181b6e\",\"version\":\"1.0.2\",\"fieldPath\":\"$.event.measurementsForVfScalingFields.additionalMeasurements[*].arrayOfFields[0].value\",\"thresholdValue\":0,\"direction\":\"EQUAL\",\"severity\":\"MAJOR\",\"closedLoopEventStatus\":\"ABATED\"},{\"closedLoopControlName\":\"ControlLoop-vCPE-48f0c2c3-a172-4192-9ae3-052274181b6e\",\"version\":\"1.0.2\",\"fieldPath\":\"$.event.measurementsForVfScalingFields.additionalMeasurements[*].arrayOfFields[0].value\",\"thresholdValue\":0,\"direction\":\"GREATER\",\"severity\":\"CRITICAL\",\"closedLoopEventStatus\":\"ONSET\"}]}]}}}", + "matchingConditions": { + "ConfigName": "SampleConfigName", + "Location": "SampleServiceLocation", + "ONAPName": "DCAE", + "service": "tca_policy", + "uuid": "test" + }, + "policyConfigMessage": "Config Retrieved! ", + "policyConfigStatus": "CONFIG_RETRIEVED", + "policyName": "com.Config_MS_MicroServicevCPE.1.xml", + "policyType": "MicroService", + "policyVersion": "1", + "property": null, + "responseAttributes": {}, + "type": "JSON" + } + ] + +PDP-D Resiliency +^^^^^^^^^^^^^^^^ + +A PDP-D container failure can be simulated by performing a"policy stop" operation within the PDP-D container, this in fact will shutdown the PDP-D service. The kubernetes liveness operation will detect that the ports are down, inferring there's a problem with the service, and in turn, will restart the container. In the following example will cause PDP-D 3 to fail. + +.. code-block:: bash + :caption: Causing PDP-D 3 to fail + + ubuntu@k8sx:~/oom/kubernetes$ kubectl exec -it dev-drools-3 --container drools -n onap -- bash -c "source /opt/app/policy/etc/profile.d/env.sh; policy stop" + [drools-pdp-controllers] + L []: Stopping Policy Management... Policy Management (pid=3284) is stopping... Policy Management has stopped. + + +Upon detection of the service being down through the liveness check, the container will be restarted. Note the restart count when querying the status of the pods: + +.. code-block:: bash + :caption: Checking PDP-D 3 restart count + + ubuntu@k8sx:~/oom/kubernetes$ kubectl get pods --all-namespaces -o=wide + + NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE + + onap dev-brmsgw-5549d99466-7989k 1/1 Running 0 1h 10.42.252.245 k8sx + onap dev-drools-0 1/1 Running 0 1h 10.42.30.52 k8sx + onap dev-drools-1 1/1 Running 0 1h 10.42.9.245 k8sx + onap dev-drools-2 1/1 Running 0 1h 10.42.95.0 k8sx + onap dev-drools-3 1/1 Running 1 1h 10.42.224.52 k8sx + onap dev-nexus-6558979c95-xlxcc 1/1 Running 0 1h 10.42.142.36 k8sx + onap dev-pap-64b67f66b9-lc8vl 2/2 Running 0 1h 10.42.187.255 k8sx + onap dev-pdp-0 2/2 Running 0 1h 10.42.164.57 k8sx + onap dev-pdp-1 2/2 Running 0 1h 10.42.155.145 k8sx + onap dev-policydb-7d4b75869-qd8n5 1/1 Running 0 1h 10.42.148.37 k8sx + + +PDP-X Scaling +^^^^^^^^^^^^^ + +To scale a new PDP-X, set the replica count appropriately. In our scenario below, we are going to scale the PDP-X with a new replica, PDP-X 2, to have a pool of 3 PDP-X. + +.. code-block:: bash + :caption: Scaling a PDP-X + + helm upgrade -i dev local/onap --namespace onap --set policy.pdp.replicaCount=3 + + Release "dev" has been upgraded. Happy Helming! + LAST DEPLOYED: Mon May 14 01:37:03 2018 + NAMESPACE: onap + STATUS: DEPLOYED + .. + + kubectl get pods --all-namespaces -o=wide + + NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE + .. + onap dev-pdp-0 2/2 Running 0 1h 10.42.164.57 k8sx + onap dev-pdp-1 2/2 Running 0 1h 10.42.155.145 k8sx + onap dev-pdp-2 2/2 Running 0 1m 10.42.47.58 k8sx + .. + + +PDP-D Scaling +^^^^^^^^^^^^^ + +To scale a new PDP-D, set the replica count appropriately. In our scenario below, we are going to scale the PDP-D service with a new replica, PDP-D 4, to have a pool of 5 PDP-D. + +.. code-block:: bash + :caption: Scaling a PDP-D + + helm upgrade -i dev local/onap --namespace onap --set policy.drools.replicaCount=5 + Release "dev" has been upgraded. Happy Helming! + LAST DEPLOYED: Mon May 14 01:45:19 2018 + NAMESPACE: onap + STATUS: DEPLOYED + + ubuntu@k8sx:~/oom/kubernetes$ kubectl get pods --all-namespaces -o=wide + NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE + .. + onap dev-drools-0 1/1 Running 0 1h 10.42.30.52 k8sx + onap dev-drools-1 1/1 Running 0 1h 10.42.9.245 k8sx + onap dev-drools-2 1/1 Running 0 1h 10.42.95.0 k8sx + onap dev-drools-3 1/1 Running 1 1h 10.42.224.52 k8sx + onap dev-drools-4 1/1 Running 0 1m 10.42.237.251 k8sx + .. + + + + +End of Document + +.. SSNote: Wiki page ref. https://wiki.onap.org/display/DW/Scalability%2C+Resiliency+and+Manageability + + diff --git a/docs/release-notes.rst b/docs/release-notes.rst index 284145c98..548931bec 100644 --- a/docs/release-notes.rst +++ b/docs/release-notes.rst @@ -99,10 +99,8 @@ The Beijing release for POLICY delivered the following Epics. For a full list of * POLICY-734 Fix Fortify Header Manipulation Issue * POLICY-743 Fixed data name since its name was changed on server side * POLICY-753 Policy Health Check failed with multi-node cluster - * POLICY-763 PDP-D throwing NullPointerException for multiple vDNS and VOLTE messages injected in parallel * POLICY-765 junit test for guard fails intermittently - * POLICY-773 brmsgw failure pushing notification when executing update-vfw-op-policy.sh - + * POLICY-795 PDP-D allow configuration on OOM install to survive upgrades **Security Issues** -- cgit 1.2.3-korg