Age | Commit message (Collapse) | Author | Files | Lines |
|
- in R4 Dublin the policy-engine introduced a totally new API
- policy-handler now has a startup option to either use the new PDP API
or the old PDP API that was created-updated before the end of 2018
- see README.md and README_pdp_api_v0.md for instructions on how to
setup the policy-handler running either with the new PDP API
or the old (pdp_api_v0) PDP API
- this is a massive refactoring that changed almost all the source files,
but kept the old logic when using the old (pdp_api_v0) PDP API
- all the code related to PDP API version is split into two subfolders
= pdp_api/ contains the new PDP API source code
= pdp_api_v0/ contains the old (2018) PDP API source code
= pdp_client.py imports from either pdp_api or pdp_api_v0
= the rest of the code is only affected when it needs to branch
the logic
- logging to policy_handler.log now shows the path of the source file to
allow tracing which PDP API is actually used
- when the new PDP API is used, the policy-update flow is disabled
= passive mode of operation
= no web-socket
= no periodic catch_up
= no policy-filters
= reduced web-API - only a single /policy_latest endpoint is available
/policies_latest returns 404
/catch_up request is accepted, but ignored
- on new PDP API: http /policy_latest returns the new data from the
new PDP API with the following fields added by the policy-handler
to keep other policy related parts intact in R4
(see pdp_api/policy_utils.py)
= "policyName" = policy_id + "." + "policyVersion" + ".xml"
= "policyVersion" = str("metadata"."policy-version")
= "config" - is the renamed "properties" from the new PDP API response
- unit tests are split into two subfolders as well
= main/ for the new PDP API testing
= pdp_api_v0/ for the old (2018) PDP API
- removed the following line from the license text of changed files
ECOMP is a trademark and service mark of AT&T Intellectual Property.
- the new PDP API is expected to be extended and redesigned in R5 El Alto
- on retiring the old PDP API - the intention is to be able to remove
the pdp_api_v0/ subfolder and minimal related cleanup of the code
that imports that as well as the cleanup of the config.py, etc.
Change-Id: Ief9a2ae4541300308caaf97377f4ed051535dbe4
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-1128
|
|
DCAEGEN2-931:
- exposed POST /reconfigure endpoint on the web-server
that initiates the reconfigure process right away
DCAEGEN2-932:
- mode_of_operation: active or passive
= active is as before this change
= in passive mode the policy-handler
* closes the web-socket to PDP
* skips the periodic catch_ups
* still periodically checks for reconfigure
* still allows usig the web-server to retrieve policies from PDP
- default is active
- when mode_of_operation changes from passive to active,
the policy-handler invokes the catch_up right away
- config-kv contains the optional override field mode_of_operation
= changing the mode_of_operation in config-kv and invoking
POST /reconfigure will bring the new value and change the
mode of operation of the policy-handler if no service_activator
section is provided in consul-kv record
- if config-kv contains the service_activator section,
= the policy-handler registers with service_activator - untested
= and receives the mode_of_operation - untested
= service_activator can POST-notify the policy-handler to
initiate the /reconfigure
- reduced the default web-socket ping interval from 180 to 30
seconds because PDP changed its default timeout on the web-socket
from 400 seconds to 50 seconds
Change-Id: If7dd21c008d9906aca97939be65dfa9c2f007535
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-931
Issue-ID: DCAEGEN2-932
|
|
DCAEGEN2-853:
- stop reporting the absence of policies or updates
as error - this is an expected result == INFO or WARNING
DCAEGEN2-903: preparation for TLS on the web-server of policy-handler
DCAEGEN2-930:
- configurable timeouts for http requests from policy-handler
- added configurable pinging on the web-socket to PDP
- added healthcheck info on the web-socket
- upgraded websocket-client lib to 0.53.0
DCAEGEN2-1017: fixed a bug on policy-filter matching
by filter_config_name
- refactored and enhanced the unit-tests
Change-Id: I111ddc57bb978554ef376cbf916965b6667dad9b
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-853
Issue-ID: DCAEGEN2-903
Issue-ID: DCAEGEN2-930
Issue-ID: DCAEGEN2-1017
|
|
- tls to policy-engine
- tls on web-socket to policy-engine
- tls to deployment-handler
- no tls on the web-server side
= that is internal API
= will add TLS in R4
- policy-handler expecting the deployment process
to mount certs at /opt/app/policy_handler/etc/tls/certs/
- blueprint for policy-handler will be updated to contain
cert_directory : /opt/app/policy_handler/etc/tls/certs/
- the matching local etc/config.json has new part tls with:
= cert_directory : etc/tls/certs/
= cacert : cacert.pem
- new optional fields tls_ca_mode in config on consul that
specify where to find the cacert.pem for tls per each https/web-socket
values are:
"cert_directory" - use the cacert.pem stored locally in cert_directory
this is the default if cacert.pem file is found
"os_ca_bundle" - use the public ca_bundle provided by linux system.
this is the default if cacert.pem file not found
"do_not_verify" - special hack to turn off the verification by cacert
and hostname
- config on consul now has 2 new fields for policy_engine
= "tls_ca_mode" : "cert_directory"
= "tls_wss_ca_mode" : "cert_directory"
- config on consul now has 1 new field for deploy_handler
= "tls_ca_mode" : "cert_directory"
- removed customization for verify -- it is now a built-in feature
Change-Id: Ibe9120504ed6036d1ed4c84ff4cd8ad1d9e80f17
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-611
|
|
- reconfigure == periodically retrieve the policy-handler config
from consul-kv and compare to previous config and subconfigs.
If changed, reconfigure the subunits
- selectively change one or any settings for the following
= catch_up timer interval
= reconfigure timer interval
= deployment-handler url and params (thread-safe)
= policy-engine url and params (thread-safe)
= web-socket url to policy-engine (through a callback)
- each subunit has its own Settings that keep track of changes
- try-catch and metrics around discovery - consul API
- hidden the secrets from logs
- froze the web-socket version to 0.49.0 because 0.50.0
and 0.51.0 are broken - looking around for stable alternatives
- fixed-adapted the callbacks passed to the web-socket lib
that changed its API in 0.49.0 and later
- log the stack on the exception occurring in the web-socket lib
- unit test refactoring
Change-Id: Id53bad59660a197f59d9aeb7c05ab761d1060cd0
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-470
|
|
- pass cfy_tenant_name in query from policy-handler
to deployment-handler
- new config "query":{"cfy_tenant_name": "default_tenant"}
- limits the single policy-handler to a single cfy_tenant_name
in cloudify under the deployment-handler
Change-Id: I257a9b74be6ddcde77a2b4fceabd4aa628890466
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-704
|
|
- changed API and functionality - new dataflow
- new dataflow between policy-handler and deployment-handler
on policy-update and catchup
= GETting policy_ids+versions and policy-filters from
deployment-handler
= PUTting policy-update and catchup in the new message format
= data segmenting the policy-update/catchup messages to
deployment-handler to avoid 413 on deployment-handler side
= matching policies from policy-engine to policies
and policy-filters from deployment-handler
= coarsening the policyName filter received from deployment-handler
to reduce the number messages passed to policy-engine on catchup
= consolidating sequential policy-updates into a single request
when the policy-update is busy
- removed policy scope-prefixes from config and logic -
it is not needed anymore because
= the policy matching happens directly to policies and
policy-filters received from deployment-handler
= on catchup - the policy scope-prefix equivalents are calculated
based on the data received from deployment-handler
- API - GET /policies_latest now returns the info on deployed
policy_ids+versions and policy-filters, rather than policies
of the scope-prefixes previously found in config (obsolete)
- not sending an empty catch_up message to deployment-handler
when nothing changed
- send policy-removed to deployment-handler when getting
404-not found from PDP on removal of policy
- config change: removed catch_up.max_skips - obsolete
- brought the latest CommonLogger.py
- minor refactoring - improved naming of variables
Change-Id: I36b3412eefd439088cb693703a6e5f18f4238b00
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-492
|
|
- no change of functionality or API
- removed the unused enum34>=1.1.6 from requirements.txt and setup.py
- refactored run_policy.sh to redirect the stdout+stderr only once
- refactoring to remove smells+vulnerability reported by sonar
-- renamed Config.config to Config.settings
-- removed the commented out code in customizer.py
-- renamed StepTimer.NEXT to StepTimer.STATE_NEXT to avoid the
naming confusion with the method StepTimer.next.
Also renamed the related StepTimer.STATE_* constants
-- refactored several functions by extracting methods to eliminate
4 out of 5 "brain-overload" smells reported by sonar
-- moved the literal string for the socket_host "0.0.0.0" to a
constant on the web-server to avoid the reported vulnerability
Change-Id: I4c7d47d41c6ecd7cb28f6704f5dad2053c1ca7d6
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-515
|
|
- migrated from python 2.7 to 3.6
- brought up the latest versions of dependencies
-- Cherrypy 15.0.0, requests 2.18.4, websocket-client 0.48.0
- fixed migration errors
-- renamed the standard package Queue to queue
-- dict.items() instead of dict.iteritems()
-- dict.keys() instead of dict.viewkeys()
-- range() instead of xrange()
-- subprocess.check_output(..., universal_newlines=True) to
get str instead of byte-stream from stdout
- cleaned up migration warnings
-- super() instead of super(A, self)
-- logger.warning() instead of .warn()
- moved main() from policy_handler.py to __main__.py
- getting the policy_handler version directly from setup.py
instead of the env var on init of the audit
Change-Id: I0fc4ddc51c08a64f3cfdc5d2f010b1c6a1ae92f0
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-515
|
|
- turned off test_gc unit-test on policy-handler to avoid
get /gc/stats after shutdown of the web-server
- made rougher comparison between execution time and timer interval
Change-Id: Idcf6caae6f2a934dc2dc2d5a0fddd06543abd48a
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-532
|
|
- in search of the memory leak that is falsely reported
by docker stats, the following runtime logging was added
= process_memory - rss and other memory of the current process
= virtual_memory - the memory info of the whole system
= thread_stacks - the active threads with the full stack on each
Change-Id: I5f5ab3a477bfba3aecc5963547aa82da6269670b
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-514
|
|
- added try-except for top level Exception into all threads
of policy-handler to avoid losing the thread and tracking
the unexpected crashes
- rediscover the deployment-handler if not found before
and after each catchup
- refactored audit - separated metrics from audit
- added more stats and runtime info to healthcheck
= gc counts and garbage info if any detected
= memory usage - to detect the potential memory leaks
= request_id to all stats
= stats of active requests
- avoid reallocating the whole Queue of policy-updates after catchup
= clear of the internal queue under proper lock
Change-Id: I3fabcaac70419a68bd070ff7d591a75942f37663
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-483
|
|
- fixed the bug of unpredictably stopping of the periodic catch-up
step-timer due to thread race condition in policy-handler
= added critical sections under the reentrant lock on every group
of local var change in step-timer
- added more stats for healthcheck to track each type of
job-operation separately
= that helps narrowing down identifying the potential problems
- unit test coverage 76%
Change-Id: I92ddf6c92a3d225d9b87427e3edfb7f80669501a
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-472
|
|
- improved step-timer due to unit tests
-- fixed events
-- better logging
- audit - collect list of package thru subprocess pip freeze
- unit tests coverage 76%
Change-Id: Ib1cb5f687612ecf18aa7414b1ff7dbf5774345b4
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-389
|
|
- policy-handler uses dns based discovery of
deployment-handler - driven by config
- new data structure for deploy_handler section of config
-- changed from string "deployment_handler" in 2.3.1
to structure in 2.4.0
deploy_handler :
# name of deployment-handler service
# used by policy-handler for logging
target_entity : "deployment_handler"
# url of the deployment-handler service
# for policy-handler to direct the policy-updates to
# - expecting dns to resolve the name
# deployment_handler to ip address
url : "http://deployment_handler:8188"
- logic is backwards compatible with 2.3.1 format
- removed import pip from audit
-- import pip broken in pip 9.0.2 (2018-03-19)
-- import pip conflicts with requests
-- pip API is not officially supported
-- see links for more
https://github.com/pypa/pip/issues/5079
https://github.com/pypa/pip/issues/5081
Change-Id: Ifcaba6cfd714f3099ab7a25fe979a3696a6460fc
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-404
|
|
- periodically catchup - interval is configurable
= max_skips defines the number of times the catch_up
message that is identical to previous one can be skipped
- do not catchup more often than the interval
even between the manual catchup and auto catchup
- do not send the same catchup message twice in a row
to the deployment-handler but not exceed a hard limit
on catchup max_skips
- catchup if the deployment-handler instance is changed
Change-Id: I9a3fcc941e8a9e553abb3952dd882c37e0f5fdfe
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-389
|
|
- added etc_customize/ folder and customize.sh script
= customize.sh script is expected to be overridden by company
to customize Docker image build
= the whole etc_customize/ folder is copied into docker image
= it is up to the company what to put into that folder - any files
- added customize/ folder with CustomizeBase and Customize classes
= CustomizeBase defines the interface and the default=ONAP behavior
= CustomizeBase is owned by ONAP and should not be changed
by the company
= Customize inherits CustomizeBase
= policy-handler instantiates Customize
to get the customized behavior
= Customize is owned by the company and should be changed
by the company = ONAP is not going to change Customize
= the methods of Customize are expected to be overridden
by the company to change the behavior of the policy-handler
= sample Customize class can be found in README.md
= Company is allowed to add more files to customize/ folder
if that is required for better structuring of their code
as soon as it is invoked by the methods of Customize
Change-Id: I46f8170afaaa48e1005e4398a768a781db0a0e6c
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-379
|
|
- removed #org.onap.dcae from license text
Change-Id: I07f11e60c4677109ccb826c4e969b47acb4c498a
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-347
|
|
- minor list comparison bug - not affecting much
Change-Id: I2f3a51ce2064601f0a08547d7c250eee551f6721
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-347
|
|
Change-Id: I853af6b231b4b2f4eff7492d4770ce6a3a7fd786
Signed-off-by: Lusheng Ji <lji@research.att.com>
Issue-ID: DCAEGEN2-355
|
|
Change-Id: I685c63758b7ce22766885d399f06e9ba14ca59f2
Issue-ID: DCAEGEN2-249
Signed-off-by: Alex Shatov <alexs@att.com>
|
|
* added errored_scopes and scope_prefixes to the message
to deployment-handler - to prevent erroneous
removal of policies
* hardcoded condition for scope not found 404 at policy-engine
to separate it from error on the scope retrieval 400
* adjusting the web API message in sync with notification
to deployment-handler
* unit test coverage 74%
Change-Id: Ie736a1b7aee0631b6785669c6b765bd240dd77b8
Issue-ID: DCAEGEN2-249
Signed-off-by: Alex Shatov <alexs@att.com>
|
|
Change-Id: I2a3628cb67d15ab2828f6818764d111df13e795a
Issue-ID: DCAEGEN2-249
Signed-off-by: Alex Shatov <alexs@att.com>
|
|
* new feature variable collection of policies per component in DCAE
* massive refactoring
* dissolved the external PolicyEngine.py into policy_receiver.py
- kept only the web-socket communication to PolicyEngine
* new /healthcheck - shows some stats of service running
* Unit Test coverage 75%
Change-Id: I816b7d5713ae0dd88fa73d3656f272b4f3e7946e
Issue-ID: DCAEGEN2-249
Signed-off-by: Alex Shatov <alexs@att.com>
|
|
usage on local run:
tox -c tox-local.ini
usage on ONAP run:
tox
Change-Id: Ic455f0f44f5b3bee92b60ea282851e72c3a12b7e
Issue-Id: DCAEGEN2-62
Signed-off-by: Alex Shatov <alexs@att.com>
|