Age | Commit message (Collapse) | Author | Files | Lines |
|
- reconfigure == periodically retrieve the policy-handler config
from consul-kv and compare to previous config and subconfigs.
If changed, reconfigure the subunits
- selectively change one or any settings for the following
= catch_up timer interval
= reconfigure timer interval
= deployment-handler url and params (thread-safe)
= policy-engine url and params (thread-safe)
= web-socket url to policy-engine (through a callback)
- each subunit has its own Settings that keep track of changes
- try-catch and metrics around discovery - consul API
- hidden the secrets from logs
- froze the web-socket version to 0.49.0 because 0.50.0
and 0.51.0 are broken - looking around for stable alternatives
- fixed-adapted the callbacks passed to the web-socket lib
that changed its API in 0.49.0 and later
- log the stack on the exception occurring in the web-socket lib
- unit test refactoring
Change-Id: Id53bad59660a197f59d9aeb7c05ab761d1060cd0
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-470
|
|
- turned off test_gc unit-test on policy-handler to avoid
get /gc/stats after shutdown of the web-server
- made rougher comparison between execution time and timer interval
Change-Id: Idcf6caae6f2a934dc2dc2d5a0fddd06543abd48a
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-532
|
|
- in search of the memory leak that is falsely reported
by docker stats, the following runtime logging was added
= process_memory - rss and other memory of the current process
= virtual_memory - the memory info of the whole system
= thread_stacks - the active threads with the full stack on each
Change-Id: I5f5ab3a477bfba3aecc5963547aa82da6269670b
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-514
|
|
- added try-except for top level Exception into all threads
of policy-handler to avoid losing the thread and tracking
the unexpected crashes
- rediscover the deployment-handler if not found before
and after each catchup
- refactored audit - separated metrics from audit
- added more stats and runtime info to healthcheck
= gc counts and garbage info if any detected
= memory usage - to detect the potential memory leaks
= request_id to all stats
= stats of active requests
- avoid reallocating the whole Queue of policy-updates after catchup
= clear of the internal queue under proper lock
Change-Id: I3fabcaac70419a68bd070ff7d591a75942f37663
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-483
|
|
- fixed the bug of unpredictably stopping of the periodic catch-up
step-timer due to thread race condition in policy-handler
= added critical sections under the reentrant lock on every group
of local var change in step-timer
- added more stats for healthcheck to track each type of
job-operation separately
= that helps narrowing down identifying the potential problems
- unit test coverage 76%
Change-Id: I92ddf6c92a3d225d9b87427e3edfb7f80669501a
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-472
|
|
- improved step-timer due to unit tests
-- fixed events
-- better logging
- audit - collect list of package thru subprocess pip freeze
- unit tests coverage 76%
Change-Id: Ib1cb5f687612ecf18aa7414b1ff7dbf5774345b4
Signed-off-by: Alex Shatov <alexs@att.com>
Issue-ID: DCAEGEN2-389
|