From 8b1223dae89dc5c81c576048b448f12b9d860abb Mon Sep 17 00:00:00 2001 From: kjaniak Date: Wed, 19 Dec 2018 22:41:40 +0100 Subject: Update HV-VES documentation - metrics Update of documentation to cover metrics provided with use of Prometheus library. Change-Id: Ib81208ce80a3bfb4000781be84d1465731055551 Issue-ID: DCAEGEN2-1046 Signed-off-by: kjaniak --- docs/sections/services/ves-hv/deployment.rst | 9 --- .../services/ves-hv/healthcheck-and-monitoring.rst | 89 ++++++++++++++++++++++ docs/sections/services/ves-hv/index.rst | 1 + .../services/ves-hv/metrics_sample_response.txt | 67 ++++++++++++++++ 4 files changed, 157 insertions(+), 9 deletions(-) create mode 100644 docs/sections/services/ves-hv/healthcheck-and-monitoring.rst create mode 100644 docs/sections/services/ves-hv/metrics_sample_response.txt diff --git a/docs/sections/services/ves-hv/deployment.rst b/docs/sections/services/ves-hv/deployment.rst index 9fdbceb1..07d26b94 100644 --- a/docs/sections/services/ves-hv/deployment.rst +++ b/docs/sections/services/ves-hv/deployment.rst @@ -117,12 +117,3 @@ Result: .. code-block:: bash kubectl get pods --namespace ${ONAP_NAMESPACE} --selector app=dcae-hv-ves-collector - -Healthcheck -=========== - -Inside HV-VES docker container runs small http service for healthcheck - exact port for this service can be configured -at deployment using `--health-check-api-port` command line option. - -This service exposes single endpoint **GET /health/ready** which returns **HTTP 200 OK** in case HV-VES is healthy -and ready for connections. Otherwise it returns **HTTP 503 Service Unavailable** with short reason of unhealthiness. diff --git a/docs/sections/services/ves-hv/healthcheck-and-monitoring.rst b/docs/sections/services/ves-hv/healthcheck-and-monitoring.rst new file mode 100644 index 00000000..8437180d --- /dev/null +++ b/docs/sections/services/ves-hv/healthcheck-and-monitoring.rst @@ -0,0 +1,89 @@ +.. This work is licensed under a Creative Commons Attribution 4.0 International License. +.. http://creativecommons.org/licenses/by/4.0 + +.. _healthcheck_and_monitoring: + +Healthcheck and Monitoring +========================== + +Healthcheck +----------- +Inside HV-VES docker container runs a small HTTP service for healthcheck. Port for healthchecks can be configured +at deployment using ``--health-check-api-port`` command line option or via `VESHV_HEALTHCHECK_API_PORT` environment variable (for details see :ref:`deployment`). + +This service exposes endpoint **GET /health/ready** which returns a **HTTP 200 OK** when HV-VES is healthy +and ready for connections. Otherwise it returns a **HTTP 503 Service Unavailable** message with a short reason of unhealthiness. + + +Monitoring +---------- +HV-VES collector allows to collect metrics data at runtime. To serve this purpose HV-VES application exposes an endpoint **GET /monitoring/prometheus** +which returns a **HTTP 200 OK** message with a specific data in its body. Returned data is in a format readable by Prometheus service. +Prometheus endpoint shares a port with healthchecks. + +Metrics provided by HV-VES metrics: + ++-----------------------------------------------+--------------+------------------------------------------------------------------------------------------+ +| Name of metric | Unit | Description | ++===============================================+==============+==========================================================================================+ +| hvves_clients_rejected_cause_total | cause/piece | number of rejected clients grouped by cause | ++-----------------------------------------------+--------------+------------------------------------------------------------------------------------------+ +| hvves_clients_rejected_total | piece | total number of rejected clients | ++-----------------------------------------------+--------------+------------------------------------------------------------------------------------------+ +| hvves_connections_active | piece | number of currently active connections | ++-----------------------------------------------+--------------+------------------------------------------------------------------------------------------+ +| hvves_connections_total | piece | total number of connections | ++-----------------------------------------------+--------------+------------------------------------------------------------------------------------------+ +| hvves_data_received_bytes_total | bytes | total number of received bytes | ++-----------------------------------------------+--------------+------------------------------------------------------------------------------------------+ +| hvves_disconnections_total | piece | total number of disconnections | ++-----------------------------------------------+--------------+------------------------------------------------------------------------------------------+ +| hvves_messages_dropped_cause_total | cause/piece | number of dropped messages grouped by cause | ++-----------------------------------------------+--------------+------------------------------------------------------------------------------------------+ +| hvves_messages_dropped_total | piece | total number of dropped messages | ++-----------------------------------------------+--------------+-----------------------------------+------------------------------------------------------+ +| hvves_messages_latency_seconds_count | piece | latency is a time between | counter for number of latency occurance | ++-----------------------------------------------+--------------+ message.header.lastEpochMicrosec +------------------------------------------------------+ +| hvves_messages_latency_seconds_max | seconds | and time when data has been sent | maximal observed latency | ++-----------------------------------------------+--------------+ from HV-VES to Kafka +------------------------------------------------------+ +| hvves_messages_latency_seconds_sum | seconds | | sum of latency parameter from each message | ++-----------------------------------------------+--------------+-----------------------------------+------------------------------------------------------+ +| hvves_messages_processing_time_seconds_count | piece | processing time is time meassured | counter for number of processing time occurance | ++-----------------------------------------------+--------------+ between decoding of WTP message +------------------------------------------------------+ +| hvves_messages_processing_time_seconds_max | seconds | and time when data has been sent | maximal processing time | ++-----------------------------------------------+--------------+ From HV-VES to Kafka +------------------------------------------------------+ +| hvves_messages_processing_time_seconds_sum | seconds | | sum of processing time from each message | ++-----------------------------------------------+--------------+-----------------------------------+------------------------------------------------------+ +| hvves_messages_received_payload_bytes_total | bytes | total number of received payload bytes | ++-----------------------------------------------+--------------+------------------------------------------------------------------------------------------+ +| hvves_messages_received_total | piece | total number of received messages | ++-----------------------------------------------+--------------+------------------------------------------------------------------------------------------+ +| hvves_messages_sent_topic_total | topic/piece | number of sent messages grouped by topic | ++-----------------------------------------------+--------------+------------------------------------------------------------------------------------------+ +| hvves_messages_sent_total | piece | number of sent messages | ++-----------------------------------------------+--------------+------------------------------------------------------------------------------------------+ + +JVM metrics: + +- jvm_buffer_memory_used_bytes +- jvm_classes_unloaded_total +- jvm_gc_memory_promoted_bytes_total +- jvm_buffer_total_capacity_bytes +- jvm_threads_live +- jvm_classes_loaded +- jvm_gc_memory_allocated_bytes_total +- jvm_threads_daemon +- jvm_buffer_count +- jvm_gc_pause_seconds_count +- jvm_gc_pause_seconds_sum +- jvm_gc_pause_seconds_max +- jvm_gc_max_data_size_bytes +- jvm_memory_committed_bytes +- jvm_gc_live_data_size_bytes +- jvm_memory_max_bytes +- jvm_memory_used_bytes +- jvm_threads_peak + +Sample response for **GET monitoring/prometheus**: + +.. literalinclude:: metrics_sample_response.txt diff --git a/docs/sections/services/ves-hv/index.rst b/docs/sections/services/ves-hv/index.rst index 5cfaed3f..5bb83ddc 100644 --- a/docs/sections/services/ves-hv/index.rst +++ b/docs/sections/services/ves-hv/index.rst @@ -36,3 +36,4 @@ High Volume VES Collector overview and functions HV-VES Offered APIs <../../apis/ves-hv/index> authorization example-event + healthcheck-and-monitoring diff --git a/docs/sections/services/ves-hv/metrics_sample_response.txt b/docs/sections/services/ves-hv/metrics_sample_response.txt new file mode 100644 index 00000000..da54e3ea --- /dev/null +++ b/docs/sections/services/ves-hv/metrics_sample_response.txt @@ -0,0 +1,67 @@ +jvm_threads_live 26.0 +system_cpu_count 4.0 +jvm_gc_memory_promoted_bytes_total 6740576.0 +process_cpu_usage 5.485463521667581E-4 +jvm_buffer_count{id="mapped",} 0.0 +jvm_buffer_count{id="direct",} 14.0 +jvm_threads_peak 26.0 +jvm_classes_unloaded_total 0.0 +hvves_clients_rejected_cause_total{cause="too_big_payload",} 7.0 +hvves_messages_sent_topic_total{topic="HV_VES_PERF3GPP",} 20000.0 +jvm_threads_daemon 25.0 +jvm_buffer_memory_used_bytes{id="mapped",} 0.0 +jvm_buffer_memory_used_bytes{id="direct",} 1.34242446E8 +jvm_gc_max_data_size_bytes 4.175429632E9 +hvves_messages_received_total 32000.0 +hvves_messages_dropped_total 12000.0 +hvves_clients_rejected_total 7.0 +system_cpu_usage 0.36204059243006037 +jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'non-nmethods'",} 5832704.0 +jvm_memory_max_bytes{area="nonheap",id="Metaspace",} -1.0 +jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'profiled nmethods'",} 1.22912768E8 +jvm_memory_max_bytes{area="nonheap",id="Compressed Class Space",} 1.073741824E9 +jvm_memory_max_bytes{area="heap",id="G1 Eden Space",} -1.0 +jvm_memory_max_bytes{area="heap",id="G1 Old Gen",} 4.175429632E9 +jvm_memory_max_bytes{area="heap",id="G1 Survivor Space",} -1.0 +jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'non-profiled nmethods'",} 1.22912768E8 +hvves_messages_dropped_cause_total{cause="invalid",} 12000.0 +hvves_messages_received_payload_bytes_total 1.0023702E7 +jvm_buffer_total_capacity_bytes{id="mapped",} 0.0 +jvm_buffer_total_capacity_bytes{id="direct",} 1.34242445E8 +system_load_average_1m 2.77 +hvves_data_received_bytes_total 1.052239E7 +jvm_gc_pause_seconds_count{action="end of minor GC",cause="Metadata GC Threshold",} 2.0 +jvm_gc_pause_seconds_sum{action="end of minor GC",cause="Metadata GC Threshold",} 0.087 +jvm_gc_pause_seconds_count{action="end of minor GC",cause="G1 Evacuation Pause",} 8.0 +jvm_gc_pause_seconds_sum{action="end of minor GC",cause="G1 Evacuation Pause",} 0.218 +jvm_gc_pause_seconds_max{action="end of minor GC",cause="Metadata GC Threshold",} 0.03 +jvm_gc_pause_seconds_max{action="end of minor GC",cause="G1 Evacuation Pause",} 0.031 +hvves_messages_processing_time_seconds_max 0.114395 +hvves_messages_processing_time_seconds_count 20000.0 +hvves_messages_processing_time_seconds_sum 280.282544 +hvves_disconnections_total 11.0 +jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'non-nmethods'",} 1312640.0 +jvm_memory_used_bytes{area="nonheap",id="Metaspace",} 3.624124E7 +jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'profiled nmethods'",} 1.1602304E7 +jvm_memory_used_bytes{area="nonheap",id="Compressed Class Space",} 4273752.0 +jvm_memory_used_bytes{area="heap",id="G1 Eden Space",} 1.38412032E8 +jvm_memory_used_bytes{area="heap",id="G1 Old Gen",} 7638112.0 +jvm_memory_used_bytes{area="heap",id="G1 Survivor Space",} 7340032.0 +jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'non-profiled nmethods'",} 4083712.0 +jvm_memory_committed_bytes{area="nonheap",id="CodeHeap 'non-nmethods'",} 2555904.0 +jvm_memory_committed_bytes{area="nonheap",id="Metaspace",} 3.7486592E7 +jvm_memory_committed_bytes{area="nonheap",id="CodeHeap 'profiled nmethods'",} 1.1730944E7 +jvm_memory_committed_bytes{area="nonheap",id="Compressed Class Space",} 4587520.0 +jvm_memory_committed_bytes{area="heap",id="G1 Eden Space",} 1.58334976E8 +jvm_memory_committed_bytes{area="heap",id="G1 Old Gen",} 9.8566144E7 +jvm_memory_committed_bytes{area="heap",id="G1 Survivor Space",} 7340032.0 +jvm_memory_committed_bytes{area="nonheap",id="CodeHeap 'non-profiled nmethods'",} 4128768.0 +jvm_gc_memory_allocated_bytes_total 1.235222528E9 +hvves_connections_total 12.0 +jvm_classes_loaded 7120.0 +hvves_messages_sent_total 20000.0 +hvves_connections_active 1.0 +jvm_gc_live_data_size_bytes 7634496.0 +hvves_messages_latency_seconds_max 1.5459828692292638E9 +hvves_messages_latency_seconds_count 20000.0 +hvves_messages_latency_seconds_sum 2.91400110035487E9 \ No newline at end of file -- cgit 1.2.3-korg