Skip to main content
Version: 14.x

Metrics

Kango service exposes various Prometheus metrics to monitor the health and performance of the Kafka producer, consumer, and message processing operations.

All metrics are exposed via OpenTelemetry and are compatible with Prometheus scraping. These metrics can be accessed at endpoint /-/metrics.

Kafka values are periodically updated by an internal process of librdkafka. This update period defaults to 10 seconds, and it can be customized by setting statistics.interval.ms property of librdkafka.
However, it is recommended to not configure it and use its default value.

Service Internal Metrics

These metrics are specific to the service functionality and provide insights into data processing and stream operations.

Metric nameMetric typeDescriptionLabelsStatus
ka_flushed_messagesCounterNumber of messages that have been written to the persistence layerresult=<ok|err>STABLE

Kafka Consumer Metrics

These metrics are collected from the underlying Kafka consumer client and provide insights into Kafka connectivity, performance, and consumption behavior.

Metric nameTypeHelpLabels
kafka_consumer_replyqGaugeNumber of ops (callbacks, events, etc) waiting in queue for application to serve with rd_kafka_poll()consumer_group=<consumer-group>
kafka_consumer_tx_totalGaugeTotal number of requests sent to Kafka brokersconsumer_group=<consumer-group>
kafka_consumer_tx_bytes_totalGaugeTotal number of bytes transmitted to Kafka brokersconsumer_group=<consumer-group>
kafka_consumer_rx_totalGaugeTotal number of responses received from Kafka brokersconsumer_group=<consumer-group>
kafka_consumer_rx_bytes_totalGaugeTotal number of bytes received from Kafka brokersconsumer_group=<consumer-group>
kafka_consumer_rx_msgs_totalGaugeTotal number of messages consumed, not including ignored messages (due to offset, etc), from Kafka brokersconsumer_group=<consumer-group>
kafka_consumer_rx_msgs_bytes_totalGaugeTotal number of message bytes (including framing) received from Kafka brokersconsumer_group=<consumer-group>
kafka_consumer_group_stateGaugeLocal consumer group handler's stateconsumer_group=<consumer-group>
kafka_consumer_group_state_ageGaugeTime elapsed since last state changeconsumer_group=<consumer-group>
kafka_consumer_group_rebalance_ageGaugeTime elapsed since last rebalance (assign or revoke)consumer_group=<consumer-group>
kafka_consumer_group_rebalance_count_totalGaugeTotal number of rebalances (assign or revoke)consumer_group=<consumer-group>
kafka_consumer_group_assigned_partition_countGaugeCurrent assignment's partition countconsumer_group=<consumer-group>
kafka_consumer_broker_rtt_avgGaugeBroker latency / round-trip time (average)consumer_group=<consumer-group> node_id=<node-id> node_name=<node-name>
kafka_consumer_broker_rtt_stdGaugeBroker latency / round-trip time (standard deviation)consumer_group=<consumer-group> node_id=<node-id> node_name=<node-name>
kafka_consumer_broker_rtt_minGaugeBroker latency / round-trip time (minimum)consumer_group=<consumer-group> node_id=<node-id> node_name=<node-name>
kafka_consumer_broker_rtt_maxGaugeBroker latency / round-trip time (maximum)consumer_group=<consumer-group> node_id=<node-id> node_name=<node-name>
kafka_consumer_broker_rtt_p50GaugeBroker latency / round-trip time (50th percentile)consumer_group=<consumer-group> node_id=<node-id> node_name=<node-name>
kafka_consumer_broker_rtt_p90GaugeBroker latency / round-trip time (90th percentile)consumer_group=<consumer-group> node_id=<node-id> node_name=<node-name>
kafka_consumer_broker_rtt_p95GaugeBroker latency / round-trip time (95th percentile)consumer_group=<consumer-group> node_id=<node-id> node_name=<node-name>
kafka_consumer_broker_rtt_p99GaugeBroker latency / round-trip time (99th percentile)consumer_group=<consumer-group> node_id=<node-id> node_name=<node-name>
kafka_consumer_broker_throttle_avgGaugeBroker throttling time (average)consumer_group=<consumer-group> node_id=<node-id> node_name=<node-name>
kafka_consumer_broker_throttle_stdGaugeBroker throttling time (standard deviation)consumer_group=<consumer-group> node_id=<node-id> node_name=<node-name>
kafka_consumer_broker_throttle_minGaugeBroker throttling time (minimum)consumer_group=<consumer-group> node_id=<node-id> node_name=<node-name>
kafka_consumer_broker_throttle_maxGaugeBroker throttling time (maximum)consumer_group=<consumer-group> node_id=<node-id> node_name=<node-name>
kafka_consumer_broker_tx_errsGaugeTotal number of transmission errorsconsumer_group=<consumer-group> node_id=<node-id> node_name=<node-name>
kafka_consumer_broker_rx_errsGaugeTotal number of receive errorsconsumer_group=<consumer-group> node_id=<node-id> node_name=<node-name>
kafka_consumer_group_lagGaugeNumber of messages that the consumer needs to readtopic=<topic-name> partition=<partition-id>
kafka_consumer_fetch_queue_countGaugeNumber of pre-fetched messages in consumer fetch queuetopic=<topic-name> partition=<partition-id>
kafka_consumer_fetch_queue_sizeGaugeBytes in consumer fetch queuetopic=<topic-name> partition=<partition-id>

Usage Notes

  • All metrics are collected via OpenTelemetry and can be scraped by Prometheus
  • Consumer metrics are automatically collected from librdkafka statistics
  • Queue metrics help monitor internal buffer utilization
  • Consumer lag metrics are essential for monitoring processing delays

Grafana Dashboard

Alongside the service, a Grafana dashboard is released, so that it is possible to set up a standardize manner to monitor the Kango. This dashboard can be found here, next to previous releases.

NameVersionLink
Kafka To Mongo Overviewv1.0.0download