Peep - Efficient TelemetryMetrics reporter supporting Prometheus and StatsD

Peep is a new TelemetryMetrics reporter that supports both StatsD (and Dogstatsd) and Prometheus.

While load testing a new Websocket-based API gateway written in Elixir, I encountered performance issues with TelemetryMetricsPrometheus.Core and TelemetryMetricsStatsd. This prompted me to write Peep, which makes different choices about storing and sending TelemetryMetrics data.

  1. Instead of sampling or on-demand aggregation, Peep uses histograms (backed by :ets.update_counter/*) to store distributions, copying the approach taken by DDSketch.
  2. Instead of sending StatsD packets for each telemetry event, StatsD data is periodically sent in a small(er) number of large(r) packets.

This library is currently running in production, in a service handling >1 million requests per minute. With a moderate number of metrics defined, the service emits StatsD data at a rate of 4KiB/s, with no observed packet drops (we use Unix Domain Sockets to send Dogstatsd lines to Datadog agents, so it’s possible for :gen_udp to return :eagain when attempting to send packets).

Here’s an image showing a drop in CPU use after replacing TelemetryMetricsPrometheus.Core and TelemetryMetricsStatsd with Peep:

Here’s another dashboard for the same period of time, showing a slight (but not unwelcome!) drop in memory usage:

Feedback and contributions welcome!

15 Likes

Peep v2.0.0 has been released!

This version fixes an issue with exposing data for Prometheus. If you use Peep with Prometheus, you should upgrade to this version.

Changes

  • fixes an issue with Prometheus exposition where zero-valued bucket time series are not shown
  • Add support for custom bucket boundaries. As part of this change, the distribution_bucket_variability option was removed.

Custom bucket boundaries

With Peep 2.0.0, the default log-linear bucketing strategy becomes an implementation of the new Peep.Buckets behavior.

You can use the Peep.Buckets.Custom module to define your own bucket boundaries. This compiles to efficient pattern matching with function heads, which ought to scale better than traversing a list.

Here’s an example of using Peep.Buckets.Custom:

defmodule MyBuckets do
  use Peep.Buckets.Custom, buckets: [
    1, 2, 5,
    10, 20, 50,
    100, 200, 500,
    1_000, 2_000, 5_000,
    10_000, 20_000, 50_000
    100_000, 200_000, 500_000,
    1_000_000, 2_000_000, 5_000_000
  ]
end

distribution("my.dist.with.custom.buckets", [reporter_options: [peep_bucket_calculator: MyBuckets]])

If you want something more involved, you can implement the callbacks for the Peep.Buckets behaviour. For an example, look at Peep.Buckets.Exponential.

2 Likes