What do you use to monitor your Elixir app?

9mm · November 25, 2018, 4:00am

Im coming close to deploying my app soon

I can imagine things like seeing a realtime view of the number of processes, exceptions and whatnot would be really helpful to see.

I know there are some extremely bloated tools for ruby, I’d rather avoid stuff like that.

So what are your ‘must have’ monitoring / performance tuning / metrics libraries?

lpil · November 25, 2018, 12:49pm

I’ve been using Appsignal https://appsignal.com/

The product itself is excellent for both exception and performance monitoring, and the custom metrics are handy too. It’s also well priced too- switching over from New Relic saved us a lot of money.

I’ve pestered the Appsignal team plenty of times for help (or to obnoxiously request features!) and they’ve always been extremely helpful and responsive. Very happy with the service.

It currently has a bug where with cowboy 2 it reports 404s as errors, but that should be fixed soon.

idi527 · November 25, 2018, 1:44pm

I use prometheus for metrics. And an “in-house” contraption on top of tantivy for logs / exceptions.

fulmicoton · November 30, 2018, 12:25am

@idi527 Any plan to opensource the tantivy contraption?

keathley · November 30, 2018, 1:12am

Currently we use a combination of statsd for raw metrics and open tracing through spandex: https://github.com/spandex-project. All of this gets sent over to datadog. In the past I’ve used prometheus and grafana and both are good. I actually prefer prometheus to statsd but both are fine. I don’t think I could live without open tracing at this point. Its probably not necessary if you’re just running 1 or 2 services. But if you end up with a lot of different services its pretty invaluable for observability.

Off the shelf apm tools have the most strengths when you only need to monitor / observe 1 application or service. As systems grow I tend to want more control over my monitoring and alerting rules and a lot of off the shelf apm tools don’t give you that sort of power. In the end I prefer tools like datadog or grafana.

idi527 · November 30, 2018, 8:07am

It would need some cleaning up, but yes, I can try. It is somewhat similar to https://github.com/KodrAus/tantivy-log

Thank you so much for tantivy!

nburkley · November 30, 2018, 9:53am

We’re using a combination of the following for our elixir services:

Statsd via the statix library for stats and metrics
Bugsnag via the bugsnag elixir library for capturing exceptions
Sending logs to an ELK stack via logstash-json. We’ve started using elastalert to trigger alerts on some specific errors in our Kibana logs. It’s nice to be able to configure alerts on existing logs, without touching a line of code

We also built a ‘synthetic monitoring’ service that we built in-house to smoke-test our API every 5 minutes. It raises alerts in OpsGenie though HTTP calls if any or or endpoints return unexpected responses.

fmcgeough · November 30, 2018, 1:38pm

These are all good suggestions. Additionally we use:

distillery for releases
pid_file (https://hex.pm/packages/pid_file) along with system.d on centos to restart if process exits on any node.
observer_cli (https://hex.pm/packages/observer_cli) to allow some additional debug capabilities on each machine

I’d use (or pick) something to store metrics in that can provide visualizations. Start small and pump some general data into that store. Add to it over time. We use telegraf currently for that.

sneako · November 30, 2018, 2:07pm

I have been using Statix with Elixometer which all gets sent to Datadog (via their agent/dogstatsd), where we have our monitoring dashboards.

Also Honeybadger for error monitoring, and VictorOps for alerting.

However, just yesterday, I replaced Elixometer with vmstats in one of our applications (following a similar strategy to the one laid out in this blog post). I think I will be using vmstats going forward. It was much easier to set up than elixometer and requires much less configuration.

binaryseed · December 5, 2018, 4:29am

Over at New Relic, we use the open source agent we have built. It comes with Plug transaction tracing, distributed tracing for micro-services, errors, BEAM stats, function tracing, custom attributes, alerting, etc.

In progress are a few framework integrations - Phoenix and Absinthe. Long term we want to align with the Elixir Telemetry project so we don’t need a bunch of vendor specific instrumentation packages in the ecosystem