How do you monitor / profile your live Elixir / Phoenix apps?

Hey everyone,

Having a great time building v2 of our app in Elixir (previously Node). And getting to that stage: how does everyone here monitor their Elixir apps live on production? Are you using NewRelic, something else?

13 Likes

Hey there,

Erlang comes with an observer which you can use to monitor your elixir app.

A few weeks ago I wrote an article on how to use it and also wrote a hex called remote_monitor for it. At least thats what you can use to monitor your app on demand.

I’m still looking for a service that monitors the app in the background and sends alerts if something goes wrong - didn’t find anything like that yet.

12 Likes

Very nice! – Switching back and forth between Elixir and Node feels like travelling in time.

3 Likes

We’ve been trying out appsignal. They have a beta client that works pretty well (use github master version) https://github.com/appsignal/appsignal-elixir/

It includes error tracking and response times per controller. So far so good, but our app isn’t in production yet.

2 Likes

a few month ago i was using - this guide - to monitor a phoenix app with Exometer, StatsD and DataDog

2 Likes

We have used a combination of elixometer, statsd, and datadog with pretty good success. Also, error notifications with Honeybadger have proven useful as well.

1 Like

Hey!

You might also want to try Prometheus + AlertManager.

Client package on hex.pm: https://hex.pm/packages/prometheus. You’ll also find integrations list here (plugs, ecto, phoenix).

3 Likes

Looks interesting, is this production ready?

3 Likes

@madshargreave Yep, I’ve been using it for a while now and I know others use it too (cc @talentdeficit).

If you want to give it a try please start with current alpha versions/branches. I don’t expect breaking changes there and everything is covered with tests. Also feel free to ask questions!

1 Like

Awesome, I’ll give it a look.

If possible, you should definitely make a blog post about it, I know lots of people in the Elixir community desperately looking for a sound monitoring/error reporting solution

1 Like

I plan 3-part series (Implementation details, Instrumenting a Phoenix app, and Deploying).

Right now I’m busy writing docs: https://hexdocs.pm/prometheus_ex/1.0.0-alpha3/Prometheus.html. Any help would be appreciated (esp, proofreading)!

5 Likes

How about Graphite? (Grafana, Collectd). Pretty popular monitoring solution
Can’t find anything related

2 Likes

Here are two guides that might help you:

Screenshot

2 Likes

Thanks!
Actually tried it, quick report:
Configuring exometer integration is hell :frowning:
Incompatible versions, different repos, dozen of version-locked dependencies, very vague documentation. E.g. it took a while i need to specify api_key as list (’’ instead of “”) even if i don’t need it at all

I fear it going to be unmaintainable

Going to give Prometheus a try

1 Like

Although I haven’t tried it yet, Wombat sponsored ElixirConf and advertises itself a a monitoring tool that understand BEAM because it runs on BEAM.

1 Like

My Prometheus stuff runs on BEAM too :slight_smile: It even has builtin collectors for VM state introspection (based on eralng:system_info|statistics|memory. Actually no different from Exometer or Folsom.

2 Likes

Hows it going so far?

1 Like

Setup is non-trivial due to lack of docs, overall much much better than exometer stuff. Then put in right places, stuff just work.
Sidenote1: It needs elixir 1.3 suddenly (wtf, even erlang module). Works perfect with 1.3 - maybe something wrong with my setup though
Sidenote2: Trying to configure grafana to display these metrics in some useful way. Unfortunately prometeus logic is a bit strange here, e.g. getting something as simple as requests per second is hard :frowning: actually didn’t figure out yet. rate(metric[1m]) produces nice chart metered in fictional values)
P.S. if you’re going to write some posts, please include grafana configuration example for related metric(s)

1 Like

Actually documentation for Elixir stuff is on hexdocs. It may be well, short :-), but lack is too broad term here :slight_smile:

  1. I use mix to publish prometheus.erl. Like its experience much more than rebar3_hex_plugin.
  2. after applying rate units aren’t changed so if you applying rate to say http_requests_total counter you’ll get rate of http requests. However, as noted in the respective docs you’ll want to use irate here because http_requests_total is rapidly changing.

Already started with sample grafana dashboard with Erlang VM, Plugs, Ecto & Phoenix metrics :-). Stay tuned!

3 Likes

Thanks for your hard work!

Re: rate - both rate and irate producing very similar results. I’ll try to describe what’s wrong.

  1. do some load testing (e.g. wrk) to get nice spike in usages. let’s say 500 req/sec for specific endpoint
  2. wait a bit
  3. go to grafana dashboard. i’m using irate(http_requests_total[1m]) metric (tried both rate/irate)
  4. zoom in, see nice spike, highest value somewhere around 500 req/sec. so far so good
  5. zoom out a bit. spike is ~5 req/sec
  6. zoom out a bit. spike is ~0.05 req/sec
    (5) (6) values is kind of interpolation over interval. still, these values doesn’t make any sense and not useful by any means.
    If you know how to display real value it would be really really helpful.
    Exactly same story goes for “table view” etc
1 Like