Recommendations for Phoenix/Ecto app metrics platform (alternative to AppSignal)

We’ve been using AppSignal for a while but we’re now looking for an alternative. I’m getting an impression that rolling something custom is getting easier now with Telemetry, but I’m looking for a (mostly) plug and play solution that captures exceptions and gathers performance data. We can allocate time to roll out something on our own but I would leave that as the last option.

I know of a few services myself, but I’m curious to find out what are people’s recommendations?

We are using NewRelic and DataDog at work.

1 Like

I have been using Sentry and Honeybadger with varying amounts of success. They are okay, not amazing but get the error reporting job done. Not sure they even can report performance metrics.

NewRelic seems to be the norm for many. It’s very feature-rich.

If you’re on Heroku their management tools are quite nice.

1 Like

If it’s not private, could you please share what’s the reason why you’re looking for an alternative to AppSignal?

I will simply say that we’re not happy with the quality of the product and particularly with their customer support; that said, the service is very cheap (probably the cheapest). If you need more details, send me a private message and I can share my personal experience.

2 Likes

I usually end up hosting my own instances of Prometheus and Grafana and then use https://github.com/deadtrickster/prometheus.ex along with all the relevant collectors for out of the box Phoenix+Ecto metrics. I then also add my own metrics when appropriate to keep track of business metrics and what not.

4 Likes

Can you two (with @EskiMag) include me in the conversation? I’m evaluating options like Svilen does, for several personal projects.

This is worthy of a blog post with a step-by-step tutorial. Consider doing it in the future! :024:

3 Likes

I’ll have to add that to my blog topic list :). I covered error monitoring with Sentry in my latest post if that is of interest https://akoutmos.com/post/error-monitoring-phoenix-with-sentry/

@dimitarvp I took your feedback and posted Part 1 of my 2 part series on monitoring with Prometheus+Grafana:

Part 2 should be out in a couple weeks.

1 Like

@svilen if you’re still looking, and open to trying something new … I’d love it if you check out Logflare. I’ve been using our Logger backend for a while now, which powers the dashboard I use to manage our infrastructure.

From a logging perspective, we’re pretty much on par with the typical solutions out there. From the metrics side, you just log structured data and create dashboards from that.

I’d love to give you a rundown. If you’re interested, perhaps you guys can play a role in shaping the future of the product. I’m working closely with engaged early adopters.

Currently handling about a billion events a week…

Appreciate it!

</pitch>

4 Likes

Hey @svilen, did you find out something interesting? I am quite interested about your findings and decision, as we’re also trying to setup some metrics collection & visualisation and also exceptions tracking.
To be honest, I am very confused by the whole logger, telemetry, opentelemetry, prometheus etc. topic, so I am looking for any useful information. :slight_smile:

2 Likes

Ok, I will try to make it straight for you:

  • logger is new Erlang module in kernel application that is meant as a general-purpose logging frontend. Right now (Elixir 1.9.x) its messages are intercepted by Elixir’s Logger application and are displayed alongside its messages. In next release (Elixir 1.10) it will slightly change, as Logger module will now be wrapper over Erlang’s logger and these two will be unified (with compatibility layer for legacy code).
  • telemetry is simple events dispatcher, think about it as a message bus, that is meant to dispatch events about applications. In theory we could use logger for that, but telemetry is much simpler and less complicated solution (no levels, no formatters, no filters, etc. just messages and handlers)
  • OpenTelemetry is CNCF project which is meant to bring iter-technology solution for monitoring (metrics and traces, no logs). The idea is to make it as a backend to the Erlang’s telemetry, so you will be able to seamlessly introduce it into your project and most of the things will work almost OotB.
  • Prometheus is another CNCF project that is meant to be TSDB and query language for metrics.

So in the end:

  • If you are writing library - logger and telemetry are your friends
  • If you are building application - OpenTelemetry is something you should look into

Prometheus is for operations, there are backends for telemetry and I am pretty sure that there also will be such feature in OpenTelemetry, so in the end, OpenTelemetry will be all you need. However currently the OpenTelemetry Erlang is still in pre-alpha phase, so if you want to use it here and now, then you should check OpenCensus (which is previous project, and all the people that were working on that one currently are working on OpenTelemetry).

About error tracking - check out Sentry, it has official support for Elixir and it is pretty decent software. I have used it in the past and it was bliss.

About visualisation - Grafana is the best, unless you want unified hosted solution, then most of the above can be achieved via DataDog.

7 Likes

Excelent. Thanks a lot for your detailed answer. :+1:

Not sure I understand this still. So is telemetry sort of like a lot of people use Redis, RabbitMQ, Kafka? Or Java’s JMS implementations? Basically a stream of events, sort of like a GenServer’s mailbox?

The easiest way that I can explain telemetry is that it is a function dispatcher. Libraries emit events with a given name (list of atoms), measurements, and metadata. Functions are then invoked if they subscribed based on the name. The function dispatcher is doing things like handling errors, so a bad function won’t be invoked again and again.

The functions can do anything you want, and they will be called inline. Many people do things like emit to StatsD or Prometheus at that point. It’s completely in Elixir though. Further, the functions are called inline so it’s not async by default either.

2 Likes