The product itself is excellent for both exception and performance monitoring, and the custom metrics are handy too. Itās also well priced too- switching over from New Relic saved us a lot of money.
Iāve pestered the Appsignal team plenty of times for help (or to obnoxiously request features!) and theyāve always been extremely helpful and responsive. Very happy with the service.
It currently has a bug where with cowboy 2 it reports 404s as errors, but that should be fixed soon.
Currently we use a combination of statsd for raw metrics and open tracing through spandex: https://github.com/spandex-project. All of this gets sent over to datadog. In the past Iāve used prometheus and grafana and both are good. I actually prefer prometheus to statsd but both are fine. I donāt think I could live without open tracing at this point. Its probably not necessary if youāre just running 1 or 2 services. But if you end up with a lot of different services its pretty invaluable for observability.
Off the shelf apm tools have the most strengths when you only need to monitor / observe 1 application or service. As systems grow I tend to want more control over my monitoring and alerting rules and a lot of off the shelf apm tools donāt give you that sort of power. In the end I prefer tools like datadog or grafana.
Weāre using a combination of the following for our elixir services:
Statsd via the statix library for stats and metrics
Bugsnag via the bugsnag elixir library for capturing exceptions
Sending logs to an ELK stack via logstash-json. Weāve started using elastalert to trigger alerts on some specific errors in our Kibana logs. Itās nice to be able to configure alerts on existing logs, without touching a line of code
We also built a āsynthetic monitoringā service that we built in-house to smoke-test our API every 5 minutes. It raises alerts in OpsGenie though HTTP calls if any or or endpoints return unexpected responses.
Iād use (or pick) something to store metrics in that can provide visualizations. Start small and pump some general data into that store. Add to it over time. We use telegraf currently for that.
However, just yesterday, I replaced Elixometer with vmstats in one of our applications (following a similar strategy to the one laid out in this blog post). I think I will be using vmstats going forward. It was much easier to set up than elixometer and requires much less configuration.
Over at New Relic, we use the open source agent we have built. It comes with Plug transaction tracing, distributed tracing for micro-services, errors, BEAM stats, function tracing, custom attributes, alerting, etc.
In progress are a few framework integrations - Phoenix and Absinthe. Long term we want to align with the Elixir Telemetry project so we donāt need a bunch of vendor specific instrumentation packages in the ecosystem