Movement or interest in Spandex, or Telemetry, for other adapters for tracing?

Hello!

This is surprisingly my first topic despite lurking for a very long time.

I find myself wanting to add tracing and some form of Application Performance Monitoring (APM) service. The complication is that we are a healthcare company and subject to regulation (the HIPAA Act) that limits what services we can use. For example NewRelic is not really an option. DataDog is a “maybe” but their pricing is a bit expensive for them to sign the contracts required for the regulation.

We already have a contract with Sentry, and use them for their originally advertised purpose of error monitoring, but they’ve “recently” added general Application Monitoring with tracing support that seems to follow the OpenTelemetry spec. The “official” sentry library for Elixir does not have support for this or the “new” way Sentry collects data, and doesn’t seem to have much movement going on there.

We also use GCP (Google Cloud) and have the contracts needed for them. They have an OpenTelemetry-based product called Cloud Trace (formerly Stackdriver?).

In both cases there aren’t really any Spandex adapters beyond what I’ve seen for DataDog, although it looks like there’s some auto-generated GCP things for Cloud Trace in their elixir-google-api library

Has anyone seen any Spandex adapters for either of these? Or suggestions for a way to get an APM-like experience? I know about prom_ex since I used to work with akoutmos, but we’re a pretty small team and I’d like to try for a paid service first before spinning up Prometheus and Grafana for this. I’m also not opposed to writing the adapters, it just becomes a harder sell, and I’m not sure yet what Spandex gives you as far as process tracking, and what I’d need to build myself.

Long time no see!

I am currently working on incorporating GrafanaAgent into PromEx which opens the door for leveraging GrafanaCloud for example to host both Grafana and Prometheus. No need to have Prometheus poll metrics over the public internet :). GrafanaAgent pushes the Prometheus metrics via remote_write. Currently using the experimental version at my work with GrafanaCloud and it is working beautifully. Doesn’t immediately address your tracing dilemma…but it’s something hah.

1 Like

Nice :slight_smile: I’m excited that your library is moving along well. I definitely plan to look at implementing it once we get that far in our monitoring needs.

1 Like

Ok, after reading up more, I think I maybe answered my own question:

:telemetry is not built with open-telemetry in mind. It’s just a pub-sub-like setup for emitting events to handlers

Spandex looks like it is roughly close to the open-telemetry spec but is not explicitly built for it, does not consume :telemetry information, and at the moment only has an adapter for DataDog.

:opentelemetry and :opentelemetry_api are built with the open-telemetry spec in mind by that community and is in beta for tracing, but it looks like not all of the opentelemetry plugins/handlers/etc use :telemetry, only the ecto and phoenix one.


So I guess a more accurate question would be:

If I wanted to do tracing part of an APM for Google Cloud or Sentry, which both supposedly support OpenTelemetry, does it seem like Spandex or opentelemetry would be the right way to go?

3 Likes

Looks like Spandex still only supports DataDog. Have you come across anything else good for the GCP integration with Profiling or Tracing?

open_telemetry doesn’t have to use telemetry. As you said, telemetry is only PubSub and anyone can hook into it to publish events, including open telemetry.

In any case, there is GitHub - opentelemetry-beam/opentelemetry_telemetry: A bridge library for Telemetry to OpenTelemetry in case you want to hook both together.

5 Likes

I just saw this post and though you may have made a decision already I wanted to mention that an option for your particular case may be Splunk APM, Introducing Splunk APM | Splunk – especially because of your HIPAA requirements.

Full disclosure: I work for Splunk on OpenTelemetry.

If anyone comes across this later on, open_telemetry is the route we’ll likely go. New Relic is now offering HIPAA compliant services, but their minimum spend seems to be $USD25k/yr once we actually got someone knowledgeable about it.

Cloud Trace should work, you’d just need to either run the collector yourself, or figure out if the stackdriver container that runs on the Container Optimized VMs is actually just a collector that you could use instead.

As far as I can tell, Google does not provide an OpenTelemetry Protocol (OTLP) endpoint but the opentelemetry collector that’s out there for Google should work with it. No idea though. I’ve only gotten as far as sending my opentelemetry traces to stdout right now.

Another option is Grafana Tempo, which would not require the collector at all, since it has an OTLP endpoint. You’d need to self-host if you have compliance needs though. It’s new enough that other cloud providers don’t offer it, so you’d need to see if Grafana, the company, would offer HIPAA compliance in their hosting

1 Like

1 year update:

We used Google Cloud Trace. It’s fine but it’s not an APM and we didn’t feel like running Prometheus.

We were interested in using Honeycomb but the HIPAA pricing was originally very high. We checked back in with them a while later and they had revised their pricing so sane levels for our volume of data.

We’ve been happily using Honeycomb for the last 5 months or so and no longer need the otel collector like we did with Cloud Trace

1 Like