Episode 129 of Thinking Elixir. Tracking, analyzing and visualizing time series data can add a lot of business value to a project! We met up with Dave Lucia to learn more about Timescale DB, a PostgreSQL extension that adds time series tools to our regular database. Dave also created a timescale hex package to make it easier to work with hypertables and hyperfunctions. We learn why Timescale DB makes sense over other options, how to get started with it, example use cases, helpful resources and more!
We’ve been using TimescaleDB at CargoSense for a couple of years now, and it has worked really well for us. In particular, the compression feature gives us 96%+ compression rate which makes a huge difference. Concretely we have a ~terrabyte’s worth of data taking up 55 gigabytes and for our query patterns, we see negligible performance performance impact for queries on compressed chunks.
For us it’s a sweet spot on trade offs. We get to keep the simplicity of running “just” postgres, we get to keep all our transactional guarantees with other code, and we also get to store bunch of timeseries data.
The only “gotcha” is that, like all partitioned tables, you need to be storing “relevant time range” values around so that you can query the right range. Eg if you had a sensor and you wanted to query its temperature readings in a shipment, you’d want to do a query like
Repo.all(from r in Reading, where: r.sensor_id == ^sensor.id, where: r.recorded_at >= ^shipment.starts_at and r.recorded_at <= ^shipment.ends_at
r.recorded_at >= ^shipment.starts_at and r.recorded_at <= ^shipment.ends_at bit is important, even if a sensor is never used in multiple shipments because you need to help the query planner know in which chunks to go looking for readings.
I’m glad to hear you’ve had such a good experience with it as well. And thanks for the query tip. Very helpful!