sabri
Which database for time-series data?
Hi,
I am building my channels based app, users will be sending their lat, long as messages each 1 second, I need to store all the messages, there could be hundred of thousands users sending their locations, the message size is less than 100 byte.
The question is, what is the recommended db to go with? As I will be storing all those messages in real time.. ? Each process (channel) will store the message once it’s received via Task.async
Please advice!
Most Liked Responses
uranther
InfluxDB came to mind because it focuses on storing time-series data. There is also TrailDB which was recently released by AdRoll but may not fit your use case as well.
In general, I would think about appending these messages to a log like Apache Kafka or RabbitMQ, and then creating subscribers/consumers which store that data in special purpose databases like InfluxDB. This allows for more flexibility as your application grows. Consider the following diagrams:
sources:
Qqwy
Are you going to need all that information forever? I think probably not (Rather, you’d want to create some smoothing algorithm that only stores a location once it is far enough from a users previous location).
If I am correct in that assumption, I think that you could store the first, real-time information using Mnesia (which is Erlangs built-in in-memory database that is made for concurrent read/writes), and after processing this information, store the smoothed values in a more conventional relational database such as PostgreSQL.
mkunikow
Yes I agree Apache Kafka is the best message queue right now. It can very easy scale adding more nodes if needed.
But the main question is what you will do with this data, how do you process this data and how do you want to access this data later.
If you have kafka you can have many consumers, process data and put it to different destinations.
There was interesting article on Arts Technica Power tools: Sorting through the crowded specialized database toolbox
You could put also data into Welcome to Apache Solr - Apache Solr or Elasticsearch: The Official Distributed Search & Analytics Engine | Elastic if you want to have data optimized for search.
if you are only interested to query data over time maybe time maybe influx db is good for you
Overview of influxdb
Summary:
You always optimize data how you access it later.
Popular in Questions
Other popular topics
Categories:
Sub Categories:
Forums
Popular Tags
- #ecto
- #liveview
- #troubleshooting
- #learning-elixir
- #deployment
- #library
- #erlang
- #testing
- #genserver
- #mix
- #absinthe
- #remote-other
- #otp
- #plug
- #how-to-question
- #macros
- #postgres
- #channels
- #elixirconf
- #exunit
- #discussion
- #javascript
- #podcasts
- #code-sync
- #onsite
- #dialyzer
- #docker
- #authentication
- #umbrella
- #full-time-contract
- #podcasts-by-brainlid
- #ecto-query
- #elixir-ls
- #phoenix_html
- #iex
- #blog-post
- #graphql
- #genstage
- #ai
- #websockets
- #supervisor
- #advent-of-code
- #elixirconf-us
- #distillery
- #processes
- #forms
- #api
- #metaprogramming
- #security
- #performance












