Handling "Firehose" Data Streams within Elixir

I don’t want to come off as incompetent, but “he who asks a question is a fool for five minutes, but he who does not remains a fool forever”.

I’ve never handled “streaming” data before. I know the basics of making an Async request and keeping the connection open.However, I decided to use a library, HTTPoison, to handle my async requests. It seems, as soon as initiating the request, I get completely flooded with the data and if I delegate it out to another process, the process ends up dying. Is there a way to buffer the incoming data(or dish it out from one process to say… n processes) for data ingestion and storage(the eventually thing I want to do).

I know I could use RabbitMQ, Kafka, Apache Spark, etc. to do this… but I want this as a “learning” experience. Just to kind of know how the underlying technology works and how to implement it. This is step one in a long series of steps of me learning. So if anyone could shed some light to a resource/book that discusses buffers/batching the incoming requests(or just handling a large amount of requests and “splitting it”). That would be awesome.

Hi, you might want to search for GenStage examples, there’s a blog post (and similar forum posts ) about using it for Twitters Firehose which sounds exactly like your use case.



Just wondering what you ended up using for streaming data?


  • fellow newbie =)