Handling "Firehose" Data Streams within Elixir

Hey Everyone,

I don’t want to come off as incompetent, but “he who asks a question is a fool for five minutes, but he who does not remains a fool forever”.

Anyways, continuing:

I’ve never handled “streaming” data before. I know the basics of making an Async request and keeping the connection open.However, I decided to use a library, HTTPoison, to handle my async requests. It seems, as soon as initiating the request, I get completely flooded with the data and if I delegate it out to another process, the process ends up dying. Is there a way to buffer the incoming data(or dish it out from one process to say… n processes) for data ingestion and storage(the eventually thing I want to do).

I know I could use RabbitMQ, Kafka, Apache Spark, etc. to do this… but I want this as a “learning” experience. Just to kind of know how the underlying technology works and how to implement it. This is step one in a long series of steps of me learning. So if anyone could shed some light to a resource/book that discusses buffers/batching the incoming requests(or just handling a large amount of requests and “splitting it”). That would be awesome.

I’ll fully admit… Web Developer newbie. Never had to deal with streaming data… so take it easy on me

Thanks in advance!

1 Like

Hi, you might want to search for GenStage examples, there’s a blog post (and similar forum posts ) about using it for Twitters Firehose which sounds exactly like your use case.

2 Likes

HI!

Just wondering what you ended up using for streaming data?

Thanks!

  • fellow newbie =)