I’m doing some computation on loads of Postgres data and I’m in the situation where I have two streams of data (Repo.stream/2) coming from different tables that I would like to merge into a single output stream containing the computed values.
No they don’t, both of my streams return different amount of records.
But all entries have a date key, and since I need to compute day by day, my idea was to chunk the streams (using Stream.chunk_by date by date) which should give me approximately the same amout of records (one per day, but it’s possible that some days have only data in one of the streams)
That could be an idea.
But regarding the complexity of the processing, I would get an unmaintainable piece of s…QL
If possible, I’d prefer to stick with Elixir stream APIs
If I were you, I’d devise a temporary record type in the DB that will describe things in the lane of “this piece of data has 7 out of all 10 pieces that it needs to be complete” and just process both streams independently.