I am looking to create a system which has to be able to sustain and persist approx. 1000 events per second from Phoenix channels, and compute statistics of the events over different time windows for an IoT project I’m doing.
The system for example given a bunch of temperature sensors scattered around a room would over windows of 1 minute, 1 hour, 6 hours, and per day compute the minimum mean temperatures, maximum mean temperatures, and opening/closing mean temperatures of the time windows and be able to report them to a web dashboard in real-time as events come by.
The statistics reported cannot be approximate (so techniques like average moving windows/rolling windows cannot be used for these aggregated statistics).
The events right now are aggregated onto a Postgres table, and I was thinking upon receiving the event I could recompute the means/mins/openings/closings using a Postgres aggregation query, though that would obviously be very tasking on the system as this aggregation query would have to go through all events thousands of times a second.
Another method I was thinking of was keeping a running count of the number of events (total count), and having a counter accumulating the temperature values over time on a separate Postgres table (to then compute the mean). I’m worried however that this accumulation would go past the data type limit after awhile and become a pain to handle in the database.
Timing accuracy and persistence if I were to do everything in memory (Redis, ETS) is also a major concern. I hopefully don’t want to touch a time-series database which holds continuous queries, though if anyone knows a very simple way to plug one in then I’d be very interested.
Anyone have ideas on how to architect such a system?