Measuring & Tuning GenStage/Flow performance

I have personally never put a flow application into production so I was wondering what others were doing with regard to measuring and tuning GenStage and Flow specific metrics. I’ve never seen anything specifically targeted at this and from my experience with GenStage/Flow it can be quite challenging to get it right. Beyond normal CPU/Memory metrics and such, what would you use to tune and monitor your flow application?

Basically, I’d like to know what others are doing to 1) Develop and tune their flow application to get the most performance out of it, and 2) Monitor and understand that performance when deployed in production.

A list of metrics I would be interested in would be something along the lines of:

  • For each producer or producer/consumer:
    • Event count buffer - average, p90 and p100
    • Time to generate new events - average & p90
    • Number of events generated - average
    • Demand count buffer - average, p90 and p100
  • For each consumer or producer/consumer:
    • Event counts processed - average, p90
    • Time to process each event - average & p90
    • Some way to determine distribution of work, not sure about this one, is it all one or two workers doing everything or is it evenly spread?

I understand you could simply put a :timer.tc(fn -> end) around each of the implementations (or something fancier with a macro) but I’m curious more about what are people using and what metrics are they tracking to tune and monitor.

1 Like

I would very interested to find out what path you took with this. What performance monitoring did you add? Cheers!