I’m expanding my understanding of genstage and how to build systems that are more resilient to overload. The GenStage docs demonstrate use of the Erlang :queue module… however, I am not clear on why that is being demonstrated. The :queue module seems to offer some convenience functions for working with lists, but it also has some inconveniences. E.g. it can’t work with a range like 1..5 – enumerables need to be converted into explicit lists, and certain operations like split fail if the queue is too small, so its use requires some “defensive coding”. Under the hood, an Erlang :queue is just a list, so I would expect the performance characteristics to be pretty much the same, and I would expect both to be equally susceptible to being over-stuffed.
I tried 2 variants of the same GenStage producer just to get a bit more familiar with :queue, and I think the 2 variants are functionally equivalent (although, the “defensive coding” required to use :queue makes me favor the simple list implementation).
Perhaps the conversation should be more about identifying the inputs to your system that could be “3rd rails” and overload it… A simple list (or a :queue) is acceptable for buffering demand or cases where the data stored in-memory is expected to remain tangibly small, but not when the input might be huge and unbounded. Anyhow, using :queue seems to be a distraction.
That’s not the case. Under the hood :queue uses a tuple with two separate lists, which has rather different performance characteristics for the usecase of a fifo queue (hence why :queue exists). With just a plain list either your queuing or dequeuing would be expensive due to how linked lists work. Queue shifts that by making both generally inexpensive at the expense of needing to rebalance head and tail ever so often.
But yes sheer size will eventually become a problem no matter how you store items.
:queue good. Use :queue in some GenServer and Broadway (i.e. GenStage) state.
I don’t feel like it required an excess of “defensive coding”. Do you mean checking if you got something out of it or not? You’re not doing that with a list?
Is it a good idea to use len? You could write yourself a recursive wrapper around out
defmodule Queue do
@doc """
Take `count` items from the front of the queue.
Returns a list of items and the new queue: `{items, queue}`. If there are
less than `count` items in the queue, all of them will be returned. If the
queue is empty, an empty list is returned.
"""
@spec take(count :: pos_integer, :queue.queue()) :: {list, :queue.queue()}
My point was only that with the addition of the relatively unfamiliar (?) :queue module, the GenStage examples may veer into discussions about data structures instead of focusing on how stages should be organized. I would argue that understanding how to properly organize stages is more important to GenStage’s rai·son d’ê·tre, no?
It’s not as black and white. Yes simpler examples are generally a good thing. But using a list in the examples in question is not just making the examples simpler, but also a surefire way for people shooting themselves in the foot, when they copy from the example. Using a list to model a fifo queue is plain not what you want to do if this is meant to hold more than a handful of items.
Thanks for this discussion. I’ve often wondered about the utility of :queue myself. It seems easy enough to use lists for this with the understanding that enqueueing would be expensive.
I don’t think saving a few seconds or minutes is a valid reason to use the wrong data structure. It harms performance, clarity and professional growth.