:queue vs genstage state

I’m expanding my understanding of genstage and how to build systems that are more resilient to overload. The GenStage docs demonstrate use of the Erlang :queue module… however, I am not clear on why that is being demonstrated. The :queue module seems to offer some convenience functions for working with lists, but it also has some inconveniences. E.g. it can’t work with a range like 1..5 – enumerables need to be converted into explicit lists, and certain operations like split fail if the queue is too small, so its use requires some “defensive coding”. Under the hood, an Erlang :queue is just a list, so I would expect the performance characteristics to be pretty much the same, and I would expect both to be equally susceptible to being over-stuffed.

I tried 2 variants of the same GenStage producer just to get a bit more familiar with :queue, and I think the 2 variants are functionally equivalent (although, the “defensive coding” required to use :queue makes me favor the simple list implementation).

Perhaps the conversation should be more about identifying the inputs to your system that could be “3rd rails” and overload it… A simple list (or a :queue) is acceptable for buffering demand or cases where the data stored in-memory is expected to remain tangibly small, but not when the input might be huge and unbounded. Anyhow, using :queue seems to be a distraction.

Thoughts?

That’s not the case. Under the hood :queue uses a tuple with two separate lists, which has rather different performance characteristics for the usecase of a fifo queue (hence why :queue exists). With just a plain list either your queuing or dequeuing would be expensive due to how linked lists work. Queue shifts that by making both generally inexpensive at the expense of needing to rebalance head and tail ever so often.

But yes sheer size will eventually become a problem no matter how you store items.

2 Likes

:queue good. Use :queue in some GenServer and Broadway (i.e. GenStage) state.

I don’t feel like it required an excess of “defensive coding”. Do you mean checking if you got something out of it or not? You’re not doing that with a list?

That makes sense. Thank you for the explanation!

E.g. compare:

  def handle_demand(demand, queue) when demand > 0 do
    {events, new_state} =
      case :queue.len(queue) > demand do
        true ->
          {head_queue, tail_queue} = :queue.split(demand, queue)
          {:queue.to_list(head_queue), tail_queue}

        false ->
          {:queue.to_list(queue), :queue.new()}
      end

    {:noreply, events, new_state}
  end

to

  def handle_demand(demand, state) when demand > 0 do
    {events, new_state} = Enum.split(state, demand)
    {:noreply, events, new_state}
  end

Is it a good idea to use len? You could write yourself a recursive wrapper around out

defmodule Queue do
  @doc """
  Take `count` items from the front of the queue.
  
  Returns a list of items and the new queue: `{items, queue}`.  If there are
  less than `count` items in the queue, all of them will be returned.  If the
  queue is empty, an empty list is returned.
  """
  @spec take(count :: pos_integer, :queue.queue()) :: {list, :queue.queue()}

Well, you see what I mean: it requires additional work.

More code for a more appropriate and well working data structure though. You’re comparing apples to oranges here.

1 Like

My point was only that with the addition of the relatively unfamiliar (?) :queue module, the GenStage examples may veer into discussions about data structures instead of focusing on how stages should be organized. I would argue that understanding how to properly organize stages is more important to GenStage’s rai·son d’ê·tre, no?

It’s not as black and white. Yes simpler examples are generally a good thing. But using a list in the examples in question is not just making the examples simpler, but also a surefire way for people shooting themselves in the foot, when they copy from the example. Using a list to model a fifo queue is plain not what you want to do if this is meant to hold more than a handful of items.

1 Like

Thanks for this discussion. I’ve often wondered about the utility of :queue myself. It seems easy enough to use lists for this with the understanding that enqueueing would be expensive.

#enqueue
q = 1..5 |> Enum.reduce([], fn i, q -> q ++ [i] end)
#dequeue
[hd | new_q] = q

I suspect this only matters beyond a certain size, which is where :queue becomes important.

No it’s a matter of behaviour – what you’re doing in that code example is popping off the top of a stack

Edit: nevermind, didn’t notice you were concatenating lists in your reduce function body.

1 Like

One should practice good habits at all times.

I don’t think saving a few seconds or minutes is a valid reason to use the wrong data structure. It harms performance, clarity and professional growth.

1 Like

FTR, I wasn’t trying to argue FOR my example over using :queue just stating why from a naive perspective it does seem puzzling.