Discussion about uses for Agent Processes

OvermindDL1 · April 11, 2017, 5:04pm

Huh, I had no issues with GenServer either when I hit erlang a decade ago, and I was heavy C++ for over a decade by then. Call’s have reply messages. Casts are fire-and-forget. Etc… etc… It all seemed pretty obvious?

If they can use the process dictionary at all then they should not be using agents and instead should be threading their state through their process calls, like they are ‘already’ doing with the Agents PID. This does not seem like a use case for agents at all to me?

josevalim · April 11, 2017, 5:17pm

For you? Sure, it seems it was the case. For everyone, I highly doubt.

We are talking about misuses of Agents. So yes, they should not be using agents nor the process dictionary. I was wondering which one is less problematic.

net · April 11, 2017, 5:36pm

I have never used an Agent either.

To me, “Agent” is a bit of a misnomer. “Agent” coveys to me a sense of doing, rather than holding.

sasajuric · April 11, 2017, 6:08pm

Call/cast dilemma also needs to be understood with agents (because there’s cast function).

Thanks to Elixir, some GenServer aspects can be ignored initially (I bet you wished there was Elixir when you started learning Erlang ). With use GenServer basic usage requires start_link, call, and cast (and respective callbacks). In fact, a simple explanation could just stick with a call, and that’s already in many ways on-par with agents.

If we’re talking about misuses, I think that procdict is a lesser evil here because:

It gives better runtime properties (no copying, scheduler overhead, potentially leaking processes)
procdict should be a stronger hint that you’re probably doing something wrong

josevalim · April 11, 2017, 6:51pm

Good points, thanks!

mjadczak · April 11, 2017, 10:31pm

Just on this topic, are there any guidelines on when using the procdict is actually appropriate?

OvermindDL1 · April 11, 2017, 10:33pm

In my opinion, only for hidden ubiquitious state, like the random module uses it, and gettext uses it, it saves the callers from passing around and holding state into those, but that absolutely should be minimized as holding on to and passing state is superior in almost all cases except those trivial ones due to their ubiquity and ease of use.

sasajuric · April 12, 2017, 8:09am

I agree with everything @OvermindDL1 said, but I’d like to expand on his comment about passing state vs procidct (or agent).

It has been hinted in this thread that agents might be helpful when it’s tedious to thread some piece of info (accumulator) through a deeper chain of nested functions. As I said, I do think that procdict is not as bad as agents for this, but a much better solution is IMO usually to pass the data around.

For example, let’s say you have a bunch of functions which maintain foo and bar, meaning they accept foo and bar, and return a tuple of {foo, bar}. Now, you want to add baz, and it feels like a huge amount of tedious work to add this piece of data, so it might be tempting to use agent or procdict.

The proper course of actions would be to bundle these fields into a map, and then you have functions which take a map, and return a transformed version of it. Adding additional pieces of data to this map is then simple. Such map with associated functions then captures some concept explicitly. If the logic around this data grows (you find you need more fields and/or functions working on the map), it’s likely the time to extract that code into a separate module (and maybe turn a map into a struct).

There are cases when passing the accumulator is extremely tedious, for example if most of the functions in the stack don’t care about the accumulator at all. A good example of that is RNG. If :random didn’t support procdict, you’d have to explicitly pass the RNG state through all the functions, just because you need a random number somewhere deep in the stack.

But in most cases, using procdict to accumulate something is a q&d shortcut, rather than a proper solution. Using agent for this is even worse, for technical reasons I mentioned

Qqwy · April 12, 2017, 8:29am

I guess the main reason why using procdicts is considered a dirty hack in most cases, is because it breaks the Referential Transparency that functional programming enjoys. If code is not referentially transparent, it is hard to maintain, because there may be hidden dependencies that you overlook while refactoring, resulting in bugs that are hard to track down.

smpallen99 · April 19, 2017, 7:50am

Interesting discussion… Hard to believe someone would use an Agent for in-process state management. I don’t think that’s the fault of the tool, but of the person’s experience with functional programming.

Personally, I use Agents, ETSs, GenServers and FSMs a lot in my solutions. I’m not that fond of ETS so I normally default to an Agent. However, ETS has some nice query functionality that I reach for if needed.

For simple long running state, Agents are nice. Less setup then GenServers. Agents are also handy for simple serialized “read and do something with the value” operations.

When I need state and more complex behaviour, I use a GenServer. Both approaches are faster than database access IMHO. Of course, I only use these for transient data and resort to the database when survivability is required.

I personally feel the addition of Agents to the language was a good thing. When they first came out, I didn’t really understand why I wouldn’t just use a GenServer. However, after learning them and a couple uses, I would miss them.

robconery1 · October 24, 2018, 8:03pm

Well… guilty as charged. I feel as though I have a pretty good understanding of the language and how functional programming works in general, but just yesterday I was coding up some ShoppingCart code and thought “hmmm, should I use Redis or should I use an Agent”. ETS occurred to me as well, but I’m not really sharing any data between processes… yet. So I went with what I considered to be the more idiomatic, “simpler” solution: Agent.

I mean… I dunno it kind of fits the docs:

Often in Elixir there is a need to share or store state that must be accessed from different processes or by the same process at different points in time.

So: the docs suggest this is a reasonable approach, my intuition suggests the same. Does this mean I don’t understand functional programming? Nope. Let me explain why…

I ended up on this thread because something didn’t feel right about this decision. I didn’t know how to kill it aside from using a cron job - I need these carts to expire. So naturally I’m thinking Redis or Mnesia - but carts are kind of transient. I’m not interested in the Cart itself, and I’m only interested short-term in the “state” of the cart. A cart will either be used to execute an order in the near-term, or it won’t.

This is where the functional break happened in my mind: I’m much more interested in the data produced than the cart’s behavior. Logs can tell me much more than the current state of the cart and an arbitrary status flag.

Anyway: I start to wonder how I can use Elixir to terminate a cart after a set amount of time. I’m thinking about a watchdog process and a bunch of other “let’s see what sticks to the wall” kind of stuff and I come back to the same realization: Redis does this better. So I Google it and end up here and damn I’m happy I did because I was asking the very same questions: what are Agents good for anyway?.

I ended up here, read this great discussion and am very happy to know I’m not the only one having these thoughts. I’m also glad I’m not alone in appreciating :mnesia.

Thanks again - great discussion.

idi527 · October 24, 2018, 8:17pm

I ended up on this thread because something didn’t feel right about this decision. I didn’t know how to kill it aside from using a cron job - I need these carts to expire. So naturally I’m thinking Redis or Mnesia - but carts are kind of transient. I’m not interested in the Cart itself, and I’m only interested short-term in the “state” of the cart. A cart will either be used to execute an order in the near-term, or it won’t.

The genserver behaviour has a timeout option which can be used to terminate the process after some idle time (no messages). Together with a dynamic supervisor and a registry it makes ephemeral processes possible.

defmodule Cart.Application do
  use Application

  def start(_type, _args) do
    children =
        [
          {Registry, keys: :unique, name: Cart.Registry},
          Cart.Supervisor
        ]

    opts = [strategy: :one_for_one, name: __MODULE__.Supervisor]
    Supervisor.start_link(children, opts)
  end
end

defmodule Cart.Supervisor do
  use DynamicSupervisor

  def start_link(opts) do
    DynamicSupervisor.start_link(__MODULE__, opts, name: __MODULE__)
  end

  @impl true
  def init(_opts) do
    DynamicSupervisor.init(strategy: :one_for_one)
  end

  @spec start_cart(cart_id :: pos_integer) :: DynamicSupervisor.on_start_child()
  @spec start_cart(cart_id :: pos_integer, opts :: Keyword.t()) :: DynamicSupervisor.on_start_child()
  def start_cart(cart_id, opts \\ []) when is_integer(cart_id) do
    DynamicSupervisor.start_child(__MODULE__, {Cart, [{:cart_id, cart_id} | opts]})
  end

  @spec stop_cart(cart_id :: pos_integer) :: :ok | {:error, :not_found}
  def stop_cart(cart_id) when is_integer(cart_id) do
    case Registry.lookup(Cart.Registry, cart_id) do
      [{pid, _}] -> DynamicSupervisor.terminate_child(__MODULE__, pid)
      [] -> {:error, :not_found}
    end
  end
end

defmodule Cart do
  use GenServer, restart: :transient

  require Record
  Record.defrecordp(:state, [:timeout]) # I usually keep a bit more data here ;)

  @timeout 10 * 60 * 1000 # exit after 10 minutes of inactivity

  def start_link(opts) do
    cart_id = opts[:cart_id] || raise("need :cart_id")
    GenServer.start_link(__MODULE__, opts, name: via(cart_id))
  end

  def add(cart_id, item) do
    call(cart_id, {:add, item})
  end

  @doc false
  def via(cart_id) when is_integer(cart_id) do
    {:via, Registry, {Cart.Registry, cart_id}}
  end

  defp call(cart_id, message) when is_integer(cart_id) do
    GenServer.call(via(cart_id), message)
  catch
    :exit, {:noproc, _} -> # a bit of a hack, but it works
      _ = Cart.Supervisor.start_cart(cart_id)
      call(cart_id, message)
  end
  # ^^^ make sure the process can always be started
  # otherwise you might get stack overflow (try/catch is not tail recursive)
  # if there is a possibility of a faulty process, add an attempt counter
  # call(cart_id, message, attempts_left - 1)

  @doc false
  def init(opts) do
    send(self(), :init)
    {:ok, state(timeout: opts[:timeout] || @timeout)} # custom timeouts for tests
  end

  @doc false
  def handle_info(:init, state(timeout: timeout) = state) do
    # init process state
    # previous state can be read from a database
    {:noreply, state, timeout}
  end

  def handle_info(:timeout, state) do
    # the process has been idle for 10 minutes, time to die
    # the current state can be persisted
    {:stop, :normal, state}
  end

  @doc false
  def handle_call({:add, item}, _from, state(timeout: timeout) = state) do
    # add item to the cart, maybe persist it in the database as well
    {:reply, :ok, state, timeout}
  end
end

Usage:

Cart.add(123, %Cart.Item{...}) # will start a cart process if it doesn't yet exist
Cart.add(123, %Cart.Item{...}) # uses the same process, will exit after 10 min of inactivity

It basically works as a very simplified version of orleans. But also inherits one of its state-managing benefits that most caches can’t provide – there are no stale entries / data races since the only way to update the cart is through interacting with a cart process. This approach also works across nodes.

This is where the functional break happened in my mind: I’m much more interested in the data produced than the cart’s behavior . Logs can tell me much more than the current state of the cart and an arbitrary status flag.

It’s possible to persist each event in either the call function or in each of handle_calls. In chat bots where I mostly use this approach (for user sessions) I persist almost everything (but in sqlite, each process (user) gets its own database), so that I can replay the events in case of a failure / faulty migration.

Moving to ephemeral gen(servers | statems) from ets tables made the code much clearer as well (for me, at least). The message handlers now just call the processes and render the results. Functional core, imperative shell, and all that.

Qqwy · October 24, 2018, 8:20pm

Also, instead of Redis, there are multiple in-memory KV-stores that have the possibility of limit the time-to-live of their data available on Hex.PM !

OvermindDL1 · October 24, 2018, 8:23pm

I’m quite partial to Cachex myself.

robconery1 · October 24, 2018, 8:34pm

I’m using Redis to interop (using pub/sub) with other containers written in various languages. Long story - but Redis is a great choice for this else I might have gone with ETS :).

Qqwy · October 24, 2018, 8:43pm

If you already are using Redis, then of course it is a good choice ! What made you use Redis for that, rather than e.g. RabbitMQ?

robconery1 · October 24, 2018, 8:57pm

I know Redis, as do my teammates. Some know Rabbit - I’ve read a few books on it and I think it will work well if we need something more than a super simple pub/sub blast.