Discussion about uses for Agent Processes

accuser · March 22, 2017, 6:02pm

For me it is essentially a separation of concerns. I realise that an Agent is effectively a specialised GenServer, and I could put all the code in one Module, but I find this approach helpful.

OvermindDL1 · March 22, 2017, 6:04pm

I’m actually quite curious as to how it is used. A GenServer passes a single state argument to all callbacks, which seems no different from a pid for an agent, you can pass both sets to functions to mutate the data within and return it, it just seems like a lot of overhead and extra code so I am curious as to the benefit?

accuser · March 22, 2017, 6:17pm

I think the main benefit is to me…

Specifically, I start my GenServer with a unique id, which in turn starts the Agent with an initial state, perhaps a record loaded via Ecto. The Agent contains a number of functions that transform its state via get_and_update/3, returning the changes and the new state. The GenServer manages the logic that invokes those transformation. It checks that the command is valid, etc.

I’ve adopted this approach for a couple of reasons:

The Agent is only concerned with what is needed to be done to transform the state of the struct it wraps. It could be a record from Ecto, but could also be a remote service. Nothing needs to know how it works, only that it does.
The GenServer is only concerned with how the Agent should be transformed.
A complex command might chain several Agent transformations together, and this reduces code duplication.

OvermindDL1 · March 22, 2017, 6:22pm

Hmm, I’d probably have stuffed that record into the state (or ‘as’ the state if nothing else).

Wait, how do you do this? Adding functions to an Agent basically just means a GenServer, so why not just have it all in a GenServer?

Yeah, that should be a GenServer, you only add things to that GenServer that relate to messing with its own state.

That could also be another GenServer, but you would not need to keep the transformation for the agent in ‘this’ genserver but could keep it with the data itself in the other genserver directly.

Hmm, how does this work? Code examples?

accuser · March 22, 2017, 8:11pm

I started with a GenServer having the entity as part of the state, but it soon got complex. I don’t mind writing more code in order to reduce complexity. I then migrated the entity to an Agent, and added some helpers. See this gist - caveat, I have macros for handling parts of this, and have extracted this simple case as best as I can. Mea cupla if it doesn’t serve the purpose!

OvermindDL1 · March 22, 2017, 8:27pm

Just because the GenServer holds the state does not mean all the functionality needs to be within it. Like on your User.Entity.activate/1 function, why can it not just take the user instead of the pid, then get rid of lines 13 and 17 then the part at User.handle_call/3 on line 18 can just store it back as the new state, then it becomes fully functional, code flow is well defined, no weird magical calls that return nothing are sitting around, and so forth… I’m not sure what an Agent here is for other than maybe obfuscating what is happening. It does not seem to be very functional at all as well as being quite a bit slower than just a direct method call to the other module would be since it has to pass a message, copy data, set new data in the other process, then send a message back, etc…?

accuser · March 22, 2017, 8:39pm

I’m glad you said that, because I have been thinking about that recently. I’ve been coding Ruby (on and off Rails) for the past 10 years, and have brought some of that baggage with me, no doubt. I started my current project with a prototype using DRb, and so naturally gravitated to Elixir when I came across it (I’m actually from a Telecoms background, and was well are of Erlang/OTP, but never got around to using it).

So, back to the issue in hand - you’re saying that my I should drop the Agent, stick with just a GenServer, but keep the transformation functionality in a separate Module, passing user as a parameter. I can see the sense in this.

OvermindDL1 · March 22, 2017, 8:52pm

Yes absolutely! That is how my GenServer’s often are. The code that operates over the data and returns new data is elsewhere, the GenServer itself just handles the synchronization and messages to call the right commands and hold the state.

EDIT: Just keep this in mind, the fundamental unit of Code Organization is not a Process on the BEAM, it is a Module. You should always jump to a new Module first and pass data around instead of a process until you really need a process, like for concurrency or so.

sasajuric · March 22, 2017, 9:13pm

1000x this!

I’ve seen multiple cases where people reach for Agents to organize their code, where plain functional module would do the job. I even remember seeing somewhere someone stating that such approach is functional (because presumably Elixir is functional), which is definitely wrong.

My feeling is that since Agents are easy to use, they are also easy to abuse. Moreover, I believe that Agents solve a fairly trivial problem, and are basically just reducing LOC compared to GenServer. Which is why I only use Agents in tests (where they really come in handy), and avoid them otherwise. YMMV of course

gregvaughn · March 22, 2017, 11:27pm

With all respect to José, even when they were first introduced I found Agents to be uninteresting. I learned GenServer first and that’s still what I tend toward. I even took a detour into Clojure at a prior job, and I learned agents there, but GenServer is so much more.

josevalim · April 6, 2017, 3:30pm

I will take that as a compliment.

I also don’t tend to use agents much but, in the few cases I do, I prefer them to a GenServer. A GenServer, as the name says, is generic so when you only need to keep state around, the intent gets lost. Other than that, a GenServer is likely the way to go and is definitely much more.

Answering the original question: Mix has two or three examples of using agents.

whatyouhide · April 6, 2017, 7:12pm

We also use it in Gettext to store translations being extracted during compilation.

mjadczak · April 6, 2017, 9:38pm

I use it as a simple KV store when parsing things into DAGs, which is notoriously difficult to do in a pure functional style (especially without more hardcore tools like Monads et al.)

I think in general it seems that most of the “good” use cases are short-lived accumulator-like processes, for times when insisting on passing accumulators around in a pure functional style gets too complex. For long-lived computation or processing or state storage, I definitely reach for a GenServer.

OvermindDL1 · April 6, 2017, 9:44pm

A monad is not hardcore, heck if you pass state from function to function (like what |> does already) that is just explicit monad handling instead of implicit, the monad in this case is just a raw value. ^.^

For anything in-process I still think passing the state around is better, after all you have to pass the PID around anyway. ^.^
For out of process I usually hit ETS, why spin up an agent when ETS can do it better and faster?
For out of node, well a GenServer, but an agent might have use here, or Mnesia depending…

sasajuric · April 6, 2017, 9:54pm

I don’t think these are good use cases. If you really want that (and I believe in most cases it’s not a good idea), then consider process dictionary. At least with that, you won’t need to pay the price for the separate process, copy data, and depend on the scheduler. An example of that in practice is :rand.uniform which uses procdict to implicitly manage the state of the RNG.

sasajuric · April 6, 2017, 10:28pm

You might consider :digraph for this. The fact that it’s powered by ETS might confirm that this is indeed hard to do with FP. Never tried implementing a DAG myself, so can’t say. I had good experience with :digraph though

mjadczak · April 6, 2017, 11:53pm

I have actually looked at :digraph, what I’m doing isn’t exactly processing arbitrary DAGs and the module isn’t a good fit—but yes, I suppose I’m using an Agent as a lightweight alternative to an ETS table (all it does it keep two maps around) as the structures I’m working with are quite small (couple hundred entries at most).

mjadczak · April 7, 2017, 12:32am

Here is another of my uses of Agent: I’m using it as a lightweight cache to the database while doing an import of a dataset. There are a bunch of references to things and also some things which I am denormalising, and keeping things in memory instead of going to the database speeds things up when importing thousands of records.

defmodule PFServer.Finder.ImportMeta do
  @moduledoc """
  Keeps track of information during an import, and asynchronously imports program types and focus areas.
  """
  alias PFServer.Finder.{Organization, ProgramFocusArea, ProgramType}
  alias PFServer.Repo
  alias PFServer.TransactionManager, as: TM
  require Logger

  defstruct importing?: false, organization_map: %{}, ptypes: %{}, fareas: %{}

  def start_link do
    Agent.start_link(fn -> %__MODULE__{} end, name: __MODULE__)
  end

  def start_import do
    Agent.get_and_update(__MODULE__, fn state ->
      cond do
        state.importing? ->
          {{:error, :already_importing}, state}
        true ->
          {:ok, put_in(state.importing?, true)}
      end
     end)
  end

  def end_import do
    Agent.update(__MODULE__, fn _ -> %__MODULE__{} end)
  end

  def register_organization(%Organization{id: db_id, airtable_recid: recid}) do
    Agent.update(__MODULE__, fn state -> put_in(state.organization_map[recid], db_id) end)
  end

  def get_organization_id!(recid) do
    case Agent.get(__MODULE__, fn state -> state.organization_map[recid] end) do
      nil -> raise "Organization with record ID #{recid} has not been registered."
      id -> id
    end
  end

  def get_program_type(name) do
    Agent.get_and_update(__MODULE__, fn state ->
      case state.ptypes[name] do
        nil ->
          ptype = TM.execute fn -> Repo.insert!(%ProgramType{name: name}) end
          {ptype, put_in(state.ptypes[name], ptype)}
        ptype -> {ptype, state}
      end
     end)
  end

  def get_focus_area(name) do
    Agent.get_and_update(__MODULE__, fn state ->
      case state.fareas[name] do
        nil ->
          farea = TM.execute fn -> Repo.insert!(%ProgramFocusArea{name: name}) end
          {farea, put_in(state.fareas[name], farea)}
        farea -> {farea, state}
      end
     end)
  end

end

However even looking at it now, a case could be made that it should actually be a GenServer, given that it’s doing a little more than just keeping state. But then, all Agent is is a wrapper for a GenServer with a different API.

Looking at it more, even though usually we have all the logic to manipulate an Agent’s state inside a single module, an Agent is more like a “promiscuous” GenServer, in that in a typical GenServer setup, the logic to manipulate GenServer state is inside the GenServer itself and we have a functional interface to the GenServer. An Agent, on the other hand, allows us to provide it with arbitrary lambdas to modify its state.

This breaks a lot of conventions in OTP/Elixir in general, and I think that’s why there’s such an instinctive dislike of them in this thread.

It begs the question—should they even be included in Elixir by default at all?

benwilson512 · April 7, 2017, 1:19am

Just judging by your Agent code here it seems like you could just as easily have a cache that was just a map and do it all in a purely functional style.

I don’t understand what this means. They aren’t arbitrary, they take the state and they return very specific values. This is exactly like a GenServer which has particular callbacks that take the state, arbitrary arguments, and return specific values.

sasajuric · April 7, 2017, 7:36am

The issue you mention is indeed one drawback of Agent, but my main concern is about something else. I have an increasing feeling that agents are frequently misused as objects, where in-process structure would work just fine. I share Ben’s impression of your code:

Looking at the interface of your module, it looks like it serves as a dumping ground used by a sinle process. If that is indeed the case, I think that plain data structure would work better. We had other examples in this thread where people used agents to organize the code, or maintain some accumulator in a loop.

While I agree Agents offer some benefits over GenServer when properly used (as demonstrated in mix and gettext examples), I’m not sure these benefits are worth the downsides:

Agent is the additional abstraction people need to learn.
People (especially newcomers) seem to be confused with Agent vs GenServer. Over years, I’ve repeatedly seen people asking which one should be used.
As you mentioned, by default agents break encapsulation.
It seems that agents are easily misused to work around FP and simulate OO. Admittedly that’s also possible with GenServer (I know for a fact because I used to do that myself ), but it requires more overhead compared to agents. Wrapping e.g an integer in an agent is trivial. Doing the same with GenServer will require a dedicated module, so that’s a hint that perhaps this is not the best approach

Given these downsides, and the fact that I’m personally not superimpressed with supposed benefits of Agents, looking at your question

I personally feel that we’d be better off without agents.