Why does starting an Agent require a function?

Why do Agent.start_link/2, get/3 and update/3 take a function to set/get state? Why not just pass/retrieve the state directly, i.e., if I want the state to be 42, just let me pass 42 instead of fn -> 42 end?

This isn’t a complaint as I am sure there is a good (maybe obvious) reason for this, this question is really just a request for a learning opportunity for me if anyone is willing to explain.

:wave:
So I tried to find something on the documentation about it. It’s there but not so explicit:

start_link/2 Once the agent is spawned, the given function fun is invoked in the server process

So it seems that it uses the Agent process to run your function and from there get the initial state, since Agent abstracts the notion of state, and doesn’t leak it to you directly you send the “action” to perform on the state. The same reason applies for get and get_and_update.

Hope that this make some sense

Thanks - I’m sorry but I think I am still kind of puzzled. Why is not “leaking it directly” a useful/good thing? Or put differently, if Agent is just there to hold state and let us access it, what’s the benefit of requiring a function be invoked to do that instead of just handing it over?

When I look at the source, for example, all it does is send the function and arguments to start_link/3 for the underlying GenServer, the init/1 callback for which calls a function that just runs apply/2. So it doesn’t seem like the function is being transformed, evaluated, checked or otherwise “used for anything” so to speak; it’s always just getting passed along until finally it gets applied to the arguments passed along with it, which I could just as easily do myself in the first place before passing in the result to be stored.

Thanks again - I’m not trying to punish anyone for trying to be helpful and I appreciate your explanation.

I suppose it’s so that if the agent has heavy/long initialisation/update work, it is then run in the agent and not the calling process, and the calling process can set a timeout for the work which crashes the agent.

As for the update case specifically, if the workflow was: 1) get value from agent, 2) update value locally, 3) send updated value to agent, then there is a risk of race conditions. When the update is done in the agent process, access is serialised automatically.

5 Likes

One thing I thought was a key point about Agents is, that all those callbacks you pass in are running “server-side”. With that I mean that your client process does not run those functions, but the server-process does.
I think in some cases you want to run things client-side, sometimes it’s better to run it server-side though. The way the Agent interface is built, you can easily do both!

If you’d pass the initial state after starting the Agent process using plain messages you’ll be open to race conditions. Meaning updates might be processed before your initial state is received and applied.

2 Likes

Two reasons:

  1. Making sure the initial value gets extracted in the Agent process and not the one you are creating it in. If that analogy is easier for you, think of it in terms of multi-threading: you’re spawning thread B from thread A and want thread B to evaluate the initial value of the held state so as thread A is not blocked evaluating it.

  2. Lazy loading. You might want an Agent to actually fetch stuff from database, caches, 3rd party APIs, configuration or discovery providers etc. This circles back to the reasoning that if the initial state is expensive to compute then you’d block your original process and you don’t want that.

As for “why don’t they have an alternative API for people who know what they are doing and want to supply a static value” I’d say that it’s better safe then sorry. Writing fn -> 42 end is not much harder than 42 and you gain 100% confidence that your Agent's state will never be evaluated in the context of the caller, only in the callee (the Agent itself).

2 Likes

Yep, now I get it - thanks everyone!

1 Like