State in Actor Model

I just read this blog post, “An Introduction to the Actor Model” at
http://blog.goodbot.co/an-introduction-to-the-actor-model/
I seem to be missing something basic. I am just beginning to learn elixir and functional programming, so perhaps it is just my lack of understanding at work.

Two questions:

  1. if we have an actor as described in this blog post that calculates the sum of all numbers and stores that in a private state, then crashes and is restarted by its supervisor, what happens to that stored private state? Is it lost? Should this state be stored somewhere else? A database?

  2. Is this concept of stored private state a poor way of doing this calculation? It seems to go against the idea of immutability, and referential transparency, or pure functions. How would you implement a calculator of the sum of all numbers?

3 Likes

Ill have a go on your first question: Yes. The state is lost; that is why you need to consider what kind of state you are dealing with. We can categorise state in three categories: “Not important”, Static/Configuration, and dynamic.

“Not important” state are intermediate calculations, like storing a stack that is being build by another process. It is not the end of the world if we lose this calculation as the owner (the other process) most likely died as well, and the state will get rebuild when that process is restarted.

Static/Configuration, if the state is just some network address to a remote server we don’t have to do much to recalculate the state. Just restart the process and rebuild the state from the initialisation data provided by the supervisor.

Dynamic state is a different beast. Think data that is accumulated over time; user provided data, measurements from sensors, etc. We need to store a backup of this data because we can not recalculate it. Save it to disk or store it in a database, but beware that you only store data in a “working state”—if the process dies it most likely died because of faulty state, so restarting into the same state will lead to a restart loop. You need to define some constraints on what constitute a working state before storing it to disk.

7 Likes

About the second question. Let me first address the misconception that we have «pure functions.» We can do side effects in our functions and it will happen a lot; the functions will often send messages to external processes, open and read from files, etc.

The data is immutable and data can be put into the state of a process running a behaviour like the GenServer. It is really a loop passing the state data from one iteration to the next.

I would probably implement it as a process implementing an Agent—or a GenServer receiving messages such as {:add, 5}, {:minus, 2}, {:divide, 2} etc. and update the state accordingly. I don’t think this is a “poor” way of going about this.

4 Likes

Can we use the Supervisor spec’s max_restarts option to deal with this case?

1 Like

@Most yes you may use that option to stop the restart loop.

@gausby I think most process fault could be related to external operations with external services rather than faulty state? Or am I misunderstanding you?

1 Like

The restart option will let the supervisor restart its child until the maximum is reached and kill itself if this number is reached. This will allow the supervisors supervisor to restart it, which (hopefully) will solve the problem :slight_smile:

@sashaafm dealing with the outside world is definitely a source of errors. I wouldn’t claim that I am an expert (still learning), but the state is a result of computation; if the computation rely on the external world (which it most likely will in most real world examples) then (yes) the corruption would very likely be related to external operations/services.

Please correct me if I am wrong.