How do you determine when to introduce a Process?

I’m trying to boil down the choice of when to introduce a process, instead of just using modules and functions. I would love to hear any criteria that you use to make the same choice.

9 Likes

This is a great question!

My first intuition tells me: Add a process can/should be added whenever you encounter some part of your computations that is either stateful or parallelizeable (or both). In the first case, a GenServer or Agent is usually what you’re looking for. In the latter, a (bunch of) Task(s).

I’ll definitely give this question some more thought the next few days.

7 Likes

If you are not interested in performance(i.e. parallelising mapping an array) but in modelling. I would say a process can be introduced for every new non-deterministic actor in the system.

4 Likes

The line I usually give in my talks is “run different things separately”, meaning power different activities (jobs) by separate processes. As an example, I gave a high-level overview of one component in my first Erlang production two years ago at ElixirConfEU. The relevant part starts here.

What makes thing/jobs different? A simple factor I use is to consider whether they can fail/succeed separately. If I need to do X and Y, and failure of X doesn’t imply the failure of Y (or of the entire task I’m doing), then they should likely be powered by separate processes.

An out-of-the-box example are supervisors. A supervisor is running in a separate process, so it can do the work even if workers themselves fail.

Another variation of the supervisor: say you want to start some job, and report to the user when it finishes (regardless of the reason). Then the reporter should run separately from the worker. The reporter monitors the worker, and can always know when the worker is done, even if the worker crashes, or is brutally killed from the outside.

Yet another variant: periodic job (cron). A separate process (let’s call it manager) ticks, starts the worker, and monitors it. Therefore, the manager can always do its job, regardless of workers success/failure.

Another great example in practice are Phoenix channels. Each channel represents a separate conversation between the client and the server, and is powered by separate processes. If one conversation crashes, other conversations are still working properly, and the socket is not closed. It’s also not just about crashes. As I’ve explained in my recent ElixirDaze talk, separate processes guard you from the total paralysis of the system. If one of your conversation (channel) is stuck, say due to a logical bug, or suboptimal code, all other conversations (and the socket) are still working properly.

Also, as @Qqwy hinted, separate processes sometime make sense as an optimization technique. Splitting a large computation into parallelizable chunks might improve the running time, even if the total work is all-or-nothing (all subtasks need to succeed).

Another example is doing the work which needs to allocate a larger amount of temp memory. You could start a separate process, and specify that it starts with a larger heap, do the work there, then send the result to the caller and stop the process, which will lead to immediate memory release, without putting the pressure on GC.

So the usual factor for splitting is IMO error semantics (should failure of X lead to the failure of Y?), or, in some special cases, technical optimization.

19 Likes

I’ve heard Francesco Caesarini say a couple of times to use processes for “each concurrent activity” of your system. When I first heard it, it was too vague to help me. However, as I’ve listened to more of his talks and especially places where Erlang Solutions has done performance tuning for clients, there is a tendency for people to make too many things processes, which can serve as singleton bottlenecks.

That “each concurrent activity” is still the best guidance I have, but I look forward to reading this thread and learning other ways to communicate that idea.

5 Likes

there is a tendency for people to make too many things processes, which can serve as singleton bottlenecks.

This was my personal experience. Coming from Ruby I had a habit of simply moving my unit of modeling from Object to Process, and so I was using Processes for all kinds of “things” in my system whether or not they were actually DOING anything.

My code got a lot simpler and a lot faster when I focused on splitting up what the data was from what I wanted to do with the data, and then I could focus on having Processes that did those actions.

5 Likes

Concurrency/asynchronicity more than anything else. I do split a lot of stuff where I’d normally use objects into modules, but one pattern I’m working on is to have e.g. GenServer state reserve slots for other modules and call these within-process. Hmm, an example is probably in order:

defmodule InProcess do
  defmodule State, do: defstruct [:foo]

  def some_work(args, state) do
    # Do some work... 
    {:ok, answer, state}
  end
end

and then a Real Process™ could do, in say a GenServer handle_call:

def handle_call(:some_call, _from, state) do
  # Basically delegate to InProcess module
  {:ok, answer, new_inproc_state} = InProcess.some_work(42, state.inproc)
  {:reply, answer, %State{state | inproc: new_inproc_state}
end

This way I can keep state+behavior reasonably encapsulated inside the InProcess module without having to do it as a GenServer and therefore I can easily separate the decision of “Does this need separate state/behavior” versus “Does this need separate process scheduling”.

3 Likes

I’ve always find the correlation between working with the Actor Model and working with hierarchical management-structures within companies and other organizations that involve multiple persons. (The main difference being that you can be a little bit ‘meaner’ to your processes than you can to your coworkers :stuck_out_tongue: )

I think that the “singleton bottleneck” that @gregvaughn and @benwilson512 describe, could be considered a manifestation of Brooks’s Law, for instance.
Do mind that I am not at all an expert on organizational structures, so I have no idea if there are rules we could learn from management to apply to distill the “each concurrent activity” credo.

5 Likes

I think you mean Conway’s Law, but, yeah, there is likely some of that involved. I think of Conway’s Law at a coarser grained architectural level. These “each concurrent activities” decision needs to be handled even in lower level software design concerns.

2 Likes

I love this concept. I feel like it almost captures the whole challenge. It seems to imply that a pipeline shouldn’t be a process, because the result of each step is needed by the next stage. Of course, as you noted, there are exceptions (Task.async_stream() comes to mind).

3 Likes

Speaking very much as an elixir newbie but also as someone who has been around functional programming for a while now (mostly Clojure) I do think there is a definite tendency for people new to functional languages to try to find a replacement for their objects. Until you get your head around it all, it’s just natural to reach for X as the replacement for that object you feel like you need to create. In Elixir/Erlang, the most natural – but still misguided – object replacement is the process.

I realize this is sort of a negative answer to your question, but it’s the only one I have.

2 Likes

You typically use processes for modelling concurrency and for handling state. We don’t do shared global! :grinning: Another “standard” use is for limiting and controlling errors. I don’t think a process can ever become more of a singleton bottleneck than a function but sometimes you have to think about splitting concurrent activity into multiple processes.

6 Likes