Is an Elixir GenServer so different from an Object of a Class?

mikejm · December 10, 2023, 11:34pm

I am just learning Elixir. The purpose is to build a server API to interact between clients and a separate database server. I am coming from C#/C++.

I have now been reading a variety of tutorials on GenServer, as that seems to be the best point to start. It has been the first thing I’ve read about that truly makes sense in that it explains how to do truly practical tasks.

If I am understanding correctly, something that says use GenServer becomes not much different once started than an object of a class. The main differences I see is are:

All internal variables must be stored within state, typically as %{} so you can store numerous data points as key-value.
Thus the only mutable variable inside the GenServer is state (but this is really not any big difference as you can store as many variables of any type as you like within that).
Every time one of the major functions of a GenServer type process is called (start_link, cast, call), this just triggers the callback to reference state again and allow you to mutate it or run other functions.
GenServer as a Behavior basically acts as a standard basic Elixir version of a Class you can derive from to accomplish things needing internal state maintained.

Am I wrong? It seems once you start understanding it, it’s just a different way of accomplishing the same thing.

I guess this design of storing all functions inside state and just having very few simple functions to start a process helps keep it more generic or maybe contributes to things being better able to regenerate itself? Plus the Supervisor system to let each run more autonomously?

It also seems like pid is the same as having an object reference. As I understand it every process (ie. object) gets a pid by which it can be referenced and that is how the system tracks these things. You can also give them names which have to be unique, again like objects.

Am I correct that these things are not that different and analogous in the ways I described?

If I am not wrong, I think Elixir would be well served by having an explanation of how concepts and roles are analogous to in object oriented programming. For example, I found this tutorial:

https://medium.com/@roydejong/elixir-a-primer-for-object-oriented-programmers-fd5ef0206943

Although he explains some things well, I think he makes it overall even more confusing. For example, he states that modules are “stateless”. At no point then does he explain how it is possible for anything in the language to maintain any type of state (which is obviously necessary for anything to get realistic done - how can an application or server or process or object do anything without storing its state?).

It would be much better in my opinion to say that the primary differences in these regards are that in Elixir:

“All functions take arguments as values not references.”
“Elixir at its simplest uses what are essentially derivations of GenServer to handle states in the same way you usually would with an object of a class. All variables of the process/object should be lumped inside state inside there.”

Something like that (if remotely true) makes a lot more sense than just saying it is “stateless” or has nothing analogous to “references” which obviously makes no sense. It obviously must hold/transform variables/state in some way and we must have some way of referring to all our processes (objects) in some way.

Am I understanding the basic analogies? Any thoughts?

kokolegorille · December 10, 2023, 11:46pm

Erlang might be the only object oriented language because the 3 tenets of object oriented programming are that it’s based on message passing, that you have isolation between objects and have polymorphism.

Joe Armstrong, creator of Erlang

sodapopcan · December 11, 2023, 12:23am

In short, they are similar—but different. I’m not an expert but perhaps my layman explanation could help.

The key thing is to let go of the idea that modules and processes are in any way related. While they do work together, modules are indeed just stateless bags of functions, and processes are used to run those functions and store state. While an object has its state encapsulated within its instance and only the methods defined on its class can operate directly on that state, processes can call functions from any module and have its state altered by any of them. So a process is like an instance, yes, but there isn’t a single module that controls its state. The gen server pattern does indeed associate a single module with a process, but there is nothing forcing that—you could put functions in that same module that are called by unrelated processes. All use GenServer does is metaprogram some functions into your module that are called by a process.

For me it helped a lot to build a very basic GenServer-type thing with primitives (spawn, send, and receive). There are lots of blog posts about this (kind of too many, heh).

I dunno, does that help? It’s a bit mind bending to think about sometimes because they are very similar but ya… basically what @kokolegorille said

benwilson512 · December 11, 2023, 1:57am

This observation is a good one! This article here elaborates on some of the key differences in terms how when you use processes vs functional structures The Erlangelist - To spawn, or not to spawn?

andyleclair · December 11, 2023, 5:27am

https://hexdocs.pm/elixir/main/process-anti-patterns.html#code-organization-by-process

adw632 · December 11, 2023, 9:02am

Not wrong, only slightly inaccurate. The state that a process maintains via the process runloop is never mutated. The runloop is a recursive function, returning a new value into itself each time around the receive “loop”.

Also note that we don’t actually have loops in Elixir, Erlang or BEAM languages since valid bytecode is not allowed to “jump backwards” and therefore it cannot get “stuck” without yielding in a potentially infinite loop inlike almost all other languages I am aware of. The BEAM bytecode can only call another function and that is also how preemption is meticulously enforced through precise “reduction accounting” and why the BEAM provides such low latency. Even bad code cannot block work and processes can always be killed without cooperating and ruining a global “OO state graph” and leaving locks everywhere.

The important thing is that both state and execution context is bound together with real encapsulation. That is not the case in “OO languages”, as Joe’s Armstrong famously said in the book Coders at work:

I think the lack of reusability comes in object-oriented languages, not functional languages. Because the problem with object-oriented languages is they’ve got all this implicit environment that they carry around with them. You wanted a banana but what you got was a gorilla holding the banana and the entire jungle.

Yes you are correct. BEAM languages like Erlang and Elixir are the epitome of object oriented.

Alan Kay the father of the term object oriented said that Erlang is the most object oriented language.

The fact is that object oriented languages are not object oriented according to Alan Kaye’s definition.

OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things.

Alan said he imagined objects as biological cells communicating only via messages like computers on a network.

Well that describes an Elxir or Erlang process cluster communicating using only messages. Don’t you think?

Joe Armstrong thought so…

Erlang has got all these things. It’s got isolation, it’s got polymorphism and it’s got pure messaging. From that point of view, we might say it’s the only object oriented language.

I think perhaps there could be some room for object-splaining in order, however…

… it might be confusing, confronting and potentially create congnitive dissonance for many to say Elixir is a true object oriented language to those who have been conditioned to what they believe OO is.

dimitarvp · December 11, 2023, 9:41am

Apart from the excellent answers you already received, I’ll also add that the “mutation” of state (which is not really a mutation, only the observed effect looks like mutation as @adw632 said) is serialized. Which means only one process at a time will have the message they sent to a given GenServer processed (namely the message that changes the GenServer internal state).

Which means that the BEAM VM has been made to make race conditions impossible.

jhogberg · December 11, 2023, 12:25pm

One thing I don’t see mentioned in this thread is agency: a GenServer (or rather its process) is not a passive entity that is acted upon like in “traditional” OO languages, but an active one that can act entirely on its own.

This is very useful when working in domains where you have lots of objects that act independently of each other. Trying to manually juggle tons of different passive objects to simulate them being active is not an easy task.

D4no0 · December 11, 2023, 12:37pm

Indeed, this is the crucial thing when we talk about client side and server side logic from “generic server”.

It is confusing when callback functions are mixed with client functions calling them, and this is a general pattern we tend to implement.

The generic examples specified by __MODULE__ callback don’t help at all, as you can have the following definition for global genservers:

Genserver.start_link(__MODULE__, [], name: __MODULE__)

You literally need to know metaprograming features in order to understand these definitions.

dimitarvp · December 11, 2023, 12:42pm

Oh absolutely, agreed. Though nowadays I wonder if Elixir shouldn’t have made two variants of GenServer i.e. a StateHolder (can only receive messages that change state) and BackgroundTask or something. But that would require much stronger static typing and almost formal-proof level of checking and compiling so I see this as one of the reasons why it wasn’t done like this.

I’ve mentored novices, both free and for a fee, and a good part of them eventually said that a GenServer is too generic a tool and it took them some getting used to to it.

I am 50/50 about that though, it’s good that we have building blocks on top of which we can make better tools. And, every language and framework has its learning ramp. We can’t make it a 100% smooth and completely painless ride.

dimitarvp · December 11, 2023, 12:42pm

Oh yeah, don’t get me started on this, I had to write two small mini libraries for myself just to make the distinction between GenServer ID and name in my head.

benwilson512 · December 11, 2023, 12:43pm

I mean this is what an Agent is right? It can’t handle_info.

dimitarvp · December 11, 2023, 12:44pm

True. Though Agent is even more of a confusing name. But again, I don’t object to any of that. It takes a few tries and you learn.

D4no0 · December 11, 2023, 12:49pm

I see nothing difficult about GenServer, in general once you get going. The serialization of messages is a thing that you can shoot yourself in the leg, this is a reason I steer away from using more complicated things with GenServer.

Nonetheless it beats stone age concurrency from languages like java, where you get undefined behavior once you introduce concurrency in 95% of cases.

dimitarvp · December 11, 2023, 12:51pm

That’s what I am saying as well – takes a little getting used to and I didn’t mind it as mentioned above. I am pointing out that it can be difficult to some novices is all.

Personally it took me half an afternoon tinkering in iex and I got it all the way to 90%, with the only exceptions being some intricacies about what does use GenServer do exactly, how do you define child_spec, what’s the difference between an ID and a name – and I got those just fine the second time I did a self-training session back in 2016.

Pfff, obviously. That’s why we are all here.

D4no0 · December 11, 2023, 12:54pm

It seems that you were lucky enough to not get memory leaks in production when using huge binaries . The best time to read about how GenServer works is when the prod server is on fire.

dimitarvp · December 11, 2023, 12:57pm

Indeed I was, but that also comes with me depriving myself of sleep in order to learn proactively before prod is on fire. So don’t bet that I was in a better position.

Back to original topic, I think OP got good answers.

Aetherus · December 11, 2023, 1:03pm

Suppose you have such a class in C#:

class Foo
{
    int Bar { get; set; } = 0;
    int Baz { get; set; } = 0;

    int Add(int bar, int baz) {
        this.Bar = bar;
        this.Baz = baz;
        return this.Bar + this.Baz;
    }
}

and an instance of that class

var foo = new Foo();

what will happen when one thread calls foo.Add(1, 2); while another thread calls foo.Add(3, 4) at the same time? You can’t guarantee the first thread gets 3 and the second thread gets 7. That’s because an object in OOP language gives you an illusion of isolation, but actually it still has to do whatever other threads ask it to do immediately.

On the other hand, when you have a GenServer module like this:

defmodule Foo do
  use GenServer

  def start_link(_) do
    GenServer.start_link(__MODULE__, {0, 0}, name: __MODULE__)
  end

  def add(foo, bar) do
    GenServer.call(__MODULE__, {:add, foo, bar})
  end

  @impl true
  def init(state) do
    {:ok, state}
  end

  @impl true
  def handle_call({:add, foo, bar}, _, _) do
    {:reply, foo + bar, {foo, bar}}
  end
end

When two different processes call Foo.add(a, b), you can guarantee that each process gets the correct result, only that it can be a little bit slow because the calculations and state settings are done sequentially.

A process is like a man working in an office. He has a mailbox to help him queue up his jobs to do. He picks a job and handles it whenever he sees fit. If you (another process) care about the result of the job, you need to wait for that man to finish the job and send the result back to you in a mail message. If you lose your patience, it’s up to you to decide what to do next.

D4no0 · December 11, 2023, 1:08pm

At the end of the day this becomes a problem when you have side-effects or global state, because if we talk about from the concept of a thread, it cannot run code concurrently (abstractions over threads are out of question). The biggest problem in those languages becomes synchronization of resources (that erlang VM thought about out of the box).

derek-zhou · December 11, 2023, 1:09pm

Although you can make such analogy to get you going, it is still better to understand what an Erlang process / GenServer is on it’s own to move up to the next level. If you latch onto the GenServer-is-Object understanding much longer, eventually you will shoot yourself in the foot. A GenServer is much heavier weight than an Object in a typical OO language; so if you try to literately translate an OO program into Elixir, you will be disappointed by the performance.