The BEAM provides the infrastructure, but you need to write the code to glue things together into a consistent distributed system - and to be clear, once you have TWO GenServers you’re trying to make change together you’re in distributed-system territory.
For instance, here’s a very basic “table with a lock” GenServer (see below for notes):
defmodule TableWithLock do
use GenServer
defstruct [:data, :owner]
def unlock(pid), do: GenServer.call(pid, :unlock)
def lock(pid), do: GenServer.call(pid, :lock)
def update(pid, fun), do: GenServer.call(pid, {:update, fun})
@impl true
def init(data) do
{:ok, %__MODULE__{data: data, owner: nil}}
end
@impl true
def handle_call(:lock, {pid, _tag}, state) do
cond do
is_nil(state.owner) ->
# unlocked, pid now owns lock
{:reply, :ok, %{state | owner: pid}}
state.owner == pid ->
# already locked by pid
{:reply, :ok, state}
true ->
# locked by another process
raise "oh no lock contention"
end
end
def handle_call(:unlock, {pid, _tag}, state) do
cond do
is_nil(state.owner) ->
# already not locked
raise "somebody already unlocked this???"
state.owner == pid ->
# locked by the caller
{:reply, :ok, %{state | owner: nil}}
true ->
# locked by somebody else
raise "unlocking somebody else's lock"
end
end
def handle_call({:update, fun}, {pid, _tag}, state) do
cond do
is_nil(state.owner) ->
# not locked
raise "not locked"
state.owner == pid ->
# locked by the caller
result = fun.(state.data)
{:reply, result, %{state | data: result}}
true ->
# locked by somebody else
raise "updater not holding the lock"
end
end
end
{:ok, pid1} = GenServer.start_link(TableWithLock, [1,2,3])
:ok = TableWithLock.lock(pid1)
result = TableWithLock.update(pid1, fn data -> Enum.map(data, & &1*2) end)
:ok = TableWithLock.unlock(pid1)
IO.inspect(result)
There are a LOT of places where this could be work better / handle concurrency better:
- crashing the table when a second process tries to take the lock is not realistic. A better implementation would keep a queue of pids that are currently trying to take the lock in
lock
and pick the next one to reply to in unlock
.
- crashing the table on bogus unlocks isn’t realistic either. An alternative would be to return something from
handle_call
that the implementation of unlock/1
could use to crash the calling process, since unlocking a table that you haven’t locked is a logic error
- if a process dies while holding the lock, it will never be unlocked. Tools like
Process.monitor
can help with this, at the cost of additional complexity.
Expanding this setup to TWO tables adds some extra complications:
-
if process A takes lock 1 and then tries to take lock 2, while at the same time process B takes lock 2 and tries to take lock 1 the system is in a classic DEADLOCK situation. The default 5s timeout on GenServer.call
will eventually pick a winner, but real systems will detect this and complain
-
coordinating changes to ensure that they either all appear or all do not is still just as tricky as always. You’d need a third process to coordinate the TableWithLocks
and roll back changes if a future change fails.
Note that even the code in your example does not produce atomicity - if mutate2
returns false
, the changes from mutate1
are still visible.
Solving this problem correctly is capital-H Hard and the solutions are highly sensitive to exactly what tradeoffs your particular application can tolerate.