In the book “Elixir in Action” I remember reading somewhere that you should have ETS wrapped in a GenServer because this way the GenServer watches over the ETS and is able to restart it should something fail.
However, I have seen in another book the opposite. People just create ETS tables in the application.ex file together with the app supervisor:
defmodule MyApp.Application do
@moduledoc false
use Application
def start(_type, _args) do
children = [module.A, module.B]
:ets.new(:my_table, [:public, :named_table])
opts = [strategy: :one_for_one]
Supervisor.start_link(children, opts)
end
end
Question
Which approach is better? Isn’t it dangerous to put the ETS table at the app supervisor level (if the table crashes, the whole app follows right?)
I’d think less about “if ETS fails” or “which approach is better”, but rather in terms of data locality and data consistency. A ETS table being e.g. a read-through cache for some external resource should be co-located to the process accessing the external resource. Maybe they should even restart together in certain scenarios to prevent stale data in the cache, but also because a root application process shouldn’t really be aware of implementation details of subsystems.
Whenever you’re asking yourself how to structure a supervision tree it’s about which things belong together and need to be restarted together to bring the system back to a known state while minimizing the amount of disruption to unrelated systems. I’d suggest watching @ferd’s talks or read his blogposts on the topic.
The parts in the middle of the talk are the most relevant, but I’d suggest watching the whole thing. There’s also a blogpost with the same title (and the others are good as well).
If you’re planing to crash ETS then yes, putting it in the root process is not the way to go, but take the argument with a grain of salt because if you’re crashing ETS I’d imagine it’s not the only thing going badly in the system.
This is the key insight here. By having a GenServer owner of an ETS table, and placing that owner in a proper place in the supervision tree, we can ensure the desired stop and restart behaviour on a case-by-case basis. In other words, it’s equally straightforward to ensure that the cache is purged on termination, as well as to ensure that the cached data lives on after the restart.
Occasionally, I find myself using a supervisor as the owner of a table, but more often I tend to have a dedicated GenServer for that purpose. I used to start tables during app start too, but I never do this anymore. The small convenience gain is not worth the confusion which comes this approach (e.g. the data lives on until the app is stopped, and it’s not clear from the code where the table is created and who owns it).
10.3.2: You might wonder why GenServer is still used in the ETS-based key/value store. The sole purpose of this process is to keep the table alive. Remember, an ETS table is released from memory when the owner process terminates. Therefore, you need to have a distinct, long-running process that creates and owns the table.
The GenServer simply acts as the owner process. When the owner process is killed by the supervisor the ETS table goes with it. Then the supervisor restarts the GenServer which in turn creates a fresh ETS table.
if the table crashes
How would a table crash??? Granted the contents can get corrupted which can lead to processes that rely on it to crash. If that can happen then it makes sense to run the owner process and client process under a :one_for_all supervisor which would also kill the owner process to create a fresh table.
But there are other use cases where the ETS table acts as a backing store to bridge the termination of one process when it is replaced with a new one - in that case it makes sense to have the supervisor own the table (Don’t loose your ETS tables).