Background
Let’s say I have a process that reads from a file and saves data into an ETS table. This ETS table must be publicly available to all processes so they can read it (only read access is necessary).
Issue
The issue here is that reading from a file is a complex topic. It may go well, or it may fail, you never know. Our app depends on this module reading this file and placing it’s information on an ETS table.
After reading this article:
https://ferd.ca/it-s-about-the-guarantees.html
It is clear to me that the module that reads the file and fills the ETS table, let’s call it Reader
can’t rely on the fact everything will go as expected, nor should MyApp, using this module, rely on that assumption either.
Solution?
So the best practice here would be to have Reader
in MyApp supervision’s tree, as a process that simply returns :ok on the init
function. After that it would try to populate the table, which may fail or not. If it succeeds, processes will be able to read from the table, if it fails, it tries again every 30 seconds or so (it tries forever).
The challenge here is “How will my client, which depends on the data on the ETS table, know the table is ready?”
Two options come to my mind:
- Ask the
Reader
process every time you need to make a read. This will, unfortunately make this process a big bottleneck because all requests must ask him something.
- Have a special value in the ETS table that says if said table is ready for usage or not.
Reader
would be the only process capable of updating this value.
Question
Overall solution 2 would be more scalable, but I am not sure about having special values in data tables. It doesn’t feel right to mix accessibility with data.
How do you solve this problem?
What patterns or solutions do you use?
1 Like
Just have the client not load up until the process that populates the ETS table fully startup first? Otherwise I’d probably use a marker value, or a persistent term if it’s global on the single node.
2 Likes
Let’s say you have a Loader
and a Reader
. Both are under the same supervisor (one-for-one, e.g.).
Loader
- It starts up, returning
{:ok, ...}
from init
unless something catastrophic occurs. Within its state, it stores a :status
value which is set to :initializing
- It attempts to load data in memory (probably in a
handle_continue
). It updates the :status
according to the result: :data_loaded
if successful, or (e.g.) {:error, :reason_1}
, {:error, :reason_2}
, etc.
Until the :status
value is :data_loaded
, all calls to Loader will return (e.g.) {:unavailable, reason}
where reason
is (e.g.) the :status
value.
Reader
- It starts up, returning
{:ok, ...}
from init
unless something catastrophic occurs. Within its state, it stores a :status
value which is set to :initializing
- It checks on the Loader to see if it’s ready or not. If the Loader indicates it’s
:data_loaded
status, the Reader updates its status to :online
. If the Loader returns a different status, Reader can either try trigger some actions to “fix” the Loader, or choose to wait and check on its state again later (e.g. if it’s some other process’ responsibility to get the Loader to load the data properly).
Until the :status
value is :online
, all calls to Reader will return (e.g.) {:error, :data_not_loaded}
. This assumes of course that you want Reader to be available to callers if the data isn’t loaded. If that’s not the case, you can attempt contacting Loader a few times within handle_continue
callbacks (possibly with a back off strategy) before finally giving up and returning {:stop, :unable_to_load_data, new_state}
and bringing everything down.
Further considerations
In the above, Loader will be restarted if it crashes, and may reload the data. If this isn’t desireable, you’ll obviously need to adapt the above.
4 Likes
I see. So the Reader
will only talk to Loader
on the initialization phase to check if the data is loaded. Once the Reader
knows the Loader
is ready, it simply starts accessing the table directly.
Thanks !
Exactly. The nice thing with this concept of “levels” that build on top of each other (e.g. init -> data loaded into cache -> DB available -> external service available) is that by designing your system to provide more and more functionality as its dependencies come online, you’re able to easily downgrade it when things start going wrong (e.g. DB down for maintenance? Downgrade to “cache loaded” service level). And since the system (both your service and its callers) were built to handle the intermediary states (where not everything is working), everything will be more resilient to hiccups in the system.
2 Likes