ETS race condition?

tjdam · June 10, 2021, 3:57am

Hi, all!

So I’ve been using ETS to cache some data, the result from a series of queries so I don’t have to hit the database so often. The user sends a request, if there is an entry for it on ETS it returns the data and if not, it queries the database, returns it to the user and also adds to an ETS table.

So the solution I had in mind so the user always get the most updated data was to, every time some of the data changes, the update function calls the GenServer that I wrote to interface ETS and deletes everything and so the next user to send a request would get fresh data from the database and repopulate the ETS table with that fresh data.

def update_content(attrs) do
    Cache.delete(:key, :table_name)

    %Content{}
    |> Content.changeset(attrs)
    |> Repo.update()
end

The GenServer looks like this:

def delete(key, table) do
    GenServer.cast(__MODULE__, {:delete, key, table})
end

def handle_cast({:delete, key, table}, state) do
    :ets.delete(table, key)
    {:noreply, state}
end

What is happening though is that sometimes the table entry is deleted, sometimes it’s not. The only thing I can think of is some sort of race condition going on, but I have no idea how that would be the case. Any ideas?

Also (except for the fact it’s not doing what I wanted it to do), is there anything wrong/inefficient with this approach?

I’m pretty new to ETS as well as writing my own GenServers, so this could be a very silly question…

Thank you!

mudasobwa · June 10, 2021, 6:50am

GenServer.handle_cast/2 is asynchronous, so it’s not guaranteed to perform immediately. Probably changing it to synchronous GenServer.handle_call/3 would fix the issue.

In general, mastering GenServer as an interface to :ets is an anti-pattern, because you basically construct a bottleneck (GenServer’s mailbox) in front of :ets, ruining its performance. If you want to stick to GenServer, I’d suggest to use Agent instead of :ets.

Or, use :ets as is, without wrapping it into GenServer.

dmitrykleymenov · June 10, 2021, 7:43am

You have GenServer.call, but handle a cast. You have another handle or this code is invalid, it’s hard to tell what’s happening, check the current implementation and update plz

tjdam · June 10, 2021, 10:04am

Wouldn’t that cause the data to be garbage collected?
Thank you for your answer! I actually didn’t quite get why use a GenServer, I did it because most of the stuff I read on ETS did use it and because I wanted to play more with GenServers

Oh, that was my mistake here, in the actual code it’s cast all through. Thank you!

dimitarvp · June 10, 2021, 10:08am

Playing creates habits that are hard to unlearn after. Do things that make sense in a production context.

GenServer centralizes the access to whatever it is having access to underneath. One of its uses is exactly this: make sure that access to something is linear / centralized, e.g. no actor can access the thing more than one at a time.

:ets in contrast is specifically crafted to be as fast as possible and to tolerate parallel access (if the proper flags are passed to the table when created).

So indeed, using both in tandem is rather weird.

tjdam · June 10, 2021, 10:16am

Fair enough. It’s just that I’m at that point where I can’t really make this assessment, though. I follow recipes then I go and try to understand it the best I can. Asking questions in this forum is probably where I learn the most!

Yeah, that makes sense. I’m thinking I call a module creating the tables I need at the Application module then just call :ets functions straight up, yeah?

dimitarvp · June 10, 2021, 10:17am

Even better, you can put an entry inside your application.ex file (where OTP workers of your app are described and then started) and you’ll have any table you want created on your app’s startup. You must give it a name though and then refer to the table by it.

tjdam · June 10, 2021, 10:45am

Oh right! There’s no need for it to be under a supervision tree…!
That’s really cool, thanks for the tip!

dimitarvp · June 10, 2021, 10:51am

To clarify: it’s still going to be under a supervision tree but not one that you manage manually. It’ll be that of your app where Erlang takes care of bootstrapping all workers on your app’s startup. It’s still OTP mechanics but they are being taken care of for you.

tjdam · June 10, 2021, 10:53am

As I wrote my answer I thought of it, haha. There’s no code in the project that would not be part of the tree, right?

dimitarvp · June 10, 2021, 10:54am

Yep, that’s correct. That’s what the BEAM VM is about.

tjdam · June 10, 2021, 10:56am

Thank you again. By knowing when NOT to use a GenServer I got to understand it a little better today!

derek-zhou · June 10, 2021, 1:50pm

Not necessarily. I often use a genserver to own the ets table and to serialize the writes so my data is consistant, while allowing other processes to read the ets tables directly (but still via client functions in the same module for encapsulation)

dimitarvp · June 10, 2021, 2:05pm

Yeah, serializing only the writes seems like a perfect use-case for GenServer.

keathley · June 10, 2021, 2:32pm

This is not an anti-pattern in any way. It completely depends on what your use cases are and what your trying to achieve. Also, using an agent is just as a much of a bottleneck as using a GenServer, so this idea doesn’t improve anything.

No, it isn’t. Its very normal to have a GenServer start an ets table and serialize writes to that ets table, but allow the callers to read from the ets table directly. Its also normal to have a private ets table in a GenServer that only that GenServer can read from. If that GenServer needs to manage thousands of records, reads from an ets table will be much higher throughput then reads from, say, a map. In either case, there’s nothing inherently wrong with using ets tables inside of a process.

Ets tables are attached to the process that creates them. So if the creating process dies or is shutdown, the ets table is collected. By creating the table in the application supervisor, your essentially saying, “I want this table to live for as long as the application does”. That is not always correct. You may want to think about the life cycle of the table and create it and supervise it so that the table is cleaned up correctly.

As for your actual question, the main issue here is that your using cast instead of call. In this case putting an ets table inside of a GenServer might be the wrong approach. Maybe you don’t need an ets table or a GenServer or a cache at all. But I hope that you and others don’t assume that putting ets tables in processes is wrong, in general. Because its not.