What would be the “proper” way to implement a simple persistent key/value store? Settings that are read far more than they are written. In Go I used a single postgres table and on application start-up I read it into to a mutex-backed struct. The write method would persist the setting in the database.
I wouldn’t necessarily need to use Postgres as a backing store, but that way the settings get backed up with the rest of the data. That being said, I’m not against using a non-postgres backing store, if that makes sense.
I’ve looked a bit into ETS, Mnesia, I also read somewhere that it would be a good idea to build a GenServer around a key/value store. I feel a bit overwhelmed by the choices and would like a pointer in the right direction. Thank you.
ETS is not persistent, DETS and Mnesia are. As you said, there are many choices, so your usage pattern and what you feel more comfortable play a big part.
If I were you I will just use Postgres, especially if you are going to need a relational db down the road anyway.
GenServers are great if you don’t need to scale, as they are notorious for being a bottleneck, the best persistent key value store is DETS if you only need it for a single node or persistent_term.
The Erlang docs are pretty comprehensive, especially if you know your constraints.
Thanks. I think you are right, I feel comfortable with postgres so I I’ll stick to that. So now I just need to figure out how to cache the table in memory and trigger a write to db only when the data has been modified.
I see. I saw something about DETS, but assumed it was probably distributed ETS, hence overkill for my needs and still not persistent. I guess I was wrong. Thank you! I’ll look deeper into DETS, this time without prejudice.
If you’re using Postgres, use Postgres. Otherwise use SQLite. Create a table with binary keys/values and then use ETS as a write-through cache.
Always write to the DB first and then the cache before returning from the write function. Always read from the cache.
def put(key, value) do
Repo.insert! %Row{key: key, value: value}
:ets.insert(@table, {key, value})
end
def get(key) do
case :ets.lookup(@table, key) do
[{^key, value}] -> value
[] -> nil
end
end
Load the rows from the DB into the cache at startup. If you want to use arbitrary terms just encode them with term_to_binary().
Edit: the put() function above is only correct for a single writer, meaning it cannot be used from multiple processes without synchronization. See @Asd 's more thorough answer below for an example that uses a GenServer as a single writer to serialize writes.
I’d do this like this if you want it backed by postgres table
defmodule Table do
@moduledoc "ets table backed by postgres table"
use GenServer
def start_link(opts) do
GenServer.start_link(__MODULE__, opts, name: __MODULE__)
end
def init(_opts) do
table = :ets.new(:table_name, [:protected, :set, :named_table])
entries = for %{key: k, value: v} <- Repo.all(Table), do: {k, v}
:ets.insert(table, entries)
{:ok, %{table: table}}
end
def handle_call({:write, key, value}, _from, %{table: table}) do
Repo.insert!(%Table{key: key, value: value})
:ets.insert(table, {key, value})
end
def read(key) do
case :ets.lookup(:table_name, key) do
[{_, value}] -> {:ok, value}
_ -> :error
end
end
def write(key, value) do
GenServer.call(Table, {:write, key, value})
end
end
But if you’re okay with just a file on disk, I’d consider using a dets
However, I am going to release much more performant persistent LVM KV db in the upcoming months, so I will reply here again once it’s ready
Consider two processes calling the put function at the same time. If one does put("key", 1) and other does put("key", 2), it is possible that the order of operations would be
Aha! Yes, a locking mechanism is a must. In Go I used a mutex to lock during a write. That is what I wanted to be able to achieve in Elixir, but I’m not quite there yet. So GenServer is the best path then?
I’m not against using a file, but I’m quite biased towards using a db table. Not for any deep philosophical reason, just power of habit, I guess.In the root post, I did mention that the settings are backed up together with a database backup. That may be a form of justification.
Of course you must trust your own application code, so if you legitimately know that there is only ever going to be a single process that changes settings then you don’t need to worry about it. The purpose of the GenServer is to ensure that there is only going to be a single process that changes settings, i.e. the correctness is tautological.
There are other techniques that could improve concurrency, like per-row locks, but I doubt that would be necessary in your case. Better to keep it simple.
To be honest, it all sounds like XY problem. Changing configuration and settings in runtime is a strange approach. And making these changes persistent is even more strange. Other thing is that such configuration changes may be just read once, then cached or stored in process state, making any change to settings partial and inconsistent
So, what are these settings exactly? How do you use them?
Yeah, you beat me to it. We never got told the concrete problem that made OP think they need a simple persistent key/value store for runtime-changeable configuration.