What would be the “proper” way to implement a simple persistent key/value store?

Is it really that unusual to have user configurable settings in an application? I can’t imagine asking my users to restart the app, that other people may be using at the time, just to change a setting.

Its not unusual, but neither you nor I know what the topic starter intentions are. It might be that they don’t know about application env, or they are just following a bad practice. Or it might be that they have a completely valid use-case, though not exactly like the one you described (cause no server-side program wants to keep in memory all settings of all users)

1 Like

Good point, I should have made that clear. It would mostly be things like the business contact details (address, phone number, etc) that would show up on invoices, receipts and the like. They could change, but probably not frequently, although I did have a client who had to move 4 times within Accra in a single year).

Since Internet connectivity can be a challenge in some areas and I’d prefer not to have to sit in a bus for 12 hours just to go to Tamale and change somebody’s configuration file, I think storing the settings in a way that the client can update themselves is not a bad idea. Power outages do happen and so persisting the settings is also required.

This is not a massive application with thousands of users, so caching the settings table would not really be needed. However, don’t fault me for being curious about how it would typically be achieved in Elixir/Phoenix. I do like the idea of memory caching the settings table if it makes sense and doesn’t lead to any other possible problem.

I do understand environment variables. I don’t mind that solution, but how do I persist changes to the environment? Because it’s a long distance from here to northern Ghana and I really wouldn’t want to travel all the way every time they’ve had lights off. It could end up with me doing nothing but sitting on a bus.

The big benefit for me with the database is that I can have changes backed up with everything else. I have cron jobs (or timers, or whatever that systemd replacement is called) to back up the database regularly. I really want to limit the amount of things to be backed up. That being said, if environment variables is the way to go, so be it.

Trust me: I’m a permanent member of the “keep it simple” fan club.

It sounds like user session state and I personally do this with Nebulex. Even if you don’t plan to have a lot of users, its very nice to have a memory limit on this kind of data, so I’d use caching instead of unlimited ets table. Just set up the Nebulex with multilevel cache adapter like local cache level on top of db cache level.

Nebulex will solve a lot of scaling problems you may encounter in the future

  1. Memory limits. You will have a safe code once the amount of users explodes
  2. Distributed cache. If you will have more than 1 instance of the app using the same database, you will need to sync the cache changes (when write happens). With Nebulex you could just swap the local cache to replicated one
  3. Atomicity. You can have transactions to change the multiple settings at once on all levels (even in distributed model). For example, user may change address and postal code and they want this change to be atomic

Perfect. It is user session state, with the following bonus features.

  1. a change to the “settings” part of the session acquires a mutex/lock
  2. the changes are persisted somewhere (database or some other place)
  3. the “settings” part of the session is reloaded from storage
  4. the lock is released

That is more or less how my Go implementation worked and it did what it needed to do. Multiple “go routines” (think threads) could access the struct (which was what I used for settings). The write method acquired the lock and updated the settings table before releasing the lock. This did what it needed to do for me. It never gave me any problems. I’m curious as to how that best translates into Elixir. It seems GenServer is what I’m looking for?

Thanks. I’m looking at both of these. Nebulex especially, seems interesting.

From what I have read so far, I think I like Nebulex.

Then they are application data, not configurations. I’d suggest to keep them in Postgres and skip the caching layer. Any caching layer is a data consistency risk; and it appears that you do not need to access those information so frequently that it will materially hurt performance.

3 Likes

Agreed, and if you’re not already using Postgres I would say just go with SQLite and Litestream for backups. I doubt an ETS table would even be noticeably faster than SQLite for something like this, and SQLite works with Ecto out of the box.

1 Like

Oh dear, did I call it configuration? I tried so hard to stick to “settings”, to make sure it was NOT configuration I meant. Oh, well, my bad I guess.

Yes, from the example I gave that is true. My question was more about how it would be done, not if I need it or not. I was asked to provide an example so I did. That seems to have been a mistake because it has shifted the discussion from “how would you achieve X?” to “do you need to do X?”.

I have been given a few pointers along the way and am looking into them.

I know very little about SQLite when it comes to web development. I’ve mostly just used it for smaller projects, like a podcatcher I built for KDE many years ago. IIRC, org-roam for Emacs uses and SQLite database to maintain indexes of documents and tags. I may be wrong, I don’t use org-roam myself, since I prefer denote. But these are both examples of SQLite being used by a single user at a time. For that I don’t doubt it makes a lot of sense to use SQLite. But is it really viable for a site where you’d need multiple connections open?

What does “multiple connections” mean? If you just mean multiple (Elixir) processes then it’s fine. There is no write concurrency but you wouldn’t even notice that until you’re in the hundreds of TPS.

The biggest advantage of SQLite is that it’s just way easier to manage out of the box. No installation, no setup, no upgrades, no server, and so on. Backups are easy with Litestream.

1 Like

Well, they wouldn’t have to be Elixir processes, necessarily. psql also spawns an (OS) process. I basically mean the ability for mutliple processes, heavy or light, to be able to communicate with the database and the server ensuring that it is done in an orderly manner so that data does not get destroyed.

As I said, the only experience I have with SQLite is a few desktop and command-line utilities I wrote that used SQLite like in a single-user context. I never stated that SQLite cannot be used for more complex scenarios, merely that I was not aware of that.

I do remember that the Amarok developers decided to use MySQL instead of SQLite back in the KDE3 days, which kind of sucked because mysql became a hard requirement to use KDE (or at least Amarok). I don’t remember the exact reasoning, but I’m quite sure it had to do with accessing SQLite from mutlipled threads.

That being said, this is quite some time ago now. I’m sure things have changed now. Things change. Again, no offence meant against SQLite.

SQLite is generally not something you would use from multiple (os) processes. There are some cases where you would; Litestream is a separate process that backs up the database, for example.

On the BEAM the nif you’re using (e.g. Exqlite) will serialize the transactions through a “connection” for you. There is definitely a point where this approach will tap out performance-wise, but modern hardware is just so fast that you’re unlikely to get there.

I might be completely out of my depth here, but isn’t an SQLite “database” just a single file? If so, why the need for a backup tool? Isn’t cp all you need?

I come from a different era where hardware was THE problem. Hence, I tend to spend far too much time optimizing.

It is a single file (or a couple with WAL mode), but if the database is being actively written there is no guarantee that you will be reading a consistent snapshot. As the copy goes on you will be reading some old parts and some new parts and this will corrupt the backup.

Most databases (including SQLite) have built-in functionality to take a consistent backup that reflects a single point in time, and you could of course just use that. Litestream is a fancy tool that will stream a continuous backup to S3 (or another machine) so that you always have an indestructible backup which is no more than a few seconds out of date.

1 Like

This makes a lot of sense. I’m just a bit confused about one thing. If I treat the database as the source of truth (which makes sense), why would I bother with cache at all? If I were to use a cache I’d assume I would read from cache, write to database and force cache to be reread. And if I’m reading from cache, would that not at least be accepting that cache is a temporary representation of truth? Or am I just not understanding something very basic?