Options for process state backup other than Postgres

Hey,

I wonder what would be some valuable options for process state backups if one does not want to add a database like Postgres to the setup.

In my particular case, I got a bunch of “tracking” processes that should survive an application restart (e.g. from deployment). Hot code reloading is not an option in our Kubernetes setup.

The state I need to backup does not contain Elixir specific data structures (e.g. atoms) so it can easily be encoded to JSON and back into lists and maps.

What I have thought of so far:

  1. Writing JSON files (e.g. {identifier}.json) to disk and load them on startup
  2. Writing state to Mnesia (maybe useful if we add clustering later)

Are there any other options I am not aware of? What would you guys do in my case?

I would favor mnesia and not bother with JSON since you can stuff the entire map/struct into mnesia without worrying about marshalling.

If you can store stuff on the filesystem I’d rather user :erlang.term_to_binary than json.

6 Likes

That is also what I thought. Never worked with Mnesia. I hope I can specify a custom filesystem location for the Mnesia data so I can simply mount it to the container.

Good point. I always forget about that.

I think I go for :erlang.term_to_binary in my case.

Good read:

1 Like

If you don’t need dynamic querying and joins, both mnesia and plain files are fine.

Redis can be good for storing binary blobs for retrieval later, depending on the size.

You can but it is a global value i.e. every mnesia table for that node would be stored there. You can specify it in the sys.config or on the command line -mnesia dir Directory.

If you are using mnesia or ets or dets there is no need for term_to_binary.

If you’re using it as a straightforward “tracker ID -> stored state” persistent mapping, maybe DETS would be sufficient? It lacks the general transactional behaviors of Mnesia, but reads of a single key are guaranteed to not see partially-written data.

1 Like

Postgres is likely the best way to go and is what I have used before, https://github.com/erleans/erleans

Mnesia can be fine on a single node but, particularly if running in kubernetes, don’t expect to be able to utilize it as a distributed store.

A StatefulSet which ensures the same volume mounted for a specific nodename could be useful or storing a local copy. But far better to be able to have a stateless app deployment and use Postgres for data.

2 Likes

This point really can’t be emphasized enough. If you’re going to use mnesia you should be aware of its use cases and its limitations. Mnesia might be fine for your use case but you should really prove that first.

I like to use jsonb columns in postgres for these sorts of things but since you specifically called that out as a non-option then I might consider something like dynamodb. Dynamo is a reasonable choice if you just want to store binary blobs somewhere. Its hard to tell from your example but if you only need to read in the files once on app start then you might be able to store the files on s3 and pull them from there when the app is booting. There’s a lot of options here. Most importantly I would highly recommend avoiding creating an ad-hoc database if you can help it.

3 Likes