I have a simple todo app where users can create and share various todo lists (this thread). It’s a phoenix live view app with default ecto/postgress right now.
I am thinking of using Mnesia mainly for learning purposes, no other strong reasons. (Maybe also to avoid usual limits on free-tier postgres as a bonus.) I have some elementary questions:
in postgres I’m storing all todos in single table and retreive them with select/where list_id (even though I have no lists table at all). in mnesia I am thinking of simply creating separate table for each todo list - does that even make sense and does it bring some additional benefits like loading to RAM only the currently used tables?
every now and then I want to run a task that deletes old unused todo lists (i.e. mnesia tables in my intended schema) - does the mnesia table have some timestamps by default, based on which I could delete it, or do I need to take care of that separately?
does mnesia have some autoincrement unique id, or do I need to handle it myself? I’ve noticed Amnesia wrapper takes care of that, but I’m thinking of going wrapper-free for those learning purposes… And don’t see that in mnesia docs
can I mix different structures in same table? I.e. I would have todos with ids 1.2,3… but then maybe :created_at key with timestamp and :table_owner with user info - then to show just todos I’d do probably some pattern matching, does that make sense and does it work?
I try to start answering my own questions, I have some partial answers, but there are gaps and not sure if my answers are correct, so I’m hoping for advice from someone more experienced, please:
I probably can’t mix different structures in the same table, as I define attributes in create_table, so that’s pretty much the same as sql databases. I would probably create another table “lists” where I’d store also the owner and timestamp – so this would be basically the same as in postgres
generic autoincrement id - not sure how to do that, but I’d probably work around that with some pseudo-unique randomized uid, or I might use a timestamp as id – is there actually a reason why a timestamp is not commonly used as id, when probably the most common way of ordering records is by time created/updated?
the cleanup task should be fairly easy if I do the schema as outlined in the first bullet…
lists as separate tables - I don’t know if there’s a benefit? for example I definitely wouldn’t do it in postgres, doesn’t make sense having a number of tables with same schema. but does it make sense in mnesia? There’s really no relation between individual todo lists, so it just feels kinda natural. (in a way I think of the lists as completely independent documents, kind of mongodb-style). so it feels like while in postgres it would be unnecessary pain to keep each list in separate table, in mnesia it would be unnecessary pain to keep them in one table… can someone shed some light to my gut feelings here, please?
I know you said that you prefer to use Mnesia directly for learning purposes, but if at any time you decided to switch to use a wrapper, then you have:
I can at least answer that: Using ‘just’ a timestamp caps you to a certain number of maximum inserts per second, and requires you to pay good attention to your clock. However, there certainly are common approaches to create timestamp-base IDs. A common one is the Snowflake ID, for which there exists an Elixir implementation.
As to when it makes sense to use UUIDs vs auto-incrementing IDs: personally I agree with the arguments of this blog post that boil down to UUIDs are larger than traditional SQL auto-increment ID fields, but they have several benefits such as better scalability, they still work in a distributed setting, you can move to a different underlying database without having to create new ids, etc.
Yes and no. You cannot vary what fields exist in a Mnesia table. You can however vary what data is stored inside, and this data might be any Erlang/Elixir term including (nested) lists, maps (including structs), tuples, etc. This gives you flexibility to store varying data in a certain field.
The main disadvantage of storing denormalized data is that you cannot (at least not easily) query this data. Depending on what you want to store vs. what you want to search on, this is usually not a problem.
I’d say: start with the simplest thing that works and only make it more complex if you hit against its limits.
Memento wrapper for Mnesia does auto increment in ordered sets only, see it here:
# Get the next numeric key (for ordered sets w/ autoincrement)
#
# It gets a list of all_keys for a table, selects the numeric
# ones, find the maximum value and adds one to it.
#
# NOTE:
# See if this implementation is efficient and does not create
# any kinds of race conditions. Maybe also use some kind of
# counter, so a key that was used for a previously deleted
# record is not used again (like SQL)?
@default_value 0
@increment_by 1
defp autoincrement_key_for(table) do
:all_keys
|> Mnesia.call([table])
|> Enum.filter(&is_number/1)
|> Enum.max(fn -> @default_value end)
|> Kernel.+(@increment_by)
end
In my Todo app I am using a unique id, that is an hash of a field in the changeset. Yes I am using Ecto Changesets with Mnesia
I generally follow this advice. In this case there’s zero technical need to replace existing postgres solution, it’s purely for my educational purposes
Re using timestamps as IDs - I understand your point, it kinda goes against the idea of highly-concurrent real-time systems where everything happens “at the same time”, so going for timestamps as unique IDs is kinda stupid
Re yes and no with mixed data in the same table - I think I understand in principle. So one case would be to have table with attributes key, payload, and maybe record_type where payload could be a map for some records and list for others, etc. The other I can think of is having something like this:
It’s a bit hacky, but should be easy to match and iterate, and maybe makes sense in some scenarios where you really need the in-memory speed of mnesia.