How to efficiently order and filter query results using Mnesia?

This topic is somewhat related to the topic ‘Mnesia vs Cassandra (vs CouchDB vs ...) - your thoughts?

I am currently running/building a chat application using Mnesia (to be more specific, I am using the EctoMnesia Ecto Adapter, but am rapidly approaching its limits, because it contains a couple of unfinished, unstable and non-existent functionalities)

When a user sees the chat screen for the first time, I want to show the 20 newest messages. And when the user scrolls backwards through history, I want to fetch the 10 messages before the oldest message’s datetime.
In other words: I want to select all messages older than a given datetime, order the results by inserted_at (in descending order), and then take only the first ten from them.

AFAIK, EctoMnesia will sort/limit in Elixir-land on the lists that are the results of the :mnesia-call, meaning that all calls will become slower (with O(n log n) time complexity) as more messages are part of the table.

I’d like to know:

  1. Is there a way for Mnesia to perform such a query faster?
  2. I know there is QLC, is it able to do so (faster than the ‘Elixir running on the output’ implementation) by using cursors or the like, or not?
  3. Or is Mnesia not at all able to do this, unless I alter my primary keys to be e.g. Snowflakes (IDs that are a combination of the current datetimestamp and a random number)?

QLC provides cursors for Mnesia.

  1. Since you want newest messages, any increasing integer count is sufficient for the primary key. Grab just the keys, sort and split then read only the recent 20.

  2. You could use a 2-stage table, put messages into a “recent table” and move older ones into a “long-term storage table”

1 Like

You could have another value in your stored object that is an integer based on previous inserts. Then in order to travel the documents 20 at a time you could take the last record from the previous search and use it as the news value to search greater then in a select search.

http://erlang.org/doc/man/mnesia.html#select-2

http://erlang.org/doc/apps/erts/match_spec.html

1 Like

Thank you for your replies!

@tty Incrementing primary keys like that would result in a set-up where running Mnesia in a distributed fashion is not possible (because there is no way to synchronise increasing integers in a Availability-over-Consistency system).

Your second idea is interesting, although it feels like it is very much overkill right now. It also does not really improving the searchability of older messages. But yes, maybe those could be kept inside a more ‘classical’ database set-up instead, because they will not be accessed that frequently. Cool idea!

@jordiee Please do elaborate! At first I thought you mean ‘store the ID of the previously inserted field’ but now I think you might have meant something else.

For now, I ended up using Snowflakes as primary ID, storing the messages in an :ordered_set-type table.
I am still hoping someone can give more information on what QLC and Mnesia are able to do/not do as for the ordering and limiting of results rows, and what the resulting time complexities will be.

Sure, I was talking store the Id much in the way an auto Inc field works in like postgres. And then use mnesia select to sort by only greater then the last returned Id. You can probably do the same thing with a date inserted field but I have normally done it with id’s