Clustering mnesia with libcluster Gossip strategy

mihailacusteanu · April 5, 2023, 3:14pm

I am new to mnesia and I am trying to create a cluster with libcluster Gossip strategy. It works fine, the nodes auto-connect.
Now I need to also connect the mnesia databases, but I encountered this problem.

If a new node creates it’s own schema(database) with it’s own data, I cannot add it to the cluster because I cannot change the :extra_db_nodes because the schema cookie doesn’t match. Event it’s the same table structure.
Also do yo have any suggestion about how should I deal with a network split? The network split happens, the offline node is changing the same row as the cluster and then I get an inconsistent database event. I theory I can use :rpc calls to backup the data on the splitted node (which is now online) and then restore it. but I couldn’t find any any to solve the conflict with custom code, deciding per row and not just chose between the cluster and the splitted node(the one that was offline for a while)
Regarding 1. I saw Mnesiac but It won’t work with Gossip strategy.

Can those 2 points be even solved in mnesia?

jchrist · May 6, 2023, 1:14pm

Welcome to the forum

If a new node creates it’s own schema(database) with it’s own data, I cannot add it to the cluster because I cannot change the :extra_db_nodes because the schema cookie doesn’t match. Event it’s the same table structure.

Do you create the schema separately on each node? Judging by the docs you need to initialize it once with all nodes you want to cluster given in it. The User Guide does it the same way

Also do yo have any suggestion about how should I deal with a network split? The network split happens, the offline node is changing the same row as the cluster and then I get an inconsistent database event.

I see a few options:

you can use the majority options to tell mnesia to only allow updates to succeed when you have a majority of nodes.
if you simply want to restore from the “good” side of the split, I think you can use :mnesia.set_master_nodes/2 ahead of starting mnesia

Finally, you may be interested in this effort to add eventual consistency to mnesia using CRDTs.

I couldn’t find any any to solve the conflict with custom code, deciding per row and not just chose between the cluster and the splitted node(the one that was offline for a while)

For this unsplit may be what you’re looking for!

mihailacusteanu · May 24, 2023, 1:03pm

hi @jchrist and thanks for your reply!
Yes, unsplit it ALMOST what I need for my case, but with the addition that I need 2 runs through the conflict list:

first I need to collect all the conflict list and that list I want to display to the user(while having the conflict not resolved)
user choses the “right” versions and those are propagated to the whole cluster

Unsplit will chose the latest write the second time it runs

So basically what I need is a way to prevent auto sync after a netsplit.
Because now when the erlang nodes reconnect mensia will automatically sync using the latest write.

Do you have any idea if there is something similar to what I want?

I looked over the CRDT work you suggested, it’s interesting, but it’s not related to my problem.
Also regarding the first thing I asked I understood the solution and solve it by creating the schema of a new node by copying from the existing cluster.
thanks!

jchrist · May 24, 2023, 1:54pm

You could probably do this with a custom unsplit_method similar to the built-in ones, but I’m not sure about keeping the conflict not resolved in this case: how can you later figure out which nodes need to be merged together, especially if another partition happens?
I think one way to do it would be: copy one of the existing unsplit methods and expand it to store both sides of the conflict somewhere. The user can then later choose which one to keep and that write is propagated around the cluster.
This should be preferable to waiting for the user suggestion directly in the unsplit method: what happens if the node goes down while it’s waiting for the user to log on and change it?

Do you mean via Erlang or via the files on disk?

mihailacusteanu · May 24, 2023, 2:42pm

how can you later figure out which nodes need to be merged together, especially if another partition happens?

A node will be on a server and the other will be on a user’s laptop. So on this node(laptop) starts it will detect the conflict and create and store the conflicts and based on that let the user choose the versions. After user chooses all the version his node(laptop) will trigger the real conflict solving.
The partitions will happen between the server and the laptop so if another partition happens the other user will solve it.

what happens if the node goes down while it’s waiting for the user to log on and change it?\

I was thinking about skip all the records the first time and then try to re trigger unsplit when I have the final versions.

But I will look over the custom unsplit_method since at the moment I not sure how this is working and this may be the actual solution

Do you mean via Erlang or via the files on disk?

I meant that initially I thought I can create schemas separately and connect them afterwards without having a bad cookie error

mihailacusteanu · May 24, 2023, 3:07pm

I also tried to “decluster” a node, but it seems once you add a node to the cluster using :mnesia.change_config(:extra_db_nodes, [new_node]) you cannot remote it from running/stopped db nodes and it will sync with the rest of the cluster regardless of making extra_db_nodes = []
@jchrist can I do that in some way?

jchrist · May 24, 2023, 6:34pm

I think :erpc.call(node_to_remove, :mnesia, :stop, []) followed by :mnesia.del_table_copy(:schema, node_to_remove) is what you’re looking for.

:mnesia.delete_schema/1 seems to serve a similar purpose, but I didn’t quite get it to fully work in my testing in regards to removing the node from the saved DB nodes locally.

mihailacusteanu · May 26, 2023, 8:00am

The method you proposed assumes that both the cluster and node to remove are up and :mnesia.del_table_copy(:schema, node_to_remove) will fail when node_to_remove is down.

I know this is a stretch, but couldn’t the cluster node reject the communication with the node_to_remove after is down and only let it sync again after manual conflict resolution ?

jchrist · May 26, 2023, 9:45am

Hm, maybe I misunderstood you, but I’ve tried reproducing this: two running mnesia nodes, both with disc_copies of the schema (via :mnesia.create_schema), then kill -9 on the first one. On the running node, I ran :mnesia.del_table_copy(:schema, downed_node) and was met with {:atomic, :ok}. This post has some additional info: [erlang-questions] Removing mnesia cluster node while mnesia is offline

I think the actual problem is this: the downed node doesn’t actually know it was removed from the schema, and when I restarted it, it synced its previous schema with the new one. Only with a table that resides on one of the nodes, but not the other, I got the running_partitioned_network event. Perhaps you can subscribe to it and manually disconnect the nodes when it happens … but that is probably uncharted territory

I will do some more experiments with this, maybe the official documentation can be improved to document this in a clear way.

mihailacusteanu · May 26, 2023, 7:44pm

@jchrist thanks again for your replies and for you patience!
I add my remote_node(laptop) tables using:

:erpc.call(cluster_node, fn ->
      :mnesia.system_info(:tables)
      |> Enum.map(fn table ->
        if table != :schema do
          result = :mnesia.add_table_copy(table, current_node, :rocksdb_copies)
        end
      end)
    end)

and yes indeed :mnesia.del_table_copy(:schema, node_to_remove) is actually working even if the node_to_remove is down.
I was mislead by the fact that :mnesia.del_table_copy(:any_other_node, node_to_remove) will fail and result will be: {:aborted, {:badarg, :any_other_node, :unknown}} and I thought that :mnesia.del_table_copy(:schema, node_to_remove) will fail too. And regarding that I am unsure how del_table_copy works. I originally thought this can be used for any table, not just for the schema.

Moreover if I just cut the network and use :mnesia.del_table_copy(:schema, node_to_remove) then put the network back it working just as intended(yay!), but if I kill the node_to_remove then use :mnesia.del_table_copy(:schema, node_to_remove) and then try to start the node_to_remove again, I get this:

[error] Mnesia(:"node2@172.21.0.3"): ** ERROR ** (core dumped to file: '/app/MnesiaCore.node2@172.21.0.3_1685_128021_181853')
 ** FATAL ** Failed to merge schema: {:aborted, :function_clause}

[notice] Application mnesia exited: :mnesia_app.start(:normal, []) returned an error: normal
[notice] Application runtime_tools exited: :stopped
** (Mix) Could not start application mnesia: :mnesia_app.start(:normal, []) returned an error: normal

And even when I use the cutting network option I cannot figure out what to do after the manual conflict resolution.
My assumptions were that I could do this by using :mnesia.set_master_nodes(table, [remote_node])(which outputs :ok but does nothing) from cluster to overwrite remote_node(laptop) and having the sync on again. Or maybe :mnesia.add_table_copy/3 to make them sync again, but it seems the only way is to use :mnesia.delete_schema on the remote_node and then use add_table_copy like in the first code paragraph above.

If you know about any additional documentation that could use for my use case please suggest it to me.