I’ve been working on making the Mnesia cache in Pow work out of the box in a cluster, and got a PR up at https://github.com/danschultzer/pow/pull/233.
This is a large enough PR in something I don’t have that much experience with, that I would like some feedback to figure out if it is the right way to handle clusters in Mnesia. The documentation for mnesia clusters are lacking, and there are very few examples. Any code reviews are very welcome, as well as comments on this thread
All nodes have disk copies. When a node connects to an existing cluster, it’ll purge the disk data, and then initiate replication. I think this makes sense since all Pow cache data is ephemeral, and I won’t have to deal with merging data. Worst case scenario is that a session key is lost, so the user have to log in again, or a reset password token is expired. An obvious caveat here is if you use Mnesia for other stuff too on that node.
As keys can expire, I also let the nodes communicate with each other when a TTL is updated to ensure that a timer is set on the other nodes as well. This ensure that elements will expire even if the node that wrote the cache element went down. I think I may change this so it’s just handled by periodic flush instead.
I would appreciate comments on refactoring, and changes unrelated to the mnesia logic, but the feedback I mostly want is whether the cluster logic is sound or there is some potential pitfalls that should be taken care of. The
init_mnesia/1 method is where that starts.
Also, I would be happy to hear from anyone who is is testing this out in their distributed system!
The last thing I’m now looking at for this PR is split-brain recovery.