How do I keep ETS in sync when I have multiple nodes at scale?

jononomo · February 23, 2017, 8:00pm

I’m not sure if this is a Phoenix or Elixir question. I’m building this all using Phoenix, but I think this relates to Elixir and the OTP generally.

Let’s say that I build and deploy an app on AWS and it gets a tremendous amount of traffic – in order to handle the traffic, multiple “nodes” will run, from my understanding. However, each of these nodes will have it’s own ETS tables.

I need my app to work as one unit – I cannot have one node with ETS tables that are not perfectly synchronized with all other nodes.

What do I need to do to ensure that if one node updates an ETS table that all other nodes will also be updated.

Thanks much!

OvermindDL1 · February 23, 2017, 8:05pm

ETS is single-machine.

Mnesia is multi-machine.

Mnesia wraps ETS and DETS to add a distributed transaction layer, it running as in-memory mode is exactly a distributed ETS.

jononomo · February 23, 2017, 8:11pm

I only have three key-value pairs to store. The entire app is tiny, in fact – only about 6 routes – and it does almost nothing. The one thing it will do, however, is receive hundreds of millions of hits per day. Do I need the overhead of Mnesia simply because of the scaling issue? If so, doesn’t that render ETS itself almost useless for anything other than (toy?) apps that never have to worry about scale?

bitwalker · February 23, 2017, 9:06pm

@jononomo Your requirement that the ETS tables be perfectly synchronized is the problem - this is a non-trivial problem to solve generally, and is one that every distributed database has to deal with in some way (e.g. CAP theorem and related tradeoffs on guarantees).

Mnesia solves this problem, to some degree, though you have to do extra work to make it resilient to network partitions. Another approach is to have a process which synchronizes data between nodes, using CRDTs to handle conflict resolution, e.g. Phoenix.Tracker. There are some libraries which build on this capability, such as dispatch. You may find that the latter works better for your use case.

ETS is extremely useful for a broad array of cases, but for your specific need here, it is not the appropriate tool.

outlog · February 23, 2017, 10:18pm

afaik mnesia carries a write overhead, not really a read overhead.

what is the payload size, and what is the spike RPS, are the nodes to be geo distributed etc. how do you intent to load balance, what is max accepted latency etc.

fyi in my benchmarks a 5$/month scaleway (really low powered) server can do around 800 RPS, (~70 million hits per day) - and this is going through the phoenix browser pipeline and html rendering etc - and with the db call cached in ets.

I assume a single server can handle your load… so fire up some servers and loadtest it.

jononomo · February 24, 2017, 4:38am

Thanks for your patient explanation, @bitwalker, and your good questions/points @outlog.

The app basically manages three JSON configs. Each of these JSON configs will be updated between 1 - 5 time per day. However, they will be read hundreds of millions of times per day.

In terms of the payload size: the configs themselves are small – just a couple dozen lines of JSON each. All the app will do is issue 302 redirects hundreds of millions of times per day based on the contents of the configs – so it will just constantly read out of the ETS table, perform a quick calculation, and then issue a redirect.

I’m not sure what spike RPS means, but traffic ebbs and flows with the time of day and the day of the week, and can rise or fall by a factor of 100.

In terms of load balancing, I haven’t thought about that yet - somehow I thought AWS/OTP handles that under the covers for me…?

Geo distributed… that would probably be an extremely good idea for our use case, but not planning to do this for the initial iteration.

Max accepted latency for the client to receive the 302 redirect would be about 30 ms. In terms of updating ETS and synchronizing ETS between all nodes, I’d be happy with 5 seconds and could live with 5 minutes.

bitwalker · February 24, 2017, 4:10pm

Mnesia is perfect for your use case then - writes are expensive because you have to execute a transaction across the cluster, but in your case that doesn’t matter, and reads are done against ETS locally, and are extremely fast, so your performance reqs should be easy to meet.

outlog · February 24, 2017, 4:33pm

as payload is close to zero, and app logic the same, it’s save to say that the server will be done in close to no time (sub 1ms). what you are looking at will be network latency…

30ms is a very difficult (or rather impossible) target to reach without geo distribution - even light in fiber has certain speed limits…

for simplicity I would look at/consider using Lambda@Edge or similar for this very specific use case. (no app logic needed, no persistence, highly beneficial to geo-distribute, huge peak demand etc)

but if you can accept say cross atlantic network latency, and have it all in one data center, then I would also guess one server would suffice, and cost is more fixed/predictable.