Ecto / Phoenix strategies for dealing with read/write from replicas/primary DBs

seva · August 25, 2023, 2:08pm

I’m looking to use database read replicas (Postgres) with my app. Ecto’s official documentation is fairly good at explaining how to set things up, though doesn’t mention strategies for making read request from the primary database immediately after writes, since there’s a high chance replica’s won’t sync with the primary db so fast.

If anyone has battle-tested strategies in their Phoenix apps, would appreciate any code examples

LordZed · October 30, 2024, 6:23pm

Hey seva, did you find a solution in the past year? I’m sorry, for digging up this old post, but I am dealing with the same right now:

I have query, that most of the times can and should be performed on the read replica, but at times I do an update just before the query, which is when I need to ensure I go to the write repo.

The way I thought about doing it:

Define the MyApp.Repo.Replica as a dynamic repo, with MyApp.Repo.Replica being set via default_dynamic_repo
Whenever I need to ensure, a read query is performed on the write instance, I call MyApp.Repo.Replica.put_dynamic_repo(MyApp.Repo) to ensure it’s using that Repo in those cases.

The way I understand (and it seems to work in my tests), when I know I am doing a write before I do a read of the same data (i.e. data I just wrote or data that changed in the same transaction), I do the following

MyApp.Repo.Replica.put_dynamic_repo(MyApp.Repo)
MyApp.function_that_does_the_insert()
MyApp.function_that_does_the_read()

That way, even though the Repo used for an operation is hidden inside the function, it would use the main repo to read.

For tests to work (as you usually insert a lot of data), you could either setup the dynamic repo in the setup of every test to be MyApp.Repo or you could check the Mix.env() as described in the documentation.

No I am wondering: Is that a good approach? Are there other - better - ways?

E.g. I would actually prefer not having to explicitly call the Read.Replica when I want to use it, but have Repo read functions go to the read replica (and then have the option to overwrite the used repo like I described above.

seva · November 2, 2024, 6:40am

Thank you for sharing your approach @LordZed!

I didn’t get to implement a solution yet and setup read replicas, but will need to do it at some point in the near future.

Without battle-testing, wouldn’t know if it’s good or not, but I think Elixir ecosystem definitely needs something as nice and seamless as Rails’ Multiple Databases.

If you launch something in production and it works for you, please share!

codyjroberts · November 2, 2024, 7:32pm

I believe any solution will be dependent on your application’s tolerance for staleness and in general what domain you’re working with. In some domains you might be able to tolerate replica staleness on quite a bit of the surface area, then always use the primary for r/w on critical paths where eventually consistency won’t cut it.

Out of curiosity, what does Rails offer in particular that solves the problem for you? Have you considered partitioning? How about a postgres compatible solution like cockroachdb?

This may interest you GitHub - superfly/fly_postgres_elixir: Library for working with local read-replica postgres databases and performing writes through RPC calls to other nodes in the primary Fly.io region.

seva · November 6, 2024, 4:57pm

Rails allows to easily specify which db is primary, which are replicas (source).

IIRC if replicas are defined, it always writes to primary, and then automatically sets a client cookie which tells the backend to make all read queries for the next HTTP request against the primary. All of this happens mostly out of the box for you, and avoids any stale issues IIRC.

And they make it easy to customise the logic for when the server should switch between replica/primary.

I’ve checked out the Fly lib, and yes, they do something similar, and it was one of the main reasons I was looking to use Fly for my servers. But their solution is designed to tie you to Fly, which isn’t cool (though makes sense), and I don’t want to use Fly anymore (expensive, flaky deploys, unreliable uptime).

Would be cool if Elixir had the same tooling as Rails when it comes to working with multiple DBs. That Fly library can probably be forked not to rely on Fly.

seva · November 6, 2024, 5:04pm

With Elixir’s big selling point being good for distributed applications, I think it should be an important focus for future versions of Phoenix to support multiple DBs.

Of course, best way to get the features you want is to contribute.

I feel like cockroachdb & partitioning are good solutions, but for other set of problems. In my case, I simply need to use replicas in different regions. I did research into cockroachdb, and while cool, it’s more expensive, harder to deploy, doesn’t support all postgres extensions, you need to be aware that even PSQL queries might have to be written differently or behave differently.

Simpler postgres replicas should be enough.

codyjroberts · November 19, 2024, 2:46am

Interesting. I see that now Multiple Databases with Active Record — Ruby on Rails Guides.

I’ve checked out the Fly lib, and yes, they do something similar, and it was one of the main reasons I was looking to use Fly for my servers. But their solution is designed to tie you to Fly, which isn’t cool (though makes sense), and I don’t want to use Fly anymore (expensive, flaky deploys, unreliable uptime).

Yeah, didn’t mean to suggest you should use fly, just that the library might be helpful in implementing behavior that meets your needs. We’re on Fly but with Crunchy, coming from GKE + CloudSQL at the last gig. It’s been a bit rocky but I’d still recommend them for the price. That said GCP was pleasant to work with as well.

Understood. And I’m sorry, I should have said shard rather than partition. e.g. shipping primaries based on region might be an easier lift / better choice depending on the domain you’re working with. Even Intercom at their team size is flashing me stale data. Eventual consistency isn’t something I’d want to deal with for as long as I can avoid it. Wonder if they use that rails solution I actually think they have a rails monolith.

Completely random, but this America reference in the Ecto docs

Ecto also allows you to start a repository with no name (just like that famous horse).

seva · November 24, 2024, 5:03pm

It’s all good, no need to apologize!
Oh interesting, thank you for sharing your experience with Crunchy / GCP.
I was looking into Crunchy too, but a bit expensive for our budget and scale.

Yeah, primaries based on region is something I looked into as well. Tbh, the data for me is intertwined, so partitioning it by region will be tough.

There’s another potential solution I thought of to solve this, but haven’t tried yet:

Say, I have two web+db servers, one in EU & AUS regions.
Elect one as primary (eg EU), and one as follower (AUS)
Put CloudFlare LoadBalancer in front of both, make sure it distributes traffic geographically for best latency, replicating Fly behaviour.
From what I read, you can setup custom load-balancer rules, e.g. setup for POST (i.e. write) requests to be sent to the primary pool (e.g. EU), and for appropriate headers/cookies from the app server to be set to ensure CF directs traffic to primary for a brief period after write.

This won’t work for LiveView, since it’s over web-sockets. But I’m building things in Hotwire.

If it works out, then it avoids the need to build anything custom in Elixir.