Okay, this one may get a bit in the weeds but here goes. First, while my application is a Phoenix app, this is more of an Ecto/Elixir setup question. Please let me know if this should go in the general category.
- My application is multi-tenant.
- We achieve multi-tenancy with Postgres schemas.
- It runs on AWS ECS containers.
- It uses AWS RDS Aurora Postgres as the database backend.
The requirements:
- Multiple read-only replicas are required. This is a very high traffic system that’s heavy on the reads.
- Configuration options must be passed at runtime, we use AWS Secret Manager to pass in sensitive data.
- Different ECS clusters may connect to different RDS clusters.
So the problem I’m facing is: How do I build a system that can connect to different database clusters with different read replicas depending on the cluster configuration it’s running in?
What I have now is this bunch of bananas - it works but it has one very specific limitation.
First, I’m using a runtime config value to read in a comma separated list of replica endpoints:
config :snw_bowman,
pod_replicas: String.split(System.get_env("POD_REPLICAS") || "localhost", ",", trim: true)
Then in my Repo file I have:
defmodule SnwBowman.Repo do
@moduledoc false
require Logger
use Ecto.Repo,
otp_app: :snw_bowman,
adapter: Ecto.Adapters.Postgres
@replicas [
SnwBowman.Repo.ReadOnly1,
SnwBowman.Repo.ReadOnly2
]
def replica do
@replicas
|> Enum.random()
end
@doc """
Starts the read-only replicas. This function is called by the GenServer
defined below, and should not be called directly. The GenServer is started
under the main supervision tree.
"""
def start_replicas(hosts) do
conf = Keyword.put(config(), :read_only, true)
for {repo, index} <- Enum.with_index(@replicas) do
case repo.start_link(
Keyword.put(conf, :hostname, Enum.at(hosts, index))
|> Keyword.put(:name, repo)
) do
{:ok, _} -> :ok
{:error, reason} -> Logger.error("Failed to start replica #{index}: #{inspect(reason)}")
end
end
end
for repo <- @replicas do
defmodule repo do
use Ecto.Repo,
otp_app: :snw_bowman,
adapter: Ecto.Adapters.Postgres
end
end
end
defmodule SnwBowman.Repo.Replicas do
@moduledoc false
@name :snw_bowman_repo_replicas
use GenServer
def start_link(args) do
GenServer.start_link(__MODULE__, args, name: @name)
end
def init(hosts: hosts) do
SnwBowman.Repo.start_replicas(hosts)
{:ok, %{}}
end
end
The key here being start_replicas/0
and the GenServer
module at the bottom. Those are used in the application.ex
file like this:
defmodule SnwBowman.Application do
use Application
@impl true
def start(_type, _args) do
children =
[
...
# Start the Ecto repository
SnwBowman.Repo,
# Start the replicas
{SnwBowman.Repo.Replicas, hosts: Application.get_env(:snw_bowman, :pod_replicas)}
...
]
# See https://hexdocs.pm/elixir/Supervisor.html
# for other strategies and supported options
opts = [strategy: :one_for_one, name: SnwBowman.Supervisor]
Supervisor.start_link(children, opts)
end
end
This allows me to sub in a different :hostname
value for each replica, have them configured at run time, and start under the main supervision tree.
The biggest problem is that it requires that ALL clusters have the same number of read replicas.
The second problem is that this just feels… odd. Like there should be some code smells, but I can’t see them. Is this idiomatic Elixir? Is there a better way to handle this?
Would love to hear some feedback and suggestions.
Thanks in advance,
~mike