Phoenix can't resolve DNS inside AKS pod

I’m having an odd problem with DNS resolution on a Kubernetes pod container with a Phoenix app.

Azure deployment configuration

I am trying to deploy a Phoenix app using a Postgres DB on Kubernetes on Azure:

  • The Kubernetes cluster and the Postgres DB live in a Virtual Network.
  • The VNet has two subnets, one for Kubernetes, the other for Postgres. The DB also has a Private DNS Zone in Azure, which adds a DNS name to the database IP.
  • Temporarily, the DB connection is handled as a Kubernetes secret, which is injected as the environment variable Phoenix uses in the configuration to connect to the database.
  • Temporarily, the Postgres DB is not exposed with TLS.

I tried to run the application, but it fails to resolve the DNS:

19:17:48.821 [error] Postgrex.Protocol (#PID<0.151.0>) failed to connect: ** (DBConnection.ConnectionError) tcp connect (ed6c59338ca8.privee-db.private.postgres.database.azure.com:5432): non-existing domain - :nxdomain

To debug the problem I installed psql on the pod’s container, to verify whether the DNS resolution is a pod’s problem, but it apparently is not:

$ psql -h ed6c59338ca8.privee-db.private.postgres.database.azure.com -p 5432 -U priveedbadmin -d postgres
Password for user priveedbadmin:
psql (15.10 (Debian 15.10-0+deb12u1), server 16.8)
WARNING: psql major version 15, server major version 16.
         Some psql features might not work.

In the following, the ping, nc and telnet outputs:

$ ping ed6c59338ca8.privee-db.private.postgres.database.azure.com
PING ed6c59338ca8.privee-db.private.postgres.database.azure.com (10.0.1.4) 56(84) bytes of data.
64 bytes from 10.0.1.4 (10.0.1.4): icmp_seq=1 ttl=62 time=2.41 ms
64 bytes from 10.0.1.4 (10.0.1.4): icmp_seq=2 ttl=62 time=0.992 ms
$ telnet ed6c59338ca8.privee-db.private.postgres.database.azure.com 5432
Trying 10.0.1.4...
Connected to ed6c59338ca8.privee-db.private.postgres.database.azure.com.
Escape character is '^]'.

nc seems to give more information:

$ nc -zv ed6c59338ca8.privee-db.private.postgres.database.azure.com 5432
Warning: inverse host lookup failed for 10.0.1.4: Unknown host
ed6c59338ca8.privee-db.private.postgres.database.azure.com [10.0.1.4] 5432 (postgresql) open

In the following, the content of the runtime.exs configuration:

  import Config

  # The secret key base is used to sign/encrypt cookies and other secrets.
  # A default value is used in config/dev.exs and config/test.exs but you
  # want to use a different value for prod and you most likely don't want
  # to check this value into version control, so we use an environment
  # variable instead.
  secret_key_base =
    System.get_env("SECRET_KEY_BASE") ||
      raise """
      environment variable SECRET_KEY_BASE is missing.
      You can generate one by calling: mix phx.gen.secret
      """

  config :privee_web, PriveeWeb.Endpoint,
    http: [
      # Enable IPv6 and bind on all interfaces.
      # Set it to  {0, 0, 0, 0, 0, 0, 0, 1} for local network only access.
      ip: {0, 0, 0, 0, 0, 0, 0, 0},
      port: String.to_integer(System.get_env("PORT") || "4000")
    ],
    secret_key_base: secret_key_base,
    server: true

I’m not sure which additional information could shed some light to the problem, I will edit this message with the required information if needed.

Additional information

I have included libcluster, both this library and DNS Cluster cannot resolve their respective DNSs.