Unable to serve Phoenix SSL certificate from memory

I’m developing a Phoenix server that needs to terminate SSL for multiple domains with certificates acquired via Certbot & stored in database in order to support the custom domains feature in my app. I’m passing sni_fun callback to https keyword list in Phoenix like so:

config :my_app, MyAppWeb.Endpoint,
  # ...
  https: [
    # ...
    sni_fun: &MyAppWeb.Certs.sni_fun/1
  ]

which works perfectly fine as long as it relies on filesystem by passing certfile, keyfile and cacertfile in the sni function like so:

def sni_fun(domain) do
  domain = List.to_string(domain)
  certs_dir = Path.join(:code.priv_dir(:my_app), "cert")
  certfile = Path.join(certs_dir, "#{domain}.pem")
  cacertfile = Path.join(certs_dir, "#{domain}_chain.pem")
  keyfile = Path.join(certs_dir, "#{domain}_key.pem")

  certs = [certfile: certfile, cacertfile: cacertfile, keyfile: keyfile]

  get_and_write_certs(domain, certs)

  certs
end

Notice that certfile, cacertfile and keyfile map to cert.pem, chain.pem and privkey.pem obtained by certbot respectively.

But this means that I have to rely both on database and filesystem to deliver certificates. This bites when trying to figure out a proper & efficient caching strategy considering there’s already a cache in place. I’ve been trying to use equivalent cert, cacerts and key options that appear both in erlang SSL docs and ranch docs, with this code being what I believe is the closest to SSL and ranch specs as well as notes in this thread about HTTPoison:

def sni_fun(domain) do
  domain = List.to_string(domain)
  {cert_pem_string, cacerts_pem_string, key_pem_string} = get_certs(domain)

  cert = read_pem(cert_pem_string) |> hd() |> elem(1)
  cacerts = read_pem(cacerts_pem_string) |> Enum.map(&elem(&1, 1))
  key = read_pem(key_pem_string) |> hd()

  [cert: cert, cacerts: cacerts, key: key]
end

defp read_pem(pem_string) do
  pem_string
  |> :public_key.pem_decode()
  |> Enum.map(fn entry ->
    entry = :public_key.pem_entry_decode(entry)
    type = elem(entry, 0)
    {type, :public_key.der_encode(type, entry)}
  end)
end

This doesn’t work yielding following error in server log:

[info] TLS :server: In state :hello at tls_connection.erl:1359 generated SERVER ALERT: Fatal - Handshake Failure
 - :malformed_handshake_data

Even though when IO.inspecting the values all seems legit and according to above specs:

[
  cert: <<48, 130, 5, 99, 48, 130, 4, 75, 160, 3, 2, 1, 2, 2, 19, 0, 250, 215,
    47, 160, 210, 189, 235, 118, 145, 90, 123, 29, 116, 249, 12, 148, 109, 65,
    48, 13, 6, 9, 42, 134, 72, 134, 247, 13, 1, 1, 11, 5, 0, ...>>,
  cacerts: [
    <<48, 130, 5, 91, 48, 130, 3, 67, 160, 3, 2, 1, 2, 2, 16, 77, 244, 43, 149,
      209, 238, 155, 58, 76, 46, 179, 59, 141, 16, 93, 214, 48, 13, 6, 9, 42,
      134, 72, 134, 247, 13, 1, 1, 11, 5, 0, 48, ...>>,
    <<48, 130, 5, 84, 48, 130, 4, 60, 160, 3, 2, 1, 2, 2, 17, 0, 237, 93, 91,
      201, 109, 251, 223, 77, 62, 205, 106, 73, 141, 209, 179, 199, 48, 13, 6,
      9, 42, 134, 72, 134, 247, 13, 1, 1, 11, 5, ...>>
  ],
  key: {:RSAPrivateKey,
   <<48, 130, 4, 163, 2, 1, 0, 2, 130, 1, 1, 0, 172, 232, 46, 48, 27, 142, 154,
     67, 197, 56, 70, 245, 47, 232, 199, 248, 192, 53, 199, 211, 94, 83, 73, 35,
     24, 181, 123, 43, 194, 15, 191, 59, 16, ...>>},
]

Don’t worry, although trimmed it’s not a production certificate :slight_smile:

I’ve also tried other variations e.g. by skipping the pem_entry_decode + der_encode calls (since pem_decode already seems to return valid DER encoded binary, just with extra element in entity tuple). Nothing seems to work unless returning to fs-based implementation…

So has anyone figured out any way to turn certificates (returned by Let’s Encrypt or not) into an in-memory representation that phoenix / cowboy / ranch / erlang ssl would consume properly? Or do you see any mistake that I’m making here? This could be useful for many other use cases like configuring certs in config/runtime.exs from various secret stores without relying on filesystem.

I checked some code of mine in which I’m doing something similar, and in the key parameter I am passing in a PKCS#8 PrivateKeyInfo DER binary. I don’t remember why, I may have had some issues passing in raw RSA keys, or maybe I just decided it was more portable, as it allows me to pass in EC key binaries as well.

Anyway, if you want to try that I would recommend you use x509 (shameless plug) to do the PEM/DER conversion, otherwise it’s a bit tricky to produce the PKCS#8 wrapper. This seems to work for me: key: {:PrivateKeyInfo, X509.PrivateKey.to_der(key, wrap: true)}, where key is the internal Erlang RSA key record (which you can get using X509.PrivateKey.from_pem(pem)). I guess you could store the PKCS#8 DER in your DB so you don’t have to convert it each time…

2 Likes

Hmm, I changed it to key: {:RSAPrivateKey, X509.PrivateKey.to_der(key, wrap: false)} and it still works. Looking at the surrounding code it really was just to support different key types with one code path. So that probably won’t make a difference for you.

Which OTP version are you on?

2 Likes

So it turned out that the code above was perfectly valid like it was supposed to be :slight_smile:

After posting here I’ve decided to move along with fs-based way but after getting @voltone ’s hint (thank you!) I’ve again put together the code posted above and it worked. I must’ve made some silly mistake after doing 100 iterations with varying cert payloads or had inconsistency in certs table.

Anyway, it took many hours to put this together so I hope my code helps others. It’s surprising that this is such an undocumented subject considering elixir and phoenix are a perfect match for building scalable servers that terminate ssl for multiple domains. I’ll consider turning this into a library or a blog post (or both).

Again, thanks @voltone for taking your time! And making me do the last lucky shot :slight_smile:

2 Likes

I actually do this for a hobby project that is utilizing GitHub - sasa1977/site_encrypt: Integrated certification via Let's encrypt for Elixir-powered sites

I now know where my mistake came from. It turns out that any exception raised in sni_fun is muted and instead the :malformed_handshake_data is produced in logs, giving no clue about an error in the callback, which I probably had in the pre-last iteration but fixed by rewriting the code. This is super-confusing and I don’t even know where this error is muted seeing that the SSL termination part of Phoenix endpoint falls somewhere between cowboy, ranch and erlang’s ssl module…

@axelson Yeah, I saw site_encrypt but wanted a database-centered solution, seeing that as easiest & most obvious way to store multiple certs in Phoenix + Ecto + Postgres app. I’m not a SSL expert so please correct me if I’m wrong but seems to me that site_encrypt doesn’t cover this case (i.e. dynamic certfificate issuing without server restarts and with certs instantly distributed among multiple server instances).