Erlang SSL problem

yurko · March 2, 2017, 5:07pm

Today I’ve had some fun time debugging one issue: An app was suddenly unable to make https requests with httpoison. I got errors like these:

SSL: :hello: ssl_alert.erl:85:Warning: unrecognised name
SSL: :certify: ssl_handshake.erl:1623:Fatal error: handshake failure - {bad_cert,unable_to_match_altnames}

The problem is that the same app with the same deps versions at the same time worked fine on my dev computer (almost the same environment as on the server) and on a collegue’s Mac so it should have been something server specific (the app is compiled from source on deploy).

What I did:

updated Erlang, Elixir and all other packages
bumped hackney and its deps
bumped httpoison and its deps
removed Erlang and Elixir, cleaned up, rebooted and installed them again

Something above might have fixed the situation though weird as it sounds it started working a little after I made the last change.

From what I could google, this could have been a problem with Erlang’s certificate but I’m not sure. If it was a temporary problem with the remote API server it should have thrown errors in every environment, otherwise it would make the most sense to me.

What I want now is to understand what happened to somehow avoid it in the future, so I am asking for collective mind’s help

Could someone please explain what exactly does the {bad_cert,unable_to_match_altnames} error mean?

If anyone had a similar problem and can share some experience that would also be helpful.

OvermindDL1 · March 2, 2017, 5:13pm

That really sounds like either a bad certificate or a MITM attack… The altnames are the valid hostnames that the certificate is for, I do not ‘think’ that’d be related to older Erlang’s SSL deprecations, that really seems like either a bad cert or MITM… o.O

No proxies on your network?

yurko · March 2, 2017, 5:18pm

that really seems like either a bad cert

you mean remote sert, the api server I try to connect to?

No proxies on your network?

none aside from nginx on the same server to use port 80, does it leave a MITM possibility? None I can think of, to be honest

OvermindDL1 · March 2, 2017, 5:20pm

The remote yeah.

Depends on your internet provider, and their provider, and so on. ^.^

It really might have just been a temporary bad certificate though, especially if it is working now. This is definitely why it is important to check the certificates for validity.

yurko · March 2, 2017, 5:25pm

The remote yeah.

it’s Telegram’s API server https://api.telegram.org/bot and at that same time me and a guy I work with were able to use this API locally.

Depends on your internet provider, and their provider, and so on. ^.^

DigitalOcean, didn’t have such problems with it (yet)

Anyway considering the amount of reasonable explanations I had, the MITM version sounds good, thank you!

In that case there is nothing I could have done to prevent or fix this issue, right?

voltone · March 2, 2017, 6:13pm

If you get different results depending on whether your client resides on DO or on your own dev machine, and you want to check if you’re getting a different cert depending on where you are, try running the following OpenSSL command:

openssl s_client -connect api.telegram.org:443 -showcerts

You can compare the subjects and issuers, and perhaps a few of the hex bytes of the actual certificates. Or you can copy & paste a certificate from the output (including the BEGIN and END markers) into this command:

openssl x509 -text -noout

The first certificate output by the s_client command should be the server cert, so you should see the hostname(s) in the Subject or the Subject Alternative Name fields.

If the certificate seems legit, but HTTPoison (Hackney) chokes on it, I’d love to have a look at it. I don’t have any problems connecting to api.telegram.org from here, but you might hit a different Telegram server or CDN edge node depending on geographical location.

idi527 · March 2, 2017, 6:20pm

I don’t know if it helps but I’ve just checked my own bot and it seems it had a lot of :nxdomain errors from 13:06:03.361 to 16:22:27.759 utc.

{:error, %Nadia.Model.Error{reason: :nxdomain}} to be exact. I’m using Nadia to talk to telegram bot api.

voltone · March 2, 2017, 6:25pm

That actually sounds more likely than a MitM attempt: some transient DNS issues that caused connections to hit a wrong server, which was also serving HTTPS but for a different domain. In that case the SSL handshake issues were just SSL doing its job: verifying that we reached the intended server.

yurko · March 2, 2017, 6:27pm

Didn’t think about looking into certs then (mostly because I was confused by the unable_to_match_altnames message), will do if it happens the next time.

When I think about it it actually makes sense that I got different Telegram’s API’s from here and DO server (both are in Germany, but still) - because of a MITM attack as suggested by @OvermindDL1 or because of a legit temporary problem. That would explain a lot.

I don’t know if it helps but I’ve just checked my own bot and it seems it had a lot :nxdomain errors from 13:06:03.361 to 16:22:27.759 utc.

It does help, thank you, @idi527 the time frame is very similar.

Out bot runs 24/7 for about a year and does a lot (so we’d notice if something like that happened before), so that would be the first such problem. Guess I’ll be better prepared next time it happens

yurko · March 2, 2017, 7:00pm

I’m using Nadia to talk to telegram bot api.

same here except for the error

yurko · March 2, 2017, 7:21pm

Marking the issue as resolved since it’s as resolved as it gets. Many thanks for all your answers, the explanation makes sense and I have my peace of mind now