MQTT TLS Debug help

I’m trying to debug a hostname check TLS issue.

I’m using Tortoise to connect to my MQTT broker hosted at HiveMQ cloud using TLS. Everything is fine, until I’m trying to set verify: :verify_peer. I’m completely lost, as I can connect from the Phoenix app to other brokers just fine, eg. to test.mosquitto.org. Moreover, I can connect to my HiveMQ broker with TLS using mosquitto_sub and supposedly, It verifies the cert I’m passing to it using the --cafile option.

Here is an example config that works:

mosquitto_sub -v -h test.mosquitto.org -t "#" -p 8885 -u ro -P readonly --cafile ~/mosquitto_test/mosquitto.org.crt`
  config :napos, :mqtt,
    server: {
      Tortoise311.Transport.SSL,
      # TODO: on prod we should remove verify_none and use the server's cert chain, we should remove
      host: "test.mosquitto.org",
      port: 8885,
      cacertfile: "path/to/mosquitto.org.crt" |> String.to_charlist(),
      verify: :verify_peer
    },
    # ClientID needs to be unique!
    client_id: System.get_env("MQTT_CLIENTID", "random_string_just_to_avoid_collision"),
    user_name: "ro",
    password: "readonly",
    handler: {Napos.DeviceQueueHandler, []},
    subscriptions: [
      {"#", 1},
    ]

According to the Tortoise docs the cacertfile should be passed as a charlist, though it worked with a string as well for me.

The test.mosquitto.org cert can be downloaded from here, while the hivemq cloud cert can be downloaded from here.

My HiveMQ config is exactly the same as above, only replacing the cert file, user, password, URL and port fields with their respective values for HiveMQ. The connection even with TLS works fine, until I try to remove verify: verify_none, or set it to verify: :verify_peer. Then I get this:

GenServer {Tortoise.Registry, {Tortoise.Connection, "serverkjbagskjbagbjklagbksajgbsalgfalfa"}} terminating
** (stop) {:tls_alert, {:handshake_failure, ~c"TLS client: In state certify at ssl_handshake.erl:2135 generated CLIENT ALERT: Fatal - Handshake Failure\n {bad_cert,hostname_check_failed}"}}

From what I can tell reading the code, it simply passes all parameters to :ssl.connect/4, so seeing the hostname check failure, I tried to set :sni to the hostname of the broker, though I’m not sure I understand the :ssl docs on :sni correctly.

Right now I’m a bit stuck. As parts of what I need to do either work with another client on the same broker, or from the same app on another broker, I don’t really know where to look further, or how could I get more verbose info on what might be going wrong.

These are typical SSL settings I have for connected to AWS MQTT. My guess is you might need to custom hostname check for https. Also, what you linked was the SNI type definition but not sure if you used the atom :sni so just for clarity, the whole option needs to be spelled out:

server_name_indication: ~c"test.mosquitto.org",
customize_hostname_check: [match_fun: :public_key.pkix_verify_hostname_match_fun(:https)],
verify: :verify_peer,
versions: [:"tlsv1.2"]

If that doesn’t work, set log_level: :debug in the SSL options to get more output. It might provide the hostname it is trying to validate. It might be that you need to set SNI to just mosquito.org depending on their cert.

The pitfall with all this is that command lines typically have magic to find all the system ssl bits with standard folders, but erlang is very explicit and needs every piece provided to it. So it tends to be trial and error :face_exhaling:

1 Like

Thank you! I didn’t see that I could get debug logs out of :ssl.

Looking at the logs, sni seems to be set up just fine, or at least it’s the same domain as I’m trying to connect to in the Client Hello section. Then the server sends it’s hello, with it’s certificate, than the error is raised.

Regarding the customize_hostname_check, how would that look for mqtt? Afaik it uses it’s own protocol and not https.

You can use the same setting for MQTT as well. The https mode for hostname verification enables the use of wildcards in server certificates. The OTP team insists this is only formally defined for HTTPS, so it is not enabled by ssl by default.

Thanks for the snippet! I’ll still need to deploy it, but it seems like the customize_hostname_check: [match_fun: :public_key.pkix_verify_hostname_match_fun(:https)] part was the thing that was missing.

@voltone Thanks for the explanation. That was the last piece of info I needed, and everything fell into place.