gmile
May 18, 2023, 8:28am
1
After upgrading to OTP26, our app that uses amqp
package fails to establish a connecting to RabbitMQ server, and I am failing to understand why. Connecting to OTP25 works fine.
# OTP25
AMQP.Connection.open("amqp://guest:guest@amqp-issue-rabbitmq")
{:ok, %AMQP.Connection{pid: #PID<0.241.0>}}
# OTP26
AMQP.Connection.open("amqp://guest:guest@amqp-issue-rabbitmq")
# => {:error, {:auth_failure, 'Disconnected'}}
I have a hunch this must be related to recent changes to SSL-specific defaults in OTP26. However, attempts to disable ssl_options
doesnât work either - it fails with another error, so I must be passing the options in a wrong way. For example, this fails:
AMQP.Connection.open("amqp://guest:guest@amqp-issue-rabbitmq", [ssl_options: :none])
# => {:error, :econnrefused}
Reproduction details
The reproduce script is this one:
# sample.bash
mix do local.hex --force, local.rebar --force
elixir -e 'Mix.install([:amqp]); AMQP.Connection.open("amqp://guest:guest@amqp-issue-rabbitmq") |> IO.inspect()'
The full reproduce experiment is this one:
docker network create amqp-issue
docker run \
--rm \
--name amqp-issue-rabbitmq \
--detach \
--network amqp-issue \
rabbitmq:3.11.16-alpine
sleep 5 # give rabbitmq enough time to fully initialize
docker run \
--rm \
--name amqp-issue-otp25 \
--network amqp-issue \
--mount type=bind,source=$(realpath sample.bash),target=/tmp/sample.bash \
hexpm/elixir:1.14.1-erlang-25.1.2-alpine-3.16.2 \
ash /tmp/sample.bash
docker run \
--rm \
--name amqp-issue-otp25 \
--network amqp-issue \
--mount type=bind,source=$(realpath sample.bash),target=/tmp/sample.bash \
hexpm/elixir:1.14.4-erlang-26.0-alpine-3.18.0 \
ash /tmp/sample.bash
docker stop amqp-issue-rabbitmq
docker network rm amqp-issue
I havenât dig too deep into amqp_client
yet. Any ideas what might be preventing the connection and how to fix it?
3 Likes
you must probably need to have a version of the server that supports otp26
for reference
opened 01:59PM - 05 Apr 23 UTC
enhancement
erlang-26
Umbrella issue for things related to OTP26 compatibility
https://erlang.org/d⊠ownload/OTP-26.0-rc2.README
https://www.erlang.org/doc/general_info/upcoming_incompatibilities.html
First batch of changes has been merged:
https://github.com/rabbitmq/rabbitmq-server/pull/7900
The most important change is ready and tested (without this, we can't accept a client connection):
https://github.com/rabbitmq/rabbitmq-server/pull/7927
## Dialyzer defaults to `-Wunknown`, which makes it fail with errors such as these:
```
Unknown functions:
compile:file/2 (c.erl:385:10)
compile:forms/2 (escript.erl:647:12)
compile:noenv_forms/2 (erl_abstract_code.erl:10:9)
compile:noenv_forms/2 (qlc_pt.erl:444:14)
compile:output_generated/1 (c.erl:444:10)
crypto:crypto_one_time/5 (beam_lib.erl:987:11)
crypto:start/0 (beam_lib.erl:1023:10)
crypto:strong_rand_bytes/1 (net_kernel.erl:2093:37)
```
We can add all the necessary apps in the right places or use `-Wno_unknown`, at least for now
- [ ] fix it in `bazel`
- [ ] fix it in `erlang.mk`
- [x] fix credentials-obfuscation repo (OTP PR https://github.com/erlang/otp/pull/7103) or a workaround
## `term_to_binary` defaults to `{minor_version, 2}`
- [x] [remove tests that expect version 1 by default](https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbit/test/term_to_binary_compat_prop_SUITE.erl)
- [ ] think through potential incompatibilities (mixed version clusters? reading data written by a different version?)
## different order in `maps:` operations (eg. `maps:to_list` or `maps:keys`); the order was never guaranteed but now it's different :)
- [x] fix tests that fail because of this (eg. [a test in Ra])(https://github.com/rabbitmq/ra/commit/d5e43c210783d3f8a082b973dfcd9923365afa49)
- [x] search for places where we might rely on the order but we don't test for this explicitly
## `verify_peer` is set by default
- [x] fix https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbit_common/mk/rabbitmq-dist.mk (it fails to download `mix_task_archive_deps`)
## BACKPORTING
OTP26 compatibility fixes need to be backported to `v3.12.x`, `v3.11.x`.
`v3.10.x` is optional, let's see how it goes
rabbitmq:main
â rabbitmq:otp-26-tcp-send
opened 03:54PM - 18 Apr 23 UTC
Follow up of https://github.com/rabbitmq/rabbitmq-server/pull/7913 and https://g⊠ithub.com/rabbitmq/rabbitmq-server/pull/7921
This commit uses the approach explained in https://github.com/erlang/otp/issues/7130#issuecomment-1512808759
We cannot use the macro `?OTP_RELEASE` since macros are evaluated at compile time. RabbitMQ can be compiled with OTP 25 and executed with OTP 26. Therefore, we use `erlang:system_info(otp_release)` instead. As `erlang:system_info/1` might be costly, we store the send function in persistent_term.
For OTP 25, we use the "old tcp send workaround" (i.e. `erlang:port_command/2`) which avoids expensive selective receives. For OTP 26, we use `gen_tcp:send/2` which uses the optimised selective receive.
Once the minimum required version becomes OTP 26, we can just switch to `gen_tcp:send/2` and delete the `inet_reply` handling code in the various RabbitMQ reader and writer processes.
Note that `rabbit_net:port_command/2` is not only used by RabbitMQ server, but also by the AMQP 0.9.1 client.
Therefore, instead of putting the OTP version (or send function) into persistent_term within the rabbit app, we just do it the first time `rabbit_net:port_command/2` is invoked.
(`rabbit_common` is just a library without supervision hierarchy.)
1 Like
gmile
May 18, 2023, 1:36pm
3
you must probably need to have a version of the server that supports otp26
Would that not be somewhat counter-intuitive? I mean, since itâs a client thatâs having issues, and the only connection between client and server is done via TCP and using a well-establish message queue protocol
The client implementation and the server share code, according to this note in that second Github issue:
Note that rabbit_net:port_command/2
is not only used by RabbitMQ server, but also by the AMQP 0.9.1 client. Therefore, instead of putting the OTP version (or send function) into persistent_term within the rabbit app, we just do it the first time rabbit_net:port_command/2
is invoked.
1 Like
gmile
May 18, 2023, 1:47pm
5
Fair enough Still canât wrap my head around how that change would affect client in the connection scenario specifically though