gmile
May 18, 2023, 8:28am
1
After upgrading to OTP26, our app that uses amqp
package fails to establish a connecting to RabbitMQ server, and I am failing to understand why. Connecting to OTP25 works fine.
# OTP25
AMQP.Connection.open("amqp://guest:guest@amqp-issue-rabbitmq")
{:ok, %AMQP.Connection{pid: #PID<0.241.0>}}
# OTP26
AMQP.Connection.open("amqp://guest:guest@amqp-issue-rabbitmq")
# => {:error, {:auth_failure, 'Disconnected'}}
I have a hunch this must be related to recent changes to SSL-specific defaults in OTP26. However, attempts to disable ssl_options
doesnāt work either - it fails with another error, so I must be passing the options in a wrong way. For example, this fails:
AMQP.Connection.open("amqp://guest:guest@amqp-issue-rabbitmq", [ssl_options: :none])
# => {:error, :econnrefused}
Reproduction details
The reproduce script is this one:
# sample.bash
mix do local.hex --force, local.rebar --force
elixir -e 'Mix.install([:amqp]); AMQP.Connection.open("amqp://guest:guest@amqp-issue-rabbitmq") |> IO.inspect()'
The full reproduce experiment is this one:
docker network create amqp-issue
docker run \
--rm \
--name amqp-issue-rabbitmq \
--detach \
--network amqp-issue \
rabbitmq:3.11.16-alpine
sleep 5 # give rabbitmq enough time to fully initialize
docker run \
--rm \
--name amqp-issue-otp25 \
--network amqp-issue \
--mount type=bind,source=$(realpath sample.bash),target=/tmp/sample.bash \
hexpm/elixir:1.14.1-erlang-25.1.2-alpine-3.16.2 \
ash /tmp/sample.bash
docker run \
--rm \
--name amqp-issue-otp25 \
--network amqp-issue \
--mount type=bind,source=$(realpath sample.bash),target=/tmp/sample.bash \
hexpm/elixir:1.14.4-erlang-26.0-alpine-3.18.0 \
ash /tmp/sample.bash
docker stop amqp-issue-rabbitmq
docker network rm amqp-issue
I havenāt dig too deep into amqp_client
yet. Any ideas what might be preventing the connection and how to fix it?
5 Likes
you must probably need to have a version of the server that supports otp26
for reference
opened 01:59PM - 05 Apr 23 UTC
enhancement
erlang-26
Umbrella issue for things related to OTP26 compatibility
https://erlang.org/d⦠ownload/OTP-26.0-rc2.README
https://www.erlang.org/doc/general_info/upcoming_incompatibilities.html
First batch of changes has been merged:
https://github.com/rabbitmq/rabbitmq-server/pull/7900
The most important change is ready and tested (without this, we can't accept a client connection):
https://github.com/rabbitmq/rabbitmq-server/pull/7927
## Dialyzer defaults to `-Wunknown`, which makes it fail with errors such as these:
```
Unknown functions:
compile:file/2 (c.erl:385:10)
compile:forms/2 (escript.erl:647:12)
compile:noenv_forms/2 (erl_abstract_code.erl:10:9)
compile:noenv_forms/2 (qlc_pt.erl:444:14)
compile:output_generated/1 (c.erl:444:10)
crypto:crypto_one_time/5 (beam_lib.erl:987:11)
crypto:start/0 (beam_lib.erl:1023:10)
crypto:strong_rand_bytes/1 (net_kernel.erl:2093:37)
```
We can add all the necessary apps in the right places or use `-Wno_unknown`, at least for now
- [ ] fix it in `bazel`
- [ ] fix it in `erlang.mk`
- [x] fix credentials-obfuscation repo (OTP PR https://github.com/erlang/otp/pull/7103) or a workaround
## `term_to_binary` defaults to `{minor_version, 2}`
- [x] [remove tests that expect version 1 by default](https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbit/test/term_to_binary_compat_prop_SUITE.erl)
- [ ] think through potential incompatibilities (mixed version clusters? reading data written by a different version?)
## different order in `maps:` operations (eg. `maps:to_list` or `maps:keys`); the order was never guaranteed but now it's different :)
- [x] fix tests that fail because of this (eg. [a test in Ra])(https://github.com/rabbitmq/ra/commit/d5e43c210783d3f8a082b973dfcd9923365afa49)
- [x] search for places where we might rely on the order but we don't test for this explicitly
## `verify_peer` is set by default
- [x] fix https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbit_common/mk/rabbitmq-dist.mk (it fails to download `mix_task_archive_deps`)
## BACKPORTING
OTP26 compatibility fixes need to be backported to `v3.12.x`, `v3.11.x`.
`v3.10.x` is optional, let's see how it goes
rabbitmq:main
ā rabbitmq:otp-26-tcp-send
opened 03:54PM - 18 Apr 23 UTC
Follow up of https://github.com/rabbitmq/rabbitmq-server/pull/7913 and https://g⦠ithub.com/rabbitmq/rabbitmq-server/pull/7921
This commit uses the approach explained in https://github.com/erlang/otp/issues/7130#issuecomment-1512808759
We cannot use the macro `?OTP_RELEASE` since macros are evaluated at compile time. RabbitMQ can be compiled with OTP 25 and executed with OTP 26. Therefore, we use `erlang:system_info(otp_release)` instead. As `erlang:system_info/1` might be costly, we store the send function in persistent_term.
For OTP 25, we use the "old tcp send workaround" (i.e. `erlang:port_command/2`) which avoids expensive selective receives. For OTP 26, we use `gen_tcp:send/2` which uses the optimised selective receive.
Once the minimum required version becomes OTP 26, we can just switch to `gen_tcp:send/2` and delete the `inet_reply` handling code in the various RabbitMQ reader and writer processes.
Note that `rabbit_net:port_command/2` is not only used by RabbitMQ server, but also by the AMQP 0.9.1 client.
Therefore, instead of putting the OTP version (or send function) into persistent_term within the rabbit app, we just do it the first time `rabbit_net:port_command/2` is invoked.
(`rabbit_common` is just a library without supervision hierarchy.)
2 Likes
gmile
May 18, 2023, 1:36pm
3
you must probably need to have a version of the server that supports otp26
Would that not be somewhat counter-intuitive? I mean, since itās a client thatās having issues, and the only connection between client and server is done via TCP and using a well-establish message queue protocol
The client implementation and the server share code, according to this note in that second Github issue:
Note that rabbit_net:port_command/2
is not only used by RabbitMQ server, but also by the AMQP 0.9.1 client. Therefore, instead of putting the OTP version (or send function) into persistent_term within the rabbit app, we just do it the first time rabbit_net:port_command/2
is invoked.
2 Likes
gmile
May 18, 2023, 1:47pm
5
Fair enough Still canāt wrap my head around how that change would affect client in the connection scenario specifically though
gmile
June 5, 2023, 7:42am
6
After upgrading to amqp_client
3.12.0, this issue is resolved now. Thereās a chance connecting should also work with 3.11.17
, but I wasnāt able to test that.
just updating the client worked for you? iāve been using cloudamqp and their latest version is 3.9.X
gmile
June 5, 2023, 4:05pm
8
Just updating the client worked, yes. Iām using cloudamqp too, and connecting using 3.12
client code appears to work now.
However, I only just did some quick tests. Havenāt deployed this yet.
2 Likes
woylie
July 10, 2023, 1:59am
9
In case someone else stumbles upon this thread: The documentation of the amqp
library has an example which sets the fail_if_no_peer_cert
to true
(AMQP.Connection ā amqp v3.3.0 )). Starting with OTP 26, setting this option on the client side will result in the error {:error, {:option, :server_only, :fail_if_no_peer_cert}}
. See Erlang/OTP 26 Highlights - Erlang/OTP .
2 Likes