Goose97
Phoenix.PubSubServer messages overloaded
I’m having a hard time debugging this issue related to Phoenix.PubSub on our production system. We have serveral Elixir instances but one is experiencing messages overloaded in the PubSub server. Sometimes the gen server cannot handle incoming messages fast enough so they got piled up in the messages queue. I’m checking the message_queue_len in 3 seconds interval.
Here’s what i know so far:
- We are using Phoenix.PubSub with PG2 as adapter
- The aforementioned instance is working under normal workload. Connected sockets got spread pretty evenly between all the instances.
- During the time when message queue got piled up, we do have a slight increase in incoming messages rate.
- AFAIK, the Phoenix.PubSub server is doing nothing but receiving messages from remote process, query from ETS and delegate to local socket processes. There are no heavy tasks involved.
So my question is:
- Is there any flaw in my measurement method? I feel like since message_queue_len does not directly reflect message latency, my assumption could be incorrect.
- What is the production safe way to debug this issue?
Thanks
Most Liked
chrismccord
We also need to know if other parts of the system are causing contention with the schedulers. Do you have any unbounded message queues in the app (such as those subscribed to the incoming pubsub messages)? Any other large message queues that appear alongside the pg2 server’s inbox staring to grow? Once you start get unbounded msgque buildup, other processes can start falling behind or calls start timing out even tho they are not the real cause as the scheduler tries to keep up. I’m not completely ruling out pg2server being your bottleneck, but we need to know a lot more to say what’s going on or to place the blame there. Our synthetic benchmarks of pubsub have handled 500k msg/sec on my macbook. What kind of broadcast rate are you pushing thru the cluster when you see this?
outlog
would it be possible to use pubsub 2.0 - which uses pg instead of pg2?
The pg2 module is deprecated as of OTP 23 and scheduled for removal in OTP 24. You are advised to replace the usage of pg2 with pg. pg has a similar API, but with an implementation that is more scalable. See the documentation of pg for more information about differences. http://erlang.org/doc/man/pg2.html
if nothing just to rule it out…
Goose97
We are using OTP 22 so pg is not available yet. But I don’t think we hit the limit of the pg2 module in term of scaling.







