Recommendations on swap usage and size for a Phoenix webserver

We have around 7 phoenix servers behind an haproxy load balancer. These servers have 64GB of RAM each. However, our memory consumption graph always has a gradually increasing slope. It never goes down even when there are no requests being served, unless we explicitly trigger :erlang.garbage_collect on all processes.

We have run into 2 kinds of problems with these servers:

  1. Servers become unresponsive at times and the requests time out.
  2. The erlang vm is killed by the Linux OOM killer.

I am not sure why the erlang VM releases the memory when no requests are used.

I also wanted to figure the best swap configuration with our setup. Currently all our servers have 4GB of swap space and it keeps getting used up over time. By reading resources around how swap is usually used by the kernel, I am thinking of removing it and letting the server crash and get restarted by systemd instead of becoming unresponsive.

How do you guys setup swap on your typical servers? Do you do anything different in case of erlang servers?

I really have no experience with that scale and ram, but I have ran into a situation where using cgroups and swap just didn’t work, and the swap usage continually grew. After thinking a bit about it I would say that the only reason I would turn on swap again ever was if I needed extra RAM during compile/building for something and never to add ram on regular usage (because swap, even if you tune the swappiness, will be called when needed, which means it’s needed when the system is under stress, if it’s under stress already and you start using a much more expensive access layer to store ram , it will only snowball from there on) - I think in this case, conceptually, it would be better to limit the vm to a cgroup, ensuring some leftover ram on the system, and kill restart - if it’s feasible. I would couple this with some system query to the memory consumption (perhaps the beam itself is able to tell you some stats that are usable for that?) and then trigger GC if that works?

1 Like

Servers should not generally run in swap. It’s reasonable for processes that are rarely used to be swapped out, e.g. a mail server. But if your Erlang VM is growing like that, then you need to fix that.

It sounds like you are running into a problem of binaries not getting garbage collected, as described in https://www.erlang-in-anger.com/

Periodically forcing garbage collection in that case is reasonable, e.g.

handle_info({gc}, State) ->
   case recon:info(self(), binary_memory) of
       {binary_memory, Binary} when Binary > 500000000 ->
            % Manually trigger garbage collection to clear refc binary memory
            lager:debug("Forcing garbage collection"),
            erlang:garbage_collect(self());
       _ ->
           ok
   end,
   erlang:send_after(60 * 1000, self(), {gc}),
   {ok, State};
7 Likes

Also might be worth looking at max_heap_size per process combined with the max number of processes limit on each machine?

1 Like

Eh, the issue here is almost certainly shared binary memory, which isn’t stored on the process heap anwyay.

4 Likes

I agree with others that issue is likely a binary leak. This is a known situation which can occur if “large” binaries are referenced by a long-running process which otherwise has low activity and is therefore not collected.

The proposal by @jakemorrison can be used as the immediate counter measure. You could also take a look at :recon.bin_leak. This function also return some information about leaking processes, so you could log them to figure out which processes are the problem.

Then you can fix the problem in those processes without needing to manually force GCs. Take a look at section 7.2 of Erlang in Anger for more details on that.

8 Likes

Thanks for all the suggestions guys :slight_smile: We do process a lot of binary data.