Why don't these binary show up in Process.info(pid, :binary)?

Hi all,

I’m working an an app where the end user has significant influence over some GenServer process memory usage. We can enforce some limits on these GenServers, so I need to monitor specific processes and kill processes are over their assigned limits.

I used Process.info(pid, :memory) and Process.info(pid, :binary) to calculate the amount of memory used. But I ran into a scenario where large binaries are not reported at all.

Here is some sample code:

defmodule BinaryTest do
  use GenServer
  def start_link() do
    GenServer.start_link(__MODULE__, nil)
  end

  def init(_) do
    {:ok, %{}}
  end

  def use_mem(pid) do
    GenServer.call(pid, :use_mem, :infinity)
  end

  def copy_binary(pid) do
    GenServer.call(pid, :copy_binary, :infinity)
  end

  def handle_call(:use_mem, _, state) do
    seed_binary = 1024 * 256
    |> :crypto.strong_rand_bytes()

    large_binary = Enum.reduce((0..10), seed_binary, fn count, acc ->
      acc <> acc
    end)

    state = Map.put(state, :large_binary, large_binary)
    {:reply, :ok, state}
  end

  def handle_call(:copy_binary, _, state) do
    state = Map.put(state, :large_binary, :binary.copy(state.large_binary))
    {:reply, :ok, state}
  end
end


{:ok, pid} = BinaryTest.start_link
IO.puts("Before concatenating, memory:#{inspect(Process.info(pid, :memory))}")
IO.puts("                    binaries:#{inspect(Process.info(pid, :binary))}")

# Use all the memory
res = BinaryTest.use_mem(pid)
Process.sleep(5000)
IO.puts("After concatenating,  memory:#{inspect(Process.info(pid, :memory))}")
IO.puts("                    binaries:#{inspect(Process.info(pid, :binary))}")

binary_len = BinaryTest.copy_binary(pid)
Process.sleep(5000)
IO.puts("After :binary.copy ,  memory:#{inspect(Process.info(pid, :memory))}")
IO.puts("                    binaries:#{inspect(Process.info(pid, :binary))}")

This example GenServer starts doing nothing.
The by calling use_mem, it generates a large binary by concatenating a binary to itself a few times.
Finally calling :copy_binary uses Binary.copy on the large binary.

It produces this output:

Before concatenating, memory:{:memory, 2776}
                    binaries:{:binary, []}
After concatenating,  memory:{:memory, 2776}
                    binaries:{:binary, []}
After :binary.copy ,  memory:{:memory, 4640}
                    binaries:{:binary, [{5705928744, 536870912, 1}]}

I would have expected the :binary results to include at least the initial seed_binary (with a refcount of 10 perhaps?), but no binaries show up at all.
What’s even more curious is that the beam process already has 1G allocated before the copy_binary step.

I have also used the recon_alloc and it told me that the binary allocated has this amount of memory.

When using a simple process like this, the binaries do show up.

spawn(fn ->
  seed_binary = 1024 * 256
    |> :crypto.strong_rand_bytes()
  
    large_binary = Enum.reduce((0..10), seed_binary, fn count, acc ->
      acc <> acc
    end)

    receive do 
      :stop -> nil
    end
end)

So I wonder, why is it not reported in the Process binaries before the copy_binary call? Could there be a bug in OTP where GenServer state is reported differently here? Is there a way to find out how much memory is allocated like this, but not reported in the :binary ?

I have tested this on different OTP versions, current version showing this behaviour is
Elixir 1.14.2 (compiled with Erlang/OTP 25)

1 Like

Aren’t large binaries allocated in a shared heap? So it wouldn’t show up in the process memory inspection.

Edit: yep just looked at the code you’ve shared. Any binary >64 bytes is in a shared heap and does not belong to the process

4 Likes

Modified code example:

defmodule BinaryTest do
  use GenServer

  def test() do
    {:ok, pid} = BinaryTest.start_link()
    IO.puts("Init:")
    IO.puts("#{inspect(Process.info(pid, :binary))}")

    BinaryTest.use_mem(pid)
    IO.puts("First binary:")
    IO.puts("#{inspect(Process.info(pid, :binary))}")

    BinaryTest.copy_binary(pid)
    IO.puts("Second (copied) binary:")
    IO.puts("#{inspect(Process.info(pid, :binary))}")

    BinaryTest.touch_binary(pid)
    IO.puts("After touching state binaries by additional 10+10 short running processes (+iex?):")
    IO.puts("#{inspect(Process.info(pid, :binary))}")
    Process.sleep(1100)
    IO.puts("After 10 short running processes termination:")
    IO.puts("#{inspect(Process.info(pid, :binary))}")
    Process.sleep(1000)
    IO.puts("After all short running processes termination:")
    IO.puts("#{inspect(Process.info(pid, :binary))}")
  end

  def start_link() do
    GenServer.start_link(__MODULE__, nil)
  end

  def touch_binary(pid) do
    state = GenServer.call(pid, :get_state, :infinity)

    for _ <- 1..10 do
      spawn(fn ->
        Process.sleep(1000)
        state
      end)
    end

    for _ <- 1..10 do
      spawn(fn ->
        Process.sleep(2000)
        state
      end)
    end
  end

  def init(_) do
    {:ok, %{}}
  end

  def use_mem(pid), do: GenServer.call(pid, :use_mem, :infinity)

  def copy_binary(pid), do: GenServer.call(pid, :copy_binary, :infinity)

  def handle_call(:use_mem, _, state) do
    state = Map.put(state, :large_binary, :crypto.strong_rand_bytes(256))
    {:reply, :ok, state}
  end

  def handle_call(:copy_binary, _, state) do
    state = Map.put(state, :new_binary, :binary.copy(state.large_binary))
    {:reply, :ok, state}
  end

  def handle_call(:get_state, _, state), do: {:reply, state, state}
end

Results:

iex(5)> BinaryTest.test()
Init:
{:binary, []}
First binary:
{:binary, [{5673024728, 256, 1}]}
Second (copied) binary:
{:binary, [{5673024728, 256, 1}, {5673025024, 256, 1}]}
After touching state binaries by additional 10+10 short running processes (+iex?):
{:binary, [{5673024728, 256, 22}, {5673025024, 256, 22}]}
After 10 short running processes termination:
{:binary, [{5673024728, 256, 12}, {5673025024, 256, 12}]}
After all short running processes termination:
{:binary, [{5673024728, 256, 1}, {5673025024, 256, 1}]}
:ok
1 Like

UPDATE: I narrowed it down to concatenated binaries
Yes, that’s exactly how it should work. The problem seems to start only with concatenated binaries. Maybe it’s a bug in beam?

Try this code:

defmodule BinaryTest do
  use GenServer

  def test() do
    {:ok, pid} = BinaryTest.start_link()
    IO.puts("Init:")
    IO.puts("#{inspect(Process.info(pid, :binary))}")

    BinaryTest.use_mem(pid)
    IO.puts("First binary:")
    IO.puts("#{inspect(Process.info(pid, :binary))}")

    BinaryTest.copy_binary(pid)
    IO.puts("Second (copied) binary:")
    IO.puts("#{inspect(Process.info(pid, :binary))}")
  end

  def start_link() do
    GenServer.start_link(__MODULE__, nil)
  end

  def init(_) do
    {:ok, %{}}
  end

  def use_mem(pid), do: GenServer.call(pid, :use_mem, :infinity)

  def copy_binary(pid), do: GenServer.call(pid, :copy_binary, :infinity)

  def handle_call(:use_mem, _, state) do        
    # concatenating any other binary makes the original not appear in Process.info(pid, :binary)
    state = Map.put(state, :large_binary, :crypto.strong_rand_bytes(1024 * 256) <> "a")
    {:reply, :ok, state}
  end

  def handle_call(:copy_binary, _, state) do
    state = Map.put(state, :new_binary, :binary.copy(state.large_binary))
    {:reply, :ok, state}
  end

  def handle_call(:get_state, _, state), do: {:reply, state, state}
end
BinaryTest.test()

That’s correct, but I think these binaries should show up in Process.info(pid, :binary) ?

Nope (from the :erlang.process_info/2 docs):

{binary, BinInfo}

BinInfo is a list containing miscellaneous information about binaries on the heap of this process. This InfoTuple can be changed or removed without prior notice. In the current implementation BinInfo is a list of tuples. The tuples contain; BinaryId, BinarySize, BinaryRefcCount.

Depending on the value of the message_queue_data process flag the message queue may be stored on the heap.

1 Like

I believe that more relevant Erlang documentation excerpt would be:

The binary object can be referenced by any number of ProcBins from any number of processes. The object contains a reference counter to keep track of the number of references, so that it can be removed when the last reference disappears.
All ProcBin objects in a process are part of a linked list, so that the garbage collector can keep track of them and decrement the reference counters in the binary when a ProcBin disappears.

In other words: RefcCount is incremented when new process will “touch” given binary. RefcCount is decremented when given process that “touched” binary is terminated.

2 Likes

After some more experimenting, I discovered that this behavior is not related to GenServer vs the spawned process. Instead, it depends on the actual size of the binary.
It also seems to be consistent.

With this code:


defmodule TestGenServer do
  def init(_) do
    {:ok, %{bin: ""}}
  end

  def handle_call({:add_binary, size}, _, state) do
    bin = :crypto.strong_rand_bytes(size) <> "a"
    state = Map.put(state, :bin, bin)
    Process.sleep(200 + :rand.uniform(1000)) # To rule out internal racing
    check_binaries("self", size, self())
    {:reply, :ok, state}
  end

  def test(from) do
    {:ok, pid} = GenServer.start_link(__MODULE__, nil)
    to = from + 16
    Enum.each((from .. to), fn size ->
      GenServer.call(pid, {:add_binary, size})
      check_binaries("other", size, pid)
    end)
  end

  defp check_binaries(from, size, pid) do
    {:binary, bins} = Process.info(pid, :binary)
    # remove small bins, these are used to call IO.inspect
    bins = Enum.filter(bins, fn {_, size, _} -> size > 100 end)
    case bins do
      [] -> IO.inspect({from, size, "emtpy!"})
      bins -> IO.inspect({from, size, bins})
    end
  end
end

TestGenServer.test(1024 * 1024 * 1024)

The output is always:

{"self", 1073741824, "emtpy!"}
{"other", 1073741824, "emtpy!"}
{"self", 1073741825, [{5626925096, 1073741825, 1}]}
{"other", 1073741825, "emtpy!"}
{"self", 1073741826, "emtpy!"}
{"other", 1073741826, "emtpy!"}
{"self", 1073741827, "emtpy!"}
{"other", 1073741827, "emtpy!"}
{"self", 1073741828, "emtpy!"}
{"other", 1073741828, "emtpy!"}
{"self", 1073741829, "emtpy!"}
{"other", 1073741829, "emtpy!"}
{"self", 1073741830, [{10566144040, 1073741830, 1}]}
{"other", 1073741830, "emtpy!"}
{"self", 1073741831, "emtpy!"}
{"other", 1073741831, "emtpy!"}
{"self", 1073741832, "emtpy!"}
{"other", 1073741832, "emtpy!"}
{"self", 1073741833, [{6700683304, 1073741833, 1}]}
{"other", 1073741833, "emtpy!"}
{"self", 1073741834, "emtpy!"}
{"other", 1073741834, "emtpy!"}
{"self", 1073741835, "emtpy!"}
{"other", 1073741835, "emtpy!"}
{"self", 1073741836, [{6915416104, 1073741836, 1}]}
{"other", 1073741836, "emtpy!"}
{"self", 1073741837, "emtpy!"}
{"other", 1073741837, "emtpy!"}
{"self", 1073741838, [{5626925096, 1073741838, 1}]}
{"other", 1073741838, "emtpy!"}
{"self", 1073741839, "emtpy!"}
{"other", 1073741839, "emtpy!"}
{"self", 1073741840, "emtpy!"}
{"other", 1073741840, "emtpy!"}

Only some of the binaries are reported, and only from within the owning process.
Removing the concatenation on line 7 results in all binaries being reported all of the time, including not-yet-gc’ed, both from within the process and from the test process, which seems like the desired result.

At this point, I’m quite confident this is a bug in Beam. Or maybe I’m missing something?

Seems like the comments from myself and others above answer this?

There is no bug in the BEAM. Small binaries (64 bytes and smaller) are allocated directly on the process heap. Read here for more context: Erlang -- Erlang Garbage Collector

1 Like

Did you even read my post? Responding with some basic knowledge that is somehow related to the issue and then pointing to the docs is in no way helpful. It reflects badly on you and the community as a whole.
Do not assume by default that people you don’t know, don’t know what they are talking about.

No, not really?

Ok good luck on your bug hunt that’s explained by the well documented behaviour linked to in this thread.

Let us know what you find!

1 Like

Hey friends, let’s keep tones as kind as possible :upside_down_face:

@markmeeus without diving too deep in the code examples, reporting a bug to Issues · erlang/otp · GitHub is definitely a viable option and the folks from the OTP team might be able to help out more.

4 Likes

Thanks!
I created an issue there this morning:

My apologies. Indeed, there is a freshly planted bug in OTP 25. Fix is on the way.
Echoing your statement about “people you don’t know”, do not assume they don’t know what they are talking about.

I will however make one more attempt to be helpful. Here is the feature request I made some time ago working on a similar problem, and a PR implementing it in OTP 26 (if it gets merged).

4 Likes

And I apoligize for being snarky like that.

Thanks for getting back.

Your feature request is more than helpful, it solves my use case for 99%. I’m basically trying to implement this in erlang/elixir land.

However, this seems much more performant than iterating all off-heap binaries and then calculating their size async in the background.

if (MAX_HEAP_SIZE_FLAGS_GET(p) & MAX_HEAP_SIZE_INCLUDE_OH_BINS) {
   used += p->bin_old_vheap + p->off_heap.overhead;
}

I’ll try to see if these ‘ghost binaries’ of mine are also problematic in that pr.
Thanks again!