Cowboy memory spike when sending JSON as iodata through Plug

mikunn · December 17, 2020, 11:55am

I’m trying to figure out what might cause Cowboy to consume a lot of memory when sending JSON data (about 3.5 megs) as iodata through Plug, but with JSON string the memory footprint is very small.

I have a schema with around 10 fields and three of “many” type of associations as a list of around 12.2k items. I noticed that when I do render(conn, "index.json", fairly_large_list: list) in the controller and make a request, Observer shows that cowboy_clear:connection_process/4 consumes almost 50 megs of memory when running cowboy_http:loop/1.

So I started to dig a bit deeper and replaced the standard render with the following

json =
  Phoenix.View.render_to_iodata(MyView, "index.json", %{fairly_large_list: list})

conn
  |> Plug.Conn.put_resp_header("content-type", "application/json")
  |> Plug.Conn.send_resp(200, json)

Running the request again shows the same memory consumption. However, when I change that to

json =
  Phoenix.View.render(MyView, "index.json", %{fairly_large_list: list})
  |> Jason.encode!()
  
conn
  |> Plug.Conn.put_resp_header("content-type", "application/json")
  |> Plug.Conn.send_resp(conn.status || 200, json)

Then the Cowboy process maxes to around 40k of memory.

So basically, when I send an iolist, Cowboy consumes 1000 times more memory with iolist than if I send a string. I took response compression off, but it didn’t have any effect. The frontend is directly connected to Cowboy.

Any ideas? Thanks!

NobbZ · December 17, 2020, 12:42pm

Binaries of a certain size are not stored in the process, but on a special binary heap, therefore the consumption will remain roughly the same, it just won’t be accounted to the process.

mikunn · December 17, 2020, 1:27pm

Thank you! That makes perfect sense.

I still wonder why the memory consumption is that high. I guess lists of structs/maps take up much more memory than corresponding JSON strings. But it still sounds quite a big difference.

NobbZ · December 17, 2020, 1:39pm

We know nothing about how the iolist is actually structured, though lists, especially with many elements, can explode pretty quickly.

On a 64 bit machine a list with 10 elements will take 8 byte for the empty list, 80 bytes for the individual “boxes” and the space of the elements itself.

If those are lists as well (as iolists can be arbitrary nested).

So, [[['a']]] which is 4 lists of length 1, 4 empty lists and a small integer will take 4 * 8 + 4 * 8 + 8 = 9 * 8 = 72 bytes for a single resulting <<?a>> after “writing” it out.

Similar binaries have an overhead of 3 to 6 bytes per binary for the fat pointer reaching either into the process heap or the binary heap. Actual data might be or might not be shared.

mikunn · December 17, 2020, 1:48pm

Thanks, that explains. I’ll need to figure out the total memory consumption instead of looking at the processes while determining the best way to render the JSON (which itself can certainly be fine-tuned).

NobbZ · December 17, 2020, 1:54pm

You can use :erts_debug.size/1 and flat_size/1 to get actual size in words, depending on your architecture you need to multiply them with either 4 or 8 to get actual byte sizes.

mikunn · December 17, 2020, 1:56pm

Alright, thanks a lot for your help!

ityonemo · December 17, 2020, 4:43pm

Just be careful. It can easily go the other way; place I work at is debugging a situation where using binaries is causing memory explosion relative to using iolists.

mikunn · December 18, 2020, 2:42pm

Thanks for pointing that out. That’s why I was wondering about the difference in memory consumption in the first place, since I was in the belief that iolists would always be more memory efficient. But like mentioned before, relates to how iolists are structured.