I’m trying to figure out what might cause Cowboy to consume a lot of memory when sending JSON data (about 3.5 megs) as iodata through Plug, but with JSON string the memory footprint is very small.
I have a schema with around 10 fields and three of “many” type of associations as a list of around 12.2k items. I noticed that when I do render(conn, "index.json", fairly_large_list: list) in the controller and make a request, Observer shows that cowboy_clear:connection_process/4 consumes almost 50 megs of memory when running cowboy_http:loop/1.
So I started to dig a bit deeper and replaced the standard render with the following
Then the Cowboy process maxes to around 40k of memory.
So basically, when I send an iolist, Cowboy consumes 1000 times more memory with iolist than if I send a string. I took response compression off, but it didn’t have any effect. The frontend is directly connected to Cowboy.
Binaries of a certain size are not stored in the process, but on a special binary heap, therefore the consumption will remain roughly the same, it just won’t be accounted to the process.
I still wonder why the memory consumption is that high. I guess lists of structs/maps take up much more memory than corresponding JSON strings. But it still sounds quite a big difference.
We know nothing about how the iolist is actually structured, though lists, especially with many elements, can explode pretty quickly.
On a 64 bit machine a list with 10 elements will take 8 byte for the empty list, 80 bytes for the individual “boxes” and the space of the elements itself.
If those are lists as well (as iolists can be arbitrary nested).
So, [[['a']]] which is 4 lists of length 1, 4 empty lists and a small integer will take 4 * 8 + 4 * 8 + 8 = 9 * 8 = 72 bytes for a single resulting <<?a>> after “writing” it out.
Similar binaries have an overhead of 3 to 6 bytes per binary for the fat pointer reaching either into the process heap or the binary heap. Actual data might be or might not be shared.
Thanks, that explains. I’ll need to figure out the total memory consumption instead of looking at the processes while determining the best way to render the JSON (which itself can certainly be fine-tuned).
You can use :erts_debug.size/1 and flat_size/1 to get actual size in words, depending on your architecture you need to multiply them with either 4 or 8 to get actual byte sizes.
Just be careful. It can easily go the other way; place I work at is debugging a situation where using binaries is causing memory explosion relative to using iolists.
Thanks for pointing that out. That’s why I was wondering about the difference in memory consumption in the first place, since I was in the belief that iolists would always be more memory efficient. But like mentioned before, relates to how iolists are structured.