Streaming works but when I want to access bigger file (for example ~4 GB movie) my application consumes large amounts of memory, and after ~20 seconds video stops playing.
Itās not beautiful but Iāll improve that later. From my observations chunk/2 is called too fast (if I insert :timer.sleep(1) before stream_next(resp) memory usage is normal but itās much slower).
This is memory usage from :observer (images are from imgur and they appear to be cropped here):
And very interesting thing is that one process has very long message queue:
ā°āā¤ /usr/bin/time -f "mem=%K RSS=%M elapsed=%E cpu.sys=%S user=%U" -- mix test
..
Finished in 0.08 seconds
2 tests, 0 failures
Randomized with seed 92121
mem=0 RSS=52428 elapsed=0:01.09 cpu.sys=0.28 user=1.10
Iām running your tests but Iām not seeing high memory usage here. Whatās the specific test to run that shows the problem?
Iām still thinking the issue is just what @NobbZ said, that the data is being received and stored without being throttled faster than it can resend it out.
Thereās no tests for this (I donāt have much experience in testing and donāt know how to test this without uploading really big binary somewhere).
Iām testing this this way:
http-server (from npm) is listening on port 8080 (and serving my video file named āBigfile.mp4ā). Then I open my browser on localhost:4000/api/files/1 (id doesnāt matter) and it starts loading video.
As you can see there is really small amount of code so I can make something new that will be more efficient. But I donāt know how.
Ah, no problem then. And to get a ābig infinite set of dataā can just read /dev/random or so. ^.^
I took a look and it seems you are using HTTPoison, which uses hackney behind it, and as I recall it does use active: :once on the TCP stack when you pass in active: :once to HTTPoison, so that should be fineā¦ Maybe itās some growing memory somewhere rather than something actually being storedā¦
At this point Iād really use :observer to see which process is allocating that memory then run a GC on that process, if that lowers the memory then itās just unused memory that hasnāt used enough of the system memory to cause a GC within itās time yet (and there are a few fixes for this, but eh). If itās actually allocated memory though then something is holding on to it, which could be hackney, httpoison, or Stream from what I see in your code, and I doubt it would be Stream. Letās look at the consumer perhapsā¦
Hmm, as I recall the response body getās accumulated in the conn.resp_body, but thatās not being accumulated here. I do know that conn can be used as an āintoā so the whole streaming part could be replaced with:
Itās Stream.map! Itās storing all past sent data!
Just replace Stream.map in your existing code with Stream.each so it doesnāt save the result. (Stream.into might work too? If it doesnāt accumulate, Iām not sure)
HTTPoison is doing good job, when i replace chunk with other code the memory usage is fine.
I used :observer and it shows which process is consuming memory (screen is in first post). Itās something related to cowboy. Iām really a beginner but I think the problem is with this long message queue (and gc wonāt help with it?).
Nice I accidentally deleted my reply and forum says āYouāve performed this action too many times. Please wait 23 hours before trying again.ā when I try to undelete. Iām pretty sure I havenāt undeleted anything before and post will deleted 24 hours
I think thatās temporary solution because if you have multiple streaming connections that mailbox might be filling from all of them and checking mailbox size might not be a very good solution. Hopefully this gets fixed in Cowboy 2.7. I read that Cowboy 1.x isnāt effected because chunk is synchronous function in 1.x. Maybe you could also try downgrading to 1.x in the meantime, if thatās possible with Phoenix.
This ensures that the controller action returns a conn, and it returns specifically the conn that has sent all the chunks, which is what you want. It also avoids accumulating all the data in memory.
HttpStream still needs to be implemented in a way that doesnāt overload the process with messages though.
It seems that you didnāt read this thread before replying. Problem doesnāt have anything to do with that code, there is a bug in Cowboy 2.x that is causing this. Look at my earlier post with links. Calling chunk pushed messages to cowboy_clear:connection_processās mailbox and returns. That means there is no back pressure like in Cowboy 1.x where call to chunk was a synchronous. Mailbox keeps filling up because file chunks are coming in fast cowboy_clear:connection_process process gets slow and huge amount memory is allocated.
If you want to try it yourself @no_one provided code in Github in some posts back.
Also are you sure itās ok to use Enum.reduce with streams? Iām pretty sure that will load whole file in to memory because it will create intermediate list as you can see in examples here Stream ā Elixir v1.16.0
Edited: I think I was wrong and using Enum.reduce doesnāt cause any intermediate list to be created because it returns a single value so itās probably OK in that code.
Tried your code and downgraded {:plug_cowboy, "~> 2.0"} to {:plug_cowboy, "~> 1.0"} in deps and it seems to fix this problem and memory doesnāt increase anymore.