Very high memory usage when streaming file with Phoenix

Hello, I have discovered Elixir recently and I have decided to do some projects with it (and Phoenix).

I need to take file from one server and stream it to client through my server. I wanted to do it with streams. My HttpStream looks almost exactly as this: https://labs.civilcode.io/elixir/2018/05/03/stream-files.html.

Streaming works but when I want to access bigger file (for example ~4 GB movie) my application consumes large amounts of memory, and after ~20 seconds video stops playing.

My controllerā€™s function looks like this:

def show(conn, _params) do
    url = "http://localhost:8080/Bigfile.mp4"
    %{headers: proxy_headers} = HTTPoison.head!(url)
    proxy_headers = for {k, v} <- proxy_headers, do: {String.downcase(k), v}, into: %{}

    chunked_conn =
      conn
      |> put_resp_content_type("video/mp4")
      |> put_resp_header("content-length", proxy_headers["content-length"])
      |> send_chunked(200)

    url
    |> HttpStream.stream()
    |> Stream.map(fn n -> chunk(chunked_conn, n) end)
    |> Stream.run()
  end

Itā€™s not beautiful but Iā€™ll improve that later. From my observations chunk/2 is called too fast (if I insert :timer.sleep(1) before stream_next(resp) memory usage is normal but itā€™s much slower).

This is memory usage from :observer (images are from imgur and they appear to be cropped here):
Imgur

And very interesting thing is that one process has very long message queue:
Imgur

How to fix this?

1 Like

Only from reading roughly your post, not the linked codeā€¦ But it seems as if data comes faster as you are able/willing to process it.

Perhaps you need some rate limiting on the reading end? But probably you need to change a lot of code to do so.

My bad, here is the code: https://github.com/Arquanite/phoenix-streaming-app

I hope there is a way to get more data from original server only after data is sent to client (normal http servers must be working similar way?).

ā•°ā”€āž¤  /usr/bin/time -f "mem=%K RSS=%M elapsed=%E cpu.sys=%S user=%U" -- mix test 
..

Finished in 0.08 seconds
2 tests, 0 failures

Randomized with seed 92121
mem=0 RSS=52428 elapsed=0:01.09 cpu.sys=0.28 user=1.10

Iā€™m running your tests but Iā€™m not seeing high memory usage here. Whatā€™s the specific test to run that shows the problem?

Iā€™m still thinking the issue is just what @NobbZ said, that the data is being received and stored without being throttled faster than it can resend it out.

Thereā€™s no tests for this (I donā€™t have much experience in testing and donā€™t know how to test this without uploading really big binary somewhere).

Iā€™m testing this this way:
http-server (from npm) is listening on port 8080 (and serving my video file named ā€œBigfile.mp4ā€). Then I open my browser on localhost:4000/api/files/1 (id doesnā€™t matter) and it starts loading video.

As you can see there is really small amount of code so I can make something new that will be more efficient. But I donā€™t know how.

Ah, no problem then. And to get a ā€˜big infinite set of dataā€™ can just read /dev/random or so. ^.^

I took a look and it seems you are using HTTPoison, which uses hackney behind it, and as I recall it does use active: :once on the TCP stack when you pass in active: :once to HTTPoison, so that should be fineā€¦ Maybe itā€™s some growing memory somewhere rather than something actually being storedā€¦

At this point Iā€™d really use :observer to see which process is allocating that memory then run a GC on that process, if that lowers the memory then itā€™s just unused memory that hasnā€™t used enough of the system memory to cause a GC within itā€™s time yet (and there are a few fixes for this, but eh). If itā€™s actually allocated memory though then something is holding on to it, which could be hackney, httpoison, or Stream from what I see in your code, and I doubt it would be Stream. Letā€™s look at the consumer perhapsā€¦

Hmm, in your controller:

    chunked_conn =                                                                    
      conn                                                                            
      |> put_resp_content_type("video/mp4")                                           
      |> put_resp_header("content-length", proxy_headers["content-length"])           
      |> send_chunked(200)                                                            
                                                                                      
    url                                                                               
    |> HttpStream.stream()                                                            
    |> Stream.map(fn n -> chunk(chunked_conn, n) end)                                 
    |> Stream.run()                                                                   
  end

Hmm, as I recall the response body getā€™s accumulated in the conn.resp_body, but thatā€™s not being accumulated here. I do know that conn can be used as an ā€˜intoā€™ so the whole streaming part could be replaced with:

    url
    |> HttpStream.stream()
    |> Stream.into(chunked_conn)
    |> Stream.run()

However I think that might accumulate the body, not sureā€¦

I havenā€™t used chunks in plug yet outside of trivial thingsā€¦ Hmmā€¦

Oh wait!!!

Itā€™s Stream.map! Itā€™s storing all past sent data!

Just replace Stream.map in your existing code with Stream.each so it doesnā€™t save the result. (Stream.into might work too? If it doesnā€™t accumulate, Iā€™m not sure)

HTTPoison is doing good job, when i replace chunk with other code the memory usage is fine.
I used :observer and it shows which process is consuming memory (screen is in first post). Itā€™s something related to cowboy. Iā€™m really a beginner but I think the problem is with this long message queue (and gc wonā€™t help with it?).

And Stream.each does not improve anything :<

Well you definitely want each instead of map there as map would have included all past sent data, soā€¦ ^.^;

If ā€˜eachā€™ doesnā€™t help, maybe try into(chunked_conn)? Iā€™m really surprised each didnā€™t fix itā€¦

Still nothing, it loads whole file into memory (it uses 4.6 GB of RAM, almost exact size of my video)

EDIT: Maybe I should use something else than conn, but what and how integrate it into Phoenix app?

1 Like

You should change your controller function to return conn. Now it returns whatever Stream.run() is returning. So change part of your code to

    conn =
      conn
      |> put_resp_content_type("video/mp4")
      |> put_resp_header("content-length", proxy_headers["content-length"])
      |> send_chunked(200)

    url
    |> HttpStream.stream()
    |> Stream.each(fn n -> chunk(conn, n) end)
    |> Stream.run()

    conn

Not sure it will help with memory problem, but every controller function needs to return Plug.Conn.t()

1 Like

Nice I accidentally deleted my reply and forum says ā€œYouā€™ve performed this action too many times. Please wait 23 hours before trying again.ā€ when I try to undelete. Iā€™m pretty sure I havenā€™t undeleted anything before and post will deleted 24 hours :slight_smile:

Here are those links again


2 Likes

Iā€™ve implemented this like in linked issue and now it works.

2 Likes

I think thatā€™s temporary solution because if you have multiple streaming connections that mailbox might be filling from all of them and checking mailbox size might not be a very good solution. Hopefully this gets fixed in Cowboy 2.7. I read that Cowboy 1.x isnā€™t effected because chunk is synchronous function in 1.x. Maybe you could also try downgrading to 1.x in the meantime, if thatā€™s possible with Phoenix.

1 Like

Once people start with streams itā€™s common to think they need to keep using a stream. The code should be:

    url
    |> HttpStream.stream()
    |> Enum.reduce(conn, fn n, conn -.
      chunk(conn, n) 
    end)

This ensures that the controller action returns a conn, and it returns specifically the conn that has sent all the chunks, which is what you want. It also avoids accumulating all the data in memory.

HttpStream still needs to be implemented in a way that doesnā€™t overload the process with messages though.

1 Like

It seems that you didnā€™t read this thread before replying. Problem doesnā€™t have anything to do with that code, there is a bug in Cowboy 2.x that is causing this. Look at my earlier post with links. Calling chunk pushed messages to cowboy_clear:connection_processā€™s mailbox and returns. That means there is no back pressure like in Cowboy 1.x where call to chunk was a synchronous. Mailbox keeps filling up because file chunks are coming in fast cowboy_clear:connection_process process gets slow and huge amount memory is allocated.

If you want to try it yourself @no_one provided code in Github in some posts back.

Also are you sure itā€™s ok to use Enum.reduce with streams? Iā€™m pretty sure that will load whole file in to memory because it will create intermediate list as you can see in examples here Stream ā€” Elixir v1.16.0

Edited: I think I was wrong and using Enum.reduce doesnā€™t cause any intermediate list to be created because it returns a single value so itā€™s probably OK in that code.

1 Like

Tried your code and downgraded {:plug_cowboy, "~> 2.0"} to {:plug_cowboy, "~> 1.0"} in deps and it seems to fix this problem and memory doesnā€™t increase anymore.

2 Likes