How to stream file from aws to client through elixir backend

Hey there,
I want to make my bucket closed, so people can’t get anything form it even with the links,
so I want to authenticate through the app (already done) and if that is passed I want to return the file with a new link, how would one go about this?
I read that chunking can be useful, but I don’t get the whole picture, that how will I have a new link for the client?

Are you using ExAws.S3 ?

In the past I did a service like wetransfer, using S3. So, if you want to use your application just for authorisation you can use a presigned url.

client --> (your backend generate a presigned url) —> redirect the client to this url —> client downloads

https://hexdocs.pm/ex_aws_s3/ExAws.S3.html#presigned_url/5

In this way the client downloads directly from S3, without using bandwidth of your servers.

2 Likes

I was about to suggest presigned urls as well, but for the timespan those are valid they’re indeed shareable. I’m not sure if @benonymus wants that or does really need authentication on each request to the resource.

1 Like

Thanks for both of the answers, we decided not to use presigned urls, so I am interested in a different solution
I was told that it is possible to stream the data from amazon through the backend using http chunking, but I dont know that how will that give a new url to the client
So far I got this:

HTTPoison.get!(x, %{}, stream_to: self())

and this gives this as a result

%HTTPoison.AsyncResponse{id: #Reference<0.528039844.3377725442.160394>}

The only other way I’d see, is to download in a stream and pipe the bytes down to the client, but this would mean that you have to pay the bill, twice!

yes, so basically instead of aws -> client we do
aws -> backend -> client
right?

Yes. That’s the way you would do it.

it depends if they have or not the servers on aws. Outbound traffic between EC2 instances and S3 is free. But sure, from EC2 to client it’s paid (and quite expensive)

@benonymus why you avoid presigned urls?

1 Like

Sorry guys for the “withdrawn” posts. I hit reply while I was trying to quote your posts.

Elixir makes this really easy thanks for concurrency. Do you have everything on AWS? Or do you have the servers with a different cloud provider?

everything is on aws yes, so the thing is that I am not sure how to go about it technially, is there any good article on it?

Presigned urls (requires methods to maintain url’s and refresh them, can result in weird caching problems when url’s expire, this was the conclusions we came with this option

like, now I have the urls to the files, but how will I get the files and then have new urls for the files to return to the client?

For streaming using HTTPoison I wrote an article few weeks ago: Download Large Files with HTTPoison Async Requests, but it would be better to use directly a library like ExAws.S3. I can’t find a function to get an Elixir stream from an S3 object though…

Update
Am I wrong or here in the ExAws.S3 there is a massive overhead: https://github.com/ex-aws/ex_aws_s3/blob/master/lib/ex_aws/s3/download.ex#L76 ??

Each chunk seems to be requested with a separate http request…:open_mouth::thinking:

1 Like

I had the time to write down an idea of wrapper around HTTPoison, to make it an Elixir stream.

defmodule HTTPDownload do

  def stream!(url) do
    Stream.resource(
      fn -> start_request(url) end,
      fn ref -> 
        case receive_response(ref) do
          #returning the chunk to the stream
          {:ok, {:chunk, chunk}} -> 
            HTTPoison.stream_next(ref)
            {[chunk], ref}
          {:ok, msg} -> 
            IO.inspect(msg)
            HTTPoison.stream_next(ref)
            {[], ref}
          {:error, error} -> 
            IO.puts("ERROR")
            raise("error #{inspect error}")
          :done -> {:halt, ref}
        end
      end,
      fn ref -> :hackney.stop_async(ref) end
    )
  end


  defp start_request(url) do
    {:ok, ref} = HTTPoison.get(url, %{}, stream_to: self(), async: :once)
    ref
  end

  defp receive_response(ref) do
    id = ref.id
    receive do 
      %HTTPoison.AsyncStatus{code: code, id: ^id} when 200 <= code and code < 300 -> 
        {:ok, {:status_code, code}}
      %HTTPoison.AsyncStatus{code: code, id: ^id} ->
        {:error, {:status_code, code}}

      %HTTPoison.AsyncHeaders{headers: headers, id: ^id}->
        {:ok, {:headers, headers}}
      
      %HTTPoison.AsyncChunk{chunk: chunk, id: ^id}->
        {:ok, {:chunk, chunk}}

      %HTTPoison.AsyncEnd{id: ^id}-> :done
    end
  end
end

So in this way you can get a stream from a http response and you can use it like this:

HTTPDownloader.stream!( url_to_my_s3_file )
|> Enum.each(fn chunk-> send_chunk_to_client(client_conn, chunk) end)

So, if you don’t want to deal yourself with AWS HTTP API, you can get the presigned url using the ex_aws_s3 library, and use the url in your backend to get the stream of chunks to send to the client.

3 Likes

and how can I return a link to the user where they can get the image like this, in the json response?

do you need to return the file or a link?

At some point the client will have to download the file, so the example I wrote above works to download the file in chunks while sending the chunks to the client.

well now I am returning a json with fields, one of those fields have the link to aws, I need to replace that with a link, so I would need a link
this is what I return now:

%{
      attachment_url:
        XD.Utils.get_image_aws_url(
          "post_attachment_url",
          post_attachment.attachment,
          post_attachment
        ),
      type: post_attachment.type,
      caption: post_attachment.caption,
      user: render_one(post_attachment.user, UserView, "user_mini.json")
    }

so i would create the aws url earlier do the magic and replace it with the new one pointing to the file

ok basically i got it, I will show the solution shortly