(AWS/ex_aws_s3) when S3 object's path contains double slash, S3 download_file throws an error

Hello,

I posted bug report for ex_aws_s3 at: https://github.com/ex-aws/ex_aws_s3/issues/58

Could anybody please take a look and suggest whether you encountered the same issue.
Thank you.

Have you tried escaping the / characters?

Thank you for review.

Trying to escape a double slash which is in S3 path

audit-function-test//dima-elixir/file-to-test

with

ExAws.S3.download_file("audit-function-test", "\/dima-elixir/file-to-test", "/home/user/temp/1/file_1") |> ExAws.request

or

ExAws.S3.download_file("audit-function-test", "//dima-elixir/file-to-test", "/home/user/temp/1/file_1") |> ExAws.request

or

ExAws.S3.download_file("audit-function-test", ~s("/dima-elixir/file-to-test"), "/home/user/temp/1/file_1") |> ExAws.request

or

ExAws.S3.download_file("audit-function-test", ~S("/dima-elixir/file-to-test"), "/home/user/temp/1/file_1") |> ExAws.request

and NON of the expression works giving this exception:

** (ExAws.Error) ExAws Request Error!

{:error, {:http_error, 404, %{headers: [{"x-amz-request-id", "F9CF634FAF5AA1A8"}, {"x-amz-id-2", "e9A+PgIECsY2yJm/XhJIeLOR1OeNAov2WtO/3hdSn4gWzWppfWYUpIocj8Yy3x5NQbgDVzrHfOo="}, {"Content-Type", "application/xml"}, {"Transfer-Encoding", "chunked"}, {"Date", "Fri, 17 May 2019 17:30:20 GMT"}, {"Server", "AmazonS3"}], status_code: 404}}}

    (ex_aws) lib/ex_aws.ex:66: ExAws.request!/2
    (ex_aws_s3) lib/ex_aws/s3/download.ex:51: ExAws.S3.Download.get_file_size/3
    (ex_aws_s3) lib/ex_aws/s3/download.ex:28: ExAws.S3.Download.build_chunk_stream/2
    (ex_aws_s3) lib/ex_aws/s3/download.ex:68: ExAws.Operation.ExAws.S3.Download.perform/2

Try:

"%2Fdima-elixir/file-to-test"


aws s3 cp s3://audit-function-test//dima-elixir/file-to-test

uses an S3 URI - which isn’t a URL.

When converting to an S3 http URL the duplicated slashes are stripped out as part of URL normalization.

Ultimately I would stay away from keys that don’t easily convert to http paths.

Object Key Naming Guidelines

1 Like

Thank you for review of my issue.

Sorry to say but using encoding

Try:
"%2Fdima-elixir/file-to-test"

does not work either (I tried it before) and was getting the same exceptions

When converting to an S3 http URL the duplicated slashes are stripped out as part of URL normalization .

Thank you sharing src, at least we know that this is a bug which should be fixed.

Ultimately I would stay away from keys that don’t easily convert to http paths.
Object Key Naming Guidelines

Thank you for suggestion, but reality is slightly different than on the paper.
First, double slash in URL is a valid sequence (however not recommend) and should be supported.

Example:

use@u38ac28ac89bd5d:~/workplace/Elixir/src/Elixir/poc$ aws s3 cp s3://audit-function-test//dima-elixir/file-to-test ~/temp/test
download: s3://audit-function-test//dima-elixir/file-to-test to ../../../../../temp/test
use@u38ac28ac89bd5d:~/workplace/Elixir/src/Elixir/poc$ echo $?
0

So you can see that AWS CLI (based on Python) can cope with double slashes (same Java’s AWS SDK API).

Second, I would gladly avoid double slash in my bucket’s naming schema but unfortunately we have production buckets with such naming schema and what I am doing now is prototyping existing Java system(s) in Elixir to get idea whether we can adopt Elixir. I can not avoid such bucket because they have real prod data with real volume I need, and in case of POC success I will need use them anyway.

So now question is whether this bug can be fixed in ex_aws/ex_aws_s3 library and when?

Thank you.

The Go AWS SDK needs a configuration of DisableRestProtocolURICleaning to permit object keys with a leading forward slash.

So to become unblocked I’d see if removing the path normalization works so that your POC can progress. In what manner (removal or configuration) if any this issue will be addressed should be discussed in the issue on github. I’m sure that a PR would be welcome.

Thank you for your advice. I believe I’ll proceed with fix to use configuration for normalization and PR.