S3 operations returning HTTP 505 error

Given a particularly devious S3 object key, e.g.

"users/f8db13a0-e2b8-45cd-a902-5eede074eb46/~*\\|:\"<> +`!@#$%^&()-=_[]{};',.?šŸ§Ŗ//fakedata~*\\|:\"<> +`!@#$%^&()-=_[]{};',.?šŸ§Ŗ.txt"

Iā€™m running into HTTP 505 errors from the ExAws client, e.g.

    bucket
    |> ExAws.S3.head_object(object_key)
    |> ExAws.request()

I can see the objects there if I run ExAws.S3.list_objects/2 or ExAws.S3.list_objects_v2/2

A python script uploaded the file (probably using official boto), but Iā€™m failing to retrieve it.

Frustratingly, the AWS docs about 5xx errors do not mention 505 errors anywhere. There are few other mentions of this error (e.g. this one).

Because Iā€™m able to interact with other ā€œnormally namedā€ objects in the same bucket, the culprit is likely the weird object key. Has anyone else encountered this? Is this perhaps a bug in ExAws.S3?

1 Like

This key contains values that are to be avoided according to the AWS docs Creating object key names - Amazon Simple Storage Service. I donā€™t expect ExAws.S3 is going to do a good job handling characters that AWS S3 itself says not to use.

The 505 error code is defined by RFC2616 as ā€œHTTP Version Not Supportedā€.

My guess is that this error is being triggered by the unescaped space character in that key - the request would look something like

HEAD users/f8db13a0-e2b8-45cd-a902-5eede074eb46/~*\|:"<> +`!@#$%^&()-=_[]{};',.?šŸ§Ŗ//fakedata~*\|:"<> +`!@#$%^&()-=_[]{};',.?šŸ§Ŗ.txt HTTP/1.1

so the part of the key after the space is being interpreted as the requested HTTP version, and the S3 server is replying with a 505 because it looks for <method> <url> <version>.

For sure, these are intentionally problematic keys, but they do represent possible values, and we canā€™t control what files the Python users might lob our way.

This seems plausible ā€“ some of the other posts Iā€™ve found describing similar errors seem to have something to do with spaces and encoding. Iā€™m not sure how to escape the space in this caseā€¦ adding a backslash or replacing it with + didnā€™t seem to work. Thoughts?

Iā€™m definitely feeling like the better option is to just reject any file names like that in our app. Even if they were ā€œlegitimatelyā€ created elsewhere (e.g. using boto), our app canā€™t handle them in any practical sense.

IIRC the +-as-space escape sequence only works in query strings; try %20 instead

I fixed some encoding in ex_aws a few month ago around encoding spaces, which didnā€˜t work with minio. Maybe thatā€˜s related: Use percent encoding instead of www form for header by LostKobrakai Ā· Pull Request #184 Ā· ex-aws/ex_aws_s3 Ā· GitHub

The proper fix/workaround is to store objects with GUID keys in S3 only (actual file name stored in database) and then use content-disposition to emit correct file names later.

As @LostKobrakai discussed, you can consider sanitising/encoding the characters or changing them to hex representation, however there is no standardised way to pass string literal within JSON & there may be significant work required to handle these files.

I think only place in ex_aws to change would be to do this encoding/decoding transparently (as breaking change) in ExAws.Operation.S3.add_bucket_to_path/2 or maybe ratchet up a step and use XML for outgoing requests (so you get proper string literals that can represent the object names you need), however as it would create such huge support burden for the maintainers + is already discouraged by AWS + would encourage further bad application design, I donā€™t think it will be considered AT ALL

So my best suggestion is to change design as previously mentioned to obviate the need of this rigmarole. It is also the right thing to do, if you could sanitise the names with some kind of URI encoded string and avoid a whole class of security bugsā€¦ such as user putting a file name that effectively results in a path traversal exploit. The bigger the attack surface the more bigger your risk.

1 Like