I’ve noticed that often when I find myself needing to manually set content-disposition - [MDN] the advice is usually to use
URI.encode_www_form or sometimes
URI.encode. This has usually resulted in users receiving files with mangled filenames, and sometimes if only using the
filename= parameter, a filename that the browser may refuse to honor, or drop the extension and half the name.
I want to start off by saying that there is a library for creating content-disposition headers, and my work on the formatting side is mostly an extension of that library GitHub - jeroenvisser101/content_disposition: A helper package for Elixir to generate Content-Disposition values (shout-out to @jeroenvisser101)
I also want to give kudos to Julian Reschke aka greenbytes, whose RFC work was uplifting, and the jshttp/content-disposition npm package for their thorough tests.
Ok, returning to the topic…
So I’ve also noticed that Plug handles this with
Plug.Parsers.MULTIPART for the newer, RFC-suggested/modern “filename*=” parameter, but still gives priority to the older, (US-)ASCII-only “filename=” parameter.
What got me here, looking for a RFC-compliant parser, is a scenario where we have to consume our downloads (why is not important), as well as with another issue where by following the old ways, users were ending up with mangled download filenames.
What I’m noticing is that each implementation of both the formatting and the parsing logic across multiple libraries, is either using the wrong RFCs (
URI.encode_www_form), encoding the “filename=” parameter when it seems that RFCs say not to, and to use the “filename*=” instead (
send_download, and a few other places and libraries), or other parses and encoding issues
If you have a file called
"my 'secrets' file.txt" and encode it in different ways:
"my%20'secrets'%20file.txt", which will give issues with “filename*=” since single-quote
'is the splitting character for parameter-extensions
"my+%27secrets%27+file.txt", which will work but results in plus-signs in the file, which even the docs point out as a potential downside.
- A couple places in different libraries where the wrong set of characters were used to do the encoding/percenting (the RFCs are quite dense at times so this seems reasonable)
Plug.Conn.Utils.paramsdoesn’t allow spaces before the “quoted-string” value, but I believe they should
- A number of places where values for
filename*=are matched against “utf-8” only and not “UTF-8” or any language tag for example in
filename*=UTF-8'en'some%20file.txt(which should be RFC-compliant)
What I am proposing though, is a well-tested library for creating and parsing the content-disposition headers that is used in Plug, and suggested in the docs of different libraries for when you need to extract the filename from a response using your HTTP client of choice
I am thinking something like jshttp/content-disposition but for Elixir, and potentially using greenbytes’ testing XML.
And it seems like it should be configurable for how “strict” it is, defaulting to not decoding the “filename=” parameter, but being a bit more forgiving when parsing with the “filename*=” parameter.
I have created a gist (link below) of some recent work I’ve done on the parsing. It still needs massive amounts of work, and honestly should probably be in a PR to @jeroenvisser101’s repository, but I wanted to start a topic here to get people’s insights, and hopefully raise awareness of how inconsistent it seems the implementations and suggestions I’ve seen are.
What do people think?
p.s. In the gist you’ll see
MyApp.Downloadable. It is a protocol we use to make it so we never have to write another bespoke
/download route again. We just have a controller responsible for each use-case necessary (authorized & single-use, shared & unlimited downloads within the TTL, etc.) and then some clever LiveView components that drop a
Downloadable into an appropriate store with a UUID on-click, then finish opening the link using the new UUID. It’s pretty much solved all our download needs for LiveViews and “dead” views. And API users can still use the protocol, just on different controllers than e.g. the “SingleUseController”.