URI.parse and the 7x http:// URL

I’m not sure whether everyone has seen Daniel Stenberg’s post on the validity of http://http://http://@http://http://?http://#http://, but it’s an interesting read.

The Elixir URI.parse implementation parses in the same way as Python (the user info block is nil). Should it change to match the cURL implementation here?

2 Likes

Interesting read indeed!

Elixir implements RFC 3986, so we should parse it accordingly to that, and not to curl. A quick check shows we align with RFC 3986 and its reference regex. :slight_smile:

But you did find a related bug:

iex(2)> URI.new "http://http://http://@http://http://?http://#http://"
{:ok,
 %URI{
   scheme: "http",
   userinfo: nil,
   host: "http",
   port: :undefined,
   path: "//http://@http://http://",
   query: "http://",
   fragment: "http://"
 }}

Notice the :undefined. :slight_smile:

5 Likes

URI.parse/1 does better:

iex(2)> URI.parse("http://http://http://@http://http://?http://#http://")
%URI{
  authority: "http:",
  fragment: "http://",
  host: "http",
  path: "//http://@http://http://",
  port: 80,
  query: "http://",
  scheme: "http",
  userinfo: nil
}