Req — A batteries-included HTTP client for Elixir

wojtekmach · June 21, 2022, 4:01pm

Hey everyone!

Req is an HTTP client for Elixir that I’ve been working on for quite some time. There is already a lot of HTTP clients out there so why create a new one? Two things: great out of the box experience and extensibility.

Regarding out of the box experience, let’s first see it in action:

Mix.install([
  {:req, "~> 0.3.0"}
])

Req.get!("https://api.github.com/repos/elixir-lang/elixir").body["description"]
#=> "Elixir is a dynamic, functional language designed for building scalable and maintainable applications"

Req.get!("http://api.github.com").status
# 23:24:11.670 [debug] follow_redirects: redirecting to https://api.github.com/
#=> 200

Req.get!("https://httpbin.org/status/500,200").status
# 19:02:08.463 [error] retry: got response with status 500, will retry in 2000ms, 2 attempts left
# 19:02:10.710 [error] retry: got response with status 500, will retry in 4000ms, 1 attempt left
#=> 200

Req automatically decompress and decodes response body, follows redirects, retries in face of errors, and more. See “Features” section in the README for the whole list.

Regarding extensibility, virtually all of Req functionality is broken down into individual pieces - steps. Req works by running the request struct through these steps. You can easily reuse or rearrange built-in steps or write new ones. Steps are similar to Tesla Middleware although they are very different in implementation. Steps are just regular functions:

debug_url = fn request ->
  IO.inspect(URI.to_string(request.url))
  request
end

req =
  Req.new(base_url: "https://api.github.com")
  |> Req.Request.append_request_steps(debug_url: debug_url)

Req.get!(req, url: "/repos/wojtekmach/req").body["description"]
# Outputs: "https://api.github.com/repos/wojtekmach/req"
#=> "Req is a batteries-included HTTP client for Elixir."

See Req.Steps module for a list of all built-in steps.

After writing custom Req steps we can make them even easier to use by others by packaging them up into plugins. Here are some examples:

Mix.install([
  {:req, "~> 0.3.0"},
  {:req_easyhtml, github: "wojtekmach/req_easyhtml"},
  {:req_s3, github: "wojtekmach/req_s3"},
  {:req_hex, github: "wojtekmach/req_hex"}
])

req =
  (Req.new(http_errors: :raise)
  |> ReqEasyHTML.attach()
  |> ReqS3.attach()
  |> ReqHex.attach())

Req.get!(req, url: "https://elixir-lang.org").body[".entry-summary h5"]
#=>
# #EasyHTML[<h5>
#    Elixir is a dynamic, functional language for building scalable and maintainable applications.
#  </h5>]

Req.get!(req, url: "s3://ossci-datasets").body
#=>
# [
#   "mnist/",
#   "mnist/t10k-images-idx3-ubyte.gz",
#   "mnist/t10k-labels-idx1-ubyte.gz",
#   "mnist/train-images-idx3-ubyte.gz",
#   "mnist/train-labels-idx1-ubyte.gz"
# ]

Req.get!(req, url: "https://repo.hex.pm/tarballs/req-0.1.0.tar").body["metadata.config"]["links"]
#=> %{"GitHub" => "https://github.com/wojtekmach/req"}

Plugins are nothing more than a convention (there’s no plugin contract) and I’m still figuring out what makes and doesn’t make sense to be a plugin. See “Writing Plugins” section in Req.Request module documentation for a little bit more information about plugins.

If you’re new to Req, I hope this post serves as a good introduction. If you have heard about it before, you may want to check the latest v0.3 release.

Any feedback is appreciated. Happy hacking!

tj0 · June 21, 2022, 5:23pm

All praise sensible defaults! I got bitten once by one of the clients not verifying SSL. Now it’s the first thing I check.

Req.get!("https://wrong.host.badssl.com/")
10:19:47.183 [info] TLS :client: In state :certify at ssl_handshake.erl:1990 generated CLIENT ALERT: Fatal - Handshake Failure
 - {:bad_cert, :hostname_check_failed}
10:19:47.184 [error] retry: got exception, will retry in 1000ms, 3 attempts left
10:19:47.184 [error] ** (Mint.TransportError) TLS client: In state certify at ssl_handshake.erl:1990 generated CLIENT ALERT: Fatal - Handshake Failure
 {bad_cert,hostname_check_failed}

sergio · June 21, 2022, 5:28pm

Will swap out my Tesla usage with this. I love sensible defaults and Req seems less code to write. I hate writing code.

stefanchrobot · June 21, 2022, 8:46pm

In my app I’m making requests to user-defined URLs, so errors are expected and I guess my feedback is mostly around that. Have a look at this:

iex> Req.request(url: "https://bad.domain")
22:23:59.915 [error] retry: got exception, will retry in 1000ms, 3 attempts left
22:23:59.915 [error] ** (Mint.TransportError) non-existing domain
22:24:00.932 [error] retry: got exception, will retry in 2000ms, 2 attempts left
22:24:00.933 [error] ** (Mint.TransportError) non-existing domain
22:24:02.945 [error] retry: got exception, will retry in 4000ms, 1 attempt left
22:24:02.945 [error] ** (Mint.TransportError) non-existing domain
{:error, %Mint.TransportError{reason: :nxdomain}}

Would be nice to have Req.get (not only the ! versions),
Would be nice to be able to abstract away the adapter: here I need to know the adapter and the possible errors as defined by the adapter; changing the adapter requires changing the error handling code,
I think the default retries should be more aggressive: something like 2 attempts with just a few milliseconds of delay,
I don’t think that retrying non-existing domain error is a sensible default.

And two general comments:

As a step “consumer” it would be nice to be able to ignore the difference between request, response and error steps,
It’s unclear how request options relate to step options. Personally I’d find it clearer if Req.request(foo: [bar: 1]) meant that [bar: 1] are the options for the :foo step.

wojtekmach · June 21, 2022, 9:25pm

Thanks for feedback!

Would be nice to have Req.get (not only the ! versions),

I might ultimately do that but no plans at the moment. I think the ! version is what most people would use most of the time, i.e. crash on transient errors (after N retries) or some serious misconfiguration (of eg ssl options), at which point it’s not like at the call site we can do much in that situation anyway so might as well crash. I’ll definitely keep this in mind though.

Would be nice to be able to abstract away the adapter: here I need to know the adapter and the possible errors as defined by the adapter; changing the adapter requires changing the error handling code,

Do you have specific kinds of errors you want to differentiate between?

I think the default retries should be more aggressive: something like 2 attempts with just a few milliseconds of delay,

On the flip side if we’re erroring because service is under load, retrying again in a few milliseconds won’t help and in fact will make things worse. I went with the backoff values (1s, 2s, 4s, 8s, …) that curl has. It is very easy to provide your own retry strategy, fwiw.

I don’t think that retrying non-existing domain error is a sensible default.

Agreed. Looking at it again, retrying on transport errors (which, to your point, are adapter specific) needs rethinking. curl handles some of 4xx/5xx by default plus opts-in to handling ECONNREFUSED with --retry-connrefused and that seems sensible enough. I will look into that.

As a step “consumer” it would be nice to be able to ignore the difference between request, response and error steps,

Not sure what you mean by this, can you elaborate?

It’s unclear how request options relate to step options. Personally I’d find it clearer if Req.request(foo: [bar: 1]) meant that [bar: 1] are the options for the :foo step.

They don’t relate, it is arbitrary. I actually started with having options be named after steps and nesting but found it a bit too verbose at times, e.g.: follow_redirects: [max_redirects: 10]. Besides this, there are options which are affecting multiple steps, for example raw: true would disable both decompress_body and decode_body. Overall I’m pretty happy with the “flatter” options but totally understand the concern. Curious what others think.

stefanchrobot · June 22, 2022, 6:52am

Not sure really, I just don’t like the fact that it’s an opaque term. That’s an issue that I have with a lot of HTTP clients and I’d like to see an improvement in this area. Can I log the error? It might contain sensitive information. I could pattern match, but then what happens when I switch adapters? Is it always going to be a map with a :reason key?

Makes sense. From my experience the retries usually address random network errors, so short retries are better. But I guess there’s no universal answer to this.

Would be nice nice to be able to just say append_steps instead of append_*_steps since I guess most of the time the step already implies where it should be.

Unless I’d learn the steps and all the config options more or less by heart, the fact that they are arbitrary makes it confusing. I don’t like when things are arbitrary or too implicit. I think there are ways to address the verbosity - the step name implies what the settings do, so you could accept all of the following:

follow_redirects: [max: 10],
follow_redirects: true,
follow_redirects: 10.

As for options that affects multiple steps, I’m not sure that’s the optimal design. Can I decompress the body without decoding it?

hubertlepicki · June 22, 2022, 7:08am

This is very nice. How hard would it be, in your opinion, to add support for the streaming request body and also streaming responses? I have a pretty specific use case where I generate files I need to send out on the fly, and I don’t want to store them at all, but they can be fairly large (> 1GB) so currently, I am generating Elixir Stream and feed it as request body to Finch (added that part to the lib) Finch — Finch v0.12.0 and when I receive the files on the other end I also turn them into Elixir stream and consume line by line. Unfortunately HTTP is all I have to talk to between both apps, but so far it’s been working great with Finch.

wojtekmach · June 22, 2022, 8:14am

What do you mean by opaque term? I believe it is pretty concrete. For Req by default it’s Finch.Error, Mint.HTTPError, or Mint.TransportError. You’re totally right that if switching adapters you’d get different errors so if you have error handling code, it would need to be updated. But what I want to figure out is what kind of error handling do you actually have. Because in my experience if I get any of these errors I cannot do anything with them anyway, I cannot recover from them, so the sensible thing is just to crash.

Perhaps I should make a bit less emphasis on being able to switch adapters because honestly I don’t see the point. Finch is great. To me switching adapters is only useful in tests.

Would be nice nice to be able to just say append_steps instead of append_*_steps since I guess most of the time the step already implies where it should be.

Gotcha, sorry, that is not possible. We have three buckets, request, response or error steps, so
when we add something we need to know what type of a thing it is, we cannot infer it.

Unless I’d learn the steps and all the config options more or less by heart, the fact that they are arbitrary makes it confusing.

I’m not sure if it addresses your concern but fwiw all the available options for built-in steps
are documented in a single place: Req.request/1 options.

Can I decompress the body without decoding it?

Yes, you can set decode_body: false.

wojtekmach · June 22, 2022, 8:22am

Thanks! Streaming request body is trivial as finch already does it (thank you for adding it!) but streaming response is pretty tricky. See https://github.com/wojtekmach/req/issues/82 for some discussion. It’s definitely on my mind and unless we have it, I won’t consider Req complete, but there’s no concrete plans at the moment, unfortunately.

stefanchrobot · June 22, 2022, 8:42am

You know the errors because you’re the author. As a user, I can only see this:

@spec request(Req.Request.t() | keyword()) :: {:ok, Req.Response.t()} | {:error, Exception.t()}

Not as opaque as hackney:

request(URL::url() | binary() | list()) -> {ok, integer(), list(), client_ref()} | {ok, integer(), list()} | {error, term()}

but still not very useful. Ideally this would be {:error, Req.Error.t()} with a well-defined set of possible values.

In my use case those errors are expected (e.g. mistyped URL), so I don’t want to crash.

Supporting only one adapter is perfectly fine for me - I was pretty happy with HTTPoison at some point. Support for replacing the adapter for tests is very important though.

wojtekmach · June 22, 2022, 9:14am

I’m skeptical about adding a Req.Error because I think it would be inferior to e.g. Mint.TransportError (it has a nice Exception.message/1 callback implementation) and I’m not sure I can reliably keep them in sync. I’m skeptical about adding strict error contract, one that would be useful for control flow, because I don’t think errors should be used for control flow. (I’m kind of doing that in retry step and maybe that’s my mistake.) I don’t have an answer for it but I’ll keep at it. Thanks for bringing this up!

tangui · June 22, 2022, 10:00am

One use-case I come across recently in favor of standardized HTTP errors: HTTP caching. When the origin is unreachable (cache is disconnected), then it is allowed to serve stale content even if there is not explicit directive allowing it (max-stale cache control directive for instance).

I’ve tested with a few libraries, the situation when the host is unreachable is the following:

:httpc: {:error, :econnrefused}
Gun and Hackney: {:error, :timeout} (this one is pretty terrible, as :timeout could be the server taking too long to respond)
Ibrowse: {:error, :nxdomain}
Mint & Finch: {:error, %Mint.TransportError{}}

IMO that’s an argument in favor of keeping errors in sync at the HTTP library level, because otherwise this has to be done by the user, which has even less knowledge about HTTP adapters.

hubertlepicki · June 22, 2022, 10:46am

oh I actually have an implementation I use for streaming response body and I can probably share… will post in the thread on GitHub

Edit: posted a link to this gist: http_streamer.ex · GitHub

kwando · June 22, 2022, 9:09pm

I really like Req the only thing that I feel uneasy about is this

Note pools are not automatically terminated by default

Which is taken from the finch docs, I have not bothered checking how/if it applies to Req though.

wojtekmach · June 23, 2022, 4:34am

It definitely applies to Req. I believe there were maybe plans to automatically terminate stale pools in Finch. If not, we will consider adding it to Req.

tj0 · June 23, 2022, 7:56am

I was checking out the retry logic and was going to write a one-liner for backoff with jitter, but looks like you already have it. That as a default would be super-cool tbh.

delay = fn n -> trunc(Integer.pow(2, n) * 1000 * (1 - 0.1 * :rand.uniform())) end
Req.get!("https://httpbin.org/status/500,200", retry_delay: delay).status
# 08:43:19.101 [error] retry: got response with status 500, will retry in 941ms, 2 attempts left
# 08:43:22.958 [error] retry: got response with status 500, will retry in 1877s, 1 attempt left
200

And for those that are unfamiliar, amazon has a nice explanation and results.

wojtekmach · June 23, 2022, 9:56am

Thank you for sharing this. If you’d like to send a PR with jitter enabled by default I’ll happily merge it. If we’d need to store some state around, that’s an option too. I think the contract would be something like this:

retry_delay: {fn attempt, acc -> {delay, new_acc} end, initial_acc}

adamu · June 23, 2022, 1:32pm

One thing I like about Tesla is that you can log out something similar to the raw HTTP request with Tesla.Middlewear.Logger. That was the killer for me last time I looked into Req - but it could just be a step that’s not been written yet / I didn’t know about?

wojtekmach · June 23, 2022, 2:14pm

Yeah there’s no built-in feature like that. It is pretty easy to add something basic:

Mix.install([
  {:req, "~> 0.3"}
])

require Logger

Req.new()
|> Req.Request.append_request_steps(
  log_request: fn request ->
    method = request.method |> Atom.to_string() |> String.upcase()
    Logger.debug("#{method} #{request.url}")
    request
  end
)
|> Req.get!(url: "https://elixir-lang.org")

Outputs:

15:52:25.679 [debug] GET https://elixir-lang.org

This is actually a good example of tradeoffs. I’d argue Req steps are conceptually much simpler than Tesla middleware but the latter is definitely more powerful. It wraps the entire request/response lifecycle so it has access to, well, both request and response and timing information.

It is definitely possible to achieve this with a combination of request, response, and errors steps. Here’s a sketch: (I left out error step for brevity)

Mix.install([
  {:req, "~> 0.3"}
])

require Logger

Req.new()
|> Req.Request.prepend_request_steps(
  log_request: fn request ->
    method = request.method |> Atom.to_string() |> String.upcase()
    time1 = System.monotonic_time()

    Req.Request.append_response_steps(request,
      log_request: fn {request, response} ->
        time2 = System.monotonic_time()
        ms = System.convert_time_unit(time2 - time1, :native, :millisecond)
        Logger.debug("#{method} #{request.url} => #{response.status} (#{ms}ms)")
        {request, response}
      end
    )
  end
)
|> Req.get!(url: "https://elixir-lang.org")

Outputs:

16:11:21.703 [debug] GET https://elixir-lang.org => 200 (311ms)

sergio · June 23, 2022, 6:21pm

Amazing, sensible defaults FTW. This would be great!