How to read request body multiple times during request handling?

stefan · February 24, 2017, 10:16pm

Hi,

I’m new to Elixir and Phoenix and want to build a small microservice to provide a small HMAC protected json api. For the HMAC verification I need to calculate the hash of the request body.

I can read the body with the plug function “read_body(conn, opts)” and calculate the hash. But in the documentation and several other resources it is said that I can only read the body once (which makes sense to me because the data is read from the underlying tcp connection).

https://hexdocs.pm/plug/Plug.Conn.html#read_body/2

Because the request body can be of any size, reading the body will only work once, as Plug will not cache the result of these operations. If you need to access the body multiple times, it is your responsibility to store it. Finally keep in mind some plugs like Plug.Parsers may read the body, so the body may be unavailable after being accessed by such plugs.

I’m a little bit irritated because when I invoke “read_body(…)” multiple times in the plug pipeline, I always get the correct body content. Is this correct?

And if calling “read_body” more than once is error prone, how can I read the body, store it with the connection and call other plugs (e.g. parse-json) which read the body as well? The documentation tells me to do so, but I don’t known how to implement this.

OvermindDL1 · February 24, 2017, 10:22pm

That is just an artifact of the current adapter and request size. Theoretically if the body is bigger then probably the MTU it will not work, though I’ve not looked at the code to confirm.

The correct way would be to make your own Plug (go to the ‘Examples’ section, the example for the ‘module plug’ is the most re-usable way.
Just read the entire body in and store it in a named assign on the conn. You might want to add a few checks to make sure it is below a limited size to prevent blowing memory and such though (people can send you whatever they want, not necessarily what you expect ;-)).
But then later plugs, or the controller, can just read that named conn.assigns.whatever and grab the cache’d body.

stefan · February 25, 2017, 11:12am

Hi, Thanks for your reply.

I’ve looked into the Plug documentation. As you said, I could manage to write a plug which reads the body and assigns it the the connection. Then all my other plugs can access the body.

But what about foreign party plugs, e.g. Plug.Parsers.JSON? I thinks it’s a great module and I’d like to use it to parse the body content into json (after I’ve done my HMAC check). But unfortunately Plug.Parsers.JSON seems to use “read_body”, too.

https://github.com/elixir-lang/plug/blob/v1.3.0/lib/plug/parsers/json.ex

Of course, I can copy the parser code and replace just the one line where the body is read. But is there a better way?

I’ve seen that the connection holds an adapter to the underlying cowboy server which handles the read_body call. If I manage to replace this adapter with some sort of proxy adapter (sorry, I’ve a Java background ) which caches the body, so that only the first call will actually read the body and the other ones access the cached information…
But in this approach (if it’s even possible ??), I’ll change private fields of the connection which are not part of the public plug API…

josevalim · February 25, 2017, 11:22am

Since you are reading the body, everything else in Plug.Parsers.JSON you have already handled by definition. All that is left to add to your code are the last 5 lines or so of Plug.Parsers.JSON: https://github.com/elixir-lang/plug/blob/v1.3.0/lib/plug/parsers/json.ex#L47-L52

The reason why Plug does not store the request body in the connection is because Plug excels at holding multiple connections at the same time and that’s not going to be efficient if every connection is holding its request body. So the goal is to parse it and discard it straight away. However, it is straight-forward to add a custom plug or a custom parser that will keep it around.

stefan · February 25, 2017, 5:16pm

Hi,

thanks for the input. I think now I’ll know enough to implement what I need. I agree with you: storing the body in the connection every time uses too much resources. But unfortunately I have to use this HMAC mechanism (company policy)… since it will be an internal service with not too much requests per day (!), this could be feasible.

I’m a little sad that I have to use my own json parser plug (or combine it with my HMAC plug) - even if its only five lines - but since my usecase is not very common, this is fine for me.

Thank you very much

ryanwinchester · February 24, 2018, 7:07am

I’m hitting this issue trying to verify stripe webhooks:

…

Step 2: Prepare the signed_payload string

You achieve this by concatenating:

The timestamp (as a string)

The character .

The actual JSON payload (i.e., the request’s body)

…

When I try to read the request body in my verifier plug, it is already empty.

The API routes are getting around 2000 requests per second, so I don’t want to go and do something that’s “not going to be efficient”. Is there a way I can read the request body and isolate any inefficiency to just the webhooks scope?

outlog · February 24, 2018, 7:49am

I used something like this recently (though not for stripe):

though halting in the plug somehow crashed my ngrok tunnel on my dev box - so ended up json decoding and putting a :verified assign on the conn and then passing it to a controller… kinda prefer having things in the controller anyways…

  def some_callback(%{assigns: %{verified: false}} = conn, _params) do
    send_resp(conn, 401, "")
  end

  def some_callback(%{assigns: %{verified: true, json: json}} = conn, params) do
    ..
  end

in the plug:

  defp verify_signature(conn, opts) do

    with [request_signature | _] <- get_req_header(conn, opts[:header]),
         secret when not is_nil(secret) <- opts[:secret],
         {:ok, body, new_conn} <- Plug.Conn.read_body(conn),
         signature =
           :crypto.hmac(
             :sha256,
             secret,
             body
           )
           |> Base.encode16(case: :lower),
         true <- Plug.Crypto.secure_compare(signature, request_signature) do
      # handle_webhook(conn, body)
      json = Poison.decode!(body)

      new_conn = assign(new_conn, :verified, true)
      new_conn = assign(new_conn, :json, json)
    else
      nil ->
        Logger.error(fn -> "Webhook secret is not set" end)
        conn = assign(conn, :verified, false)

      false ->
        Logger.error(fn -> "Received webhook with invalid signature" end)
        conn = assign(conn, :verified, false)

      _ ->
        conn = assign(conn, :verified, false)
    end
  end

let me know if you have any questions…

ryanwinchester · February 24, 2018, 8:10pm

Is this a good way to do it?

If I add this above Plug.Parsers in my Endpoint?

defmodule MyApp.WebhookPayloads do
  import Plug.Conn, only: [assign: 3, read_body: 1]

  def init(opts), do: opts

  def call(conn, _opts) do
    case conn.path_info do
      ["webhooks" | _] -> add_payload(conn)
      _path_info       -> conn
    end
  end

  defp add_payload(conn) do
    {:ok, payload, _conn} = read_body(conn)
    assign(conn, :payload, payload)
  end
end

outlog · February 26, 2018, 10:40am

if this is a good pattern match you should be good to go… (I match on conn.request_path)

stefan · February 26, 2018, 1:29pm

Hi,

since I’m the one who created this thread, maybe you are interested in my solution. Although I’ve written some small microservices in Elixir, I’ll see myself still as an Elixir beginner.

Here is the plug I use to store the request body in the connection:

defmodule PlugStoreBody do
  import Plug.Conn
  @moduledoc false

  @behaviour Plug
  @methods ~w(POST PUT PATCH DELETE)

  def init(options), do: options

  def call(conn, options) do
    if conn.method in @methods do
      case read_body(conn, options) do
        {:error, :timeout} ->
          raise Plug.TimeoutError
        {:error, _} ->
          raise Plug.BadRequestError
        {:more, _, conn} ->
          raise PayloadTooLargeError, conn: conn, router: __MODULE__
        {:ok, "", conn} ->
          conn
          |> assign(:req_body, {:ok, nil})
        {:ok, data, conn} ->
          conn
          |> assign(:req_body, {:ok, data})
      end
    else
      conn
      |> assign(:req_body, {:ok, nil})
    end
  end
end

In the Phoenix pipeline definition, this looks like this:

  pipeline :my_api do
    plug(PlugStoreBody)
    plug(PlugHmacAuthentication)
    plug(
      Plug.Parsers,
      parsers: [JsonParser],
      pass: ["application/my.fancy.type+json"],
      json_decoder: Poison
    )    
  end

The plug “PlugHmacAuthentication” and the “JsonParser” read the body again via

case conn.assigns[:req_body] do
  {:ok, data} ->
    # do something with the body...
  _ ->
    raise ArgumentError, "Body has not been stored in the session at conn.assigns[:req_body]"

This is not the nicest way, because other plugs might depend on call “read_body(conn)” again, which fails. But for me, it works.

tanweerdev · April 13, 2020, 8:09am

How do you tell Plug.Parsers to read body from conn.assigns[:req_body]

stefan · April 13, 2020, 6:03pm

Oh… this was a long time ago. I’ve had to look into the code. To make a long story short: I don’t tell Plug.Parsers anything.

plug(
      Plug.Parsers,
      parsers: [JsonParser],
      pass: ["application/my.fancy.type+json"],
      json_decoder: Poison
    )

As you see above, I’m passing the module JsonParser to Plug.Parsers… and JsonParser is a local module in my codebase. So I’ve got the full controll where JsonParser reads its data and uses conn.assigns[:req_body] to access it. It is merely a wrapper around the :json_decoder - in this case Poison.

I remember that I wasn’t very happy that I have to write a few lines of glue code. But IMAO reading the request body multiple times during the request handling is not the best idea in the first place. Every web framework I know has problems dealing with this. I was forced to do so by the HMAC authentication - nowadays I wouldn’t use that anymore.

How to read request body multiple times during request handling?

Step 2: Prepare the signed_payload string

Step 2: Prepare the `signed_payload` string