Not getting response body with :gun

Background

Some time ago someone in this wonderful community suggested I used gun as an HTTP client, given that I was having severe issues with HTTPoison and later on had them with HTTPotion as well (they didn’t scale well enough).

Code

To fix it, we moved our solution to use the asynchronous HTTP client mentioned above: gun.
gun is an erlang library that communicates with a GenServer via events, calls and casts. To use gun I have therefore built a primitive GenServer client that prints to the console everything it receives:

defmodule ConnPoolWorker do
  use GenServer
  alias ProcessRegistry
  alias :gun, as: Gun

  @url_domain 'google.com'
  @https_port 443

  def start_link({worker_id}) do
    IO.puts("Starting worker #{worker_id}")

    GenServer.start_link(
      __MODULE__,
      nil,
      name: via_tuple(worker_id)
    )
  end

  ## Public API

  #makes a GET request via gun
  def fire(worker_id, url) do
    GenServer.cast(via_tuple(worker_id), {:fire, url})
  end
  
   ## Implementation 
   defp via_tuple(worker_id) do
    ProcessRegistry.via_tuple({__MODULE__, worker_id})
  end

  @impl GenServer
  def init(_args) do
    {:ok, conn_pid} = Gun.open(@url_domain, @https_port)
    {:ok, _protocol} = Gun.await_up(conn_pid)
    {:ok, conn_pid}
  end

  @impl GenServer
  def handle_cast({:fire, url}, conn_pid) do
    Gun.get(conn_pid, url)
    {:noreply, conn_pid}
  end

  # handle_info everything else
  @impl GenServer
  def handle_info(msg, state) do
    IO.puts("MSG: #{inspect msg}")
    {:noreply, state}
  end
end

This client is registered in the Registry using via_tuples, but I don’t think that is majorly important for now.

Problem

This code works. It opens a connections, waits for the connection to be up, and if you invoke fire to make a request (get it? because the library is called gun? :smiley: ) you get a :gun_response event that handle_info picks up.

However, that’s precisely the issue. It’s the only event the process ever picks up. This process gets no other events like :gun_data or :gun_trailers, even though the documentation says it should.

The only thing I get every now and then is a :gun_down (connection down) followed by a :gun_up (connection up) which is normal:

MSG: {:gun_response, #PID<0.214.0>, #Reference<0.1474087065.2991849473.15656>, :fin, 302, [{"server", "nginx"}, {"date", "Wed, 20 Feb 2019 11:55:14 GMT"}, {"content-length", "0"}, {"connection", "keep-alive"}, {"location", "http://www.sapo.pt/noticias/"}, {"strict-transport-security", "max-age=31536000"}, {"x-content-type-options", "nosniff"}, {"content-security-policy", "upgrade-insecure-requests; block-all-mixed-content"}, {"x-xss-protection", "1; mode=block"}, {"referrer-policy", "origin-when-cross-origin"}]}
MSG: {:gun_down, #PID<0.221.0>, :http, :closed, [], []}
MSG: {:gun_down, #PID<0.224.0>, :http, :closed, [], []}

Question

  1. Am I doing something wrong with my GenServer client?
  2. Is this working as intended or am I missing something?
  3. How can I get the rest of the data from the request?

Don’t you need to do something with await_body?

https://ninenines.eu/docs/en/gun/1.3/manual/gun.await_body/

I found the issue. Turns out I was getting the full response:

{:gun_response, #PID<0.214.0>, #Reference<0.1474087065.2991849473.15656>, :fin, ....}

The last argument :fin means this is the only response I am supposed to get. So, both :gun and my server are working as intended as I am getting a redirection code 302.

2 Likes

I would strongly encourage moving the connection building out of your init into a handle_continue if you will be spooling off a lot of these in rapid succession, which feels likely for an HTTP client wrapper. init is blocking until it returns, which means anything calling start_link on this module is blocking too.

3 Likes

Excellent idea. I am aware of this flaw, however:

  1. I only create processes when I start the application. I don’t really care if it takes 5 seconds or 5 minutes. It’s a cost I only pay once (unless ALL workers crash, because the supervisor decided to go nuclear. I find this unlikely).
  2. I don’t yet fully understand the consequences of handle_continue. I don’t know how it affects the runtime cycle of init and most importantly I don’t know if it is a good idea to tell the Application that my worker processes are ready when in reality they are still opening the connection. This means my workers would very likely receive requests which would fail because the connections are still not open or not up.

I have in mind exploring this concept later on however, as I understand this is actually the standard way of doing costly work the in the init function. Right now however, I am just looking to build a proof of concept that works.

Thanks for the info!
Will surely take your advice at heart and use it!

I think in your case continue could work fine because you’re instantiating independent genservers that then receive messages to do their work, although doing the bootstrap on init is probably better in this particular case (I say this not knowing the remaining of the system).

Still, to understand how {:continue ...} works you need to look into the process message box queue erlang uses.

Say you have GenServer “Server” and now, concurrently from any part of your system you have multiple processes sending “Server” a message

p1 ! Server :message_a
p2 ! Server :message_b
p3 ! Server :message_c

Now the BEAM guarantees that the messages are delivered in order to the process “Server” (caveats apply if processes are spread on nodes and not on a single one)

So “Server” has in its mailbox: [:message_a, :message_b, :message_c]

And it handles one by one, handle_info(any, state), so the first handle matching will be called with msg a, then b, then c.

If for instance you have a handle that continues, say for :message_b

handle_info(:message_b, state), do: {:noreply, state, {:continue, :recalc}}

What happens is, back to the same 3 sent messages

[:message_a, :message_b, :message_c]
Processed A
[:message_b, :message_c]
Processed B
:continue
[:continue_recalc, :message_c]
Processed handle_continue :recalc
[:message_c]
Processed C
[]

So it basically hijacks the message box and puts on its first item whatever is the continuation handle, and that is guaranteed to run before any other message is processed. That continuation handle can itself continue to something else, and if it did, that would be guaranteed to be processed as well before any other message.

In your case, let’s say you start ConnPoolWorker.start_link({1}) on your supervision tree

def init(_) do
   {:ok, :not_ready, {:continue, :open}}
end

Here it returns ok, the supervisor receives the :ok, and continues starting the remaining tree, somewhere else on your app down the line, some other process are started and they start sending fire requests, say 5000 , the message box now has 5000 requests, but you’re guaranteed that before trying to process any of those, the continue handle will be executed.

def handle_continue(:open, _) do
   {:ok, conn_pid} = Gun.open(@url_domain, @https_port)
   {:ok, _protocol} = Gun.await_up(conn_pid)
   {:noreply, conn_pid}
end

So only when this handle continue executes are the remaining messages processed.

Of course if this handle_continue fails (say Gun.open errors out), then the genserver will crash, and those messages already sent and waiting on the process’ message box will be lost, whereas if you start it as you do in init, and there’s no way for the processes down the line to be started and start firing requests unless the init is successful, it might be better to simply do the bootstrap on init (as you’re doing) and prevent the app from even going further if the gun connections can’t be open.

But if you don’t loose the “targets” until the requests are successfully done, meaning if the app/processes crashes when they restart you’ll be able to fire the same requests because they haven’t been completed, there wouldn’t be a problem in using :continue.

One example where you probably don’t want to use :continue is when your genserver instantiates things that won’t be interacted by its own message box (like say, you’re populating a big ETS table that is bootstrapped from a DB) and this ETS is going to be read directly by other processes. In this case using :continue to populate the ETS table could mean that other processes start trying to access the ETS before it’s fully populated/even created, while populating it on init would guarantee that it’s ready before the Supervision Tree continues starting the other bits of your system that might interact with it. But even in this case, it might be a reasonable tradeoff, it depends.

5 Likes

An excellent response, that I have bookmarked for future reference !

1 Like