We have an app that deals with a considerable amount of requests per second. This app needs to notify an external service, by making a GET call via HTTPS to one of our servers.
Objective
The objective here is to use HTTPoison to make async GET requests. I don’t really care about the response of the requests, all I care is to know if they failed or not, so I can write any possible errors into a logger.
If it succeeds I don’t want to do anything.
Research
I have checked the official documentation for HTTPoison and I see that they support async requests:
They use flush to show the request was completed. I can’t loggin into the app and manually flush to see how the requests are going, that would be insane.
They don’t show any notifications mechanism for when we get the responses or errors.
So, I have a simple question:
How do I get asynchronously notified that my request failed or succeeded?
I assume that the default HTTPoison.get is synchronous, as shown in the documentation.
Interesting idea, but I am a little bit interested in the result, i.e., as I explained I would like to know if the request failed so I can log it.
Furthermore using Task means I wouldn’t be able to take advantage of the hackney pools HTTPoison uses, so I would have to create a Task (process) for each request instead of taking advantage of the pools already in place, which would result in far less efficient alternative.
Unlike in most languages in Elixir or Erlang it’s trivial for a function to be made asynchronous using processes (possibly using an abstraction such as Task). Because of this we don’t need libraries to implement async versions of the functions they offer- and often we would prefer them not to as one size does not fit all here.
The reason I mention why this is confusing is because receive is actually blocking. According to the documentation:
If there is no message in the mailbox matching any of the patterns, the current process will wait until a matching message arrives
I don’t want my process to wait for anything. I want it to go around doing it’s business and if it receives something, to act accordingly. I don’t want to block a process waiting for a response. In reality this would be a perfect case of fire and forget but since I need to know if something failed it is not 100% fire and forget.
So I hope you understand my confusion when you people tell me to use receive, which according to my understanding, actually blocks the process until it receives something.
I guess under this assumption Task.start would make more sense, since I could block the Task instead.
Not sure I agree with the philosophy, but the insight is surely appreciated! It helps me put into context the responses I have been getting thus far!
alias HTTPoison
defmodule Test.Poison do
def async_get(url) do
Task.start(fn ->
case HTTPoison.get(url, [], hackney: [pool: :first_pool], stream_to: self) do
response -> IO.inspect(response)
end
end)
end
end
I do have a question though. What is the difference between this approach of yours, and the one the rest of the people are recommending?
Task.start(fn ->
HTTPoison.get(url, [], hackney: [pool: :first_pool], stream_to: self)
receive do
response -> IO.inspect(response)
end
end)
Are they pretty much the same?
If I read this correctly: Erlang -- Processes new process needs 309 * 64 = 19776 bits of memory which is 2472 bytes (on 64 bit machine). I don’t know how much exactly a Linux or Windows new process or thread uses memory, but I believe it’s at least orders of magnitude more. What it boils down is that Erlang processes are very cheap, isolated, and BEAM cleans them when they finished they run, and OS processes are expensive. The overhead of creating new Erlang process to execute some code that has to be run asynchronously is negligible.
Donald Knuth wrote famous words
“Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.”
I strongly believe that in your case (when we talk about writing in Erlang) having additional process (started by Task.start) when you make HTTPoision requests asynchronously goes into that 97%.
I’m not saying that there is no overhead, nor that you have to agree, but that IMHO in most cases the overhead is negligible, but allow writing much more readable code.
You can do “fire and forget” requests within the current process if you have another process that handles HTTPoison responses using stream_to option like this HTTPoison.get! "your_url", %{}, stream_to: handler_process
Using tasks in no way interferes with Hackney’s connection pools.
I think OP’s reaction has more to do with the fact that Elixir’s concurrency is so simple, that libraries leave it up to users to choose, rather than providing some explicit async call which hides a ton of complexity.
Thank you all for the answers and explanations. There seems to be a misunderstanding that I don’t agree with the path Elixir has taken to deal with Async requests, some even generalizing that I am not a fan of the Actor model inherent to Erlang.
Such is not true. The post that illustrates almost in perfection my reaction is in fact here:
I am not a seasoned Elixir / Erlang developer. I wish I was, but truth is I came into this strange world from a Functional Javascript background. In JS, as some of you may know, you either use Promises or (the better version) Futures (and no, I am not going to mention callbacks… leave me alone!). All libraries offer you a way to handle async behavior by default.
I thought Elixir was similar. I am happy to see it isn’t. Now that I have a better understanding on how to handle async behavior in Elixir I see how much more powerful (and dangerous :D) it is.
My quest for learning hasn’t stopped yet, but I think I got what I needed from here