:gen_tcp.recv(sock, 0) not returning all available bytes

There’s an exercise in the Pragmatic Studio Elixir course in which I need to render a form and post the data back to create some resource.
My API for creating a resource works correctly if I post to it from command line using CURL:

curl -X POST http://localhost:4000/pledges -d 'name=daisy&amount=400'

But whenever I post to it from my form in a browser page, it fails because the request is not correctly parsed. And the reason it can not be correctly parsed is that the data received from the client socket is incomplete.

It’s incomplete because I have installed many plugins in my browser and there are many headers and a very long cookie. I can reproduce the error by using a request that’s long enough:

curl 'http://localhost:4000/pledges' -X POST -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Firefox/102.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8' -H 'Accept-Language: en-US,en;q=0.5' -H 'Accept-Encoding: gzip, deflate, br' -H 'Content-Type: application/x-www-form-urlencoded' -H 'Origin: http://localhost:4000' -H 'DNT: 1' -H 'Connection: keep-alive' -H 'Referer: http://localhost:4000/pledges/new' -H 'DFF: 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111123231111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111232311111111111111111111111111111111111111111111111111111111111111111111111111111111111111111232311111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111123231111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111112323111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111232311111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111123231111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111' --data-raw 'name=df&amount=32'

But that’s weird, because we are using the following setup when listening:

:gen_tcp.listen(port, [:binary, packet: :raw, active: false, reuseaddr: true])

And length is set to 0 when calling recv: :gen_tcp.recv(client_socket, 0).
According to the document, all available bytes should be returned when the length is 0 and the socket is in raw mode.

When I check the bit size of the truncated request, it’s always 11680 (1460 bytes).

It seems like this article explained this limitation: Figuring out a gen_tcp:recv limitation :: Simon Zelazny's Blog

The available bytes in the socket will be returned, but that doesn’t mean the entire message has finished sending. Not sure about your use case but one way this is handled is to send the length of the message in the first packet then the receiver know how many bytes they should be expecting.

1 Like

Thanks.
Yes, that’s exactly the solution described in the article I shared.
I don’t have a use case like this, it’s just part of the sample code in the course.

When sending/receiving from a socket, you never know how many chunks a message will get split into, and recv(socket, 0) reads just one chunk from the socket, which is considered “all available bytes”. The important point is that “all available bytes” does not mean “the entire message”. Even for short messages, you can’t be sure they will be sent as one chunk.

That might be a good idea if the sender were an elixir/erlang program, however that is not the case. In the op’s case, the sender is a browser (or curl), and a browser doesn’t prepend messages (i.e. http requests) with a length.

Further complicating matters is the header Connection: keep-alive because the message will not be terminated by the browser closing the socket, which recv(socket, 0) can detect. So how can the op determine when the end of the message has been reached and to stop trying to read from the socket?

I think the the op needs to read from the socket until the op finds a double newline signaling the end of the headers, then parse the headers for a Content-Length header to get the length of the body to know how many bytes to read from the socket AFTER reading the double newline. The chunk containing the double newline will most likely contain some bytes after the double newline, so those extra bytes must be subtracted from the Content-Length, e.g. the op needs to do a:

recv(socket, content_length - extra_bytes)

to read to the end of the message. It’s also possible that the chunk containing the double newline will also contain the end of the message, i.e. if content_length - extra_bytes is 0, in which case doing another recv() would cause the recv() to hang.

More info on how gen_tcp works: gen tcp - Erlang client-server example using gen_tcp is not receiving anything - Stack Overflow