How to optimize for latency?

markdev · June 5, 2018, 9:53pm

I am working on a Phoenix project and trying to figure out how best to optimize it.

The app is API-driven. People using the app must connect to it via API; no browser code.
The app serves two kinds of customers: Producers and Consumers. There will only be a few (<10) of each.
Producers create Requests, which Consumers then have to respond to. Requests have a variable window of time for which they can wait for a Request, ranging from half a second to several minutes. Consumers can respond to Requests only once, and only within the time window (which begins once they receive the Request).
Producers will create approximately 500-5,000 Requests per day. There will be situations where the Request windows overlap.

I am trying to optimize for the following, in order:
Latency, Persistence, ease-of-use for API connection (for customers).

The best way I can think to do this would be to:

use Phoenix channels to create websocket connections with Producers and Consumers
push Request information with a unique RequestId to all Consumers once created by the Producer
expect responses with the same RequestId to come back from the Consumers within the time window and then notify the Producer.

My questions:

Websockets seem to be much faster than REST in benchmarks (although those estimates vary widely). Is this speed bump significant, and does it justify the additional overhead of a socket API?
Can GraphQL subscriptions provide any additional benefits? (The Request API is pretty firm and all Requests will look the same)
Are there any options other than HTTP, channels and GraphQL subscriptions that Phoenix supports for this purpose?

dimitarvp · June 5, 2018, 10:11pm

Apologies that I respond uninformed but you didn’t specify any of your reasoning. You only specified your almost-decided architecture.

As an outsider, I’d immediatelly challenge the notion that you need subscriptions at all – what purpose do they serve? What you describe can really easily be done with a very regular boring background jobs queue (of which Elixir has plenty of implementations), bundled with GenStage or even supervised workers that reschedule themselves after they are done with their current task (via Process.send_after).

I cannot comment adequately on your entire infrastructure and idea but from the outset, it seems you already started over-engineering.

Mind giving some more context?

markdev · June 5, 2018, 11:28pm

I haven’t decided my architecture yet, but that is implied by my questions. I have revised them.

I’m operating under the belief that, because there will be frequent, real-time communication between the App and the Producers/Consumers, and reducing latency is important, using Phoenix channels instead of REST is a preferred way to go. (I’m not sure which benchmarks should be considered canon, if any, but this seems like common knowledge)

I want to know if there’s something wrong with this reasoning, or if there are other, better options.

dimitarvp · June 5, 2018, 11:56pm

Even at 5000 tasks a day, that still means one task per 17.28 seconds. I don’t think a latency of 50 to 300 milliseconds here and there is a big deal.

But again, your scenario described here lacks details so I can’t give you more nuanced answer.

gregvaughn · June 6, 2018, 1:01am

I don’t understand how to parse this. REST is tied to HTTP and its verbs, but Phoenix channels are a bi-directional communication channel usually built atop websockets (but can also use long polling or custom transports).

Please elaborate.

markdev · June 6, 2018, 2:20am

Sorry, that was ambiguous. I’ve edited to say "Phoenix Channels instead of REST.’