Bidirectional interaction patterns under HTTP/2

peerreynders · February 18, 2019, 8:38pm

Split from

I’ve been recently trying to understand some of the implications of HTTP/2 - which in itself was triggered by my impression that the ServiceWorker spec focuses heavily (perhaps exclusively) on request/response (including streaming) interactions.

I was curious to hear about some other opinions about some of the conclusions that I have arrived at.

This seems like a very deliberate choice.

My interpretation of this is to strongly encourage designing all client/server interactions in terms of request/response (fetch) - while relegating SSE to only serve as a notification source - fighting fat event syndrome.

So at a very general level there is the intent to limit the SSE traffic to notifications to “use your request/response machinery to update yourself” rather than pushing actual updates over SSE.

The favoured bidirectional communication pattern seems to be:

Client to server:

POST/PUT data to server

Server to client:

SSE to client “new data available”
GET new data from server

Given that all of this can happen over the same connection it’s probably less problematic than it was under HTTP/1.1

Background
The micro-pain of the additional GET in the case of the server-to-client interaction has a desirable macro-effect - all data is forced through the Fetch API.

All browser fetches go through though the ServiceWorker giving it an opportunity to cache the request/response (if so desired).
The ServiceWorker isn’t interested in ephemeral data, e.g. “real time data” that may have a temporary effect on the page as it is currently rendered but wouldn’t leave a trace if the page was reloaded.
However any data that may be necessary to “weave” the page together again, especially when the network connection is poor or non-existent, is non-ephemeral and therefore a candidate for caching.
Forcing all data through the Fetch API simplifies site caching. Non-ephemeral data delivered via SSE:
- Wouldn’t be in a Request/Response object pair that can be stored via the Cache interface
- Would have to be cached by the UI thread in IndexedDB which at this point is much less “ergonomic” than Cache.

Live Data in the Service Worker:

For example HTML, CSS, and JS files should be stored in the cache, while JSON data should be stored in IndexedDB. Note that this is only a guideline, not a firm rule.

Crowdhailer · February 19, 2019, 9:04am

Do you have any reference for “fat event syndrome”?

We have a system that put’s most event data in the SSE’s. They are JSON so already text.
Obviously really large things like images are referenced by a url but otherwise the contents of the even are the data.

woeye · February 19, 2019, 10:14am

SSE also seems to be a nice complement to mobile apps which use push notifications. Because the data flow is typically the same: The push notification simply notifies the mobile app that new data is ready. The mobile app then starts a worker in background and fetches new data from the server by using some kind of last-seen-id.

peerreynders · February 19, 2019, 10:56am

It seems to be a colloquialism that I’ve run into in connection with WebSockets. For example:

The lack of structure and semantics of an Event (the central concept of WebSockets) leads you naturally to drop the structure on your API — you end up throwing more and more stuff into that event, which keeps on getting fat, and suddenly your endpoints lose the notion of single responsibility (not to mention the notion of “resource” itself).

At the extreme are web apps that basically stop using request/response once they’ve established a WebSocket.

We have a system that put’s most event data in the SSE’s. They are JSON so already text.

And there is no explicit mandate against doing that that I can find. But data travelling in that manner has to be explicitly cached locally for “off-line” situations.

Why HTTP/2 isn’t the answer

SSE is designed to let servers inform clients that something has happened on the server and that the app running in the browser is likely interested in this event.

Compare that to an SSE event that declares that the most recent event data referenced by a non-canonical URL has been updated. When the ServiceWorker sees that URL during the data fetch it can automatically cache that request/response and serve it later in an off-line situation.

It almost feels like the web standards are trying to put SSE “in its place” by gimping it (under HTTP/2) and tilting the balance towards request/response.

Hence my differentiation between “ephemeral” and “non-ephemeral” data - non-ephemeral data should be routed through the Fetch API.

That’s what the Notification API is for. It works even when the web sites pages aren’t active.

Adding Push Notifications to a Web App

woeye · February 19, 2019, 2:10pm

With “mobile apps” I was refering to native mobile apps. I fail to see why push notifications are not a thing in those cases?

sribe · February 19, 2019, 2:15pm

Push notifications are most definitely a thing for native mobile apps, a huge thing in fact. The push services provided by the vendors are the “blessed” way to get an app to wake up and check the network on demand.

peerreynders · February 19, 2019, 2:20pm

Native push notifications are a feature of the device OS. It has to be as it works even if the app is not running. Using SSE would mean bypassing the device’s inherent capability (reinventing the wheel) and would only work while the app is running.
The Notification API is what gives the browser access to the native push notification capability.

Now it is possible that I misunderstood what you are saying.

josevalim · February 19, 2019, 2:48pm

From my personal perspective, it is only deliberate because it is running on top of HTTP 1.1, which is text centric. So I would not necessarily say it was done in interest to enforce a certain pattern but rather because anything else would be unnatural (and many things binary based in HTTP 1.1 have to be explicitly encoded as text somehow anyway). But of course, I am just guessing as well.

Unfortunately, once you go distributed (i.e. you have more than one node), this approach is very hard to pull off, as you effectively move the problem to the domain of unique registration. Now you need to be able to answer where the process identified by ABC is in the whole cluster and there is no tool I would recommend at the moment that can reliably or efficiently solve this problem.

So if you consider the three different solutions:

Bidirectional and stateless with SSE + POST (watercooler)
Bidirectional and stateful via sessions/connections (phoenix channels)
Bidirectional and stateful in the cluster (what you proposed, which can be found in something like Microsoft Orleans)

Each is more complex than the previous and each is also strictly more powerful than the previous. Still I don’t believe there is anything reliable enough to implement 3 right now.*

*well, I would recommend :global but it does have scalability and performance bottlenecks. Everything else I tried has led to issues in terms of consistency, duplication and loss of data at some point down the road.

woeye · February 19, 2019, 2:49pm

The idea is that native push notifications tend to be small with a payload as small as possible. So typically a native push notification is just a trigger to wake up the app. The app then will fetch new data from the server by using REST calls.

While the app is running in front the app could make use of SSE to get near-realtime notifications as a trigger to fetch new data over the same REST calls like before.

What I am trying to say is that because of the bidirectional nature of WebSockets one might be tempted to drive all comnunication over WebSockets (fat events) whereas in the case of SSE and unidirectional events (server-to-client) you could use SSE as just a trigger (light events) - like native push notifications.

Of course this is possible with WebSockets as well. But I’ve seen many projects where at some point the API was a weird mixture of REST calls and message-driven events.

Crowdhailer · February 19, 2019, 3:17pm

We have a system with multiple nodes and we solve this by writing messages to the DB before sending over the stream. The DB has optimistic concurrency control and having events persisted before sending allows the client to stream from any point in history using the last event id. Works well for us.

josevalim · February 19, 2019, 4:01pm

Right, that is option 1 though according to the scenarios I described. By stateful, I assume that the data is in a process (or another decentralized place) that is not the database.

To be clear, using the database is completely fine and enough for the many cases but it is not enough in certain domains such as game servers, analytics, or even LiveView itself (and IIRC the Orleans paper does describe other use cases too).

peerreynders · February 19, 2019, 5:11pm

Until recently I would gone the same way, i.e. included all the relevant data in the event itself - eliminating the need for further data fetches.

But recently for me the offline first angle is challenging the universality of that approach.

What if it is better of having the last ten (stale) events in the browser than none at all when the network is (temporarily) absent?

1.) Storing of SSE data events would have to be managed individually in detail on the browser and old ones purged when new ones arrive.

2.) There is no point in storing a “new events available” SSE event. Meanwhile the last ten events could be fetched over a “ten_most_recent_events” URL and automatically cached as a single request/response by the ServiceWorker. This approach is cruder and more wasteful than (1) but can make the logic on the browser simpler (especially if can be simplified down to a server rendered page).

It’s the heavy emphasis of the web and browser standards on improving the request/response story that seemed peculiar when product implementations are increasingly embracing WebSockets and SSE over the standard request/response interaction.

Get back to using request/response as your standard interaction and only use a sprinkling of events when absolutely necessary.

… at least that is my impression.

Now the offline first angle doesn’t apply everywhere and it remains to be seen if it will become a design pressure in the same way responsive web design did.

Aside: Orleans: Distributed Virtual Actors for Programmability and Scalability (2014)

Fair enough - but in effect it is already good enough to deliver lightweight notifications to client browsers so there was really no incentive to give it the same type of makeover that request/response has gotten under HTTP/2.

lessless · October 13, 2019, 10:11pm

Just came across this HTTP/2-based communication project https://github.com/dunglas/vulcain and want to share it with community since it always interesting to see what we can learn from emerging patterns.

It’s not a stateful communication protocol, but at least should be well performing thankfully to connection multiplexing in HTTP2.

Good thing is that it should be easier to scale horizontally than WebSockets (unless I’m missing something).