Should there be a client-server interaction-adapter API in Elixir?

I was reading through this topic: Nex - A minimalist web framework for indie hackers and startups and I love minimalist web stuff and the “traditional” page-based web of yore, but I also have never liked opinionated frameworks. I always like the pitch, but every time I use one, it makes me sad when I try to write code the way I want and it doesn’t work. I have always stuck with Flask/Quart in Python because it gives all of the difficult-to-implement functionality of a web app, but it doesn’t try to build the legos ahead of time. I’m not limited by however the framework wants the pieces to fit together. It takes care of the hard parts, and I get to just do what I enjoy which is design code that is fun for me personally to work with.

That’s what I love about Phoenix as well. It felt heavy and opinionated at first with the generated code, but after learning it more, I realized I could structure things however I wanted by straying from the defaults.

This message in the linked topic brings up an interesting idea related to all that Nex - A minimalist web framework for indie hackers and startups - #28 by thiagomajesk LiveView as a generic client-server interaction engine.

I’m not sure how feasible it would be to decouple LiveView from Phoenix, but I normally build LiveView-like websockets in my Python apps, and I think it’s feasible to have something that achieves the same things as LiveView as a standalone library.

Carrying on with that idea and my love for frameworks that provide the tools to build the legos myself instead of providing the legos out of the box, I think generalizing the idea of a client-server interaction library might be an interesting thing to explore.

My thinking is that the modern web has been mostly dominated by SPA frameworks, but there’s also been a lot of work going on with the Hypermedia stuff to try to reduce the bloat of the web somewhat. All of these things are basically just different models of client-server interaction. FE-first reactive frameworks use a request-response model for creating a more stateful user experience. Realtime frameworks like LiveView try to accomplish a similar experience with a stateful connection and the ability to push from the server instead of having everything driven by the client.

I think there must be an abstraction hiding here somewhere. Maybe rather than thinking about things in terms of SPA/realtime/hypermedia, we could find a way to abstract pushing/pulling in a bi-directional model. Then we can use that same abstraction on both the client and the server and it doesn’t matter how the actual connection is implemented. This has the advantage that it also works for server-server interactions in a service architecture.

Assuming something like Plug could be implemented for this client-server push/pull interaction abstraction, then the choice of LiveView or HTMX or whatever would be a user decision and not something the framework author needs to decide up front. It would also simplify fallback/failover scenarios when things like websockets or SSE fail in realtime apps. Making it easier to implement both stateful connections with websockets or HTTP2/3 and also stateless connections with the same interface would be super helpful both from a productivity perspective (single mental model for building different kinds of apps) and from a quality perspective (fallback/failover for different networking implementations).

Thoughts?

1 Like

I think you will find there is little in the way of a shared abstraction to be had between React-likes (incl. LiveView) and HTMX-likes.

The purpose of the React-likes is to enable a declarative programming style, i.e. a style in which you rebuild the application’s entire derived state from scratch on each render. You can do this without React, but React’s engine is a tool for making this efficient through incrementalization as it otherwise does not scale very well.

I have not used HTMX but it seems to encourage an imperative style. Ironically a full HTML pageload is as declarative as can be, and it’s only by patching parts of the page that you violate that principle. It is exactly this idea of “patching parts of the page” that led to disaster 20 years ago, and it is exactly those disasters which precipitated the rise of React.

LiveView is a stateful React-like on the server, but it does not provide very good tools for declarative programming. In fairness, neither did React in the beginning; it took them many years to figure out what worked and what didn’t. I do worry that the confusion will lead to a self-fulfilling prophecy, where those using LiveView aren’t really aware of what React-likes are for and are just writing imperative code. Which means that there is no pressure for LiveView to “grow up” in the way that React did and develop into a full UI runtime.

There is an abstraction lurking, but I think that abstraction really is just LiveView or something similar. I don’t think the HTMX-likes factor into the picture at all as their own fundamental abstraction is literally just “HTTP request”. There is probably room for something shaped like “React on the server” to be factored out, and I would point to attempts to use LiveView to render native apps as an example of that.

Disclaimer: I hate React and I wish people would stop trying to force it into every shaped hole they can find. :slight_smile:

I don’t think trying to create an abstraction over LiveView, HTMX, React, etc is a good idea. I was thinking more along the lines of the communication semantics.

If we could merge the apis for handling a route for HTTP request-response cycle and a stateful persistent connection (e.g. websockets) into a single uniform api, then the programmer wouldn’t have to choose a framework that’s good at realtime stuff to have stateful interactivity or a framework that’s optimized for incremental stateless updates for component-based DOM-patching, etc.

I’m not sure how feasible it is ofc since if it was a simple problem, it would have been solved a long time ago, but now that we have our choice of well-supported protocols for use in browser-based and service-oriented applications, it would be worth investigating a possible abstraction over all of them.

Not just a library that makes working with each different protocol easier we’ve got those already.

My initial thoughts were that separating out sending and receiving data into separate functions (either declarative or imperative doesn’t matter) would be a good first step.

For example, we can model a bi-directional websocket connection by implementing two handlers:

def push(%{updated: val} = state), do: {val, state}
def pull(%{updated: val} = msg, state), do: {:reply, Map.put(state, :updated, val)}
def pull(msg, state), do: {:noreply, state}

This same api would work for stateless HTTP request-response communication as well.

The implementation could raise or return a 404 or something else if returning {:noreply, state} from a pull/2 running over an HTTP1 connection.

I actually want this kind of api in a web framework regardless of its broader usefulness because I think it makes it complex connections more composable. In Python I use the Quart web framework for websocket stuff, and it has a single @websocket decorator for creating a websocket “route”, and the code inside that handler runs both sending and receiving messages, so I have to manually separate sending and receiving to make them run concurrently. Separating them out ends up making the code a lot simpler and easier to test as well though, so I think it’s a good pattern to enforce if possible.

P.S. I really think send/1 and recv/2 are better names for the above functions, but that would clash with Elixir’s process utilities, so :person_shrugging:

2 Likes

I see what you mean.

One thing that has long bugged me is the way Phoenix built on top of Plug for its pipeline abstraction but then had to logically duplicate the entire thing to handle navigation over websockets. So you end up with duplicate pipelines for stuff like authorization which is IMO pretty messy API design. I have spent a lot of time thinking about how to do better there.

I think there is room for something like you suggest, but ironically I think it would be useful mainly for client-rendered apps. Server-rendered apps do not need to communicate arbitrarily with the client; they simply receive events and return declarative DOM.

I will certainly not waste any breath defending React proper (or anything else on NPM), but it’s also best to avoid throwing the baby out with the bathwater. React is, design-wise, a very good tool. Of course bad programmers will make a mess no matter what tools they’re using.

1 Like

HTTP request/response and websocket streaming are very different things in both semantics and targeted applications; the former is more like UDP and the latter is more like TCP. I am afraid a grand unifying scheme will not exploit each to their full potential. An optimized architecture should be built from the ground up for either a connection or a connection-less networking API. I can cite 2 examples where crossing the 2 did not work out: HTTP/2 server push and Phoenix Socket over long poll.

3 Likes

:+1: I agree, knowing the underlying transport and it’s characteristics often matters.

You got me curious :slight_smile: Would you mind expanding on what went wrong with Phoenix Socket over the LongPoll transport?

It is worse than the websocket transport in any network condition. The only legit scenarios are to support ancient browser or to bypass crazy corporate firewall.

To clarify, I don’t think anything “went wrong” with it. Phoenix Socket and Liveview were designed with websocket in mind, and rightly so.

Okay! Yes, LongPoll is there as a fallback, and I imagine it was not meant to be better than WebSocket.

Made me think how long polling was the super clever technique way before WS existed… good old times!

Phoenix.Socket is one such abstraction. It doesn’t unify with regular HTTP requests (the Plug abstraction), but it is nice that it is not exclusively tied to a WebSocket-based implementation.

Phoenix ships with a LongPoll transport, and I can imagine one could write an alternative transport using POST+SSE, for example.

As @derek-zhou wrote, unifying HTTP and WS forces us to pick the minimum common denominator, and settle for the worse of both worlds.

  • HTTP gives us caching, easier debugging, statelessness.
  • WS is a persistent connection, less handshakes, harder debugging, stateful.

An abstraction over that would need to be stateless, not support caching/response headers/cookies, …

This difference between the two technologies is not only the cause of different pipelines in Phoenix (plugs vs on_mount), but also has a very deliberate use for certain scenarios that require one over the other, e.g. login/logout in LiveView requires a regular HTTP request to set cookies.

5 Likes

Continuing the discussion based on replies from @derek-zhou and @rhcarvalho I do understand the complexities of websockets vs HTTP at the protocol level. I’ve actually read both RFCs completely and implemented HTTP 1.1 redirects manually based on the RFC in a C++ HTTP client.

I guess what I really want is something in between the transport protocol and the application state/communication logic. The situation right now with most web frameworks is that there is tight coupling between the UI/UX of the application and the underlying network protocol used due to the application being built around the underlying protocol’s semantics.

I’m a BE developer, so maybe this problem is because I (and my contemporaries) tend to think about applications from a data-first perspective instead of a UI-first perspective like a FE dev might. I do think we can do better at the framework level though for providing a single api for declaring communication channels. From an application development perspective, there isn’t much difference between responding to an HTTP request and sending a message over a websocket based on some pubsub event or something.

I guess the idea posed in the original post probably isn’t realistic since we would probably just end up moving a lot of the code/logic out of the route/websocket declaration/handlers and into configuration or some middleware-like layer below them.

Maybe the solution is actually the opposite? Instead of looking for an abstraction over the proliferation of web tooling we have, go back to the drawing board and figure out a simpler approach that satisfies a broader set of use-cases now that we have the knowledge gained by existing solutions?

I dunno, that’s probably a job for the W3C. :sweat_smile:

Yeah, I don’t mind React as a subject of study. My main problem with React is that it’s forced on me. Same reason I hate JS but not PHP. No one is forcing me to write PHP every day, so I’m okay with it being its weird buggy self over in the corner while I go about my own business. JS on the other hand just can’t leave me tf alone, so I have a special hatred for it and its ecosystem. lol

2 Likes

I agree with your overall point, but I don’t think this part is correct. As hinted by the name, on_mount runs when the connection is mounted. There is little conceptual difference between securing a stateless HTTP request and securing the mount of a socket. In both cases the connection is checked at the beginning and then everything else is authorized by whatever metadata you set on said connection.

The split is a pretty blatant case of accidental complexity, where a new abstraction had to be layered on top of an old one because OG Phoenix/Plug were not designed for LiveView. Phoenix had to maintain backwards compatibility so I’m not saying they made the wrong decision, but the outcome is clearly not what you would choose with a clean slate.

Interestingly, there actually isn’t any technical difference between securing a websocket connection and securing a stateless HTTP request either. Authentication on a websocket connection has to happen during the initial handshake, and according to the RFC, the handshake is not required to be performed over HTTP, but since that’s what everything runs, that’s how it’s handled in practice on every server ever. The server can reject the connection before responding with the 101 to switch to the websocket connection, in which case everything stays HTTP.

In practice, handling auth on a websocket connection is a bit different from a regular HTTP request since the browser APIs don’t allow passing arbitrary headers on a websocket handshake (even though the RFC says they are supposed to), but it’s still just a stateless HTTP request until the 101 has been received by the client.

The on_mount/3 callback is a leaky abstraction of the websocket handshake. I actually think that on_mount/3 should be named before_connect/3 and only run once in the disconnected state. handle_params/3 always gets called after the socket is connected anyway, so it makes more sense to load state in that callback and leave on_mount/3 for establishing the connection itself. Worst case scenario, there could be a separate on_connect/3 callback that gets called after the connection is accepted ({:cont, socket} returned from before_connect/3) but before the first call to handle_params/3. I think it probably wouldn’t get used much in practice though since the URL params should hold the state of the page, so “initialization” is just whatever data gets loaded when the params map is empty.

Thinking about this, since we have the connected?/1 helper function, this connection pattern could actually be implemented in LiveView as-is by simply putting an if !connected?(socket) do at the top of the on_mount/3 implementation with no else block. I might try that out. :slight_smile:

That‘s wrong. LiveViews are mounted many times within a single websocket connection. Each time live navigation happens a LiveView is mounted and on_mount callbacks execute.

This one is true though. There‘s indeed some history and accidental complexity here. Phoenix‘s socket macro was created when cowboy was the only supported webserver and phoenix integrated channels directly with it. Websocket connections for a long time were not routable on the plug pipeline, but completely separate and only for channels. Once Bandit came along the WebSock abstraction was added to allow for exactly that, but the socket macro was not yet updated to make use of that additional flexibility (see Phoenix.Router based socket routing by LostKobrakai · Pull Request #6142 · phoenixframework/phoenix · GitHub).

So yes there is some opportunity to improve things, but no on_mount is not comparable with the websocket handshake.

1 Like

This is true, and ironically it’s what supports the OP’s original point. Navigation over the socket is essentially emulating stateless routing over the stateful connection. Live navigation is a unification of stateful and stateless paradigms.

2 Likes

Very interesting. I assumed that a navigate event would reconnect the websocket, but I must have misread or simply missed that detail in the docs. Thanks for the correction!

I guess that makes sense though since nested LiveViews have an on_mount/3 as well.