HTTP-caching LiveView's first render

tangui · October 22, 2023, 9:08pm

Hi all!

Since I released plug_http_cache, I’ve been asked a few times how to use it to cache LiveView’s initial responses. I’m not very well versed in LiveView, and I think there are many ways one could shoot himself in the foot doing that. However, it seems to be there are some reasonable use-cases for doing that.

One of them is to use http caching to improve SEO ranking for public pages that have little interactivity whether the user is logged in or not. For instance, one LiveView page could render financial charts in real-time as an aside, while the main content is almost static.

I tried plug_http_cache and it almost works out-of-the box:

initial mount replies with a 200 response, which is cached
second mount replies with a 101 HTTP response, which is not cacheable. The websocket connection is established

So far so good, but LiveView uses a CSRF token that is cached and sent along the second request - and LiveView WS connection fails when opening the same page in another browser with:

[debug] LiveView session was misconfigured or the user token is outdated.

1) Ensure your session configuration in your endpoint is in a module attribute:

    @session_options [
      ...
    ]

2) Change the `plug Plug.Session` to use said attribute:

    plug Plug.Session, @session_options

3) Also pass the `@session_options` to your LiveView socket:

    socket "/live", Phoenix.LiveView.Socket,
      websocket: [connect_info: [session: @session_options]]

4) Ensure the `protect_from_forgery` plug is in your router pipeline:

    plug :protect_from_forgery

5) Define the CSRF meta tag inside the `<head>` tag in your layout:

    <meta name="csrf-token" content={Plug.CSRFProtection.get_csrf_token()} />

6) Pass it forward in your app.js:

    let csrfToken = document.querySelector("meta[name='csrf-token']").getAttribute("content");
    let liveSocket = new LiveSocket("/live", Socket, {params: {_csrf_token: csrfToken}});

I figured out it’s possible to disable checking of this CSRF token when configuring the live endpoint, at the cost of disabling live sessions:

endpoint.ex before:

socket "/live", Phoenix.LiveView.Socket, websocket: [connect_info: [session: @session_options]]

endpoint.ex after:

socket "/live", Phoenix.LiveView.Socket, websocket: []

My first batch of questions would be:

is it safe to do so?
what are the other side-effects, other than disabling live session?
is that possible to have 2 sockets: one where live sessions are disabled for this specific use-case, and another one with session information enabled?

Another thing I noticed is that some information related to LiveView is stored in the HTML upon initial rendering:

<div
data-phx-main
data-phx-session="SFMyNTY.g2gDaAJhBXQAAAAIdwJpZG0AAAAUcGh4LUY1Q0hZRndnM2I0N0lRSkN3B3Nlc3Npb250AAAAAHcKcGFyZW50X3BpZHcDbmlsdwR2aWV3dy1FbGl4aXIuSHR0cENhY2hlV2l0aExpdmV2aWV3V2ViLlVzZXJMaXZlLlNob3d3BnJvdXRlcncmRWxpeGlyLkh0dHBDYWNoZVdpdGhMaXZldmlld1dlYi5Sb3V0ZXJ3DGxpdmVfc2Vzc2lvbmgCdwdkZWZhdWx0bggAt3Fq2ugnkBd3CHJvb3RfcGlkdwNuaWx3CXJvb3Rfdmlld3ctRWxpeGlyLkh0dHBDYWNoZVdpdGhMaXZldmlld1dlYi5Vc2VyTGl2ZS5TaG93bgYA-_cJWYsBYgABUYA.25eSbZuNO6oxbt4AEcignF3HVsHExGqPcoc8BI8E5nI"
data-phx-static="SFMyNTY.g2gDaAJhBXQAAAADdwJpZG0AAAAUcGh4LUY1Q0hZRndnM2I0N0lRSkN3BWZsYXNodAAAAAB3CmFzc2lnbl9uZXdqbgYA-_cJWYsBYgABUYA.k0KyuD5mMq4arpXYUO2poQAdDrnFipHxVJ1D6B32tyI" id="phx-F5CHYFwg3b47IQJC">

After decoding it, it seems it doesn’t contain private session data, which remains stored in the cookie. However, is there any reason this data shouldn’t be served to users other than the one who initiated the request in the first place?

Lastly, I’d like to write guidance on how to handle authenticated LiveView’s with static content. The idea is to render only public data on initial mount, and load private, session-based data only when the LiveView goes live:

def mount(params, session, socket) do
  if not connected?(socket) do
    # Initial request: here we render only **public** data.
    # The response can be cached by shared caches
    products = Products.list(params)

    {:ok, assign(socket, products: products)}
  else
    # Check user authentication
    socket =
      case session do
        %{"user_id" => user_id} ->
          assign(socket, user: Accounts.get_user!(user_id))

        _ ->
          socket
      end

    # Perform stateful stuff that don't depend on the user
    Phoenix.PubSub.subscribe(MyApp.PubSub, "product-updates")

    if socket.assigns.user do
      # Prepare rendering for authenticated users
      socket =
        socket
        |> assign(:pending_orders, User.get_pending_orders!(socket.assigns.user))
        |> assign(:pending_notifications, User.get_pending_notifications(socket.assigns.user))

      {:ok, socket}
    else
      # Prepare rendering for anonymous users
      {:ok, socket}
    end
  end
end

What do you think? Did I miss something? Any security issue I didn’t anticipate?

Cheers!

LostKobrakai · October 24, 2023, 3:13pm

The CSRF token is there only to sign the session contents used by LV. Without the token these could be tampered with. What data is stored in these session values depends on the application, but I’d always expect them to include user information e.g. about which user is currently logged in.

Removing the CSRF token therefore affects only that session information. It’s safe to do so, but a tradeoff. Once you have some kind of login affecting the page state you usually cannot do that anymore.

LV doesn’t use any of the information stored in the cookie. The thing called session in the context of LV is the blobs of data scattered all over the html, not the data in the cookie. Cookies cannot be used securely over websockets afaik.

tangui · November 5, 2023, 9:25pm

Thanks for your answer, @LostKobrakai!

It seems the CSRF token is used to prevent WebSocket CSRF attacks (called CSWSH), as described in this article (or this other one).

I think LV does use information in the session cookie to populate the session param, but only after verifying the CSRF token is valid. As far as I understand, the WS session is established after a regular HTTP request, having an upgrade: websocket header. Cookies are sent along this HTTP request, and their content transmitted to the LiveView websocket session (unless there’s no valid CSRF token).

Actually my assumption that session (cookie) information is sent to the LV when CSRF token is missing was false (erroneous testing).

However, that might not be a lost cause: it seems that verifying the origin is sufficient to prevent this CSWSH attacks, and Phoenix already supports checking the origin for websockets. For now, the :check_origin checks the origin only when the header is present. In this case users would have to modify the Liveview’s app.js a bit to discard looking for a CSRF token, but that’s easy to do.

Unless there’s some good security reason not to do it, I’ll take a look at enabling an origin-based security check instead of relying on CSRF tokens to populate the user session in LV.

Any thoughts are welcomed

Schultzer · November 6, 2023, 12:30am

:check_origin - if the transport should check the origin of requests when the origin header is present. May be true, false, a list of hosts that are allowed, or a function provided as MFA tuple. Defaults to :check_origin setting at endpoint configuration.

If true, the header is checked against :host in YourAppWeb.Endpoint.config(:url)[:host].

If false, your app is vulnerable to Cross-Site WebSocket Hijacking (CSWSH) attacks. Only use in development, when the host is truly unknown or when serving clients that do not send the origin header, such as mobile apps.

As the documentation state, origin headers is not guaranteed for mobile apps.

LostKobrakai · November 6, 2023, 6:50am

I don‘t think that‘s the case. Session data provided to connected LV instances comes from the HTML, not the cookie. The session data might include more keys than what is in the session cookie.

tangui · November 6, 2023, 7:59pm

Indeed. I guess there’s another authentication mechanism at play with these apps as non-browser app don’t handle cookies anyway. But indeed we’d possibly need to make origin check mandatory in the case we prefer this option to CSRF checking.

It does include LV specific data, but not user session data as far as I can see:

iex(25)> data_phx_session = "SFMyNTY.g2gDaAJhBXQAAAAIZAACaWRtAAAAFHBoeC1GNVVnT1FLSkpDS19zaGpEZAAMbGl2ZV9zZXNzaW9uaAJkAAdkZWZhdWx0bggAw7Rm4fwflRdkAApwYXJlbnRfcGlkZAADbmlsZAAIcm9vdF9waWRkAANuaWxkAAlyb290X3ZpZXdkAC1FbGl4aXIuSHR0cENhY2hlV2l0aExpdmV2aWV3V2ViLlVzZXJMaXZlLlNob3dkAAZyb3V0ZXJkACZFbGl4aXIuSHR0cENhY2hlV2l0aExpdmV2aWV3V2ViLlJvdXRlcmQAB3Nlc3Npb250AAAAAGQABHZpZXdkAC1FbGl4aXIuSHR0cENhY2hlV2l0aExpdmV2aWV3V2ViLlVzZXJMaXZlLlNob3duBgBOLCqmiwFiAAFRgA.9wLt2qujqlGyjQvMSy-h5vORSoJXlJHTI5SBoPWr7EY"
"SFMyNTY.g2gDaAJhBXQAAAAIZAACaWRtAAAAFHBoeC1GNVVnT1FLSkpDS19zaGpEZAAMbGl2ZV9zZXNzaW9uaAJkAAdkZWZhdWx0bggAw7Rm4fwflRdkAApwYXJlbnRfcGlkZAADbmlsZAAIcm9vdF9waWRkAANuaWxkAAlyb290X3ZpZXdkAC1FbGl4aXIuSHR0cENhY2hlV2l0aExpdmV2aWV3V2ViLlVzZXJMaXZlLlNob3dkAAZyb3V0ZXJkACZFbGl4aXIuSHR0cENhY2hlV2l0aExpdmV2aWV3V2ViLlJvdXRlcmQAB3Nlc3Npb250AAAAAGQABHZpZXdkAC1FbGl4aXIuSHR0cENhY2hlV2l0aExpdmV2aWV3V2ViLlVzZXJMaXZlLlNob3duBgBOLCqmiwFiAAFRgA.9wLt2qujqlGyjQvMSy-h5vORSoJXlJHTI5SBoPWr7EY"

iex(26)> data_phx_static = "SFMyNTY.g2gDaAJhBXQAAAADZAAKYXNzaWduX25ld2pkAAVmbGFzaHQAAAAAZAACaWRtAAAAFHBoeC1GNVVnT1FLSkpDS19zaGpEbgYATiwqposBYgABUYA.-GKGKKPNQUqxDLw1f39O1osZEL8V5a7XUCYZWk8jhWA"                   "SFMyNTY.g2gDaAJhBXQAAAADZAAKYXNzaWduX25ld2pkAAVmbGFzaHQAAAAAZAACaWRtAAAAFHBoeC1GNVVnT1FLSkpDS19zaGpEbgYATiwqposBYgABUYA.-GKGKKPNQUqxDLw1f39O1osZEL8V5a7XUCYZWk8jhWA"                                              

iex(27)> decode_lv_stuff = fn v -> v |>  String.split(".") |> Enum.map(&Base.url_decode64!(&1, padding: false)) |> Enum.map(fn v -> try do :erlang.binary_to_term(v) rescue _ -> :no_erlang_term end end) end      #Function<42.3316493/1 in :erl_eval.expr/6>     
                                                                                                                                                            
iex(28)> decode_lv_stuff.(data_phx_session)  
[
  :no_erlang_term,                                                        
  {{5,
    %{
      id: "phx-F5UgOQKJJCK_shjD",
      live_session: {:default, 1699299605376054467},
      parent_pid: nil,
      root_pid: nil,
      root_view: HttpCacheWithLiveviewWeb.UserLive.Show,
      router: HttpCacheWithLiveviewWeb.Router,
      session: %{},
      view: HttpCacheWithLiveviewWeb.UserLive.Show
    }}, 1699299863630, 86400},
  :no_erlang_term
]

iex(29)> decode_lv_stuff.(data_phx_static)
[                                                                                                                                                                                                                    
  :no_erlang_term,                                                        
  {{5, %{assign_new: [], flash: %{}, id: "phx-F5UgOQKJJCK_shjD"}},
   1699299863630, 86400},
  :no_erlang_term
]

(and I’ve set data in the session in this example.

LostKobrakai · November 7, 2023, 11:04am

Interesting it seems like it only adds additional LV specific session values in there:

live_session :app,
  session: %{
    "abc" => "def"
  } do
  live "/", Live, :index
end

With that the data-phx-session decodes to:

{{5,
    %{
      id: "phx-F5VSPUb9SFly8eMl",
      session: %{"abc" => "def"},
      parent_pid: nil,
      router: Router,
      view: Live,
      root_pid: nil,
      root_view: Live,
      live_session: {:app, 1699354849671623834}
    }}, 1699354857549, 86400}

Those additional session values could be determined at runtime as well when an mfa is provided.

onnimonni · September 2, 2024, 12:35pm

Thanks @tangui for your great blog post regarding http caching Liveview pages: Caching Liveviews - Part 1: The road to HTTP-caching Liveviews - Tangui's blog

I’m eagerly waiting for the next blog posts already . I saw that you needed to create your own fork of Phoenix. Did you create a pull-request to phoenix core to allow disabling the csrf tokens?

tangui · September 2, 2024, 1:16pm

I’ll publish the second part (Caching Liveviews - Part 2: Publicly caching private Liveviews) this week.

I tried earlier this year but the PR was unclear and not great TBH. It’d be nice if some devs with security expertise could confirm the hypothesis I’ve made in the post (that is, you can safely rely on origin only). I’ll open the PR this month anyway and keep the community informed in this post

LostKobrakai · September 2, 2024, 1:29pm

Afaik the csrf token is not just used to secure the websocket connection against CSWH, but also to sign the session data put into the html. Though I guess if you want to cache html you cannot put per user data on the html anyways. At that point removing the csrf token is indeed viable and should be supported without needing to adjust any of the 3rd party codebases involved - e.g. I did that here: Do not use use a session cookie · LostKobrakai/hex-bobs-list@f86dde5 · GitHub

tangui · September 2, 2024, 2:15pm

It’s indeed used to store the additional session data set by the :session option of Phoenix.LiveView.Router.live_session/3. Is this what you are referring to?

Or user data from the session? Both will be discussed in the next blog post

LostKobrakai · September 2, 2024, 2:41pm

LV sessions are merged data between what is in the plug/http session store and what is provided by the :session option on live_session. The merged data is put on the html tags to be sent back by LV. It’s a bit unfortunate that LV calls that data in the html tag session given it’s technically completely unrelated to plug/http level session data.

This basically affects any data you’d get in mount/3s second parameter. So e.g. (not different to http level caching) needing to deal with logged in users would likely require the usage of the session.

tangui · September 2, 2024, 5:43pm

Unless I’m terribly wrong, this is not the case: only data set via the :session attribute of a live_session is stored in the HTML, and not the session data from the cookie. I have the feeling you assume the websocket connection doesn’t include the cookie, which is wrong - it does. Hence the CSWH issue discussed in the blog post.

Let’s check it out after setting 1) session cookie data 2) setting live_session :default, session: %{"titi" => "toto"} do:

Rendered HTML div root liveview:

<div id="phx-F_F6hHTMdNAciQDC" data-phx-main data-phx-session="SFMyNTY.g2gDaAJhBXQAAAAIdwJpZG0AAAAUcGh4LUZfRjZoSFRNZE5BY2lRREN3B3Nlc3Npb250AAAAAW0AAAAEdGl0aW0AAAAEdG90b3cKcGFyZW50X3BpZHcDbmlsdwZyb3V0ZXJ3I0VsaXhpci5DYWNoZWFibGVMaXZldmlld3NXZWIuUm91dGVydwxsaXZlX3Nlc3Npb25oAncHZGVmYXVsdG4IAIiG5dqDevEXdwR2aWV3dytFbGl4aXIuQ2FjaGVhYmxlTGl2ZXZpZXdzV2ViLk1haW5MaXZlLkluZGV4dwlyb290X3ZpZXd3K0VsaXhpci5DYWNoZWFibGVMaXZldmlld3NXZWIuTWFpbkxpdmUuSW5kZXh3CHJvb3RfcGlkdwNuaWxuBgD5DpazkQFiAAFRgA.mhCxhz3RtfaIh7TRDV1I55h2Ra8h4o5060cFCL_x8RE" data-phx-static="SFMyNTY.g2gDaAJhBXQAAAADdwJpZG0AAAAUcGh4LUZfRjZoSFRNZE5BY2lRREN3BWZsYXNodAAAAAB3CmFzc2lnbl9uZXdqbgYA-g6Ws5EBYgABUYA.M75Zd_M-hrnOsnCrWy15ZFFKwzUJrwkJtX9dXg_iow4"><header class="relative top-0 right-0 left-0 p-1 bg-gray-100">

Decoding both values:

iex(17)> data_phx_session |> String.split(".") |> Enum.map(&Base.url_decode64!(&1, padding: false)) |> Enum.drop(1) |> Enum.take(1) |> Enum.map(&:erlang.binary_to_term/1)
[
  {{5,
    %{
      id: "phx-F_F6hHTMdNAciQDC",
      session: %{"titi" => "toto"},
      parent_pid: nil,
      router: CacheableLiveviewsWeb.Router,
      live_session: {:default, 1725294838991390344},
      view: CacheableLiveviewsWeb.MainLive.Index,
      root_view: CacheableLiveviewsWeb.MainLive.Index,
      root_pid: nil
    }}, 1725294841593, 86400}
]
iex(18)> data_phx_static |> String.split(".") |> Enum.map(&Base.url_decode64!(&1, padding: false)) |> Enum.drop(1) |> Enum.take(1) |> Enum.map(&:erlang.binary_to_term/1)
[
  {{5, %{id: "phx-F_F6hHTMdNAciQDC", flash: %{}, assign_new: []}},
   1725294841594, 86400}
]

There’s no trace of session (cookie) data, only the values set by the :session option of the live session.

LostKobrakai · September 2, 2024, 6:13pm

Seems like I got the actual implementation mixed up with how it works conceptually and what you eventually get in mount/3. But at least in the current implementation of socket you still need a csrf token to get access to the cookie. Iirc earlier versions didn’t allow that at all.

tangui · September 5, 2024, 8:49am

Here we go with the second part: Caching Liveviews - Part 2: Publicly caching private Liveviews

(cc @onnimonni )

Happy hacking!

tangui · October 13, 2024, 5:40pm

PR created: Add `check_csrf` option to socket transport options by tanguilp · Pull Request #5952 · phoenixframework/phoenix · GitHub

Feel free to comment, improve the idea.

tangui · November 4, 2024, 7:31pm

The PR has been merged Thanks to the Phoenix team The new option will be available as of Phoenix 1.7.15

I’ve updated the instructions and the demo app.

Happy caching!

BartOtten · November 4, 2024, 8:58pm

Nice. I was thinking of a static site generator for the static part so the initial pages can be on a CDN (and socket activated on first interaction?). Seems this lib and the enhancement it brought to the core makes a good base.

tangui · November 5, 2024, 7:31am

I guess if you can tell your CDN to browse your sitemap then you can cache your pages this way. I was thinking of warming locally the cache by simulating user requests using something like Invoke Phoenix endpoint programmatically at runtime. Will probably write a blog post about it.

By the way, if you’re using a cloud application platform (Gigalixir, render, fly.io…) and your users are always near one of your servers, then maybe you no longer need a CDN to cache pages. It’d be interesting to make measures and have feedback about it.