How do I reduce RAM usage with Phoenix LiveView socket?

gus · January 31, 2022, 6:28am

Hey all,

I’m wondering how to reduce some of the RAM usage (and maybe some tips for diagnosing which processes and function calls are actually hogging RAM) in my LiveView application. My use case is that I have a set of geographical features stored in Postgres with the great geo_postgis library. These features are broken up into sectors, each of which are defined by a rectangular polygon. When the user moves the map on the webpage, I have a hook send an event to the LiveView process with the bounding box of the map area in view. Then, the event handler calculates which sectors are in view and gets the features by their associated sector ID from the in-memory Cachex cache (this obviously is a large portion of RAM usage, but the total number of features is around 20,000 and is around 17MB of CSV file total). The event handler assigns prepends the sector IDs to the :loaded_sector_ids list assign in the socket (so it doesn’t send them multiple times), and sends the data to the client with push_event/3 using Enum.reduce with the socket. The code is below:

  def handle_event("load-data", %{"bounds" => bounds}, socket) do
    socket =
      bounds
      |> MyApp.get_intersecting_features() # loads data in [{sector_id, sector_data}, ...] format
      |> Enum.reject(fn {sector_id, _sector_data} -> sector_id in socket.assigns.loaded_sector_ids end)
      |> Enum.reduce(socket, fn {sector_id, sector_data}, s ->
        s
        |> push_event("data", %{sector_id: sector_id, data: sector_data})
        |> update(:loaded_sector_ids, fn sector_ids -> [sector_id | sector_ids] end)
      end)

    {:noreply, socket}
  end

My guess is that the heavy memory usage is caused by the large immutable lists of features are being copied to the socket in the Enum.reduce call, but I’m not sure how to structure this in a better way to reduce that. Would it make sense to run this in more of a recursion manner? On receiving the bounds, I could load the list of sector_ids that need to be sent to the client, then basically send a message to the current process with the remaining list of sector_ids to load and send to the client until that list is empty?

Looking forward to hearing suggestions, and thanks in advance!
Gus

sb8244 · January 31, 2022, 7:36am

Are you looking to reduce max RAM usage, or reduce long-term RAM usage? Asked another way, is your current problem that after the push_event occurs, you’re still seeing high RAM usage? Even though the data isn’t in memory?

If looking to reduce max RAM, then I’m not sure what you could do besides possibly streaming the data from server->client and never allowing large chunks to be passed around.

If seeing high RAM usage after the push_event occurs, then you could try a very rough :erlang.garbage_collect() to see if it solves your problem. This is the least elegant way to handle it, but it’s useful for diagnostics. If you see that helps with memory usage, then you could evaluate spawning a process or Task to grab/push the data, or you could leverage a more aggressive garbage collection threshold so that it runs GC more often.

I wrote a (now very old) blog post about memory usage with WebSocket / Channels. A good bit of this is now obsolete due to process hibernation, but it talks about diagnosing and a few various solutions.

kartheek · January 31, 2022, 7:39am

send(socket.transport_pid, :garbage_collect)

https://hexdocs.pm/phoenix/Phoenix.Socket.html#module-garbage-collection

:fullsweep_after might be of some help needs phoenix >= 1.6.3 and Erlang/OTP 24 .

https://hexdocs.pm/phoenix/Phoenix.Endpoint.html#socket/3-websocket-configuration

sb8244 · January 31, 2022, 7:42am

Yea, this is a good point. There are 2 processes here, the transport and the LiveView Channel. You would want to try forcing GC to happen in both if you wanted to fully ensure cleanup. The link you provided above is specifically for the transport.

gus · February 1, 2022, 6:31pm

Hi @kartheek and @sb8244, thanks for the quick replies!

I tried adding send(socket.transport_pid, :garbage_collect) to the event handler last night, and it has reduced some of the RAM usage (down about ~30%). I was also looking into the :fullsweep_after option, but it’s not obvious to me where I can set this option for this LiveView? A quick pointer would be appreciated

Because max RAM usage is a slight concern, I will also look into streaming methods. Will update with results after doing a comparison, hopefully later this week.

Many thanks for your suggestions!
Gus

gus · February 1, 2022, 6:39pm

Ah I think I found it - in my endpoint.ex file, I added the fullsweep_after: 0 option. Does this look right?

socket "/live", Phoenix.LiveView.Socket, websocket: [connect_info: [session: @session_options], fullsweep_after: 0]

Thanks!

sb8244 · February 1, 2022, 6:52pm

I typically set fullsweep_after at the VM level because it seems like a universally good thing in the environments I’ve deployed to. I do that by adding -env ERL_FULLSWEEP_AFTER 20 to my vm.args file. This would apply to ALL processes in your VM, such as the transport process and the Channel process.

However, setting it at the WebSocket level like you’ve done is completely valid and a good place to start. I’ve found > 0 is a good idea, otherwise you will be running fullsweep after every single message.

woutdp · August 11, 2023, 9:54pm

I’m having an issue with LiveView memory usage but the provided solution is not applicable to me I believe.

I’m building a multiplayer game with Elixir/Phoenix. The state of the game is stored in a genserver. This gamestate can be around 20MB or more in a worst case scenario that I want to support. The way I display my gamestate to each user is through LiveView. I have a channel that broadcasts the entire state (20MB) over a channel to LiveView. I’m using the live_json library to push this state to the frontend, so at least my data transfer over the websocket is minimal, and I’m happy on that front.

Background Information

The state is just one big map. The game has a board/grid, if the board size is 50x50 that’s 2500 tiles. It’s a game of hidden information, where I keep a representation of the board for each player. So a player can explore and only know about certains things on the board. So if there are 12 players that’s 2500x12 = 30.000 tiles, each tile is currently represented by a little less than a kilobyte of information.

The Problem

Where the issue occurs is that each LiveView process has the entire gamestate, which means if 10 people are connected, I have 200MB usage at minimum. I believe the way live_json works is it keeps the data in memory in my LiveView so it can correctly diff what needs to be sent over the wire on subsequent updates.

I don’t mind that the genserver has 20MB of data in it. I’d happily increase my server’s infrastructure to support more concurrent games if that happens. The issue is only at the LiveView process where I’d want to keep memory usage low, 200MB is too excessive, if one user opens 10 tabs of the same game, I’m already at that limit. Ideally LiveView doesn’t keep any memory usage about the game in its process.

Possible Solutions

I can shave off quite a lot of data from this 20MB. I can go this route, but then there’s probably another ceiling somewhere I’ll hit in the future depending on the features and information I want to add to the state. Like I said I don’t really care about the size of the data being stored in a genserver, I don’t expect to have a huge amount of games running at the same time. It’s only on the LiveView level where I’m worried.
Be more clever about what data is being sent to which LiveView. Users don’t need to know about other people’s board representations. I already make sure I don’t send certain information to certain users so no cheating can happen. I do this at the LiveView level though, because that’s where I know which user is being shown. I can also send a diff of the state over the channel so at least the channel has minimal data going over it.

My Question

Before I implement any of the above solutions, can I make it so the LiveView process doesn’t grow in size depending on the size of the game? Is there any way for me to send the data to the frontend through the LiveView but not keep it in memory in the LiveView process?

It’s a very specific scenario, and probably have to go with above solutions. But before I do that I want to make sure I can’t do something more simple on the LiveView level.

Code for reference

Code is a little simplified than what it actually is, but it’s the general concept that counts

defmodule AppWeb.GameLive do
  use AppWeb, :live_view

  alias Phoenix.Socket.Broadcast
  alias App.{GameEngine, Game}

  @impl true
  def render(assigns) do
    ~H"""
    <div>
      <.live_component
        module={AppWeb.GameComponent}
        id={@id}
        player_id={@player_id}
        current_user={@current_user}
        ljgame={@ljgame}
      />
    </div>
    """
  end

  @impl true
  def mount(%{"id" => id}, _session, socket) do
    user = socket.assigns[:current_user]

    if connected?(socket) do
      GameEngine.subscribe(id)
    end

    {:ok, server} = GameEngine.join(id, user)
    player = GameEngine.find_player_by_user(server.game.players, user)

    {
      :ok,
      socket
      |> LiveJson.initialize("game", Game.serialize(server.game, player.id))
      |> assign(
        player_id: player.id,
        id: id
      )
    }
  end

  @impl true
  def handle_info(%Broadcast{event: "game-server:update", payload: server}, socket) do
    {
      :noreply,
      LiveJson.push_patch(socket, "game", Game.serialize(server.game, socket.assigns.player_id))
    }
  end
end

gus · August 12, 2023, 11:07am

With regard to this, I believe that you could potentially use temporary assigns to make sure that the game state is only stored in your GameEngine genserver. I’m not sure if this will work well for you if you need to do validation on actions (such as ensuring an attempted action is valid), but it could help you reduce them memory consumption of each LiveView. You may have already tried this as well, but I can’t tell from your example.

That being said, I suggest you go for the low hanging fruit now and then optimize your memory usage after you run into/close to some limit or threshold. As they say, premature optimization is the root of all evil

Hope this helps!

woutdp · August 16, 2023, 12:54am

Thank you @gus

I think in my situation I can’t use temporary assigns as I’m using live_json, which uses regular assigns under the hood.

Will probably just do the low hanging fruit like you said when I hit the wall with memory usage, currently it’s still a problem in theory but I expect to hit the wall at some point.

codeanpeace · August 16, 2023, 1:54am

Hmm, it’d be much more ideal if that diffing could happen in the GameEngine genserver to avoid needing to duplicate so much shared game state.

I wonder if live_json could be adapted to work at a lower level since LiveViews are built on top of channels and genservers.