Genserver performance - what would be best for this scenario?

arpan · April 27, 2021, 5:56am

Hi Everyone

I am making a multiplayer game and that uses phoenix as the backend. Its heavily dependent on Phoenix channels, there is no database all data is stored in ETS.

Now, currently, I have used a Genserver backed ETS store as shown here.

From what I read ETS is very fast and should not be a bottleneck, however, I also understand that genservers process messages sequentially.

Now all the game state is stored in a single ETS table. There is a genserver GameStore which provides a public API to perform operations on the single-game ETS table.
All game data will be in the same table with the game id as the key and a struct as the value.

When the number of users on the website increases there could be a case when say 100 games are played at once, and every game will frequently change its state which requires updating or reading from the ETS table by making calls to the GameStore genserver.

I am wondering how this setup will perform under heavy load, will this single genserver be a bottleneck?
Some other approaches that I am thinking of are…

Having a pool of worker genservers who can query the ETS table, so under load, some other genserver can pick up messages.
Spawning a genserver for every game, all managed by a DynamicSupervisor. So we have a genserver for each game that can query that game-specific data from ETS. Also, going by this approach should this genserver also create an separate ETS table only for that game, when the game is over the table is deleted

Or any other better approach that you can suggest.

But all these things will add complexity to the code and I only want to consider these if the present setup could be a bottleneck.

Can anyone help me with these approaches and also it will be very helpful if you can provide some code reference or links explaining how to manage genserver pools or dynamically spawning genservers if you are suggesting those solutions.

Thanks!

mindok · April 27, 2021, 7:10am

First things first, can you create a load simulator so you can test performance under different loads? That way you will know what needs fixing when, and whether the fixes are good.

100 games isn’t really a heavy load (depending on frequency of updates per game), but piping all activity through a single process (particularly one that “owns” the ets table) will end in tears before too long - one bad input will crash all games for all users. OTP is all about isolating “conversations” so one failure doesn’t affect thousands or millions of connections.

Your second option (GenServer per game) makes sense. Whether or not your second option is absolutely the best I don’t know, but you should try coding it anyway - you will learn some key lessons along the way and it shouldn’t introduce much complexity overall in return for improved reliability and responsiveness.

This article will give you some pointers: The Erlangelist - To spawn, or not to spawn?

LostKobrakai · April 27, 2021, 7:18am

Also be sure to understand the difference between sequencing writes by going through the genserver process for writing to ets vs. having the ets table be public and directly writing to it without involving the genserver starting the table.

axelson · April 27, 2021, 7:29am

Here’s another good blog post I’d recommend you read to understand the performance better: The dangers of the Single Global Process

mattbaker · April 27, 2021, 7:30am

I was thinking the same thing as @LostKobrakai.

If it’s a named table you don’t necessarily need these interactions to happen in a “handle_call” for example.

I’m actually not sure I agree with that approach in the thoughtbot article, it’s unclear to me what the advantage is of reads and writes going through the genserver process in their example.

If this genserver is nothing but a wrapper around your ETS operations, you might try building it without a genserver first, then add one later, just as a learning tool.

arpan · April 27, 2021, 7:55am

Thanks for replys everyone.

@mindok

but piping all activity through a single process (particularly one that “owns” the ets table) will end in tears before too long - one bad input will crash all games for all users.

Yes that’s exactly what I am worried about, the Genserver might become a bottleneck very fast, regarding the genserver crashing I think that won’t be a problem since it’s under a supervisor which would restart it and the genserver has no state everything is in ETS so we should be fine.

@LostKobrakai
Yea, currently the genserver crates the ETS table in its init callback like :ets.new(@table_name, [:named_table, :set, :private]).

This means only the genserver process is allowed to access the ETS table as its private. This will have to change if I have a genserver per game.

Making all ETS table access through the genserver will sequence writes as you mentioned, but I am not sure if there will be problems if the ETS table is public and accessed by multiple genservers(each game has its own genserver). Each genserver should access only its own game and not the data for some other game, so I think there shoudl be any problems.

@mattbaker

it’s unclear to me what the advantage is of reads and writes going through the genserver process in their example.

Yes that is exactly my thinking as well, the only use of genserver here is I think if we want to make the ETS table only acessible via the genserver process and also starting the genserver will create the table.(but I think just for creating the table we don’t need a genserver).

I am sharing the game genserver code that I have written for reference, there will be many more handle_call added as I make the game

The below genserver is supervised by a Pictionary.StoreSupervisor so it will restart if it crashes for some reason.

defmodule Pictionary.Stores.GameStore do
  use GenServer
  alias Pictionary.Game
  require Logger

  @table_name :game_table
  @custom_word_limit 10000

  @permitted_update_params [
    "id",
    "rounds",
    "time",
    "max_players",
    "custom_words",
    "custom_words_probability",
    "public_game",
    "vote_kick_enabled"
  ]

  ## Public API

  def get_game(game_id) do
    GenServer.call(__MODULE__, {:get, game_id})
  end

  def add_game(game) do
    GenServer.call(__MODULE__, {:set, game})
  end

  def update_game(game_params) do
    GenServer.call(__MODULE__, {:update, game_params})
  end

  def change_admin(game_id, admin_id) do
    GenServer.call(__MODULE__, {:update_admin, %{game_id: game_id, admin_id: admin_id}})
  end

  def add_player(game_id, player_id) do
    GenServer.call(__MODULE__, {:add_player, game_id, player_id})
  end

  def remove_player(game_id, player_id) do
    GenServer.call(__MODULE__, {:remove_player, game_id, player_id})
  end

  ## GenServer callbacks

  def start_link(_opts) do
    GenServer.start_link(__MODULE__, nil, name: __MODULE__)
  end

  def init(_args) do
    # Create a ETS table
    # private access ensure read/write limited to owner process.
    :ets.new(@table_name, [:named_table, :set, :private])

    {:ok, nil}
  end

  def handle_call({:get, game_id}, _from, state) do
    {:reply, fetch_game(game_id), state}
  end

  def handle_call({:set, %Game{id: game_id}} = game_data, _from, state) do
    # Below pattern match ensure genserver faliure and restart in case
    # of ETS insertion faliure
    true = :ets.insert(@table_name, {game_id, game_data})

    Logger.info("Create game #{game_id}")

    {:reply, game_data, state}
  end

  def handle_call({:update, %{"id" => id} = game_params}, _from, state) do
    # For some reason :ets is returning two types of values, this case block handles both
    game = fetch_game(id)

    updated_game =
      if game do
        filtered_params =
          game_params
          |> Enum.filter(fn {key, _val} -> Enum.find(@permitted_update_params, &(&1 == key)) end)
          |> Enum.map(fn {key, val} -> {String.to_atom(key), val} end)
          |> Enum.into(%{})
          |> handle_custom_words()

        updated_game = struct(game, Map.put(filtered_params, :updated_at, DateTime.utc_now()))

        true = :ets.insert(@table_name, {id, updated_game})

        Logger.info("Update game #{id}")

        updated_game
      end

    {:reply, updated_game || game, state}
  end

  def handle_call({:update_admin, %{game_id: id, admin_id: admin_id}}, _from, state) do
    game = fetch_game(id)

    game.players
    |> Enum.find(&(&1 == admin_id))
    |> if do
      updated_game = struct(game, %{creator_id: admin_id, updated_at: DateTime.utc_now()})

      true = :ets.insert(@table_name, {id, updated_game})

      Logger.info("Change admin for game #{id} to #{admin_id}")

      {:reply, updated_game, state}
    else
      Logger.warn("Could not change game admin")

      {:reply, game, state}
    end
  end

  def handle_call({:add_player, game_id, player_id}, _from, state) do
    game = fetch_game(game_id)

    if game && MapSet.size(game.players) <= game.max_players do
      game = %Pictionary.Game{game | players: MapSet.put(game.players, player_id)}
      true = :ets.insert(@table_name, {game_id, game})
      Logger.info("Add player #{player_id} to game #{game_id}")
      {:reply, game, state}
    else
      Logger.warn("Could not add player to game")

      {:reply, :error, state}
    end
  end

  def handle_call({:remove_player, game_id, player_id}, _from, state) do
    game = fetch_game(game_id)

    if game do
      game = %Pictionary.Game{game | players: MapSet.delete(game.players, player_id)}
      true = :ets.insert(@table_name, {game_id, game})
      Logger.info("Removed player #{player_id} from game #{game_id}")

      # Remove game if everyone leaves
      if MapSet.size(game.players) == 0 do
        true = :ets.delete(@table_name, game.id)
        Logger.info("Removed game #{game_id}")
      end

      # Change admin if admin leaves
      if MapSet.size(game.players) > 0 && player_id == game.creator_id do
        Task.start_link(fn ->
          new_admin = get_random_player(game.players)
          change_admin(game_id, new_admin)

          # Broadcast on game channel about admin change
          PictionaryWeb.Endpoint.broadcast!("game:#{game.id}", "game_admin_updated", %{
            creator_id: new_admin
          })
        end)
      end

      {:reply, game, state}
    else
      Logger.warn("Could not remove player from game")

      {:reply, :error, state}
    end
  end

  ## Private helpers

  defp handle_custom_words(%{custom_words: custom_words} = filtered_params) do
    custom_word_list =
      custom_words
      |> String.split(",")
      |> Stream.map(fn word ->
        word
        |> String.downcase()
        |> String.trim()
      end)
      |> Stream.filter(&(String.length(&1) < 30 || String.length(&1) > 2))
      |> Stream.uniq()
      |> Enum.take(@custom_word_limit)

    Map.put(filtered_params, :custom_words, custom_word_list)
  end

  defp handle_custom_words(filtered_params), do: filtered_params

  defp fetch_game(game_id) do
    case :ets.lookup(@table_name, game_id) do
      [{_id, {:set, game}}] -> game
      [{_game_id, game}] -> game
      _ -> nil
    end
  end

  defp get_random_player(players) do
    players
    |> MapSet.to_list()
    |> Enum.shuffle()
    |> List.first()
  end
end

paulstatezny · April 27, 2021, 2:27pm

I thought ETS was optimized for super fast reads at the cost of slower writes.

Am I correct? If so, ETS may not be the most performant option unless you write to it infrequently. (And I assume this may not be the case for a game.)

a8t · April 27, 2021, 3:03pm

I implemented something extremely similar to the thoughtbot article recently (for generating random slugs for urls, fetch either gets or sets and returns).

The difference is that I’m only using the Genserver to handle the ets table lifecycle, since ets tables are tied to processes.

The functions to get and set to the ets table are defined in the same module, but as regular functions that directly read/write to ets, without handle_call’ing.

I am open to criticism here as a newb (this is in fact my first Genserver! Lol), but it seems to have the best of both worlds.

krasenyp · April 28, 2021, 6:38am

Your approach is the most appropriate for most use cases. OP should probably do the same as you.

keathley · April 28, 2021, 11:25am

This isn’t quite accurate. ETS tables do have config options to optimize reads or optimize writes, and setting either option to true will tend to slow down the inverse operation (depending on table type). But neither of those configs are set to true by default.

arpan · April 28, 2021, 4:54pm

I doubt whether this approach would work. Actually, ETS tables have some rules around accessibility. In my implementation, the ETS table I create is private this means only the process which created it(the genserver) can access it.

The functions to get and set in ETS tables should be called from the genserver process other wise you will not be able to access the table unless you have less strict access rule like public.

The functions to get and set to the ETS table are defined in the same module, but as regular functions that directly read/write to ETS, without handle_call’ing.

I don’t think having the functions defined in the same module will be of help if the process of trying to access them is not the genserver process. If you access the ETS table in genserver callbacks like handle_call or handle_cast it will be the genserver process that will be accessing the ETS table so that should work.

I have limited experience with genservers please correct me if I am wrong, but I just shared my idea of how I think it works.

derek-zhou · April 28, 2021, 4:59pm

All 3 ways are possible with ETS:

read/write exclusively from the a single process
a single process handle all mutations, any processes can read
any processes can read and write

I usually choose the middle ground so my ets table is always consistent and reads are parallel

whoops · May 3, 2021, 7:04pm

I think you have a major problem waiting to bite you here. If your global GameStore process crashes, it’ll be restarted with a new pid that isn’t the owner of the private table. In fact the private table will automatically be deleted when the owner terminates, so your looking at dumping all your state in the event of a crash at the moment.

Personally I’d probably have the processes write to the ETS tables directly without a gatekeeper (making it public) or have a table per-game depending on what’s easiest for your application.

arpan · May 3, 2021, 8:33pm

Yea you are right, I haven’t thought about this.
While it is unlikely for the game genserver to crash but it is not impossible.

I will reconsider my approach and maybe use public tables not tied to a genserver.

arpan · June 20, 2021, 8:11am

Thank you guys, for all your suggestions I ended up spawning a new genserver for every game and store the individual game state in its own genserver, the genserver gets automatically shut down when the game ends.

This strategy ensures that a single genserver will not block processing updates to the games and

This has worked pretty well for me till now.

The game is complete check it out here, I also wrote a detailed post about it here