How to manage state for each player in Phoenix

Hi guys,
I am working on a game(a grid based game), which will be monitored by server(in elixir).
The client is on typescript(already done).So i have to manage game state for every player in the server.
Right now i am writing the server code in elixir as a different app, where every player have a genserver and it will manage and simulate the player’s game remotely.

But when i plug the phoenix to use websocket, should i spawn a genserver for each player on connect.
Or i read that every channel per user is also a genserver. I have no idea , how to implement this.
(This is not a multiplayer game, just a single player game monitored by server).
Thanks

Can you please tell us a bit more about the details and dynamics of the game?

If it’s not a multiplayer game and each player has its own state that doesn’t have to deal with other states of other players/games, you could just have one process (the Phoenix Channel process) and the state of this process could be a %Game{} struct you update at each action/event.

Elixir has a great concurrency but spawning processes add complexity, so if you can have just one channel process with a game struct/state it’s easier to manage than another genserver linked to the channel process.

If you go this way, remember to have a separate Game module which is your interface to alter the %Game{} struct. Separate what is the game logic from what happens in the channel process.

4 Likes

:+1:

But one would need to note that it would also start the game anew for each new opened tab since the channels won’t be reused. :point_up_2:And also that the ephemeral game state would be lost on network interruptions since the channel process would be terminated.

1 Like

yes, right! It really depends if they want to start a new game for each tab. As you said, with a disconnection the game would be lost since the channel process would exit loosing the state.
To solve this I would then a Game genserver process registered in a Registry with something like the user_id. In this way if there is a disconnection/reconnection the game process is still there.

The problem at this point is to have processes of abandoned games. So there should be some sort of timeout like: if the genserver doesn’t receive a message after 30 minutes it exists. @idi527 what do you think?

@madclaws this is a great book where you see how to build a game with Elixir + Phoenix: Search

1 Like

To solve this I would then a Game genserver process registered in a Registry with something like the user_id . In this way if there is a disconnection/reconnection the game process is still there.

So you suggest having an additional process with the game state or am I misunderstanding? I think there’s still a way to avoid that and continue with your initial approach of just using channels by pushing the state to the client (it would supply the state on channel join), but have some kind of verification logic in place on the backend (like sha256 of the game state object written to ets on each update) so that the client is not allowed to cheat.

The client can then use browser capabilities to store the state, like local storage and service workers to possibly update the game across the browser tabs. But it still would be limited to a single client machine … So that if they started the game on their laptop and then opened the game from their phone, it would start anew. All of that could be then solved with a database on the backend!

I really don’t know though, you are completely right that it depends on what the purpose of the game is.

I usually try to avoid to spawn processes if I can, but yes… it could be a GenServer or an ETS table, but something that persists a disconnection.

I like this idea :blush: if you have to store the sha of the game in the state, then why not saving the game directly?
UPDATE: I maybe now got what you mean. Saving in one table all the SHAs of the games just to check if they are legit. Still hashes of abandon games should be removed “manually”. Obviously having a hash is better than a process to kill.
The nice thing about a process is that the “auto exit after timeout” would be easier to implement. Like every event there is a delayed message sent with Process.send_after. When this message is received by the process, the process checks when the last game event was received.

I think just ETS table (or a genserver), as you said, should be enough at the end :blush:. If you register the game to a user_id (and for the login yes, you need a database) then if you login on a different device you can able to resume the same game.

1 Like

Thanks for the detail info. So the game is here https://v1.gamezop.com/p/gamepage/S1Wrpf1v5ym
It’s a drag and drop grid based game. So you guys were saying to use Phoenix channel Process instead of genserver per player. In this game i just want to simulate it online, so that players can’t cheat , that is whole point of server monitoring.
But during a socket disconnection we will loose the game state. In one of my other multiplayer game i tackled this with storing gamestate with corresponding roomid in redis. So you are saying to use genserver for each player, registered in a Registry. So how will we retreive the game state on reconnection?

1 Like

That’s really cool! Do you earn money with the ads of that site?

Yes, instead of redis you can use Registry to register your game process under a key which could be the roomid. Or you could use an ETS table which gives you better performance to update the state (and you can give a name to ETS tables as well)

P.S: remember that Registry works only on a local node, so if you want to scale the app over multiple nodes using Distributed Elixir, you need to use something like the :global registry or, even better, a distributed registry like horde which uses CRDTs for synching instead of locking the nodes at each update

Yeah, basically the company i work for, earns money from the ads.
So should i create an ETS for supervisor, because i read somewhere that ETS is per process .

ETS is single-machine, but a table can be accessed by multiple processes. You can set the permissions (from ets documentation):
:public: Any process can read or write to the table.
:protected: The owner process can read and write to the table. Other processes can only read the table. This is the default setting for the access rights.
: private: Only the owner process can read or write to the table.

The owner is the process which creates the table. It’s really up to you and the kind of security you want to have. I would avoid any intermediate process between the player channel and the ets table, because it could become a bottleneck.

Notice that there is no automatic garbage collection for tables. Even if there are no references to a table from any process, it is not automatically destroyed unless the owner process terminates. To destroy a table explicitly, use function delete/1. The default owner is the process that created the table. To transfer table ownership at process termination, use option heir or call give_away/3.
Erlang -- ets

But I would start with a normal GenServer process to hold a player’s game state and register it on a Registry using the roomid key. Use the :via tuple for this.

Then if you really see that you need better performance move to ETS.

So basically from what i understand Player Connect to a phoenix channelon join, spawn a genserver which will maintain the stateRegister the genserver in Registry with useridIf disconnects, on reconnect we retrieve the game state from the genserver with our userid. Then what is the point of using Phoenix channel Process state, if we are already using a genserver for each player.

Or you were saying about creating a genserver with the supervisor(on server start), and stores the gamestate of players with corresponding userid in that global kinda genserver.

What I do is a genserver per game… and after each player’s move, I store the state (or equivalent) into ETS.

That way I can restore state from ETS when a genserver crashes and restarts.

I do usually 2 players game board (go, chess, backgammon), so 1 game (genserver) for 2 players.

Or 1 game for 4 players for bridge card game.

As I can detect login and logout, the game genserver has a list of current players, and detects iddle state (1 user left… the game is paused)

When all players have left I use a grace period, after which I stop the game genserver.

This way, I can ensure games are restarted after crash, and no zombie games are left forever on the server.

1 Like

Cool. I have one doubt , we will be doing this over Phoenix socket channels which itself is a process, will that be an overhead when we use genserver on top of it.
2. Ideally we should spawn a genserver when the client connects (in my case) or when match starts in your case, right?

You can use A LOT of processes, and You can read about processes good practice here…

https://www.theerlangelist.com/article/spawn_or_not

2 Likes

What about this?

Yes it’s true…

1 Like

So looking at what you said you have:

Websocket connection between a client and your server
A gen_server that holds the game state for that client.

Storing the game in the socket is possible but it has many issues - a disconnect/network/socket problem will throw out the state. No bueno unless you’re saving it somewhere external. You can solve it but I think it will add complexity where it doesn’t need to.

Any process in erlang can be named and this is something that you can use here if your players have an unique identifying property (a user id, a token, etc). To use non-atom names you need to either register them globally or by Registry (or any other module that implements the registry required things), using the :via options.

You would (on channel join) ask if the server for the player was running, if yes you would request the state from it, if not, you would start a fresh genserver giving the unique identifier to be used as part of its name. This takes care of the multiple channels/tabs issue. If you want to know when all sockets are disconnected to perhaps clean up, you could also monitor the channel pids and set an appropriate timeout in case all channels go down (meaning the user disconnected and didn’t reconnect in a sensible timeframe)
So on your socket you could have something like:

def join("user:" <> id, %{"token" => token}, socket) do
	# do validation to check if it's valid user etc
	{n_id, _} = String.to_integer(id)
	case start_or_state(n_id) do
		{:ok, state} -> 
			{:ok, state, assign(socket, :player, id}
		{:error, reason} ->
			{:error, reason}
	end
end

def handle_in("make_a_move", params, %{assigns: %{player: id}} = socket) do
    case GenservModule.move(id, params) do
      {:ok, response} ->
        {:reply, {:ok, response}, socket}
      {:error, errors} ->
        {:reply, {:error, %{errors: errors}}, socket}
    end
end

Then on your genserver module you would have a function as part of its public api start_or_state/1

def gen_serv_name(id), do: {:global, {:game, id}}

def start_or_state(id) when is_integer(id) do
	{:ok, pid} = case GenServer.whereis(gen_serv_ref(id)) do
				nil -> 
                                    case GenServer.start(__MODULE__, {n_id, self()}, name: gen_serv_ref(id)) do
                                       {:ok, pid} -> {:ok, pid}
                                       {:error, {:already_started, pid}} -> {:ok, pid}
                                    end
                                    #or start&link it by a dynamic supervisor, or add it to a supervisor tree, etc
				pid -> {:ok, pid}
			   end
	GenServer.call(pid, :get_state)
end

def move(id, params) do
	# maybe verify params, etc
	GenServer.call(gen_serv_ref(id), {:move, params})
end

def init({id, channel_pid}) do
	# if you also store the game state somewhere else you could see if it was stored and feed it, otherwise if it's transient, just start fresh, etc
	monitor_ref = Process.monitor(channel_pid)
	monitors_map = Map.put(%{}, pid, monitor_ref)
	{:ok, %{my_game_state: %{}, monitors: monitors_map}}
end

def handle_call(:get_state, {pid, _tag} = _caller, %{monitors: monitors, my_game_state: gs} = state) when :erlang.is_map_key(pid, monitors) do
	# because we have the guard is_map_key we know we don't need to add this channel (the caller) to the monitors
	{:reply, {:ok, gs}, state}	
end

def handle_call(:get_state, {pid, _tag} = _caller, %{monitors: monitors, my_game_state: gs} = state)
	# here we know this channel pid isn't being monitored (happens if a new tab is open as it will only ask for the state and we won't be monitoring that channel unless we add it here
	monitor_ref = Process.monitor(pid)
	n_monitors = Map.put(monitors, pid, monitor_ref)
	{:reply, {:ok, gs}, %{state | monitors: n_monitors}}
end

def handle_call({:move, params}, _, %{my_game_state: gs} = state) do
	n_game_state = GameEngine.do_stuff(gs, params)
	{:reply, {:ok, n_game_state}, %{state | my_game_state: n_game_state}}
end

# now because you're monitoring the channels we need a handle for any :DOWN messages coming from channels that die
def handle_info({:DOWN, ref, :process, pid, _reason}, %{monitors: monitors} = state) do
    {^ref, n_monitors} = Map.pop(monitors, pid)
    n_state = %{state | monitors: n_monitors}
    case n_monitors do
	_ when n_monitors == %{} -> 
		# there are no active channels for this user, lets set a timeout
		{:noreply, n_state, 25_000}
	_ ->
		# there's still some active channel no need to set timeout
		{:noreply, n_state}
    end
end

# and the timeout handle - if the user disconnects from all channels and doesn't reconnect in 25_000ms this message will be received and in this case shutdown the server
def handle_info(:timeout, state) do
	{:stop, :normal, state}
end

You could (and maybe should) add a monitor on the channels for the game gen_server itself - although technically you should make it so it’s not possible to crash it things out of your control might make it crash - so that the channel can be notified if the game crashes and do whatever is needed. This depends also on how you design the access to the genserver for the game updates, etc, it might not be needed depending on that…

:wave:

You have a race condition in start_or_state.

1 Like

Then please clarify! If I knew I wouldn’t have written it racy

Process can get started between GenServer.whereis and GenServer.start. It would cause an error {:error, {:already_started, pid}} to be returned, which wouldn’t match {:ok, pid} and crash the channel.

1 Like