How to save a static object to avoid repetitive wasteful minor functions? Must it be GenServer?

mikejm · October 11, 2024, 7:35pm

As I understand it, the philosophical design of Elixir is that it is function driven. This is partly what makes it so multi-threaded and efficient. Thus we must go out of our way to save “state” only when truly needed (ie. via GenServer or other modules that support state).

How do we then avoid unnecessary repeated functions?

For example, here is a class I started for my token (Joken) purposes:

defmodule My.Token do

    def get_custom_claims(userID, durationHrs) do
        token_config_map = %{} #empty
            # userID:
            |> Joken.Config.add_claim("userID", fn -> userID end, &(&1 == userID))
            # issuer: 
            |> Joken.Config.add_claim("iss", fn -> "The Issuer" end, &(&1 == "The Issuer"))
            # expiry: 
            |> Joken.Config.add_claim("exp", fn -> Joken.CurrentTime.OS.current_time() + (durationHrs * 60 * 60) end, &(&1 < Joken.CurrentTime.OS.current_time() ) ) 

        {status, token_claims} = Joken.generate_claims(token_config_map)
        token_claims
    end

    def encode_and_sign_claims(token_claims) do
        signer_priv = Joken.Signer.create("RS256", %{"pem" => My.Token.private_key}) #custom signer
        {:ok, token, _} = Joken.encode_and_sign(token_claims, signer_priv)
        token
    end

    def verify_token_pub(token) do
        signer_pub = Joken.Signer.create("RS256", %{"pem" => My.Token.public_key}) #custom signer
        Joken.verify_and_validate(%{}, token, signer_pub)
    end

    def private_key do
        """
        -----BEGIN PRIVATE KEY-----
        ...
        -----END PRIVATE KEY-----
        """
    end

    def public_key do 
        """
        -----BEGIN PUBLIC KEY-----
        ...
        -----END PUBLIC KEY-----
        """
    end
    
end

This can then be used as:

    token_claims = My.Token.get_custom_claims("Carlos", 1)
    token = My.Token.encode_and_sign_claims(token_claims)
    {status, data} = My.Token.verify_token_pub(token)

However it does not seem exactly ideal.

In reality, what I think I would like to do is store the private_key string and public_key string in .pem files on disk, then once a day reload them from disk (in case I hot swap the files on the server while it is running)

But this would require storing the strings in state. If not, I would have to load the pem directly from disk every time (!). This then adds a real unnecessary bottleneck with the disk access.

Additionally, above I am constantly re-creating signer_pub and signer_priv, when similarly to the key strings, I might only want to create them if they don’t already exist or the strings change, or say if the day changes over time to refresh.

In C# or C++ I would:

Create an instance or static class to manage the keys & signers.
Have that static class or instance monitor the date with a timer, and on date change, run an event/function to update them.
In update function, load new keys from disc to strings, and if different from what we have, recreate signer_pub and signer_priv.
Would likely all be done synchronous on same thread, but otherwise, put a bool to protect the signer creation and if any validation/ attempts occur during this process, repeat them some way in 1 second (when should be long since done).

I presume this could conceptually be done similarly in Elixir with a GenServer, storing the key strings and signer_pub and signer_priv inside state. This GenServer could presumably also check the date periodically to see if must update and reload the keys from disk then (how if so?). But then we have a potential bottleneck where everything must run through that GenServer (or we must spawn a group of worker such GenServers).

So what is the general philosophy here?

What would be the correct general practical solution?

Thanks for any help.

cmo · October 11, 2024, 7:45pm

GenServer, ETS, Agent and persistent_term are some (but not all) places to keep state. If you want lots of processes to be able to read the state in parallel, you often have a GenServer that owns an ETS table configured for concurrent reads. The state lives in the ETS table. The GenServer can update the state in the ETS table whenever you want. You’d set the permissions on the ETS table so only the owner can write to it.

How many times per second are you expecting to retrieve these strings?

SirWerto · October 11, 2024, 7:52pm

Thus we must go out of our way to save “state”

Other option in functional programming to save a state is to partially apply a function

add_to_counter_function = fn counter, x -> counter + x end

fixed_counter_function = fn x -> add_to_counter_function(3, x) end

fixed_counter_function.(1) == 4
fixed_counter_function.(6) == 9

I am pretty sure it isn’t what you are looking for but couldn’t resist to mention it

mikejm · October 11, 2024, 8:00pm

That’s a good question. I suppose it is really a performance question so everything must be based on the expected volumes. I am not sure. I have been trying to think this through.

Hypothetically, let’s say you have a social/discussion/chat app. Users log-in via presumably HTTP Request (submit their password and username, validate from database), and are generated a token and websocket connection. At this point, we will have one main GenServer based module I think per user just to manage their connection and server requests. Token is sent to the user and can be stored in that GenServer as well.

I guess at that point we don’t really need to be continuously checking or requesting their token anymore (while WebSocket is maintained, we can trust that they are still “them” I believe (?)). But we would want to check the token periodically to see if it expires (can check the one in the main user connection GenServer we have as it is the same as their now client-side token).

Websockets would be the server bottleneck to design around then. According to this thread we could hypothetically sustain between 60K and 2 million web sockets per server.

You could limit the token checks by having the user connection’s GenServer track also in state “last_token_check_date_time” and check that against current date time, then only revalidate it if difference is more than x amount of time (say 1 hour at most often).

If so, in a maximally performing and optimized server, with max user load, you could be re-validating 60,000-2,000,000 tokens per hour. With more or less load at different times of that hour depending on when in the hour people randomly signed in and got their token.

Am I generally understanding all this correctly? Is this reasonably how it could work? Thanks for any ideas or conjecture.

benwilson512 · October 11, 2024, 8:04pm

It is incredibly common for token management to use a genserver fronted by an ets table. The :ets table is publically and concurrently readable so you avoid bottlenecks, but the genserver controls rights. That GenServer can expire / refresh tokens based on time or it can watch the file system for new certs, or whatever other rotation mechanism you have in mind.

I think your overall idea is dead on, you just are missing the :ets table to make sure that requests for the token scale.

mikejm · October 11, 2024, 8:12pm

I think those are two separate things, though, right? What I mean is, there is the question of:

(1) Where to store the tokens once generated

Option 1 - store in :ets table (I have read about these but will need to learn more)
Option 2 - store in user’s connection GenServer (they will need one to manage their connection anyway, so why not just keep it in there)

(2) How to efficiently validate the tokens without constantly reloading pem from disk and/or recreating signers to authenticate

Option 1 - just take the hit and keep reloading and recreating these things
Option 2 - create a scaleable pool of GenServers that store the pem strings and signers inside them
Option 3 - save the keys and private/public signers to the :ets table and pull them directly from there

My inclination now is for question (1) to store the tokens in the user’s personal connection GenServer and let that manage and recheck it.

For question (2) I am not sure if the added complexity of managing a pool of GenServers and coordinating requests in and out is worthwhile. Sounds messy. So I suspect the right answer then is to actually use the :ets to store the pem strings and signers once created.

Then maybe create a single GenServer to monitor the date and reload those pem’s from disk and update the :ets entries (like the static class I was referring to in my C#/C++ analogy).

The My.Token class I posted above would thus be unchanged except it would pull out the signers on each request from the :ets table.

I think that seems reasonable. Thoughts?

derek-zhou · October 11, 2024, 8:13pm

That’s not ideal either. keys should be in runtime configuration passed down environment. Check this:

https://hexdocs.pm/mix/Mix.Tasks.Release.html#module-runtime-configuration

mikejm · October 11, 2024, 8:18pm

Why is that more ideal that what I described? In my method (swap the pem’s on disk), I can do it without even stopping the server.

To enable runtime configuration in your release, all you need to do is to create a file named config/runtime.exs
This file will be executed whenever your Mix project or your release starts.

So in your example, to update the keys we must disrupt service (or work around doing so with multiple servers, clearing users off of one before restarting it, more effort).

If you are saying it as a matter of protecting the keys from hacking if the server is compromised, and these are just token signing keys, if the server is completely compromised, I am not sure that matters too much anymore. But I am not a security expert.

derek-zhou · October 11, 2024, 8:23pm

No. Everything can be triggered from the command-line, that’s the beauty of BEAM. Why poll when you can control precisely when to change the key?

https://hexdocs.pm/elixir/1.17.3/Application.html#put_env/4