I have some public API’s that I want to restrict with tokens. Basically I want to allow API calls only if the client has an active token. Each token will be restricted to a host or a list of hosts.
I am not sure about implementation though.
My thoughts are to have a Phoenix plug that will verify the token from header. Will search for token in database then will match the request host and will continue if everything is fine. Also I would like to log the request to be able to do graphs and statistics later.
My API though has a high request rate (tens per second or more). I do not want that the token validation to be a bottleneck though. Since I need to rely on database calls and regex matches I am not sure about the impact. Also I need to count each request in database after the call.
One optimisation would be to keep the token in memory for a while after loading it from DB, or to load all the tokens at the application start (This can lead to stale data though). Also I have a cluster of nodes and I want to rate limit for all together.
Did somebody encountered the same use-case? Can you share your thoughts if you have different ideas?
Not possible if you also want to provide token revocation. Sooner or later it will require the DB check. You can use signed tokens for preliminary elimination of obviously invalid tokens. Alternatively stateless tokens could be used if, and only if, the lifetime of such token would be very short (like half an hour).
DB of the revoked tokens is IMHO much worse idea than DB of allowed tokens. The later gives us useful tools like allowing user to list all their current tokens, review last usage of them, etc. Guardian DB is a hack because JWT is terrible solution for sessions.
Cache them in ETS. Either prepopulate and bust the cache or TTL them. Cachex has a nice TTL feature. If you have lots of api keys with a highly variable rate a TTL may not be optimal as you still might get a lot of requests to the db if you have a lot of infrequent api key usage.
Use erlang’s counters. Persist to the db as needed.
If you’re not showing the rate limit data anywhere in your UI I would just have a per node rate limit and not deal with any of this. But…
I managed to get Phoenix Tracker working to send rate limit data around the cluster multiple times per second but I wouldn’t recommend that.
I need to play with this a bit more but probably persist your counters per node per api token. Or maybe use pubsub to pass around the data and cache it on each node for each node. Again, really only if you need to cluster totals for some reason, otherwise just do per node limits.