What is a good strategy to persistently revoke or blacklist bad user login tokens?

Let’s say you have a system with multiple elixir nodes for user websockets and managing most tasks, and one separate database server separately that it communicates with. You do not want to over-message the main database server.

Upon login, users send their username & pass which is checked (in some manner) against the hash for them in the database. If it passes, they get a token, maybe with a long expiration (to avoid needing to login again too quickly).

On further sign in attempts, they just send the token to sign in or make server requests they are authorized for.

But in the mean time they manually log themselves out, and we want to now consider this token invalid. The token itself cannot be made to expire. Maybe someone has stolen the token. Or they have changed their password and we don’t want someone who has the old token to use it. We don’t want it being used anymore at all.

So what is the best solution?

Chat GPT suggests:

Invalidating a user token in Elixir typically involves a few steps, depending on how you’ve implemented token management in your application. Here’s a general approach:

1. Use a Token Store

If you’re using a stateless JWT (JSON Web Token) approach, you can’t directly invalidate a token since it’s self-contained. Instead, consider maintaining a blacklist or a revocation list.

2. Implement Token Blacklisting

You can implement a simple in-memory store or a database table to manage invalidated tokens. Here’s a basic outline:

  1. Create a Blacklist Table:
    If you’re using Ecto, you might create a migration for a token_blacklist table.

    def change do
      create table(:token_blacklist) do
        add :token, :string
        add :inserted_at, :naive_datetime
      end
    end
    
  2. Insert Invalidated Token:
    When a user logs out or when you want to invalidate a token, insert it into the blacklist.

    def invalidate_token(token) do
      %TokenBlacklist{token: token, inserted_at: NaiveDateTime.utc_now()}
      |> Repo.insert()
    end
    
  3. Check Token Validity:
    Before processing requests, check if the token is in the blacklist.

    def token_valid?(token) do
      case Repo.get_by(TokenBlacklist, token: token) do
        nil -> true
        _ -> false
      end
    end
    

3. Handle Token Expiration

To manage the token’s lifecycle better, consider setting an expiration date on your tokens. You can implement token expiration directly in the JWT payload.

4. Middleware for Token Validation

You can create a plug to check if the token is valid before proceeding with requests.

defmodule MyAppWeb.Plugs.Auth do
  import Plug.Conn

  def init(default), do: default

  def call(conn, _opts) do
    token = get_req_header(conn, "authorization") |> List.first() |> String.replace("Bearer ", "")

    if token_valid?(token) do
      conn
    else
      conn
      |> put_status(:unauthorized)
      |> json(%{error: "Unauthorized"})
      |> halt()
    end
  end
end

5. Clear Old Tokens

You may also want to implement a cleanup mechanism to remove old tokens from the blacklist periodically.

Conclusion

With this approach, you can effectively manage token invalidation in your Elixir application. The specific implementation details may vary based on your application’s requirements and the libraries you are using.

Is this the general approach?

One could store the blacklist of tokens in a table, but you would need persistent storage, because if a server restarts you can’t lose the blacklist. So mnesia?

The problem is also if you set the expirations of the tokens very long (really really long) you will be stuck with keeping and checking these blacklisted tokens in your table until then as well.

The alternative for persistent login is (if client is app based) having the app cache their username and password internally and submit that automatically on app load again to get a new token each time. They you can just give them a shortlived token (1 day for example) and if you need to blacklist it still, you only must do so for a day.

You can have the client app request a new token every time the token is about to expire as long as the client system is still set to “auto-login”. So the user doesn’t see that they are continuously getting new login sessions/tokens but this is seamless.

And we no longer have to store long lived token blacklist entries across the whole node system.

What do you think? What might a common way be to handle this? Would mnesia be a good simple Elixir data store for the “blacklisted” tokens to refer to? Worse case if it goes down you have a few black listed tokens working again for 1 day before they expire anyway.

Alternatively instead of just a blacklist, we could keep a whitelist and store ALL login tokens given out and consider only the ones we keep as valid (check against that). This would allow more certain “log out of all devices” behavior because then we can just wipe all tokens for that user in storage when needed.

Perhaps that is safer and better.

Thoughts? Thanks.

Well I agree with ChatGPT’s first point here, which is that once the server generates and gives the token to the user, the token is no longer a part of a server state. In fact, the “traditional” manner in which logout with a JWT is handled is that when the user hits “log out” the browser simply deletes the cookie containing their JWT from memory. There is no change in server state.

A JWT also conventionally is assumed to be secure because it is delivered to the user by secure transport, usually HTTPS. So it sounds like you are planning for cases where the user’s machine is already compromised, in which case there is nothing you can do in your app to fix this!

Have you had any real problems in your app with this token expiry problem? If so could you elaborate on what happened more specifically? I only ask because my favorite solution is always “if it ain’t broke, don’t fix it.”

1 Like

Why not just using cookie sessions and short lived tokens for websockets?

3 Likes

I’d question this assumption: “fetch one row with this ID” is a very efficient operation on most DBs, and easy to cache if you need to go even faster.

3 Likes

Thank you. That is very helpful. I have not built the system yet. I am building it. I am not used to doing this so want to make sure I am doing a good approach. “Measure twice, cut once” as they say.

The main consideration is I don’t want to unnecessarily spam login requests on a continuous basis against my database server (as I fear the database must not become a bottleneck - I don’t want a distributed or sharded database for simplicity).

My users would all be on apps like Windows/iOS/Android applications where I can cache whatever I need directly in the installed program’s temp folder. Ie. I can cache a JWT on their end, or a cookie or whatever. I am not limited by browser functions, so don’t specifically need cookies (unless they pose an advantage - I am not sure).

I would also like to be able to do things like “log out of all devices” or revoke any sessions once a user changes their password. Imagine, if someone loses their device while they are logged in. They then use another system to change the password, but if the lost device remains logged in, and this is not revoked, this is a security failure.

I asked ChatGPT about how “log out of all devices should work” and here were the main points it said:

Logging out of all devices is a security feature that ensures your account is no longer accessible from any previously logged-in device. Here’s how it typically works:

1. Token Management:

  • When you log into an account, the server often generates a session token or cookie that authenticates your session. This token is stored on the device.

2. Revocation of Tokens:

  • When you choose to log out of all devices, the server invalidates all active session tokens associated with your account. This could mean deleting them from the server’s database or marking them as expired.

3. Check Session Status:

  • The next time a device tries to access your account using the old token, the server checks its status. If the token is invalid or expired, the user is required to log in again.

4. Implementation Options:

> * Session Store: The server keeps a record of active sessions for each user. When a logout request is received, it clears the sessions.
> * Database Entries: Each session might have a corresponding entry in a database that can be modified or deleted upon logout.

5. User Interface:

  • Users typically access this feature from their account settings, where they can see active sessions and choose to log out of all.

This process helps protect user accounts from unauthorized access, especially if a device has been lost or stolen.

I understand from my limited knowledge that cookies can be revoked as their expiration dates can be modified when communicating with the server. But what this suggests to me is that I must store a “whitelist” of cookies/tokens either way if I want this safety. In order to know when “active sessions” should be revoked, one must be keeping track of all and checking sessions always.

Either that or you make the tokens expire rapidly so the compromise is shortlived (but this is still no good as any duration of account compromise after a user changes their password should be unacceptable).

If so, then JWT is just as good for me, and I can provide very long session times with them. If I will be checking the JWT or cookie every time against the white list, it makes no difference. If I want a user to “logout of all devices” I simply must store the active permitted sessions (tokens/cookies) and clear them when the user changes password or logs out.

Correct?

Options To Do So:

  1. Store the whitelist tokens in a separate very simple Key-Value type persistent temp database on another server (so as not to risk bottlenecking my main database with frivolous nonsense and essentially spam-like requests to confirm tokens)
  2. Keeping it separate makes it easy for other unrelated services that should share the same login sessions to read only query whether tokens they are sent for those services are valid, eg. to authenticate read access to an unrelated server system.
  3. Keeping this separate means I can maintain login sessions even if the whole rest of the system needs to be shut down or restarted (though this is not too important as devices will cache username and password as well so can auto-reauth as well).
  4. Otherwise use a distributed database system in Elixir like mnesia for storage of these.
  5. Or send the tokens also to the true main database to store per user (but this may be a lot of reading/writing that doesn’t need to happen there).

I am inclined to just use JWT (since cookies don’t change the need for white listing) and store a white list then of all active tokens either in mnesia or a separate simple key-value service on another server that Elixir and other services can communicate with. Elixir could write and read (as it creates the tokens). Other services can read.

But I also don’t want to slow things down but requiring requests to then wait for another unrelated server to auth the token constantly. So perhaps actually mnesia is best. And unrelated services can either store a copy there as well or query Elixir. hmm.

What do you think? Perhaps overkill but I think it is necessary if you want secure password changes and global logouts.

You know I think there is a simple solution here actually that you might be able to get some experience from. I recommend you look at the code generated with mix phx.gen.auth in a new Phoenix app. The solution it creates will illustrate a straightforward solution to your problem. Long story short, all tokens are kept in a user_token table in the server database. Ever new connection calls a Plug that checks if a token is provided in the client request, and if so that it matches the corresponding session in the user_token table. The user_token table can be considered a whitelist of sorts of active sessions. And it would be very simple to write a function to log a user out of all of their sessions:

def log_out_user_from_all_sessions(user) do
  # Delete all session tokens for the user
  Repo.delete_all(from(ut in UserToken, where: ut.user_id == ^user.id))
  :ok
end

On all subsequent requests there would be no matching tokens, and thus no matter what has occurred on the client they would not be authenticated.

4 Likes

Yeah exactly. That’s just the whitelist scenario I describe. Having a whitelist is the only way it seems to allow a “logout of every device” system.

The question is really then where you store the whitelist. My database is on a different server and it is a single server (not sharded). I would rather not spam it as noted or waste time going back and forth.

So I am leaning toward keeping the token whitelist in mnesia, which is what I will try to start.

I would only suggest adding in more complex data stores when faced with an actual performance issue. You’re gonna add a lot of complexity and risk by using an unfamiliar data store and that just doesn’t seem worth it when Postgres will happily handle this for a long time.

If you want to “future proof it” then build a behavior and have a Postgres implementation for now, and swap to something else in the future.

6 Likes

And if you will have deny list, then it will not be on one shared server? And you will not “spam it on each request” and “waste time going back and forth”?

Image thanks to http://cryto.net/~joepie91/blog/2016/06/19/stop-using-jwt-for-sessions-part-2-why-your-solution-doesnt-work/

7 Likes

Haha I knew this would come when I read “blacklist” in the first post :smiley:

1 Like

I have no PostGres system. I am using a separate database on a separate server and don’t want to message it back and forth just to confirm token viability or re-auth each user potentially dozens of times a day.

So implementing PostGres specifically for session management is an extra step either way. I can implement PostGres or Redis or mnesia or any database for session management and no matter what it will be an extra step.

So the best option is what makes the most sense.

What’s interesting is that graph doesn’t seem to cover the whitelist solution. It talks about why blacklisting is pointless as we have discussed in this thread.

Whitelisting is still a valid idea. If the whitelist database (mnesia, Redis, whatever) goes down nothing bad happens. Users just have to re-auth. You otherwise save spamming your main database server with potentially millions of unnecessary authentication calls so it can focus on actual work.

Tokens are easy to re-create when needed but there is also no need to auth against your primary server for the same user 10-50 times in one day as they keep reconnecting. This is foolish.

If you are storing some type of cryptographic hash of the password on the server, you will also have to pay the price of validating this each time, which by nature of this as a security measure cannot be too fast to perform. And if you don’t want to prompt the user for their pass every time, you must cache that on their system (which is no different than caching their token).

If a user app has a websocket connection, then they lose connection, when they turn off their phone, when they turn it back on, that client can just send you the token and you check the local mnesia (or other quick storage) directly in Elixir and run a quick check on the token itself to confirm it is a valid and re-establish the socket.

I am not using web browsers but rather client apps. Local storage is protected by the OS in the app working directory. Tokens or cached username/password are as secure as the OS there is.

There is a real data cost to storing the tokens. I just checked and a simple one is about 800 bytes. So if you are storing 1,000,000 sessions this is a cost of 800 MB. This is still not so much as to be unwieldy though you might have to be mindful of RAM if doing it that way vs. hard-drive space and having protections past a certain size (or upgrading servers).

The only other valid criticism I see in that graph is that if someone hacks your server and captures all your tokens you are in trouble, but this is not much different than if they hack your server and capture your database. Obviously if someone takes over your entire server infrastructure you have a problem. If somehow they are running any operation they want freely on your server, then it is likely everything is compromised.

I find debating these things helpful, as it has led me to believe this is a good solution.

Do you have a better suggestion?

Because exactly the same problems as with deny list applies there.

Congratulations, you have just reinvented sessions.

Why the actual hell you want to validate password on each request? You validate that once, create session token, and then use that session token to fetch session form the DB.

Use sessions and use cookie to store session ID on the client side.

6 Likes

So basically we are discussing the same thing then, except using cookies to store the user/session ID information vs. JWT. Agreed?

A “cookie” is a relatively meaningless topic if you are speaking of apps because it is a web browser concept. Definition: “A cookie is a small piece of data stored on a user’s computer by a web browser while browsing a website”.

The function of a “cookie” in practical purposes to an app can be anything (including a JWT or a traditional web cookie). It is whatever you make it to be.

So it is just the format of “cookie” or “token” under debate, since we agree on the need to whitelist and track the valid ones. Correct?

For example,if you have a “cookie” like this:

This can be the same information you place into a JWT. Only the JWT is also signed. Which comes with one single cost and one single benefit.

The benefit of JWT over a plain text cookie (ie. the plain text info that went into the JWT) is it can provide some direct authentication value. Eg. If you have a 7 day JWT duration, you can send let users send it to an unrelated service that has the public key and that service doesn’t have to do any database checks at all to verify that the request is at least likely valid (or at least was valid within the past 7 days). You can cheaply and easily at least reasonably protect other low security services this way with no added database checks. It is an easy way to block non-authentic or general public users out of a resource.

Downside of JWT is it adds a bit of data cost for the extra signing information. That’s the only appreciable downside I can see.

Otherwise as you said, it is the exact same thing either way. To an app the format of the “token” or “cookie” is irrelevant - it is up to your design. A plain text cookie that contains the same information is the same as a signed JWT of the same information. Right?

If you are talking browser based design then perhaps that is different but that is not my field so I don’t know about that.

You can sign cookies as well.

Unless it allows access to, for example, value to which access could be restricted during that period. For example imagine that the secured data is some kind of company document - you do not want to allow employee that was terminated to access such documents after his firing. Or someone has his access reduced for any reason. If you have any form of authorisation beyond authentication, then you will not avoid checking DB.

Main downside of the JWT is cryptographic agility, which is really bad.

And if you will store just session ID in the token, then you are paying cost of all that cruft that is in JWT that is not even signature (algorithm selection, timeout, etc.).

If you want to have short lived tokens (matters of hours at most) for stateless services (like object storage or some additional one-off services), then there are still better options than JWT. And in case of such uses, then you do not need revoking/denylisting anything, because token is short lived (ideally few minutes should be enough time).

If you want persistent sessions, then use, well, sessions. You will not avoid DB queries anyway (because of authorisation), so it should not matter much (and you still will not need any revocation/denylisting).

3 Likes

If a “signed cookie” is a cookie encrypted with a public and private key (as it seems to be from what I see), it sounds like this is the same thing then as a JWT in function and could be used equivalently in non-browser based systems (eg. apps as I said, where it makes no difference what format you use, because it’s under your control). If so, then at that point we are talking about identical systems except the token/cookie format I believe.

Yes I agree, but it is still an easy way to keep public users out of a general low security resource without checking a single database or having server-server communication. For example, if you have an S3 or other server with a bunch of non-critical things your users may access on a continuous basis and just want to filter out random public people or roughly identify what within that each user can access based on some token variables, you can easily put a lambda type function on it to check the JWT.

Zero database checking or communication is then required and it can self regulate to filter out 99.999% of unauthorized access and even control what resources each user accesses (with the only 0.001% of then “bad access” being theoretically stolen tokens which would have to be recovered by the bad user from stolen/hacked devices).

Thinking about it further, one way to improve it would be to send the users a second token (of different claims or signature) from their primary server session one to use for that resource access which is short lived, and just keep sending them new ones on a periodic basis via their main authenticated websocket.

That would provide the same benefit of letting the resource self-regulate access while tightening the time frame for token viability on that resource if any user devices are hijacked and then “logged out of all” after it is recognized.

Do you imagine a better way to manage such situations?

I like that better actually. The cost of creating the extra short duration tokens would not be so terrible and it allows a tighter time leash on the self-regulating resource.

The only way you could do a higher level of security in such a circumstance I imagine is to have the S3 or other document resource sending constant queries on every user request to the database server to see if the session from each request’s supplied token/cookie is still valid. Which is not reasonable in many cases. Will slow things down unreasonably. Having it self manage this by checking a provided token should work well, especially with a shorter time leash.

Thanks for this post - that’s very interesting and helpful. Good to know of those vulnerabilities. Looks like those can be managed.

If we are otherwise agreeing on the basic principle of (1) storing a user session in a database of some kind via a session_id and other information, (2) providing the user an encrypted representation of that session (JWT or signed cookie if this is the same), and (3) letting the user send that to be tested both in terms of decryption success and (for important things) against the session whitelist before allowing regulated things like websocket creation or account changes, then I do believe we are talking about the same thing and the only difference is the format of how the session representation is given to the user.

I don’t see much difference at that point if the obvious flaws you mentioned with JWT are handled. I presume cookies would be better if browser based as the function for them will be built into the browser. Otherwise, I’m not sure if it matters (does it?).

Thanks for your thoughts.

Nope. If anything, you can sign it with public and private key, but most of the time what you care about is just tamper-proofing for which you do not need public key encryption. You need keyed hash function for that.

Alternatively you want to encrypt data in a way that will allow only you to read it, in such case you want symmetric encryption.

But, if you are using sessions stored in some DB, then TBH you do not need any of that. All that you need is just secure ID that is impossible to guess - like for example UUIDv4.

The difference is that JWT has configurable encryption/signature format. Which is big problem because of algorithm confusion attacks (or you are accepting "alg": "none" and your tokens aren’t secured at all).

Why not use build in features of S3 with presigned URLs? Their tokens are way simpler, you do not need any extra lambda or whatever service, etc. But in general that fits into the use of “short lived tokens”. If your token live for 15 minutes, then JWT isn’t that bad (but still, there are better token formats, like for example PASETO).

2 Likes

I think this is important to raise and repeat for everyone for awareness as much as possible, but it is also overrating this issue once it is known. I posted some simple code here that I think handles this and it should not be complicated to fix, as per the guidelines at the “best practices” link. I am grateful you pointed it out to me, but it should not be difficult to manage or affect the usage.

It is not really the same thing and then you are going down proprietary cloud software pathway (like AWS services) vs. having a generic and portable solution. A signed URL can typically be shared and used by anyone (easy for average person to do). While if you have a simple script just checking the JWT in the request, this is not easily shareable. You also don’t have to make a new signed link or token for every single access attempt with the JWT.

And you could copy and paste your entire system over to another service (including just your own raw server) and it will still work.

For example it would take a few hours at most to set up an Elixir server with a Router that checks the header for the token and either returns the file or not based on parameters set in the token. This is very portable and simple design. You can search “how to make elixir check token in request before allowing access to file at url on server” on ChatGPT and it lays it out pretty simply.

I hate using proprietary systems like AWS more than is absolutely necessary.

Doing the equivalent with signed links (even on Elixir) is less secure again as the links can be more easily shared and more work. You would still need to supply some identifying information for the accessing user to the server for it to determine if that user is allowed to receive a signed link. Or server-server communication to determine this. This doesn’t fix that. And it is a pain in the ass on proprietary services like AWS which I touch as little as possible. And more computational work overall.

I don’t think signed links are as good a solution in general. I don’t see any particular advantage to them. They just seem messier and even more work to me.

If I give a user a JWT periodically to access a “low security” resource documents repository, that server can quickly verify their identity, authorization level, and either allow or deny based on the file requested and those details with a simple script. And this is so easy.

I can’t think of any reason for signed URLs at all unless perhaps it is a situation where they are intended to be spammed out to many people at once for some reason (like a time limited giveaway email link or something) or to track URL clicks again in things like emails.

I agree in theory (though more security is always better - try guessing a session ID AND cracking JWT RS256 encryption AND guessing other verification details of the session like the exact date created :joy:) and that is why I think the decision about the token or not or encryption is based on the potential ancillary benefits and any net harm. Ie. There is no net harm to using the JWT in this case as long as you watch the “alg” and “kid” and it provides extra benefits (if you want them) like what I said for a method of easy self-contained non-proprietary verification of user resource access in disconnected systems.

In terms of the format of tokens, thanks for pointing out PASETO, but there is also an advantage to sticking with what is popular and conventional. There is good interoperability, eg. with OpenAuth on JWT. There is only so much time in the day and using one token type everywhere and knowing it well is better than mixing and matching and adding too much complexity.