API url security

dimitarvp · March 31, 2021, 9:33am

There are a lot of programmers out there who only do this for the money, hate their job, and are just ticking boxes to make sure they are not fired – that’s the reason for the ton of data breaches. Not because somebody doesn’t follow a magical gospel with magical steps that magically fix security by blindly following it all.

Cost-benefit analysis, man. I prefer to spend 2 weeks testing my authn / authz code and then peacefully use the easier-to-use IDs (numbers) compared to just assume my authn / authz code is broken and trying to mitigate that risk by using UUIDs.

I get what you are aiming at but if I was a CTO I absolutely wouldn’t cripple my devs’ daily productivity on the off-chance that random IDs would save part of the data of our starting business in case of a breach. Let them use whatever makes their daily work easy! Most devs suck at security anyway so it’s likely worth it to bring one security-focused person on the team. I’d work with professionals to secure our servers and maybe audit the authn / authz code myself.

Again, cost-benefit analysis. It’s natural that you as a professional in the area will always be pushing for your expertise to be more utilized but there are many other factors at play here.

(I guess one rather bad analogy would be that many mathematicians can say “why aren’t all programming languages strictly and strongly typed, there’s formally proven math clearly demonstrating you get less bugs that way!” but still, almost nobody cares because a lot of people don’t want to spend the extra effort upfront to guarantee smooth sailing later.)

Exadra37 · March 31, 2021, 9:37am

As I say here:

Exadra37:

dimitarvp:

If my authentication system is broken I’d not sleep until I get it right. Not to mention that I’ll definitely test it properly to remove the possibility of the most trivial attacks.

You can have authentication and authorization working correctly, but if they aren’t invoked in the right places, then you expose data that you should not be exposing, and this happens a lot in complex backends, despite the authentication/authorization code itself being well tested.

For example, you add an endpoint to your application and you put it behind authentication, but then you forgot to check if the logged user is authorized to access to it. Another scenario is when some code is touched and some dev accidentally removes authorization or relocates it to a place were is not effective anymore.

You can test it as much as you want, one day you or another developer will misuse it, and an ednpoint will leak data it shouldn’t, therefore this is nothing related with server security:

slouchpie · March 31, 2021, 10:45am

That’s truth! Number 1 on the list should be “apathetic devs”.

dimitarvp · March 31, 2021, 11:03am

Well to be honest, nobody is giving them an incentive to be attentive in these matters.

Exadra37 · March 31, 2021, 2:26pm

Just showed now in my Twitter feed:

https://twitter.com/gabsmashh/status/1377235026358714369

One of my favorite replies:

https://twitter.com/Joe_YadaYada/status/1377250569153298438

stevensonmt · March 31, 2021, 3:07pm

dimitarvp:

Exadra37:

And in practice one of the main reasons was the constant grow of data-breaches happening

There are a lot of programmers out there who only do this for the money, hate their job, and are just ticking boxes to make sure they are not fired – that’s the reason for the ton of data breaches. Not because somebody doesn’t follow a magical gospel with magical steps that magically fix security by blindly following it all.
…
Cost-benefit analysis, man. I prefer to spend 2 weeks testing my authn / authz code and then peacefully use the easier-to-use IDs (numbers) compared to just assume my authn / authz code is broken and trying to mitigate that risk by using UUIDs.

I get what you are aiming at but if I was a CTO I absolutely wouldn’t cripple my devs’ daily productivity on the off-chance that random IDs would save part of the data of our starting business in case of a breach. Let them use whatever makes their daily work easy! Most devs suck at security anyway so it’s likely worth it to bring one security-focused person on the team. I’d work with professionals to secure our servers and maybe audit the authn / authz code myself.

I am out of my depth on the technical merits of this discussion, but I think there are two logical/practical comments to make on this post independent of technical implementations.

In any industry where errors occur, focusing on the merits and actions of individuals tends to be less effective than focusing on producing more robust systems that help reduce errors. In healthcare for instance, requiring two nurses to acknowledge correct handling of controlled substances in the hospital reduces errors both intentional and otherwise. I think what @Exadra37 is suggesting is that software development should follow an ingrained system that makes authorization errors harder to accidentally fall into.
The hypothetical CTO focused on devs’ daily productivity should probably hope their app does not gain a huge user base. I would imagine improving security at a late stage of development is much more difficult than implementing it correctly from the beginning. That up-front investment of time will almost certainly end up better in the long run for the users and thus the company. Maybe there is data to show that economically the company is better off pushing a product out and trying to deal with any security issues later in a whack-a-mole fashion. This is where my lack of experience might be limiting my viewpoint.

stevensonmt · March 31, 2021, 3:16pm

I am working on a toy project right now that I implemented slugs for the URLs that can change and had not considered the need to redirect from old bookmarks/links. Would this be potentially as simple as creating a table of “redirect_slugs” with fields of “old slug” and “new slug” and handling the redirect in the router if the URL matches an entry in the “old slug” field of the table?

Exadra37 · March 31, 2021, 3:16pm

Exactly

I am a very strong believer of security as an opt-out, not an opt-in, aka being secure by default not as an afterthought as it is a common practice in our industry at several layers of the development and deployment stage of software.

It’s more difficult, and sometimes never done properly due to the constraints of the current system, and costs a lot more, especially when done after a data-breach.

Exadra37 · March 31, 2021, 3:18pm

Thats how e-commerce platforms I have worked on do it

Do the redirect as a a 301 or 308.

dimitarvp · March 31, 2021, 3:24pm

But that’s exactly what I’m suggesting: pay proper attention to security at the start and then leave your devs to work in peace.

IMO using obscure IDs is basically a damage-control measure. Not exactly security by itself although it’s kind of sort of a security measure because it introduces obscurity.

Exadra37 · March 31, 2021, 3:26pm

We disagree here.

They are not damage-control measure they are damage-prevention measure for when authentication/authorization controls fail.

dimitarvp · March 31, 2021, 3:31pm

Well, I’m no security professional so I admit partial ignorance here.

It’s just that to me if your authn / authz controls fail then all bets are off and the game is already lost. But that’s likely not always true (it seems to me that this is what you’re saying — if I’m reading you correctly).

Exadra37 · March 31, 2021, 3:37pm

Pretty much any API out there has authentication/authorization flaws, but they are usually only discovered by hackers or security researchers, like Alissa Kinight in the mHealth study, but many more cases exist in a weekly basis. You just need to start follow the right newsletters and people in Twitter and Linkedin.

Just to try to be more specific, you can design a bullet proof authentication/authorization library, but if is not used correctly in the right places of your code, then it’s when things go side ways. By other words I am not putting in cause directly the authentication/authorization code, but how its used or note used when it should have been used

dorgan · March 31, 2021, 4:04pm

The enumeration issue is of course context dependent, but enumeratig public resources can still be an issue(though not a data breach).

I experienced this myself. My website allows users to post anything, think twitter like platforms. Back when I just used the autoincremented ids in the urls, someone had the brilliant idea of making an “archive” of posts, so you could see the posts even if they were deleted by my moderation team. They would ping at the next id and store it when it’s contents are finally created. Rinse and repeat and they stored posts as they were published. And I didn’t even have an API.

And sure, it’s quite harmless, like the Internet Archive, but then there’s content that you want to be gone for good(like child porn, doxxed people’s data and other gross/illegal stuff) that is now easily accessible, linked to your brand and completely out of your control.

The solution was to hash the ids with something like https://hashids.org/. I contributed an example to the Ecto docs with a simple example of this use case: Ecto.Type — Ecto v3.5.8

Was it a vulnerability? No. Was it harmful to my project? Yes. In this case I wanted to limit the spread of harmful content.

chulkilee · March 31, 2021, 4:21pm

This is the main point of this thread. Don’t beat around the bush! (see the later section what could be done better)

First of all, we all here agree that the impact of the predictable IDs when the system does not protect the resource correctly.

However, think about followings:

Do we have to use non-numerical ids always?
Can we always use non-numerical ids?
Do we have to consider the use of numerical ids as “insecure”?
- and do we have to fix it ASAP?

I don’t think we can just say “you MUST NOT use numerical ids”. I’m not sure what you’re trying to say - probably “SHOULD NOT”?

I think the current “How To Prevent” section is well written enough. It just needs to be read

I agree that the default configuration or examples should adopt good practices for educational purpose. There would be some trade off (e.g. not all database adapter implements the binary id for instance) - but we may push it a little bit more. This is not Elixir specific though. You may write it up and submit to Hacker News…

@Exadra37 I think we can discuss the actual problem without beating bushes if we’ve done followings.

Don’t provide specific alternative, since it could lead to off-topic discussion; slug is a bad example.
Present background, facts, details, at once, to support your argument, so that we can avoid ping-pong on details, and focus on what matters actually!
Use clear and consistent wordings - for example I always use RFC 2119 for written proposals

No, it’s not. The main purpose of GDPR is how to protect your personal data. The main problem GDPR want to fix is collecting and sharing personal data without consent.

I didn’t say that - seems like you’re assuming I’ve done that? I said “I’d like to correct things, as I’ve also done security compliance covering this!”, because my words are based on actual work done in the domain.

I totally agree with this. Actually, I think “you must not use numerical ids since it’s insecure by nature” is also going to “oh, I use random ids so it’s secure” laziness as well!

BTW I’ve seen this in the real world - publicly accessible S3 object or CDN artifacts with random ids

sfusato · March 31, 2021, 4:28pm

This is one of those things that sound simple at first sight but can get out of hand very quickly. You can easily get in an endless loop of redirects, or have to keep updated all past slugs redirects on any current change forever. Amazon, Reddit and most of the huge platforms use the /:id/:slug pattern, which is resilient by design.

Keep it simple stupid and use non-numerical id’s if you want to avoid the enumeration issue. Obviously, it goes without saying that for sensitive data use proper authn/authz measures and don’t rely on “random ids”.

Exadra37 · March 31, 2021, 4:38pm

This is a split topic from another one, and the slug example is because of that.

And I say that in the topic:

I am not a user of Hacker News, neither like to use it, but thanks for the suggestion.

No, I was not assuming, I was just wanting that you would not take my observation of compliance as a personal attack

Yes they can be accessible, but you cannot enumerate the rest of the ids if an url is in the form of example.com/something/asd123dffrt, but you can easily do it if is in the form of example.com/something/123.

The main point about not using numeric IDs is to protect you from enumeration of all other resources when a mis-configuration of authentication/authorization or public/private access occurs.

Exadra37 · March 31, 2021, 4:41pm

I am not proposing the use of random IDs as a solo security mechanism. I am proposing to not use them to avoid resources enumeration when authentication/authorization fails to be used correctly.

Exadra37 · March 31, 2021, 4:48pm

Completely agree

A very good example why numeric IDs must not be used by default

Once they are used by default, by all frameworks in all programming languages, it’s easy to get in the trap of building an application with them without realizing the consequences, until you have problems, like yours.

A good example of how your brand image can be affected and you would not have any control of how to stop it.

tangui · March 31, 2021, 4:48pm

I’m a bit surprised by this assertion because if it was true, there would be dozens if not hundreds of massive data leaks per week. Any source to back this claim is welcome

Using random IDs might actually not help if API access control is flawed: if user A is allowed to request for B, then he can probably lists resources of B to get the resources’ random IDs as well.

There are also other means of dealing with authentication & authorization flaws at the API level:

pentests on a regular basis
setting up periodic code audits (especially for object-level authz)
create in-house API for fine-grained authz
use of standard protocols (OAuth2, mTLS, …)
use of well-recognized libraries for access-control (java: Spring Security, Apache: mod_auth_openidc…)
not relying on under-payed subcontractors

Of course all of that depends on the sensitivity of the data and cost-benefit analysis.

I think assuming authentication & authorization is or will necessarily be flawed at some point is an exaggeration from an cybersecurity point of view. For sure it requires extra attention and care, but far from impossible.