Arbiter

type1fool · July 6, 2021, 3:46am

Backstory

Before I became a full-time developer on a PHP team in 2018, I had been studying role based authorization (RBA) for an Elixir application I was developing part-time. Ever since, I have missed working in Elixir.

To sharpen my Elixir skills, I want to create an RBA library. There are a few existing libraries, but they seem to be intended for single-tenant applications. I think there is room for a multi-tenant library for B2B applications that serve multiple businesses.

Arbiter on GitHub

So far, I have written about the intent and preliminary design in the readme. I have also created the modules I expect to need, and started writing tests for the Organization module. I’m hoping to TDD this project to discover the optimal abstractions and to satisfy potential security requirements.

As I get started on the design, I want to hear from potential users to test my ideas. It’s worth noting is that this is my first open source project. I have been working on internal PHP applications and a custom framework, but nothing public-facing. I’ve learned the importance of hearing from users before spending hours/weeks on implementation.

Question Time

Have you used one of the existing libraries or built custom RBA tools?
- What were the benefits and pain points?
Do I need a Policy module?
What are some possible hard requirements for adoption?
Could the library accept (potentially decorate) a User struct from other auth libraries/modules?
- How?
If Ecto is a dependency, should it default to ETS, SQLite, or Postgres?
- Could/should the library avoid persistence entirely, leaving that to the user?
Would telemetry be useful in this kind of library?
Do you have any other suggestions?

thojanssens1 · July 6, 2021, 5:29am

Personally I think what’s missing is the API you eventually want to provide for the library users.

I came to the conclusion that authentication and authorization is way too application-specific to make a library out of it. Authorization is actually vague as it can mean many different systems.
But change my mind

To elaborate this thought, take for example authentication and Pow. Pow requires you to inject code (use) in many layers of your application (controllers, views, routes, etc.) for it to work. Even though many devs like and use it, it makes more sense for me to use a scaffolding tool like phx.gen.auth that provides a codebase from which you can customize to your application needs. The fact that Pow must inject code application-wide is for me a clear sign that this should be application-specific code.

A lot of the topics and questions you raise concern business logic (user, team, organization, etc.) and I suspect the library might impose a minimum of conventions to follow in one’s business logic for it to work, but that is something I’ll never want to compromise. The library should work over the existing business logic without a change.

In my case, I work on calendars and a huge part of the authorization is sharing calendars between users with permissions. I have a CalendarSharingPermission Ecto schema that holds those permissions. Would I want to replace this calendar sharing table by a more general authorization solution? I think not, as it’s actually quite complex and specific, and I would also lose in terms of semantics and readability, and customization and flexibility.

I’m not sure what is the library’s goal (as I said, missing a demo API/usage), but you might consider the alternative of creating a blog post with your knowledge about authorization systems and provide sample code that someone can copy over and customize (similar to scaffolding).

al2o3cr · July 6, 2021, 2:54pm

General advice: for the first pass at the idea, MAKE DECISIONS. Yes, eventually you might need to support primary keys of various types or totally different persistence - but trying to force everything to be too generic too early is a recipe for over-abstraction (see also… most of the last 30 years of OOP).

As @thojanssens1 mentioned, the hard part of RBAC is the interface the rest of the system uses: how does application code tell Arbiter what an “action” is? How will Arbiter answer questions like “show me the most recent 20 Widget records that this user can see”?

A good way to learn the answers to these questions (and have a nice demo to show people) is to build your abstractions inside a demo application and only then try to abstract them into a library.

On the organizations branch, it feels like the documentation is already suggesting a limitation of the interface. In this commit this example is added:

Clerks may start an order, add items, remove items, place an order on hold, accept payment method(s), and complete the order.

    permissions: [
      {:order, [:create, :update]},
    ],

Managers would likely include the same permissions as clerks, but would also be able to cancel orders, process refunds, and view reports.

    permissions: [
      {:order, [:create, :read, :update, :delete]},
      {:report, [:read]},
    ],

The words used to describe the permissions (“clerks can place an order on hold”) are much more specific than the actions used to represent the permissions (order: [:create, :update]).

For instance, can a clerk update an order that’s already been shipped? Can a clerk EVER update the address on an order? “update” is very wide…

Some of this may be a matter of picking a better data model - for instance, only giving managers the “process refund” permission is tricky if it’s viewed as an “order update”, but easier if it’s a “refund create”.

type1fool · July 6, 2021, 8:07pm

Thank you for the thoughtful responses @thojanssens1 & @al2o3cr!

While it wouldn’t fit every use case, I do think there is a broadly useful API waiting to be discovered.

I have been thinking primarily in terms of using Phoenix routes as resources, and their public functions as actions. If the library is dealing with the typical actions :index, :show, :new, :create, :edit, :update, :delete, I can seed default permissions for each of these. Those atom tuples were a simplified version of what i have in mind.

This is still a very early stage project, so you may be right that the readme and some of the modules are a little over-prescribed. I wanted to get as much of the idea into markdown ASAP in part so I could get the kind of valuable feedback you guys have provided.

There would be an interface for managing permissions (and data for each of the modules), which should provide devs flexibility to follow a strict-ish CRUD design for resources; or they could add custom permissions to be used across roles and resources.

For instance, can a clerk update an order that’s already been shipped? Can a clerk EVER update the address on an order? “update” is very wide…

In this example, the org might create different resources for different types of orders and for orders with various attributes. Maybe the Resource struct needs a where field to identify subsets of a resource type.

More Detailed Example

%Resource{
  name: "Pending Order",
  route: Routes.order_path(...), # just a thought
  where: [
    [:status, =, :pending] # 🤔
  ],
  ...
}

yesterday = DateTime.utc_now() |> DateTime.add(-86400)

%Resource{
  name: "Recently Shipped Order",
  route: Routes.order_path(...),
  where: [
    [:status, =, :shipped], # 🤔
    [:shipped_at, <, yesterday]
  ],
  ...
}

%Role{
  name: "Clerk",
  org_id: 1234,
  permissions: [
    %Permission{
      resource: %Resource{name: "Pending Order", ...},
      actions: [:index, :view, :edit, :update]
    },
    %Permission{
      resource: %Resource{name: "Recently Shipped Order", ...},
      actions: [:index, :view]
    },
  ],
}

%Role{
  name: "Manager",
  org_id: 1234,
  permissions: [
    %Permission{
      resource: %Resource{name: "Pending Order", ...},
      actions: [:index, :view, :new, :create, :edit, :update, :delete] # soft delete
    },
    %Permission{
      resource: %Resource{name: "Recently Shipped Order", ...},
      actions: [:index, :view, :edit, :update]
    },
  ],
}

al2o3cr · July 7, 2021, 12:32am

type1fool:

yesterday = DateTime.utc_now() |> DateTime.add(-86400)

%Resource{
  name: "Recently Shipped Order",
  route: Routes.order_path(...),
  where: [
    [:status, =, :shipped], # 🤔
    [:shipped_at, <, yesterday]
  ],

This seems like it would be… challenging to store in the DB

The where syntax could be useful for properties directly attached to the resource (like status and shipped_at), but it would get complicated for properties of related resources. For instance, in a region-based sales system “recently shipped orders in the current user’s district”.

Right here is one of the pain points of trying to design a “generic” authorization system: the result may be a system whose configuration grows in complexity without bound. To balance that, consider establishing a set of “application personas” (see also the “user persona” practice in design); each “application persona” is a detailed description of a system built using RBAC. The example in your organizations branch is a good start at one.

One other thing to consider: designing good data structures is about “what questions should this make easy to ask?”. Consuming the access control (hooking into controllers, or similar mechanism) is important to prototype early.

harmon25 · July 7, 2021, 1:41am

I have been considering using Bodyguard — Bodyguard v2.4.1 (hexdocs.pm) to handle relatively simple authorization requirements.

Elixir/Erlangs pattern matching capabilities lends itself quite well for this type of code, which is one of the reasons I like Bodyguard.

On the more complex end - think AWS/GCP policies. Take a look at Zanzibar: Google’s Consistent, Global Authorization System

And ORY Keto which is “the first and only open source implementation of Zanzibar”

Open Policy Agent is also worth looking into for inspiration. Maybe building on top of something like that would make sense. (OPA is what earlier versions of Keto were built on)

type1fool · July 7, 2021, 2:28am

Excellent resources! Thanks!

benwilson512 · July 7, 2021, 2:32am

At the risk of nit picking, this appears to move into the realm of an ABAC not an RBAC. The difference is that in an RBAC, permissions are determinable on the basis of whether the permissions required for the operation overlap with the permissions granted by the users roles. In an ABAC (Attribute Based Access Control) properties of the entities the operation is being performed on are allowed to be relevant.

The danger with ABAC is I think well exemplified in this. It becomes very tempting to try to implement far too much business logic in the permission system layer. Not all validation failures are access control limitations.

Personally I have come to prefer RBAC inside of scopes. It ends up being far easier to reason about.

type1fool · July 9, 2021, 12:57am

This clarified things a bit for me.

In other words, RBAC isn’t concerned with an entity’s ability to be edited, only the user’s ability to edit this type of entity. So, it would be up to the developer to add entity checks before or after RBAC checks.

Using scopes, it seems that Arbiter could be a simple-ish RBAC that composes nicely with custom entity checks. Plenty of room to grow from there. I’m hoping to have more time to flesh this out in code over the weekend.

@harmon25 Thanks again for posting those links. Most of the complexity in the Zanzibar article seems to result from its distributed design . For now, the happy path is non-distributed applications.

benwilson512 · July 9, 2021, 3:06pm

This is definitely in the right direction. I think the only other caveat I’d add is that RBAC doesn’t tend to think in terms of types of entities either, instead focusing on operations, and the permissions required to perform that operation. Those operations can be sort of parameterized by type, but those types are largely opaque to the RBAC system.

So you’d have operations like :start_shipment or :onboard_user or :edit_billing_info and those operations will require one or more permissions. Clearly there are entities involved there but the RBAC logic doesn’t really know or care about users or shipments, just the named operations. Then it looks to match the permissions required by those operations up to permissions granted to an actor according to the roles that actor possesses.

I think this is a good plan! Just to clarify what I have in mind by scope, it isn’t usually about “this kind of entity” but rather “within this organization, user A has these roles, but within some other organization A has different roles”. Operations are then performed within the context of an organization, and the relevant roles applied.