Below are various questions and answers about general authorization concepts. All of this information is general-purpose educational content and is not specific to Authzed in any way.
What is AuthN vs AuthZ?
Authentication is the act of proving or establishing a user's identity. In most applications this manifests itself as "log in", and verifies to the application which "entity" (typically a user) is accessing the application.
Authorization is the act of proving that an entity (user) has permission to access a resource, perform some action, or is otherwise permitted somewhere.
In most applications, authorization is handled by having inline permissions checks, where code will check the user’s permission before performing an action or returning some data:
What is RBAC?
RBAC stands for Role Based Access Control.
RBAC is an extremely common design pattern for permissions systems where each user is assigned a role, from which the user's permissions are derived. When a user changes roles, as do their permissions. This is as specific as the definition gets because implementations can vary greatly.
An note-taking application could have two roles defined for each note: editor and reader.
Editors have access to be able view and edit notes. Readers have access to view notes, but not alter them.
When the application needs to check if a user can edit a note, it checks to see if the user has the editor role. When the application needs to check if a user can view a note, it checks to see if the user has the editor or the reader role.
Because RBAC is a very general design, it leaves lots of details unexplored for implementers to discover and decide the best behavior for their application. Often fleshing out these details and implementing additional behavior is referred to as Advanced RBAC.
Because permissions are tied to particular roles, users that do not perfectly fit into these roles cannot be represented. RBAC also doesn't specify whether roles are required to exist for every object.
For example, a note-taking application could check to see if is a user has the admin role on a collection of notes rather than the admin role on the particular note the user is accessing.
Groups are sets of users organized into arbitrary collections that can be assigned roles. A naive implementation of RBAC use the roles themselves as groups, but this can be confusing and limiting for users.
Groups can also have hierarchies where the users are inherited into other groups. For example, the users in an owners group might be included in every group in the application.
Roles may or may not be implemented to exist within a hierarchy.
For example, an admin role on a group of users might mean that admins can only manage group membership or it could also transitively inherit admin roles on each of the resources owned by the team.
It is not immediately obvious if every user should have roles assigned by default. Users without roles can be considered a bug or interpretted as having an implicit role.
Assigning roles when it isn't user-driven
Descriptions of RBAC use examples where an administrative user assigns roles to new users. However, there are events that can occur in applications where a human is not assigning roles manually. These can be simple, such as "creating a resource assigns the creator the admin role." However, not all of these scenarios will have obvious or intuitive solutions.
Imagine an application whose users originate from another system. Events such as a user's first-login, which creates their account, puts the application in an awkward spot. It can not assign any roles at all or it can guess which roles to assign for the new user. If the application does the former, administrative users will have to manually assign roles for every user. If the application does the latter, roles could be assigned incorrectly and the application and user system likely needs to become deeply coupled to be correct.
Enterprises often list RBAC as a requirement for applications that they want to adopt. Businesses have different roles of employees that will use the software, so it's intuitive that they would require a permission system based around roles. However, it may not be intuitive what functionality they need beyond the simplest definition of RBAC.
Predefined groups might not be enough for enterprises attempting to map their existing company structure into an application.
When there are multiple organizations using the same appplication, there has to be something that owns a set of users or groups for each organization.
On applications like GitHub, this container is actually called an organization and organizations contain user-defined groups called teams which have roles assigned to them.
While GitHub has only one parent, there are times where a container is needed for a set of organizations, and then a parent for that, ad infinitum.
Roles for management
There are a variety of roles that are often overlooked until enterprise requires them.
- IT or operations employees often require a role for an administrating absolutely everything.
- Employees tracking expenses can require access to only configure and collect billing information.
- Auditors or security specialists can require access to only collect logs of actions for compliance.
Integration with authentication systems
There are many companies that already have groups, teams, and roles stored in their authentication service. Enterprises want these users to be able to log into the application -- often with their existing groups matched to particular roles. This sometimes requires active synchronization with protocols such as SCIM and other times the application can lazily create accounts on first sign-in.
The following are common enterprise Identity Providers:
What is ABAC?
Attribute Based Access Control
Attribute based access control (ABAC) is a slightly more advanced (compared to RBAC) permissions model where each item (user, group, resource, etc) is assigned an attribute, from which, combined with a policy document, permissions are derived.
For example, if we wanted to grant a user permission to change a specific resource, an attribute assigned to the user might be
The policy engine would then be configured to say if there exists a
write-resource attribute for the user matching the resource’s ID, then the user can write to the resource.
Attribute based access control provides for much finer-grained and controlled permissions modeling than RBAC, but at the cost of maintaining a policy document and numerous attributes. ABAC can be used to model RBAC and other simpler permissions systems, but with an additional complexity cost around configuration and validation.
What are other AuthZ models?
- Context Based Access Control (CBAC)
- Capability Based Security
- Discretionary Access Control (DAC)
- Graph-Based Access Control (GBAC)
- Lattice-Based Access Control (LBAC, Label-Based Access Control, Rule-Based Access Control)
- LBAC is a policy model where users and objects can be combined to define rules. These rules are the minimum requirements to gain access.
- Organization-Based Access Control (OrBAC)
- Mandatory Access Control (MAC)
- MAC is any system where only a centralized "administrator" controls all policy changes. For example, Linux file permissions are not MAC because all users can change the permissions of files where they have write-access. Linux does have a great example of MAC: the most commonly cited example of MAC are Linux Security Modules which centrally restrict access to resources in the kernel.
- Relationship-based Access Control (ReBAC)
- This is the model used by Authzed!
- Rule Set Based Access Control (RSBAC)
What is ACL-Filtering?
ACL, or "Access Control List", in its colloquial usage, is synonymous with the word "permission". ACL filtering is simply filtering a list of objects by whether or not a particular user has access to the items in the list.
Put another way, this is the answer to the question "What are all the things that this user has access to?".
There are a variety of ways that this can be accomplished, deeply depending on the implementation of the authorization system. This document differentiates between prefilters that filter results before the data is fetched from a database/service and postfilters which filter results after they have been fetched.
How does Authzed perform ACL-Filtered List operations?
A prefilter API is currently a work-in-progress. Prefilters query Authzed to return the list of objects that a user can access. The results of this query can be used as the input to a query for selecting items out of a database.
Postfilters query the data source first and feed results into a
filter() function provided by an Authzed client library.
This function performs Check requests for each item fed into it.
In order to make this strategy efficient, each language's implementation of
filter() has performance optimizations such as laziness, batching, and performing check requests in parallel.
Because postfilters are implemented in the authzed client libraries, they have varying functionality based on which library is used.
To achieve better performance, if a perfectly consistent view of the results is not required, a combination of both filters can be used. The results of a Pre-Filter can be cached and have a Post-Filter ran over them to ensure that nothing has changed since the results were cached. This will exclude new items, but will not return any items that the user has lost access to.
How do other systems perform ACL-Filtered List operations?
Policy Engines perform can only filter potential results by executing polices on each result in a Post-Filter. This can cause many polices to be executed, but performance overhead can be dealt with by batching and performing execution in parallel.
Homegrown authorization systems that have been modelled in the same relational database as the application usually filter by performing SQL JOINs. This comes with the trade-off that relationships have to be stored in a denormalized form in the database so that JOINs are possible. The more complex the relationships, the harder it is to design, maintain, and keep performant a system based on JOINs.
What is the New Enemy Problem?
The "New Enemy Problem" is an issue that can appear in systems where resources and permissions are distributed amongst replicated data sources, and therefore, not bounded by time in how "out of date" they can get.
Imagine a system where there are two databases, one in the United States and one in Europe, with updates being replicated from one database to the other, as changes are made. Lets now say a user ("Alice") in the United States removes another user’s ("Bob") permission on a document, and then immediately afterwards updates the contents of the document with information that Bob cannot know about. If the updates to the document reach the European replica of the database before the permission updates, then Bob could see the new information, which is caused by Bob’s permission change having arrived out-of-order.
What is OpenID Connect (OIDC)?
OpenID Connect provides a way for websites and applications, who support OAuth 2, to define a standard means of using OAuth 2 to provide authentication and identification. Typically, OIDC is used to allow users to authenticate to another website or application using their existing account on a service such as Google or Facebook; since Google and Facebook "speak" OIDC, another website that wants to add a "Login with Google" button can do so without having to implement a custom authentication API.
Open ID Connect, like all authentication systems, does not generally provide authorization or permissions; while OAuth 2 does provide permissions, OIDC as a specification explicitly is designed not to do so, in favor of simply providing the identity of the user to the website or application using it.
What is a Policy Engine?
A Policy Engine is software that process programs called policies in order to produce a final decision. Policies are expressed in policy languages that vary depending on the engine. Some languages are very limited in functionality, while others are Turing Complete.
In permission systems, policy engines are used to determine whether or not a subject has access to perform an action on an object. This can occur at various points during a request's lifecycle:
- early in software infrastructure such as load-balancing
- in a reverse-proxy directly in front of an application
- after an application has received a request and gathered any additional context to be used as input
Choosing between these integration points depends on what the goal of the policies are and what input data is available at each.
Pros & Cons
Policy Engines are great because they separate policy from the business logic in your application. This separation allows for much easier changes to policy than if the policy is deeply coupled to the application. A dedicated language for describing policies can also be far more succinct than a general-purpose programming language. Policies are also often pure, so they can be exhaustively tested.
Because Policy Engines are only focused on checking access to a resource, they are not full solution for solving permissions in an application. Policies can only evaluate input and compute "yes or no", so they do not efficiently answer questions such as "Who are all the people with access to this resource?" or "What are all the resources that this user can access?".
Policy Engines can only be as consistent as the input data they are provided, which often puts the task of synchronizing input data into the hands of the application developer. This could be either good or bad, depending on the design of the policy engine and the requirements of the application.