Authorization Query Performance

This document explores factors that can affect Oso Cloud's authorization query performance and situations that can increase query latency.

Performance heuristics

When performing authorization queries, the majority of the time Oso spends is during the authorization data evaluation phase, where Oso determines if your centralized authorization data can satisfy the query's unconstrained parameters. For a more conceptual overview, see Authorization Query Processing Flow.

Two key factors influence authorization query performance:

Query Satisfaction Combinations: How many different ways can a given allow(actor, action, resource) query evaluate to true? Fewer possibilities generally mean faster evaluation.
Constraint Evaluation Cost: How expensive is it to evaluate each potential way to satisfy the query? Targeted, highly constrained queries with simple, linear relationships evaluate more quickly than complex ones.

Common performance impact factors

While most queries evaluate quickly, two primary factors can lead to increased latency:

Recursive relationships

Policies with recursive relationships can be expensive to evaluate, especially when combined with broad access patterns. Consider this example:


resource Folder {
  roles = ["reader", "writer"];
  relations = {
    repository: Repository,
    folder: Folder,
  };
  "reader" if "reader" on "repository";
  "writer" if "maintainer" on "repository";
  role if global "admin" and is_public(resource);
  role if role on "folder";
}

Checking if a user can read a specific folder requires checking if they have the reader role on that folder, or any of it's parent/grandparent folders. Folders with a deeply nested directory structure will be more expensive to check. This is also true when querying all accessible folders for a specific user,
but additionally a user with broad access (e.g. a global admin) will need recurse over more folders and become more expensive compared to a user with access to only a few folders.

If the relationships between these files and folders are deeply nested, this can cause latency well above the average.

The complexity increases with multiple recursive relationships:


resource Team {
  roles = ["member"];
}
resource Folder {
  permissions = ["read"];
  roles = ["reader"];
  relations = { team: Team, folder: Folder };
  # Multiple relationships determine permission
  "reader" if "reader" on "folder" and "member" on "team";
  "read" if "reader";
}

With this policy, determining if readers have the "reader" role requires recursively consulting both the "folder" and "team" relations. In the common case of authorizing access to a specific Folder, this is fine. However, queries like listing all readable folders can require a lot of resources.

Satisfying combination "length"

Consider a condition A that is only true if B and C is true. This means that B and C must be evaluated in order to prove A. Since A has two sub-conditions, we will say that condition A has length 2.

The sub-conditions themselves might also require evaluating other sub-conditions, or otherwise have their own performance characteristics, such as recursion. For example, C may be true only if D and E is true. This means to evaluate A above, we need to evaluate B and D and E, bringing the length of A to 3.

You need to consider the performance of every sub-condition, including nested sub-conditions, when trying to understand the performance of evaluating the top level condition.

As seen above, when thinking about performance heuristics, we consider the number of different ways that a query can be evaluated.

Let's dissect an example. When determining if a user can read a file, they might be able to if the user has:

Role	Resource
`"reader"`	On the file directly
`"reader"`	On the containing folder (established recursively)
`"admin"`	Globally and the file is public

This represents 3 combinations which might result in the ability to read the file (though one of them is recursive).

However, it's important to also consider the different types of conditions that we need to evaluate for each condition, even if they are not recursive. Specifically, the number of different conditions that must be true for the evaluated condition to be true––we'll call this the condition's "length."

For example:


  role if global "admin" and is_public(resource);

Requires that we determine both if the user is a global "admin", as well as if the resource is public. You can conceive of this as a "length" of 2. If one of the conditions that this condition depends on also had a "length" greater than 1, we would need to add that to this condition's length.

Naturally, "longer" conditions take longer to evaluate.

Data skew

Query performance can suffer when access patterns deviate significantly from typical usage. This occurs because:

The database that stores your authorization data executes the query based on the most common access patterns, meaning atypical access patterns may result in less efficient query execution
The execution engine might need to scan large portions of your data for broad-access queries

For example, if your application regularly queries for all the resources a user can view within their organization (where users belong to single organization), the optimizer will efficiently filter by organization early in the query.

However, if you perform the same query for a user that has access to all organizations' resources, the query plan might be less efficient and introduce greater latency.

Optimization considerations

When designing your authorization system, consider these factors:

The depth and complexity of recursive relationships.
The total number of conditions that must be evaluated (i.e. the condition's "length").
The distribution of access patterns across your user base
The potential for queries that must evaluate large portions of authorization data. For example, broad access roles (like global admins) perform differently than highly targeted queries.

Understanding these factors can help you structure your policies and data to maintain optimal performance for your use case.

End-to-End Example