Filter Data

This guide shows you how to use Data Filtering in the Oso Library. Data filtering lets you select certain data from your data store, based on the logic in your policy. In the Oso Library, data filtering works by telling Oso how to turn Polar constraints into queries against your data store, such as SQL queries or ORM query objects.

If you’re using Oso Cloud as an authorization service, data filtering is built in. Read about how to list authorized resources using the Oso Cloud API.

Why do you need Data Filtering?

When you call authorize(actor, action, resource) , Oso evaluates the allow rule(s) you have defined in your policy to determine if actor is allowed to perform action on resource. For example, if jane wants to "edit" a document, Oso may check that jane = document.owner. But what if you need the set of all documents that Jane is allowed to edit? For example, you may want to render them as a list in your application.

One way to answer this question is to take every document in the system and call is_allowed on it. This isn’t efficient and many times is just impossible. There could be thousands of documents in a database but only three that have the owner "steve". Instead of fetching every document and passing it into Oso, it’s better to ask the database for only the documents that have the owner "steve". Oso provides a “data filtering” API to do this.

You can use data filtering to enforce authorization on queries made to your data store. Oso will take the logic in the policy and turn it into a query for the authorized data. Examples could include an ORM filter object, an HTTP request or an elastic-search query. The query object and the way the logic maps to a query are both user defined.

Data filtering is initiated through two methods on Oso.

authorized_resources returns a list of all the resources a user is allowed to do an action on. The results of a built and executed query.

authorized_query returns the query object itself. This lets you add additional filters or sorts or any other data to it before executing it.

The mapping from Polar to a query is defined by an Adapter. If an adapter exists for your ORM or database you can use it, otherwise you may have to implement your own.

Implementing an Adapter

Adapters

An adapter is an interface that defines two methods. Once you’ve defined an adapter, you can configure your Oso instance to use it with the set_data_filtering_adapter method.

Build a Query

build_query takes some type information and a Filter object and returns a Query.

A DataFilter is a representation of a query. It is very similar to a SQL query. It has four fields:

  • root Is the name of the type we are filtering.
  • relations Are named relations to other types, typically turned into joins.
  • conditions Are the individual pieces of logic that must be true with respect to objects matching the filter. These typically get turned into where clauses.
  • types Is a map from type names to user type information, including registered relations. We use this to generate the join SQL.
Relations

A relation has three properties: left, Relation, and right. The adapter uses these properties to look up the tables and fields to join together for the query.

Conditions

A condition has three properties left, cmp, and right. The left and right fields will be either Immediate objects with a value field that can be inserted directly into a query, or Projection objects with string properties source and optionally field. A missing field property indicates the adapter should substitute an appropriate unique identifier, usually a primary key.

Execute a Query

execute_query takes a query and returns a list of the results.

Fields

The other thing you have to provide to use data filtering is type information for registered classes. This lets Oso know what the types of an object’s fields are. Oso needs this information to handle specializers and other things in the policy when we don’t have a concrete resource. The fields are a dictionary from field name to type.

Relations

Often you need data that is not contained on the object to make authorization decisions. This comes up when the role required to do something is implied by a role on it’s parent object. For instance, you want to check the organization for a repository but that data isn’t embedded on the repository object. You can add a Relation type to the type definition that states how the other resource is related to this one. Then you can access this field in the policy like any other field and it will fetch the data when it needs it (via the query functions).

Relations are a special type that tells Oso how one Class is related to another. They specify what the related type is and how it’s related.

  • kind is either “one” or “many”. “one” means there is one related object and “many” means there is a list of related objects.
  • other_type is the class of the related objects.
  • my_field Is the field on this object that matches other_field.
  • other_field Is the field on the other object that matches this_field.

The my_field / other_field relationship is similar to a foreign key. It lets Oso know what fields to match up with building a query for the other type.

Example

data_filtering_example_b.py
from sqlalchemy import create_engine, not_, or_, and_, false
from sqlalchemy.types import String, Boolean, Integer
from sqlalchemy.schema import Column, ForeignKey
from sqlalchemy.orm import sessionmaker, relationship
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()


class Organization(Base):
    __tablename__ = "orgs"

    id = Column(String(), primary_key=True)


# Repositories belong to Organizations
class Repository(Base):
    __tablename__ = "repos"

    id = Column(String(), primary_key=True)
    org_id = Column(String, ForeignKey("orgs.id"), nullable=False)


class User(Base):
    __tablename__ = "users"

    id = Column(String(), primary_key=True)


class RepoRole(Base):
    __tablename__ = "repo_roles"
    id = Column(Integer, primary_key=True)
    user_id = Column(String, ForeignKey("users.id"), nullable=False)
    repo_id = Column(String, ForeignKey("repos.id"), nullable=False)
    user = relationship("User", backref="repo_roles", lazy=True)
    name = Column(String, index=True)


class OrgRole(Base):
    __tablename__ = "org_roles"
    id = Column(Integer, primary_key=True)
    user_id = Column(String, ForeignKey("users.id"), nullable=False)
    org_id = Column(String, ForeignKey("orgs.id"), nullable=False)
    user = relationship("User", backref="org_roles", lazy=True)
    name = Column(String, index=True)


engine = create_engine("sqlite:///:memory:")

Session = sessionmaker(bind=engine)
session = Session()

Base.metadata.create_all(engine)

# Here's some more test data
osohq = Organization(id="osohq")
apple = Organization(id="apple")

ios = Repository(id="ios", org_id="apple")
oso_repo = Repository(id="oso", org_id="osohq")
demo_repo = Repository(id="demo", org_id="osohq")

leina = User(id="leina")
steve = User(id="steve")

role_1 = OrgRole(user_id="leina", org_id="osohq", name="owner")

objs = {
    "leina": leina,
    "steve": steve,
    "osohq": osohq,
    "apple": apple,
    "ios": ios,
    "oso_repo": oso_repo,
    "demo_repo": demo_repo,
    "role_1": role_1,
}
for obj in objs.values():
    session.add(obj)
session.commit()
data_filtering_example_b.py
from oso import Oso, Relation
from polar.data.adapter.sqlalchemy_adapter import SqlAlchemyAdapter

oso = Oso()

oso.set_data_filtering_adapter(SqlAlchemyAdapter(session))

oso.register_class(
    Organization,
    fields={
        "id": str,
    },
)

oso.register_class(
    Repository,
    fields={
        "id": str,
        # Here we use a Relation to represent the logical connection between an Organization and a Repository.
        # Note that this only goes in one direction: to access repositories from an organization, we'd have to
        # add a "many" relation on the Organization class.
        "organization": Relation(
            kind="one", other_type="Organization", my_field="org_id", other_field="id"
        ),
    },
)

oso.register_class(User, fields={"id": str, "repo_roles": list})
policy_b.polar
actor User {}

resource Organization {
	permissions = ["add_member", "read", "delete"];
	roles = ["member", "owner"];

	"add_member" if "owner";
	"delete" if "owner";

	"member" if "owner";
}

# Anyone can read.
allow(_, "read", _org: Organization);

resource Repository {
	permissions = ["read", "push", "delete"];
	roles = ["contributor", "maintainer", "admin"];
	relations = { parent: Organization };

	"read" if "contributor";
	"push" if "maintainer";
	"delete" if "admin";

	"maintainer" if "admin";
	"contributor" if "maintainer";

	"contributor" if "member" on "parent";
	"admin" if "owner" on "parent";
}

has_relation(organization: Organization, "parent", repository: Repository) if
	repository.organization = organization;

has_role(user: User, role_name: String, repository: Repository) if
	role in user.repo_roles and
	role.name = role_name and
	role.repo_id = repository.id;

has_role(user: User, role_name: String, organization: Organization) if
	role in user.org_roles and
	role.name = role_name and
	role.org_id = organization.id;

allow(actor, action, resource) if has_permission(actor, action, resource);
data_filtering_example_b.py
oso.load_str(policy_a)
leina_repos = list(oso.authorized_resources(leina, "read", Repository))
assert leina_repos == [oso_repo, demo_repo]

Evaluation

When Oso is evaluating data filtering methods it uses the adapter to build queries and execute them.

Relation fields also work when you are not using data filtering methods are also use the adapter to query for the related resources when you access them.

Limitations

Some Polar operators including cut and arithmetic operators aren’t supported in data filtering queries.

You can’t call any methods on the resource argument or pass the resource as an argument to other methods. Many cases where you would want to do this are better handled by Relation fields.

The new data filtering backend doesn’t support queries where a given resource type occurs more than once, so direct or indirect relations from a type to itself are currently unsupported. This limitation will be removed in an upcoming release.

Connect with us on Slack

If you have any questions, or just want to talk something through, jump into Slack. An Oso engineer or one of the thousands of developers in the growing community will be happy to help.