Oso Winter Hackathon

At Oso we use hackathons as an opportunity to push the boundaries a little in what we think the Oso product could do. And, honestly, we do hackathons to have a little break from cranking out releases of the core product—they’re a fun break from our day-to-day.

Our most recent hackathon just wrapped up and we’re finished hacking on a bunch of neat projects. We think the details will be interesting to you! Here's the lineup:

  • An update to the internals of our language VM
  • A VSCode plugin to auto-format your code, much like gofmt or prettier
  • Visualizing how our language Polar executes a query
  • Audiovisualizing how our language Polar executes a query
  • Writing a custom allocator to prove our code is leak-free

1. Async time, a time sync

Dave Hatch, Gabe Jackson Award: Most Punctual Project

Gabe and I worked on an improvement to how the Polar VM is implemented. We wanted to remove the need to manually maintain state in the VM—instead, we can use Rust’s built-in await syntax! We called this change the Async VM.

Right now, the VM is written as a hand-rolled event loop in Rust. The host languages (Python, Ruby, etc.) poll this event loop, and Oso’s VM returns a QueryEvent. The QueryEvent tells the host language what to do, whether that’s calling a method, looking up an attribute, or returning a result to the user.

This style has worked well for us, but has made the VM harder to work with. The event loop architecture means that we need to maintain state (and control flow information) on the heap because the Rust stack unwinds all the way to return to the host language. See this snippet below:

/// VM calls this based on the policy.
fn isa(&self) -> QueryEvent {
    let state = State { .. };
        // Developer must save state across return.
    let state_id = self.set_state(state);
        //
    return QueryEvent::ExternalIsa { state_id, .. };
}

/// Host language calls this function with result.
fn handle_isa_result(&self, state_id, isa_result) -> QueryEvent {
        // Developer must retrieve state, and have a way to correlate
        // with the saved state.
    let state = self.get_state(state_id);
    if !isa_result {
       self.backtrack()
    };

    return QueryEvent::Result { .. }
}

We realized that this problem is quite similar to running tasks in an async runtime. The VM is itself an async task that must wait on input from the host. In our hackathon project, we made use of Rust’s built-in async functions to allow us to refactor the VM to use await. This makes it easy to use state across external query events (just use variables normally). Rust’s zero-cost async/await syntax translates this into a future under the hood, taking care of all the state management for us:

async fn isa() -> QueryEvent {
    let state = 1;
    let isa_result = await self.host.isa(instance_id, class_tag)?;
    // Just use state normally. No need to save across boundary
    if !isa_result && state {
       self.backtrack()
    };

    // continue...
    return QueryEvent::Result { .. state, ... }
}

To achieve this, we wrote our own async runtime for the VM that takes care of returning to the host language as needed, while polling the VM’s async task to make progress. This is just a proof of concept, but we hope to roll out these changes to help our team ship even faster!

2. Polar formatter

Graham “GK” Kaemmer

Award: Prettiest Possible Parsing Project

I made polar-formatter, a code formatter for Polar code. It uses the same basic pretty-printing algorithm used by a lot of modern code formatters, and I’ve got it plugged into our VSCode plugin to automatically reformat polar files on save.

The formatting algorithm (Wadler’s “prettier printer”) is based on the same concept as that of gofmt, rustfmt, and prettier. We parse the whole polar file into an AST, and for each AST type (rules, operators, lists, etc), we define how those should be laid out on one line AND on multiple lines. Then, the algorithm uses those definitions to lay out the code such that lines do not exceed 80 characters.

Handling comments was actually the big challenge here—comments are absent from the Polar AST, so I needed to build a way to look at the source code to fetch “all comments since the last node”, and then print them in the same place. Our existing runtime parser actually doesn’t gather enough information to do this reliably, so I ended up writing a custom Polar parser that maintained more source location information!

I find that it saves a ton of time crafting policies because I can just write out a long line of code, and then the plugin will decide how best to format it. Hopefully we’ll have a release version of it out soon in the VScode plugin so you can try it out!

image (79).png

3. Polarcoaster

Steve Olsen Award: The Ride of a Lifetime

For my hackathon project I started with a really good name for a project, "Polarcoaster," and then tried to build something that could live up to it.

As you may know, in Polar you can run a policy with the POLAR_LOG=1 environment variable set to see some tracing output of its execution. The output shows how the query evaluates: when each rule is called, how inner terms are evaluated, when backtracking on failure happens, etc. For instance, here’s our policy:

allow(actor, action, resource) if
        has_permission(actor, action, resource);
    has_role(user: User, name: String, org: Organization) if
        role in user.org_roles and
        role.role = name and
        role.org_name = org.name;
    actor User {}
    resource Organization {
        roles = ["owner", "member"];
        permissions = ["invite", "create_repo"];
        "create_repo" if "member";
        "invite" if "owner";
        "member" if "owner";
    }
    resource Repository {
        roles = [ "writer", "reader" ];
        permissions = [ "push", "pull" ];
        relations = { parent: Organization };

        "pull" if "reader";
        "push" if "writer";

        "reader" if "writer";

        "reader" if "member" on "parent";
        "writer" if "owner" on "parent";
      }
      has_relation(org: Organization, "parent", repo: Repository) if
          org = repo.org;

When we trace the query oso.query_rule("allow", steve, "invite", osohq) we get this trace, which I’m truncating to 10 lines (in actuality, it’s 50 lines).

Polar tracing enabled. Get help with traces from our engineering team:https://help.osohq.com/trace
[debug]   QUERY: allow(User(name='steve', org_roles=[{'role': 'owner', 'org_name': 'osohq'}]), "invite", Organization(name='osohq')), BINDINGS: {}
[debug]     APPLICABLE_RULES:
[debug]       allow(actor, action, resource) if has_permission(actor, action, resource);
[debug]     RULE: allow(actor, action, resource) if has_permission(actor, action, resource);
[debug]       QUERY: has_permission(_actor_9, _action_10, _resource_11), BINDINGS: {_action_10 => "invite", _actor_9 => User(name='steve', org_roles=[{'role': 'owner', 'org_name': 'osohq'}]), _resource_11 => Organization(name='osohq')}
[debug]         APPLICABLE_RULES:
[debug]           has_permission(actor: Actor{}, "invite", organization: Organization{}) if has_role(actor, "owner", organization);
[debug]         RULE: has_permission(actor: Actor{}, "invite", organization: Organization{}) if has_role(actor, "owner", organization);
[debug]         MATCHES: _actor_16 matches Actor{}, BINDINGS: {_actor_16 => User(name='steve', org_roles=[{'role': 'owner', 'org_name': 'osohq'}])}

This is pretty cool! But it’s hard to read, right?

Query execution is a depth first search through the knowledge base. It’s a tree, but it's hard to tell from the logs where the branches are and when the backtracking happens. Wouldn't it be nice if you could see what's going on? No problem, just take a ride on the Polarcoaster! This shows the same query execution, but instead of some boring old logs, it displays the search as a rollercoaster. The cart moves through the query, backtracking when it comes to the end of any one branch. Each node is a rule or a sub-query, and you can see what each one is by mousing over the individual nodes.

https://youtu.be/8mt13Gp8HW0

4. Audio Visual

Patrick O’Doherty Award: The Synesthesiologist’s Prize

Recently I’ve been reading about generative art, and I was inspired to see if I could answer the question, "what does an Oso authorization query look and sound like?" I implemented some basic instrumentation of the Polar virtual machine to generate and broadcast Open Sound Control messages in response to key events. I wrote a small viewer program using Nannou (a creative coding framework for Rust) which listens for these Open Sound Control messages and uses them to power a live visualization. The stack depth of the query is represented by a sequence of recursively nested squares, and the background color changes in response to branching events in the query. Here's a video of it in action! https://www.youtube.com/watch?v=4GcT_8IxWxI

The audio accompaniment is generated by VCV Rack, a virtual Eurorack synthesizer. You can check out the source for the viewer program on our Github.

5. "Hackathon? I have no memory of that"

Sam Scott

Award: The Retrograde Amnesia Prize

This project was about how we make sure our core product is rock-solid and stable. In the past, we've occasionally had some problems with memory leaks, like this issue raised by Brian in June.

As you’ve seen above, we use Rust for the core of our implementation. Rust has great control and reassurances around memory management. But when interfacing with other languages like Python and Go, we’re still passing raw pointers around. These pointers require us calling back into the Rust code to free them. If we forget to call free, this memory will be leaked.

Writing tests to detect memory leaks in four languages sounded like a huge pain. Maybe a set of tests in a docker container that watches if memory usage grows over time? Sounds super error-prone.

Instead, I wanted to be able to write reproducible tests that wouldn't flake and would give us a precise answer, no matter how slow or small the leak. To do this, I used Rust's support for custom memory allocators to track exactly how much memory was allocated and where that allocation happened. The allocator logic was super simple:

An allocator needs to reserve memory for a data layout and give back a pointer. In English, this API says: here’s an unsafe function that takes in a reference to self (the allocator), a specification for how memory is laid out (the layout), and returns a mutable pointer to the memory that’s been allocated.

unsafe fn alloc(&self, layout: Layout) -> *mut u8 {

So what I do is count how much memory the layout needs,

let size = layout.size();

ask the system allocator to actually allocate the memory,

let ptr = System.alloc(layout);

and then increment an atomic integer to store the result.

# This is using Rust atomics to mutate global state!
ALLOCATED.fetch_add(size, Ordering::SeqCst);

I do something similar for deallocating memory, and I have a function that can see how much memory is allocated! I hooked this up to one of our existing Python tests: https://github.com/osohq/oso/blob/0e07ea1b1c1df29a1bf2552c0395beb97e725a6c/test/test.py#L161-L169 If there were no memory leaks, we would expect to see the memory allocated to return to zero after deleting the Oso object.However, oh no! It didn't go to zero. The end of the trace showed:

Allocated:  40
D 1cef6f0 40

Saying that there were 40 bytes still allocated, and these were held at address 1cef6f0 when the program exited. But luckily, because of how I’d written the allocator, I could search through the output to see exactly where these bytes were allocated:

A 1cef6f0 40
  11: polar::LAST_ERROR::__init
             at ./polar-c-api/src/lib.rs:56:59

Aha! Prior to PR #1390, we were using a global variable to handle the "last error". This global variable will exist until the program exits. The first time there is an error, these 40 bytes are allocated and never returned. This isn't really a memory leak, it's just a global variable. But, we removed that global variable in #1390. So we shouldn’t expect to see those 40 bytes if I run this against main....

Allocated:  0

Success! There are zero memory leaks that happen over the course of running the Python parity tests. Plus, now we don't even have any global variables allocated. Next up: repeating this across all libraries that use the C API (I'm coming for you Ruby, Go and Java!).

Hope you enjoyed, and stay tuned for more hackathon projects from us! If this got you interested in Oso, join the Oso community Slack to connect with us and hundreds of other developers. Also, if quarterly team hackathons sounds like something that would be fun to work on, we’re hiring!

Want us to remind you?
We'll email you before the event with a friendly reminder.

Write your first policy