Oso Cloud’s List Filtering: Is It Reverse Indexing?
If you are familiar with reverse indexing, one of the core concepts of Zanzibar, Oso refers to this process as list filtering. List filtering enables applications to retrieve only authorized resources—reducing unnecessary data processing.
Check out our documentation on List Filtering to learn about the two methods Oso provides for list filtering: centralized filtering and local filtering.
List Filtering in LLMs and Retrieval-Augmented Generation (RAG)
In LLM chatbots and retrieval-augmented generation (RAG) systems, list filtering ensures users only see responses or documents they have permission for. If an LLM needs to generate responses based on a user’s data, filtering must happen before retrieval to ensure:
- Data privacy: The model only accesses authorized documents.
- Query efficiency: The search space is reduced to relevant content.
- Better answers: The LLM retrieves contextually appropriate data instead of irrelevant or unauthorized information.
Instead of fetching everything, filtering in-memory and checking access for each item separately, LLM applications can:
- Use centralized filtering to query Oso Cloud for a list of authorized document IDs before retrieving them from the database.
- Use local filtering to enforce authorization constraints directly within database queries, minimizing redundant data transfers.
For example, before retrieving a list of support tickets for summarization, you can apply list filtering to ensure the LLM doesn’t access unauthorized tickets and only shows tasks assigned to you based on department, region, or customer. Similarly, if you are working at a SaaS company with an internal chatbot system, employees should only see their own data when asking the chatbot about health insurance or pay statements. This approach improves response times by minimizing network requests, reducing database load, and making your application more scalable when integrating retrieval-augmented generation (RAG) techniques with LLMs.