O’Reilly SuperStream: Retrieval-Augmented Generation in Production

We recently sponsored the O’Reilly SuperStream: Retrieval-Augmented Generation (RAG) in Production webinar.

Large Language Models (LLMs) open up a new way to interact with your data. Your users can now use a natural language interface rather than traditional search to get back relevant information. But the flexibility and scale of LLMs make it harder to ensure that you don’t leak sensitive data. Oso Engineer Greg explored these challenges and demo'ed how to use Retrieval-Augmented Generation to build an authorized LLM chatbot that protects your data. Watch the recording below:

What you will learn

  • Why access control can be lost in RAG chatbot implementations.
  • Different approaches to handling authorization in vector search.
  • How to store and retrieve embeddings with proper access controls.
  • Best practices for integrating access control with vector databases and traditional databases.
  • Practical examples of authorization-aware queries.

Key highlights & technical takeaways

Challenges with access control in RAG chatbots

  • Source data often has access controls that get lost in the ETL process.
  • Without proper filtering, sensitive information can be exposed in chatbot responses.
  • Potential data leaks could lead to security incidents or reputational damage.

Approaches to preserving access control

  1. Separate embeddings databases (Not ideal)
    • Store HR data, engineering data, etc., in separate embeddings databases.
    • Requires querying multiple databases for users with cross-team access.
    • Increases maintenance complexity and operational overhead.
  2. Embedding-level metadata for access control (Better approach)
    • Attach metadata (e.g., folder IDs, access control lists) to embeddings.
    • Use vector databases like Pinecone/Chroma that support metadata.
    • Query embeddings while filtering based on access rights.
  3. Tightly coupling vector embeddings with application data (Best approach)
    • Store embeddings in a general-purpose database (e.g., Postgres, MongoDB with vector extensions).
    • Link embeddings to existing application data with foreign keys.
    • Write authorization-aware queries that filter vector search results before returning responses.

Technical implementation details

  • Pass access control metadata through the ETL pipeline.
  • Use vector database metadata fields or relational database joins to enforce access control.
  • Ensure chatbot queries only retrieve vectors from authorized data sources.
  • Implement fine-grained authorization logic at query time rather than relying on blunt database separation.
Want us to remind you?
We'll email you before the event with a friendly reminder.

Write your first policy