Knowledge Base
The Platform Knowledge Base is the foundation that grounds all agent responses in your actual organizational data. Documents, policies, SOPs, and institutional knowledge are indexed, chunked, embedded, and connected through a knowledge graph that agents query in real time.
What the Knowledge Base is
The Knowledge Base is a unified, AI-optimized repository of your organization's documents, data, and institutional knowledge. Unlike a traditional document management system that just stores files, the BOSS Knowledge Base actively understands content -- extracting entities, mapping relationships, and making everything instantly searchable via natural language.
Every document uploaded to any studio is automatically indexed into the Knowledge Base. Agents use this as their primary source of truth. When you ask an agent a question, it searches the Knowledge Base first, ensuring responses are grounded in your actual data rather than general training knowledge.
Unified repository
All documents, notes, tables, and files across all studios are indexed in a single knowledge base. Cross-studio search finds information wherever it lives.
Knowledge graph
Powered by Graphiti, the knowledge graph maps entities (people, products, policies, dates) and their relationships. Agents can traverse the graph to find connected information.
Semantic search
Vector embeddings enable natural language search. Ask "what is our refund policy?" instead of keyword matching. Results are ranked by relevance with confidence scores.
Access-controlled
Knowledge Base access respects studio permissions. Agents can only retrieve documents from studios they are assigned to, unless explicitly granted cross-studio access.
How documents get indexed
When a document is uploaded to the Knowledge Base (or created in any studio), it goes through a 4-stage indexing pipeline. This process typically takes 5-30 seconds depending on document size and complexity.
1. Upload and parse
The document is received, format is detected, and content is extracted. Supports PDF, DOCX, XLSX, PPTX, MD, TXT, HTML, CSV, JSON, and 40+ more formats. Scanned documents go through OCR (Optical Character Recognition) to extract text from images.
Large documents (100+ pages) are processed in parallel chunks for speed.
2. Chunk and embed
Content is split into semantically meaningful chunks. Unlike fixed-size chunking that might split a paragraph mid-sentence, semantic chunking respects document structure -- keeping paragraphs, sections, and tables intact. Each chunk is then embedded into a vector representation using a text embedding model.
Chunk sizes typically range from 200-1000 tokens. Overlap between chunks ensures context is not lost at boundaries.
3. Extract entities
Named Entity Recognition (NER) extracts people, organizations, dates, amounts, product names, policy references, and other domain-specific entities from the text. These entities become nodes in the knowledge graph.
Entity extraction is configurable per document type. Legal documents extract clause references; financial documents extract amounts and dates.
4. Build graph
Extracted entities are connected with typed relationships (e.g., "PTO Policy" is "defined_in" the "Employee Handbook", which is "managed_by" the "HR Department"). The Graphiti knowledge graph grows with every indexed document, creating a web of organizational knowledge.
The graph enables multi-hop queries: "Who manages the policy that covers employee time off?" traverses Person -> Department -> Policy -> Document.
How agents query the Knowledge Base
When an agent needs to answer a question or complete a task, it automatically queries the Knowledge Base using RAG (Retrieval-Augmented Generation). This happens transparently -- the user asks a question, and the agent grounds its response in your actual data.
The RAG pipeline works in 5 steps:
- 1Agent receives the user message and analyzes intent
- 2Agent constructs a search query optimized for the knowledge base (may be different from the user's exact words)
- 3Knowledge Base returns the most relevant chunks with relevance scores
- 4Agent synthesizes a response using the retrieved chunks as context
- 5Agent includes citations linking back to the source documents
Grounding indicator: Agent responses include a grounded: true flag when the response is based on Knowledge Base data. When an agent generates a response purely from its training data (no KB matches), the flag is grounded: false -- a signal to treat the response with more scrutiny.
Spec update proposals
One of the most powerful features of the Knowledge Base is that agents can propose updates to it. When an agent detects that a document is outdated, a policy has changed, or new information contradicts existing content, it creates a Spec Update Proposal.
Proposals go through a review process -- they are never applied automatically. Designated reviewers (typically subject matter experts or compliance officers) approve, reject, or modify the proposal. This creates a feedback loop where agents actively maintain and improve the knowledge base over time.
Common proposal triggers:
Enterprise: uploading SOPs, policies, and compliance docs
For enterprise organizations, the Knowledge Base serves as the canonical source of truth for Standard Operating Procedures, corporate policies, compliance documentation, and institutional knowledge. When agents are grounded in your actual policies, they give answers consistent with how your organization actually operates.
SOPs and Procedures
Examples: Employee onboarding, incident response, change management, code review
Agents follow your actual procedures when executing playbooks instead of generic best practices.
Corporate Policies
Examples: PTO, expense, travel, data handling, acceptable use, code of conduct
Agents answer policy questions accurately. HR agents cite the exact policy section in their responses.
Compliance Documents
Examples: SOC 2 controls, HIPAA procedures, GDPR DPIAs, ISO 27001 policies
Compliance agents proactively check agent outputs against regulatory requirements. Audit prep becomes automated.
Product Documentation
Examples: API docs, user guides, architecture diagrams, release notes
Support agents answer customer questions from current documentation. Engineering agents reference specs when building.
Training Materials
Examples: Onboarding guides, role-specific training, tool tutorials, best practices
New employee onboarding agents create personalized training plans based on actual training materials.
Bulk upload: Enterprise customers can bulk-upload entire document repositories via the API or the admin UI at /admin/knowledge-base/upload. BOSS processes documents in parallel and provides a progress dashboard showing indexing status for each file.
API routes
The Knowledge Base is fully accessible via the REST API. All routes require authentication with an API key that has the appropriate scopes (documents:read, documents:write, search:query).
Documents
PDF, DOCX, DOC, ODT, RTF
Spreadsheets
XLSX, XLS, CSV, TSV, ODS
Presentations
PPTX, PPT, ODP, KEY
Text
TXT, MD, HTML, XML, YAML
Code
JS, TS, PY, GO, RS, JAVA
Images (OCR)
PNG, JPG, TIFF, BMP
EML, MSG, MBOX
Archives
ZIP (auto-extract)