DocsKnowledge Base

Knowledge Base

The Platform Knowledge Base is the foundation that grounds all agent responses in your actual organizational data. Documents, policies, SOPs, and institutional knowledge are indexed, chunked, embedded, and connected through a knowledge graph that agents query in real time.

What the Knowledge Base is

The Knowledge Base is a unified, AI-optimized repository of your organization's documents, data, and institutional knowledge. Unlike a traditional document management system that just stores files, the BOSS Knowledge Base actively understands content -- extracting entities, mapping relationships, and making everything instantly searchable via natural language.

Every document uploaded to any studio is automatically indexed into the Knowledge Base. Agents use this as their primary source of truth. When you ask an agent a question, it searches the Knowledge Base first, ensuring responses are grounded in your actual data rather than general training knowledge.

Unified repository

All documents, notes, tables, and files across all studios are indexed in a single knowledge base. Cross-studio search finds information wherever it lives.

Knowledge graph

Powered by Graphiti, the knowledge graph maps entities (people, products, policies, dates) and their relationships. Agents can traverse the graph to find connected information.

Semantic search

Vector embeddings enable natural language search. Ask "what is our refund policy?" instead of keyword matching. Results are ranked by relevance with confidence scores.

Access-controlled

Knowledge Base access respects studio permissions. Agents can only retrieve documents from studios they are assigned to, unless explicitly granted cross-studio access.

How documents get indexed

When a document is uploaded to the Knowledge Base (or created in any studio), it goes through a 4-stage indexing pipeline. This process typically takes 5-30 seconds depending on document size and complexity.

1. Upload and parse

The document is received, format is detected, and content is extracted. Supports PDF, DOCX, XLSX, PPTX, MD, TXT, HTML, CSV, JSON, and 40+ more formats. Scanned documents go through OCR (Optical Character Recognition) to extract text from images.

Large documents (100+ pages) are processed in parallel chunks for speed.

2. Chunk and embed

Content is split into semantically meaningful chunks. Unlike fixed-size chunking that might split a paragraph mid-sentence, semantic chunking respects document structure -- keeping paragraphs, sections, and tables intact. Each chunk is then embedded into a vector representation using a text embedding model.

Chunk sizes typically range from 200-1000 tokens. Overlap between chunks ensures context is not lost at boundaries.

3. Extract entities

Named Entity Recognition (NER) extracts people, organizations, dates, amounts, product names, policy references, and other domain-specific entities from the text. These entities become nodes in the knowledge graph.

Entity extraction is configurable per document type. Legal documents extract clause references; financial documents extract amounts and dates.

4. Build graph

Extracted entities are connected with typed relationships (e.g., "PTO Policy" is "defined_in" the "Employee Handbook", which is "managed_by" the "HR Department"). The Graphiti knowledge graph grows with every indexed document, creating a web of organizational knowledge.

The graph enables multi-hop queries: "Who manages the policy that covers employee time off?" traverses Person -> Department -> Policy -> Document.

upload-document.ts

// Upload a document to the knowledge base const result = await boss.knowledgeBase.upload({ file: fileBuffer, filename: 'employee-handbook-2026.pdf', studioId: 'stu_hr_01', // Optional: scope to a studio metadata: { category: 'policy', department: 'HR', version: '3.2', effectiveDate: '2026-01-01', }, processing: { ocr: true, // OCR for scanned documents extractEntities: true, // Extract people, orgs, dates, amounts generateSummary: true, // AI-generated document summary chunkStrategy: 'semantic', // 'fixed', 'semantic', or 'page' }, }); // result: // { // documentId: 'kb_doc_abc123', // chunks: 47, // entities: 23, // relationships: 15, // summary: 'Employee handbook covering PTO policy, code of conduct...', // processingTime: 12400 // ms // }

How agents query the Knowledge Base

When an agent needs to answer a question or complete a task, it automatically queries the Knowledge Base using RAG (Retrieval-Augmented Generation). This happens transparently -- the user asks a question, and the agent grounds its response in your actual data.

The RAG pipeline works in 5 steps:

1Agent receives the user message and analyzes intent
2Agent constructs a search query optimized for the knowledge base (may be different from the user's exact words)
3Knowledge Base returns the most relevant chunks with relevance scores
4Agent synthesizes a response using the retrieved chunks as context
5Agent includes citations linking back to the source documents

search-query.ts

// Semantic search across the knowledge base const results = await boss.knowledgeBase.search({ query: 'What is our PTO policy for employees with less than 2 years tenure?', studioId: 'stu_hr_01', // Optional: scope to specific studio filters: { category: 'policy', updatedAfter: '2025-01-01', }, limit: 5, minScore: 0.7, includeGraph: true, // Include knowledge graph context }); // results: // { // chunks: [ // { // id: 'chunk_abc123', // documentId: 'kb_doc_abc123', // content: 'Employees with less than 2 years of service receive 15 days...', // score: 0.94, // metadata: { page: 12, section: 'Time Off Policy' } // } // ], // graphContext: { // entities: ['PTO Policy', 'Employee Handbook', 'HR Department'], // relationships: [ // { from: 'PTO Policy', to: 'Employee Handbook', type: 'defined_in' }, // { from: 'PTO Policy', to: 'HR Department', type: 'managed_by' } // ] // } // }

agent-rag-response.json

// How agents query the knowledge base internally // (This happens automatically when agents use RAG search) // 1. Agent receives user message // 2. Agent constructs a search query from the message context // 3. Agent calls RAG search with studio-scoped filters // 4. Agent receives relevant chunks with citations // 5. Agent synthesizes response grounded in actual data // 6. Agent includes citations in the response // Example agent response structure: { "content": "Based on our employee handbook (v3.2), employees with less than 2 years of tenure receive 15 PTO days per year. This increases to 20 days after the 2-year anniversary.", "citations": [ { "documentId": "kb_doc_abc123", "documentTitle": "Employee Handbook 2026", "chunk": "Employees with less than 2 years of service receive 15 days...", "page": 12, "section": "Time Off Policy", "confidence": 0.94 } ], "grounded": true // Response is grounded in knowledge base data }

Grounding indicator: Agent responses include a grounded: true flag when the response is based on Knowledge Base data. When an agent generates a response purely from its training data (no KB matches), the flag is grounded: false -- a signal to treat the response with more scrutiny.

Spec update proposals

One of the most powerful features of the Knowledge Base is that agents can propose updates to it. When an agent detects that a document is outdated, a policy has changed, or new information contradicts existing content, it creates a Spec Update Proposal.

Proposals go through a review process -- they are never applied automatically. Designated reviewers (typically subject matter experts or compliance officers) approve, reject, or modify the proposal. This creates a feedback loop where agents actively maintain and improve the knowledge base over time.

spec-proposal.ts

// Agents can propose updates to the knowledge base // (Spec Update Proposal pattern) // When Axiom (Compliance Officer) detects a regulatory change: const proposal = await boss.knowledgeBase.proposeUpdate({ documentId: 'kb_doc_compliance_soc2', proposedBy: 'agent_axiom', reason: 'SOC 2 Type II requirements updated by AICPA effective 2026-06-01', changes: [ { section: 'Access Control Requirements', currentText: 'Annual access reviews are required for all systems...', proposedText: 'Quarterly access reviews are required for all critical systems, with annual reviews for non-critical systems...', source: 'AICPA TSC 2026 Update, Section CC6.1', }, ], urgency: 'high', reviewers: ['usr_compliance_lead', 'usr_ciso'], }); // Proposal goes to designated reviewers for approval // Approved proposals update the knowledge base automatically // Rejected proposals are logged with feedback for the agent

Common proposal triggers:

Compliance agents detect regulatory changes that affect existing policies

Research agents find newer data that contradicts existing documents

Process agents identify steps that are no longer followed in practice

Analytics agents detect metrics that have changed significantly

Connector agents receive updates from external systems (e.g., Salesforce field changes)

Enterprise: uploading SOPs, policies, and compliance docs

For enterprise organizations, the Knowledge Base serves as the canonical source of truth for Standard Operating Procedures, corporate policies, compliance documentation, and institutional knowledge. When agents are grounded in your actual policies, they give answers consistent with how your organization actually operates.

SOPs and Procedures

Examples: Employee onboarding, incident response, change management, code review

Agents follow your actual procedures when executing playbooks instead of generic best practices.

Corporate Policies

Examples: PTO, expense, travel, data handling, acceptable use, code of conduct

Agents answer policy questions accurately. HR agents cite the exact policy section in their responses.

Compliance Documents

Examples: SOC 2 controls, HIPAA procedures, GDPR DPIAs, ISO 27001 policies

Compliance agents proactively check agent outputs against regulatory requirements. Audit prep becomes automated.

Product Documentation

Examples: API docs, user guides, architecture diagrams, release notes

Support agents answer customer questions from current documentation. Engineering agents reference specs when building.

Training Materials

Examples: Onboarding guides, role-specific training, tool tutorials, best practices

New employee onboarding agents create personalized training plans based on actual training materials.

Bulk upload: Enterprise customers can bulk-upload entire document repositories via the API or the admin UI at /admin/knowledge-base/upload. BOSS processes documents in parallel and provides a progress dashboard showing indexing status for each file.

API routes

The Knowledge Base is fully accessible via the REST API. All routes require authentication with an API key that has the appropriate scopes (documents:read, documents:write, search:query).

knowledge-base-api.sh

# Knowledge Base API Routes # List all indexed documents GET /api/knowledge-base Authorization: Bearer sk_cintrico_YOUR_KEY # Upload a new document POST /api/knowledge-base/upload Content-Type: multipart/form-data # Search the knowledge base POST /api/search Content-Type: application/json { "query": "...", "studioId": "...", "limit": 10 } # Trigger re-indexing for a document POST /api/ingest Content-Type: application/json { "documentId": "kb_doc_abc123", "force": true } # Get knowledge graph stats GET /api/knowledge-base/graph/stats # Query the knowledge graph POST /api/knowledge-base/graph/query Content-Type: application/json { "entity": "PTO Policy", "depth": 2, "types": ["defined_in", "managed_by"] } # Propose a spec update POST /api/knowledge-base/proposals Content-Type: application/json # List pending proposals GET /api/knowledge-base/proposals?status=pending # Approve/reject a proposal POST /api/knowledge-base/proposals/{id}/review Content-Type: application/json { "action": "approve", "feedback": "..." }

Supported file formats

Documents

PDF, DOCX, DOC, ODT, RTF

Spreadsheets

XLSX, XLS, CSV, TSV, ODS

Presentations

PPTX, PPT, ODP, KEY

Text

TXT, MD, HTML, XML, YAML

Code

JS, TS, PY, GO, RS, JAVA

Images (OCR)

PNG, JPG, TIFF, BMP

EML, MSG, MBOX