Skip to main content
The Knowledge Base is the knowledge repository for your agents. Instead of relying solely on the system prompt, the agent queries the base during each conversation and retrieves relevant excerpts before formulating a response. This architecture is called RAG (Retrieval-Augmented Generation).

Core Concepts

ConceptDescription
Knowledge BaseAn independent collection of documents and content that an agent can query. Each base is isolated — documents from one base never appear in searches from another.
ChunkA text fragment produced by splitting long documents. The chunker divides content into pieces of ~4,000 characters with an overlap of ~400 characters between consecutive chunks to preserve context at boundaries.
EmbeddingA vector representation of a chunk generated by the text-embedding-3-small model. Each embedding has 1,536 dimensions and captures the semantic meaning of the text.
RetrievalA cosine-similarity search process that compares the embedding of the user’s question with stored embeddings and returns the semantically closest chunks.
RAGRetrieval-Augmented Generation — retrieved chunks are injected into the LLM context alongside the user’s message, allowing the agent to respond based on specific indexed information rather than generic pre-training knowledge.

Why It Matters

  1. Responses grounded in real data: the agent answers with the exact content you indexed — policies, pricing, catalogs — not generic inferences from the base model.
  2. Updates without reprogramming: simply update the content in the Knowledge Base. The agent will use the new information immediately in subsequent conversations, with no need to modify the system prompt.
  3. Controlled scope: searches are always filtered by knowledge_base_id and company_id, ensuring that one customer’s data never appears to another, and that distinct bases remain isolated even when an agent accesses multiple bases simultaneously.

How It Works

1

Content ingestion

You add content to the base — documents, Q&A pairs, website pages, or YouTube videos. The content is stored as a document in the knowledge_base_documents table.
2

Chunking

The knowledge-process-document function splits the text into pieces of ~4,000 characters. An overlap of ~400 characters is maintained between consecutive chunks so context is not lost at boundaries.
3

Embedding generation

Each chunk is converted into a 1,536-dimensional vector by OpenAI’s text-embedding-3-small model, sent in batches of up to 100 chunks per request. Vectors are stored in the knowledge_chunks table.
4

Semantic search

When the agent needs information, the knowledge-search function converts the user’s question into a vector and runs a cosine-similarity search. The semantically closest chunks are returned, ordered by relevance.
5

Augmented generation

The retrieved chunks are injected into the LLM context alongside the user’s message. The agent formulates the response based on that specific information, not on generic pre-training knowledge.

Supported Content Types

TypeDescription
DocumentsPDFs, text files, and uploaded documents. The processor extracts the text, splits it into chunks, and generates embeddings.
Q&AManually added question-and-answer pairs. Indexed instantly, with no asynchronous pipeline. High precision because you control exactly what will be retrieved.
WebsiteURLs crawled by the crawler. The system traverses the pages, extracts textual content, and indexes it as documents. Useful for keeping the base in sync with public documentation.
YouTubeVideo URLs. The system downloads the automatic transcript, splits it into chunks, and indexes it. Useful for video-based tutorial knowledge bases.

Knowledge Lifecycle

1

Base creation

A Knowledge Base is created in the Knowledge module, given a name, and associated with the workspace. It starts empty, with no documents.
2

Content addition

Documents, Q&As, URLs, and videos are added to the base. Each source goes through the processing pipeline corresponding to its type.
3

Processing and indexing

Content is processed asynchronously: chunking, vectorization, and storage. The status changes from Processing to Indexed when complete.
4

Connecting to an agent

The base is connected to one or more agents in the Training tab. Retrieval parameters — top-k and similarity threshold — are configured per base.
5

Production use

The agent queries the base automatically in every conversation. When the user’s question has a semantic match above the threshold, the relevant chunks are included in the response context.

Example

A software company connects three bases to the same support agent:
  • “Product FAQ” base with answers to the most frequent questions about features
  • “Technical Documentation” base with manuals and integration guides
  • “Commercial Policies” base with cancellation, refund, and contract rules
When a user asks “how do I cancel my subscription?”, the agent searches all three bases simultaneously, retrieves the most relevant chunks from the “Commercial Policies” base, and formulates an accurate response based on the company’s actual rules.