What You’ll Build
- An Agno agent that joins conversations as a documentation expert.
- An ingestion pipeline that writes markdown artifacts into
knowledge_agent/data/knowledge/<namespace>. - Retrieval and answering logic that always cites the sources it used.
- A
/streamendpoint that outputs newline-delimited JSON events so CometChat can subscribe without changes.
Prerequisites
- Python 3.10 or newer (3.11 recommended).
OPENAI_API_KEYwith access to GPT-4o or any compatible model.- Optional: alternate OpenAI base URL or model IDs if you self-host OpenAI-compatible APIs.
- curl or an API client (Hoppscotch, Postman) to call the FastAPI endpoints.
Quick links
- Repo root: ai-agent-agno-examples
- Project folder:
knowledge_agent/ - Server guide:
README.md#knowledge-agent - API reference:
knowledge_agent/main.py - Knowledge helpers:
knowledge_agent/knowledge_manager.py
How it works
This example recreates the Vercel knowledge base workflow using Agno:- Ingest —
collect_documentsaccepts URLs, markdown, plain text, uploads, or multipart forms. Sources are deduplicated by a SHA-256 hash and normalized into markdown. - Store —
KnowledgeManagerkeeps oneChromaDbcollection per namespace, with metadata persisted underknowledge_agent/data/knowledge/<namespace>. - Retrieve — Searches hit the vector DB via Agno’s
Knowledgeclass, returning ranked snippets and the metadata used for citations. - Answer —
create_agentenablessearch_knowledgeandadd_knowledge_to_context, forcing every response to cite sources via the system prompt. - Stream —
/streamemits newline-delimited JSON events (text_delta,tool_*,text_done,done,error) that match CometChat’s Bring Your Own Agent expectations. Every event echoes the caller’sthread_idandrun_id.
Setup
1
Clone & install
git clone https://github.com/cometchat/ai-agent-agno-examples.git, then inside the repo run:python3 -m venv .venv && source .venv/bin/activate && pip install -e .2
Configure environment
Create
.env (or export env vars) with at least OPENAI_API_KEY. Optional overrides: OPENAI_BASE_URL, KNOWLEDGE_OPENAI_MODEL, KNOWLEDGE_STORAGE_PATH, KNOWLEDGE_CHROMA_PATH.3
Start the server
Launch FastAPI with
uvicorn knowledge_agent.main:app —host 0.0.0.0 —port 8000 —reload. The app exposes health, ingestion, search, generate, and /stream endpoints (newline-delimited JSON).Project Structure
- FastAPI & wiring
- Knowledge + ingestion
- Constants & helpers
Step 1 - Configure the Knowledge Agent
KnowledgeManager.create_agent builds an Agno agent bound to the current namespace:
- Uses
OpenAIChatwithOPENAI_API_KEY, optional custom base URL, and temperature from settings. - Enables
search_knowledge=Trueandadd_knowledge_to_context=Trueso retrieved snippets feed the model. - Injects a system prompt that demands a knowledge search before every reply and enforces the
"Sources: <file>.md"footer. - Reuses the namespace-specific
ChromaDbcollection initialised in_get_namespace.
Step 2 - Ingest Knowledge
POST /api/tools/ingest accepts JSON or multipart payloads. Highlights:
- Up to 30 sources per call, 6 MB per file, 200 kB per inline text/markdown.
- URLs, PDFs, HTML pages, plain text, and uploads are normalized to markdown with metadata and timestamps.
- Duplicate hashes are skipped with a
"duplicate-content"reason; existing files return"already-ingested". - Responses provide
saved,skipped,errors, and the resolved namespace.
Step 3 - Search & Validate
POST /api/tools/searchDocs lets you confirm retrieval before opening the agent to users:
- Required body:
{"query": "How do I add tools?"}with optionalnamespaceandmax_results. - Returns ranked snippets with metadata (hashes, distances converted to scores).
- Empty queries immediately return an error so the UI can prompt the operator.
Step 4 - Chat & Stream
POST /api/agents/knowledge/generatehandles non-streaming responses.POST /streamstreams newline-delimited JSON events that include tool calls, intermediate reasoning, text deltas, and completion markers.
type field such as text_delta, tool_call_start, tool_result, text_done, or done. thread_id and run_id are echoed back so CometChat can correlate partial events.
Step 5 - Connect to CometChat
- Deploy the FastAPI service behind HTTPS (e.g., Fly.io, Render, Railway, or your own Kubernetes cluster).
- Add auth headers or gateway middleware if you need to validate incoming requests from CometChat.
- In the CometChat dashboard, point the Agno agent’s Deployment URL at the
/streamendpoint; use Headers for bearer tokens or basic auth if required. - Provide
namespace(ortoolParams.namespacefrom CometChat) when you need to target non-default knowledge stores; the service normalizes values before lookup.