Build Your Knowledge Agent with Agno

Imagine a FastAPI service that ingests your documentation, stores it in a vector database, and streams Agno agent responses with citations that CometChat can consume in real time.

What You’ll Build

An Agno agent that joins conversations as a documentation expert.
An ingestion pipeline that writes markdown artifacts into knowledge_agent/data/knowledge/<namespace>.
Retrieval and answering logic that always cites the sources it used.
A /stream endpoint that outputs newline-delimited JSON events so CometChat can subscribe without changes.

Prerequisites

Python 3.10 or newer (3.11 recommended).
OPENAI_API_KEY with access to GPT-4o or any compatible model.
Optional: alternate OpenAI base URL or model IDs if you self-host OpenAI-compatible APIs.
curl or an API client (Hoppscotch, Postman) to call the FastAPI endpoints.

Quick links

Repo root: ai-agent-agno-examples
Project folder: knowledge_agent/
Server guide: README.md#knowledge-agent
API reference: knowledge_agent/main.py
Knowledge helpers: knowledge_agent/knowledge_manager.py

How it works

This example recreates the Vercel knowledge base workflow using Agno:

Ingest — collect_documents accepts URLs, markdown, plain text, uploads, or multipart forms. Sources are deduplicated by a SHA-256 hash and normalized into markdown.
Store — KnowledgeManager keeps one ChromaDb collection per namespace, with metadata persisted under knowledge_agent/data/knowledge/<namespace>.
Retrieve — Searches hit the vector DB via Agno’s Knowledge class, returning ranked snippets and the metadata used for citations.
Answer — create_agent enables search_knowledge and add_knowledge_to_context, forcing every response to cite sources via the system prompt.
Stream — /stream emits newline-delimited JSON events (text_delta, tool_*, text_done, done, error) that match CometChat’s Bring Your Own Agent expectations. Every event echoes the caller’s thread_id and run_id.

Setup

Clone & install

git clone https://github.com/cometchat/ai-agent-agno-examples.git, then inside the repo run:
python3 -m venv .venv && source .venv/bin/activate && pip install -e .

Configure environment

Create .env (or export env vars) with at least OPENAI_API_KEY. Optional overrides: OPENAI_BASE_URL, KNOWLEDGE_OPENAI_MODEL, KNOWLEDGE_STORAGE_PATH, KNOWLEDGE_CHROMA_PATH.

Start the server

Launch FastAPI with uvicorn knowledge_agent.main:app —host 0.0.0.0 —port 8000 —reload. The app exposes health, ingestion, search, generate, and /stream endpoints (newline-delimited JSON).

Project Structure

Step 1 - Configure the Knowledge Agent

KnowledgeManager.create_agent builds an Agno agent bound to the current namespace:

Uses OpenAIChat with OPENAI_API_KEY, optional custom base URL, and temperature from settings.
Enables search_knowledge=True and add_knowledge_to_context=True so retrieved snippets feed the model.
Injects a system prompt that demands a knowledge search before every reply and enforces the "Sources: <file>.md" footer.
Reuses the namespace-specific ChromaDb collection initialised in _get_namespace.

Step 2 - Ingest Knowledge

POST /api/tools/ingest accepts JSON or multipart payloads. Highlights:

Up to 30 sources per call, 6 MB per file, 200 kB per inline text/markdown.
URLs, PDFs, HTML pages, plain text, and uploads are normalized to markdown with metadata and timestamps.
Duplicate hashes are skipped with a "duplicate-content" reason; existing files return "already-ingested".
Responses provide saved, skipped, errors, and the resolved namespace.

Example JSON payload:

curl -X POST http://localhost:8000/api/tools/ingest \
  -H "Content-Type: application/json" \
  -d '{
        "namespace": "default",
        "sources": [
          { "type": "url", "value": "https://docs.agno.com/concepts/agents/overview" },
          { "type": "markdown", "title": "Playbook", "value": "# Notes\n\nAgno rocks!" }
        ]
      }'

Step 3 - Search & Validate

POST /api/tools/searchDocs lets you confirm retrieval before opening the agent to users:

Required body: {"query": "How do I add tools?"} with optional namespace and max_results.
Returns ranked snippets with metadata (hashes, distances converted to scores).
Empty queries immediately return an error so the UI can prompt the operator.

Step 4 - Chat & Stream

POST /api/agents/knowledge/generate handles non-streaming responses.
POST /stream streams newline-delimited JSON events that include tool calls, intermediate reasoning, text deltas, and completion markers.

Streaming example (newline-delimited JSON):

curl -N http://localhost:8000/stream \
  -H "Content-Type: application/json" \
  -d '{
        "thread_id": "thread_1",
        "run_id": "run_001",
        "messages": [
          { "role": "user", "content": "Summarize the agent lifecycle." }
        ]
      }'

Each line is a JSON object with a type field such as text_delta, tool_call_start, tool_result, text_done, or done. thread_id and run_id are echoed back so CometChat can correlate partial events.

Step 5 - Connect to CometChat

Deploy the FastAPI service behind HTTPS (e.g., Fly.io, Render, Railway, or your own Kubernetes cluster).
Add auth headers or gateway middleware if you need to validate incoming requests from CometChat.
In the CometChat dashboard, point the Agno agent’s Deployment URL at the /stream endpoint; use Headers for bearer tokens or basic auth if required.
Provide namespace (or toolParams.namespace from CometChat) when you need to target non-default knowledge stores; the service normalizes values before lookup.

With that, you have a fully grounded Agno agent that streams CometChat-compatible events into your UI.

Guides

Tutorials

Build Your Knowledge Agent with Agno

What You’ll Build

Prerequisites

Quick links

How it works

Setup

Project Structure

Step 1 - Configure the Knowledge Agent

Step 2 - Ingest Knowledge

Step 3 - Search & Validate

Step 4 - Chat & Stream

Step 5 - Connect to CometChat

Guides

Tutorials

​What You’ll Build

​Prerequisites

​Quick links

​How it works

​Setup

​Project Structure

​Step 1 - Configure the Knowledge Agent

​Step 2 - Ingest Knowledge

​Step 3 - Search & Validate

​Step 4 - Chat & Stream

​Step 5 - Connect to CometChat

What You’ll Build

Prerequisites

Quick links

How it works

Setup

Project Structure

Step 1 - Configure the Knowledge Agent

Step 2 - Ingest Knowledge

Step 3 - Search & Validate

Step 4 - Chat & Stream

Step 5 - Connect to CometChat