Build a PDF Knowledge Agent with Mastra

Give your chat experience document intelligence: ingest PDFs, run hybrid semantic + lexical retrieval, and stream grounded answers into CometChat.

Quick links

Live (UI served from docs/ if deployed to GitHub Pages): static upload + ask demo
Source code: GitHub repository
Agent files: src/mastra/agents/*
Tools: src/mastra/tools/*
Workflow: src/mastra/workflows/pdf-workflow.ts
Server: src/server.ts

What you’ll build

Dual Mastra agents:
- pdfAgent (multi‑PDF retrieval across all uploaded docs)
- singlePdfAgent (restricted to latest uploaded PDF)
Deterministic retrieval + answer workflow (pdfWorkflow)
Express API with upload, ask (JSON), and streaming (SSE) endpoints
Persisted hybrid vector + BM25 store (.data/ namespace pdf)
Integration with CometChat AI Agents (endpoint wiring)

How it works

Users upload PDF(s) → server parses pages → chunks + embeddings generated → persisted in vector + manifest store.
Hybrid retrieval ranks candidate chunks: score = α * cosine + (1-α) * sigmoid(BM25) where α = hybridAlpha.
Multi‑query expansion (optional) generates paraphrased variants to boost recall.
Tools (retrieve-pdf-context, retrieve-single-pdf-context) return stitched context + source metadata.
Workflow (pdfWorkflow) orchestrates retrieval + answer; streaming endpoints emit meta then incremental token events.
Mastra agent(s) are exposed via REST endpoints you wire into CometChat AI Agents.

Repo layout (key files)

src/mastra/agents/pdf-agent.ts – multi‑PDF agent
src/mastra/agents/single-pdf-agent.ts – single latest PDF agent
src/mastra/tools/retrieve-pdf-context.ts / retrieve-single-pdf-context.ts – hybrid retrieval tools
src/mastra/workflows/pdf-workflow.ts – deterministic orchestration
src/lib/* – vector store, embeddings, manifest, PDF parsing
src/server.ts – Express API (upload, ask, streaming, manifest ops)
docs/index.html – optional static UI
.data/ – persisted vectors + manifest JSON

Prerequisites

Node.js 20+
OPENAI_API_KEY (embeddings + chat model)
A CometChat app (to register the agent)
(Optional) CORS_ORIGIN if restricting browser origins

Step 1 — Clone & install

Clone the example and install dependencies:

git clone https://github.com/cometchat/ai-agent-mastra-examples.git
cd ai-agent-mastra-examples/mastra-knowledge-agent-pdf
npm install

Create a .env with at least:
OPENAI_API_KEY=sk-...
PORT=3000

Step 2 — Define retrieval tools

File: src/mastra/tools/retrieve-pdf-context.ts (multi) & retrieve-single-pdf-context.ts (single)

import { createTool } from '@mastra/core/tools';
import { z } from 'zod';

export const retrieverTool = createTool({ /* simplified example for tutorial brevity */ });

Step 3 — Create agents

Files: src/mastra/agents/pdf-agent.ts, src/mastra/agents/single-pdf-agent.ts

import { Agent } from '@mastra/core/agent';
import { openai } from '@ai-sdk/openai';
import { retrieverTool } from '../tools/retriever-tool';

export const pdfAgent = new Agent({ /* instruct to use retrieve-pdf-context, cite sources */ });
export const singlePdfAgent = new Agent({ /* restrict answers to latest doc */ });

Step 4 — Wire Mastra & workflow

File: src/mastra/index.ts registers agents + pdfWorkflow with storage (LibSQL or file‑backed).

import { Mastra } from '@mastra/core/mastra';
import { LibSQLStore } from '@mastra/libsql';
import { knowledgeAgent } from './agents/knowledge-agent';

export const mastra = new Mastra({ /* agents, workflow, storage */ });

Start the dev server:

npm run dev

Step 5 — Run locally

 ┌──────────────┐          ┌──────────────┐
 │  Express API │ upload → │   PDF Parser │
 └──────┬───────┘          └──────┬───────┘
   │  chunks + embeddings    │
   ▼                         │
 ┌──────────────┐   upsert/search ┌──────────────┐
 │ Vector Store │◀───────────────▶│ Embeddings   │
 └──────┬───────┘                  └──────────────┘
   │  hybrid retrieve
   ▼
 ┌──────────────┐  tool calls  ┌────────────────────┐
 │ Mastra Agent │─────────────▶│ retrieve-* tools   │
 └──────┬───────┘               └─────────┬──────────┘
   │ stitched context                 │ fallback
   ▼                                  │ broaden
 ┌──────────────┐ answer tokens (SSE) ┌──────────────┐
 │  Workflow    │────────────────────▶│   Client     │
 └──────────────┘                     └──────────────┘

Step 6 — Upload and ask (API)

Agent	File	Purpose	Tool
`pdfAgent`	`src/mastra/agents/pdf-agent.ts`	Multi‑PDF retrieval QA	`retrieve-pdf-context`
`singlePdfAgent`	`src/mastra/agents/single-pdf-agent.ts`	Latest single PDF QA	`retrieve-single-pdf-context`

Tool input examples:

// retrieve-pdf-context
{ query, docIds?, topK=5, hybridAlpha=0.7, multiQuery=true, qVariants=3, maxContextChars=4000 }

// retrieve-single-pdf-context
{ query, docId?, topK=5, hybridAlpha=0.7, multiQuery=true, qVariants=3, maxContextChars=4000 }

Fallback widens search (higher topK, more qVariants) if initial context is sparse.

JSON ask (Mastra dev agent route)

Mastra automatically exposes the API:

curl -X POST http://localhost:4111/api/agents/knowledge/generate   -H "Content-Type: application/json"   -d '{"messages":[{"role":"user","content":"What is covered in our docs?"}]}'

Expected response:

{
  "reply": "The docs cover..."
}

Step 7 — Deploy & connect to CometChat

Deploy the project (e.g., Vercel, Railway, or AWS).
Copy the deployed endpoint URL.
In CometChat Dashboard → AI Agents, add a new agent:
- Agent ID: knowledge
- Endpoint: https://your-deployed-url/api/agents/knowledge/generate

Step 8 — Optimize & extend

Add more documents to the docs/ folder.
Use embeddings + vector DB (Pinecone, Weaviate) for large datasets.
Extend the agent with memory or multi-tool workflows.

Repository Links

Environment variables

Name	Description
`OPENAI_API_KEY`	Required for embeddings + model
`CORS_ORIGIN`	Optional allowed browser origin
`PORT`	Server port (default 3000)

.env example:

OPENAI_API_KEY=sk-...
CORS_ORIGIN=http://localhost:3000
PORT=3000

Endpoint summary

Method	Path	Description
POST	`/api/upload`	Upload a PDF (multipart) returns `{ docId, pages, chunks }`
GET	`/api/documents`	List ingested documents
DELETE	`/api/documents/:id`	Delete a document + vectors
GET	`/api/documents/:id/file`	Stream stored original PDF
POST	`/api/ask`	Multi‑PDF retrieval + answer (JSON)
POST	`/api/ask/full`	Same as `/api/ask` (deterministic path)
POST	`/api/ask/stream`	Multi‑PDF streaming (SSE)
POST	`/api/ask/single`	Single latest PDF answer (JSON)
POST	`/api/ask/single/stream`	Single latest PDF streaming (SSE)

Curl Examples

Upload:

curl -X POST http://localhost:3000/api/upload \
  -H "Content-Type: multipart/form-data" \
  -F "file=@/path/to/file.pdf"

Ask (multi):

curl -X POST http://localhost:3000/api/ask \
  -H 'Content-Type: application/json' \
  -d '{"question":"Summarize the abstract","topK":6}'

Stream (multi):

curl -N -X POST http://localhost:3000/api/ask/stream \
  -H 'Content-Type: application/json' \
  -d '{"question":"List key methods","multiQuery":true}'

Ask (single):

curl -X POST http://localhost:3000/api/ask/single \
  -H 'Content-Type: application/json' \
  -d '{"question":"What are the main conclusions?"}'

Stream (single):

curl -N -X POST http://localhost:3000/api/ask/single/stream \
  -H 'Content-Type: application/json' \
  -d '{"question":"Give me an outline"}'

SSE Events

Event	Payload	Notes
`meta`	`{ sources, docId? }`	First packet with retrieval metadata
`token`	`{ token }`	Incremental answer token chunk
`done`	`{}`	Completion marker
`error`	`{ error }`	Error occurred

Tuning & retrieval knobs

Parameter	Effect	Trade‑off
`hybridAlpha`	Higher = more semantic weight	Too high reduces keyword recall
`topK`	More chunks = broader context	Larger responses, slower
`multiQuery`	Recall across paraphrases	Extra model + embedding cost
`qVariants`	Alternative queries for expansion	Diminishing returns >5
`maxContextChars`	Caps stitched context size	Too small omits evidence

Tip: For exploratory QA try topK=8, qVariants=5.

Troubleshooting & debugging

Enable internal logging (if available) to inspect scoring.
Inspect vectors: open .data/pdf-vectors.json.
Manifest corrupted? Delete .data/manifest.json and re‑upload.
Low lexical relevance? Lower hybridAlpha (e.g. 0.55).
Noise / irrelevant chunks? Reduce topK or lower qVariants.

Hardening & roadmap

SSE/WebSocket answer token streaming to clients (UI consumption)
Source highlighting + per‑chunk confidence
Semantic / layout‑aware advanced chunking
Vector deduplication & compression
Auth layer (API keys / JWT) & per‑user isolation
Background ingestion queue for large docs
Retrieval quality regression tests

Repository Links

Source: GitHub Repository
Multi agent: pdf-agent.ts
Single agent: single-pdf-agent.ts
Tools: retrieve-pdf-context.ts, retrieve-single-pdf-context.ts
Workflow: pdf-workflow.ts
Server: server.ts

Guides

Tutorials

Build a PDF Knowledge Agent with Mastra

Quick links

What you’ll build

How it works

Repo layout (key files)

Prerequisites

Step 1 — Clone & install

Step 2 — Define retrieval tools

Step 3 — Create agents

Step 4 — Wire Mastra & workflow

Step 5 — Run locally

Step 6 — Upload and ask (API)

JSON ask (Mastra dev agent route)

Step 7 — Deploy & connect to CometChat

Step 8 — Optimize & extend

Repository Links

Environment variables

Endpoint summary

Curl Examples

SSE Events

Tuning & retrieval knobs

Troubleshooting & debugging

Hardening & roadmap

Guides

Tutorials

​Quick links

​What you’ll build

​How it works

​Repo layout (key files)

​Prerequisites

​Step 1 — Clone & install

​Step 2 — Define retrieval tools

​Step 3 — Create agents

​Step 4 — Wire Mastra & workflow

​Step 5 — Run locally

​Step 6 — Upload and ask (API)

​JSON ask (Mastra dev agent route)

​Step 7 — Deploy & connect to CometChat

​Step 8 — Optimize & extend

​Repository Links

​Environment variables

​Endpoint summary

​Curl Examples

​SSE Events

​Tuning & retrieval knobs

​Troubleshooting & debugging

​Hardening & roadmap

Quick links

What you’ll build

How it works

Repo layout (key files)

Prerequisites

Step 1 — Clone & install

Step 2 — Define retrieval tools

Step 3 — Create agents

Step 4 — Wire Mastra & workflow

Step 5 — Run locally

Step 6 — Upload and ask (API)

JSON ask (Mastra dev agent route)

Step 7 — Deploy & connect to CometChat

Step 8 — Optimize & extend

Repository Links

Environment variables

Endpoint summary

Curl Examples

SSE Events

Tuning & retrieval knobs

Troubleshooting & debugging

Hardening & roadmap