Quick links
- Live (UI served from
docs/
if deployed to GitHub Pages): static upload + ask demo - Source code: GitHub repository
- Agent files:
src/mastra/agents/*
- Tools:
src/mastra/tools/*
- Workflow:
src/mastra/workflows/pdf-workflow.ts
- Server:
src/server.ts
What you’ll build
- Dual Mastra agents:
pdfAgent
(multi‑PDF retrieval across all uploaded docs)singlePdfAgent
(restricted to latest uploaded PDF)
- Deterministic retrieval + answer workflow (
pdfWorkflow
) - Express API with upload, ask (JSON), and streaming (SSE) endpoints
- Persisted hybrid vector + BM25 store (
.data/
namespacepdf
) - Integration with CometChat AI Agents (endpoint wiring)
How it works
- Users upload PDF(s) → server parses pages → chunks + embeddings generated → persisted in vector + manifest store.
- Hybrid retrieval ranks candidate chunks:
score = α * cosine + (1-α) * sigmoid(BM25)
whereα = hybridAlpha
. - Multi‑query expansion (optional) generates paraphrased variants to boost recall.
- Tools (
retrieve-pdf-context
,retrieve-single-pdf-context
) return stitched context + source metadata. - Workflow (
pdfWorkflow
) orchestrates retrieval + answer; streaming endpoints emitmeta
then incrementaltoken
events. - Mastra agent(s) are exposed via REST endpoints you wire into CometChat AI Agents.
Repo layout (key files)
src/mastra/agents/pdf-agent.ts
– multi‑PDF agentsrc/mastra/agents/single-pdf-agent.ts
– single latest PDF agentsrc/mastra/tools/retrieve-pdf-context.ts
/retrieve-single-pdf-context.ts
– hybrid retrieval toolssrc/mastra/workflows/pdf-workflow.ts
– deterministic orchestrationsrc/lib/*
– vector store, embeddings, manifest, PDF parsingsrc/server.ts
– Express API (upload, ask, streaming, manifest ops)docs/index.html
– optional static UI.data/
– persisted vectors + manifest JSON
Prerequisites
- Node.js 20+
OPENAI_API_KEY
(embeddings + chat model)- A CometChat app (to register the agent)
- (Optional)
CORS_ORIGIN
if restricting browser origins
Step 1 — Clone & install
Clone the example and install dependencies:Create a.env
with at least:
Step 2 — Define retrieval tools
File:src/mastra/tools/retrieve-pdf-context.ts
(multi) & retrieve-single-pdf-context.ts
(single)
Step 3 — Create agents
Files:src/mastra/agents/pdf-agent.ts
, src/mastra/agents/single-pdf-agent.ts
Step 4 — Wire Mastra & workflow
File:src/mastra/index.ts
registers agents + pdfWorkflow
with storage (LibSQL or file‑backed).
Step 5 — Run locally
Step 6 — Upload and ask (API)
Agent | File | Purpose | Tool |
---|---|---|---|
pdfAgent | src/mastra/agents/pdf-agent.ts | Multi‑PDF retrieval QA | retrieve-pdf-context |
singlePdfAgent | src/mastra/agents/single-pdf-agent.ts | Latest single PDF QA | retrieve-single-pdf-context |
topK
, more qVariants
) if initial context is sparse.
JSON ask (Mastra dev agent route)
Mastra automatically exposes the API:Step 7 — Deploy & connect to CometChat
- Deploy the project (e.g., Vercel, Railway, or AWS).
- Copy the deployed endpoint URL.
- In CometChat Dashboard → AI Agents, add a new agent:
- Agent ID:
knowledge
- Endpoint:
https://your-deployed-url/api/agents/knowledge/generate
- Agent ID:
Step 8 — Optimize & extend
- Add more documents to the
docs/
folder. - Use embeddings + vector DB (Pinecone, Weaviate) for large datasets.
- Extend the agent with memory or multi-tool workflows.
Repository Links
Environment variables
Name | Description |
---|---|
OPENAI_API_KEY | Required for embeddings + model |
CORS_ORIGIN | Optional allowed browser origin |
PORT | Server port (default 3000) |
.env
example:
Endpoint summary
Method | Path | Description |
---|---|---|
POST | /api/upload | Upload a PDF (multipart) returns { docId, pages, chunks } |
GET | /api/documents | List ingested documents |
DELETE | /api/documents/:id | Delete a document + vectors |
GET | /api/documents/:id/file | Stream stored original PDF |
POST | /api/ask | Multi‑PDF retrieval + answer (JSON) |
POST | /api/ask/full | Same as /api/ask (deterministic path) |
POST | /api/ask/stream | Multi‑PDF streaming (SSE) |
POST | /api/ask/single | Single latest PDF answer (JSON) |
POST | /api/ask/single/stream | Single latest PDF streaming (SSE) |
Curl Examples
Upload:SSE Events
Event | Payload | Notes |
---|---|---|
meta | { sources, docId? } | First packet with retrieval metadata |
token | { token } | Incremental answer token chunk |
done | {} | Completion marker |
error | { error } | Error occurred |
Tuning & retrieval knobs
Parameter | Effect | Trade‑off |
---|---|---|
hybridAlpha | Higher = more semantic weight | Too high reduces keyword recall |
topK | More chunks = broader context | Larger responses, slower |
multiQuery | Recall across paraphrases | Extra model + embedding cost |
qVariants | Alternative queries for expansion | Diminishing returns >5 |
maxContextChars | Caps stitched context size | Too small omits evidence |
topK=8
, qVariants=5
.
Troubleshooting & debugging
- Enable internal logging (if available) to inspect scoring.
- Inspect vectors: open
.data/pdf-vectors.json
. - Manifest corrupted? Delete
.data/manifest.json
and re‑upload. - Low lexical relevance? Lower
hybridAlpha
(e.g. 0.55). - Noise / irrelevant chunks? Reduce
topK
or lowerqVariants
.
Hardening & roadmap
- SSE/WebSocket answer token streaming to clients (UI consumption)
- Source highlighting + per‑chunk confidence
- Semantic / layout‑aware advanced chunking
- Vector deduplication & compression
- Auth layer (API keys / JWT) & per‑user isolation
- Background ingestion queue for large docs
- Retrieval quality regression tests
Repository Links
- Source: GitHub Repository
- Multi agent:
pdf-agent.ts
- Single agent:
single-pdf-agent.ts
- Tools:
retrieve-pdf-context.ts
,retrieve-single-pdf-context.ts
- Workflow:
pdf-workflow.ts
- Server:
server.ts