> ## Documentation Index
> Fetch the complete documentation index at: https://www.cometchat.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

> Scrapes a website to discover and crawl its pages for the knowledge base. Use this endpoint to add website content as a knowledge source for agents.  **Configuration:** Supports options for URL filter

# Crawl Website

For the complete error reference, see [Error Guide](/rest-api/ai-agents-apis/error-codes).


## OpenAPI

````yaml post /ai-agents/agent-builder/knowledge-base/website/scrape
openapi: 3.0.0
info:
  title: AI Agents APIs
  description: API reference for CometChat AI Agents service
  version: '1.0'
servers:
  - url: https://{appId}.api-{region}.cometchat.io/v3
    variables:
      appId:
        default: appId
        description: (Required) App ID
      region:
        enum:
          - us
          - eu
          - in
        default: us
        description: Select Region
security: []
tags:
  - name: ai-agent
    description: ''
paths:
  /ai-agents/agent-builder/knowledge-base/website/scrape:
    post:
      tags:
        - crawl-web-pages
      summary: Crawl Website
      description: >-
        Scrapes a website to discover and crawl its pages for the knowledge
        base. Use this endpoint to add website content as a knowledge source for
        agents.


        **Configuration:** Supports options for URL filtering and crawl depth
        control.
      operationId: CrawlWebPagesController_crawlWebsite
      parameters: []
      requestBody:
        required: true
        description: Website crawling configuration
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ScrapePagesDto'
            examples:
              basic:
                summary: Website sitemap scrape
                description: Scrape website and fetch sitemap with basic configuration
                value:
                  url: https://example.com
                  maxDepth: 3
                  maxPages: 5
      responses:
        '200':
          description: Website scraped successfully
          content:
            application/json:
              schema:
                type: object
                properties:
                  success:
                    type: boolean
                    example: true
                  message:
                    type: string
                    example: Website scrape completed successfully
                  data:
                    type: object
                    properties:
                      crawlId:
                        type: string
                        example: crawl-1701789123456
                      crawlDuration:
                        type: number
                        example: 120000
                      status:
                        type: string
                        example: completed
                      sitemap:
                        type: object
                        properties:
                          found:
                            type: boolean
                            example: true
                          url:
                            type: string
                            example: https://docs.example.com/sitemap.xml
                          totalUrls:
                            type: number
                            example: 245
                          urls:
                            type: array
                            items:
                              type: string
                          lastModified:
                            type: string
                            example: '2025-12-05T10:30:00Z'
              example:
                success: true
                message: Website scrape completed successfully
                data:
                  crawlId: crawl-1701789123456
                  crawlDuration: 120000
                  status: completed
                  sitemap:
                    found: true
                    url: https://docs.example.com/sitemap.xml
                    totalUrls: 245
                    urls:
                      - https://docs.example.com/page1
                      - https://docs.example.com/page2
                    lastModified: '2025-12-05T10:30:00Z'
      security:
        - apiKey: []
components:
  schemas:
    ScrapePagesDto:
      type: object
      properties:
        url:
          type: string
          description: Target website URL to crawl
          example: https://docs.example.com
        maxDepth:
          type: number
          description: Maximum depth to crawl from the starting URL
          minimum: 1
          maximum: 10
          default: 3
          example: 5
        maxPages:
          type: number
          description: Maximum number of pages to crawl
          minimum: 1
          maximum: 10000
          default: 100
          example: 500
        include:
          description: URL patterns to include in crawling (substring matching)
          example:
            - docs/
            - api/
            - guides/
          type: array
          items:
            type: string
        exclude:
          description: URL patterns to exclude from crawling (substring matching)
          example:
            - login
            - signup
            - admin
            - privacy
          type: array
          items:
            type: string
        fetchSitemap:
          type: boolean
          description: Fetch and return sitemap URLs from the website
          default: false
          example: true
      required:
        - url
  securitySchemes:
    apiKey:
      type: apiKey
      description: API Key (i.e. Rest API Key from the Dashboard).
      name: apikey
      in: header

````