Skip to main content
CometChat on-premise delivers an enterprise-grade, self-hosted real-time messaging platform engineered for mission-critical applications requiring complete data sovereignty, regulatory compliance, and predictable performance at scale. Built on battle-tested open-source technologies and cloud-native principles, this deployment architecture supports workloads from 10,000 to 250,000+ monthly active users with linear scalability and sub-100ms message latency. This guide covers Docker Swarm deployments. For Kubernetes deployments (recommended for 200k+ MAU or multi-region requirements), see the Kubernetes deployment guide or contact us for enterprise support. Enterprise Value Proposition
  • Data Sovereignty: Complete control over data residency, encryption, and access policies to meet GDPR, HIPAA, SOC 2, and industry-specific compliance requirements
  • Predictable Economics: Eliminate per-user SaaS pricing with fixed infrastructure costs and transparent capacity planning
  • Operational Control: Full visibility into system behavior, customizable monitoring, and direct access to all components for troubleshooting and optimization
  • Security Posture: Deploy within your existing security perimeter with private networks, custom authentication integration, and air-gapped deployment options
  • Performance Guarantees: Achieve consistent sub-100ms latency with dedicated resources and optimized data locality

Who this guide is for

This comprehensive deployment guide is designed for technical decision-makers and implementation teams responsible for enterprise infrastructure:
  • DevOps & SRE Teams: Operations professionals managing production uptime, incident response, capacity planning, and continuous deployment pipelines
  • Platform & Backend Engineers: Technical leads architecting scalable systems, optimizing performance, and integrating real-time messaging into existing application ecosystems
  • Infrastructure Architects: Strategic planners designing multi-region deployments, disaster recovery strategies, compliance frameworks, and long-term scalability roadmaps
  • Security & Compliance Officers: Stakeholders ensuring data protection, access controls, audit logging, and regulatory adherence across all system components

Platform capabilities

CometChat on-premise provides a comprehensive suite of real-time communication features designed for enterprise applications: Core Messaging Infrastructure
  • 1:1 and Group Conversations: Scalable messaging architecture supporting unlimited conversation threads with persistent message history, rich media attachments, and message threading
  • Real-time Event Streaming: WebSocket-based bi-directional communication delivering instant presence updates, typing indicators, delivery receipts, read receipts, and custom event propagation
  • Message Delivery Guarantees: Effectively-once message delivery semantics using idempotent producers, automatic retry logic, offline message queuing, and synchronization across multiple devices
Enterprise Features
  • Distributed Event Pipeline: Apache Kafka-powered event backbone enabling decoupled microservices architecture, guaranteed message ordering, and fault-tolerant event processing at scale
  • Push Notifications: Multi-provider notification delivery supporting Firebase Cloud Messaging (FCM), Apple Push Notification Service (APNs), and custom webhook integrations with intelligent batching and delivery optimization
  • Content Moderation: Configurable policy engine with real-time profanity filtering, spam detection, image moderation, and extensible AI/ML adapter framework for custom moderation workflows
  • Webhooks & Integrations: Reliable outbound event delivery system with configurable retry policies, secure authentication, and comprehensive audit trails for third-party system integration
API & Developer Experience
  • RESTful APIs: Comprehensive HTTP APIs for user management, conversation operations, group administration, metadata queries, and administrative functions with OpenAPI documentation
  • Horizontal Scalability: Stateless service design enabling linear scaling by adding compute resources without architectural changes or data migration
  • Multi-tenancy Support: Logical isolation of tenant data with tenant-specific configurations, rate limits, and resource quotas

Data architecture & storage

The platform employs a polyglot persistence strategy with multiple storage technologies optimized for specific data access patterns and consistency requirements: Primary Data Stores
  • TiDB Cluster: Horizontally scalable, MySQL-compatible distributed SQL database providing ACID transactions, automatic sharding, and multi-region replication. Composed of three components:
    • Placement Driver (PD): Cluster metadata management and intelligent data placement
    • TiKV: Distributed key-value storage engine with Raft consensus for strong consistency
    • TiDB SQL Layer: MySQL-compatible query interface with distributed transaction coordination
  • MongoDB: Document-oriented database optimized for flexible schema evolution and semi-structured data with native JSON support and rich query capabilities
  • Redis Clusters: Dedicated in-memory data structure stores providing sub-millisecond latency for high-frequency operations:
    • Cache Cluster: Application-level caching, query result caching, and frequently accessed data
    • Session & Rate Limiting Cluster: User session management, authentication tokens, and distributed rate limiting counters
Event Streaming Platform
  • Apache Kafka: Distributed commit log serving as the central event backbone for asynchronous communication between microservices
    • Guarantees: 100% real-time message delivery with sub-second latency, guaranteed message delivery with no message loss, configurable retention policies, and horizontal scalability
Optional Storage Systems
  • Object Storage: S3-compatible storage (Amazon S3, MinIO, Ceph, or Google Cloud Storage) for unstructured data and large binary objects
    • Features: Lifecycle policies, versioning, encryption at rest, and cost-optimized storage tiers
Data Durability & Backup
  • Automated backup strategies with point-in-time recovery capabilities
  • Configurable retention policies aligned with compliance requirements

Deployment models

CometChat on-premise supports multiple deployment architectures to match your operational maturity, scale requirements, and infrastructure preferences:

Docker Swarm (Recommended: 10k-200k MAU)

Target Environment: Production deployments up to ~200,000 monthly active users and ~20,000 peak concurrent connections Characteristics:
  • Lightweight orchestration with native Docker integration and minimal operational overhead
  • Predictable service placement with node constraints and resource reservations
  • Secure overlay networking with encrypted service-to-service communication
  • Rolling updates with configurable health checks and automatic rollback capabilities
  • Built-in load balancing and service discovery without external dependencies
Enterprise Benefits:
  • Lower operational complexity compared to Kubernetes while maintaining production-grade reliability
  • Faster deployment cycles with straightforward configuration management
  • Reduced infrastructure costs with efficient resource utilization
  • Proven architecture supporting hundreds of production deployments
Recommended For: Mid-market enterprises, SaaS platforms, healthcare applications, financial services, and organizations prioritizing operational simplicity

Kubernetes (Enterprise & Multi-Region)

For large-scale deployments exceeding 200,000 MAU, multi-region architectures, or advanced orchestration requirements, see the Kubernetes deployment guide.

High-level architecture

The CometChat on-premise platform employs a modern, microservices-based architecture designed for enterprise-grade reliability, security, and performance. The system is built on proven open-source technologies and follows cloud-native principles to ensure operational excellence at scale. CometChat On-premise Architecture

Architecture Components

Client Layer
  • Desktop Clients: Native desktop applications and web browsers accessing the platform via HTTPS and WebSocket protocols
  • Mobile Apps: iOS and Android applications with persistent connections for real-time messaging and push notification support
  • Cloud Services: Third-party integrations, webhooks consumers, and external systems interfacing with the platform APIs
Load Balancer
  • Enterprise-grade traffic distribution layer providing high availability, SSL/TLS termination, health checking, and automatic failover across Docker Swarm nodes
  • Supports session affinity for WebSocket connections and intelligent routing based on service health metrics
Docker Swarm Cluster The core platform runs within a Docker Swarm orchestration environment, providing service discovery, load balancing, and automated container management. Backend Services NGINX Reverse Proxy
  • TLS/SSL termination with configurable cipher suites and certificate management
  • HTTP/2 and WebSocket protocol support with automatic upgrade handling
  • Request routing to microservices based on URL patterns and headers
  • Rate limiting, request buffering, and connection pooling for optimal performance
Microservices Layer
  • WebSocket Gateway: Maintains persistent bi-directional connections for real-time event delivery, presence management, typing indicators, and instant message routing with automatic reconnection and session recovery
  • Chat API Service: RESTful API handling message CRUD operations, conversation management, user operations, group administration, and metadata queries with transaction support
  • Moderation Service: Content filtering engine with configurable policies, profanity detection, spam prevention, image moderation, and AI/ML integration for advanced threat detection
  • Notifications Service: Asynchronous push notification dispatcher supporting FCM, APNs, and custom providers with intelligent batching, retry logic, and delivery tracking
  • Webhooks Service: Outbound event delivery system with configurable retry policies, exponential backoff, secure authentication, and comprehensive audit logging
Kafka Event Bus
  • Distributed event streaming platform serving as the central message backbone for inter-service communication
  • Provides guaranteed message ordering, fault-tolerant persistence, and horizontal scalability
  • Enables decoupled microservices architecture with publish-subscribe and event sourcing patterns
  • Handles real-time message routing, event notifications, and asynchronous processing pipelines
Data Store Components TiDB Cluster (Distributed SQL Database)
  • Placement Driver (PD): Cluster metadata management, timestamp allocation, and intelligent data placement decisions
  • TiKV: Distributed transactional key-value storage engine with Raft consensus protocol ensuring strong consistency and automatic data replication
  • TiDB SQL Layer: MySQL-compatible query interface with distributed transaction coordination, supporting ACID guarantees and horizontal scalability
Redis Clusters
  • Dedicated in-memory data structure stores providing sub-millisecond latency
  • Cache Cluster: Application-level caching, query result caching, and frequently accessed data
  • Session & Rate Limiting Cluster: User session management, authentication tokens, and distributed rate limiting counters
MongoDB
  • Document-oriented database for flexible schema requirements
  • Stores moderation policies, user preferences, custom metadata, webhook configurations, and audit logs
  • Provides native JSON support and rich query capabilities for semi-structured data
Frontend Service
  • Frontend Application: Web-based administrative dashboard and user interface components
Monitoring Stack
  • Prometheus: Time-series metrics collection system scraping service endpoints, storing performance data, and triggering alerts based on configurable thresholds
  • Grafana: Visualization platform providing real-time operational dashboards, SLA monitoring, capacity planning insights, and customizable alerting workflows
  • Loki & Promtail: Centralized log aggregation and querying infrastructure enabling rapid troubleshooting and audit trail analysis
  • Node Exporter & cAdvisor: Host and container-level metrics collection for infrastructure monitoring and capacity planning
Infrastructure Layer Host Infrastructure
  • Physical or virtual compute resources running Docker Swarm nodes
  • Persistent storage volumes for stateful services (databases, Kafka, logs)
  • Resource allocation and isolation using Docker resource constraints
Private Network
  • Secure overlay network isolating backend services from external access
  • Encrypted service-to-service communication using Docker Swarm’s built-in encryption
  • Network segmentation separating public-facing services from data stores
  • Optimized routing paths minimizing inter-service latency and maximizing throughput