- Data Sovereignty: Complete control over data residency, encryption, and access policies to meet GDPR, HIPAA, SOC 2, and industry-specific compliance requirements
- Predictable Economics: Eliminate per-user SaaS pricing with fixed infrastructure costs and transparent capacity planning
- Operational Control: Full visibility into system behavior, customizable monitoring, and direct access to all components for troubleshooting and optimization
- Security Posture: Deploy within your existing security perimeter with private networks, custom authentication integration, and air-gapped deployment options
- Performance Guarantees: Achieve consistent sub-100ms latency with dedicated resources and optimized data locality
Who this guide is for
This comprehensive deployment guide is designed for technical decision-makers and implementation teams responsible for enterprise infrastructure:- DevOps & SRE Teams: Operations professionals managing production uptime, incident response, capacity planning, and continuous deployment pipelines
- Platform & Backend Engineers: Technical leads architecting scalable systems, optimizing performance, and integrating real-time messaging into existing application ecosystems
- Infrastructure Architects: Strategic planners designing multi-region deployments, disaster recovery strategies, compliance frameworks, and long-term scalability roadmaps
- Security & Compliance Officers: Stakeholders ensuring data protection, access controls, audit logging, and regulatory adherence across all system components
Platform capabilities
CometChat on-premise provides a comprehensive suite of real-time communication features designed for enterprise applications: Core Messaging Infrastructure- 1:1 and Group Conversations: Scalable messaging architecture supporting unlimited conversation threads with persistent message history, rich media attachments, and message threading
- Real-time Event Streaming: WebSocket-based bi-directional communication delivering instant presence updates, typing indicators, delivery receipts, read receipts, and custom event propagation
- Message Delivery Guarantees: Effectively-once message delivery semantics using idempotent producers, automatic retry logic, offline message queuing, and synchronization across multiple devices
- Distributed Event Pipeline: Apache Kafka-powered event backbone enabling decoupled microservices architecture, guaranteed message ordering, and fault-tolerant event processing at scale
- Push Notifications: Multi-provider notification delivery supporting Firebase Cloud Messaging (FCM), Apple Push Notification Service (APNs), and custom webhook integrations with intelligent batching and delivery optimization
- Content Moderation: Configurable policy engine with real-time profanity filtering, spam detection, image moderation, and extensible AI/ML adapter framework for custom moderation workflows
- Webhooks & Integrations: Reliable outbound event delivery system with configurable retry policies, secure authentication, and comprehensive audit trails for third-party system integration
- RESTful APIs: Comprehensive HTTP APIs for user management, conversation operations, group administration, metadata queries, and administrative functions with OpenAPI documentation
- Horizontal Scalability: Stateless service design enabling linear scaling by adding compute resources without architectural changes or data migration
- Multi-tenancy Support: Logical isolation of tenant data with tenant-specific configurations, rate limits, and resource quotas
Data architecture & storage
The platform employs a polyglot persistence strategy with multiple storage technologies optimized for specific data access patterns and consistency requirements: Primary Data Stores-
TiDB Cluster: Horizontally scalable, MySQL-compatible distributed SQL database providing ACID transactions, automatic sharding, and multi-region replication. Composed of three components:
- Placement Driver (PD): Cluster metadata management and intelligent data placement
- TiKV: Distributed key-value storage engine with Raft consensus for strong consistency
- TiDB SQL Layer: MySQL-compatible query interface with distributed transaction coordination
- MongoDB: Document-oriented database optimized for flexible schema evolution and semi-structured data with native JSON support and rich query capabilities
-
Redis Clusters: Dedicated in-memory data structure stores providing sub-millisecond latency for high-frequency operations:
- Cache Cluster: Application-level caching, query result caching, and frequently accessed data
- Session & Rate Limiting Cluster: User session management, authentication tokens, and distributed rate limiting counters
- Apache Kafka: Distributed commit log serving as the central event backbone for asynchronous communication between microservices
- Guarantees: 100% real-time message delivery with sub-second latency, guaranteed message delivery with no message loss, configurable retention policies, and horizontal scalability
- Object Storage: S3-compatible storage (Amazon S3, MinIO, Ceph, or Google Cloud Storage) for unstructured data and large binary objects
- Features: Lifecycle policies, versioning, encryption at rest, and cost-optimized storage tiers
- Automated backup strategies with point-in-time recovery capabilities
- Configurable retention policies aligned with compliance requirements
Deployment models
CometChat on-premise supports multiple deployment architectures to match your operational maturity, scale requirements, and infrastructure preferences:Docker Swarm (Recommended: 10k-200k MAU)
Target Environment: Production deployments up to ~200,000 monthly active users and ~20,000 peak concurrent connections Characteristics:- Lightweight orchestration with native Docker integration and minimal operational overhead
- Predictable service placement with node constraints and resource reservations
- Secure overlay networking with encrypted service-to-service communication
- Rolling updates with configurable health checks and automatic rollback capabilities
- Built-in load balancing and service discovery without external dependencies
- Lower operational complexity compared to Kubernetes while maintaining production-grade reliability
- Faster deployment cycles with straightforward configuration management
- Reduced infrastructure costs with efficient resource utilization
- Proven architecture supporting hundreds of production deployments
Kubernetes (Enterprise & Multi-Region)
For large-scale deployments exceeding 200,000 MAU, multi-region architectures, or advanced orchestration requirements, see the Kubernetes deployment guide.High-level architecture
The CometChat on-premise platform employs a modern, microservices-based architecture designed for enterprise-grade reliability, security, and performance. The system is built on proven open-source technologies and follows cloud-native principles to ensure operational excellence at scale.
Architecture Components
Client Layer- Desktop Clients: Native desktop applications and web browsers accessing the platform via HTTPS and WebSocket protocols
- Mobile Apps: iOS and Android applications with persistent connections for real-time messaging and push notification support
- Cloud Services: Third-party integrations, webhooks consumers, and external systems interfacing with the platform APIs
- Enterprise-grade traffic distribution layer providing high availability, SSL/TLS termination, health checking, and automatic failover across Docker Swarm nodes
- Supports session affinity for WebSocket connections and intelligent routing based on service health metrics
- TLS/SSL termination with configurable cipher suites and certificate management
- HTTP/2 and WebSocket protocol support with automatic upgrade handling
- Request routing to microservices based on URL patterns and headers
- Rate limiting, request buffering, and connection pooling for optimal performance
- WebSocket Gateway: Maintains persistent bi-directional connections for real-time event delivery, presence management, typing indicators, and instant message routing with automatic reconnection and session recovery
- Chat API Service: RESTful API handling message CRUD operations, conversation management, user operations, group administration, and metadata queries with transaction support
- Moderation Service: Content filtering engine with configurable policies, profanity detection, spam prevention, image moderation, and AI/ML integration for advanced threat detection
- Notifications Service: Asynchronous push notification dispatcher supporting FCM, APNs, and custom providers with intelligent batching, retry logic, and delivery tracking
- Webhooks Service: Outbound event delivery system with configurable retry policies, exponential backoff, secure authentication, and comprehensive audit logging
- Distributed event streaming platform serving as the central message backbone for inter-service communication
- Provides guaranteed message ordering, fault-tolerant persistence, and horizontal scalability
- Enables decoupled microservices architecture with publish-subscribe and event sourcing patterns
- Handles real-time message routing, event notifications, and asynchronous processing pipelines
- Placement Driver (PD): Cluster metadata management, timestamp allocation, and intelligent data placement decisions
- TiKV: Distributed transactional key-value storage engine with Raft consensus protocol ensuring strong consistency and automatic data replication
- TiDB SQL Layer: MySQL-compatible query interface with distributed transaction coordination, supporting ACID guarantees and horizontal scalability
- Dedicated in-memory data structure stores providing sub-millisecond latency
- Cache Cluster: Application-level caching, query result caching, and frequently accessed data
- Session & Rate Limiting Cluster: User session management, authentication tokens, and distributed rate limiting counters
- Document-oriented database for flexible schema requirements
- Stores moderation policies, user preferences, custom metadata, webhook configurations, and audit logs
- Provides native JSON support and rich query capabilities for semi-structured data
- Frontend Application: Web-based administrative dashboard and user interface components
- Prometheus: Time-series metrics collection system scraping service endpoints, storing performance data, and triggering alerts based on configurable thresholds
- Grafana: Visualization platform providing real-time operational dashboards, SLA monitoring, capacity planning insights, and customizable alerting workflows
- Loki & Promtail: Centralized log aggregation and querying infrastructure enabling rapid troubleshooting and audit trail analysis
- Node Exporter & cAdvisor: Host and container-level metrics collection for infrastructure monitoring and capacity planning
- Physical or virtual compute resources running Docker Swarm nodes
- Persistent storage volumes for stateful services (databases, Kafka, logs)
- Resource allocation and isolation using Docker resource constraints
- Secure overlay network isolating backend services from external access
- Encrypted service-to-service communication using Docker Swarm’s built-in encryption
- Network segmentation separating public-facing services from data stores
- Optimized routing paths minimizing inter-service latency and maximizing throughput