What Is Real-Time Messaging?
Real-time messaging enables instant communication between users by using persistent WebSocket connections. When a user sends a message, an event is generated, routed through the server, and delivered instantly to the recipient’s client.
- Event example: "User A pushes 'Hi' to room X, notify User B".
- Recipient receives the event via an open WebSocket connection.
How Message Status Is Tracked
Three core statuses indicate the lifecycle of a message:
- Sent: Server acknowledges receipt of the message.
- Delivered: Recipient’s client acknowledges receipt over WebSocket.
- Read: Recipient opens the chat and an explicit read event is sent.
Why Security, Authentication, and Authorization Matter
Messages are delivered only to authorized participants.
- Each conversation has a unique, server‑generated
roomId. - Users can join only rooms they are permitted to access.
- WebSocket connections are authenticated and tied to user sessions.
- Server validates every event against the
roomIdand user permissions.
What Is a Message Broker?
A message broker is a central component that decouples producers (servers) from consumers (other servers) and routes messages based on routing rules.
- Key parts: entry point (exchange), routing logic, queues, and metadata.
- Common types: Direct, Topic, Fanout, Headers.
Why Use a Message Broker?
Scaling real‑time systems requires handling millions of concurrent connections, which a single server cannot support.
- Distributes load across multiple servers.
- Enables reliable delivery even if a server fails.
- Provides buffering so messages are not lost during spikes.
How Message Brokers Enable Scalable Delivery
Step‑by‑step flow of a message from sender to receiver:
- Step 1: Server 1 receives the message and publishes it to the broker with a routing key (e.g.,
room:room_123). - Step 2: The broker’s exchange examines the routing key and forwards the message to the appropriate queue(s) bound to that key.
- Step 3: One or more consumer servers listen on the queue. Server 2 pulls the message and pushes it to User B via WebSocket.
What Scaling Techniques Complement Message Brokers
Additional layers ensure high availability and performance.
- Load Balancing: Distributes incoming WebSocket connections across many servers; redirects traffic if a server fails.
- Database Sharding: Splits data across multiple databases by userId or roomId to avoid a single bottleneck.
- In‑Memory Caching (Redis): Stores frequently accessed data (e.g., user presence, recent messages) for ultra‑fast reads.
- Horizontal Scaling: Adds more servers to increase capacity without a single point of failure.
How Concurrency Is Managed
High‑throughput systems must prevent race conditions.
- Distributed Locks (Redis): Ensure only one server performs a critical operation at a time (e.g., group creation).
- Database Transactions: Guarantee atomicity; either all related writes succeed or none do.
- Idempotency: Each client‑generated message carries a unique ID; duplicate submissions are ignored.
Why Message Ordering Is Crucial and How Kafka Provides It
Conversation flow depends on preserving order.
- All messages for a given room are sent to the same Kafka partition.
- Kafka guarantees order within a partition, so recipients see messages in the exact sequence they were sent.
How Rate Limiting Protects the System
Prevents abuse and ensures fair resource distribution.
- Token‑bucket algorithm implemented in Redis limits messages per second per user.
- Exceeding the limit results in throttling, protecting downstream services.
End‑to‑End Flow Summary
The complete lifecycle of a chat message:
- User A sends a message via WebSocket to Server 1.
- Rate limiter checks User A’s quota.
- Server 1 validates authentication, authorization, and idempotency.
- Message is stored atomically in the database (transaction).
- Server 1 publishes the event to the broker with a routing key.
- Broker routes the event to the appropriate queue.
- Server 2 consumes the message from the queue and pushes it to User B’s WebSocket.
- User B receives the message instantly, and read/delivery events are generated back through the same pipeline.