Introduction
Creating a real-time chat app presents us with a few challenges. Namely, it has to be fast, scalable, and reliable. No user wants a chat app to work slowly or for their messages to go undelivered. Furthermore, the more users a chat app gets, the more valuable it becomes.
There are as many chat app architectures as there are chat apps, each with peculiarities that require attention. Without specific references, I will outline an architecture for developing a robust chat system that scales and provides a pleasant user experience.
Basic Features
For this project, we'll focus on implementing the following features:
- Almost real-time delivery of messages (we’ll discuss the “almost” part later)
- Scalability
- Permanent storage of messages
- Notifications for connected users
- Write-ahead logging for robust message handling
- Presence tracking for users
- Load balancing
- Microservices architecture, decoupling crucial components
Deriving the Architecture
Client
We start with the clients. For simplicity, let’s consider a web-based chat app. Using standard web tools like React, we can implement a UI for users to:
- Log into the app
- View their list of chats
- Exchange messages with other users
Since SEO is not a requirement (everything is private and behind authentication), there’s no need for a server-side rendering (SSR) solution.
Sending Messages
Let’s say User A sends a message to User B. Two key considerations are:
- The message must be permanently stored.
- User B must receive it.
We'll use a message broker to handle message transmission. Clients send messages to this broker, which processes them in a scalable and decoupled way.
For this architecture, we’ll use Amazon SQS to store messages. Since processing may occasionally fail, we’ll include a Dead Letter Queue (DLQ) as a companion to the main queue for handling problematic messages.
Consuming Messages for Permanent Storage
The SQS queue will be consumed by either an ECS cluster or a Lambda function. For simplicity, let’s choose a Lambda function.
- The Lambda function subscribes to the SQS queue and polls messages for consumption.
- Messages are processed and stored in a DynamoDB table.
Decoupling Notifications
To handle notifications for active users, the same Lambda function will send a notification message for each processed message to a separate Notification Messages queue. This queue will also have an associated DLQ.
Consuming Notification Messages
Real-time notifications require WebSocket connections between the server and the clients. Since Lambda functions terminate after completing tasks, they’re not suitable for maintaining persistent connections.
Instead, we’ll use an ECS cluster to:
- Maintain WebSocket connections
- Send real-time notifications to users
Additionally, a DynamoDB table will store the connection IDs to facilitate message delivery.
This architecture provides a foundation for building a fast, scalable, and reliable chat application while addressing challenges like message persistence, real-time notifications, and scalability.