Redis Pub/Sub Fan-Out for WebSockets #
You run three WebSocket nodes behind a load balancer. A user on node A posts a chat message; your handler calls server.clients.forEach(c => c.send(...)) and it works perfectly in development. In production, half the room never sees the message — because those clients are connected to node B and node C, and server.clients only holds the sockets terminated on the node that received the publish. A single-node broadcast loop is structurally blind to every connection living on another process. This guide shows how to use Redis pub/sub as the inter-node bus that carries a message published on node A to subscribers connected anywhere in the fleet.
This is the canonical fan-out primitive for Scaling Real-Time Infrastructure: each node keeps a local registry of the sockets it owns, and Redis carries the cross-node notification so every node can broadcast to its own locals.
Prerequisites #
Before wiring up the bridge, confirm the following are already in place:
- A working multi-node WebSocket deployment. Connection acceptance, heartbeats, and reconnection are out of scope here; if a node restarts, clients must reconnect — see how server-side routing patterns dispatch frames once a socket is established.
- A reachable Redis instance (6.x+). A single primary is fine to start; pub/sub is not sharded across a Redis Cluster the way keys are, so plan topology before you scale Redis itself.
- A clear decision on durability. Pub/sub is fire-and-forget with no replay. If you need delivery guarantees, that is a different mechanism — see message delivery guarantees and the streams comparison below.
ioredisandwsinstalled (npm i ioredis ws).
How node-to-node fan-out works #
The hardest part to picture is that the WebSocket connection and the Redis subscription are two completely separate channels. A client never talks to Redis. The node receives a frame, publishes a serialized envelope to a Redis channel, and every node — including the publisher — receives that envelope on its subscriber client and replays it to the local sockets it owns.
The asymmetry is the whole point: the bus is global, but the socket sends are always local. No node ever touches another node’s sockets directly.
Core implementation #
You need two Redis connections per node. A connection in subscribe mode cannot issue normal commands, so the publish path and the subscribe path must use separate clients. The pattern below bridges a Redis channel to a local ws registry keyed by room.
import { WebSocketServer, WebSocket } from 'ws';
import Redis from 'ioredis';
const NODE_ID = process.env.NODE_ID ?? crypto.randomUUID(); // identify the origin node
const CHANNEL_PREFIX = 'ws:room:'; // namespace pub/sub channels
// Two clients: one stuck in subscribe mode, one free to PUBLISH.
const sub = new Redis(process.env.REDIS_URL!);
const pub = new Redis(process.env.REDIS_URL!);
// Per-node local registry: room -> set of sockets terminated on THIS node.
const localRooms = new Map<string, Set<WebSocket>>();
const wss = new WebSocketServer({ port: 8080 });
function joinRoom(room: string, ws: WebSocket) {
let set = localRooms.get(room);
if (!set) {
set = new Set();
localRooms.set(room, set);
sub.subscribe(CHANNEL_PREFIX + room); // subscribe lazily: only rooms with local clients
}
set.add(ws);
}
function leaveRoom(room: string, ws: WebSocket) {
const set = localRooms.get(room);
if (!set) return;
set.delete(ws);
if (set.size === 0) {
localRooms.delete(room);
sub.unsubscribe(CHANNEL_PREFIX + room); // stop receiving fan-out for empty rooms
}
}
// Publish path: serialize an envelope and hand it to Redis. Note we do NOT
// write to local sockets here — the subscriber callback below does that for
// every node uniformly, including this one. This avoids double-sending.
function broadcast(room: string, payload: unknown) {
const envelope = JSON.stringify({ origin: NODE_ID, room, payload, ts: Date.now() });
pub.publish(CHANNEL_PREFIX + room, envelope);
}
// Subscribe path: every published envelope arrives here on every subscribed node.
sub.on('message', (channel, raw) => {
const room = channel.slice(CHANNEL_PREFIX.length);
const set = localRooms.get(room);
if (!set) return; // no local clients for this room — drop
const { payload } = JSON.parse(raw);
const frame = JSON.stringify(payload);
for (const ws of set) {
if (ws.readyState === WebSocket.OPEN) ws.send(frame); // local send only
}
});
wss.on('connection', (ws, req) => {
const room = new URL(req.url!, 'http://x').searchParams.get('room') ?? 'lobby';
joinRoom(room, ws);
ws.on('message', (data) => broadcast(room, JSON.parse(data.toString())));
ws.on('close', () => leaveRoom(room, ws));
});
Two design choices carry the weight. First, the publisher does not also write to its own locals inline; it relies on the subscriber callback to deliver everywhere, so the logic that reaches a local socket is identical regardless of which node originated the message. Second, subscriptions are lazy — a node only subscribes to a Redis channel while it holds at least one local socket for that room, which keeps PUBSUB CHANNELS proportional to active rooms rather than total rooms.
Channel sharding #
One channel per room scales to a lot of rooms, but a single hot room (a stadium-scale broadcast) still funnels every message through one channel and re-fans it to every node. When fan-out volume on a single logical topic dwarfs the rest, shard it: publish to ws:room:42:shardN where N = hash(senderId) % SHARD_COUNT, and have each node subscribe to all shards of rooms it holds. Sharding spreads the per-channel message rate but multiplies subscription count, so treat SHARD_COUNT as a tuning knob, not a default.
Configuration reference #
| Parameter | Type | Default | Production value | Notes |
|---|---|---|---|---|
REDIS_URL |
string | redis://127.0.0.1:6379 |
TLS endpoint, auth | Same instance for pub and sub clients |
CHANNEL_PREFIX |
string | ws:room: |
app:env:ws:room: |
Namespace per app/env to avoid cross-talk |
SHARD_COUNT |
number | 1 |
4–16 for hot rooms |
Only shard topics that need it |
retryStrategy |
function | exponential | cap at 2000 ms |
On sub client, must resubscribe after reconnect |
enableOfflineQueue |
boolean | true |
true (pub) / false (sub) |
Queuing subscribes on a dead conn hides failures |
maxListeners |
number | 10 |
rooms-per-node | Raise to avoid MaxListenersExceeded on sub |
socket_keepalive |
boolean | off | on | Detect dead Redis links faster |
Edge cases & gotchas #
- Fire-and-forget, no replay. Redis pub/sub delivers only to clients subscribed at publish time. A node that is mid-restart, or a room a node hasn’t subscribed to yet, silently misses the message. There is no backlog and no acknowledgement. If a missed message is a correctness bug rather than a cosmetic one, you need message delivery guarantees, not raw pub/sub.
- Reconnect must resubscribe. When the
subclient drops and reconnects, Redis has forgotten everySUBSCRIBE. ioredis re-issues prior subscriptions on auto-reconnect, but if you manage channels dynamically, re-derive them fromlocalRoomsinside asub.on('ready', ...)handler — never assume the server remembers. - Message amplification. Every published envelope is delivered to every subscribed node, even nodes whose local set is empty for a sharded variant. With N nodes, one publish becomes up to N inbound deliveries. Watch Redis egress bandwidth; it grows with node count, not just message rate. Lazy subscriptions and tight channel scoping are your main levers.
- Self-delivery and double sends. Because the publisher is also a subscriber, it receives its own message. That is intentional here, but if you ever add an inline local send and keep the subscriber send, clients on the origin node get the message twice. Pick one path. The
origin: NODE_IDfield lets you filter if you must.
Verification #
Confirm the bus is live and carrying traffic with redis-cli while clients are connected:
# Which room channels currently have at least one subscriber node?
redis-cli PUBSUB CHANNELS 'ws:room:*'
# How many nodes are subscribed to a specific room?
redis-cli PUBSUB NUMSUB ws:room:42
# Watch envelopes flow in real time (publish a test message from another shell).
redis-cli MONITOR | grep PUBLISH
Then prove cross-node delivery end to end: connect one WebSocket client to node A and another to node B (force the load balancer or hit each node’s port directly), send from the A client, and assert the B client receives it. If PUBSUB CHANNELS lists the room but the B client gets nothing, your subscriber callback or local registry is the fault — not Redis.
# Sanity-check that publishing to a channel reaches subscribed nodes.
redis-cli PUBLISH ws:room:42 '{"payload":{"type":"ping"}}'
# Expect the integer reply = number of subscribed nodes (not clients).
Guides in this area #
- Scaling WebSocket Broadcast with Redis Pub/Sub — the full broadcast loop, batching, and backpressure under high message rates.
- Redis Streams vs Pub/Sub for WebSocket Fan-Out — when fire-and-forget is fine and when you need the replayable, consumer-group durability of Streams.
FAQ #
When should I choose Redis Streams instead of pub/sub? #
Choose pub/sub when a missed message is acceptable — live cursors, typing indicators, ephemeral presence pings — and you want the lowest possible latency with zero storage. Choose Streams when subscribers must be able to reconnect and replay what they missed, when you need consumer-group load distribution, or when delivery is a correctness requirement. Pub/sub has no backlog, no offsets, and no acknowledgements; Streams give you all three at the cost of memory and trimming policy. The dedicated comparison guide walks through the trade-offs in detail.
Why do I need two Redis clients per node? #
A Redis connection in subscribe mode is restricted — it can only issue subscribe/unsubscribe commands until it leaves that mode. Your sub client lives permanently in subscribe mode, so it cannot also PUBLISH. The second pub client stays in normal command mode for publishing (and any other Redis commands the node runs). Sharing one client breaks one of the two paths.
Does pub/sub work across a Redis Cluster? #
Not transparently. Classic pub/sub on Redis Cluster broadcasts to all nodes regardless of slot, which can saturate inter-node links at scale. Redis 7 added sharded pub/sub (SSUBSCRIBE/SPUBLISH) that routes a channel to a single slot’s node. If you are on Cluster, prefer sharded pub/sub and key your channels so related rooms land on the same slot.
Will I get duplicate messages on the publishing node? #
Yes, by design in this pattern — the publisher is also a subscriber, so it receives its own envelope and delivers it to its locals through the same callback as every other node. That is what keeps the delivery path uniform. You only get true duplicates if you additionally send to local sockets inline at publish time; don’t do both.
How do I keep one hot room from overwhelming a node? #
Shard the room’s channel across SHARD_COUNT channels keyed by a hash of the sender, so message rate per channel drops and Redis spreads the fan-out work. Combine that with batching sends on the WebSocket side and applying backpressure when a socket’s bufferedAmount climbs. Routing-level isolation also helps — see how server-side routing patterns keep broadcast storms inside a single state domain.
Related #
- Scaling Real-Time Infrastructure — the parent area covering fan-out, presence, delivery guarantees, and horizontal scaling.
- Message Delivery Guarantees — at-least-once delivery and acknowledgements for when fire-and-forget pub/sub is not enough.
- Server-Side Routing Patterns — dispatching frames to the right handler and channel before they hit the fan-out bus.
- Scaling WebSocket Broadcast with Redis Pub/Sub — the broadcast loop in depth with batching and backpressure.
- Redis Streams vs Pub/Sub for WebSocket Fan-Out — picking the right Redis primitive for your durability needs.
Back to Scaling Real-Time Infrastructure