Server-Side WebSocket Routing Patterns #

A single physical socket carries dozens of logically distinct message types: presence pings, chat events, document edits, telemetry, control frames. The moment you accept more than one message shape on a connection, you need a router — a layer that reads each inbound frame, decides which handler owns it, and dispatches without blocking the event loop. Get this wrong and the failure is not subtle: a tenant’s “delete-all” command leaks into a neighbouring tenant’s channel, a malformed route field throws inside the message loop and kills every socket on that worker, or an unbounded JSON.parse on attacker-supplied frames pins a CPU core at 100%.

This guide covers the dispatch layer that sits between socket acceptance and your business logic. The parent area, Backend WebSocket Connection Management, handles the handshake and the connection lifecycle; here we focus on what happens to each frame after the socket is open and before it reaches application code: parsing, validating, namespacing by tenant, rate-limiting per channel, and fanning out across nodes. The target is O(1) dispatch lookup, hard isolation between tenants, and bounded fan-out during broadcast storms.

Prerequisites #

Routing assumes a healthy, authenticated, single-server connection already exists. Before applying anything here, confirm:

  • Liveness is handled. Dead sockets must be evicted by a heartbeat before they reach the router, or you will dispatch to half-open connections. See Connection Lifecycle & Heartbeats.
  • Identity is established on the upgrade. The router keys every decision off a trusted tenant/user identity. That identity must be pinned at handshake time, not read from the message body — see WebSocket Authentication & Authorization.
  • Affinity exists for in-memory routing tables. If your route map lives in process memory, clients must return to the same node — see Load Balancer Sticky Sessions. For multi-node broadcast you will instead lean on Redis Pub/Sub fan-out.
Server-side WebSocket frame routing An inbound frame is parsed and validated, namespaced by tenant, rate-limited per channel, then dispatched to a typed handler or fanned out via Redis. Inbound frame {route, payload} Parse + validate allowlist guard Namespace tenant:channel Rate limit per channel Typed handler O(1) lookup Redis fan-out cross-node One socket, many logical channels, hard tenant isolation

Core implementation #

The router is a Map from a string key to a typed handler. Resolution is O(1); validation happens before dispatch; every tenant decision is keyed off the identity pinned at handshake, never off the message body. The connection state lives in a WeakMap so a closed socket is garbage-collected without manual bookkeeping.

import type { WebSocket } from 'ws';

// Identity pinned during the upgrade — never trust the message body for this.
interface ConnContext {
tenantId: string; // e.g. "acme" — set once at handshake
userId: string;
subscriptions: Set<string>; // channels this socket has joined, namespaced
}

type RouteHandler = (
payload: unknown,
ctx: ConnContext,
ws: WebSocket,
) => void | Promise<void>;

const MAX_FRAME_BYTES = 64 * 1024; // reject oversized frames before parsing
const routes = new Map<string, RouteHandler>();
const contexts = new WeakMap<WebSocket, ConnContext>();

export function registerRoute(name: string, handler: RouteHandler): void {
routes.set(name, handler); // build the table once, at startup
}

// Namespacing: a channel is ALWAYS scoped by the connection's tenant.
// "room:42" from tenant "acme" becomes "acme:room:42" — no cross-tenant collision.
export function namespaced(ctx: ConnContext, channel: string): string {
return `${ctx.tenantId}:${channel}`;
}

export function routeMessage(ws: WebSocket, raw: Buffer): void {
const ctx = contexts.get(ws);
if (!ctx) return; // socket not yet registered — drop

if (raw.byteLength > MAX_FRAME_BYTES) {
return closeWith(ws, 1009, 'frame too large'); // 1009 = message too big
}

let route: string;
let payload: unknown;
try {
({ route, payload } = JSON.parse(raw.toString('utf8')));
} catch {
return sendError(ws, 'MALFORMED_FRAME'); // never let a parse throw escape
}

// Allowlist check BEFORE lookup — an unknown route is a protocol error,
// and prevents prototype-pollution-style keys from probing the Map.
if (typeof route !== 'string' || !routes.has(route)) {
return sendError(ws, 'UNKNOWN_ROUTE');
}

const handler = routes.get(route)!;
// Isolate handler failures: one bad payload must not kill the socket loop.
Promise.resolve(handler(payload, ctx, ws)).catch((err) => {
console.error(`route ${route} failed for ${ctx.tenantId}`, err);
sendError(ws, 'HANDLER_ERROR');
});
}

export function registerConnection(ws: WebSocket, ctx: ConnContext): void {
contexts.set(ws, ctx); // WeakMap: GC-friendly, no leak on close
}

function sendError(ws: WebSocket, code: string): void {
if (ws.readyState === ws.OPEN) {
ws.send(JSON.stringify({ type: 'ERROR', code, ts: Date.now() }));
}
}

function closeWith(ws: WebSocket, code: number, reason: string): void {
if (ws.readyState === ws.OPEN) ws.close(code, reason);
}

Subscription handlers join channels through the namespaced helper so a tenant can never address another tenant’s room — the prefix is derived from the trusted context, not from anything the client sent. A per-channel rate limiter wraps the handler before it broadcasts:

// Token-bucket per (tenant, channel). Keeps one noisy room from starving
// the worker and bounds fan-out during a broadcast storm.
const PUBLISH_REFILL_PER_SEC = 20;
const PUBLISH_BURST = 50;

interface Bucket { tokens: number; updatedAt: number; }
const buckets = new Map<string, Bucket>();

function allowPublish(key: string): boolean {
const now = Date.now();
const b = buckets.get(key) ?? { tokens: PUBLISH_BURST, updatedAt: now };
const refill = ((now - b.updatedAt) / 1000) * PUBLISH_REFILL_PER_SEC;
b.tokens = Math.min(PUBLISH_BURST, b.tokens + refill);
b.updatedAt = now;
if (b.tokens < 1) { buckets.set(key, b); return false; }
b.tokens -= 1;
buckets.set(key, b);
return true;
}

registerRoute('publish', (payload, ctx) => {
const { channel, body } = payload as { channel: string; body: unknown };
const ns = namespaced(ctx, channel); // tenant-scoped channel id
if (!ctx.subscriptions.has(ns)) return; // must be subscribed to publish
if (!allowPublish(ns)) return; // drop over-rate publishes
redis.publish(ns, JSON.stringify({ from: ctx.userId, body }));
});

When a connection count exceeds one node, that redis.publish is what carries the message to sockets on other workers; each node subscribes to the channels its local clients hold and re-emits inbound Redis messages to them. The full multi-node fan-out topology — sharding, ordering, and back-pressure — is covered in Redis Pub/Sub fan-out.

Configuration reference #

Parameter Type Default Production value Notes
MAX_FRAME_BYTES number none 65536 Reject before JSON.parse; mirror in ws maxPayload.
PUBLISH_REFILL_PER_SEC number unlimited 20 Steady-state publishes per channel per second.
PUBLISH_BURST number unlimited 50 Token-bucket ceiling; absorbs short spikes.
routes table build enum per-message startup-only Populate Map once; never mutate per-connection.
namespaced prefix source string handshake context Derive tenant from pinned identity, never the body.
ws.maxPayload (server opt) number 104857600 65536 Library-level cap; backstops MAX_FRAME_BYTES.
backpressure threshold number none ws.bufferedAmount > 1MB Pause/drop on slow consumers before OOM.

Edge cases & gotchas #

  • Tenant leakage via the message body. The single most dangerous bug. If channel is used un-prefixed, tenant-A can subscribe to tenant-B’s room by guessing its name. Always run channel strings through namespaced(ctx, …) so the tenant prefix comes from the handshake identity, not the frame.
  • A throwing handler killing the whole worker. An uncaught exception inside the message loop propagates up and can crash the process, dropping every socket on that node. Wrap dispatch in Promise.resolve(...).catch(...) and never await un-guarded handler code in the read path.
  • Unbounded parse on hostile frames. JSON.parse on a multi-megabyte frame blocks the event loop. Enforce MAX_FRAME_BYTES and the library-level maxPayload so a malicious client cannot stall dispatch for everyone.
  • Slow consumers and back-pressure. A subscriber that stops reading makes ws.bufferedAmount grow without bound during fan-out, leaking memory. Check bufferedAmount before broadcasting and drop or disconnect laggards rather than buffering forever.

Verification #

Confirm the router behaves under both normal and adversarial traffic:

# 1. Sockets are actually open and owned by the node process (not half-open).
ss -tnp 'sport = :8080' | head

# 2. Drive a tenant-isolation probe: subscribe as tenant A, attempt a
# cross-tenant publish, assert it is dropped (expects no delivery).
wscat -c "wss://api.example.com/ws?token=$TENANT_A_JWT" \
-x '{"route":"publish","payload":{"channel":"tenant-b:secret","body":1}}'
// 3. Metric assertion in a smoke test: unknown routes must be rejected,
// not silently dispatched.
const res = await sendFrame(ws, { route: '__proto__', payload: {} });
assert.equal(res.code, 'UNKNOWN_ROUTE');

// 4. Rate limiter caps fan-out: a burst above PUBLISH_BURST yields drops.
const accepted = await flood(ws, 'publish', 200);
assert.ok(accepted <= PUBLISH_BURST + PUBLISH_REFILL_PER_SEC);

In Chrome DevTools, open the Network → WS frames panel and watch a publish round-trip: an over-rate or cross-tenant frame should produce an ERROR frame or no echo, never a delivered payload.

Guides in this area #

FAQ #

Should I use Socket.IO namespaces or roll my own router? #

Socket.IO namespaces and rooms give you a namespacing primitive for free, but they couple you to the Socket.IO protocol and its reconnection model. With the raw ws package you own the dispatch table and the wire format, which is what this guide assumes. If you already run Socket.IO, map its namespace to the tenant prefix shown here and keep the same allowlist and rate-limit guards — the security properties do not come from the library.

How do I route messages to a socket connected to a different node? #

In-process Map lookup only reaches sockets on the local worker. For cross-node delivery, publish to a channel that every node subscribes to and let each node re-emit to its local subscribers. That is exactly the Redis Pub/Sub fan-out pattern; the redis.publish(ns, …) call in the publish handler above is the hand-off point.

Where should tenant identity come from? #

From the connection context pinned during the upgrade handshake, validated by WebSocket Authentication & Authorization. Never read tenantId from the message body — a client can set any value there, which defeats namespacing entirely.

Does an in-memory route table work behind a load balancer? #

Yes, as long as the load balancer keeps a client pinned to the node that holds its subscriptions. Without affinity, a reconnect can land on a node whose Map has no record of the client’s channels. Pair the router with Load Balancer Sticky Sessions, or externalise the subscription registry to Redis so any node can serve any client.

Back to Backend WebSocket Connection Management