At-least-once WebSocket delivery with acknowledgements #

You shipped a notification, the server’s socket.send() returned without throwing, and the user never saw it. The connection had dropped mid-flight — TCP buffered the frame, the socket reset, and the bytes evaporated. If you are landing here because messages silently disappear across reconnects, the fix is an application-level acknowledgement protocol layered on top of the raw socket. This page builds one: an ack envelope, a per-client outbox of unacked messages, resend-on-timeout, and client-side deduplication by a monotonically increasing sequence number.

Root cause #

Raw WebSocket gives you an ordered, reliable byte stream only while the underlying TCP connection is alive. The protocol (RFC 6455) defines no end-to-end delivery receipt. WebSocket.send() does not block until the peer reads the data — it copies the frame into the kernel send buffer and returns. bufferedAmount tells you how much is still queued locally, but a frame that has left your buffer is not a frame the peer has processed.

When the connection drops, three loss windows open:

  • In-flight frames: bytes written to the TCP send buffer but not yet ACKed at the transport layer are discarded when the socket resets (1006 Abnormal Closure, ECONNRESET).
  • Unread frames: data delivered to the peer’s kernel but not yet read into application memory is lost when the application-side socket object is torn down.
  • The reconnect gap: anything the server tries to send while the client is offline and re-establishing has no socket to write to at all.

A new TCP connection after reconnect is a fresh stream with no memory of the old one. The browser’s automatic retransmission only covers segments within a single live connection; it cannot replay a message across a reconnect. So “the socket reconnected” is not the same as “no messages were lost” — and your auto-reconnection strategies restore the pipe without restoring the in-flight payloads. Delivery guarantees have to be rebuilt one layer up, in your own protocol.

At-least-once means: every message is delivered one or more times. The sender keeps retrying until it gets proof of receipt; the receiver tolerates duplicates. That duplicate tolerance is non-negotiable — a resend after a lost ack will deliver the same message twice.

At-least-once ack and resend sequence Server sends seq 7 from its outbox; the client acks, the server clears the outbox entry, then a reconnect triggers a resend of an unacked seq 8 which the client dedups. Server outbox holds unacked seq Client dedup by lastSeq msg {seq:7, payload} ack {seq:7} drop seq 7 — reconnect, seq 8 was in flight — resend {seq:8} (timeout) ack {seq:8}, dedup if seen

Resolution #

Wrap every payload in an envelope { seq, payload }, hold each unacked envelope in a per-client outbox keyed by seq, and arm a resend timer when you send. On ack {seq}, clear the entry and cancel its timer. On reconnect, replay the entire outbox. The client tracks the highest seq it has processed and drops anything at or below it.

import { WebSocketServer, WebSocket } from 'ws';

const ACK_TIMEOUT_MS = 5_000; // wait this long for an ack before resending
const MAX_RESENDS = 5; // give up (and alert) after this many tries

type Envelope = { type: 'msg'; seq: number; payload: unknown };
type Ack = { type: 'ack'; seq: number };

// One outbox per logical client, keyed by a stable clientId (NOT the socket —
// the socket changes on every reconnect; the client identity must not).
class ClientOutbox {
private nextSeq = 1;
private pending = new Map<number, { env: Envelope; timer: NodeJS.Timeout; tries: number }>();
private socket: WebSocket | null = null;

// Rebind to the live socket after a reconnect, then flush everything unacked.
attach(socket: WebSocket) {
this.socket = socket;
for (const entry of this.pending.values()) this.write(entry.env); // replay outbox
}

detach() {
this.socket = null; // stop writing; timers keep running so resends fire on re-attach
}

send(payload: unknown) {
const env: Envelope = { type: 'msg', seq: this.nextSeq++, payload };
this.arm(env, 0); // store + start the resend timer
this.write(env); // best-effort first attempt
}

// Called when the client confirms receipt of a specific seq.
onAck(seq: number) {
const entry = this.pending.get(seq);
if (!entry) return; // duplicate or late ack — already cleared, ignore
clearTimeout(entry.timer);
this.pending.delete(seq); // delivery confirmed; safe to forget
}

private arm(env: Envelope, tries: number) {
const timer = setTimeout(() => this.onTimeout(env.seq), ACK_TIMEOUT_MS);
this.pending.set(env.seq, { env, timer, tries });
}

private onTimeout(seq: number) {
const entry = this.pending.get(seq);
if (!entry) return; // acked while the timer was firing — race resolved
if (entry.tries >= MAX_RESENDS) {
this.pending.delete(seq); // poison message: stop looping, surface it
console.error(`giving up on seq ${seq} after ${MAX_RESENDS} resends`);
return;
}
this.arm(entry.env, entry.tries + 1); // re-arm with incremented try count
this.write(entry.env); // resend the SAME seq, so the client can dedup
}

private write(env: Envelope) {
if (this.socket?.readyState === WebSocket.OPEN) this.socket.send(JSON.stringify(env));
// if not OPEN, the timer or the next attach() will retry — never drop the entry
}
}

const outboxes = new Map<string, ClientOutbox>(); // clientId -> outbox, survives reconnects
const wss = new WebSocketServer({ port: 8080 });

wss.on('connection', (socket, req) => {
const clientId = new URL(req.url!, 'http://x').searchParams.get('cid')!; // stable identity
const outbox = outboxes.get(clientId) ?? new ClientOutbox();
outboxes.set(clientId, outbox);
outbox.attach(socket); // replays any unacked messages from before the drop

socket.on('message', (raw) => {
const msg = JSON.parse(raw.toString()) as Ack;
if (msg.type === 'ack') outbox.onAck(msg.seq);
});
socket.on('close', () => outbox.detach());
});

The client side is deliberately small: process the payload, then ack — and dedup before processing so a resend never double-applies.

let lastSeq = 0; // highest seq already applied; persist per session

ws.onmessage = (e) => {
const env = JSON.parse(e.data) as { type: 'msg'; seq: number; payload: unknown };
if (env.type !== 'msg') return;

if (env.seq > lastSeq) { // first time we have seen this seq
applyPayload(env.payload); // your business logic runs exactly once
lastSeq = env.seq;
} // else: a resend of something already applied — skip the work
ws.send(JSON.stringify({ type: 'ack', seq: env.seq })); // ALWAYS ack, even on a dup
};

Acking even duplicates is the detail people miss: if the original ack was lost, the server resends, and if the client stays silent on the duplicate the server resends forever. The seq > lastSeq gap also lets the client detect a hole (a skipped seq) and request a replay from that point — the building block for ordered, gap-free delivery on top of at-least-once. This whole mechanism is one piece of a broader Scaling Real-Time Infrastructure story; the outbox is per-client state that must live somewhere durable once you run more than one node.

Operational checklist #

  • Tune ACK_TIMEOUT_MS
  • Confirm the clientId
  • Verify duplicate delivery is harmless: every consumer of applyPayload must be idempotent or guarded by lastSeq
  • Persist lastSeq
  • Alert on MAX_RESENDS

FAQ #

Does raw WebSocket guarantee delivery on its own? #

No. WebSocket inherits TCP’s reliability only for the lifetime of one connection. There is no end-to-end receipt in RFC 6455, and send() returns once the frame is buffered locally — not when the peer reads it. Any frame in flight when the socket resets is lost, which is why you need an application-level ack.

How is this different from TCP retransmission? #

TCP retransmits segments within a single live connection and gives up when that connection dies. At-least-once acks operate above the socket: the outbox survives the drop and replays on the next connection. TCP can’t replay a message across a reconnect because the new connection is a brand-new stream.

Why must the client ack duplicate messages? #

Because a resend usually happens precisely because the previous ack was lost. If the client processes the duplicate (correctly skipping the work) but stays silent, the server never learns the message arrived and resends indefinitely. Always ack — dedup the side effects, not the acknowledgement.

Where should the outbox live in a multi-node deployment? #

In shared, durable storage — Redis (a sorted set or hash keyed by clientId) or a database table. An in-process Map is fine for a single node but loses every unacked message when the process restarts, and a reconnect routed to a different node would find no outbox at all.

How do I get exactly-once instead of at-least-once? #

You don’t get true exactly-once over the network — you get at-least-once delivery plus idempotent processing, which is observationally exactly-once. The client’s seq > lastSeq check is exactly that: receive duplicates, apply each effect once. Make every downstream side effect keyed by seq and duplicates become harmless.

Back to Message Delivery Guarantees