Implementing WebSocket Ping/Pong in Node.js #
Symptom: Silent Connection Drops & Zombie Sockets #
Engineers observe clients receiving stale state updates despite active network connections. Server-side ws.on('close') events fail to fire. This causes memory accumulation as Backend WebSocket Connection Management logs show thousands of idle sockets consuming file descriptors. Load balancers report 504 Gateway Timeouts on long-lived connections. Application metrics indicate a 15–20% heap increase over 24 hours without traffic spikes.
Run these diagnostics to confirm the issue:
lsof -i :8080 | wc -lcompared against active client count- Node.js heap snapshot analysis for detached
WebSocketobjects - Reverse proxy access logs showing
504on idle connections
The root cause stems from missing RFC 6455 heartbeat enforcement. The default ws library does not enforce application-level keepalives. OS-level TCP keepalives default to 7200s, which exceeds modern reverse proxy idle timeouts. Without explicit Connection Lifecycle & Heartbeats logic, the server retains dead sockets while proxies silently terminate them.
Resolution: Strict Ping/Pong Implementation with Timeout Boundaries #
Implement a deterministic heartbeat loop sending ping frames at 30s intervals. Enforce a strict 10s pong timeout. Terminate and clean up any socket failing to respond within this boundary. The following implementation uses the native ws package with explicit error handling.
import { WebSocketServer } from 'ws';
const HEARTBEAT_INTERVAL = 30_000; // 30s
const PONG_TIMEOUT = 10_000; // 10s
const wss = new WebSocketServer({ port: 8080 });
wss.on('connection', (ws) => {
ws.isAlive = true;
ws.pingTimeout = null;
ws.on('pong', () => {
ws.isAlive = true;
clearTimeout(ws.pingTimeout);
});
ws.heartbeatInterval = setInterval(() => {
if (!ws.isAlive) {
console.warn('Terminating unresponsive WebSocket:', ws._socket.remoteAddress);
return ws.terminate();
}
ws.isAlive = false;
try {
ws.ping();
ws.pingTimeout = setTimeout(() => {
if (!ws.isAlive) {
ws.terminate();
}
}, PONG_TIMEOUT);
} catch (err) {
console.error('Ping frame send failed:', err.message);
ws.terminate();
}
}, HEARTBEAT_INTERVAL);
ws.on('close', () => {
clearInterval(ws.heartbeatInterval);
clearTimeout(ws.pingTimeout);
});
});
Apply these error boundaries in production:
- Wrap
ws.ping()intry/catchto prevent unhandled exceptions on half-closed sockets - Execute
clearTimeoutandclearIntervalin theclosehandler to prevent timer leaks - Use
ws.terminate()instead ofws.close()for unresponsive sockets to force immediate TCP teardown
Prevention: Monitoring & Configuration Hardening #
Deploy structured logging to track heartbeat success and failure ratios. Configure reverse proxy idle timeouts to exceed the HEARTBEAT_INTERVAL. For example, set Nginx proxy_read_timeout to 45s. Integrate socket count metrics into APM dashboards to detect early connection leaks. Regularly audit lifecycle practices against infrastructure changes.
Operational checklist:
- Align proxy idle timeout > heartbeat interval
- Monitor
active_connectionsvsheap_size - Automate load testing with simulated network partitions
- Enforce
ws.terminate()overws.close()for unresponsive sockets