OpenAI Accelerates Agent Workflows with WebSockets in Responses API
OpenAI's new WebSocket implementation in the Responses API reduces latency and API overhead for agentic workflows. The update enhances model performance by enabling connection-scoped caching.

OpenAI has introduced WebSockets in its Responses API to significantly speed up agentic workflows. By leveraging WebSocket connections, the company has reduced API overhead and improved model latency, making agent interactions faster and more efficient. The update also includes connection-scoped caching, which minimizes redundant computations and enhances overall performance.
This development is crucial for developers building complex agent systems that require real-time interactions. The reduction in latency means agents can respond more quickly, improving user experience and enabling more dynamic applications. Compared to traditional HTTP requests, WebSockets provide a persistent connection, reducing the need for repeated handshakes and streamlining data exchange.
Looking ahead, this enhancement could set a new standard for agentic workflows in the AI industry. Developers are likely to adopt this technology to build more responsive and efficient AI systems. Open questions remain about how this will scale with increased usage and whether other API providers will follow suit with similar optimizations.