2025-08-28 · OpenAI

Introducing gpt-realtime and Realtime API updates

modelsinfrastructure

Introducing gpt-realtime and Realtime API updates

Source: OpenAI Date: 2025-08-28 URL: https://openai.com/index/introducing-gpt-realtime

Summary

OpenAI’s August 2025 launch of the GPT Realtime model and associated API updates, enabling low-latency end-to-end audio conversation — not text-to-speech layered on a text model, but a native audio model that processes and generates speech directly. The Realtime API allows developers to build voice assistants, phone-based AI agents, and real-time interpretation tools without the latency and quality degradation of the text intermediary. Named “gpt-realtime” to distinguish from GPT-5’s text API.

Implications

Audio AI as a developer primitive. The Realtime API makes low-latency voice a buildable layer rather than a product only OpenAI can ship. This opens the voice AI market to every developer building on the API — customer service voice agents, healthcare intake, language tutoring, real-time translation — use cases that required bespoke voice AI infrastructure before this.

Latency as the differentiator. The demo of GPT-4o’s real-time voice at WWDC 2024 was compelling partly because it was fast — sub-second turn-taking rather than the 2-3 second waits typical of concatenated TTS. The Realtime API’s value is making that latency profile available to developers, not just OpenAI’s consumer products.

Thread: audio model. The successor to Voice Engine (March 2024) and the September 2023 ChatGPT voice launch. The Realtime API is the developer-facing version of the audio stack that powers the consumer voice experience.

Watch: Whether the Realtime API’s pricing per audio-second makes voice-first AI applications cost-competitive with existing IVR and voice automation infrastructure at enterprise call center scale.

← all signals