OpenAI
OpenAI ships three next-gen voice models in its API
Reasoning, translation, transcription — one stack. The brief.
OpenAI launched GPT-Realtime-2, Translate and Whisper on 7 May 2026 in its Realtime API.
What happened. On 7 May, OpenAI added three audio models to its Realtime API: GPT-Realtime-2 (a voice model with GPT-5-class reasoning), GPT-Realtime-Translate (live translation, 70+ input languages into 13) and GPT-Realtime-Whisper (streaming transcription).
The pricing. Translate and Whisper are billed by the minute; GPT-Realtime-2 is billed by token usage — a hint at where the heavy lifting (and cost) sits.
Sources
- Advancing voice intelligence with new models in the API — OpenAI, 7 May 2026
- OpenAI launches new voice intelligence features in its API — TechCrunch, 7 May 2026