# OpenAI ships three next-gen voice models in its API

> OpenAI launched GPT-Realtime-2, Translate and Whisper on 7 May 2026 in its Realtime API.

*Reasoning, translation, transcription — one stack. The brief.*

By The FeaturedDaily Desk · FeaturedDaily
Canonical: https://featureddaily.com/news/openai-voice-models-brief

> **Key:** **The one-liner:** OpenAI put real reasoning into a real-time voice model — the missing piece for voice agents that actually do things.

**What happened.** On 7 May, OpenAI added three audio models to its Realtime API: **GPT-Realtime-2** (a voice model with GPT-5-class reasoning), **GPT-Realtime-Translate** (live translation, 70+ input languages into 13) and **GPT-Realtime-Whisper** (streaming transcription).

**The pricing.** Translate and Whisper are billed by the minute; GPT-Realtime-2 is billed by token usage — a hint at where the heavy lifting (and cost) sits.

> **Note:** **Why it matters.** Folding reasoning, translation and speech into one low-latency stack is what makes capable voice agents buildable — and it set the bar just before Sesame's app and Apple's Siri AI.

## Key takeaways

- GPT-Realtime-2: first OpenAI voice model with GPT-5-class reasoning.
- GPT-Realtime-Translate: live speech translation, 70+ inputs into 13 outputs.
- GPT-Realtime-Whisper: streaming speech-to-text as you talk.
- Billing: Translate/Whisper by the minute; Realtime-2 by tokens.

## FAQ

### What are the three new OpenAI voice models?
GPT-Realtime-2 (reasoning-capable voice), GPT-Realtime-Translate (live speech translation across 70+ languages), and GPT-Realtime-Whisper (streaming transcription), all released 7 May 2026 in OpenAI's Realtime API.

### Who are they for?
Developers building voice features into apps — assistants, customer service, translation and transcription. End users feel the benefit indirectly through smarter voice experiences.
