The Decoder June 8, 2025 NEUTRAL

OpenAI makes ChatGPT voice more natural for live translation

OpenAI has updated ChatGPT's Advanced Voice Mode for subscribers with more natural speech and continuous real-time translation. The feature can interpret both sides of a conversation until told to stop, but audio glitches and strange unprompted sounds can still occur.

OpenAI has updated ChatGPT's voice features for subscribers, with a focus on making spoken conversations feel smoother, more expressive, and more useful in multilingual settings.

The revamped Advanced Voice Mode is designed to improve how ChatGPT speaks, not only what it says. The update also adds real-time translation that can keep running through a conversation until the user tells ChatGPT to stop.

What changed in ChatGPT voice

According to OpenAI, the updated Advanced Voice Mode now produces speech that is more natural and emotionally nuanced. The improvements include better intonation, more lifelike pauses, and a stronger ability to convey tones such as empathy or sarcasm.

That matters because voice AI is judged in real time. A text response can be scanned, paused, or reread. A spoken response has to land immediately, and small details such as pitch, pacing, and timing can shape whether the exchange feels fluid or awkward.

OpenAI's stated direction is clear: ChatGPT is being pushed closer to natural, real-time interaction. Advanced Voice Mode is meant to support conversations where users can interrupt the AI and where the system can express emotion during an exchange.

Real-time translation becomes a core use case

The most practical addition is continuous real-time translation. Users can ask ChatGPT to translate between specific language pairs, and the AI will then interpret both sides of the conversation until instructed otherwise.

This makes ChatGPT voice less like a one-off translation tool and more like a live conversation assistant. Instead of translating a single phrase, the system can remain active while two people speak across a language gap.

OpenAI suggests this could help in everyday and workplace situations, including restaurant orders and multilingual workplace discussions. Those examples point to the main value of the update: reducing friction when people need to understand each other quickly and do not want to keep restarting a translation task.

Language pairs: users specify the languages they want ChatGPT to translate between.
Continuous interpretation: ChatGPT keeps translating both sides of the exchange.
User control: the translation continues until the user tells it to stop.

Who can use the updated voice tools

The voice improvements are available to paying users across all platforms. Access is handled through the language icon in the chat interface.

That platform-wide availability is important because voice and translation features are often most useful away from a desk. A user may need them during a conversation, while ordering food, or while moving between work contexts where typing is not the most practical input method.

The source describes the update as being for subscribers, so the improved ChatGPT voice experience should be understood as a paid-user feature rather than a general free release.

Known limitations remain

OpenAI also notes that the system is not perfect. Users may still hear occasional drops in audio quality. These can include unexpected changes in pitch or volume, and the issue may be more noticeable with certain voices.

Another limitation is stranger: hallucinations can still occur in audio. In this context, ChatGPT may produce odd sounds without being prompted, including snippets that resemble ads, random noises, or even background music.

One recent case involved a user reporting that ChatGPT suddenly played an advertisement in the middle of a conversation, even though OpenAI doesn't actually serve ads. That example underlines the difference between sounding more natural and being fully predictable. A voice system can become more expressive while still producing outputs that surprise users in the wrong way.

For anyone relying on ChatGPT voice in a real conversation, the practical lesson is simple: the feature is improving, but it still needs attention from the user. Live translation and natural speech can help a conversation move, but users should be prepared for audio quality shifts or unusual sounds.

Where Advanced Voice Mode fits

OpenAI first introduced Advanced Voice Mode in May 2024 with a gradual rollout. It later expanded availability to the EU in October 2024.

The broader aim is to make interaction with ChatGPT feel more immediate and conversational. Voice, interruption, emotion, and real-time translation all move the product beyond a text chat box and toward an assistant that can participate in spoken exchanges.

There is also a visual dimension when users turn on their cameras. In that mode, ChatGPT can comment live on objects or surroundings, adding another layer to real-time interaction.

OpenAI is not alone in this direction. Google offers similar features in its Gemini app. The competitive signal is that major AI products are treating voice, translation, and live context as central parts of the user experience, not side features.