TechCrunch AI September 24, 2024 NEUTRAL

Advanced Voice Mode reaches more paid ChatGPT users

OpenAI is expanding Advanced Voice Mode to more paying ChatGPT customers, starting with Plus and Teams tiers. The rollout adds five voices, a new blue animated sphere, Custom Instructions, Memory and claimed improvements to speed, smoothness and accents.

WTF Index NEUTRAL

◄ Terminator 0 Idiocracy 1 ►

This is mostly a routine feature rollout, with only a mild dependence-on-AI angle from more natural voice interaction.

Advanced Voice Mode reaches more paid ChatGPT users

OpenAI is widening access to Advanced Voice Mode, the ChatGPT audio feature designed to make spoken conversations with the chatbot feel more natural. The rollout starts with paying customers in ChatGPT’s Plus and Teams tiers, while Enterprise and Edu customers are expected to begin receiving access next week.

The update is not only about broader availability. OpenAI is also changing how the feature looks, adding more voice options, and bringing some of ChatGPT’s personalization tools into the voice experience.

Who gets Advanced Voice Mode now

Advanced Voice Mode, or AVM, is being made available to an expanded group of ChatGPT paying customers. OpenAI said the feature is rolling out first to Plus and Teams users, with access arriving through the ChatGPT app.

Users will know they have access when a pop-up appears in the app near the voice icon. OpenAI also said Advanced Voice is rolling out to all Plus and Team users in the ChatGPT app over the course of the week.

Enterprise and Edu customers are not included in the first wave. According to the source, those customers will start receiving access next week.

The rollout is also limited by region. An OpenAI spokesperson said AVM is not yet available in several regions, including the EU, the U.K., Switzerland, Iceland, Norway, and Liechtenstein.

A new look and five more voices

Advanced Voice Mode now has a different visual identity inside ChatGPT. Instead of the animated black dots OpenAI showed during its May presentation, the feature is represented by a blue animated sphere.

The voice lineup is expanding as well. ChatGPT users can now try five new voices: Arbor, Maple, Sol, Spruce, and Vale. Together with Breeze, Juniper, Cove, and Ember, that brings ChatGPT’s total number of voices to nine.

The naming pattern is noticeable because every voice name is drawn from nature. That fits the broader purpose of AVM: making voice interaction with ChatGPT feel less mechanical and more conversational.

One voice is still absent from the list. Sky, the voice OpenAI showed during its spring update, is not part of this rollout. Sky became controversial after Scarlett Johansson, who played an AI system in the feature film "Her," claimed the voice sounded a little too similar to her own.

OpenAI removed Sky and said it had never intended the voice to resemble Johansson’s voice. The source also notes that several staff members had referenced the movie in tweets at the time.

What has improved since the alpha test

OpenAI says it has improved Advanced Voice Mode since the limited alpha test. The company claims ChatGPT’s voice feature is now better at understanding accents, and that conversations are smoother and faster.

Those improvements matter because voice assistants are judged less by isolated answers and more by the flow of the exchange. If a spoken system hesitates, misunderstands accents, or interrupts the rhythm of conversation, the experience can quickly feel less natural.

The source notes that tests with AVM found glitches were not uncommon. OpenAI now says that has improved, though the rollout will put the feature in front of a much larger group of paying users.

OpenAI also highlighted a multilingual example, saying AVM can say "Sorry I’m late" in over 50 languages. The source does not describe the full language coverage beyond that example.

Personalization comes to voice

Advanced Voice Mode is also gaining access to two ChatGPT customization features: Custom Instructions and Memory.

Custom Instructions allow users to personalize how ChatGPT responds. In a voice setting, that can shape the way spoken answers are framed, though the source does not provide examples of specific instruction settings for AVM.

Memory lets ChatGPT remember conversations for later reference. Bringing Memory into AVM means the spoken version of ChatGPT can connect with the same broader personalization approach used elsewhere in the product.

Together, Custom Instructions and Memory make AVM less like a standalone voice layer and more like another interface for the existing ChatGPT experience. The key change is that personalization is moving into the spoken interaction, not staying only in text chat.

What is still missing

The rollout does not include every capability OpenAI showed in its spring update. Video and screen sharing are still absent.

Those features were shown as part of a broader multimodal experience in which GPT-4o could process visual and audible information at the same time. During the demo, an OpenAI staff member asked ChatGPT real-time questions about math on a piece of paper and code on a computer screen.

At this time, OpenAI has not provided a timeline for when those multimodal capabilities will launch. That means the current AVM rollout is focused on voice conversation, not live visual understanding through video or screen sharing.

For paying ChatGPT users who receive access, the practical change is clear: Advanced Voice Mode is becoming more widely available, more customizable, and broader in voice selection. But the most ambitious visual features shown earlier remain outside this release.