OpenAI is adding even more features to its ChatGPT chatbot. Today, the company announced it has started rolling out new voice features on its mobile apps, along with ways to upload images that can be analyzed by ChatGPT.
In a blog post, OpenAI announced that ChatGPT users will soon be able to speak to the chatbot. Once the feature is available on the iOS and Android app, users can go to the Settings menu and then tap on the New Features selection. They can then tap to opt into the app’s voice conversations. Finally, they can tap on the headphone icon and choose from one of five voice options.
The new voice capability is powered by a new text-to-speech model, capable of generating human-like audio from just text and a few seconds of sample speech. We collaborated with professional voice actors to create each of the voices. We also use Whisper, our open-source speech recognition system, to transcribe your spoken words into text.
The mobile ChatGPT apps will soon be able to use the photo button to either take a picture or choose an already-created one. ChatGPT can then check the photo out and perform a number of different tasks, such as analyzing a graph for work, troubleshooting when a device doesn’t work, and more.
Image understanding is powered by multimodal GPT-3.5 and GPT-4. These models apply their language reasoning skills to a wide range of images, such as photographs, screenshots, and documents containing both text and images.
The new features are rolling out over the next couple of weeks and will be made available first for ChatGPT Plus and Enterprise users. Those features will be expanded to developers and other ChatGPT users in the near future.
Last week, OpenAI announced DALL-E 3, the next version of its AI image generator that will offer integration with ChatGPT. It will officially launch in October.