Readers like you help support How-To Geek. When you make a purchase using links on our site, we may earn an affiliate commission. Read More.

ChatGPT has, so far, been known as the magical bot that can help you get answers to text-based queries. Even though it can make mistakes pretty often, it can also be helpful in some cases. For the most part, however, you could only interact with it via text. You typed in a query, and it would spell out an answer. Now, OpenAI is announcing a few different ways you can actually interact with it.

OpenAI is adding a few new voice and image features to ChatGPT, offering you more intuitive ways to interact with the AI. These features not only let you communicate with ChatGPT in more ways, but they also let you integrate it to more parts of your daily life.

For one, you can now engage in voice conversations with ChatGPT, allowing for back-and-forth discussions on various topics. Voice capabilities are available on iOS and Android through opt-in settings, with five different voice options made by different voice actors. Whisper, OpenAI's open-source speech recognition system, transcribes spoken words into text for seamless communication.

What about images? You can also discuss images with ChatGPT, making it possible to troubleshoot problems, plan meals, or analyze complex data graphs by showing one or more images. The image understanding feature leverages multimodal GPT-3.5 and GPT-4 models, enabling reasoning with a wide range of images, including photographs, screenshots, and documents containing text and images. Basically, just throw an image at ChatGPT, and it'll do its best to understand it and help you out with your query — like the features being tested in Bing Chat. This will likely not be perfect, and especially not at first, but it'll slowly get better.

The company is also really making it clear that it's deploying this in a responsible manner to ensure safety and mitigate potential risks. Voice chat, for instance, was developed in collaboration with voice actors to prevent misuse, and OpenAI is working with partners like Spotify for applications like Voice Translation in podcasting. And as far as image detection models go, OpenAI has tested its models with red teamers and alpha testers to ensure responsible usage. The company is also working with organizations like Be My Eyes to understand the limitations and benefits of vision capabilities, particularly for the visually impaired.

OpenAI is rolling out these capabilities gradually to Plus and Enterprise users before expanding access to other groups. Software developers creating their own GPT-powered apps will also get to use these features at some point.

Source: OpenAI