In last week’s demo, Raul Puri, a scientist who works on GPT-4, gave me a quick tour of the image recognition feature. He uploaded a photo of a kid’s math homework, circled a Sudoku-like puzzle on the screen and asked ChatGPT how you were meant to solve it. ChatGPT replied with the correct steps.
Puri says that he has also used the feature to help him fix his fiancee’s computer by uploading screenshots of error messages and asking ChatGPT what he should do. “This was a very painful experience that it helped me get through,” he says.
ChatGPT’s image recognition ability has already been trialled by a company called Be My Eyes, which makes an app for people with impaired vision. Users of this app can upload a photo of what’s in front of them and ask human volunteers to tell them what it is. In a partnership with OpenAI, Be My Eyes now gives its users the option of asking a chatbot instead.
“Sometimes my kitchen is a little messy or it’s just very early Monday morning and I don’t want to talk to a human being,” Be My Eyes founder Hans Jorgen Wiberg, who uses the app himself, told me when I interviewed him at EmTech Digital in May. “Now you can ask the photo questions.”
OpenAI is aware of the risks of releasing these updates to the public. Combining models brings whole new levels of complexity, says Puri. He says his team has spent months brainstorming possible misuses. You cannot ask questions about photos of private individuals, for example.
Jang gives another example: “Right now if you ask ChatGPT to make a bomb it will refuse,” she says. “But instead of saying, ‘Hey, tell me how to make a bomb,’ what if you showed it an image of a bomb and said, ‘Can you tell me how to make this?’”
“You have all the problems with computer vision, you have all the problems of large language models, voice fraud is a big problem,” says Puri. “You have to consider not just our users, but also the people that aren’t using the product.”
But OpenAI claims that it has addressed the worst problems and is confident that ChatGPT’s updates are safe enough to release. “It’s been a remarkably good learning experience getting all these sharp edges sorted out,” says Puri.