Do you need a different set of eyes for a second opinion? Instead of asking your friend or coworker, you can now turn to ChatGPT.
OpenAI, theMicrosoft-backed artificial intelligence firm responsible for the revolutionary chatbot, has now endowed it with the sense of sight. Granted, this is the natural next step, since OpenAI is also the creator of the text-to-image generator DALL-E.
The update comes with the arrival of the wider GPT-4, which is “multimodal” (AKA recognizing more than just text prompts) and accepts images. The previous versions that ChatGPT was based on, GPT-3 and GPT-3.5, were limited to text inputs, making this rollout a rather significant jump.
OpenAI explains that GPT-4 can analyze screenshots, diagrams, documents with both text and photographs, and more—and then explain them in natural language.
“While less capable than humans in many real-world scenarios, [GPT-4] exhibits human-level performance on various professional and academic benchmarks,” OpenAi notes.
For instance, you could upload a graph and have the bot scrutinize it for important data. Or, say, scan in a page from an instruction manual and get it to summarize it in simpler terms.
Here’s a more creative example shared by OpenAI, in which a user gets ChatGPT to predict what comes next in an abstract scenario:
In the buildup to GPT-4’s reveal, media outlets had previously reported that the AI model might be able to generate visual output like videos. At the AI in Focus – Digital Kickoff event, Microsoft Germany CTO Andreas Braun had announced that GPT-4 would “offer completely different possibilities—for example, videos.”
However, it seems that the bot—at least this version—is solely focused on input.
With that being said, OpenAI has hinted that GPT-4 is far more advanced than is being advertised, though the organization is keeping mum about the bot’s other talents. Chief scientist Ilya Sutskever tells MIT Technology Review that “we can’t really comment on” GPT-4’s full skillset as it’s “pretty competitive out there.”
What the team is willing to let on is that GPT-4 is “more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5,” it writes in a blog post.
Similar to its predecessor, GPT-4 still “hallucinates” facts, but it does this on a lower scale (about 40% less, according to its creators). To prevent compulsive lying, OpenAI has worked with 50 experts in the fields of cybersecurity, international security, trust and safety, and more to test the model.
GPT-4 has been rolled out into ChatGPT and the API. Unfortunately, the new and improved version is not available to everyone. You’ll require a ChatGPT Plus subscription to test it out, and there’s also a limit to the number of times you can use it.