OpenAI has today announced ChatGPT-4, the next-generation AI language model that can read photos and explain what’s in them, according to a research blog post.
ChatGPT-3/3.5 has taken the world by storm, but the deep learning language model has only accepted text inputs until now. GPT-4 aims to change that as it will also accept images as prompts.
GPT-4 Improvements Over GPT-3
“It generates text outputs given inputs consisting of interspersed text and images,” OpenAI write today. “Over a range of domains — including documents with text and photographs, diagrams, or screenshots — GPT-4 exhibits similar capabilities as it does on text-only inputs.”
Simply put, ChatGPT-4 will be able to analyze what is in an image. For example, it can tell the user what is unusual about the below photo of a man ironing his clothes while attached to a taxi.
Last week, Microsoft Germany Chief Technical Officer Andreas Braun said that GPT-4 will “offer completely different possibilities — for example, videos.”
However, from today’s announcement, there has been no mention of video within ChatGPT-4 and the only multi-modal element is the inputting of images.
Microsoft had already presented a multi-modal language model that operates in different formats called Kosmos-1.
In the Kosmos-1 presentation, the AI can read images along with a photo. For example, a picture of a clock showing 10:10 is inputted into the AI with the question “The time now?” To which the AI replies, “10:10 on a large clock.”
Kosmos-1 can also tell the viewer what particular type of hairstyle a woman is wearing or it recognizes a movie poster and can tell the user when that movie will be released.
OpenAI says it’s already partnered with a number of companies to integrate GPT-4 into their products, including Duolingo, Stripe, and Khan Academy.
The new model is also available to the general public via ChatGPT Plus, OpenAI’s $20 monthly subscription, and is powering Microsoft’s Bing chatbot.
The new AI model will also be available as an API for developers to build on but they’ll have to join the waitlist here, which OpenAI says will start admitting users today.
During the announcement, OpenAI did stress that the new AI model had gone through six months of safety training, and that in internal tests, it was “82% less likely to respond to requests for disallowed content and 40% more likely to produce factual responses than GPT-3.5.”
However, that doesn’t mean that GPT-4 is perfect. As we’ve stated above, GPT-4 has been running Microsoft’s Bing chat bot and many users have able to break it in all sorts of creative ways, getting the bot to offer dangerous advice, threaten users, and make up information.
ChatGPT has become wildly popular, becoming the fastest-growing consumer app in history to reach 100 million users