ChatGPT-4: An AI That Can Understand Photos

Jump ahead

OpenAI has today announced ChatGPT-4, the next-generation AI language model that can read photos and explain what’s in them, according to a research blog post.

ChatGPT-3/3.5 has taken the world by storm, but the deep learning language model has only accepted text inputs until now. GPT-4 aims to change that as it will also accept images as prompts.

Announcing GPT-4, a large multimodal model, with our best-ever results on capabilities and alignment: https://t.co/TwLFssyALF pic.twitter.com/lYWwPjZbSg
— OpenAI (@OpenAI) March 14, 2023

GPT-4 Improvements Over GPT-3

“It generates text outputs given inputs consisting of interspersed text and images,” OpenAI write today. “Over a range of domains — including documents with text and photographs, diagrams, or screenshots — GPT-4 exhibits similar capabilities as it does on text-only inputs.”

Simply put, ChatGPT-4 will be able to analyze what is in an image. For example, it can tell the user what is unusual about the below photo of a man ironing his clothes while attached to a taxi.

Last week, Microsoft Germany Chief Technical Officer Andreas Braun said that GPT-4 will “offer completely different possibilities — for example, videos.”

However, from today’s announcement, there has been no mention of video within ChatGPT-4 and the only multi-modal element is the inputting of images.

Microsoft had already presented a multi-modal language model that operates in different formats called Kosmos-1.

In the Kosmos-1 presentation, the AI can read images along with a photo. For example, a picture of a clock showing 10:10 is inputted into the AI with the question “The time now?” To which the AI replies, “10:10 on a large clock.”

Kosmos-1 can also tell the viewer what particular type of hairstyle a woman is wearing or it recognizes a movie poster and can tell the user when that movie will be released.

ChatGPT-4 Availability

OpenAI says it’s already partnered with a number of companies to integrate GPT-4 into their products, including Duolingo, Stripe, and Khan Academy.

The new model is also available to the general public via ChatGPT Plus, OpenAI’s $20 monthly subscription, and is powering Microsoft’s Bing chatbot.

The new AI model will also be available as an API for developers to build on but they’ll have to join the waitlist here, which OpenAI says will start admitting users today.

Security Concerns

During the announcement, OpenAI did stress that the new AI model had gone through six months of safety training, and that in internal tests, it was “82% less likely to respond to requests for disallowed content and 40% more likely to produce factual responses than GPT-3.5.”

However, that doesn’t mean that GPT-4 is perfect. As we’ve stated above, GPT-4 has been running Microsoft’s Bing chat bot and many users have able to break it in all sorts of creative ways, getting the bot to offer dangerous advice, threaten users, and make up information.

ChatGPT has become wildly popular, becoming the fastest-growing consumer app in history to reach 100 million users

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

ChatGPT-4: An AI That Can Understand Photos

Microsoft To Launch ChatGPT-4 Next Week

Microsoft Teams Will Now Let You Attend Video Meetings as a 3D Avatar

Muchiri

Related Posts

Empowering Kenya’s youth: How Young Scientists Kenya is shaping the future of STEM education

Global Microsoft 365 outage paralyzes airlines, other businesses

Samsung Electronics faces three-day strike as union negotiations fail

Amnesty Kenya calls for uninterrupted internet during Finance Bill 2024 protests

HarmonyOS NEXT: Huawei’s revolutionary Android free operating system unveiled

Worldcoin probe in Kenya dropped

Microsoft Teams Will Now Let You Attend Video Meetings as a 3D Avatar

Leave a Reply Cancel reply

Click Smarter, Not Harder!