OpenAI GPT-4 Arriving this Month — and It’s Multimodal

OpenAI GPT-4, the newest iteration of OpenAI's Large Language Model, is arriving this month. And it’s multimodal, presenting yet another challenge to one of its most publicized rivals, Google’s Bard.
Two men sit on a black couch in a tech lab, each with a laptop, facing and talking to each other amid wired equipment and OpenAI branding behind them.
Two men sit on a black couch in a tech lab, each with a laptop, facing and talking to each other amid wired equipment and OpenAI branding behind them.
BAM BBQ 2026

If you're still treating AI like a search engine, this is for you. BAM BBQ is two and a half hours of real instruction on AI for real estate, from conversations to content to systems. It’s free, virtual, and loaded with plays you can run the same week. Save your spot →

Six smiling real estate agents stand against orange, black, and red panels with a bold headline about learning AI now and BAMx/realtor logos in the band at the bottom.
FREE VIRTUAL EVENT
BAM BBQ 2026

If you're still treating AI like a search engine, this is for you. BAM BBQ is two and a half hours of real instruction on AI for real estate, from conversations to content to systems. It’s free, virtual, and loaded with plays you can run the same week. Save your spot →

Key Details:

  • Open AI GPT-4, the newest iteration of OpenAI’s Large Language Model—the tech behind ChatGPT—is arriving this month. And it’s multimodal, presenting yet another challenge to one of its most publicized rivals, Google’s Bard. 
  • Multi-modal AI can operate within multiple kinds of input—going beyond text to include speech, images, and video—while GPT-3 and GPT-3.5 are text-only. 

The newest iteration of ChatGPT is just around the corner, expected to arrive this month. 

And, according to the latest news from Microsoft Germany, it will be multimodal, meaning Google has even further to go to keep up—especially after the very public mistakes made by its own AI search engine (Bard). 

After all, when users type a question into an actual search engine, relied upon by millions of people across the globe, they have a reasonable expectation that the results will be accurate. 

Still, don’t discount Google yet. After all, it’s synonymous with “online search,” and that likely won’t change anytime soon. 

It’s also worth noting that multimodal functionality doesn’t guarantee accuracy. We’ve already mentioned some of ChatGPT’s limitations, especially for those who expect it to do all their work for them (which we do not recommend). 

That said, the news about GPT-4 is exciting. After all, GPT-3 has already changed how real estate professionals work—and we can’t wait to dive into the next version.

Microsoft Germany confirms the (approximate) arrival date

Andreas Braun, CTO of Microsoft Germany, confirmed in a recent news release that GPT-4 is coming within a week of March 9, 2023—and that it will, in fact, be multimodal. 

Modality refers to the type of input an AI can process. Multimodal AI can operate within multiple kinds of input, going beyond the usual text to include images, video, and sound/speech.

According to the MS Germany news report, GPT-4 may be able to operate in at least those four modalities.

We will introduce GPT-4 next week, there we will have multimodal models that will offer completely different possibilities—for example, videos…

Andreas Braun
CTO of Microsoft Germany

For reference, GPT-3 and GPT-3.5 only operated in one modality—text. 

While Holger Kenn, Microsoft’s Director of Business Strategy explained what multimodal generally means and what those modalities could include, the report didn’t say whether GPT-4, specifically, will be able to function within all four of the previously mentioned modalities. 

But it seems likely that those four were explicitly mentioned for a reason. 

Kenn explained what multimodal AI is about and what it can potentially do, like translating text into images or even into music and/or video. 

Microsoft is also working on “confidence metrics” to ground their AI with facts to strengthen the accuracy of its responses. 

Microsoft Kosmos-1

While underreported in the U.S., at the beginning of March 2023, Microsoft Germany released another multimodal AI language model called Kosmos-1. 

According to the German news site Heise.de: 

The team subjected the pre-trained model to various tests, with good results in classifying images, answering questions about image content, automated labeling of images, optical text recognition and speech generation tasks….Visual reasoning, i.e., drawing conclusions about images without using language as an intermediate step, seems to be a key here…

Holger Kenn
Microsoft Director Business Strategy

Kosmos-1 integrates the modalities of text and images. GPT-4 goes even further, adding video and, by all appearances, sound.  

GPT-4 works across multiple languages

Reporting confirms that GPT-4 can work across all languages. One example has it receiving a question in German (or English, etc.) and giving an answer in Italian. And while that sounds like an improbable request, it becomes more practical when the user has a question to ask and the source material is in a language they don’t speak. 

The point of this breakthrough is to transcend language barriers by pulling knowledge from sources across all languages. That would make GPT-4 similar to Google’s multimodal AI called MUM (Multitask Unified Model), which can provide answers in English when the data for it exists only in another language. 

GPT-4 applications

There’s no explicit announcement as to where GPT-4 will show up, but the report did mention Azure-OpenAI. 

Meanwhile, Google’s continuing struggle to integrate competing AI technology into its own search engine strengthens the perception that they’re falling behind. White Google already integrates AI in products like Google Lens and Google Maps, among others, GPT-4’s approach is to use AI as assistive technology to help users with small tasks. 

Those little things have a way of piling up and taking time and energy away from tasks that can’t be delegated.

No wonder ChatGPT has already captivated real estate professionals and content creators across the globe. We’re already dreaming up ways to incorporate the different modalities, and can’t wait to test them out. And, as always, we’ll keep you updated. 

Download the printable PDF with all 27 lines:

Sign Up for the BAM Newsletter

For daily real estate news, business and marketing.

About the Author

Sarah Lentz started writing for BAM in late May of 2022 and quickly realized she was exactly where she wanted to be (and still is). Before BAM, she worked as a freelance writer. She lives in Minnesota with her four kids and, in her free time, is writing her next book.

Share:

Related Posts

Recent Articles

Upcoming Events

Virtual Event
Virtual
Webinar
Virtual

Related Posts