Googles Gemini Is The Real Start Of The Generative AI Boom

The history of artificial intelligence is marked by periods of so-called “AI winters,” when technology stalled and funding seemed to dry up. Each of them comes with the claim that creating truly intelligent machines is too difficult for humans to understand.

The discovery of Google's Gemini, considered a fundamentally new type of artificial intelligence and the company's most powerful model yet, suggests that another AI winter can't come soon enough. In fact, while the 12 months since the launch of ChatGPT have been a banner year for AI, there is good reason to believe that the current AI boom has only just begun.

OpenAI didn't have high hopes when it launched a "modest research preview" called ChatGPT in November 2022. The company was simply testing a new interface for its large text linguistic models (LLMs). But the ability of chatbots to perform such a wide range of tasks, from synthesizing essays and poetry to solving coding problems, has impressed and alarmed many, and stumped the tech industry. When OpenAI added its new LLM GPT-4 to ChatGPT, some experts were so upset that they begged the company to slow down.

There was now little evidence that anyone had heeded the alarm. It's now unthinkable that Google would go back and perhaps change the rules of the game by announcing Gemini.

Google had already responded directly to ChatGPT earlier this year in the form of Bird, which eventually introduced LLM chatbot technology developed before OpenAI, but decided to remain private. Gemini claims to have ushered in a new era that goes beyond LLMs, which are largely tied to text, and potentially lays the groundwork for a new round of artificial intelligence products significantly different from those running on ChatGPT.

Google calls Gemini an "inherently multimodal" model, meaning it can learn only from data outside of text, as well as extract information from audio, video and images. ChatGPT shows how artificial intelligence models can learn impressive information about the world if given enough text. And some AI researchers argue that increasing language models will improve their ability to compete with humans.

But you can only learn so much about physical reality through the filter of what people write about it, and LLM programs like GPT-4 have limitations such as hallucinatory information, poor judgment, and their own strange certainties. Flaws. There appear to be limitations to the scalability of existing technologies.

Ahead of yesterday's Gemini announcement, WIRED spoke with Demis Hassabis, who led the development of Gemini and whose previous credits include leading the team that created AlphaGo, a superhuman gaming robot. He's predictably enthusiastic about Gemini, saying it introduces new features that will ultimately differentiate Google products. But Hassabis also says LLM needs to be combined with other AI techniques to create AI systems that understand the world in a way that today's chatbots cannot.

Hassabis is in aggressive competition with OpenAI, but rivals seem to agree that radical new approaches are needed. A mysterious project called Q* on OpenAI suggests that the company is also exploring ideas that involve more than just extending the GPT-4 system.

This echoes OpenAI CEO Sam Altman's comments at MIT in April, when he made it clear that despite the success of ChatGPT, significant progress in AI requires the next big idea. “I think we're at the end of an era where there will be giant, giant models,” Altman said. "We're going to make them better in other ways."

Google may have demonstrated an approach that goes beyond ChatGPT. But the most important message from the Gemini presentation is that Google is aiming for something more important than today's chatbots, which OpenAI appears to be.

Googles Gemini Is The Real Start Of The Generative AI Boom

How to Rank in Google SGE (Generative Search)

Contact Form