Google introduced Gemini, its largest and “most capable” AI model

Google introduced Gemini, its largest and "most capable" AI model
Getty Images

As pressure grows on Google to explain how it plans to monetize artificial intelligence, the company is launching on Wednesday what it believes to be its largest and most capable model.

Three sizes will be available for the large language model Gemini: Gemini Ultra, which is the largest and most capable category; Gemini Pro, which scales across a variety of tasks; and Gemini Nano, which is reserved for specialized tasks and mobile devices.

Join our Channel

For the time being, the business intends to license Gemini to clients via Google Cloud so they can utilize it in their apps. Developers and enterprise clients can use Google Cloud Vertex AI or Google AI Studio’s Gemini API to access Gemini Pro as of December 13. Gemini Nano will also be available for use by Android developers. Additionally, Gemini will power Google products like its Bard chatbot & Search Generative Experience (SGE), which attempts to provide conversational-style text responses to search queries (though SGE is not yet generally available).

Businesses and corporations could use it to identify trends for product advertising, as well as for more sophisticated customer service engagement through chatbots and product recommendations. Gemini can also be used for productivity apps that need to produce code for developers or summarize meetings, or for content creation in the case of a business wanting to write blog posts or marketing campaigns.

The company provided examples, such as how Gemini could snap a picture of a chart, analyze hundreds of research pages, and then update the chart. Analyzing a photo of someone’s math homework and pointing out the right answers and incorrect ones was another example.

The company announced in a blog post on Wednesday that Gemini Ultra is the first model to surpass human experts on MMLU (massive multitask language understanding), which tests both problem-solving and general knowledge across 57 subjects including math, physics, history, law, medicine, and ethics. It is said to be able to comprehend reasoning and subtleties in complicated subjects.

In a blog post published on Wednesday, CEO Sundar Pichai stated, “Teams from all across Google, including our colleagues at Google Research, have worked together to create Gemini.” “Its multimodal architecture allows it to comprehend, operate on, and blend various forms of information, such as text, code, audio, images, and video, and to generalize across them.”

Gemini Pro will be used by Google’s chatbot Bard to assist with advanced reasoning, planning, understanding, and other skills starting today. On a call with reporters on Tuesday, executives announced that “Bard Advanced,” which will use Gemini Ultra, will launch early in the following year. It is the most significant update to Bard, the chatbot that mimics ChatGPT.

Eight months have passed since the search engine behemoth debuted Bard, and a year has passed since OpenAI introduced ChatGPT on GPT-3.5. The startup headed by Sam Altman introduced GPT-4 in March of this year. On Tuesday, executives declared that Gemini Pro performed better than GPT-3.5, but they avoided discussing how it compared to GPT-4.

However, a white paper Google published on Wednesday claims that in a few benchmarks, Gemini’s Ultra model performed better than GPT-4.

Sissie Hsiao, Google’s general manager for Bard, responded that the company is focused on providing a positive user experience and is unsure about how “Bard Advanced” will be monetized when asked if there are any plans to charge for access.

Eli Collins, vice president of product at Google DeepMind, said during a press briefing that he “suspects” Gemini has any novel capabilities in comparison to current generation LLMs, but that research is still ongoing to determine these capabilities.

Google allegedly delayed the release of Gemini due to its unpreparedness, which brought back memories of the company’s problematic rollout of its artificial intelligence tools at the start of the year.

Collins responded that testing the more sophisticated models takes longer when several reporters questioned about the delay. Collins claimed that Gemini has “the most comprehensive safety evaluations” of any Google model and is the most thoroughly tested AI model the company has ever created.

Collins stated that Gemini Ultra is substantially less expensive to serve even though it is the company’s largest model. He declared, “It’s more efficient—not just more capable.” “While training Gemini still takes a lot of processing power, we’re making great progress in training these models.”

Collins stated that the perimeter count will not be released, but the company will provide a technical white paper on Wednesday that includes more model details. Google’s PaLM 2 large language model, which was the company’s most recent AI model at the time, required almost five times as much text data for training as Google’s LLM model, as CNBC discovered earlier this year.

Google also unveiled its next-generation tensor processing unit for AI model training on Wednesday. According to Google, the TPU v5p chip—which Lightricks, a startup, and Salesforce have started utilizing—offers better performance for the money than the TPU v4, which was announced in 2021. However, the business did not disclose performance data in comparison to Nvidia, the market leader.

The chip announcement follows the demonstration of custom silicon aimed at AI by cloud rivals Microsoft and Amazon a few weeks ago.

Investors pressed Google executives further during the company’s third-quarter earnings conference call in October, asking more questions about how AI will be used to generate real profit.

Search is still a significant source of revenue for Google, which is why in August it introduced Search Generative Experience, or SGE, as an “early experiment” to show users what a generative AI experience might seem like when using the search engine. As a result, the outcome is more conversational, in line with the chatbot era. It is still in the experimental stage and has not yet been made available to the general public.

Since May, when the company first revealed the experiment at its annual developer conference, Google I/O, investors have been requesting a timeline for SGE. SGE was hardly mentioned in the Gemini announcement on Wednesday, and executives only stated that Gemini would be integrated into it “in the next year” when asked about plans to launch to the public.

In his blog post on Wednesday, Pichai stated, “This new era of models represents one among the biggest science and engineering efforts we’ve undertaken as a company.” “I genuinely can’t wait to see what lies ahead and the opportunities that Gemini will bring to people everywhere.”

Leave a comment