Anirban Ghoshal
Senior Writer

Google delivers Gemini LLM support to BigQuery data warehouse

news
Feb 29, 20242 mins
AnalyticsCloud ComputingData Warehousing

BigQuery’s update will make it easier to prepare data for AI with speech-to-text and document processing.

gemini graphic 2
Credit: Google

Google is integrating its Gemini 1.0 Pro large language model with its AI and machine learning platform, Vertex AI, to help enterprises unlock new capabilities of large language models (LLMs), including analysis of text, image and video.

The Gemini API, which has been made generally available, can also be used in Google’s data warehouse, BigQuery, to develop generative AI-based analytical applications.

“The Gemini 1.0 Pro model is designed for higher input-output scale and better result quality across a wide range of tasks like text summarization and sentiment analysis. You can now access it using simple SQL statements or BigQuery’s embedded DataFrame API from right inside the BigQuery console,” Gerrit Kazmaier, general manager of data analytics at Google Cloud, said in a statement.

The company is also expected to integrate the vision version of the Gemini Pro model in the coming months.

In addition, Google is extending Vertex AI’s document processing and speech-to-text APIs to BigQuery to help enterprises analyze unstructured data, such as documents and audio.

Earlier this month, the company announced the preview of BigQuery vector search, which when integrated with Vertex AI can enable vector similarity search on data inside BigQuery along with other features such as retrieval augmented generation (RAG), text clustering and summarization.

Hyoun Park, principal analyst at Amalgam Insights, sees RAG support as table stakes for data warehouse vendors these days.

“Retrieval augmented generation is a capability every data warehouse will need to support, as it refers to accessing data from a third party source when someone asks a question,” Park said. “For instance, if someone asks an HR question, the RAG would also ask the employee’s HR system for relevant and current data to contextualize the question. The relevant capability here is in accessing a real-time update of a specific table or data source when someone asks a question to an LLM.”

Other companies are moving in a similar direction. Steven Dickens, vice president and practice lead at The Futurum Group, said that warehouse stalwarts such as Teradata and Cloudera are also adding vector capabilities alongside players such as Oracle and Elastic.