Skip to main content

Google debuts Cloud Data Fusion, connected sheets in BigQuery, and Data Catalog

Google data center in Douglas County, Georgia.
Image Credit: Google

testsetset

Coinciding with the database improvements announced this morning during Google’s annual Cloud Next conference, the Mountain View company announced a slew of new capabilities heading to its data analytics portfolio.

The first is Cloud Data Fusion, a fully managed and cloud-native data integration service that’s available starting this week in beta. Google’s pitching it as a way to ingest, integrate, and manipulate data using a library of open source transformations and over a hundred connectors. They’re mainly controlled through a drag-and-drop interface where data sets and pipelines are represented visually, without code.

Google also introduced ​Data Catalog​ in beta, a fully managed and scalable metadata management service with a search interface for data discovery, underpinned by the same search technology that supports Gmail and Drive. It boasts a cataloging system for capturing technical and business metadata, and it integrates with ​Cloud DLP​ and ​Cloud IAM​ for privileged access and control.

On the BigQuery side of the equation, Google says it’s built a data warehouse migration service to automate data and schema migration to BigQuery from Teradata and Amazon Redshift, as well as data loading from Amazon S3. ​And it took the wraps off of BigQuery BI Engine​, a speedy in-memory analysis service designed to handle complex data sets with “sub-second” query response time and high concurrency.​


June 5th: The AI Audit in NYC

Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.


Also new: connected sheets​, a type of Google Sheet spreadsheet that works with the full dataset from BigQuery up to 10 billion rows. (BigQuery has been able to take in data from Google Sheets since 2016, but only up to a point.) Analyses in connected sheets are performed with formulas, pivot tables, and charts as opposed to SQL, and can be visualized as dashboards and shared with anyone within an organization.

BigQuery BI Engine is available in beta starting today through Google Data Studio for interactive reporting and dashboarding, and Google says that in the coming months, Looker and Tableau will be able to leverage it as well. Connected sheets will arrive a bit later.

In other BigQuery news, BigQuery ML, which facilitates the deployment of AI models on data sets inside BigQuery, is gaining new models like ​k-means clustering​ (in beta) and matrix factorization (in alpha), and it’s now possible to build and directly import TensorFlow Deep Neural Network models (in alpha). Moreover, Google said that its BigQuery Data Transfer Service, which automates data movement from software-as-a-service (SaaS) apps to Google BigQuery on a scheduled basis, now supports more than 100 apps, including Salesforce, Marketo, Workday, and Stripe.

“From Fortune 500 enterprises to start-ups, more and more businesses continue to look to the cloud to help them store, manage, and generate insights from their data,” said Google Cloud director of product management Sudhir Hasbe. “And we’ll continue to develop new, transformative tools to help them do just that.”