Google Simplifies SQL
Google BigQuery is making it possible for data analysts to deploy SQL machine learning (ML) models using only simple SQL statements, potentially reducing the ML skills gap.
Google has now added SQL machine learning (ML) capabilities to its Google BigQuery, the company’s petabyte (PB)-scale cloud database offering. Now dubbed BigQuery ML, the new version lets you use simple Structured Query Language (SQL) statements to build and deploy ML models for predictive analytics.
That’s not just good news for data scientists who use Google. It’s also good for business operators interested in advancing their data analytics capabilities because it adds one more effective competitor to a rather small list of vendors capable of delivering this level of sophistication via the cloud. The other two most well-known names are Amazon’s Relational Database Service and Microsoft’s Azure SQL, and you can find more in our recent cloud database service roundup.
The bane of all data product vendors and buyers has always been the skills gap. That’s been especially true for those interested in ML and predictive analytics since these disciplines often require knowledge of new technologies and querying languages.
“For every one data scientist, there are hundreds of analysts working with data, and most using SQL,” Sudhir Hasbe, Director of Product Management at Google Cloud, told PCMag. Something had to give if the power of an army of data analysts was to be uncorked from the bottleneck created by too few and too overworked data scientists.
Google’s answer to this dilemma is nothing short of remarkable. While ML is a hot trend and showing up in products of all kinds everywhere, it’s still firmly data scientist territory. Plenty of vendors have made headway into simplifying the technology, but the ugly truth is, you can simplify it by a lot and it’s still too difficult for more than 99 percent of the human population to use. Yet, we need to be able to use it because ML can do more, and do it faster than a group of super-smart humans can.
Google is planting ML inside Google BigQuery so that it resides closer to the data. The application will bring SQL Machine Learning (ML) capabilities faster than traditional ML models in part because the data analytics can be performed at the source. Now in beta, BigQuery ML enables analysts (and data scientists) to run predictive analytics such as forecasting sales and creating customer segments right on top of the data where it is stored. That alone is a respectable and a notable upgrade.
However, Google went further than that by adding a capability that enables data analysts to use simple SQL statements to build and deploy ML models. Right now, the options are linear regression and logistic regression models for predictive analysis as those are the two models most commonly used.
Google plans to add more ML options to this capability over time, according to Hasbe. “We need to hear from our customers on which models they want us to add so that we’re providing the most useful ones first,” he said.
Additional Google BigQuery Upgrades – SQL Machine Learning
Topping the substantial list of upgrades after ML are a clustering capability, BigQuery Geographic Information Systems (BigQuery GIS), a new Google Sheets data connector, and a new Google Sheets data connector.
Clustering is also in beta and enables the creation of clustered tables in a data optimization move that bunches rows with similar cluster keys together. This reduces costs since it improves performance and enables Google BigQuery to charge the user only for the data scanned rather than the entire table or partition.
BigQuery GIS is currently in alpha and is used for geospatial data analysis. While the Google Cloud team partnered with Google Earth Engine to build BigQuery GIS, you have to bring your own geospatial data to the table. That’s not a problem in and across several industries, including connected car systems, the Internet of Things (IoT), manufacturing, retail, smart cities, and telematics. Not to mention government agencies ranging from the Environmental Protection Agency (EPA) and the National Geospatial-Intelligence Agency to the National Oceanic and Atmospheric Administration (NOAA) and all of the military branches, of course.
BigQuery GIS uses the S2 library, which now has over a billion users through a variety of products such as Google Earth Engine and Google Maps. If you need more geospatial data, then the federal government shares an immense amount of it on GeoPlatform.
A new Google Sheets data connector is likely to delight many data analysts simply because it’s so practical for daily use. You can access Google BigQuery from the Google Sheets (spreadsheet program) and use Google Sheets tools such as Explore, which is a combined collaboration, data visualization, and natural language querying tool.
Google BigQuery now has a new user interface (UI) in beta, too. One of the more interesting elements is one-click visualization functionality, which Google Data Studio supports. All told, it’s a great round of upgrades for an already elegant service. Contact Musato Technologies to learn more about ways our innovative ICT solutions can successfully and effectively transform your organization. SQL Machine Learning, an article by Pam Baker