Last week has been a week of many new announcements for Google Cloud, with their annual Google Next 2019 conference in San Francisco. I had the pleasure to be an attendee. Here is an overview focused on the most relevant new data analytics features, in my opinion.
Information is the oil of the 21st century, and analytics is the combustion engine.” – Peter Sondergaard
- Data Fusion: this is definitely the biggest announcement and a real game-changer in the cloud data analytics world. Cloud Data Fusion is defined as a “Fully managed, code-free data integration at any scale.” It comes to the rescue of the many end-users that struggle to integrate their different data silos in a streamlined manner. As with every fully managed service, it shifts the focus from code to concrete insights and actions. Built on top of the CDAP open-source project, Data Fusion will ensure the portability of your code and will integrate smoothly with your on-premises and Cloud platform. As this new product is a small revolution in itself, I will write a dedicated post to it soon. Stay tuned!
- BigQuery BI Engine: probably one of the most added-value releases of Next 2019 that will enable Data Studio to close the gap between major players such as Tableau and Qlik. So what does it do? It finally caches results so that you can can analyse data stored in BigQuery with sub-second query response time and with high concurrency. Long loading time to refresh or filter a dashboard is now part of history. Last but not least, setting it up has never been so easy: a few clicks and you’re good to go. It will automatically cache your tables based on the tables ad columns you are using in your dashboards. While this is now limited to 10 Gb tables, it should soon increase to 150 Gb tables.
- Cloud Dataflow SQL and Dataflow FlexRS: a usual complaint about Dataflow is its steep learning curve for data analysts. This is exactly what is tackled by these two new features. Cloud Dataflow SQL will make it possible for data analysts to build their own Dataflow pipelines using familiar SQL for both batch and stream data processing. And all this through the BigQuery user interface, idea being to enlarge BigQuery more and more as an ELT platform.
- Connected sheets: data analysts in most companies still rely heavily on spreadsheets for various data analysis. For this reason, Google has released “Connected Sheets”. This will give Google Sheets access to the limitless power of BigQuery. It will translate into no row limits, direct connection to datasets in BigQuery and most importantly: no need to learn SQL, you can just apply the Sheets functionality.
- Data Catalog: In a world full of data, there is a need for a tool to document and make your data easily accessible. Data Catalog comes at the rescue as “a fully managed and scalable metadata management service that empowers organisations to quickly discover, manage, and understand all their data in Google Cloud.” For security and data governance, it integrates with Cloud DLP – so you can discover and catalog sensitive data assets – and Cloud IAM, where we honour source access control lists (ACLs), simplifying access management.
For a complete list of new releases, I invite you to visit the full list of new announcements on our blog here. For more analytics announcements you can check out the Google Cloud blog. In this article, I’ve focused on the game changers; i.e. the tools that will put GCP as the leader in terms of analytics in the Cloud. These tools all share some common factors: accessibility, user-friendliness and high integration.
Google Cloud Platform now offers the biggest and more robust suite of analytical tools. As a Google Cloud premier partner, we are here to help you get the most out of it. Any analytics or other Cloud-related questions? Drop us a line, we’re happy to help!