Building Resilient Streaming Analytics Systems on Google Cloud (BRSAS)

 

Course Overview

Processing streaming data is becoming increasingly popular as streaming enables businesses to get real-time metrics on business operations. This course covers how to build streaming data pipelines on Google Cloud. Pub/Sub is described for handling incoming streaming data. The course also covers how to apply aggregations and transformations to streaming data using Dataflow, and how to store processed records to BigQuery or Cloud Bigtable for analysis. Learners get hands-on experience building streaming data pipeline components on Google Cloud by using QwikLabs.

Who should attend

This class is intended for data analysts, data scientists and programmers who want to build for extraordinary scenarios such as high availability, resiliency, high-throughput, real-time streaming analytics on Google Cloud.

Prerequisites

  • Experience analyzing and visualizing big data, implementing cloud-based big data solutions, and transforming/processing datasets.
  • Google Cloud Big Data and Machine Learning Fundamentals (or equivalent experience).
  • Some knowledge of Java

Course Objectives

  • Interpret use-cases for real-time streaming analytics
  • Manage data events by using the Pub/Sub asynchronous messaging service
  • Write streaming pipelines and run transformations where necessary
  • Interoperate Dataflow, BigQuery and Pub/Sub for real-time streaming and analysis

Outline: Building Resilient Streaming Analytics Systems on Google Cloud (BRSAS)

Module 1 - Introduction to Processing Streaming Data

Topics:

  • Introduction to processing streaming data

Objectives:

  • Explain streaming data processing.
  • Describe the challenges with streaming data.
  • Identify the Google Cloud products and tools that can help address streaming data challenges.

Module 2 - Serverless Messaging with Pub/Sub

Topics:

  • Introduction to Pub/Sub
  • Pub/Sub push versus pull
  • Publishing with Pub/Sub code

Objectives:

  • Describe the Pub/Sub service.
  • Explain how Pub/Sub works.
  • Simulate real-time streaming sensor data using Pub/Sub

Module 3 - Dataflow Streaming Features

Topics:

  • Steaming data challenges
  • Dataflow windowing

Objectives:

  • Describe the Dataflow service.
  • Build a stream processing pipeline for live traffic data.
  • Demonstrate how to handle late data by using watermarks, triggers, and accumulation.

Module 4 - High-Throughput BigQuery and Bigtable Streaming Features

Topics:

  • Streaming into BigQuery and visualizing results
  • High-throughput streaming with Bigtable
  • Optimizing Bigtable performance

Objectives:

  • Describe how to perform ad hoc analysis on streaming data using BigQuery and dashboards.
  • Discuss Cloud Bigtable as a low-latency solution.
  • Describe how to architect for Bigtable and how to ingest data into Bigtable.
  • Highlight performance considerations for the relevant services.

Module 5 - Advanced BigQuery Functionality and Performance

Topics:

  • Analytic window functions
  • Geographic Information System (GIS) functions
  • Performance considerations

Objectives:

  • Review some of BigQuery’s advanced analysis capabilities.
  • Discuss ways to improve query performance.

Prices & Delivery methods

Online Training

Duration
1 day

Price
  • Online Training: CAD 785
  • Online Training: US$ 595
Classroom Training

Duration
1 day

Price
  • Canada: CAD 785

Click on town name or "Online Training" to book Schedule

This is an Instructor-Led Classroom course
This is a FLEX course, which is delivered both virtually and in the classroom.

Italy

Milan This is a FLEX course. Enroll
Online Training Time zone: Europe/Rome Enroll
Rome This is a FLEX course. Enroll
Online Training Time zone: Europe/Rome Enroll
Milan This is a FLEX course. Enroll
Online Training Time zone: Europe/Rome Enroll