Course Overview

This 2-day course introduces learners to the data integration capability of Google Cloud using Cloud Data Fusion. In this course, we discuss the challenges of data integration and the need for a data integration platform (middleware). We then examine how Cloud Data Fusion can help effectively integrate data from a variety of sources and formats and generate insights. We look at the main components of Cloud Data Fusion and how they work, how to process batch and streaming data in real time with visual pipeline design, rich metadata and data lineage tracking, and how to deploy data pipelines on various runtime engines.

Who should attend

Data Engineer
Data Analysts

Prerequisites

To get the most out of this course, participants are encouraged to have: Completed !

Course Objectives

Identify the need for data integration,
Understand the capabilities of Cloud Data Fusion as a data integration platform,
Identify use cases for possible implementation with Cloud Data Fusion,
List the major components of Cloud Data Fusion,
Design and execute batch and real-time data processing pipelines,
Work with Wrangler to build data transformations.
Use connectors to integrate data from different sources and formats,
Configure the runtime environment; monitor and troubleshoot pipeline execution,
Understand the relationship between metadata and data lineage

Outline: Data Integration with Cloud Data Fusion (DICDF)

Module 00 - Introduction

Module 01 - Introduction to Data Integration and Cloud Data Fusion

Data integration: what, why, challenges
Data integration tools used in the industry
User personas
Introduction to cloud-based data fusion
Data Integration Critical Capabilities
Cloud Data Fusion UI components

Module 02 - Building Pipelines

Cloud Data Fusion architecture
Core concepts
Data pipelines and directed acyclic graphs (DAG)
Pipeline Lifecycle
Designing pipelines in Pipeline Studio

Module 03 - Designing Complex Pipelines

Branches, merging and joining
Actions and Notifications
Error handling and macros
Pipeline configurations, scheduling, import and export

Module 04 - Pipeline Execution Environment

Scheduling and triggers
Execution environment: Compute profile and provisioners
Monitoring pipelines

Module 05 - Building transformations and preparing data with Wrangler

Wrangler
Directives
User-defined directives

Module 06 - Connectors and Streaming Pipelines

Understand the data integration architecture.
List various connectors.
Use the Cloud Data Loss Prevention (DLP) API.
Understand the reference architecture of streaming pipelines.
Build and execute a streaming pipeline

Module 07 - Metadata and Data Lineage

Metadata
Data lineage

Module 08 - Summary

Course summary

Prices & Delivery methods

Online Training

Duration
2 days

Price

CAD 2,065

Dates and Booking

Request a date

Classroom Training

Duration
2 days

Price

Canada: CAD 2,065

Dates and Booking

Request a date

Click on town name or "Online Training" to book Schedule

Instructor-led Online Training: This computer icon in the schedule indicates that this date/time will be conducted as Instructor-Led Online Training. If you have any questions about our online courses, feel free to contact us via phone or Email anytime.

Italy

Sep 24–25, 2026

Online Training Time zone: Central European Summer Time (CEST)

Enroll