AI Infrastructure Essentials (AIIE)

 

Course Overview

This course provides a foundational overview of the hardware, software, and networking components required to develop and manage AI models at scale. It explores Google Cloud's AI Hypercomputer architecture, compares compute accelerators like GPUs and TPUs, and examines the critical data pipelines and storage solutions necessary to maximize training performance.

Who should attend

IT decision-makers and infrastructure architects looking to understand the technical requirements and the AI Hypercomputer’s offerings for enterprise-grade AI deployment.

Prerequisites

Familiarity with cloud computing concepts and general data center infrastructure.

Course Objectives

  • Differentiate between the layers of the AI Hypercomputer.
  • Select appropriate accelerators for the most cost-effective AI workloads.
  • Evaluate storage and networking solutions to maximize training goodput.
  • Compare various deployment and consumption models for resource optimization.

Outline: AI Infrastructure Essentials (AIIE)

Module 1 - Foundations of AI Infrastructure

Topics:

  • Definition of AI infrastructure
  • The evolution of computing demands
  • The need for new computing power

Objectives:

  • N/A

Activities:

  • N/A

Module 2 - Google Cloud’s AI Hypercomputer

Topics:

  • The AI Hypercomputer
  • The 3 layers of the AI Hypercomputer: Overview

Objectives:

  • Differentiate between the layers of the AI Hypercomputer.

Activities:

  • N/A

Module 3 - Compute Accelerators: GPUs and TPUs

Topics:

  • Graphics Processing Units
    • GPU architecture
    • Google Cloud GPU family
    • Selecting GPUs
  • Tensor Processing Units
    • TPU architecture
    • Google Cloud TPU family
    • Best practices and considerations

Objectives:

  • item

Activities:

  • 1x exercise/discussion

Module 4 - The AI Data Pipeline: Network and Storage

Topics:

  • Maximizing goodput
  • Networking for data ingestion and training
  • Storage for data preparation and training
  • Architecture for inference

Objectives:

  • Evaluate storage and networking solutions to maximize training goodput.

Activities:

  • 1x discussion

Module 5 - Orchestration and Consumption

Topics:

  • Deployment options
  • Flexible consumption

Objectives:

  • Compare various deployment and consumption models for resource optimization.

Activities:

  • N/A

Module 6 - Course Summary and Quiz

Topics:

  • Course summary
  • Q&A
  • Quiz

Objectives:

  • Differentiate between the layers of the AI Hypercomputer
  • Select appropriate accelerators for the most cost effective AI workloads
  • Evaluate storage and networking solutions to maximize training goodput
  • Compare various deployment and consumption models for resource optimization

Activities:

  • 1x quiz with 4 MCQs

Prices & Delivery methods

Online Training

Duration
3 hours

Price
  • CAD 485
Classroom Training

Duration
3 hours

Price
  • Canada: CAD 485

Schedule

Currently there are no training dates scheduled for this course.