Fundamentals of Accelerated Computing with Modern CUDA C++ (FACCC)

 

Course Overview

This workshop provides a comprehensive introduction to general-purpose GPU programming with CUDA. You'll learn how to write, compile, and run GPU-accelerated code, leverage CUDA core libraries to harness the power of massive parallelism provided by modern GPU accelerators, optimize memory migration between CPU and GPU, and implement your own algorithms. At the end of the workshop, you'll have access to additional resources to create your own GPU-accelerated applications.

Please note that once a booking has been confirmed, it is non-refundable. This means that after you have confirmed your seat for an event, it cannot be cancelled and no refund will be issued, regardless of attendance.

Prerequisites

  • Basic C++ competency, including familiarity with lambda expressions, loops, conditional statements, functions, standard algorithms and containers.
  • No previous knowledge of CUDA programming is assumed.

Course Objectives

At the conclusion of the workshop, you'll have an understanding of the fundamental concepts and techniques for accelerating C++ code with CUDA and be able to:

  • Write and compile code that runs on the GPU
  • Optimize memory migration between CPU and GPU
  • Leverage powerful parallel algorithms that simplify adding GPU acceleration to your code
  • Implement your own parallel algorithms by directly programming GPUs with CUDA kernels
  • Utilize concurrent CUDA streams to overlap memory traffic with compute
  • Know where, when, and how to best add CUDA acceleration to existing CPU-only applications

Outline: Fundamentals of Accelerated Computing with Modern CUDA C++ (FACCC)

Introduction

  • Meet the instructor.
  • Create an account at courses.nvidia.com/join

CUDA Made Easy: Accelerating Applications with Parallel Algorithms

To make your first steps in GPU programming as easy as possible, this lab teaches you how to leverage powerful parallel algorithms that make GPU acceleration of your code as easy as changing a few lines of code. While doing so, you’ll learn fundamental concepts such as execution space and memory space, parallelism, heterogeneous computing, and kernel fusion. These concepts will serve as a foundation for your advancement in accelerated computing. By the time you complete this lab, you will be able to:

  • Write, compile, and run GPU code
  • Refactor standard algorithms to execute on GPU
  • Extend standard algorithms to fit your unique use cases

Break (60 mins)

Unlocking the GPU’s Full Potential: Harnessing Asynchrony with CUDA Streams

In the previous lab, you learned how to use parallel algorithms. However, But the concept of parallelism is not sufficient for accelerating your applications. To fully utilize GPUs, this lab will teach you another fundamental concept: asynchrony. In this lab, you'll learn how and when to leverage asynchrony. You’ll use Nsight Systems to distinguish synchronous and asynchronous algorithms and identify performance bottlenecks. By the time you complete this lab, you will be able to:

  • Use CUDA streams to overlap execution and memory transfers
  • Use CUDA events for asynchronous dependency management
  • Profile CUDA code with NVIDIA Nsight Systems

Break (15 mins)

Implementing New Algorithms with CUDA Kernels

Previous labs equipped you with necessary understanding of how using standard parallel algorithms can provide both convenient and speed-of-light GPU acceleration. However, sometimes your unique use cases are not covered by accelerated libraries. In this lab, you’ll learn the CUDA SIMT programming model to program the GPU directly using CUDA kernels. Besides that, this lab will cover utilities provided by the CUDA ecosystem to facilitate development of custom CUDA kernels. By the time you complete this lab, you will be able to:

  • Write and launch custom CUDA kernels
  • Control thread hierarchy
  • Leverage shared memory
  • Use cooperative algorithms

Final Review

  • Review key learnings and wrap up questions.
  • Complete the assessment to earn a certificate.
  • Take the workshop survey.

Prices & Delivery methods

Online Training

Duration
8 hours

Price
  • CAD 690
Classroom Training

Duration
8 hours

Price
  • Canada: CAD 690

Schedule

Currently there are no training dates scheduled for this course.