NVIDIA Fundamentals of Accelerated Computing with Modern CUDA C++

This workshop provides a comprehensive introduction to general-purpose GPU programming with CUDA. You'll learn how to write, compile, and run GPU-accelerated code, leverage CUDA core libraries to harness the power of massive parallelism provided by modern GPU accelerators, optimize memory migration between CPU and GPU, and implement your own algorithms. At the end of the workshop, you'll have access to additional resources to create your own GPU-accelerated applications.

Difficulty rating: ★★★★ Advanced

Who is it for?

Both research staff and research students
Developers, data scientists, and researchers looking to solve challenging problems with deep learning and accelerated computing

Summary of the topics covered

Write and compile code that runs on the GPU
Optimize memory migration between CPU and GPU
Leverage powerful parallel algorithms that simplify adding GPU acceleration to your code
Implement your own parallel algorithms by directly programming GPUs with CUDA kernels
Utilize concurrent CUDA streams to overlap memory traffic with compute
Know where, when, and how to best add CUDA acceleration to existing CPU-only applications

Prerequisites

Basic C++ competency, including familiarity with lambda expressions, loops, conditional statements, functions, standard algorithms and containers.

Frequency

2 times a year

Duration

12 hours (over 2 days)

Next course

Tuesday 25th (9:30 - 16:30) and Thursday 27th (9:30 - 15:00) November 2025 - attend both days.

Book here

Can't attend?

We don’t have online materials for this session, but the course will run again — so you’ll be very welcome to join next time. You can find more information here.