Course Objectvies
After completing this course, students will be able to write blazingly fast and scalable applications for modern parallel systems. They will be able to identify the best hardware architectures and programming models for their application of interest, and implement and debug them for performance.
Schedule
Week | Date | Topic | Reading/Homework |
---|---|---|---|
1 | Sep. 27 | Intro to Parallel Computing | |
Sep. 29 | Performance | Amdahl’s Law, Gustafson’s Law, Isoefficiency | |
2 | Oct. 4 | Architecture I | Ninja Performance Gap |
Oct. 6 | Architecture II | Cray-1 | |
3 | Oct. 11 | Fork-Join Model and OpenMP | OpenMP Tutorial |
Oct. 13 | OpenMP II | Homework 01 | |
4 | Oct. 18 | OpenMP III | |
Oct. 20 | Parallel Patterns | Homework 02 | |
5 | Oct. 25 | Collectives and MPI | MPI Tutorial |
Oct. 27 | MPI II | Homework 03 | |
6 | Nov. 1 | MPI III | |
Nov. 3 | GPUs and CUDA | Homework 04, CUDA Programming Guide | |
7 | Nov. 8 | CUDA II | Homework 04 Slides |
Nov. 10 | CUDA III | ||
8 | Nov. 15 | CUDA IV | |
Nov. 17 | CUDA continued | ||
9 | Nov. 22 | (Guest Lecture) Performance Tools | |
Nov. 24 | Optional Lecture - Perf. Modeling & 12 Ways to Fool the Masses | ||
10 | Nov. 29 | Project Presentation I | |
Dec. 1 | Project Presentation II |