Syllabus

Course Objectvies

After completing this course, students will be able to write blazingly fast and scalable applications for modern parallel systems. They will be able to identify the best hardware architectures and programming models for their application of interest, and implement and debug them for performance.

Schedule

Week Date Topic Reading/Homework
1 Sep. 27 Intro to Parallel Computing
Sep. 29 Performance Amdahl’s Law, Gustafson’s Law, Isoefficiency
2 Oct. 4 Architecture I Ninja Performance Gap
Oct. 6 Architecture II Cray-1
3 Oct. 11 Fork-Join Model and OpenMP OpenMP Tutorial
Oct. 13 OpenMP II Homework 01
4 Oct. 18 OpenMP III
Oct. 20 Parallel Patterns Homework 02
5 Oct. 25 Collectives and MPI MPI Tutorial
Oct. 27 MPI II Homework 03
6 Nov. 1 MPI III
Nov. 3 GPUs and CUDA Homework 04, CUDA Programming Guide
7 Nov. 8 CUDA II Homework 04 Slides
Nov. 10 CUDA III
8 Nov. 15 CUDA IV
Nov. 17 CUDA continued
9 Nov. 22 (Guest Lecture) Performance Tools
Nov. 24 Optional Lecture - Perf. Modeling & 12 Ways to Fool the Masses
10 Nov. 29 Project Presentation I
Dec. 1 Project Presentation II