From Code to Performance

Tutorial 2: Introduction to High-Performance Parallel Programming

1. Sequential vs Parallel

In sequential computing, one task executes at a time. In parallel, tasks run simultaneously.

Key Insight:

p>T_parallel ≈ T_sequential / P + T_overhead

Where P = processors, and T_overhead = communication + synchronization cost.

2. Levels of Parallelism

Examples: summing large arrays

#pragma omp parallel for reduction(+:sum)
for (int i = 0; i < N; i++)
    sum += A[i];

3. Amdahl’s Law

Even with many processors, some code remains sequential.

Speedup=1(1−P)+PNSpeedup=(1−P)+NP​1​

where PP = parallelizable portion, NN = number of processors.

Example: If 90% of code is parallelizable, max speedup = 1 / (0.1 + 0.9/∞) = 10x limit.

4. Scalability Challenges