Tutorial 2: Introduction to High-Performance Parallel Programming
1. Sequential vs Parallel
In sequential computing, one task executes at a time. In parallel, tasks run simultaneously.
Key Insight:
p>T_parallel ≈ T_sequential / P + T_overheadWhere P = processors, and T_overhead = communication + synchronization cost.
2. Levels of Parallelism
- Data Parallelism: Each core processes a subset of data.
- Task Parallelism: Different tasks run in parallel.
- Pipeline Parallelism: Combination using MPI + OpenMP
Examples: summing large arrays
#pragma omp parallel for reduction(+:sum)
for (int i = 0; i < N; i++)
sum += A[i];
3. Amdahl’s Law
Even with many processors, some code remains sequential.
Speedup=1(1−P)+PNSpeedup=(1−P)+NP1
where PP = parallelizable portion, NN = number of processors.
Example: If 90% of code is parallelizable, max speedup = 1 / (0.1 + 0.9/∞) = 10x limit.
4. Scalability Challenges
- Communication latency
- Memory contention
- Load imbalance