Reducing Overhead
Convert to coarser grain model:
- More work per parallel region
- Reduce synchronization across threads
Combine multiple DO directives into single parallel region
- Continue to use Work-sharing directives
- Compiler does the work of distributing iterations
- Less work for user
- Doesn’t break code