OpenMP 4.5 Released for Heterogeneous Parallel Programming

OpenMP 4.5 allows a significantly improved programming of accelerator devices and parallelization of doacross loops

SC15, Austin, Texas – November 16, 2015 – The OpenMP ARB is pleased to announce OpenMP 4.5, a major upgrade of the OpenMP standard language specifications. This release provides a substantial improvement on the support for programming of accelerator and GPU devices, and supports now also the parallelization of loops with well- structured dependencies. Implementation is underway in GCC and Clang. The new specification can be found here.

Standard for parallel programming extends its reach

With this release, OpenMP, the de-facto standard for parallel programming on shared memory systems, continues to extend its reach beyond pure HPC to include DSPs, real time systems, and accelerators. OpenMP aims to provide high-level parallel language support for a wide range of applications, from biotech and automotive to aeronautics, automation, robotics and financial analysis.

“OpenMP 4.5 is a significant achievement that demonstrates the industry-wide collaboration and the hard work and dedication within the OpenMP community”, says Michael Wong, CEO of the OpenMP ARB. “It is more than a minor release, representing the road towards OpenMP 5.0 while we continue on a cadence that delivers Technical Reports and/or Ratified Specifications annually, in keeping pace with the marketplace.”

Many new features

  • Significantly improved support for devices. OpenMP now provides mechanisms for unstructured data mapping and asynchronous execution and also runtime routines for device memory management. These routines allow for allocating, copying and freeing.
  • Support for doacross loops. A natural mechanism to parallelize loops with well- structured dependences is provided.
  • New taskloop construct. Support to divide loops into tasks, avoiding the requirement that all threads execute the loop.
  • Reductions for C/C++ arrays. This often requested feature is now available by building on support for array sections.
  • New hint mechanisms. Hint mechanisms can provide guidance on the relative priority of tasks and on preferred synchronization implementations.
  • Thread affinity support. It is now possible to use runtime functions to determine the effect of thread affinity clauses.
  • Improved support for Fortran 2003. Users can now parallelize many Fortran 2003 programs.
  • SIMD extensions. These extensions include the ability to specify exact SIMD width and additional data-sharing attributes.


Implementation is already almost complete in GCC version 6.0. It is starting in in the current trunk of Clang 4.8. Other vendor compilers are following.