[Omp] OpenMP spec 2.5 seems to have incorrect flush example on page 12
Marcel Beemster
marcel at ace.nl
Sat May 5 03:10:28 PDT 2007
Larry wrote:
> It really does seem specific to this particular optimization; I've been
> trying to think of other cases. Is a flush required even if the
> assignment is never executed?
While Jakub wrote:
> There are no OpenMP directives in the
> for( i = 0 ; i < 100 ; i++ )
> loop, it can very well be moved to a new routine in some other compilation
> unit, perhaps not built with OpenMP flags at all. Are you saying that
> because of OpenMP existence all similar loop transformations are illegal?
I side with Jakub on this. This is really not specific for
this particular optimization. If an OMP compiler does not
have this freedom >between flushes<, then any optimization
involving globally visible objects (including malloced arrays
residing in memory), also in library code that is free of
OMP directives, has to be looked at very carefully.
As a compiler writer, I was happy to read the intentions of
the OMP memory model at the start of 1.4, page 10. It says:
"OpenMP provides a relaxed-consistency, shared-memory model. All
OpenMP threads have access to a place to store and retrieve
variables, called the memory. In addition, each thread
is allowed to have its own temporary view of the memory.
The temporary view of memory for each thread is not a required
part of the OpenMP memory model, but can represent any kind
of intervening structure, such as machine registers, cache,
or other local storage, between the thread and the memory.
The temporary view of memory allows the thread to cache
variables and thereby avoid going to memory for every reference
to a variable."
I interpret this as saying that an OMP compiler can be as
aggressive in its optimizations as a compiler for sequential
C or Fortran, between points where the application programmer
explicitly places OMP directives. I also believe that this
is what we all should want, because we should not start off
our parallelization effort with a built-in disadvantage over
sequential code.
The implication of allowing compilers to do such optimizations
is that the page 12 communication of shared variables example
should be removed from the OMP specification. I argue that this
is not a great loss: we lose the ability to communicate a shared
variable between essentially >unsynchronized< parallel threads. It
is actually non-trivial to construct a program that performs such
communication, see my example code. If you want to do this, use
volatile.
In that case, the explicit and unsynchronized "#pragma omp
flush" also becomes meaningless and must be removed from
OMP. The flush only has use when it occurs synchronized between
two or more threads, for example impled at a barrier. I am
not a fan of suddenly removing well-recognized features from
language specifications, but it really is the only logical
outcome of wanting OMP compilers to do optimizations (in an
equally agressive way as their sequential counterparts).
Marcel
--
Dr. Marcel Beemster, Senior Software Engineer, marcel at ace.nl,www.ace.nl
Associated Compiler Experts bv. Amsterdam, Netherlands. +31 20 6646416.
-----------------------------------------------------------------------
This e-mail and any files transmitted with it are confidential. Any
technical information contained herein is supplied as-is, and no rights
can be derived therefrom. If you have received this message in error,
please notify the sender by reply e-mail immediately, and delete the
message and all copies thereof.
More information about the Omp
mailing list