[Omp] OpenMP spec 2.5 seems to have incorrect flush example on page 12

Greg Bronevetsky greg at bronevetsky.com
Sat May 5 16:40:54 PDT 2007


Let me clarify. Are you saying is that if you have code like "if(0) { X=1;
}" then the behavior of the code is undefined if the "X=1;" may
participate in a race if it were ever executed? This interpretation would
certainly allow gcc's optimization.

                             Greg Bronevetsky

On Sat, 5 May 2007, Meadows, Lawrence F wrote:

> Sorry, I hit send too soon.
> 
> Your example references a variable X that is being used by two threads but does not use a flush. You need the flush. Think of the variable as being volatile in the region where it might be accessed by another thread. Essentially a load *was* moved and an extra store introduced. A flush was needed to prevent that optimization. Volatile is a heavier hammer that would have worked too.
> 
> -- Larry
> 
>  -----Original Message-----
> From: 	Greg Bronevetsky [mailto:greg at bronevetsky.com]
> Sent:	Saturday, May 05, 2007 04:03 PM Pacific Standard Time
> To:	Meadows, Lawrence F
> Cc:	Marcel Beemster; omp at openmp.org
> Subject:	RE: [Omp] OpenMP spec 2.5 seems to have incorrect flush example on page 12
> 
> I don't understand. This particular optimization doesn't move anything
> around. It introduces new operations that were not present in the original
> code. Furthermore, Jay's example specifically applies to flushes with
> lists. My example uses flushes without lists (the ones implied by lock
> operations), so its not the same thing.
> 
>                              Greg Bronevetsky
> 
> On Sat, 5 May 2007, Meadows, Lawrence F wrote:
> 
> > Locks imply flush, right? You can't examine or set X without a
> > lock or a flush. Your first example is broken because the
> > assignment to X by thread 0 is not guarded with a lock or
> > a flush. It doesn't matter whether or not it is ever
> > executed. The compiler is free to move things around
> > if a store/load (whether executed or not) isn't guarded
> > with a flush.
> > 
> > The actual act of execution is not the issue, I finally
> > understand this after Jay pointed it out. If a path of
> > execution includes a flush of a variable then the compiler
> > is not allowed to hoist loads or stores out of that
> > path.
> > 
> > 
> > -----Original Message-----
> > From: omp-bounces at openmp.org [mailto:omp-bounces at openmp.org] On Behalf
> > Of Greg Bronevetsky
> > Sent: Saturday, May 05, 2007 2:45 PM
> > To: Marcel Beemster
> > Cc: omp at openmp.org
> > Subject: Re: [Omp] OpenMP spec 2.5 seems to have incorrect flush example
> > on page 12
> > 
> > My understanding of what Marcel has been saying is that it applies
> > equally
> > to flush with and without a list. Marcel suggests that flush in general
> > is a bad idea since it prevents certain sequential optimizations.
> > Although
> > Marcel suggests that programmers be limited to barriers and the like, I
> > believe that barriers are the only synchronization construct that his
> > optimization is compatible with. In particular, locks, critical regions
> > and ordered regions don't seem to be compatible. For example, consider
> > the
> > following program, which uses locks: (the loop is just like Marcel's in
> > that X=1 is never executed but the compiler can't know that)
> > (initially X=21)
> > Thread 0           Thread 1
> > --------           --------
> > while(?)           omp_set_lock(&l)
> >   if(?) X=1        X=42
> >                    omp_unset_lock(&l)
> > omp_set_lock(&l)
> > print X
> > omp_unset_lock(&l)
> > 
> > I think everyone can agree that the print on Thread 0 should output
> > 42. However, Marcel's optimization can transform that above execution
> > into
> > the one below:
> > (initially X=21)
> > Thread 0           Thread 1
> > --------           --------
> > r=X                omp_set_lock(&l)
> > while(?)           X=42
> >   if(?) r=1        omp_unset_lock(&l)
> > X=r
> > omp_set_lock(&l)
> > print X
> > omp_unset_lock(&l)
> > 
> > The result is that Thread 0 prints 21.
> > 
> > As such, we have a choice: This sequential optimization or all
> > synchronization constructs besides barrier. 
> > 
> > There is one other option. Instead of the above transformation, do the
> > following:
> > (initially X=21)
> > Thread 0           Thread 1
> > --------           --------
> > r=X                omp_set_lock(&l)
> > X2=X               X=42
> > while(?)           omp_unset_lock(&l)
> >   if(?) r=1        
> > if(X2!=X) X=r
> > omp_set_lock(&l)
> > print X
> > omp_unset_lock(&l)
> > 
> > This would preserve the desired semantics at a pretty small reduction in
> > sequential performance.
> > 
> >                              Greg Bronevetsky
> > 
> > On Sat, 5 May 2007, Marcel Beemster wrote:
> > 
> > > Larry wrote:
> > > > It really does seem specific to this particular optimization; I've
> > been
> > > > trying to think of other cases. Is a flush required even if the
> > > > assignment is never executed?
> > > 
> > > While Jakub wrote:
> > > > There are no OpenMP directives in the
> > > > for( i = 0 ; i < 100 ; i++ )
> > > > loop, it can very well be moved to a new routine in some other
> > compilation
> > > > unit, perhaps not built with OpenMP flags at all.  Are you saying
> > that
> > > > because of OpenMP existence all similar loop transformations are
> > illegal?
> > > 
> > > I side with Jakub on this. This is really not specific for
> > > this particular optimization. If an OMP compiler does not
> > > have this freedom >between flushes<, then any optimization
> > > involving globally visible objects (including malloced arrays
> > > residing in memory), also in library code that is free of
> > > OMP directives, has to be looked at very carefully.
> > > 
> > > As a compiler writer, I was happy to read the intentions of
> > > the OMP memory model at the start of 1.4, page 10. It says:
> > > 
> > >     "OpenMP provides a relaxed-consistency, shared-memory model. All
> > >     OpenMP threads have access to a place to store and retrieve
> > >     variables, called the memory. In addition, each thread
> > >     is allowed to have its own temporary view of the memory.
> > >     The temporary view of memory for each thread is not a required
> > >     part of the OpenMP memory model, but can represent any kind
> > >     of intervening structure, such as machine registers, cache,
> > >     or other local storage, between the thread and the memory.
> > >     The temporary view of memory allows the thread to cache
> > >     variables and thereby avoid going to memory for every reference
> > >     to a variable."
> > > 
> > > I interpret this as saying that an OMP compiler can be as
> > > aggressive in its optimizations as a compiler for sequential
> > > C or Fortran, between points where the application programmer
> > > explicitly places OMP directives. I also believe that this
> > > is what we all should want, because we should not start off
> > > our parallelization effort with a built-in disadvantage over
> > > sequential code.
> > > 
> > > The implication of allowing compilers to do such optimizations
> > > is that the page 12 communication of shared variables example
> > > should be removed from the OMP specification. I argue that this
> > > is not a great loss: we lose the ability to communicate a shared
> > > variable between essentially >unsynchronized< parallel threads. It
> > > is actually non-trivial to construct a program that performs such
> > > communication, see my example code. If you want to do this, use
> > > volatile.
> > > 
> > > In that case, the explicit and unsynchronized "#pragma omp
> > > flush" also becomes meaningless and must be removed from
> > > OMP. The flush only has use when it occurs synchronized between
> > > two or more threads, for example impled at a barrier. I am
> > > not a fan of suddenly removing well-recognized features from
> > > language specifications, but it really is the only logical
> > > outcome of wanting OMP compilers to do optimizations (in an
> > > equally agressive way as their sequential counterparts).
> > > 
> > > Marcel
> > > 
> > > 
> > > -- 
> > > Dr. Marcel Beemster, Senior Software Engineer,
> > marcel at ace.nl,www.ace.nl
> > > Associated Compiler Experts bv. Amsterdam, Netherlands. +31 20
> > 6646416.
> > >
> > -----------------------------------------------------------------------
> > > This e-mail and any  files transmitted  with it are  confidential.
> > Any
> > > technical information contained herein is supplied as-is, and no
> > rights
> > > can be  derived therefrom.  If you have received this message in
> > error,
> > > please notify  the sender by reply  e-mail immediately,  and delete
> > the
> > > message and all copies thereof.
> > > 
> > > 
> > > _______________________________________________
> > > Omp mailing list
> > > Omp at openmp.org
> > > http://openmp.org/mailman/listinfo/omp
> > > 
> > 
> > _______________________________________________
> > Omp mailing list
> > Omp at openmp.org
> > http://openmp.org/mailman/listinfo/omp
> > 
> > 
> 
> 




More information about the Omp mailing list