[Omp] A question about OpenMP 2.5

Greg Bronevetsky greg at bronevetsky.com
Wed Mar 21 11:43:30 PDT 2007


I'm sorry, that should have been "upper bound". If the bound on the size
of memory transfers is y and the application operates on memory locations
on size x, the bad case arises when x<y. Thus, if an application is
compiled for a given x (>=y) and the implementation uses some larger
transfer size y' (>y), then the application may have races. However, if
the implementation uses a smaller transfer size y' (<y), the application
is guaranteed to work fine. In a lot of ways this situation is similar to
false sharing in caches, except in caches the only bad outcome is slowdown
while in parallel for loops its an issue of correctness.

While smaller transfer sizes bring their own memory model complications
but as far as I know these complications arise in programs that already
have data races and thus, we don't need to worry about them.

Now, the tricky part. The only truly inclusive upper bound that we can
place on memory transfers is 1 page, because that is the granularity used
by distributed shared memory systems. That's a pretty useless bound for
Brad's code example and I don't know what to do about that. It seems like
the behavior of this code example needs to become officially undefined.

                             Greg Bronevetsky


On Wed, 21 Mar 2007, Greg Bronevetsky wrote:

> > Yes, this does not surprise me. My point was a position
> > on how we need to modify the spec. I think Brad's question
> > demonstrates that we will need to specify SOME minimum
> > level of atomicity. I was staking a position of byte level
> > with char; if the hardware doesn't support it then the
> > OpenMP implementation would need to include sufficient
> > synchronization to guarantee it. I understand this might
> > entail undue hardship and I could be convinced that four
> > bytes is the right level (more than that seems unreasonable).
> > 
> In my last email I mentioned that in this discussion we're worrying about
> code that operates on memory locations of size x, while the hardware
> supports memory transfers of size y, where x<y. As such, we don't need to
> specify an official memory location size but instead specify a lower-bound
> for location size.
> 
>                              Greg Bronevetsky
> 
> 
> 




More information about the Omp mailing list