[Omp] OpenMP spec 2.5 seems to have incorrect flush example on page 12
Bronis R. de Supinski
bronis at llnl.gov
Mon May 7 15:35:23 PDT 2007
Greg:
I do not believe you are correct. The implicit flushes
in OpenMP synchronizationconstructs are intended to
avoid these types of problems. So, the issue is really
restricted to flushes, which manage the memory model.
Anyway, I am sorry it has taken so long for me to get
back to responding to this issue. I had email access
issues arise over the weekend and I am only now dug
out enough that I can contemplate putting forth a response.
First off, yes, there are many things about the memory
model section that would be improved by moving them from
being spread throughout the specification to being centralized
in that section (or at least placing a clear forward
reference). That is the item intended for the 3.0 spec
that I promised to take on (if I ever have enough time
to handle it - oh well, enough sniveling).
Next, the registerization issue that Marcel raised is an
interesting one. Although there is a distant peripheral
relationship to the atomicity question (i.e., what is the
minimum size write at which atomicity is guaranteed for
accesses by different threads), I think the real question
that we need to dredge up here is the thread safety question.
By that, I mean what is thread safety and what is the user
allowed to assume. Although it is slightly different from
the question of whether or not standard libraries are thread
safe, the registerization problem is primarily one of thread
safety. Specifically, the registerization of unexecuted code
is fine IN THE ABSENCE OF THREADS. It is also fine if there
are threads but the compiler must ensure that the optimization
is done in a thread safe manner. I note that this is really
not an OpenMP question but is certainly one that could be
commented upon in the specification.
Anyhow, here is exactly what I mean:
My paraphrase of Marcel's example:
Given this:
for (...) {
if (?)
sharedVar = ...;
}
the compiler wishes to transform it to this:
r = sharedVar;
for (...) {
if (?)
/* other occurences of SharedVar also substituted for... */
r = ...;
}
sharedVar = r;
OK, that works fine in the absence of threads and provides a
win if the compiler predicts that ? will be true more than once.
If the code is threaded and the compiler is right, then there
will be a data race and it is the (stupid) user's fault so they
get what they deserve. However, the compiler could be wrong and
the code is never executed (at least in one thread) and the
compiler has created the race and the (maybe not so stupid, huh?)
user gets stomped on. Clearly that is unacceptable. But why
does it happen? Because the transformation is not thread safe.
If instead the compiler does this:
r0 = sharedVar;
r1 = 0;
for (...) {
if (?) {
/* other occurences of SharedVar also substituted for... */
r0 = ...;
r1 = 1;
}
}
if (r1)
sharedVar = r;
Now the code will work corretly with threads - that is the
transformation is now thread safe.
We already know that we must require external routines to be
compiled with thread safe compilers. It is not an excessive
burden on the compiler to implement the transformation in a
thread-safe manner. And the (not os stupid) user is required
to ensure that their separately compiled modules are thread-safe.
The (stupid) user who fails to do so gets what he deserves.
Everyone should be happy.
Anyway, one final comment - the suggestion that we could just
say "the compiler is not required to make this work even if
we can prove the code is never executed" is absurd. There are
perfectly valid reasons to use coding like this. For example,
consider this sketch:
----
int sharedVar = 42;
main (argc, argv)
#pragma omp parallel sections
{
#pragma omp section
{
f(argc);
}
#pragma omp section
{
g(argc);
}
}
----
f (switch)
{
for (i = 0; i < TOP; i++) {
if (switch > 1)
shared *= i;
}
}
----
g (switch)
{
for (i = 0; i < TOP; i++) {
if (switch == 1)
shared += i;
}
}
----
In this case, I can prove quite well that there are no races
at the level of my code. I will agree that it would be difficult
for the compiler to prove it (and we can twist it further to
make it less and less likely). Should the code be declared
invalid for this reason? NO! How could a user know when the
specification would say "Asking the compiler to understand
that is too hard so your code is incorrect."? It would be an
impossible nightmare. So, should we disallow registerization?
NO! We simply require the user to compile the functioons using
a thread-safe compiler! End of story.
So, we need to add a thread safety section to the specification -
or expand the memory model section to include that issue.
Bronis
On Mon, 7 May 2007, Greg Bronevetsky wrote:
> > The point is simply that _unsophisticated_ users should be aware that
> > writing code with explicit flushes is difficult to get right. They
> > should understand that they run the risk of implanting complex bugs in
> > their code. And not only that - as we've seen in this thread, they
> > should beware of asking the compiler to do aggressive optimizations on
> > such code, because sometimes those optimizations may be unsafe in the
> > presence of explicit flushes.
> >
> I disagree that explicit flushes are the problem here. We get the same
> problem all OpenMP synchronization constructs given this particular
> compiler optimization. I think that most OpenMP people would want to
> forbid this optimization but I'd like to see some more people give their
> thumbs up or down.
>
> Greg Bronevetsky
>
>
>
> _______________________________________________
> Omp mailing list
> Omp at openmp.org
> http://openmp.org/mailman/listinfo/omp
>
More information about the Omp
mailing list