[Omp] OpenMP spec 2.5 seems to have incorrect flush example on page 12
Greg Bronevetsky
greg at bronevetsky.com
Mon May 7 16:02:07 PDT 2007
My point is that the implicit flushes do not avoid this problem if that
compiler optimization is used. I agree that it is not threadsafe but I
didn't want to make a stand because I wanted to see what other people's
thoughts. Your "final comment" section summarizes pretty well why this
should be declared uniformly illegal.
Your fix to the optimization is similar to the one I suggested and I think
that both of them point towards how registerization should be made
thread-safe in general. Registers are an incoherent cache in that they do
not support dirty bits. As such, any compiler optimization that
registerizes shared variables must implement this dirty-bit functionality
in software and only copy-out of registers if their contents are dirty.
Otherwise, we get confused about whether the thread's temporary has new
values to share with others or is simply a receiver of values.
Greg Bronevetsky
On Mon, 7 May 2007, Bronis R. de Supinski wrote:
>
> Greg:
>
> I do not believe you are correct. The implicit flushes
> in OpenMP synchronizationconstructs are intended to
> avoid these types of problems. So, the issue is really
> restricted to flushes, which manage the memory model.
>
> Anyway, I am sorry it has taken so long for me to get
> back to responding to this issue. I had email access
> issues arise over the weekend and I am only now dug
> out enough that I can contemplate putting forth a response.
>
> First off, yes, there are many things about the memory
> model section that would be improved by moving them from
> being spread throughout the specification to being centralized
> in that section (or at least placing a clear forward
> reference). That is the item intended for the 3.0 spec
> that I promised to take on (if I ever have enough time
> to handle it - oh well, enough sniveling).
>
> Next, the registerization issue that Marcel raised is an
> interesting one. Although there is a distant peripheral
> relationship to the atomicity question (i.e., what is the
> minimum size write at which atomicity is guaranteed for
> accesses by different threads), I think the real question
> that we need to dredge up here is the thread safety question.
>
> By that, I mean what is thread safety and what is the user
> allowed to assume. Although it is slightly different from
> the question of whether or not standard libraries are thread
> safe, the registerization problem is primarily one of thread
> safety. Specifically, the registerization of unexecuted code
> is fine IN THE ABSENCE OF THREADS. It is also fine if there
> are threads but the compiler must ensure that the optimization
> is done in a thread safe manner. I note that this is really
> not an OpenMP question but is certainly one that could be
> commented upon in the specification.
>
> Anyhow, here is exactly what I mean:
>
> My paraphrase of Marcel's example:
>
> Given this:
>
> for (...) {
> if (?)
> sharedVar = ...;
> }
>
> the compiler wishes to transform it to this:
>
> r = sharedVar;
> for (...) {
> if (?)
> /* other occurences of SharedVar also substituted for... */
> r = ...;
> }
> sharedVar = r;
>
> OK, that works fine in the absence of threads and provides a
> win if the compiler predicts that ? will be true more than once.
>
> If the code is threaded and the compiler is right, then there
> will be a data race and it is the (stupid) user's fault so they
> get what they deserve. However, the compiler could be wrong and
> the code is never executed (at least in one thread) and the
> compiler has created the race and the (maybe not so stupid, huh?)
> user gets stomped on. Clearly that is unacceptable. But why
> does it happen? Because the transformation is not thread safe.
> If instead the compiler does this:
>
> r0 = sharedVar;
> r1 = 0;
> for (...) {
> if (?) {
> /* other occurences of SharedVar also substituted for... */
> r0 = ...;
> r1 = 1;
> }
> }
> if (r1)
> sharedVar = r;
>
> Now the code will work corretly with threads - that is the
> transformation is now thread safe.
>
> We already know that we must require external routines to be
> compiled with thread safe compilers. It is not an excessive
> burden on the compiler to implement the transformation in a
> thread-safe manner. And the (not os stupid) user is required
> to ensure that their separately compiled modules are thread-safe.
> The (stupid) user who fails to do so gets what he deserves.
> Everyone should be happy.
>
> Anyway, one final comment - the suggestion that we could just
> say "the compiler is not required to make this work even if
> we can prove the code is never executed" is absurd. There are
> perfectly valid reasons to use coding like this. For example,
> consider this sketch:
>
> ----
>
> int sharedVar = 42;
>
> main (argc, argv)
> #pragma omp parallel sections
> {
> #pragma omp section
> {
> f(argc);
> }
> #pragma omp section
> {
> g(argc);
> }
> }
>
> ----
>
> f (switch)
> {
> for (i = 0; i < TOP; i++) {
> if (switch > 1)
> shared *= i;
> }
> }
>
> ----
>
> g (switch)
> {
> for (i = 0; i < TOP; i++) {
> if (switch == 1)
> shared += i;
> }
> }
>
> ----
>
> In this case, I can prove quite well that there are no races
> at the level of my code. I will agree that it would be difficult
> for the compiler to prove it (and we can twist it further to
> make it less and less likely). Should the code be declared
> invalid for this reason? NO! How could a user know when the
> specification would say "Asking the compiler to understand
> that is too hard so your code is incorrect."? It would be an
> impossible nightmare. So, should we disallow registerization?
> NO! We simply require the user to compile the functioons using
> a thread-safe compiler! End of story.
>
> So, we need to add a thread safety section to the specification -
> or expand the memory model section to include that issue.
>
> Bronis
>
>
> On Mon, 7 May 2007, Greg Bronevetsky wrote:
>
> > > The point is simply that _unsophisticated_ users should be aware that
> > > writing code with explicit flushes is difficult to get right. They
> > > should understand that they run the risk of implanting complex bugs in
> > > their code. And not only that - as we've seen in this thread, they
> > > should beware of asking the compiler to do aggressive optimizations on
> > > such code, because sometimes those optimizations may be unsafe in the
> > > presence of explicit flushes.
> > >
> > I disagree that explicit flushes are the problem here. We get the same
> > problem all OpenMP synchronization constructs given this particular
> > compiler optimization. I think that most OpenMP people would want to
> > forbid this optimization but I'd like to see some more people give their
> > thumbs up or down.
> >
> > Greg Bronevetsky
> >
> >
> >
> > _______________________________________________
> > Omp mailing list
> > Omp at openmp.org
> > http://openmp.org/mailman/listinfo/omp
> >
>
More information about the Omp
mailing list