[Omp] A memory model question
Michael Suess
mike_ml at suessnetz.de
Thu Jul 13 04:07:08 PDT 2006
Hi,
I was playing around with OpenMP and C++ the other day, trying to implement
some common concurrent patterns (such as guard objects and singletons). When
implementing the singleton, I chose to try the double-checked locking
variant. I know this pattern does not work with all parallel programming
systems, but it should work with OpenMP (I think). But since after Gregs talk
at the IWOMP-labs I am not so sure I really understand the memory model, I
figured I would double-check (really no pun intended ;-) ) if I am correct.
Therefore please take a look at the following code if you find the time:
omp_singleton.hpp:
------------------
#ifndef OMP_SINGLETON_HPP_
#define OMP_SINGLETON_HPP_
/** This template class implements a threadsafe singleton with OpenMP.
* It is adapted from the book "Pattern-Oriented Software Architecture - vol.
* 2". */
template <typename TYPE>
class omp_singleton {
public:
/* return the same instance of the protected object every time, initialize
* when it's first needed */
static TYPE* instance ();
private:
static TYPE* instance_;
};
#endif /*OMP_SINGLETON_HPP_*/
===============================================================================
omp_singleton.cpp:
------------------
#include <omp.h>
#include "omp_singleton.hpp"
/* define initial value of instance_ variable */
template< typename TYPE>
TYPE* omp_singleton<TYPE>::instance_ = 0;
/* return the same instance of the protected object every time, initialize
* when it's first needed */
template <typename TYPE>
TYPE* omp_singleton<TYPE>::instance ()
{
/* double-checked locking here */
#pragma omp flush (instance_)
if (instance_ == 0) {
#pragma omp critical
{
if (instance_ == 0) {
instance_ = new TYPE;
}
}
}
return instance_;
}
/* the compiler needs this to correctly instantiate the template
* (and I did not want to put the function definitions in the header file) */
template class omp_singleton<int>;
This works on all platforms I have access to and it also seems to do the right
thing (I have a testcase that shows this if you are interested). Actually,
the previous sentence is not correct, since the Intel Compiler 9.1 does not
want to flush the instance_-variable and errors out (but I can work around
this by not giving flush a list - I should probably file this as a bug
report, though).
But is this by design, or just because of the underlaying hardware consistency
model? The critical part of the program is the not protected check if
instance_ equals 0. I remember from Gregs Talk that there may be problems
with this, but I cannot remember exactly why and I cannot find his slides.
Operation reordering should not be a problem, since the check is framed by
two flushes. And even if the temporary view were not correct (e.g. when the
flush is left out), the check is still repeated inside the critical region,
so the worst thing that should happen is that the critical region is entered
more often than necessary. So, theoretically, I should even be able to leave
out the flush, and the program would still be correct (albeit possibly a
little slower?). I think I am missing something, but I cannot figure it out
from the spec and from Jay's and Gregs paper on the memory model.
Any help appreciated,
Best regards,
Michael
More information about the Omp
mailing list