[Omp] slow performance

Thomas.L.Clune at nasa.gov Thomas.L.Clune at nasa.gov
Thu Dec 16 05:45:40 PST 2004


Just being paranoid, but how are you measuring the time=3F  Many
timers total the time among all child threads - giving results just
like yours even though things are actually running substantially
faster.

- Tom



andrew wang writes:
 > Hi All,
 >=20
 > Sorry, forget to tell you the system info:
 >=20
 > Compaq AlphaServer SC45 with 44 nodes, each node comprising of four =
1GHz=20
 > Alpha processors with 1GB memory. I am uing only one node with diffe=
rent=20
 > thread number (1-3). Compaq C compiler supports openmp spe 1.0. The =
os=20
 > should be true 64 Unix.
 >=20
 >=20
 > I also try to compile same program on Intel C compiler 8.0, and run =
it on=20
 > two processor win2k server. Here is the running result:
 >=20
 > D:\omp\test>try 2
 > omp=5Fget=5Fnum=5Fprocs=3D2
 > Parallel region time=3D12 seconds
 > Total time =3D 14 seconds
 > D:\omp\test>try 1
 > omp=5Fget=5Fnum=5Fprocs=3D2
 > Parallel region time=3D12 seconds
 > Total time =3D 14 seconds
 >=20
 > seems there is not much difference, same problem.
 >=20
 >=20
 > As somebody point out, my program actually do not much inside parall=
el=20
 > region, so i increase the inner loop from 50->500,
 >=20
 > ....
 >   for (kk=3D0; kk< 500; kk ++){
 >=20
 >=20
 > =09        x =3D (kk+0.5)*step;
 > =09        sum +=3D 4.0/(1.0+x*x);   // more complicated calculation=
 here.
 > =09       }
 > ....
 >=20
 >=20
 > here is the result:
 >=20
 >=20
 > d:\omp\test>try 1
 > omp=5Fget=5Fnum=5Fprocs=3D2
 > Parallel region time=3D83 seconds
 > Total time =3D 87 seconds
 >=20
 > D:\omp\test>try 2
 > omp=5Fget=5Fnum=5Fprocs=3D2
 > Parallel region time=3D66 seconds
 > Total time =3D 66 seconds
 >=20
 > So the perfromance got enhanced for 2 threads. If this is the case, =
how=20
 > should I parallelize such program=3F Because in my real program, I c=
an only=20
 > parallize the particular region only.
 >=20
 >=20
 > Thanks
 > Andrew
 >=20
 > >From: Nils Smeds <smeds at pdc.kth.se>
 > >Reply-To: smeds at pdc.kth.se
 > >To: "andrew wang" <mcwang88 at hotmail.com>
 > >CC: omp at openmp.org
 > >Subject: Re: [Omp] slow performance Date: Wed, 15 Dec 2004 17:21:50=
 +0100
 > >
 > >
 > >mcwang88 at hotmail.com said:
 > > > But to my big suprise, I see that the result is quite different =
from=20
 > >what I
 > > > can  imagine. The more threads I have, the more slow the calcula=
tion is.
 > >
 > >You need to tell us more about the platform you are running on. How=
 many=20
 > >processors
 > >are available=3F How many processors are in use=3F Is there any oth=
er processes=20
 > >running
 > >that may interfere with your application=3F What kind of processors=
=3F=20
 > >Operating system=3F
 > >
 > >You enter and exit a parallel region 16200*50 times. The 39 second =
overhead=20
 > >then
 > >divides into 39s/(16200*50) =3D 48=B5s per fork-join which sounds a=
 little high=20
 > >on a
 > >modern system, but it is not outrageously high.
 > >
 > >/Nils
 > >
 >=20
 >=20
 >=20
 > =5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=
=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=

 > Omp mailing list
 > Omp at openmp.org
 > http://openmp.org/mailman/listinfo/omp=5Fopenmp.org
 >=20

--

-- 
Thomas Clune, Ph.D.				301-286-4635 (W)
Advanced Software Technology Group		301-286-1634 (F)
Science Computing Branch, Code 931		<Thomas.L.Clune at nasa.gov>
NASA GSFC





More information about the Omp mailing list