[Omp] The test of the barrier

Shengyan Hong shhong at cse.psu.edu
Mon Mar 26 07:13:33 PDT 2007


Every OMP member,
       Now I test the barrier by using the code like this:
!$omp parallel default(shared) private(i,j,k,jj,y1,y2)

!$omp&  shared(is,logd1,d1)
CALL MAGIC_BRK_SIM_START()
!$omp do
          do jj = 0, d2 - fftblock, fftblock
             do j = 1, fftblock
                do i = 1, d1
                   y1(j,i) = x(i,j+jj,k)
                enddo
             enddo
             call cfftz (is, logd1, d1, y1, y2)


             do j = 1, fftblock
                do i = 1, d1
                   xout(i,j+jj,k) = y1(j,i)
                enddo
             enddo
          enddo
!$omp end do nowait
CALL MAGIC_BRK_SIM_MIDDLE()
!$omp BARRIER
CALL MAGIC_BRK_SIM_STOP()
!$omp end parallel
       enddo
       I test the exe time and the idle time. 
exe_time=middle_time-start_time, idle_time=stop_time-middle_time
       I run the program on the simics, and I use 8 processors with 
different frequencies and L1 cache latencies. The parallel code is 
divided into 8 threads. Each time I get the data exe time and the idle 
time for one iteration, I will reallocate the 8 threads to the 8 processors. 
Besides, I also do another experiment in which I do not reallocate the 
8 threads to the 8 processors.
       Actually I just change the frequencies and the latencies of the 8 
processors to implement the allocation.
       Now there is one problem. I compare the exe time of the 2 
experiments, and find that for the same iteration, the exe time for the 
same thread in 2 experiments are the same, even when I change the 
processor for the thread in 1 experiment. For example, in the reallocation 
experiment, for 32nd iteration, the exe time for the thread 2 is 22664;  in the 
non-reallocation experiment, for the same iteration, the exe time for the 
thread 2 is also 22664. But the processors for the threads are different.
       Another problem is that I add the exe time and the idle time to get 
the total time. But the total time of the thread 0 is always around 
400 cycles smaller than that of the other threads.
       Thank you.
                                                   Shengyan Hong


More information about the Omp mailing list