[Omp] Still on the barrier

Eugene Loh Eugene.Loh at Sun.COM
Sun Mar 11 21:22:50 PDT 2007


Shengyan Hong wrote:

>         Suppose there are 8 processors. Each has different frequencies and 
>L1 cache latencies. For example, Cpu1 has 1.1 GHz and its L1 cache latency 
>is 2 cycle.
>         Suppose there are 8 threads in one benchmark. Suppose there is one 
>barrier for these 8 threads. Suppose these 8 threads are divided into the 
>8 processors.  Now I want to utilize the simics to test the idle time in 
>the barrier for these 8 processors.
>         I choose the benchmark FT.  I think that the code should be 
>parallelized, so I choose the code as below:
>!$omp parallel do default(shared) private(i,j,k)
>       do k = 1, d3
>          do j = 1, d2
>             do i = 1, d1
>                u0(i,j,k) = 0.d0
>                u1(i,j,k) = 0.d0
>                indexmap(i,j,k) = 0.d0
>             end do
>          end do
>       end do
>
>       return
>       end
>And change the code to be:
>!$omp parallel do default(shared) private(i,j,k)
>       do k = 1, d3
>C       TID = OMP_GET_THREAD_NUM()
>C       PRINT *, 'thread = ', TID
>C       print *, "March 9"
>        CALL MAGIC_BRK_SIM_START()
>          do j = 1, d2
>             do i = 1, d1
>                u1(i,j,k) = u0(i,j,k)*ex(t*indexmap(i,j,k))
>             end do
>          end do
>C       print *, "Before barrier"
>        CALL MAGIC_BRK_SIM_MIDDLE()
>C       !$OMP BARRIER
>C       print *, "After barrier"
>        CALL MAGIC_BRK_SIM_STOP()
>        end do
>      Now I get the idle time by using MAGIC_BRK_SIM_MIDDLE() and 
>MAGIC_BRK_SIM_STOP. But each processor has the same idle time 6 cycles.
>  
>
I'm not sure I understand this.  Everything between SIM_MIDDLE and 
SIM_STOP is commented out.  So, there is nothing to time.  The 6 cycles 
must reflect how long it takes to get two consecutive timestamps.

How about moving the SIM_STOP statement to after the last "end do"?

>Besides, I test the execution time by using MAGIC_BRK_SIM_START() and 
>MAGIC_BRK_SIM_MIDDLE(). And each processor has different execution time. 
>But no too different. For example, 1.7821*10^5 and 1.78345*10^5.
>      Since you tell me that barrier can not be added into the parallel 
>region. Now can you tell me where I can add the barrier. I think that the 
>place of the code should have 8 threads. Right? To the end, how can I use 
>barrier in the fortran code? Thank you very much.
>
Again, don't worry about the barrier.  Just move the final timestamp to 
after the last "end do".


More information about the Omp mailing list