next up previous
Next: OpenMP Design Objective Up: OpenMP: A Proposed Industry Previous: Scalability

A More Complicated Example

To highlight additional features in the standard we present a slightly more complicated example which computes the energy spectrum for a field. This is essentially a histogramming problem but with a slight twist in that the sequence also is being generated in parallel. The histogramming loop and the sequence generation both could be easily parallelized as in the previous example, but in the interest of performance we would like to histogram as we compute in order to preserve locality.

 
Figure 4: A more complicated example.

The program goes immediately into a parallel region with a PARALLEL directive. The variables field and ispectrum are declared SHARED, and everything else is made PRIVATE with a DEFAULT clause. Note that this does not affect common blocks, so setup remains a shared data structure.

Within the parallel region we call initialize_field() to initialize the field and ispectrum arrays. Here we have an example of orphaning of the DO directive. With the X3H5 directives we would have to move these loops up to the main program so they could be lexically visible within the PARALLEL directive. Clearly that restriction makes it difficult to write good modular parallel programs. We use the NOWAIT clause on the END DO directives to eliminate the implicit barrier there. Finally, we use the SINGLE directive when we initialize a single internal field point. The END SINGLE directive also can take a NOWAIT clause but to guarantee correctness we need to synchronize here.

The field gets computed in compute_field. This could be any parallel Laplacian solver, but in the interest of brevity we don't include it here. With the field computed we are ready to compute the spectrum. To compute the spectrum we histogram the field values using the ATOMIC directive to eliminate a race condition in the updates to ispectrum. The END DO here has a NOWAIT because the parallel region ends after compute_spectrum() and there is an implied barrier when the threads join.



next up previous
Next: OpenMP Design Objective Up: OpenMP: A Proposed Industry Previous: Scalability



Leo Dagum
Wed Nov 5 17:43:05 PST 1997