Summary
The flush construct executes the OpenMP flush operation. This operation makes a thread’s temporary
view of memory consistent with memory and enforces an order on the memory operations of the variables
explicitly specified or implied. See the memory model description in Section 1.4 for more details. The
flush construct is a stand-alone directive.
where memory-order-clause is one of the following:
seq_cst acq_rel release acquire
The syntax of the flush construct is as follows:
!$ompflush[memory-order-clause][(list)]
where memory-order-clause is one of the following:
seq_cst acq_rel release acquire
Binding
The binding thread set for a flush region is all threads in the device-set of its flush operation. Execution of
a flush region affects the memory and it affects the temporary view of memory of the encountering
thread. It does not affect the temporary view of other threads. Other threads on devices in the device-set
must themselves execute a flush operation in order to be guaranteed to observe the effects of the flush
operation of the encountering thread.
Description
If neither memory-order-clause nor a list appears on the flush construct then the behavior is as if
memory-order-clause is seq_cst.
A flush construct with the seq_cst clause, executed on a given thread, operates as if all data storage
blocks that are accessible to the thread are flushed by a strong flush operation. A flush construct with a list
applies a strong flush operation to the items in the list, and the flush operation does not complete until the
operation is complete for all specified list items. An implementation may implement a flush construct
with a list by ignoring the list and treating it the same as a flush construct with the seq_cst
clause.
If no list items are specified, the flush operation has the release and/or acquire flush properties:
If
memory-order-clause
is
seq_cst
or
acq_rel,
the
flush
operation
is
both
a
release
flush
and
an
acquire
flush.
If
memory-order-clause
is
release,
the
flush
operation
is
a
release
flush.
If
memory-order-clause
is
acquire,
the
flush
operation
is
an
acquire
flush.
If a pointer is present in the list, the pointer itself is flushed, not the memory block to which the pointer
refers.
A flush construct without a list corresponds to a call to atomic_thread_fence, where the argument
is given by the identifier that results from prefixing memory_order_ to memory-order-clause.
For a flush construct without a list, the generated flush region implicitly performs the corresponding
call to atomic_thread_fence. The behavior of an explicit call to atomic_thread_fence that
occurs in the program and does not have the argument memory_order_consume is as if the call is
replaced by its corresponding flush construct.
If the list item or a subobject of the list item has the POINTER attribute, the allocation or association status
of the POINTER item is flushed, but the pointer target is not. If the list item is a Cray pointer, the pointer is
flushed, but the object to which it points is not. Cray pointer support has been deprecated. If the list item is
of type C_PTR, the variable is flushed, but the storage that corresponds to that address is not flushed.
If the list item or the subobject of the list item has the ALLOCATABLE attribute and has an
allocation status of allocated, the allocated variable is flushed; otherwise the allocation status is
flushed.
Note – Use of a flush construct with a list is extremely error prone and users are strongly
discouraged from attempting it. The following examples illustrate the ordering properties of the flush
operation. In the following incorrect pseudocode example, the programmer intends to prevent
simultaneous execution of the protected section by the two threads, but the program does not
work properly because it does not enforce the proper ordering of the operations on variables a
and b. Any shared data accessed in the protected section is not guaranteed to be current or
consistent during or after the protected section. The atomic notation in the pseudocode in the
following two examples indicates that the accesses to a and b are atomic write and atomic
read operations. Otherwise both examples would contain data races and automatically result in
unspecified behavior. The flush operations are strong flushes that are applied to the specified flush
lists
Incorrect example:a = b = 0
thread 1
thread 2
atomic(b = 1)
atomic(a = 1)
flush(b)
flush(a)
flush(a)
flush(b)
atomic(tmp = a)
atomic(tmp = b)
if (tmp == 0) then
if (tmp == 0) then
protected section
protected section
end if
end if
The problem with this example is that operations on variables a and b are not ordered with respect to each
other. For instance, nothing prevents the compiler from moving the flush of b on thread 1 or the
flush of a on thread 2 to a position completely after the protected section (assuming that the
protected section on thread 1 does not reference b and the protected section on thread 2 does not
reference a). If either re-ordering happens, both threads can simultaneously execute the protected
section.
The following pseudocode example correctly ensures that the protected section is executed by only
one thread at a time. Execution of the protected section by neither thread is considered correct
in this example. This occurs if both flushes complete prior to either thread executing its if
statement.
Correct example:a = b = 0
thread 1
thread 2
atomic(b = 1)
atomic(a = 1)
flush(a,b)
flush(a,b)
atomic(tmp = a)
atomic(tmp = b)
if (tmp == 0) then
if (tmp == 0) then
protected section
protected section
end if
end if
The compiler is prohibited from moving the flush at all for either thread, ensuring that the
respective assignment is complete and the data is flushed before the if statement is executed.
Execution Model Events
The flush event occurs in a thread that encounters the flush construct.
Tool Callbacks
A thread dispatches a registered ompt_callback_flush callback for each occurrence of a flush event in
that thread. This callback has the type signature ompt_callback_flush_t.
Restrictions
Restrictions to the flush construct are as follows:
If
a
memory-order-clause
is
specified,
list
items
must
not
be
specified
on
the
flush
directive.
Flush operations implied when executing an atomic region are described in Section 2.19.7.
A flush region that corresponds to a flush directive with the release clause present is implied at the
following locations:
During
a
barrier
region;
At
entry
to
a
parallel
region;
At
entry
to
a
teams
region;
At
exit
from
a
critical
region;
During
an
omp_unset_lock
region;
During
an
omp_unset_nest_lock
region;
Immediately
before
every
task
scheduling
point;
At
exit
from
the
task
region
of
each
implicit
task;
At
exit
from
an
ordered
region,
if
a
threads
clause
or
a
depend
clause
with
a
source
dependence
type
is
present,
or
if
no
clauses
are
present;
and
During
a
cancel
region,
if
the
cancel-var
ICV
is
true.
For a target construct, the device-set of an implicit release flush that is performed in a target task during
the generation of the target region and that is performed on exit from the initial task region that implicitly
encloses the target region consists of the devices that execute the target task and the target
region.
A flush region that corresponds to a flush directive with the acquire clause present is implied at the
following locations:
During
a
barrier
region;
At
exit
from
a
teams
region;
At
entry
to
a
critical
region;
If the region causes the lock to be set, during:
an
omp_set_lock
region;
an
omp_test_lock
region;
an
omp_set_nest_lock
region;
and
an
omp_test_nest_lock
region;
Immediately after every task scheduling point;
At entry to the task region of each implicit task;
At entry to an ordered region, if a threads clause or a depend clause with a sink dependence
type is present, or if no clauses are present; and
Immediately before a cancellation point, if the cancel-var ICV is true and cancellation has been
activated.
For a target construct, the device-set of an implicit acquire flush that is performed in a target task
following the generation of the target region or that is performed on entry to the initial task region that
implicitly encloses the target region consists of the devices that execute the target task and the target
region.
Note – A flush region is not implied at the following locations:
At
entry
to
worksharing
regions;
and
At
entry
to
or
exit
from
masked
regions.
The synchronization behavior of implicit flushes is as follows:
When
a
thread
executes
an
atomic
region
for
which
the
corresponding
construct
has
the
release,
acq_rel,
or
seq_cst
clause
and
specifies
an
atomic
operation
that
starts
a
given
release
sequence,
the
release
flush
that
is
performed
on
entry
to
the
atomic
operation
synchronizes
with
an
acquire
flush
that
is
performed
by
a
different
thread
and
has
an
associated
atomic
operation
that
reads
a
value
written
by
a
modification
in
the
release
sequence.
When
a
thread
executes
an
atomic
region
for
which
the
corresponding
construct
has
the
acquire,
acq_rel,
or
seq_cst
clause
and
specifies
an
atomic
operation
that
reads
a
value
written
by
a
given
modification,
a
release
flush
that
is
performed
by
a
different
thread
and
has
an
associated
release
sequence
that
contains
that
modification
synchronizes
with
the
acquire
flush
that
is
performed
on
exit
from
the
atomic
operation.
When
a
thread
executes
a
critical
region
that
has
a
given
name,
the
behavior
is
as
if
the
release
flush
performed
on
exit
from
the
region
synchronizes
with
the
acquire
flush
performed
on
entry
to
the
next
critical
region
with
the
same
name
that
is
performed
by
a
different
thread,
if
it
exists.
When
a
thread
team
executes
a
barrier
region,
the
behavior
is
as
if
the
release
flush
performed
by
each
thread
within
the
region
synchronizes
with
the
acquire
flush
performed
by
all
other
threads
within
the
region.
When
a
thread
executes
a
taskwait
region
that
does
not
result
in
the
creation
of
a
dependent
task
and
the
task
that
encounters
the
corresponding
taskwait
construct
has
at
least
one
child
task,
the
behavior
is
as
if
each
thread
that
executes
a
child
task
that
is
generated
before
the
taskwait
region
performs
a
release
flush
upon
completion
of
the
child
task
that
synchronizes
with
an
acquire
flush
performed
in
the
taskwait
region.
When
a
thread
executes
a
taskgroup
region,
the
behavior
is
as
if
each
thread
that
executes
a
remaining
descendant
task
performs
a
release
flush
upon
completion
of
the
descendant
task
that
synchronizes
with
an
acquire
flush
performed
on
exit
from
the
taskgroup
region.
When
a
thread
executes
an
ordered
region
that
does
not
arise
from
a
stand-alone
ordered
directive,
the
behavior
is
as
if
the
release
flush
performed
on
exit
from
the
region
synchronizes
with
the
acquire
flush
performed
on
entry
to
an
ordered
region
encountered
in
the
next
logical
iteration
to
be
executed
by
a
different
thread,
if
it
exists.
When
a
thread
executes
an
ordered
region
that
arises
from
a
stand-alone
ordered
directive,
the
behavior
is
as
if
the
release
flush
performed
in
the
ordered
region
from
a
given
source
iteration
synchronizes
with
the
acquire
flush
performed
in
all
ordered
regions
executed
by
a
different
thread
that
are
waiting
for
dependences
on
that
iteration
to
be
satisfied.
When
a
thread
team
begins
execution
of
a
parallel
region,
the
behavior
is
as
if
the
release
flush
performed
by
the
primary
thread
on
entry
to
the
parallel
region
synchronizes
with
the
acquire
flush
performed
on
entry
to
each
implicit
task
that
is
assigned
to
a
different
thread.
When
an
initial
thread
begins
execution
of
a
target
region
that
is
generated
by
a
different
thread
from
a
target
task,
the
behavior
is
as
if
the
release
flush
performed
by
the
generating
thread
in
the
target
task
synchronizes
with
the
acquire
flush
performed
by
the
initial
thread
on
entry
to
its
initial
task
region.
When
an
initial
thread
completes
execution
of
a
target
region
that
is
generated
by
a
different
thread
from
a
target
task,
the
behavior
is
as
if
the
release
flush
performed
by
the
initial
thread
on
exit
from
its
initial
task
region
synchronizes
with
the
acquire
flush
performed
by
the
generating
thread
in
the
target
task.
When
a
thread
encounters
a
teams
construct,
the
behavior
is
as
if
the
release
flush
performed
by
the
thread
on
entry
to
the
teams
region
synchronizes
with
the
acquire
flush
performed
on
entry
to
each
initial
task
that
is
executed
by
a
different
initial
thread
that
participates
in
the
execution
of
the
teams
region.
When
a
thread
that
encounters
a
teams
construct
reaches
the
end
of
the
teams
region,
the
behavior
is
as
if
the
release
flush
performed
by
each
different
participating
initial
thread
at
exit
from
its
initial
task
synchronizes
with
the
acquire
flush
performed
by
the
thread
at
exit
from
the
teams
region.
When
a
task
generates
an
explicit
task
that
begins
execution
on
a
different
thread,
the
behavior
is
as
if
the
thread
that
is
executing
the
generating
task
performs
a
release
flush
that
synchronizes
with
the
acquire
flush
performed
by
the
thread
that
begins
to
execute
the
explicit
task.
When
an
undeferred
task
completes
execution
on
a
given
thread
that
is
different
from
the
thread
on
which
its
generating
task
is
suspended,
the
behavior
is
as
if
a
release
flush
performed
by
the
thread
that
completes
execution
of
the
undeferred
task
synchronizes
with
an
acquire
flush
performed
by
the
thread
that
resumes
execution
of
the
generating
task.
When
a
dependent
task
with
one
or
more
predecessor
tasks
begins
execution
on
a
given
thread,
the
behavior
is
as
if
each
release
flush
performed
by
a
different
thread
on
completion
of
a
predecessor
task
synchronizes
with
the
acquire
flush
performed
by
the
thread
that
begins
to
execute
the
dependent
task.
When
a
task
begins
execution
on
a
given
thread
and
it
is
mutually
exclusive
with
respect
to
another
sibling
task
that
is
executed
by
a
different
thread,
the
behavior
is
as
if
each
release
flush
performed
on
completion
of
the
sibling
task
synchronizes
with
the
acquire
flush
performed
by
the
thread
that
begins
to
execute
the
task.
When
a
thread
executes
a
cancel
region,
the
cancel-var
ICV
is
true,
and
cancellation
is
not
already
activated
for
the
specified
region,
the
behavior
is
as
if
the
release
flush
performed
during
the
cancel
region
synchronizes
with
the
acquire
flush
performed
by
a
different
thread
immediately
before
a
cancellation
point
in
which
that
thread
observes
cancellation
was
activated
for
the
region.
When
a
thread
executes
an
omp_unset_lock
region
that
causes
the
specified
lock
to
be
unset,
the
behavior
is
as
if
a
release
flush
is
performed
during
the
omp_unset_lock
region
that
synchronizes
with
an
acquire
flush
that
is
performed
during
the
next
omp_set_lock
or
omp_test_lock
region
to
be
executed
by
a
different
thread
that
causes
the
specified
lock
to
be
set.
When
a
thread
executes
an
omp_unset_nest_lock
region
that
causes
the
specified
nested
lock
to
be
unset,
the
behavior
is
as
if
a
release
flush
is
performed
during
the
omp_unset_nest_lock
region
that
synchronizes
with
an
acquire
flush
that
is
performed
during
the
next
omp_set_nest_lock
or
omp_test_nest_lock
region
to
be
executed
by
a
different
thread
that
causes
the
specified
nested
lock
to
be
set.