SummaryThe simd construct can be applied to a loop to indicate that the loop can be transformed into a
SIMD loop (that is, multiple iterations of the loop can be executed concurrently using SIMD
instructions).
The simd directive places restrictions on the structure of the associated for-loops. Specifically, all
associated for-loops must have canonical loop form (Section 2.9.1 on page 271).
If an endsimd directive is not specified, an endsimd directive is assumed at the end of the
do-loops.
The simd directive places restrictions on the structure of all associated do-loops. Specifically, all associated
do-loops must have canonical loop form (see Section 2.9.1 on page 271).
BindingA simd region binds to the current task region. The binding thread set of the simd region is the
current team.
DescriptionThe simd construct enables the execution of multiple iterations of the associated loops
concurrently by means of SIMD instructions.
The collapse clause may be used to specify how many loops are associated with the construct. The
parameter of the collapse clause must be a constant positive integer expression. If no collapse clause
is present, the only loop that is associated with the simd construct is the one that immediately follows the
directive.
If more than one loop is associated with the simd construct, then the iterations of all associated loops are
collapsed into one larger iteration space that is then executed with SIMD instructions. The sequential
execution of the iterations in all associated loops determines the order of the iterations in the collapsed
iteration space.
If more than one loop is associated with the simd construct then the number of times that any intervening
code between any two associated loops will be executed is unspecified but will be at least once per
iteration of the loop enclosing the intervening code and at most once per iteration of the innermost
loop associated with the construct. If the iteration count of any loop that is associated with the
simd construct is zero and that loop does not enclose the intervening code, the behavior is
unspecified.
The integer type (or kind, for Fortran) used to compute the iteration count for the collapsed loop is
implementation defined.
A SIMD loop has logical iterations numbered 0,1,...,N-1 where N is the number of loop iterations, and the
logical numbering denotes the sequence in which the iterations would be executed if the associated
loop(s) were executed with no SIMD instructions. At the beginning of each logical iteration,
the loop iteration variable of each associated loop has the value that it would have if the set of
the associated loop(s) were executed sequentially. The number of iterations that are executed
concurrently at any given time is implementation defined. Each concurrent iteration will be
executed by a different SIMD lane. Each set of concurrent iterations is a SIMD chunk. Lexical
forward dependencies in the iterations of the original loop must be preserved within each SIMD
chunk.
The safelen clause specifies that no two concurrent iterations within a SIMD chunk can have a distance
in the logical iteration space that is greater than or equal to the value given in the clause. The parameter of
the safelen clause must be a constant positive integer expression. The simdlen clause
specifies the preferred number of iterations to be executed concurrently unless an if clause is
present and evaluates to false, in which case the preferred number of iterations to be executed
concurrently is one. The parameter of the simdlen clause must be a constant positive integer
expression.
The aligned clause declares that the object to which each list item points is aligned to the number of
bytes expressed in the optional parameter of the aligned clause.
The aligned clause declares that the location of each list item is aligned to the number of bytes expressed
in the optional parameter of the aligned clause.
The optional parameter of the aligned clause, alignment, must be a constant positive integer expression.
If no optional parameter is specified, implementation-defined default alignments for SIMD instructions on
the target platforms are assumed.
The nontemporal clause specifies that accesses to the storage locations to which the list
items refer have low temporal locality across the iterations in which those storage locations are
accessed.
Restrictions
No
OpenMP
directive
may
appear
in
the
region
between
any
associated
loops.
If
a
collapse
clause
is
specified,
exactly
one
loop
must
occur
in
the
region
at
each
nesting
level
up
to
the
number
of
loops
specified
by
the
parameter
of
the
collapse
clause.
The
associated
loops
must
be
structured
blocks.
A
program
that
branches
into
or
out
of
a
simd
region
is
non-conforming.
Only
one
collapse
clause
can
appear
on
a
simd
directive.
A
list-item
cannot
appear
in
more
than
one
aligned
clause.
A
list-item
cannot
appear
in
more
than
one
nontemporal
clause.
Only
one
safelen
clause
can
appear
on
a
simd
directive.
Only
one
simdlen
clause
can
appear
on
a
simd
directive.
If
both
simdlen
and
safelen
clauses
are
specified,
the
value
of
the
simdlen
parameter
must
be
less
than
or
equal
to
the
value
of
the
safelen
parameter.
A
modifier
may
not
be
specified
on
a
linear
clause.
The
only
OpenMP
constructs
that
can
be
encountered
during
execution
of
a
simd
region
are
the
atomic
construct,
the
loop
construct,
the
simd
construct
and
the
ordered
construct
with
the
simd
clause.
If
an
order(concurrent)
clause
is
present,
all
restrictions
from
the
loop
construct
with
an
order(concurrent)
clause
also
apply.
∙ The simd region cannot contain calls to the longjmp or setjmp functions.
∙ The type of list items appearing in the aligned clause must be array or pointer.
∙ The type of list items appearing in the aligned clause must be array, pointer, reference to array, or
reference to pointer.
∙ No exception can be raised in the simd region.
∙ The do-loop iteration variable must be of type integer.
∙ The do-loop cannot be a DO WHILE or a DO loop without loop control.
∙ If a list item on the aligned clause has the ALLOCATABLE attribute, the allocation status must be
allocated.
∙ If a list item on the aligned clause has the POINTER attribute, the association status must be
associated.
∙ If the type of a list item on the aligned clause is either C_PTR or Cray pointer, the list item must be
defined.
Cross References
order(concurrent)
clause,
see
Section 2.9.5
on
page 363.
private,
lastprivate,
linear
and
reduction
clauses,
see
Section 2.19.4
on
page 842.
2.9.3.2 Worksharing-Loop SIMD Construct
SummaryThe worksharing-loop SIMD construct specifies that the iterations of one or more associated
loops will be distributed across threads that already exist in the team and that the iterations executed by each
thread can also be executed concurrently using SIMD instructions. The worksharing-loop SIMD construct is
a composite construct.
where clause can be any of the clauses accepted by the simd or do directives, with identical meanings and
restrictions.
If an enddosimd directive is not specified, an enddosimd directive is assumed at the end of the
do-loops.
DescriptionThe worksharing-loop SIMD construct will first distribute the iterations of the
associated loop(s) across the implicit tasks of the parallel region in a manner consistent with any
clauses that apply to the worksharing-loop construct. The resulting chunks of iterations will then
be converted to a SIMD loop in a manner consistent with any clauses that apply to the simd
construct.
Execution Model EventsThis composite construct generates the same events as the worksharing-loop
construct.
Tool CallbacksThis composite construct dispatches the same callbacks as the worksharing-loop
construct.
RestrictionsAll restrictions to the worksharing-loop construct and the simd construct apply to the
worksharing-loop SIMD construct. In addition, the following restrictions apply:
No
ordered
clause
with
a
parameter
can
be
specified.
A
list
item
may
appear
in
a
linear
or
firstprivate
clause
but
not
both.
Cross References
worksharing-loop
construct,
see
Section 2.9.2
on
page 288.
Data
attribute
clauses,
see
Section 2.19.4
on
page 842.
2.9.3.3 declaresimd Directive
SummaryThe declaresimd directive can be applied to a function (C, C++ and Fortran) or a subroutine
(Fortran) to enable the creation of one or more versions that can process multiple arguments using SIMD
instructions from a single invocation in a SIMD loop. The declaresimd directive is a declarative
directive. There may be multiple declaresimd directives for a function (C, C++, Fortran) or subroutine
(Fortran).
Syntax
The syntax of the declaresimd directive is as follows:
The use of one or more declaresimd directives immediately prior to a function declaration or definition
enables the creation of corresponding SIMD versions of the associated function that can be used to process
multiple arguments from a single invocation in a SIMD loop concurrently.
The expressions appearing in the clauses of each directive are evaluated in the scope of the arguments of the
function declaration or definition.
The use of one or more declaresimd directives for a specified subroutine or function enables the
creation of corresponding SIMD versions of the subroutine or function that can be used to process multiple
arguments from a single invocation in a SIMD loop concurrently.
If a SIMD version is created, the number of concurrent arguments for the function is determined by the
simdlen clause. If the simdlen clause is used its value corresponds to the number of concurrent
arguments of the function. The parameter of the simdlen clause must be a constant positive integer
expression. Otherwise, the number of concurrent arguments for the function is implementation
defined.
The special this pointer can be used as if it was one of the arguments to the function in any of the linear,
aligned, or uniform clauses.
The uniform clause declares one or more arguments to have an invariant value for all concurrent
invocations of the function in the execution of a single SIMD loop.
The aligned clause declares that the object to which each list item points is aligned to the number of
bytes expressed in the optional parameter of the aligned clause.
The aligned clause declares that the target of each list item is aligned to the number of bytes expressed in
the optional parameter of the aligned clause.
The optional parameter of the aligned clause, alignment, must be a constant positive integer expression.
If no optional parameter is specified, implementation-defined default alignments for SIMD instructions on
the target platforms are assumed.
The inbranch clause specifies that the SIMD version of the function will always be called from inside a
conditional statement of a SIMD loop. The notinbranch clause specifies that the SIMD version of the
function will never be called from inside a conditional statement of a SIMD loop. If neither clause is
specified, then the SIMD version of the function may or may not be called from inside a conditional
statement of a SIMD loop.
Restrictions
Each
argument
can
appear
in
at
most
one
uniform
or
linear
clause.
At
most
one
simdlen
clause
can
appear
in
a
declaresimd
directive.
Either
inbranch
or
notinbranch
may
be
specified,
but
not
both.
When
a
linear-step
expression
is
specified
in
a
linear
clause
it
must
be
either
a
constant
integer
expression
or
an
integer-typed
parameter
that
is
specified
in
a
uniform
clause
on
the
directive.
The
function
or
subroutine
body
must
be
a
structured
block.
The
execution
of
the
function
or
subroutine,
when
called
from
a
SIMD
loop,
cannot
result
in
the
execution
of
an
OpenMP
construct
except
for
an
ordered
construct
with
the
simd
clause
or
an
atomic
construct.
The
execution
of
the
function
or
subroutine
cannot
have
any
side
effects
that
would
alter
its
execution
for
concurrent
iterations
of
a
SIMD
chunk.
A
program
that
branches
into
or
out
of
the
function
is
non-conforming.
∙ If the function has any declarations, then the declaresimd construct for any declaration that
has one must be equivalent to the one specified for the definition. Otherwise, the result is
unspecified.
∙ The function cannot contain calls to the longjmp or setjmp functions.
∙ The type of list items appearing in the aligned clause must be array or pointer.
∙ The function cannot contain any calls to throw.
∙ The type of list items appearing in the aligned clause must be array, pointer, reference to array, or
reference to pointer.
∙proc-name must not be a generic name, procedure pointer or entry name.
∙ If proc-name is omitted, the declaresimd directive must appear in the specification part of a
subroutine subprogram or a function subprogram for which creation of the SIMD versions is
enabled.
∙ Any declaresimd directive must appear in the specification part of a subroutine subprogram,
function subprogram or interface body to which it applies.
∙ If a declaresimd directive is specified in an interface block for a procedure, it must match a
declaresimd directive in the definition of the procedure.
∙ If a procedure is declared via a procedure declaration statement, the procedure proc-name should appear
in the same specification.
∙ If a declaresimd directive is specified for a procedure name with explicit interface and a
declaresimd directive is also specified for the definition of the procedure then the two
declaresimd directives must match. Otherwise the result is unspecified.
∙ Procedure pointers may not be used to access versions created by the declaresimd directive.
∙ The type of list items appearing in the aligned clause must be C_PTR or Cray pointer, or the list item
must have the POINTER or ALLOCATABLE attribute.