Summary
The simd construct can be applied to a loop to indicate that the loop can be transformed into a
SIMD loop (that is, multiple iterations of the loop can be executed concurrently by using SIMD
instructions).
If an endsimd directive is not specified, an endsimd directive is assumed at the end of the
do-loops.
Binding
A simd region binds to the current task region. The binding thread set of the simd region is the current
team.
Description
The simd construct enables the execution of multiple iterations of the associated loops concurrently by
using SIMD instructions.
The collapse clause may be used to specify how many loops are associated with the simd construct.
The collapse clause specifies the number of loops that are collapsed into a logical iteration space that is
then executed with SIMD instructions. The parameter of the collapse clause must be a constant positive
integer expression. If the collapse clause is omitted, the behavior is as if a collapse clause with a
parameter value of one was specified.
At the beginning of each logical iteration, the loop iteration variable or the variable declared by range-decl
of each associated loop has the value that it would have if the set of the associated loops was executed
sequentially. The number of iterations that are executed concurrently at any given time is implementation
defined. Each concurrent iteration will be executed by a different SIMD lane. Each set of concurrent
iterations is a SIMD chunk. Lexical forward dependences in the iterations of the original loop must be
preserved within each SIMD chunk, unless an order clause that specifies concurrent is
present.
The safelen clause specifies that no two concurrent iterations within a SIMD chunk can have a distance
in the logical iteration space that is greater than or equal to the value given in the clause. The parameter of
the safelen clause must be a constant positive integer expression. The simdlen clause
specifies the preferred number of iterations to be executed concurrently, unless an if clause is
present and evaluates to false, in which case the preferred number of iterations to be executed
concurrently is one. The parameter of the simdlen clause must be a constant positive integer
expression.
If an order clause is present then the semantics are as described in Section 2.11.3.
The aligned clause declares that the object to which each list item points is aligned to the number of
bytes expressed in the optional parameter of the aligned clause.
The aligned clause declares that the location of each list item is aligned to the number of bytes expressed
in the optional parameter of the aligned clause.
The optional parameter of the aligned clause, alignment, must be a constant positive integer expression.
If no optional parameter is specified, implementation-defined default alignments for SIMD instructions on
the target platforms are assumed.
The nontemporal clause specifies that accesses to the storage locations to which the list
items refer have low temporal locality across the iterations in which those storage locations are
accessed.
Restrictions
Restrictions to the simd construct are as follows:
At
most
one
collapse
clause
can
appear
on
a
simd
directive.
A
list-item
cannot
appear
in
more
than
one
aligned
clause.
A
list-item
cannot
appear
in
more
than
one
nontemporal
clause.
At
most
one
safelen
clause
can
appear
on
a
simd
directive.
At
most
one
simdlen
clause
can
appear
on
a
simd
directive.
At
most
one
if
clause
can
appear
on
a
simd
directive.
If
both
simdlen
and
safelen
clauses
are
specified,
the
value
of
the
simdlen
parameter
must
be
less
than
or
equal
to
the
value
of
the
safelen
parameter.
A
modifier
may
not
be
specified
on
a
linear
clause.
The
only
OpenMP
constructs
that
can
be
encountered
during
execution
of
a
simd
region
are
the
atomic
construct,
the
loop
construct,
the
simd
construct,
and
the
ordered
construct
with
the
simd
clause.
If
an
order
clause
that
specifies
concurrent
appears
on
a
simd
directive,
the
safelen
clause
may
not
also
appear.
∙ The simd region cannot contain calls to the longjmp or setjmp functions.
∙ The type of list items appearing in the aligned clause must be array or pointer.
∙ The type of list items appearing in the aligned clause must be array, pointer, reference to array, or
reference to pointer.
∙ No exception can be raised in the simd region.
∙ The only random access iterator types that are allowed for the associated loops are pointer
types.
∙ If a list item on the aligned clause has the ALLOCATABLE attribute, the allocation status must be
allocated.
∙ If a list item on the aligned clause has the POINTER attribute, the association status must be
associated.
∙ If the type of a list item on the aligned clause is either C_PTR or Cray pointer, the list item must be
defined. Cray pointer support has been deprecated.
Data-sharing
attribute
clauses,
see
Section 2.21.4.
2.11.5.2 Worksharing-Loop SIMD Construct
Summary
The worksharing-loop SIMD construct specifies that the iterations of one or more associated loops will be
distributed across threads that already exist in the team and that the iterations executed by each thread can
also be executed concurrently using SIMD instructions. The worksharing-loop SIMD construct is a
composite construct.
Syntax
The syntax of the worksharing-loop SIMD construct is as follows:
where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the for or simd
directives with identical meanings and restrictions.
The syntax of the worksharing-loop SIMD construct is as follows:
where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the simd or do
directives, with identical meanings and restrictions.
If an enddosimd directive is not specified, an enddosimd directive is assumed at the end of the
do-loops.
Description
The worksharing-loop SIMD construct will first distribute the logical iterations of the associated loops
across the implicit tasks of the parallel region in a manner consistent with any clauses that apply to the
worksharing-loop construct. Each resulting chunk of iterations will then be converted to a SIMD loop in a
manner consistent with any clauses that apply to the simd construct.
Execution Model Events
This composite construct generates the same events as the worksharing-loop construct.
Tool Callbacks
This composite construct dispatches the same callbacks as the worksharing-loop construct.
Restrictions
All restrictions to the worksharing-loop construct and the simd construct apply to the worksharing-loop
SIMD construct. In addition, the following restrictions apply:
No
ordered
clause
with
a
parameter
can
be
specified.
A
list
item
may
appear
in
a
linear
or
firstprivate
clause,
but
not
in
both.
Data-sharing
attribute
clauses,
see
Section 2.21.4.
2.11.5.3 declaresimd Directive
Summary
The declaresimd directive can be applied to a function (C, C++, and Fortran) or a subroutine (Fortran)
to enable the creation of one or more versions that can process multiple arguments using SIMD instructions
from a single invocation in a SIMD loop. The declaresimd directive is a declarative directive. Multiple
declaresimd directives may be specified for a function (C, C++, and Fortran) or subroutine
(Fortran).
Syntax
The syntax of the declaresimd directive is as follows:
The use of one or more declaresimd directives on a function declaration or definition enables the
creation of corresponding SIMD versions of the associated function that can be used to process multiple
arguments from a single invocation in a SIMD loop concurrently.
The expressions appearing in the clauses of each directive are evaluated in the scope of the arguments of the
function declaration or definition.
The use of one or more declaresimd directives in a subroutine or function enables the creation of
corresponding SIMD versions of the subroutine or function that can be used to process multiple arguments
from a single invocation in a SIMD loop concurrently.
If a SIMD version is created, the number of concurrent arguments for the function is determined by the
simdlen clause. If the simdlen clause is used, its value corresponds to the number of concurrent
arguments of the function. The parameter of the simdlen clause must be a constant positive integer
expression. Otherwise, the number of concurrent arguments for the function is implementation
defined.
The special this pointer can be used as if it was one of the arguments to the function in any of the linear,
aligned, or uniform clauses.
The uniform clause declares one or more arguments to have an invariant value for all concurrent
invocations of the function in the execution of a single SIMD loop.
The aligned clause declares that the object to which each list item points is aligned to the number of
bytes expressed in the optional parameter of the aligned clause.
The aligned clause declares that the target of each list item is aligned to the number of bytes expressed in
the optional parameter of the aligned clause.
The optional parameter of the aligned clause, alignment, must be a constant positive integer expression.
If no optional parameter is specified, implementation-defined default alignments for SIMD instructions on
the target platforms are assumed.
The inbranch clause specifies that the SIMD version of the function will always be called from inside a
conditional statement of a SIMD loop. The notinbranch clause specifies that the SIMD version of the
function will never be called from inside a conditional statement of a SIMD loop. If neither clause is
specified, then the SIMD version of the function may or may not be called from inside a conditional
statement of a SIMD loop.
Restrictions
Restrictions to the declaresimd directive are as follows:
Each
argument
can
appear
in
at
most
one
uniform
or
linear
clause.
At
most
one
simdlen
clause
can
appear
in
a
declaresimd
directive.
Either
inbranch
or
notinbranch
may
be
specified,
but
not
both.
When
a
linear-step
expression
is
specified
in
a
linear
clause
it
must
be
either
a
constant
integer
expression
or
an
integer-typed
parameter
that
is
specified
in
a
uniform
clause
on
the
directive.
The
function
or
subroutine
body
must
be
a
structured
block.
The
execution
of
the
function
or
subroutine,
when
called
from
a
SIMD
loop,
cannot
result
in
the
execution
of
an
OpenMP
construct
except
for
an
ordered
construct
with
the
simd
clause
or
an
atomic
construct.
The
execution
of
the
function
or
subroutine
cannot
have
any
side
effects
that
would
alter
its
execution
for
concurrent
iterations
of
a
SIMD
chunk.
A
program
that
branches
into
or
out
of
the
function
is
non-conforming.
∙ If the function has any declarations, then the declaresimd directive for any declaration that
has one must be equivalent to the one specified for the definition. Otherwise, the result is
unspecified.
∙ The function cannot contain calls to the longjmp or setjmp functions.
∙ The type of list items appearing in the aligned clause must be array or pointer.
∙ The function cannot contain any calls to throw.
∙ The type of list items appearing in the aligned clause must be array, pointer, reference to array, or
reference to pointer.
∙proc-name must not be a generic name, procedure pointer, or entry name.
∙ If proc-name is omitted, the declaresimd directive must appear in the specification part of a
subroutine subprogram or a function subprogram for which creation of the SIMD versions is
enabled.
∙ Any declaresimd directive must appear in the specification part of a subroutine subprogram,
function subprogram, or interface body to which it applies.
∙ If a declaresimd directive is specified in an interface block for a procedure, it must match a
declaresimd directive in the definition of the procedure.
∙ If a procedure is declared via a procedure declaration statement, the procedure proc-name should appear
in the same specification.
∙ If a declaresimd directive is specified for a procedure name with explicit interface and a
declaresimd directive is also specified for the definition of the procedure then the two
declaresimd directives must match. Otherwise the result is unspecified.
∙ Procedure pointers may not be used to access versions created by the declaresimd directive.
∙ The type of list items appearing in the aligned clause must be C_PTR or Cray pointer, or the list
item must have the POINTER or ALLOCATABLE attribute. Cray pointer support has been
deprecated.