Standard Pattern Names For Generation#
Here is a table of the instruction names that are meaningful in the RTL generation pass of the compiler. Giving one of these names to an instruction pattern tells the RTL generation pass that it can use the pattern to accomplish a certain task.
- movm
Here
m
stands for a two-letter machine mode name, in lowercase. This instruction pattern moves data with that machine mode from operand 1 to operand 0. For example,movsi
moves full-word data.If operand 0 is a
subreg
with modem
of a register whose own mode is wider thanm
, the effect of this instruction is to store the specified value in the part of the register that corresponds to modem
. Bits outside ofm
, but which are within the same target word as thesubreg
are undefined. Bits which are outside the target word are left unchanged.This class of patterns is special in several ways. First of all, each of these names up to and including full word size must be defined, because there is no other way to copy a datum from one place to another. If there are patterns accepting operands in larger modes,
movm
must be defined for integer modes of those sizes.Second, these patterns are not used solely in the RTL generation pass. Even the reload pass can generate move insns to copy values from stack slots into temporary registers. When it does so, one of the operands is a hard register and the other is an operand that can need to be reloaded into a register.
Therefore, when given such a pair of operands, the pattern must generate RTL which needs no reloading and needs no temporary registers—no registers other than the operands. For example, if you support the pattern with a
define_expand
, then in such a case thedefine_expand
mustn’t callforce_reg
or any other such function which might generate new pseudo registers.This requirement exists even for subword modes on a RISC machine where fetching those modes from memory normally requires several insns and some temporary registers.
During reload a memory reference with an invalid address may be passed as an operand. Such an address will be replaced with a valid address later in the reload pass. In this case, nothing may be done with the address except to use it as it stands. If it is copied, it will not be replaced with a valid address. No attempt should be made to make such an address into a valid address and no routine (such as
change_address
) that will do so may be called. Note thatgeneral_operand
will fail when applied to such an address.The global variable
reload_in_progress
(which must be explicitly declared if required) can be used to determine whether such special handling is required.The variety of operands that have reloads depends on the rest of the machine description, but typically on a RISC machine these can only be pseudo registers that did not get hard registers, while on other machines explicit memory references will get optional reloads.
If a scratch register is required to move an object to or from memory, it can be allocated using
gen_reg_rtx
prior to life analysis.If there are cases which need scratch registers during or after reload, you must provide an appropriate secondary_reload target hook.
The macro
can_create_pseudo_p
can be used to determine if it is unsafe to create new pseudo registers. If this variable is nonzero, then it is unsafe to callgen_reg_rtx
to allocate a new pseudo.The constraints on a
movm
must permit moving any hard register to any other hard register provided thatTARGET_HARD_REGNO_MODE_OK
permits modem
in both registers andTARGET_REGISTER_MOVE_COST
applied to their classes returns a value of 2.It is obligatory to support floating point
movm
instructions into and out of any registers that can hold fixed point values, because unions and structures (which have modesSImode
orDImode
) can be in those registers and they may have floating point members.There may also be a need to support fixed point
movm
instructions in and out of floating point registers. Unfortunately, I have forgotten why this was so, and I don’t know whether it is still true. IfTARGET_HARD_REGNO_MODE_OK
rejects fixed point values in floating point registers, then the constraints of the fixed pointmovm
instructions must be designed to avoid ever trying to reload into a floating point register.- reload_inm reload_outm
These named patterns have been obsoleted by the target hook
secondary_reload
.Like
movm
, but used when a scratch register is required to move between operand 0 and operand 1. Operand 2 describes the scratch register. See the discussion of theSECONDARY_RELOAD_CLASS
macro in see Register Classes.There are special restrictions on the form of the
match_operand
s used in these patterns. First, only the predicate for the reload operand is examined, i.e.,reload_in
examines operand 1, but not the predicates for operand 0 or 2. Second, there may be only one alternative in the constraints. Third, only a single register class letter may be used for the constraint; subsequent constraint letters are ignored. As a special exception, an empty constraint string matches theALL_REGS
register class. This may relieve ports of the burden of defining anALL_REGS
constraint letter just for these patterns.- movstrictm
Like
movm
except that if operand 0 is asubreg
with modem
of a register whose natural mode is wider, themovstrictm
instruction is guaranteed not to alter any of the register except the part which belongs to modem
.- movmisalignm
This variant of a move pattern is designed to load or store a value from a memory address that is not naturally aligned for its mode. For a store, the memory will be in operand 0; for a load, the memory will be in operand 1. The other operand is guaranteed not to be a memory, so that it’s easy to tell whether this is a load or store.
This pattern is used by the autovectorizer, and when expanding a
MISALIGNED_INDIRECT_REF
expression.- load_multiple
Load several consecutive memory locations into consecutive registers. Operand 0 is the first of the consecutive registers, operand 1 is the first memory location, and operand 2 is a constant: the number of consecutive registers.
Define this only if the target machine really has such an instruction; do not define this if the most efficient way of loading consecutive registers from memory is to do them one at a time.
On some machines, there are restrictions as to which consecutive registers can be stored into memory, such as particular starting or ending register numbers or only a range of valid counts. For those machines, use a
define_expand
(see Defining RTL Sequences for Code Generation) and make the pattern fail if the restrictions are not met.Write the generated insn as a
parallel
with elements being aset
of one register from the appropriate memory location (you may also needuse
orclobber
elements). Use amatch_parallel
(see RTL Template) to recognize the insn. Seers6000.md
for examples of the use of this insn pattern.- store_multiple
Similar to
load_multiple
, but store several consecutive registers into consecutive memory locations. Operand 0 is the first of the consecutive memory locations, operand 1 is the first register, and operand 2 is a constant: the number of consecutive registers.- vec_load_lanesmn
Perform an interleaved load of several vectors from memory operand 1 into register operand 0. Both operands have mode
m
. The register operand is viewed as holding consecutive vectors of moden
, while the memory operand is a flat array that contains the same number of elements. The operation is equivalent to:int c = GET_MODE_SIZE (m) / GET_MODE_SIZE (n); for (j = 0; j < GET_MODE_NUNITS (n); j++) for (i = 0; i < c; i++) operand0[i][j] = operand1[j * c + i];
For example,
vec_load_lanestiv4hi
loads 8 16-bit values from memory into a register of modeTI
. The register contains two consecutive vectors of modeV4HI
.This pattern can only be used if:
TARGET_ARRAY_MODE_SUPPORTED_P (n, c)
is true. GCC assumes that, if a target supports this kind of instruction for some mode
n
, it also supports unaligned loads for vectors of moden
.This pattern is not allowed to
FAIL
.- vec_mask_load_lanesmn
Like
vec_load_lanesmn
, but takes an additional mask operand (operand 2) that specifies which elements of the destination vectors should be loaded. Other elements of the destination vectors are set to zero. The operation is equivalent to:int c = GET_MODE_SIZE (m) / GET_MODE_SIZE (n); for (j = 0; j < GET_MODE_NUNITS (n); j++) if (operand2[j]) for (i = 0; i < c; i++) operand0[i][j] = operand1[j * c + i]; else for (i = 0; i < c; i++) operand0[i][j] = 0;
This pattern is not allowed to
FAIL
.- vec_store_lanesmn
Equivalent to
vec_load_lanesmn
, with the memory and register operands reversed. That is, the instruction is equivalent to:int c = GET_MODE_SIZE (m) / GET_MODE_SIZE (n); for (j = 0; j < GET_MODE_NUNITS (n); j++) for (i = 0; i < c; i++) operand0[j * c + i] = operand1[i][j];
for a memory operand 0 and register operand 1.
This pattern is not allowed to
FAIL
.- vec_mask_store_lanesmn
Like
vec_store_lanesmn
, but takes an additional mask operand (operand 2) that specifies which elements of the source vectors should be stored. The operation is equivalent to:int c = GET_MODE_SIZE (m) / GET_MODE_SIZE (n); for (j = 0; j < GET_MODE_NUNITS (n); j++) if (operand2[j]) for (i = 0; i < c; i++) operand0[j * c + i] = operand1[i][j];
This pattern is not allowed to
FAIL
.- gather_loadmn
Load several separate memory locations into a vector of mode
m
. Operand 1 is a scalar base address and operand 2 is a vector of moden
containing offsets from that base. Operand 0 is a destination vector with the same number of elements asn
. For each element indexi
:extend the offset element
i
to address width, using zero extension if operand 3 is 1 and sign extension if operand 3 is zero;multiply the extended offset by operand 4;
add the result to the base; and
load the value at that address into element
i
of operand 0.
The value of operand 3 does not matter if the offsets are already address width.
- mask_gather_loadmn
Like
gather_loadmn
, but takes an extra mask operand as operand 5. Biti
of the mask is set if elementi
of the result should be loaded from memory and clear if elementi
of the result should be set to zero.- scatter_storemn
Store a vector of mode
m
into several distinct memory locations. Operand 0 is a scalar base address and operand 1 is a vector of moden
containing offsets from that base. Operand 4 is the vector of values that should be stored, which has the same number of elements asn
. For each element indexi
:extend the offset element
i
to address width, using zero extension if operand 2 is 1 and sign extension if operand 2 is zero;multiply the extended offset by operand 3;
add the result to the base; and
store element
i
of operand 4 to that address.
The value of operand 2 does not matter if the offsets are already address width.
- mask_scatter_storemn
Like
scatter_storemn
, but takes an extra mask operand as operand 5. Biti
of the mask is set if elementi
of the result should be stored to memory.- vec_setm
Set given field in the vector value. Operand 0 is the vector to modify, operand 1 is new value of field and operand 2 specify the field index.
- vec_extractmn
Extract given field from the vector value. Operand 1 is the vector, operand 2 specify field index and operand 0 place to store value into. The
n
mode is the mode of the field or vector of fields that should be extracted, should be either element mode of the vector modem
, or a vector mode with the same element mode and smaller number of elements. Ifn
is a vector mode, the index is counted in units of that mode.- vec_initmn
Initialize the vector to given values. Operand 0 is the vector to initialize and operand 1 is parallel containing values for individual fields. The
n
mode is the mode of the elements, should be either element mode of the vector modem
, or a vector mode with the same element mode and smaller number of elements.- vec_duplicatem
Initialize vector output operand 0 so that each element has the value given by scalar input operand 1. The vector has mode
m
and the scalar has the mode appropriate for one element ofm
.This pattern only handles duplicates of non-constant inputs. Constant vectors go through the
movm
pattern instead.This pattern is not allowed to
FAIL
.- vec_seriesm
Initialize vector output operand 0 so that element
i
is equal to operand 1 plusi
times operand 2. In other words, create a linear series whose base value is operand 1 and whose step is operand 2.The vector output has mode
m
and the scalar inputs have the mode appropriate for one element ofm
. This pattern is not used for floating-point vectors, in order to avoid having to specify the rounding behavior fori
> 1.This pattern is not allowed to
FAIL
.- while_ultmn
Set operand 0 to a mask that is true while incrementing operand 1 gives a value that is less than operand 2, for a vector length up to operand 3. Operand 0 has mode
n
and operands 1 and 2 are scalar integers of modem
. Operand 3 should be omitted whenn
is a vector mode, and aCONST_INT
otherwise. The operation for vector modes is equivalent to:operand0[0] = operand1 < operand2; for (i = 1; i < GET_MODE_NUNITS (n); i++) operand0[i] = operand0[i - 1] && (operand1 + i < operand2);
And for non-vector modes the operation is equivalent to:
operand0[0] = operand1 < operand2; for (i = 1; i < operand3; i++) operand0[i] = operand0[i - 1] && (operand1 + i < operand2);
- check_raw_ptrsm
Check whether, given two pointers
a
andb
and a lengthlen
, a write oflen
bytes ata
followed by a read oflen
bytes atb
can be split into interleaved byte accessesa[0], b[0], a[1], b[1], ...
without affecting the dependencies between the bytes. Set operand 0 to true if the split is possible and false otherwise.Operands 1, 2 and 3 provide the values of
a
,b
andlen
respectively. Operand 4 is a constant integer that provides the known common alignment ofa
andb
. All inputs have modem
.This split is possible if:
a == b || a + len <= b || b + len <= a
You should only define this pattern if the target has a way of accelerating the test without having to do the individual comparisons.
- check_war_ptrsm
Like
check_raw_ptrsm
, but with the read and write swapped round. The split is possible in this case if:b <= a || a + len <= b
- vec_cmpmn
Output a vector comparison. Operand 0 of mode
n
is the destination for predicate in operand 1 which is a signed vector comparison with operands of modem
in operands 2 and 3. Predicate is computed by element-wise evaluation of the vector comparison with a truth value of all-ones and a false value of all-zeros.- vec_cmpumn
Similar to
vec_cmpmn
but perform unsigned vector comparison.- vec_cmpeqmn
Similar to
vec_cmpmn
but perform equality or non-equality vector comparison only. Ifvec_cmpmn
orvec_cmpumn
instruction pattern is supported, it will be preferred overvec_cmpeqmn
, so there is no need to define this instruction pattern if the others are supported.- vcondmn
Output a conditional vector move. Operand 0 is the destination to receive a combination of operand 1 and operand 2, which are of mode
m
, dependent on the outcome of the predicate in operand 3 which is a signed vector comparison with operands of moden
in operands 4 and 5. The modesm
andn
should have the same size. Operand 0 will be set to the valueop1
&msk
|op2
& ~msk
wheremsk
is computed by element-wise evaluation of the vector comparison with a truth value of all-ones and a false value of all-zeros.- vcondumn
Similar to
vcondmn
but performs unsigned vector comparison.- vcondeqmn
Similar to
vcondmn
but performs equality or non-equality vector comparison only. Ifvcondmn
orvcondumn
instruction pattern is supported, it will be preferred overvcondeqmn
, so there is no need to define this instruction pattern if the others are supported.- vcond_mask_mn
Similar to
vcondmn
but operand 3 holds a pre-computed result of vector comparison.- maskloadmn
Perform a masked load of vector from memory operand 1 of mode
m
into register operand 0. Mask is provided in register operand 2 of moden
.This pattern is not allowed to
FAIL
.- maskstoremn
Perform a masked store of vector from register operand 1 of mode
m
into memory operand 0. Mask is provided in register operand 2 of moden
.This pattern is not allowed to
FAIL
.- len_load_m
Load (operand 2 - operand 3) elements from vector memory operand 1 into vector register operand 0, setting the other elements of operand 0 to undefined values. Operands 0 and 1 have mode
m
, which must be a vector mode. Operand 2 has whichever integer mode the target prefers. Operand 3 conceptually has modeQI
.Operand 2 can be a variable or a constant amount. Operand 3 specifies a constant bias: it is either a constant 0 or a constant -1. The predicate on operand 3 must only accept the bias values that the target actually supports. GCC handles a bias of 0 more efficiently than a bias of -1.
If (operand 2 - operand 3) exceeds the number of elements in mode
m
, the behavior is undefined.If the target prefers the length to be measured in bytes rather than elements, it should only implement this pattern for vectors of
QI
elements.This pattern is not allowed to
FAIL
.- len_store_m
Store (operand 2 - operand 3) vector elements from vector register operand 1 into memory operand 0, leaving the other elements of operand 0 unchanged. Operands 0 and 1 have mode
m
, which must be a vector mode. Operand 2 has whichever integer mode the target prefers. Operand 3 conceptually has modeQI
.Operand 2 can be a variable or a constant amount. Operand 3 specifies a constant bias: it is either a constant 0 or a constant -1. The predicate on operand 3 must only accept the bias values that the target actually supports. GCC handles a bias of 0 more efficiently than a bias of -1.
If (operand 2 - operand 3) exceeds the number of elements in mode
m
, the behavior is undefined.If the target prefers the length to be measured in bytes rather than elements, it should only implement this pattern for vectors of
QI
elements.This pattern is not allowed to
FAIL
.- vec_permm
Output a (variable) vector permutation. Operand 0 is the destination to receive elements from operand 1 and operand 2, which are of mode
m
. Operand 3 is the selector. It is an integral mode vector of the same width and number of elements as modem
.The input elements are numbered from 0 in operand 1 through 2*
N
-1 in operand 2. The elements of the selector must be computed modulo 2*N
. Note that ifrtx_equal_p(operand1, operand2)
, this can be implemented with just operand 1 and selector elements moduloN
.In order to make things easy for a number of targets, if there is no
vec_perm
pattern for modem
, but there is for modeq
whereq
is a vector ofQImode
of the same width asm
, the middle-end will lower the modem
VEC_PERM_EXPR
to modeq
.See also
TARGET_VECTORIZER_VEC_PERM_CONST
, which performs the analogous operation for constant selectors.- pushm1
Output a push instruction. Operand 0 is value to push. Used only when
PUSH_ROUNDING
is defined. For historical reason, this pattern may be missing and in such case anmov
expander is used instead, with aMEM
expression forming the push operation. Themov
expander method is deprecated.- addm3
Add operand 2 and operand 1, storing the result in operand 0. All operands must have mode
m
. This can be used even on two-address machines, by means of constraints requiring operands 1 and 0 to be the same location.ssaddm3
,usaddm3
subm3
,sssubm3
,ussubm3
mulm3
,ssmulm3
,usmulm3
divm3
,ssdivm3
udivm3
,usdivm3
modm3
,umodm3
uminm3
,umaxm3
andm3
,iorm3
,xorm3
Similar, for other arithmetic operations.
- addvm4
Like
addm3
but takes acode_label
as operand 3 and emits code to jump to it if signed overflow occurs during the addition. This pattern is used to implement the built-in functions performing signed integer addition with overflow checking.subvm4
,mulvm4
Similar, for other signed arithmetic operations.
- uaddvm4
Like
addvm4
but for unsigned addition. That is to say, the operation is the same as signed addition but the jump is taken only on unsigned overflow.usubvm4
,umulvm4
Similar, for other unsigned arithmetic operations.
- addptrm3
Like
addm3
but is guaranteed to only be used for address calculations. The expanded code is not allowed to clobber the condition code. It only needs to be defined ifaddm3
sets the condition code. If adds used for address calculations and normal adds are not compatible it is required to expand a distinct pattern (e.g. using an unspec). The pattern is used by LRA to emit address calculations.addm3
is used ifaddptrm3
is not defined.- fmam4
Multiply operand 2 and operand 1, then add operand 3, storing the result in operand 0 without doing an intermediate rounding step. All operands must have mode
m
. This pattern is used to implement thefma
,fmaf
, andfmal
builtin functions from the ISO C99 standard.- fmsm4
Like
fmam4
, except operand 3 subtracted from the product instead of added to the product. This is represented in the rtl as(fma:m op1 op2 (neg:m op3))
- fnmam4
Like
fmam4
except that the intermediate product is negated before being added to operand 3. This is represented in the rtl as(fma:m (neg:m op1) op2 op3)
- fnmsm4
Like
fmsm4
except that the intermediate product is negated before subtracting operand 3. This is represented in the rtl as(fma:m (neg:m op1) op2 (neg:m op3))
sminm3
,smaxm3
Signed minimum and maximum operations. When used with floating point, if both operands are zeros, or if either operand is
NaN
, then it is unspecified which of the two operands is returned as the result.fminm3
,fmaxm3
IEEE-conformant minimum and maximum operations. If one operand is a quiet
NaN
, then the other operand is returned. If both operands are quietNaN
, then a quietNaN
is returned. In the case when gcc supports signalingNaN
(-fsignaling-nans) an invalid floating point exception is raised and a quietNaN
is returned.All operands have mode
m
, which is a scalar or vector floating-point mode. These patterns are not allowed toFAIL
.reduc_smin_scal_m
,reduc_smax_scal_m
Find the signed minimum/maximum of the elements of a vector. The vector is operand 1, and operand 0 is the scalar result, with mode equal to the mode of the elements of the input vector.
reduc_umin_scal_m
,reduc_umax_scal_m
Find the unsigned minimum/maximum of the elements of a vector. The vector is operand 1, and operand 0 is the scalar result, with mode equal to the mode of the elements of the input vector.
reduc_fmin_scal_m
,reduc_fmax_scal_m
Find the floating-point minimum/maximum of the elements of a vector, using the same rules as
fminm3
andfmaxm3
. Operand 1 is a vector of modem
and operand 0 is the scalar result, which has modeGET_MODE_INNER (m)
.- reduc_plus_scal_m
Compute the sum of the elements of a vector. The vector is operand 1, and operand 0 is the scalar result, with mode equal to the mode of the elements of the input vector.
- reduc_and_scal_m, reduc_ior_scal_m, reduc_xor_scal_m
Compute the bitwise
AND
/IOR
/XOR
reduction of the elements of a vector of modem
. Operand 1 is the vector input and operand 0 is the scalar result. The mode of the scalar result is the same as one element ofm
.- extract_last_m
Find the last set bit in mask operand 1 and extract the associated element of vector operand 2. Store the result in scalar operand 0. Operand 2 has vector mode
m
while operand 0 has the mode appropriate for one element ofm
. Operand 1 has the usual mask mode for vectors of modem
; seeTARGET_VECTORIZE_GET_MASK_MODE
.- fold_extract_last_m
If any bits of mask operand 2 are set, find the last set bit, extract the associated element from vector operand 3, and store the result in operand 0. Store operand 1 in operand 0 otherwise. Operand 3 has mode
m
and operands 0 and 1 have the mode appropriate for one element ofm
. Operand 2 has the usual mask mode for vectors of modem
; seeTARGET_VECTORIZE_GET_MASK_MODE
.- fold_left_plus_m
Take scalar operand 1 and successively add each element from vector operand 2. Store the result in scalar operand 0. The vector has mode
m
and the scalars have the mode appropriate for one element ofm
. The operation is strictly in-order: there is no reassociation.- mask_fold_left_plus_m
Like
fold_left_plus_m
, but takes an additional mask operand (operand 3) that specifies which elements of the source vector should be added.- sdot_prodm
Compute the sum of the products of two signed elements. Operand 1 and operand 2 are of the same mode. Their product, which is of a wider mode, is computed and added to operand 3. Operand 3 is of a mode equal or wider than the mode of the product. The result is placed in operand 0, which is of the same mode as operand 3.
Semantically the expressions perform the multiplication in the following signs
sdot<signed op0, signed op1, signed op2, signed op3> == op0 = sign-ext (op1) * sign-ext (op2) + op3 ...
- udot_prodm
Compute the sum of the products of two unsigned elements. Operand 1 and operand 2 are of the same mode. Their product, which is of a wider mode, is computed and added to operand 3. Operand 3 is of a mode equal or wider than the mode of the product. The result is placed in operand 0, which is of the same mode as operand 3.
Semantically the expressions perform the multiplication in the following signs
udot<unsigned op0, unsigned op1, unsigned op2, unsigned op3> == op0 = zero-ext (op1) * zero-ext (op2) + op3 ...
- usdot_prodm
Compute the sum of the products of elements of different signs. Operand 1 must be unsigned and operand 2 signed. Their product, which is of a wider mode, is computed and added to operand 3. Operand 3 is of a mode equal or wider than the mode of the product. The result is placed in operand 0, which is of the same mode as operand 3.
Semantically the expressions perform the multiplication in the following signs
usdot<signed op0, unsigned op1, signed op2, signed op3> == op0 = ((signed-conv) zero-ext (op1)) * sign-ext (op2) + op3 ...
- ssadm, usadm
Compute the sum of absolute differences of two signed/unsigned elements. Operand 1 and operand 2 are of the same mode. Their absolute difference, which is of a wider mode, is computed and added to operand 3. Operand 3 is of a mode equal or wider than the mode of the absolute difference. The result is placed in operand 0, which is of the same mode as operand 3.
widen_ssumm3
widen_usumm3
Operands 0 and 2 are of the same mode, which is wider than the mode of operand 1. Add operand 1 to operand 2 and place the widened result in operand 0. (This is used express accumulation of elements into an accumulator of a wider mode.)
smulhsm3, umulhsm3
Signed/unsigned multiply high with scale. This is equivalent to the C code:
narrow op0, op1, op2; ... op0 = (narrow) (((wide) op1 * (wide) op2) >> (N / 2 - 1));where the sign of
narrow
determines whether this is a signed or unsigned operation, andN
is the size ofwide
in bits.
smulhrsm3, umulhrsm3
Signed/unsigned multiply high with round and scale. This is equivalent to the C code:
narrow op0, op1, op2; ... op0 = (narrow) (((((wide) op1 * (wide) op2) >> (N / 2 - 2)) + 1) >> 1);where the sign of
narrow
determines whether this is a signed or unsigned operation, andN
is the size ofwide
in bits.
- sdiv_pow2m3
Signed division by power-of-2 immediate. Equivalent to:
signed op0, op1; ... op0 = op1 / (1 << imm);
- vec_shl_insert_m
Shift the elements in vector input operand 1 left one element (i.e. away from element 0) and fill the vacated element 0 with the scalar in operand 2. Store the result in vector output operand 0. Operands 0 and 1 have mode
m
and operand 2 has the mode appropriate for one element ofm
.- vec_shl_m
Whole vector left shift in bits, i.e. away from element 0. Operand 1 is a vector to be shifted. Operand 2 is an integer shift amount in bits. Operand 0 is where the resulting shifted vector is stored. The output and input vectors should have the same modes.
- vec_shr_m
Whole vector right shift in bits, i.e. towards element 0. Operand 1 is a vector to be shifted. Operand 2 is an integer shift amount in bits. Operand 0 is where the resulting shifted vector is stored. The output and input vectors should have the same modes.
- vec_pack_trunc_m
Narrow (demote) and merge the elements of two vectors. Operands 1 and 2 are vectors of the same mode having N integral or floating point elements of size S. Operand 0 is the resulting vector in which 2*N elements of size S/2 are concatenated after narrowing them down using truncation.
- vec_pack_sbool_trunc_m
Narrow and merge the elements of two vectors. Operands 1 and 2 are vectors of the same type having N boolean elements. Operand 0 is the resulting vector in which 2*N elements are concatenated. The last operand (operand 3) is the number of elements in the output vector 2*N as a
CONST_INT
. This instruction pattern is used when all the vector input and output operands have the same scalar modem
and thus usingvec_pack_trunc_m
would be ambiguous.vec_pack_ssat_m
,vec_pack_usat_m
Narrow (demote) and merge the elements of two vectors. Operands 1 and 2 are vectors of the same mode having N integral elements of size S. Operand 0 is the resulting vector in which the elements of the two input vectors are concatenated after narrowing them down using signed/unsigned saturating arithmetic.
vec_pack_sfix_trunc_m
,vec_pack_ufix_trunc_m
Narrow, convert to signed/unsigned integral type and merge the elements of two vectors. Operands 1 and 2 are vectors of the same mode having N floating point elements of size S. Operand 0 is the resulting vector in which 2*N elements of size S/2 are concatenated.
vec_packs_float_m
,vec_packu_float_m
Narrow, convert to floating point type and merge the elements of two vectors. Operands 1 and 2 are vectors of the same mode having N signed/unsigned integral elements of size S. Operand 0 is the resulting vector in which 2*N elements of size S/2 are concatenated.
vec_unpacks_hi_m
,vec_unpacks_lo_m
Extract and widen (promote) the high/low part of a vector of signed integral or floating point elements. The input vector (operand 1) has N elements of size S. Widen (promote) the high/low elements of the vector using signed or floating point extension and place the resulting N/2 values of size 2*S in the output vector (operand 0).
vec_unpacku_hi_m
,vec_unpacku_lo_m
Extract and widen (promote) the high/low part of a vector of unsigned integral elements. The input vector (operand 1) has N elements of size S. Widen (promote) the high/low elements of the vector using zero extension and place the resulting N/2 values of size 2*S in the output vector (operand 0).
vec_unpacks_sbool_hi_m
,vec_unpacks_sbool_lo_m
Extract the high/low part of a vector of boolean elements that have scalar mode
m
. The input vector (operand 1) has N elements, the output vector (operand 0) has N/2 elements. The last operand (operand 2) is the number of elements of the input vector N as aCONST_INT
. These patterns are used if both the input and output vectors have the same scalar modem
and thus usingvec_unpacks_hi_m
orvec_unpacks_lo_m
would be ambiguous.vec_unpacks_float_hi_m
,vec_unpacks_float_lo_m
vec_unpacku_float_hi_m
,vec_unpacku_float_lo_m
Extract, convert to floating point type and widen the high/low part of a vector of signed/unsigned integral elements. The input vector (operand 1) has N elements of size S. Convert the high/low elements of the vector using floating point conversion and place the resulting N/2 values of size 2*S in the output vector (operand 0).
vec_unpack_sfix_trunc_hi_m
, vec_unpack_sfix_trunc_lo_m vec_unpack_ufix_trunc_hi_m vec_unpack_ufix_trunc_lo_mExtract, convert to signed/unsigned integer type and widen the high/low part of a vector of floating point elements. The input vector (operand 1) has N elements of size S. Convert the high/low elements of the vector to integers and place the resulting N/2 values of size 2*S in the output vector (operand 0).
vec_widen_umult_hi_m
,vec_widen_umult_lo_m
vec_widen_smult_hi_m
,vec_widen_smult_lo_m
vec_widen_umult_even_m
,vec_widen_umult_odd_m
vec_widen_smult_even_m
,vec_widen_smult_odd_m
Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2) are vectors with N signed/unsigned elements of size S. Multiply the high/low or even/odd elements of the two vectors, and put the N/2 products of size 2*S in the output vector (operand 0). A target shouldn’t implement even/odd pattern pair if it is less efficient than lo/hi one.
vec_widen_ushiftl_hi_m
,vec_widen_ushiftl_lo_m
vec_widen_sshiftl_hi_m
,vec_widen_sshiftl_lo_m
Signed/Unsigned widening shift left. The first input (operand 1) is a vector with N signed/unsigned elements of size S. Operand 2 is a constant. Shift the high/low elements of operand 1, and put the N/2 results of size 2*S in the output vector (operand 0).
vec_widen_uaddl_hi_m
,vec_widen_uaddl_lo_m
vec_widen_saddl_hi_m
,vec_widen_saddl_lo_m
Signed/Unsigned widening add long. Operands 1 and 2 are vectors with N signed/unsigned elements of size S. Add the high/low elements of 1 and 2 together, widen the resulting elements and put the N/2 results of size 2*S in the output vector (operand 0).
vec_widen_usubl_hi_m
,vec_widen_usubl_lo_m
vec_widen_ssubl_hi_m
,vec_widen_ssubl_lo_m
Signed/Unsigned widening subtract long. Operands 1 and 2 are vectors with N signed/unsigned elements of size S. Subtract the high/low elements of 2 from 1 and widen the resulting elements. Put the N/2 results of size 2*S in the output vector (operand 0).
- vec_addsubm3
Alternating subtract, add with even lanes doing subtract and odd lanes doing addition. Operands 1 and 2 and the outout operand are vectors with mode
m
.- vec_fmaddsubm4
Alternating multiply subtract, add with even lanes doing subtract and odd lanes doing addition of the third operand to the multiplication result of the first two operands. Operands 1, 2 and 3 and the outout operand are vectors with mode
m
.- vec_fmsubaddm4
Alternating multiply add, subtract with even lanes doing addition and odd lanes doing subtraction of the third operand to the multiplication result of the first two operands. Operands 1, 2 and 3 and the outout operand are vectors with mode
m
.These instructions are not allowed to
FAIL
.- mulhisi3
Multiply operands 1 and 2, which have mode
HImode
, and store aSImode
product in operand 0.mulqihi3, mulsidi3
Similar widening-multiplication instructions of other widths.
umulqihi3, umulhisi3, umulsidi3
Similar widening-multiplication instructions that do unsigned multiplication.
usmulqihi3, usmulhisi3, usmulsidi3
Similar widening-multiplication instructions that interpret the first operand as unsigned and the second operand as signed, then do a signed multiplication.
- smulm3_highpart
Perform a signed multiplication of operands 1 and 2, which have mode
m
, and store the most significant half of the product in operand 0. The least significant half of the product is discarded. This may be represented in RTL using asmul_highpart
RTX expression.- umulm3_highpart
Similar, but the multiplication is unsigned. This may be represented in RTL using an
umul_highpart
RTX expression.- maddmn4
Multiply operands 1 and 2, sign-extend them to mode
n
, add operand 3, and store the result in operand 0. Operands 1 and 2 have modem
and operands 0 and 3 have moden
. Both modes must be integer or fixed-point modes andn
must be twice the size ofm
.In other words,
maddmn4
is likemulmn3
except that it also adds operand 3.These instructions are not allowed to
FAIL
.- umaddmn4
Like
maddmn4
, but zero-extend the multiplication operands instead of sign-extending them.- ssmaddmn4
Like
maddmn4
, but all involved operations must be signed-saturating.- usmaddmn4
Like
umaddmn4
, but all involved operations must be unsigned-saturating.- msubmn4
Multiply operands 1 and 2, sign-extend them to mode
n
, subtract the result from operand 3, and store the result in operand 0. Operands 1 and 2 have modem
and operands 0 and 3 have moden
. Both modes must be integer or fixed-point modes andn
must be twice the size ofm
.In other words,
msubmn4
is likemulmn3
except that it also subtracts the result from operand 3.These instructions are not allowed to
FAIL
.- umsubmn4
Like
msubmn4
, but zero-extend the multiplication operands instead of sign-extending them.- ssmsubmn4
Like
msubmn4
, but all involved operations must be signed-saturating.- usmsubmn4
Like
umsubmn4
, but all involved operations must be unsigned-saturating.- divmodm4
Signed division that produces both a quotient and a remainder. Operand 1 is divided by operand 2 to produce a quotient stored in operand 0 and a remainder stored in operand 3.
For machines with an instruction that produces both a quotient and a remainder, provide a pattern for
divmodm4
but do not provide patterns fordivm3
andmodm3
. This allows optimization in the relatively common case when both the quotient and remainder are computed.If an instruction that just produces a quotient or just a remainder exists and is more efficient than the instruction that produces both, write the output routine of
divmodm4
to callfind_reg_note
and look for aREG_UNUSED
note on the quotient or remainder and generate the appropriate instruction.- udivmodm4
Similar, but does unsigned division.
ashlm3
,ssashlm3
,usashlm3
Arithmetic-shift operand 1 left by a number of bits specified by operand 2, and store the result in operand 0. Here
m
is the mode of operand 0 and operand 1; operand 2’s mode is specified by the instruction pattern, and the compiler will convert the operand to that mode before generating the instruction. The shift or rotate expander or instruction pattern should explicitly specify the mode of the operand 2, it should never beVOIDmode
. The meaning of out-of-range shift counts can optionally be specified byTARGET_SHIFT_TRUNCATION_MASK
. See target_shift_truncation_mask. Operand 2 is always a scalar type.ashrm3
,lshrm3
,rotlm3
,rotrm3
Other shift and rotate instructions, analogous to the
ashlm3
instructions. Operand 2 is always a scalar type.vashlm3
,vashrm3
,vlshrm3
,vrotlm3
,vrotrm3
Vector shift and rotate instructions that take vectors as operand 2 instead of a scalar type.
- avgm3_floor uavgm3_floor
Signed and unsigned average instructions. These instructions add operands 1 and 2 without truncation, divide the result by 2, round towards -Inf, and store the result in operand 0. This is equivalent to the C code:
narrow op0, op1, op2; ... op0 = (narrow) (((wide) op1 + (wide) op2) >> 1);
where the sign of
narrow
determines whether this is a signed or unsigned operation.- avgm3_ceil uavgm3_ceil
Like
avgm3_floor
anduavgm3_floor
, but round towards +Inf. This is equivalent to the C code:narrow op0, op1, op2; ... op0 = (narrow) (((wide) op1 + (wide) op2 + 1) >> 1);
- bswapm2
Reverse the order of bytes of operand 1 and store the result in operand 0.
negm2
,ssnegm2
,usnegm2
Negate operand 1 and store the result in operand 0.
- negvm3
Like
negm2
but takes acode_label
as operand 2 and emits code to jump to it if signed overflow occurs during the negation.- absm2
Store the absolute value of operand 1 into operand 0.
- sqrtm2
Store the square root of operand 1 into operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- rsqrtm2
Store the reciprocal of the square root of operand 1 into operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.On most architectures this pattern is only approximate, so either its C condition or the
TARGET_OPTAB_SUPPORTED_P
hook should check for the appropriate math flags. (Using the C condition is more direct, but usingTARGET_OPTAB_SUPPORTED_P
can be useful if a target-specific built-in also uses thersqrtm2
pattern.)This pattern is not allowed to
FAIL
.- fmodm3
Store the remainder of dividing operand 1 by operand 2 into operand 0, rounded towards zero to an integer. All operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- remainderm3
Store the remainder of dividing operand 1 by operand 2 into operand 0, rounded to the nearest integer. All operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- scalbm3
Raise
FLT_RADIX
to the power of operand 2, multiply it by operand 1, and store the result in operand 0. All operands have modem
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- ldexpm3
Raise 2 to the power of operand 2, multiply it by operand 1, and store the result in operand 0. Operands 0 and 1 have mode
m
, which is a scalar or vector floating-point mode. Operand 2’s mode has the same number of elements asm
and each element is wide enough to store anint
. The integers are signed.This pattern is not allowed to
FAIL
.- cosm2
Store the cosine of operand 1 into operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- sinm2
Store the sine of operand 1 into operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- sincosm3
Store the cosine of operand 2 into operand 0 and the sine of operand 2 into operand 1. All operands have mode
m
, which is a scalar or vector floating-point mode.Targets that can calculate the sine and cosine simultaneously can implement this pattern as opposed to implementing individual
sinm2
andcosm2
patterns. Thesin
andcos
built-in functions will then be expanded to thesincosm3
pattern, with one of the output values left unused.- tanm2
Store the tangent of operand 1 into operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- asinm2
Store the arc sine of operand 1 into operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- acosm2
Store the arc cosine of operand 1 into operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- atanm2
Store the arc tangent of operand 1 into operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- fegetroundm
Store the current machine floating-point rounding mode into operand 0. Operand 0 has mode
m
, which is scalar. This pattern is used to implement thefegetround
function from the ISO C99 standard.feclearexceptm
- feraiseexceptm
Clears or raises the supported machine floating-point exceptions represented by the bits in operand 1. Error status is stored as nonzero value in operand 0. Both operands have mode
m
, which is a scalar. These patterns are used to implement thefeclearexcept
andferaiseexcept
functions from the ISO C99 standard.- expm2
Raise e (the base of natural logarithms) to the power of operand 1 and store the result in operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- expm1m2
Raise e (the base of natural logarithms) to the power of operand 1, subtract 1, and store the result in operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.For inputs close to zero, the pattern is expected to be more accurate than a separate
expm2
andsubm3
would be.This pattern is not allowed to
FAIL
.- exp10m2
Raise 10 to the power of operand 1 and store the result in operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- exp2m2
Raise 2 to the power of operand 1 and store the result in operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- logm2
Store the natural logarithm of operand 1 into operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- log1pm2
Add 1 to operand 1, compute the natural logarithm, and store the result in operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.For inputs close to zero, the pattern is expected to be more accurate than a separate
addm3
andlogm2
would be.This pattern is not allowed to
FAIL
.- log10m2
Store the base-10 logarithm of operand 1 into operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- log2m2
Store the base-2 logarithm of operand 1 into operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- logbm2
Store the base-
FLT_RADIX
logarithm of operand 1 into operand 0. Both operands have modem
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- significandm2
Store the significand of floating-point operand 1 in operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- powm3
Store the value of operand 1 raised to the exponent operand 2 into operand 0. All operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- atan2m3
Store the arc tangent (inverse tangent) of operand 1 divided by operand 2 into operand 0, using the signs of both arguments to determine the quadrant of the result. All operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- floorm2
Store the largest integral value not greater than operand 1 in operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode. If-ffp-int-builtin-inexact
is in effect, the ‘inexact’ exception may be raised for noninteger operands; otherwise, it may not.This pattern is not allowed to
FAIL
.- btruncm2
Round operand 1 to an integer, towards zero, and store the result in operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode. If-ffp-int-builtin-inexact
is in effect, the ‘inexact’ exception may be raised for noninteger operands; otherwise, it may not.This pattern is not allowed to
FAIL
.- roundm2
Round operand 1 to the nearest integer, rounding away from zero in the event of a tie, and store the result in operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode. If-ffp-int-builtin-inexact
is in effect, the ‘inexact’ exception may be raised for noninteger operands; otherwise, it may not.This pattern is not allowed to
FAIL
.- ceilm2
Store the smallest integral value not less than operand 1 in operand 0. Both operands have mode
m
, which is a scalar or vector floating-point mode. If-ffp-int-builtin-inexact
is in effect, the ‘inexact’ exception may be raised for noninteger operands; otherwise, it may not.This pattern is not allowed to
FAIL
.- nearbyintm2
Round operand 1 to an integer, using the current rounding mode, and store the result in operand 0. Do not raise an inexact condition when the result is different from the argument. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- rintm2
Round operand 1 to an integer, using the current rounding mode, and store the result in operand 0. Raise an inexact condition when the result is different from the argument. Both operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- lrintmn2
Convert operand 1 (valid for floating point mode
m
) to fixed point moden
as a signed number according to the current rounding mode and store in operand 0 (which has moden
).- lroundmn2
Convert operand 1 (valid for floating point mode
m
) to fixed point moden
as a signed number rounding to nearest and away from zero and store in operand 0 (which has moden
).- lfloormn2
Convert operand 1 (valid for floating point mode
m
) to fixed point moden
as a signed number rounding down and store in operand 0 (which has moden
).- lceilmn2
Convert operand 1 (valid for floating point mode
m
) to fixed point moden
as a signed number rounding up and store in operand 0 (which has moden
).- copysignm3
Store a value with the magnitude of operand 1 and the sign of operand 2 into operand 0. All operands have mode
m
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- xorsignm3
Equivalent to
op0 = op1 * copysign (1.0, op2)
: store a value with the magnitude of operand 1 and the sign of operand 2 into operand 0. All operands have modem
, which is a scalar or vector floating-point mode.This pattern is not allowed to
FAIL
.- issignalingm2
Set operand 0 to 1 if operand 1 is a signaling NaN and to 0 otherwise.
- cadd90m3
Perform vector add and subtract on even/odd number pairs. The operation being matched is semantically described as
for (int i = 0; i < N; i += 2) { c[i] = a[i] - b[i+1]; c[i+1] = a[i+1] + b[i]; }
This operation is semantically equivalent to performing a vector addition of complex numbers in operand 1 with operand 2 rotated by 90 degrees around the argand plane and storing the result in operand 0.
In GCC lane ordering the real part of the number must be in the even lanes with the imaginary part in the odd lanes.
The operation is only supported for vector modes
m
.This pattern is not allowed to
FAIL
.- cadd270m3
Perform vector add and subtract on even/odd number pairs. The operation being matched is semantically described as
for (int i = 0; i < N; i += 2) { c[i] = a[i] + b[i+1]; c[i+1] = a[i+1] - b[i]; }
This operation is semantically equivalent to performing a vector addition of complex numbers in operand 1 with operand 2 rotated by 270 degrees around the argand plane and storing the result in operand 0.
In GCC lane ordering the real part of the number must be in the even lanes with the imaginary part in the odd lanes.
The operation is only supported for vector modes
m
.This pattern is not allowed to
FAIL
.- cmlam4
Perform a vector multiply and accumulate that is semantically the same as a multiply and accumulate of complex numbers.
complex TYPE op0[N]; complex TYPE op1[N]; complex TYPE op2[N]; complex TYPE op3[N]; for (int i = 0; i < N; i += 1) { op0[i] = op1[i] * op2[i] + op3[i]; }
In GCC lane ordering the real part of the number must be in the even lanes with the imaginary part in the odd lanes.
The operation is only supported for vector modes
m
.This pattern is not allowed to
FAIL
.- cmla_conjm4
Perform a vector multiply by conjugate and accumulate that is semantically the same as a multiply and accumulate of complex numbers where the second multiply arguments is conjugated.
complex TYPE op0[N]; complex TYPE op1[N]; complex TYPE op2[N]; complex TYPE op3[N]; for (int i = 0; i < N; i += 1) { op0[i] = op1[i] * conj (op2[i]) + op3[i]; }
In GCC lane ordering the real part of the number must be in the even lanes with the imaginary part in the odd lanes.
The operation is only supported for vector modes
m
.This pattern is not allowed to
FAIL
.- cmlsm4
Perform a vector multiply and subtract that is semantically the same as a multiply and subtract of complex numbers.
complex TYPE op0[N]; complex TYPE op1[N]; complex TYPE op2[N]; complex TYPE op3[N]; for (int i = 0; i < N; i += 1) { op0[i] = op1[i] * op2[i] - op3[i]; }
In GCC lane ordering the real part of the number must be in the even lanes with the imaginary part in the odd lanes.
The operation is only supported for vector modes
m
.This pattern is not allowed to
FAIL
.- cmls_conjm4
Perform a vector multiply by conjugate and subtract that is semantically the same as a multiply and subtract of complex numbers where the second multiply arguments is conjugated.
complex TYPE op0[N]; complex TYPE op1[N]; complex TYPE op2[N]; complex TYPE op3[N]; for (int i = 0; i < N; i += 1) { op0[i] = op1[i] * conj (op2[i]) - op3[i]; }
In GCC lane ordering the real part of the number must be in the even lanes with the imaginary part in the odd lanes.
The operation is only supported for vector modes
m
.This pattern is not allowed to
FAIL
.- cmulm4
Perform a vector multiply that is semantically the same as multiply of complex numbers.
complex TYPE op0[N]; complex TYPE op1[N]; complex TYPE op2[N]; for (int i = 0; i < N; i += 1) { op0[i] = op1[i] * op2[i]; }
In GCC lane ordering the real part of the number must be in the even lanes with the imaginary part in the odd lanes.
The operation is only supported for vector modes
m
.This pattern is not allowed to
FAIL
.- cmul_conjm4
Perform a vector multiply by conjugate that is semantically the same as a multiply of complex numbers where the second multiply arguments is conjugated.
complex TYPE op0[N]; complex TYPE op1[N]; complex TYPE op2[N]; for (int i = 0; i < N; i += 1) { op0[i] = op1[i] * conj (op2[i]); }
In GCC lane ordering the real part of the number must be in the even lanes with the imaginary part in the odd lanes.
The operation is only supported for vector modes
m
.This pattern is not allowed to
FAIL
.- ffsm2
Store into operand 0 one plus the index of the least significant 1-bit of operand 1. If operand 1 is zero, store zero.
m
is either a scalar or vector integer mode. When it is a scalar, operand 1 has modem
but operand 0 can have whatever scalar integer mode is suitable for the target. The compiler will insert conversion instructions as necessary (typically to convert the result to the same width asint
). Whenm
is a vector, both operands must have modem
.This pattern is not allowed to
FAIL
.- clrsbm2
Count leading redundant sign bits. Store into operand 0 the number of redundant sign bits in operand 1, starting at the most significant bit position. A redundant sign bit is defined as any sign bit after the first. As such, this count will be one less than the count of leading sign bits.
m
is either a scalar or vector integer mode. When it is a scalar, operand 1 has modem
but operand 0 can have whatever scalar integer mode is suitable for the target. The compiler will insert conversion instructions as necessary (typically to convert the result to the same width asint
). Whenm
is a vector, both operands must have modem
.This pattern is not allowed to
FAIL
.- clzm2
Store into operand 0 the number of leading 0-bits in operand 1, starting at the most significant bit position. If operand 1 is 0, the
CLZ_DEFINED_VALUE_AT_ZERO
(see Miscellaneous Parameters) macro defines if the result is undefined or has a useful value.m
is either a scalar or vector integer mode. When it is a scalar, operand 1 has modem
but operand 0 can have whatever scalar integer mode is suitable for the target. The compiler will insert conversion instructions as necessary (typically to convert the result to the same width asint
). Whenm
is a vector, both operands must have modem
.This pattern is not allowed to
FAIL
.- ctzm2
Store into operand 0 the number of trailing 0-bits in operand 1, starting at the least significant bit position. If operand 1 is 0, the
CTZ_DEFINED_VALUE_AT_ZERO
(see Miscellaneous Parameters) macro defines if the result is undefined or has a useful value.m
is either a scalar or vector integer mode. When it is a scalar, operand 1 has modem
but operand 0 can have whatever scalar integer mode is suitable for the target. The compiler will insert conversion instructions as necessary (typically to convert the result to the same width asint
). Whenm
is a vector, both operands must have modem
.This pattern is not allowed to
FAIL
.- popcountm2
Store into operand 0 the number of 1-bits in operand 1.
m
is either a scalar or vector integer mode. When it is a scalar, operand 1 has modem
but operand 0 can have whatever scalar integer mode is suitable for the target. The compiler will insert conversion instructions as necessary (typically to convert the result to the same width asint
). Whenm
is a vector, both operands must have modem
.This pattern is not allowed to
FAIL
.- paritym2
Store into operand 0 the parity of operand 1, i.e. the number of 1-bits in operand 1 modulo 2.
m
is either a scalar or vector integer mode. When it is a scalar, operand 1 has modem
but operand 0 can have whatever scalar integer mode is suitable for the target. The compiler will insert conversion instructions as necessary (typically to convert the result to the same width asint
). Whenm
is a vector, both operands must have modem
.This pattern is not allowed to
FAIL
.- one_cmplm2
Store the bitwise-complement of operand 1 into operand 0.
- cpymemm
Block copy instruction. The destination and source blocks of memory are the first two operands, and both are
mem:BLK
s with an address in modePmode
.The number of bytes to copy is the third operand, in mode
m
. Usually, you specifyPmode
form
. However, if you can generate better code knowing the range of valid lengths is smaller than those representable in a full Pmode pointer, you should provide a pattern with a mode corresponding to the range of values you can handle efficiently (e.g.,QImode
for values in the range 0–127; note we avoid numbers that appear negative) and also a pattern withPmode
.The fourth operand is the known shared alignment of the source and destination, in the form of a
const_int
rtx. Thus, if the compiler knows that both source and destination are word-aligned, it may provide the value 4 for this operand.Optional operands 5 and 6 specify expected alignment and size of block respectively. The expected alignment differs from alignment in operand 4 in a way that the blocks are not required to be aligned according to it in all cases. This expected alignment is also in bytes, just like operand 4. Expected size, when unknown, is set to
(const_int -1)
.Descriptions of multiple
cpymemm
patterns can only be beneficial if the patterns for smaller modes have fewer restrictions on their first, second and fourth operands. Note that the modem
incpymemm
does not impose any restriction on the mode of individually copied data units in the block.The
cpymemm
patterns need not give special consideration to the possibility that the source and destination strings might overlap. These patterns are used to do inline expansion of__builtin_memcpy
.- movmemm
Block move instruction. The destination and source blocks of memory are the first two operands, and both are
mem:BLK
s with an address in modePmode
.The number of bytes to copy is the third operand, in mode
m
. Usually, you specifyPmode
form
. However, if you can generate better code knowing the range of valid lengths is smaller than those representable in a full Pmode pointer, you should provide a pattern with a mode corresponding to the range of values you can handle efficiently (e.g.,QImode
for values in the range 0–127; note we avoid numbers that appear negative) and also a pattern withPmode
.The fourth operand is the known shared alignment of the source and destination, in the form of a
const_int
rtx. Thus, if the compiler knows that both source and destination are word-aligned, it may provide the value 4 for this operand.Optional operands 5 and 6 specify expected alignment and size of block respectively. The expected alignment differs from alignment in operand 4 in a way that the blocks are not required to be aligned according to it in all cases. This expected alignment is also in bytes, just like operand 4. Expected size, when unknown, is set to
(const_int -1)
.Descriptions of multiple
movmemm
patterns can only be beneficial if the patterns for smaller modes have fewer restrictions on their first, second and fourth operands. Note that the modem
inmovmemm
does not impose any restriction on the mode of individually copied data units in the block.The
movmemm
patterns must correctly handle the case where the source and destination strings overlap. These patterns are used to do inline expansion of__builtin_memmove
.- movstr
String copy instruction, with
stpcpy
semantics. Operand 0 is an output operand in modePmode
. The addresses of the destination and source strings are operands 1 and 2, and both aremem:BLK
s with addresses in modePmode
. The execution of the expansion of this pattern should store in operand 0 the address in which theNUL
terminator was stored in the destination string.This pattern has also several optional operands that are same as in
setmem
.- setmemm
Block set instruction. The destination string is the first operand, given as a
mem:BLK
whose address is in modePmode
. The number of bytes to set is the second operand, in modem
. The value to initialize the memory with is the third operand. Targets that only support the clearing of memory should reject any value that is not the constant 0. Seecpymemm
for a discussion of the choice of mode.The fourth operand is the known alignment of the destination, in the form of a
const_int
rtx. Thus, if the compiler knows that the destination is word-aligned, it may provide the value 4 for this operand.Optional operands 5 and 6 specify expected alignment and size of block respectively. The expected alignment differs from alignment in operand 4 in a way that the blocks are not required to be aligned according to it in all cases. This expected alignment is also in bytes, just like operand 4. Expected size, when unknown, is set to
(const_int -1)
. Operand 7 is the minimal size of the block and operand 8 is the maximal size of the block (NULL if it cannot be represented as CONST_INT). Operand 9 is the probable maximal size (i.e. we cannot rely on it for correctness, but it can be used for choosing proper code sequence for a given size).The use for multiple
setmemm
is as forcpymemm
.- cmpstrnm
String compare instruction, with five operands. Operand 0 is the output; it has mode
m
. The remaining four operands are like the operands ofcpymemm
. The two memory blocks specified are compared byte by byte in lexicographic order starting at the beginning of each string. The instruction is not allowed to prefetch more than one byte at a time since either string may end in the first byte and reading past that may access an invalid page or segment and cause a fault. The comparison terminates early if the fetched bytes are different or if they are equal to zero. The effect of the instruction is to store a value in operand 0 whose sign indicates the result of the comparison.- cmpstrm
String compare instruction, without known maximum length. Operand 0 is the output; it has mode
m
. The second and third operand are the blocks of memory to be compared; both aremem:BLK
with an address in modePmode
.The fourth operand is the known shared alignment of the source and destination, in the form of a
const_int
rtx. Thus, if the compiler knows that both source and destination are word-aligned, it may provide the value 4 for this operand.The two memory blocks specified are compared byte by byte in lexicographic order starting at the beginning of each string. The instruction is not allowed to prefetch more than one byte at a time since either string may end in the first byte and reading past that may access an invalid page or segment and cause a fault. The comparison will terminate when the fetched bytes are different or if they are equal to zero. The effect of the instruction is to store a value in operand 0 whose sign indicates the result of the comparison.
- cmpmemm
Block compare instruction, with five operands like the operands of
cmpstrm
. The two memory blocks specified are compared byte by byte in lexicographic order starting at the beginning of each block. Unlikecmpstrm
the instruction can prefetch any bytes in the two memory blocks. Also unlikecmpstrm
the comparison will not stop if both bytes are zero. The effect of the instruction is to store a value in operand 0 whose sign indicates the result of the comparison.- strlenm
Compute the length of a string, with three operands. Operand 0 is the result (of mode
m
), operand 1 is amem
referring to the first character of the string, operand 2 is the character to search for (normally zero), and operand 3 is a constant describing the known alignment of the beginning of the string.- rawmemchrm
Scan memory referred to by operand 1 for the first occurrence of operand 2. Operand 1 is a
mem
and operand 2 aconst_int
of modem
. Operand 0 is the result, i.e., a pointer to the first occurrence of operand 2 in the memory block given by operand 1.- floatmn2
Convert signed integer operand 1 (valid for fixed point mode
m
) to floating point moden
and store in operand 0 (which has moden
).- floatunsmn2
Convert unsigned integer operand 1 (valid for fixed point mode
m
) to floating point moden
and store in operand 0 (which has moden
).- fixmn2
Convert operand 1 (valid for floating point mode
m
) to fixed point moden
as a signed number and store in operand 0 (which has moden
). This instruction’s result is defined only when the value of operand 1 is an integer.If the machine description defines this pattern, it also needs to define the
ftrunc
pattern.- fixunsmn2
Convert operand 1 (valid for floating point mode
m
) to fixed point moden
as an unsigned number and store in operand 0 (which has moden
). This instruction’s result is defined only when the value of operand 1 is an integer.- ftruncm2
Convert operand 1 (valid for floating point mode
m
) to an integer value, still represented in floating point modem
, and store it in operand 0 (valid for floating point modem
).- fix_truncmn2
Like
fixmn2
but works for any floating point value of modem
by converting the value to an integer.- fixuns_truncmn2
Like
fixunsmn2
but works for any floating point value of modem
by converting the value to an integer.- truncmn2
Truncate operand 1 (valid for mode
m
) to moden
and store in operand 0 (which has moden
). Both modes must be fixed point or both floating point.- extendmn2
Sign-extend operand 1 (valid for mode
m
) to moden
and store in operand 0 (which has moden
). Both modes must be fixed point or both floating point.- zero_extendmn2
Zero-extend operand 1 (valid for mode
m
) to moden
and store in operand 0 (which has moden
). Both modes must be fixed point.- fractmn2
Convert operand 1 of mode
m
to moden
and store in operand 0 (which has moden
). Modem
and moden
could be fixed-point to fixed-point, signed integer to fixed-point, fixed-point to signed integer, floating-point to fixed-point, or fixed-point to floating-point. When overflows or underflows happen, the results are undefined.- satfractmn2
Convert operand 1 of mode
m
to moden
and store in operand 0 (which has moden
). Modem
and moden
could be fixed-point to fixed-point, signed integer to fixed-point, or floating-point to fixed-point. When overflows or underflows happen, the instruction saturates the results to the maximum or the minimum.- fractunsmn2
Convert operand 1 of mode
m
to moden
and store in operand 0 (which has moden
). Modem
and moden
could be unsigned integer to fixed-point, or fixed-point to unsigned integer. When overflows or underflows happen, the results are undefined.- satfractunsmn2
Convert unsigned integer operand 1 of mode
m
to fixed-point moden
and store in operand 0 (which has moden
). When overflows or underflows happen, the instruction saturates the results to the maximum or the minimum.- extvm
Extract a bit-field from register operand 1, sign-extend it, and store it in operand 0. Operand 2 specifies the width of the field in bits and operand 3 the starting bit, which counts from the most significant bit if
BITS_BIG_ENDIAN
is true and from the least significant bit otherwise.Operands 0 and 1 both have mode
m
. Operands 2 and 3 have a target-specific mode.- extvmisalignm
Extract a bit-field from memory operand 1, sign extend it, and store it in operand 0. Operand 2 specifies the width in bits and operand 3 the starting bit. The starting bit is always somewhere in the first byte of operand 1; it counts from the most significant bit if
BITS_BIG_ENDIAN
is true and from the least significant bit otherwise.Operand 0 has mode
m
while operand 1 hasBLK
mode. Operands 2 and 3 have a target-specific mode.The instruction must not read beyond the last byte of the bit-field.
- extzvm
Like
extvm
except that the bit-field value is zero-extended.- extzvmisalignm
Like
extvmisalignm
except that the bit-field value is zero-extended.- insvm
Insert operand 3 into a bit-field of register operand 0. Operand 1 specifies the width of the field in bits and operand 2 the starting bit, which counts from the most significant bit if
BITS_BIG_ENDIAN
is true and from the least significant bit otherwise.Operands 0 and 3 both have mode
m
. Operands 1 and 2 have a target-specific mode.- insvmisalignm
Insert operand 3 into a bit-field of memory operand 0. Operand 1 specifies the width of the field in bits and operand 2 the starting bit. The starting bit is always somewhere in the first byte of operand 0; it counts from the most significant bit if
BITS_BIG_ENDIAN
is true and from the least significant bit otherwise.Operand 3 has mode
m
while operand 0 hasBLK
mode. Operands 1 and 2 have a target-specific mode.The instruction must not read or write beyond the last byte of the bit-field.
- extv
Extract a bit-field from operand 1 (a register or memory operand), where operand 2 specifies the width in bits and operand 3 the starting bit, and store it in operand 0. Operand 0 must have mode
word_mode
. Operand 1 may have modebyte_mode
orword_mode
; oftenword_mode
is allowed only for registers. Operands 2 and 3 must be valid forword_mode
.The RTL generation pass generates this instruction only with constants for operands 2 and 3 and the constant is never zero for operand 2.
The bit-field value is sign-extended to a full word integer before it is stored in operand 0.
This pattern is deprecated; please use
extvm
andextvmisalignm
instead.- extzv
Like
extv
except that the bit-field value is zero-extended.This pattern is deprecated; please use
extzvm
andextzvmisalignm
instead.- insv
Store operand 3 (which must be valid for
word_mode
) into a bit-field in operand 0, where operand 1 specifies the width in bits and operand 2 the starting bit. Operand 0 may have modebyte_mode
orword_mode
; oftenword_mode
is allowed only for registers. Operands 1 and 2 must be valid forword_mode
.The RTL generation pass generates this instruction only with constants for operands 1 and 2 and the constant is never zero for operand 1.
This pattern is deprecated; please use
insvm
andinsvmisalignm
instead.- movmodecc
Conditionally move operand 2 or operand 3 into operand 0 according to the comparison in operand 1. If the comparison is true, operand 2 is moved into operand 0, otherwise operand 3 is moved.
The mode of the operands being compared need not be the same as the operands being moved. Some machines, sparc64 for example, have instructions that conditionally move an integer value based on the floating point condition codes and vice versa.
If the machine does not have conditional move instructions, do not define these patterns.
- addmodecc
Similar to
movmodecc
but for conditional addition. Conditionally move operand 2 or (operands 2 + operand 3) into operand 0 according to the comparison in operand 1. If the comparison is false, operand 2 is moved into operand 0, otherwise (operand 2 + operand 3) is moved.- cond_addmode cond_submode cond_mulmode cond_divmode cond_udivmode cond_modmode cond_umodmode cond_andmode cond_iormode cond_xormode cond_sminmode cond_smaxmode cond_uminmode cond_umaxmode cond_fminmode cond_fmaxmode cond_ashlmode cond_ashrmode cond_lshrmode
When operand 1 is true, perform an operation on operands 2 and 3 and store the result in operand 0, otherwise store operand 4 in operand 0. The operation works elementwise if the operands are vectors.
The scalar case is equivalent to:
op0 = op1 ? op2 op op3 : op4;
while the vector case is equivalent to:
for (i = 0; i < GET_MODE_NUNITS (m); i++) op0[i] = op1[i] ? op2[i] op op3[i] : op4[i];
where, for example,
op
is+
forcond_addmode
.When defined for floating-point modes, the contents of
op3[i]
are not interpreted ifop1[i]
is false, just like they would not be in a normal C?:
condition.Operands 0, 2, 3 and 4 all have mode
m
. Operand 1 is a scalar integer ifm
is scalar, otherwise it has the mode returned byTARGET_VECTORIZE_GET_MASK_MODE
.cond_opmode
generally corresponds to a conditional form ofopmode3
. As an exception, the vector forms of shifts correspond to patterns likevashlmode3
rather than patterns likeashlmode3
.- cond_fmamode cond_fmsmode cond_fnmamode cond_fnmsmode
Like
cond_addm
, except that the conditional operation takes 3 operands rather than two. For example, the vector form ofcond_fmamode
is equivalent to:for (i = 0; i < GET_MODE_NUNITS (m); i++) op0[i] = op1[i] ? fma (op2[i], op3[i], op4[i]) : op5[i];
- negmodecc
Similar to
movmodecc
but for conditional negation. Conditionally move the negation of operand 2 or the unchanged operand 3 into operand 0 according to the comparison in operand 1. If the comparison is true, the negation of operand 2 is moved into operand 0, otherwise operand 3 is moved.- notmodecc
Similar to
negmodecc
but for conditional complement. Conditionally move the bitwise complement of operand 2 or the unchanged operand 3 into operand 0 according to the comparison in operand 1. If the comparison is true, the complement of operand 2 is moved into operand 0, otherwise operand 3 is moved.- cstoremode4
Store zero or nonzero in operand 0 according to whether a comparison is true. Operand 1 is a comparison operator. Operand 2 and operand 3 are the first and second operand of the comparison, respectively. You specify the mode that operand 0 must have when you write the
match_operand
expression. The compiler automatically sees which mode you have used and supplies an operand of that mode.The value stored for a true condition must have 1 as its low bit, or else must be negative. Otherwise the instruction is not suitable and you should omit it from the machine description. You describe to the compiler exactly which value is stored by defining the macro
STORE_FLAG_VALUE
(see Miscellaneous Parameters). If a description cannot be found that can be used for all the possible comparison operators, you should pick one and use adefine_expand
to map all results onto the one you chose.These operations may
FAIL
, but should do so only in relatively uncommon cases; if they wouldFAIL
for common cases involving integer comparisons, it is best to restrict the predicates to not allow these operands. Likewise if a given comparison operator will always fail, independent of the operands (for floating-point modes, theordered_comparison_operator
predicate is often useful in this case).If this pattern is omitted, the compiler will generate a conditional branch—for example, it may copy a constant one to the target and branching around an assignment of zero to the target—or a libcall. If the predicate for operand 1 only rejects some operators, it will also try reordering the operands and/or inverting the result value (e.g. by an exclusive OR). These possibilities could be cheaper or equivalent to the instructions used for the
cstoremode4
pattern followed by those required to convert a positive result fromSTORE_FLAG_VALUE
to 1; in this case, you can and should make operand 1’s predicate reject some operators in thecstoremode4
pattern, or remove the pattern altogether from the machine description.- cbranchmode4
Conditional branch instruction combined with a compare instruction. Operand 0 is a comparison operator. Operand 1 and operand 2 are the first and second operands of the comparison, respectively. Operand 3 is the
code_label
to jump to.- jump
A jump inside a function; an unconditional branch. Operand 0 is the
code_label
to jump to. This pattern name is mandatory on all machines.- call
Subroutine call instruction returning no value. Operand 0 is the function to call; operand 1 is the number of bytes of arguments pushed as a
const_int
. Operand 2 is the result of calling the target hookTARGET_FUNCTION_ARG
with the second argumentarg
yielding true forarg.end_marker_p ()
, in a call after all parameters have been passed to that hook. By default this is the first register beyond those used for arguments in the call, orNULL
if all the argument-registers are used in the call.On most machines, operand 2 is not actually stored into the RTL pattern. It is supplied for the sake of some RISC machines which need to put this information into the assembler code; they can put it in the RTL instead of operand 1.
Operand 0 should be a
mem
RTX whose address is the address of the function. Note, however, that this address can be asymbol_ref
expression even if it would not be a legitimate memory address on the target machine. If it is also not a valid argument for a call instruction, the pattern for this operation should be adefine_expand
(see Defining RTL Sequences for Code Generation) that places the address into a register and uses that register in the call instruction.- call_value
Subroutine call instruction returning a value. Operand 0 is the hard register in which the value is returned. There are three more operands, the same as the three operands of the
call
instruction (but with numbers increased by one).Subroutines that return
BLKmode
objects use thecall
insn.call_pop, call_value_pop
Similar to
call
andcall_value
, except used if defined and ifRETURN_POPS_ARGS
is nonzero. They should emit aparallel
that contains both the function call and aset
to indicate the adjustment made to the frame pointer.For machines where
RETURN_POPS_ARGS
can be nonzero, the use of these patterns increases the number of functions for which the frame pointer can be eliminated, if desired.- untyped_call
Subroutine call instruction returning a value of any type. Operand 0 is the function to call; operand 1 is a memory location where the result of calling the function is to be stored; operand 2 is a
parallel
expression where each element is aset
expression that indicates the saving of a function return value into the result block.This instruction pattern should be defined to support
__builtin_apply
on machines where special instructions are needed to call a subroutine with arbitrary arguments or to save the value returned. This instruction pattern is required on machines that have multiple registers that can hold a return value (i.e.FUNCTION_VALUE_REGNO_P
is true for more than one register).- return
Subroutine return instruction. This instruction pattern name should be defined only if a single instruction can do all the work of returning from a function.
Like the
movm
patterns, this pattern is also used after the RTL generation phase. In this case it is to support machines where multiple instructions are usually needed to return from a function, but some class of functions only requires one instruction to implement a return. Normally, the applicable functions are those which do not need to save any registers or allocate stack space.It is valid for this pattern to expand to an instruction using
simple_return
if no epilogue is required.- simple_return
Subroutine return instruction. This instruction pattern name should be defined only if a single instruction can do all the work of returning from a function on a path where no epilogue is required. This pattern is very similar to the
return
instruction pattern, but it is emitted only by the shrink-wrapping optimization on paths where the function prologue has not been executed, and a function return should occur without any of the effects of the epilogue. Additional uses may be introduced on paths where both the prologue and the epilogue have executed.For such machines, the condition specified in this pattern should only be true when
reload_completed
is nonzero and the function’s epilogue would only be a single instruction. For machines with register windows, the routineleaf_function_p
may be used to determine if a register window push is required.Machines that have conditional return instructions should define patterns such as
(define_insn "" [(set (pc) (if_then_else (match_operator 0 "comparison_operator" [(reg:CC CC_REG) (const_int 0)]) (return) (pc)))] "condition" "...")
where
condition
would normally be the same condition specified on the namedreturn
pattern.- untyped_return
Untyped subroutine return instruction. This instruction pattern should be defined to support
__builtin_return
on machines where special instructions are needed to return a value of any type.Operand 0 is a memory location where the result of calling a function with
__builtin_apply
is stored; operand 1 is aparallel
expression where each element is aset
expression that indicates the restoring of a function return value from the result block.- nop
No-op instruction. This instruction pattern name should always be defined to output a no-op in assembler code.
(const_int 0)
will do as an RTL pattern.- indirect_jump
An instruction to jump to an address which is operand zero. This pattern name is mandatory on all machines.
- casesi
Instruction to jump through a dispatch table, including bounds checking. This instruction takes five operands:
The index to dispatch on, which has mode
SImode
.The lower bound for indices in the table, an integer constant.
The total range of indices in the table—the largest index minus the smallest one (both inclusive).
A label that precedes the table itself.
A label to jump to if the index has a value outside the bounds.
The table is an
addr_vec
oraddr_diff_vec
inside of ajump_table_data
. The number of elements in the table is one plus the difference between the upper bound and the lower bound.- tablejump
Instruction to jump to a variable address. This is a low-level capability which can be used to implement a dispatch table when there is no
casesi
pattern.This pattern requires two operands: the address or offset, and a label which should immediately precede the jump table. If the macro
CASE_VECTOR_PC_RELATIVE
evaluates to a nonzero value then the first operand is an offset which counts from the address of the table; otherwise, it is an absolute address to jump to. In either case, the first operand has modePmode
.The
tablejump
insn is always the last insn before the jump table it uses. Its assembler code normally has no need to use the second operand, but you should incorporate it in the RTL pattern so that the jump optimizer will not delete the table as unreachable code.- doloop_end
Conditional branch instruction that decrements a register and jumps if the register is nonzero. Operand 0 is the register to decrement and test; operand 1 is the label to jump to if the register is nonzero. See Defining Looping Instruction Patterns.
This optional instruction pattern should be defined for machines with low-overhead looping instructions as the loop optimizer will try to modify suitable loops to utilize it. The target hook
TARGET_CAN_USE_DOLOOP_P
controls the conditions under which low-overhead loops can be used.- doloop_begin
Companion instruction to
doloop_end
required for machines that need to perform some initialization, such as loading a special counter register. Operand 1 is the associateddoloop_end
pattern and operand 0 is the register that it decrements.If initialization insns do not always need to be emitted, use a
define_expand
(see Defining RTL Sequences for Code Generation) and make it fail.- canonicalize_funcptr_for_compare
Canonicalize the function pointer in operand 1 and store the result into operand 0.
Operand 0 is always a
reg
and has modePmode
; operand 1 may be areg
,mem
,symbol_ref
,const_int
, etc and also has modePmode
.Canonicalization of a function pointer usually involves computing the address of the function which would be called if the function pointer were used in an indirect call.
Only define this pattern if function pointers on the target machine can have different values but still call the same function when used in an indirect call.
- save_stack_block save_stack_function save_stack_nonlocal restore_stack_block restore_stack_function restore_stack_nonlocal
Most machines save and restore the stack pointer by copying it to or from an object of mode
Pmode
. Do not define these patterns on such machines.Some machines require special handling for stack pointer saves and restores. On those machines, define the patterns corresponding to the non-standard cases by using a
define_expand
(see Defining RTL Sequences for Code Generation) that produces the required insns. The three types of saves and restores are:save_stack_block
saves the stack pointer at the start of a block that allocates a variable-sized object, andrestore_stack_block
restores the stack pointer when the block is exited.save_stack_function
andrestore_stack_function
do a similar job for the outermost block of a function and are used when the function allocates variable-sized objects or callsalloca
. Only the epilogue uses the restored stack pointer, allowing a simpler save or restore sequence on some machines.save_stack_nonlocal
is used in functions that contain labels branched to by nested functions. It saves the stack pointer in such a way that the inner function can userestore_stack_nonlocal
to restore the stack pointer. The compiler generates code to restore the frame and argument pointer registers, but some machines require saving and restoring additional data such as register window information or stack backchains. Place insns in these patterns to save and restore any such required data.
When saving the stack pointer, operand 0 is the save area and operand 1 is the stack pointer. The mode used to allocate the save area defaults to
Pmode
but you can override that choice by defining theSTACK_SAVEAREA_MODE
macro (see Storage Layout). You must specify an integral mode, orVOIDmode
if no save area is needed for a particular type of save (either because no save is needed or because a machine-specific save area can be used). Operand 0 is the stack pointer and operand 1 is the save area for restore operations. Ifsave_stack_block
is defined, operand 0 must not beVOIDmode
since these saves can be arbitrarily nested.A save area is a
mem
that is at a constant offset fromvirtual_stack_vars_rtx
when the stack pointer is saved for use by nonlocal gotos and areg
in the other two cases.- allocate_stack
Subtract (or add if
STACK_GROWS_DOWNWARD
is undefined) operand 1 from the stack pointer to create space for dynamically allocated data.Store the resultant pointer to this space into operand 0. If you are allocating space from the main stack, do this by emitting a move insn to copy
virtual_stack_dynamic_rtx
to operand 0. If you are allocating the space elsewhere, generate code to copy the location of the space to operand 0. In the latter case, you must ensure this space gets freed when the corresponding space on the main stack is free.Do not define this pattern if all that must be done is the subtraction. Some machines require other operations such as stack probes or maintaining the back chain. Define this pattern to emit those operations in addition to updating the stack pointer.
- check_stack
If stack checking (see Specifying How Stack Checking is Done) cannot be done on your system by probing the stack, define this pattern to perform the needed check and signal an error if the stack has overflowed. The single operand is the address in the stack farthest from the current stack pointer that you need to validate. Normally, on platforms where this pattern is needed, you would obtain the stack limit from a global or thread-specific variable or register.
- probe_stack_address
If stack checking (see Specifying How Stack Checking is Done) can be done on your system by probing the stack but without the need to actually access it, define this pattern and signal an error if the stack has overflowed. The single operand is the memory address in the stack that needs to be probed.
- probe_stack
If stack checking (see Specifying How Stack Checking is Done) can be done on your system by probing the stack but doing it with a ‘store zero’ instruction is not valid or optimal, define this pattern to do the probing differently and signal an error if the stack has overflowed. The single operand is the memory reference in the stack that needs to be probed.
- nonlocal_goto
Emit code to generate a non-local goto, e.g., a jump from one function to a label in an outer function. This pattern has four arguments, each representing a value to be used in the jump. The first argument is to be loaded into the frame pointer, the second is the address to branch to (code to dispatch to the actual label), the third is the address of a location where the stack is saved, and the last is the address of the label, to be placed in the location for the incoming static chain.
On most machines you need not define this pattern, since GCC will already generate the correct code, which is to load the frame pointer and static chain, restore the stack (using the
restore_stack_nonlocal
pattern, if defined), and jump indirectly to the dispatcher. You need only define this pattern if this code will not work on your machine.- nonlocal_goto_receiver
This pattern, if defined, contains code needed at the target of a nonlocal goto after the code already generated by GCC. You will not normally need to define this pattern. A typical reason why you might need this pattern is if some value, such as a pointer to a global table, must be restored when the frame pointer is restored. Note that a nonlocal goto only occurs within a unit-of-translation, so a global table pointer that is shared by all functions of a given module need not be restored. There are no arguments.
- exception_receiver
This pattern, if defined, contains code needed at the site of an exception handler that isn’t needed at the site of a nonlocal goto. You will not normally need to define this pattern. A typical reason why you might need this pattern is if some value, such as a pointer to a global table, must be restored after control flow is branched to the handler of an exception. There are no arguments.
- builtin_setjmp_setup
This pattern, if defined, contains additional code needed to initialize the
jmp_buf
. You will not normally need to define this pattern. A typical reason why you might need this pattern is if some value, such as a pointer to a global table, must be restored. Though it is preferred that the pointer value be recalculated if possible (given the address of a label for instance). The single argument is a pointer to thejmp_buf
. Note that the buffer is five words long and that the first three are normally used by the generic mechanism.- builtin_setjmp_receiver
This pattern, if defined, contains code needed at the site of a built-in setjmp that isn’t needed at the site of a nonlocal goto. You will not normally need to define this pattern. A typical reason why you might need this pattern is if some value, such as a pointer to a global table, must be restored. It takes one argument, which is the label to which builtin_longjmp transferred control; this pattern may be emitted at a small offset from that label.
- builtin_longjmp
This pattern, if defined, performs the entire action of the longjmp. You will not normally need to define this pattern unless you also define
builtin_setjmp_setup
. The single argument is a pointer to thejmp_buf
.- eh_return
This pattern, if defined, affects the way
__builtin_eh_return
, and thence the call frame exception handling library routines, are built. It is intended to handle non-trivial actions needed along the abnormal return path.The address of the exception handler to which the function should return is passed as operand to this pattern. It will normally need to copied by the pattern to some special register or memory location. If the pattern needs to determine the location of the target call frame in order to do so, it may use
EH_RETURN_STACKADJ_RTX
, if defined; it will have already been assigned.If this pattern is not defined, the default action will be to simply copy the return address to
EH_RETURN_HANDLER_RTX
. Either that macro or this pattern needs to be defined if call frame exception handling is to be used.
- prologue
This pattern, if defined, emits RTL for entry to a function. The function entry is responsible for setting up the stack frame, initializing the frame pointer register, saving callee saved registers, etc.
Using a prologue pattern is generally preferred over defining
TARGET_ASM_FUNCTION_PROLOGUE
to emit assembly code for the prologue.The
prologue
pattern is particularly useful for targets which perform instruction scheduling.
- window_save
This pattern, if defined, emits RTL for a register window save. It should be defined if the target machine has register windows but the window events are decoupled from calls to subroutines. The canonical example is the SPARC architecture.
- epilogue
This pattern emits RTL for exit from a function. The function exit is responsible for deallocating the stack frame, restoring callee saved registers and emitting the return instruction.
Using an epilogue pattern is generally preferred over defining
TARGET_ASM_FUNCTION_EPILOGUE
to emit assembly code for the epilogue.The
epilogue
pattern is particularly useful for targets which perform instruction scheduling or which have delay slots for their return instruction.- sibcall_epilogue
This pattern, if defined, emits RTL for exit from a function without the final branch back to the calling function. This pattern will be emitted before any sibling call (aka tail call) sites.
The
sibcall_epilogue
pattern must not clobber any arguments used for parameter passing or any stack slots for arguments passed to the current function.- trap
This pattern, if defined, signals an error, typically by causing some kind of signal to be raised.
- ctrapMM4
Conditional trap instruction. Operand 0 is a piece of RTL which performs a comparison, and operands 1 and 2 are the arms of the comparison. Operand 3 is the trap code, an integer.
A typical
ctrap
pattern looks like(define_insn "ctrapsi4" [(trap_if (match_operator 0 "trap_operator" [(match_operand 1 "register_operand") (match_operand 2 "immediate_operand")]) (match_operand 3 "const_int_operand" "i"))] "" "...")
- prefetch
This pattern, if defined, emits code for a non-faulting data prefetch instruction. Operand 0 is the address of the memory to prefetch. Operand 1 is a constant 1 if the prefetch is preparing for a write to the memory address, or a constant 0 otherwise. Operand 2 is the expected degree of temporal locality of the data and is a value between 0 and 3, inclusive; 0 means that the data has no temporal locality, so it need not be left in the cache after the access; 3 means that the data has a high degree of temporal locality and should be left in all levels of cache possible; 1 and 2 mean, respectively, a low or moderate degree of temporal locality.
Targets that do not support write prefetches or locality hints can ignore the values of operands 1 and 2.
- blockage
This pattern defines a pseudo insn that prevents the instruction scheduler and other passes from moving instructions and using register equivalences across the boundary defined by the blockage insn. This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM.
- memory_blockage
This pattern, if defined, represents a compiler memory barrier, and will be placed at points across which RTL passes may not propagate memory accesses. This instruction needs to read and write volatile BLKmode memory. It does not need to generate any machine instruction. If this pattern is not defined, the compiler falls back to emitting an instruction corresponding to
asm volatile ("" ::: "memory")
.- memory_barrier
If the target memory model is not fully synchronous, then this pattern should be defined to an instruction that orders both loads and stores before the instruction with respect to loads and stores after the instruction. This pattern has no operands.
- speculation_barrier
If the target can support speculative execution, then this pattern should be defined to an instruction that will block subsequent execution until any prior speculation conditions has been resolved. The pattern must also ensure that the compiler cannot move memory operations past the barrier, so it needs to be an UNSPEC_VOLATILE pattern. The pattern has no operands.
If this pattern is not defined then the default expansion of
__builtin_speculation_safe_value
will emit a warning. You can suppress this warning by defining this pattern with a final condition of0
(zero), which tells the compiler that a speculation barrier is not needed for this target.- sync_compare_and_swapmode
This pattern, if defined, emits code for an atomic compare-and-swap operation. Operand 1 is the memory on which the atomic operation is performed. Operand 2 is the ‘old’ value to be compared against the current contents of the memory location. Operand 3 is the ‘new’ value to store in the memory if the compare succeeds. Operand 0 is the result of the operation; it should contain the contents of the memory before the operation. If the compare succeeds, this should obviously be a copy of operand 2.
This pattern must show that both operand 0 and operand 1 are modified.
This pattern must issue any memory barrier instructions such that all memory operations before the atomic operation occur before the atomic operation and all memory operations after the atomic operation occur after the atomic operation.
For targets where the success or failure of the compare-and-swap operation is available via the status flags, it is possible to avoid a separate compare operation and issue the subsequent branch or store-flag operation immediately after the compare-and-swap. To this end, GCC will look for a
MODE_CC
set in the output ofsync_compare_and_swapmode
; if the machine description includes such a set, the target should also define specialcbranchcc4
and/orcstorecc4
instructions. GCC will then be able to take the destination of theMODE_CC
set and pass it to thecbranchcc4
orcstorecc4
pattern as the first operand of the comparison (the second will be(const_int 0)
).For targets where the operating system may provide support for this operation via library calls, the
sync_compare_and_swap_optab
may be initialized to a function with the same interface as the__sync_val_compare_and_swap_n
built-in. If the entire set of__sync
builtins are supported via library calls, the target can initialize all of the optabs at once withinit_sync_libfuncs
. For the purposes of C++11std::atomic::is_lock_free
, it is assumed that these library calls do not use any kind of interruptable locking.sync_addmode
,sync_submode
sync_iormode
,sync_andmode
sync_xormode
,sync_nandmode
These patterns emit code for an atomic operation on memory. Operand 0 is the memory on which the atomic operation is performed. Operand 1 is the second operand to the binary operator.
This pattern must issue any memory barrier instructions such that all memory operations before the atomic operation occur before the atomic operation and all memory operations after the atomic operation occur after the atomic operation.
If these patterns are not defined, the operation will be constructed from a compare-and-swap operation, if defined.
sync_old_addmode
,sync_old_submode
sync_old_iormode
,sync_old_andmode
sync_old_xormode
,sync_old_nandmode
These patterns emit code for an atomic operation on memory, and return the value that the memory contained before the operation. Operand 0 is the result value, operand 1 is the memory on which the atomic operation is performed, and operand 2 is the second operand to the binary operator.
This pattern must issue any memory barrier instructions such that all memory operations before the atomic operation occur before the atomic operation and all memory operations after the atomic operation occur after the atomic operation.
If these patterns are not defined, the operation will be constructed from a compare-and-swap operation, if defined.
sync_new_addmode
,sync_new_submode
sync_new_iormode
,sync_new_andmode
sync_new_xormode
,sync_new_nandmode
These patterns are like their
sync_old_op
counterparts, except that they return the value that exists in the memory location after the operation, rather than before the operation.- sync_lock_test_and_setmode
This pattern takes two forms, based on the capabilities of the target. In either case, operand 0 is the result of the operand, operand 1 is the memory on which the atomic operation is performed, and operand 2 is the value to set in the lock.
In the ideal case, this operation is an atomic exchange operation, in which the previous value in memory operand is copied into the result operand, and the value operand is stored in the memory operand.
For less capable targets, any value operand that is not the constant 1 should be rejected with
FAIL
. In this case the target may use an atomic test-and-set bit operation. The result operand should contain 1 if the bit was previously set and 0 if the bit was previously clear. The true contents of the memory operand are implementation defined.This pattern must issue any memory barrier instructions such that the pattern as a whole acts as an acquire barrier, that is all memory operations after the pattern do not occur until the lock is acquired.
If this pattern is not defined, the operation will be constructed from a compare-and-swap operation, if defined.
- sync_lock_releasemode
This pattern, if defined, releases a lock set by
sync_lock_test_and_setmode
. Operand 0 is the memory that contains the lock; operand 1 is the value to store in the lock.If the target doesn’t implement full semantics for
sync_lock_test_and_setmode
, any value operand which is not the constant 0 should be rejected withFAIL
, and the true contents of the memory operand are implementation defined.This pattern must issue any memory barrier instructions such that the pattern as a whole acts as a release barrier, that is the lock is released only after all previous memory operations have completed.
If this pattern is not defined, then a
memory_barrier
pattern will be emitted, followed by a store of the value to the memory operand.- atomic_compare_and_swapmode
This pattern, if defined, emits code for an atomic compare-and-swap operation with memory model semantics. Operand 2 is the memory on which the atomic operation is performed. Operand 0 is an output operand which is set to true or false based on whether the operation succeeded. Operand 1 is an output operand which is set to the contents of the memory before the operation was attempted. Operand 3 is the value that is expected to be in memory. Operand 4 is the value to put in memory if the expected value is found there. Operand 5 is set to 1 if this compare and swap is to be treated as a weak operation. Operand 6 is the memory model to be used if the operation is a success. Operand 7 is the memory model to be used if the operation fails.
If memory referred to in operand 2 contains the value in operand 3, then operand 4 is stored in memory pointed to by operand 2 and fencing based on the memory model in operand 6 is issued.
If memory referred to in operand 2 does not contain the value in operand 3, then fencing based on the memory model in operand 7 is issued.
If a target does not support weak compare-and-swap operations, or the port elects not to implement weak operations, the argument in operand 5 can be ignored. Note a strong implementation must be provided.
If this pattern is not provided, the
__atomic_compare_exchange
built-in functions will utilize the legacysync_compare_and_swap
pattern with an__ATOMIC_SEQ_CST
memory model.- atomic_loadmode
This pattern implements an atomic load operation with memory model semantics. Operand 1 is the memory address being loaded from. Operand 0 is the result of the load. Operand 2 is the memory model to be used for the load operation.
If not present, the
__atomic_load
built-in function will either resort to a normal load with memory barriers, or a compare-and-swap operation if a normal load would not be atomic.- atomic_storemode
This pattern implements an atomic store operation with memory model semantics. Operand 0 is the memory address being stored to. Operand 1 is the value to be written. Operand 2 is the memory model to be used for the operation.
If not present, the
__atomic_store
built-in function will attempt to perform a normal store and surround it with any required memory fences. If the store would not be atomic, then an__atomic_exchange
is attempted with the result being ignored.- atomic_exchangemode
This pattern implements an atomic exchange operation with memory model semantics. Operand 1 is the memory location the operation is performed on. Operand 0 is an output operand which is set to the original value contained in the memory pointed to by operand 1. Operand 2 is the value to be stored. Operand 3 is the memory model to be used.
If this pattern is not present, the built-in function
__atomic_exchange
will attempt to preform the operation with a compare and swap loop.atomic_addmode
,atomic_submode
atomic_ormode
,atomic_andmode
atomic_xormode
,atomic_nandmode
These patterns emit code for an atomic operation on memory with memory model semantics. Operand 0 is the memory on which the atomic operation is performed. Operand 1 is the second operand to the binary operator. Operand 2 is the memory model to be used by the operation.
If these patterns are not defined, attempts will be made to use legacy
sync
patterns, or equivalent patterns which return a result. If none of these are available a compare-and-swap loop will be used.atomic_fetch_addmode
,atomic_fetch_submode
atomic_fetch_ormode
,atomic_fetch_andmode
atomic_fetch_xormode
,atomic_fetch_nandmode
These patterns emit code for an atomic operation on memory with memory model semantics, and return the original value. Operand 0 is an output operand which contains the value of the memory location before the operation was performed. Operand 1 is the memory on which the atomic operation is performed. Operand 2 is the second operand to the binary operator. Operand 3 is the memory model to be used by the operation.
If these patterns are not defined, attempts will be made to use legacy
sync
patterns. If none of these are available a compare-and-swap loop will be used.atomic_add_fetchmode
,atomic_sub_fetchmode
atomic_or_fetchmode
,atomic_and_fetchmode
atomic_xor_fetchmode
,atomic_nand_fetchmode
These patterns emit code for an atomic operation on memory with memory model semantics and return the result after the operation is performed. Operand 0 is an output operand which contains the value after the operation. Operand 1 is the memory on which the atomic operation is performed. Operand 2 is the second operand to the binary operator. Operand 3 is the memory model to be used by the operation.
If these patterns are not defined, attempts will be made to use legacy
sync
patterns, or equivalent patterns which return the result before the operation followed by the arithmetic operation required to produce the result. If none of these are available a compare-and-swap loop will be used.- atomic_test_and_set
This pattern emits code for
__builtin_atomic_test_and_set
. Operand 0 is an output operand which is set to true if the previous previous contents of the byte was “set”, and false otherwise. Operand 1 is theQImode
memory to be modified. Operand 2 is the memory model to be used.The specific value that defines “set” is implementation defined, and is normally based on what is performed by the native atomic test and set instruction.
- atomic_bit_test_and_setmode atomic_bit_test_and_complementmode atomic_bit_test_and_resetmode
These patterns emit code for an atomic bitwise operation on memory with memory model semantics, and return the original value of the specified bit. Operand 0 is an output operand which contains the value of the specified bit from the memory location before the operation was performed. Operand 1 is the memory on which the atomic operation is performed. Operand 2 is the bit within the operand, starting with least significant bit. Operand 3 is the memory model to be used by the operation. Operand 4 is a flag - it is
const1_rtx
if operand 0 should contain the original value of the specified bit in the least significant bit of the operand, andconst0_rtx
if the bit should be in its original position in the operand.atomic_bit_test_and_setmode
atomically sets the specified bit after remembering its original value,atomic_bit_test_and_complementmode
inverts the specified bit andatomic_bit_test_and_resetmode
clears the specified bit.If these patterns are not defined, attempts will be made to use
atomic_fetch_ormode
,atomic_fetch_xormode
oratomic_fetch_andmode
instruction patterns, or theirsync
counterparts. If none of these are available a compare-and-swap loop will be used.- atomic_add_fetch_cmp_0mode atomic_sub_fetch_cmp_0mode atomic_and_fetch_cmp_0mode atomic_or_fetch_cmp_0mode atomic_xor_fetch_cmp_0mode
These patterns emit code for an atomic operation on memory with memory model semantics if the fetch result is used only in a comparison against zero. Operand 0 is an output operand which contains a boolean result of comparison of the value after the operation against zero. Operand 1 is the memory on which the atomic operation is performed. Operand 2 is the second operand to the binary operator. Operand 3 is the memory model to be used by the operation. Operand 4 is an integer holding the comparison code, one of
EQ
,NE
,LT
,GT
,LE
orGE
.If these patterns are not defined, attempts will be made to use separate atomic operation and fetch pattern followed by comparison of the result against zero.
- mem_thread_fence
This pattern emits code required to implement a thread fence with memory model semantics. Operand 0 is the memory model to be used.
For the
__ATOMIC_RELAXED
model no instructions need to be issued and this expansion is not invoked.The compiler always emits a compiler memory barrier regardless of what expanding this pattern produced.
If this pattern is not defined, the compiler falls back to expanding the
memory_barrier
pattern, then to emitting__sync_synchronize
library call, and finally to just placing a compiler memory barrier.- get_thread_pointermode set_thread_pointermode
These patterns emit code that reads/sets the TLS thread pointer. Currently, these are only needed if the target needs to support the
__builtin_thread_pointer
and__builtin_set_thread_pointer
builtins.The get/set patterns have a single output/input operand respectively, with
mode
intended to bePmode
.- stack_protect_combined_set
This pattern, if defined, moves a
ptr_mode
value from an address whose declaration RTX is given in operand 1 to the memory in operand 0 without leaving the value in a register afterward. If several instructions are needed by the target to perform the operation (eg. to load the address from a GOT entry then load theptr_mode
value and finally store it), it is the backend’s responsibility to ensure no intermediate result gets spilled. This is to avoid leaking the value some place that an attacker might use to rewrite the stack guard slot after having clobbered it.If this pattern is not defined, then the address declaration is expanded first in the standard way and a
stack_protect_set
pattern is then generated to move the value from that address to the address in operand 0.- stack_protect_set
This pattern, if defined, moves a
ptr_mode
value from the valid memory location in operand 1 to the memory in operand 0 without leaving the value in a register afterward. This is to avoid leaking the value some place that an attacker might use to rewrite the stack guard slot after having clobbered it.Note: on targets where the addressing modes do not allow to load directly from stack guard address, the address is expanded in a standard way first which could cause some spills.
If this pattern is not defined, then a plain move pattern is generated.
- stack_protect_combined_test
This pattern, if defined, compares a
ptr_mode
value from an address whose declaration RTX is given in operand 1 with the memory in operand 0 without leaving the value in a register afterward and branches to operand 2 if the values were equal. If several instructions are needed by the target to perform the operation (eg. to load the address from a GOT entry then load theptr_mode
value and finally store it), it is the backend’s responsibility to ensure no intermediate result gets spilled. This is to avoid leaking the value some place that an attacker might use to rewrite the stack guard slot after having clobbered it.If this pattern is not defined, then the address declaration is expanded first in the standard way and a
stack_protect_test
pattern is then generated to compare the value from that address to the value at the memory in operand 0.- stack_protect_test
This pattern, if defined, compares a
ptr_mode
value from the valid memory location in operand 1 with the memory in operand 0 without leaving the value in a register afterward and branches to operand 2 if the values were equal.If this pattern is not defined, then a plain compare pattern and conditional branch pattern is used.
- clear_cache
This pattern, if defined, flushes the instruction cache for a region of memory. The region is bounded to by the Pmode pointers in operand 0 inclusive and operand 1 exclusive.
If this pattern is not defined, a call to the library function
__clear_cache
is used.- spaceshipm3
Initialize output operand 0 with mode of integer type to -1, 0, 1 or 2 if operand 1 with mode
m
compares less than operand 2, equal to operand 2, greater than operand 2 or is unordered with operand 2.m
should be a scalar floating point mode.This pattern is not allowed to
FAIL
.