Basic PowerPC Built-in Functions#
This section describes PowerPC built-in functions that do not require the inclusion of any special header files to declare prototypes or provide macro definitions. The sections that follow describe additional PowerPC built-in functions.
Basic PowerPC Built-in Functions Available on all Configurations#
-
void __builtin_cpu_init(void)#
This function is a
nop
on the PowerPC platform and is included solely to maintain API compatibility with the x86 builtins.
-
int __builtin_cpu_is(const char *cpuname)#
This function returns a value of
1
if the run-time CPU is of typecpuname
and returns0
otherwiseThe
__builtin_cpu_is
function requires GLIBC 2.23 or newer which exports the hardware capability bits. GCC defines the macro__BUILTIN_CPU_SUPPORTS__
if the__builtin_cpu_supports
built-in function is fully supported.If GCC was configured to use a GLIBC before 2.23, the built-in function
__builtin_cpu_is
always returns a 0 and the compiler issues a warning.The following CPU names can be detected:
power10
IBM POWER10 Server CPU.
power9
IBM POWER9 Server CPU.
power8
IBM POWER8 Server CPU.
power7
IBM POWER7 Server CPU.
power6x
IBM POWER6 Server CPU (RAW mode).
power6
IBM POWER6 Server CPU (Architected mode).
power5+
IBM POWER5+ Server CPU.
power5
IBM POWER5 Server CPU.
ppc970
IBM 970 Server CPU (ie, Apple G5).
power4
IBM POWER4 Server CPU.
ppca2
IBM A2 64-bit Embedded CPU
ppc476
IBM PowerPC 476FP 32-bit Embedded CPU.
ppc464
IBM PowerPC 464 32-bit Embedded CPU.
ppc440
PowerPC 440 32-bit Embedded CPU.
ppc405
PowerPC 405 32-bit Embedded CPU.
ppc-cell-be
IBM PowerPC Cell Broadband Engine Architecture CPU.
Here is an example:
#ifdef __BUILTIN_CPU_SUPPORTS__ if (__builtin_cpu_is ("power8")) { do_power8 (); // POWER8 specific implementation. } else #endif { do_generic (); // Generic implementation. }
-
int __builtin_cpu_supports(const char *feature)#
This function returns a value of
1
if the run-time CPU supports the HWCAP featurefeature
and returns0
otherwise.The
__builtin_cpu_supports
function requires GLIBC 2.23 or newer which exports the hardware capability bits. GCC defines the macro__BUILTIN_CPU_SUPPORTS__
if the__builtin_cpu_supports
built-in function is fully supported.If GCC was configured to use a GLIBC before 2.23, the built-in function
__builtin_cpu_supports
always returns a 0 and the compiler issues a warning.The following features can be detected:
4xxmac
4xx CPU has a Multiply Accumulator.
altivec
CPU has a SIMD/Vector Unit.
arch_2_05
CPU supports ISA 2.05 (eg, POWER6)
arch_2_06
CPU supports ISA 2.06 (eg, POWER7)
arch_2_07
CPU supports ISA 2.07 (eg, POWER8)
arch_3_00
CPU supports ISA 3.0 (eg, POWER9)
arch_3_1
CPU supports ISA 3.1 (eg, POWER10)
archpmu
CPU supports the set of compatible performance monitoring events.
booke
CPU supports the Embedded ISA category.
cellbe
CPU has a CELL broadband engine.
darn
CPU supports the
darn
(deliver a random number) instruction.dfp
CPU has a decimal floating point unit.
dscr
CPU supports the data stream control register.
ebb
CPU supports event base branching.
efpdouble
CPU has a SPE double precision floating point unit.
efpsingle
CPU has a SPE single precision floating point unit.
fpu
CPU has a floating point unit.
htm
CPU has hardware transaction memory instructions.
htm-nosc
Kernel aborts hardware transactions when a syscall is made.
htm-no-suspend
CPU supports hardware transaction memory but does not support the
tsuspend.
instruction.ic_snoop
CPU supports icache snooping capabilities.
ieee128
CPU supports 128-bit IEEE binary floating point instructions.
isel
CPU supports the integer select instruction.
mma
CPU supports the matrix-multiply assist instructions.
mmu
CPU has a memory management unit.
notb
CPU does not have a timebase (eg, 601 and 403gx).
pa6t
CPU supports the PA Semi 6T CORE ISA.
power4
CPU supports ISA 2.00 (eg, POWER4)
power5
CPU supports ISA 2.02 (eg, POWER5)
power5+
CPU supports ISA 2.03 (eg, POWER5+)
power6x
CPU supports ISA 2.05 (eg, POWER6) extended opcodes mffgpr and mftgpr.
ppc32
CPU supports 32-bit mode execution.
ppc601
CPU supports the old POWER ISA (eg, 601)
ppc64
CPU supports 64-bit mode execution.
ppcle
CPU supports a little-endian mode that uses address swizzling.
scv
Kernel supports system call vectored.
smt
CPU support simultaneous multi-threading.
spe
CPU has a signal processing extension unit.
tar
CPU supports the target address register.
true_le
CPU supports true little-endian mode.
ucache
CPU has unified I/D cache.
vcrypto
CPU supports the vector cryptography instructions.
vsx
CPU supports the vector-scalar extension.
Here is an example:
#ifdef __BUILTIN_CPU_SUPPORTS__ if (__builtin_cpu_supports ("fpu")) { asm("fadd %0,%1,%2" : "=d"(dst) : "d"(src1), "d"(src2)); } else #endif { dst = __fadd (src1, src2); // Software FP addition function. }
The following built-in functions are also available on all PowerPC processors:
uint64_t __builtin_ppc_get_timebase ();
unsigned long __builtin_ppc_mftb ();
double __builtin_unpack_ibm128 (__ibm128, int);
__ibm128 __builtin_pack_ibm128 (double, double);
double __builtin_mffs (void);
void __builtin_mtfsf (const int, double);
void __builtin_mtfsb0 (const int);
void __builtin_mtfsb1 (const int);
void __builtin_set_fpscr_rn (int);
The __builtin_ppc_get_timebase
and __builtin_ppc_mftb
functions generate instructions to read the Time Base Register. The
__builtin_ppc_get_timebase
function may generate multiple
instructions and always returns the 64 bits of the Time Base Register.
The __builtin_ppc_mftb
function always generates one instruction and
returns the Time Base Register value as an unsigned long, throwing away
the most significant word on 32-bit environments. The __builtin_mffs
return the value of the FPSCR register. Note, ISA 3.0 supports the
__builtin_mffsl()
which permits software to read the control and
non-sticky status bits in the FSPCR without the higher latency associated with
accessing the sticky status bits. The __builtin_mtfsf
takes a constant
8-bit integer field mask and a double precision floating point argument
and generates the mtfsf
(extended mnemonic) instruction to write new
values to selected fields of the FPSCR. The
__builtin_mtfsb0
and __builtin_mtfsb1
take the bit to change
as an argument. The valid bit range is between 0 and 31. The builtins map to
the mtfsb0
and mtfsb1
instructions which take the argument and
add 32. Hence these instructions only modify the FPSCR[32:63] bits by
changing the specified bit to a zero or one respectively. The
__builtin_set_fpscr_rn
builtin allows changing both of the floating
point rounding mode bits. The argument is a 2-bit value. The argument can
either be a const int
or stored in a variable. The builtin uses
the ISA 3.0
instruction mffscrn
if available, otherwise it reads the FPSCR, masks
the current rounding mode bits out and OR’s in the new value.
Basic PowerPC Built-in Functions Available on ISA 2.05#
The basic built-in functions described in this section are
available on the PowerPC family of processors starting with ISA 2.05
or later. Unless specific options are explicitly disabled on the
command line, specifying option -mcpu=power6
has the effect of
enabling the -mpowerpc64
, -mpowerpc-gpopt
,
-mpowerpc-gfxopt
, -mmfcrf
, -mpopcntb
,
-mfprnd
, -mcmpb
, -mhard-dfp
, and
-mrecip-precision
options. Specify the
-maltivec
option explicitly in
combination with the above options if desired.
The following functions require option -mcmpb
.
unsigned long long __builtin_cmpb (unsigned long long int, unsigned long long int);
unsigned int __builtin_cmpb (unsigned int, unsigned int);
The __builtin_cmpb
function
performs a byte-wise compare on the contents of its two arguments,
returning the result of the byte-wise comparison as the returned
value. For each byte comparison, the corresponding byte of the return
value holds 0xff if the input bytes are equal and 0 if the input bytes
are not equal. If either of the arguments to this built-in function
is wider than 32 bits, the function call expands into the form that
expects unsigned long long int
arguments
which is only available on 64-bit targets.
The following built-in functions are available
when hardware decimal floating point
(-mhard-dfp
) is available:
void __builtin_set_fpscr_drn(int);
_Decimal64 __builtin_ddedpd (int, _Decimal64);
_Decimal128 __builtin_ddedpdq (int, _Decimal128);
_Decimal64 __builtin_denbcd (int, _Decimal64);
_Decimal128 __builtin_denbcdq (int, _Decimal128);
_Decimal64 __builtin_diex (long long, _Decimal64);
_Decimal128 _builtin_diexq (long long, _Decimal128);
_Decimal64 __builtin_dscli (_Decimal64, int);
_Decimal128 __builtin_dscliq (_Decimal128, int);
_Decimal64 __builtin_dscri (_Decimal64, int);
_Decimal128 __builtin_dscriq (_Decimal128, int);
long long __builtin_dxex (_Decimal64);
long long __builtin_dxexq (_Decimal128);
_Decimal128 __builtin_pack_dec128 (unsigned long long, unsigned long long);
unsigned long long __builtin_unpack_dec128 (_Decimal128, int);
The __builtin_set_fpscr_drn builtin allows changing the three decimal floating point rounding mode bits. The argument is a 3-bit value. The argument can either be a const int or the value can be stored in a variable. The builtin uses the ISA 3.0 instruction mffscdrn if available. Otherwise the builtin reads the FPSCR, masks the current decimal rounding mode bits out and OR’s in the new value.
The following functions require -mhard-float
,
-mpowerpc-gfxopt
, and -mpopcntb
options.
double __builtin_recipdiv (double, double);
float __builtin_recipdivf (float, float);
double __builtin_rsqrt (double);
float __builtin_rsqrtf (float);
The vec_rsqrt
, __builtin_rsqrt
, and
__builtin_rsqrtf
functions generate multiple instructions to
implement the reciprocal sqrt functionality using reciprocal sqrt
estimate instructions.
The __builtin_recipdiv
, and __builtin_recipdivf
functions generate multiple instructions to implement division using
the reciprocal estimate instructions.
The following functions require -mhard-float
and
-mmultiple
options.
The __builtin_unpack_longdouble
function takes a
long double
argument and a compile time constant of 0 or 1. If
the constant is 0, the first double
within the
long double
is returned, otherwise the second double
is returned. The __builtin_unpack_longdouble
function is only
available if long double
uses the IBM extended double
representation.
The __builtin_pack_longdouble
function takes two double
arguments and returns a long double
value that combines the two
arguments. The __builtin_pack_longdouble
function is only
available if long double
uses the IBM extended double
representation.
The __builtin_unpack_ibm128
function takes a __ibm128
argument and a compile time constant of 0 or 1. If the constant is 0,
the first double
within the __ibm128
is returned,
otherwise the second double
is returned.
The __builtin_pack_ibm128
function takes two double
arguments and returns a __ibm128
value that combines the two
arguments.
Additional built-in functions are available for the 64-bit PowerPC
family of processors, for efficient use of 128-bit floating point
(__float128
) values.
Basic PowerPC Built-in Functions Available on ISA 2.06#
The basic built-in functions described in this section are
available on the PowerPC family of processors starting with ISA 2.05
or later. Unless specific options are explicitly disabled on the
command line, specifying option -mcpu=power7
has the effect of
enabling all the same options as for -mcpu=power6
in
addition to the -maltivec
, -mpopcntd
, and
-mvsx
options.
The following basic built-in functions require -mpopcntd
:
unsigned int __builtin_addg6s (unsigned int, unsigned int);
long long __builtin_bpermd (long long, long long);
unsigned int __builtin_cbcdtd (unsigned int);
unsigned int __builtin_cdtbcd (unsigned int);
long long __builtin_divde (long long, long long);
unsigned long long __builtin_divdeu (unsigned long long, unsigned long long);
int __builtin_divwe (int, int);
unsigned int __builtin_divweu (unsigned int, unsigned int);
vector __int128 __builtin_pack_vector_int128 (long long, long long);
void __builtin_rs6000_speculation_barrier (void);
long long __builtin_unpack_vector_int128 (vector __int128, signed char);
Of these, the __builtin_divde
and __builtin_divdeu
functions
require a 64-bit environment.
The following basic built-in functions, which are also supported on
x86 targets, require -mfloat128
.
__float128 __builtin_fabsq (__float128);
__float128 __builtin_copysignq (__float128, __float128);
__float128 __builtin_infq (void);
__float128 __builtin_huge_valq (void);
__float128 __builtin_nanq (void);
__float128 __builtin_nansq (void);
__float128 __builtin_sqrtf128 (__float128);
__float128 __builtin_fmaf128 (__float128, __float128, __float128);
Basic PowerPC Built-in Functions Available on ISA 2.07#
The basic built-in functions described in this section are
available on the PowerPC family of processors starting with ISA 2.07
or later. Unless specific options are explicitly disabled on the
command line, specifying option -mcpu=power8
has the effect of
enabling all the same options as for -mcpu=power7
in
addition to the -mpower8-fusion
, -mpower8-vector
,
-mcrypto
, -mhtm
, -mquad-memory
, and
-mquad-memory-atomic
options.
This section intentionally empty.
Basic PowerPC Built-in Functions Available on ISA 3.0#
The basic built-in functions described in this section are
available on the PowerPC family of processors starting with ISA 3.0
or later. Unless specific options are explicitly disabled on the
command line, specifying option -mcpu=power9
has the effect of
enabling all the same options as for -mcpu=power8
in
addition to the -misel
option.
The following built-in functions are available on Linux 64-bit systems
that use the ISA 3.0 instruction set (-mcpu=power9
):
-
__float128 __builtin_addf128_round_to_odd(__float128, __float128)#
Perform a 128-bit IEEE floating point add using round to odd as the rounding mode.
-
__float128 __builtin_subf128_round_to_odd(__float128, __float128)#
Perform a 128-bit IEEE floating point subtract using round to odd as the rounding mode.
-
__float128 __builtin_mulf128_round_to_odd(__float128, __float128)#
Perform a 128-bit IEEE floating point multiply using round to odd as the rounding mode.
-
__float128 __builtin_divf128_round_to_odd(__float128, __float128)#
Perform a 128-bit IEEE floating point divide using round to odd as the rounding mode.
-
__float128 __builtin_sqrtf128_round_to_odd(__float128)#
Perform a 128-bit IEEE floating point square root using round to odd as the rounding mode.
-
__float128 __builtin_fmaf128_round_to_odd(__float128, __float128, __float128)#
Perform a 128-bit IEEE floating point fused multiply and add operation using round to odd as the rounding mode.
-
double __builtin_truncf128_round_to_odd(__float128)#
Convert a 128-bit IEEE floating point value to
double
using round to odd as the rounding mode.
The following additional built-in functions are also available for the PowerPC family of processors, starting with ISA 3.0 or later:
long long __builtin_darn (void);
long long __builtin_darn_raw (void);
int __builtin_darn_32 (void);
The __builtin_darn
and __builtin_darn_raw
functions require a
64-bit environment supporting ISA 3.0 or later.
The __builtin_darn
function provides a 64-bit conditioned
random number. The __builtin_darn_raw
function provides a
64-bit raw random number. The __builtin_darn_32
function
provides a 32-bit conditioned random number.
The following additional built-in functions are also available for the PowerPC family of processors, starting with ISA 3.0 or later:
int __builtin_byte_in_set (unsigned char u, unsigned long long set);
int __builtin_byte_in_range (unsigned char u, unsigned int range);
int __builtin_byte_in_either_range (unsigned char u, unsigned int ranges);
int __builtin_dfp_dtstsfi_lt (unsigned int comparison, _Decimal64 value);
int __builtin_dfp_dtstsfi_lt (unsigned int comparison, _Decimal128 value);
int __builtin_dfp_dtstsfi_lt_dd (unsigned int comparison, _Decimal64 value);
int __builtin_dfp_dtstsfi_lt_td (unsigned int comparison, _Decimal128 value);
int __builtin_dfp_dtstsfi_gt (unsigned int comparison, _Decimal64 value);
int __builtin_dfp_dtstsfi_gt (unsigned int comparison, _Decimal128 value);
int __builtin_dfp_dtstsfi_gt_dd (unsigned int comparison, _Decimal64 value);
int __builtin_dfp_dtstsfi_gt_td (unsigned int comparison, _Decimal128 value);
int __builtin_dfp_dtstsfi_eq (unsigned int comparison, _Decimal64 value);
int __builtin_dfp_dtstsfi_eq (unsigned int comparison, _Decimal128 value);
int __builtin_dfp_dtstsfi_eq_dd (unsigned int comparison, _Decimal64 value);
int __builtin_dfp_dtstsfi_eq_td (unsigned int comparison, _Decimal128 value);
int __builtin_dfp_dtstsfi_ov (unsigned int comparison, _Decimal64 value);
int __builtin_dfp_dtstsfi_ov (unsigned int comparison, _Decimal128 value);
int __builtin_dfp_dtstsfi_ov_dd (unsigned int comparison, _Decimal64 value);
int __builtin_dfp_dtstsfi_ov_td (unsigned int comparison, _Decimal128 value);
double __builtin_mffsl(void);
The __builtin_byte_in_set
function requires a
64-bit environment supporting ISA 3.0 or later. This function returns
a non-zero value if and only if its u
argument exactly equals one of
the eight bytes contained within its 64-bit set
argument.
The __builtin_byte_in_range
and
__builtin_byte_in_either_range
require an environment
supporting ISA 3.0 or later. For these two functions, the
range
argument is encoded as 4 bytes, organized as
hi_1:lo_1:hi_2:lo_2
.
The __builtin_byte_in_range
function returns a
non-zero value if and only if its u
argument is within the
range bounded between lo_2
and hi_2
inclusive.
The __builtin_byte_in_either_range
function returns non-zero if
and only if its u
argument is within either the range bounded
between lo_1
and hi_1
inclusive or the range bounded
between lo_2
and hi_2
inclusive.
The __builtin_dfp_dtstsfi_lt
function returns a non-zero value
if and only if the number of signficant digits of its value
argument
is less than its comparison
argument. The
__builtin_dfp_dtstsfi_lt_dd
and
__builtin_dfp_dtstsfi_lt_td
functions behave similarly, but
require that the type of the value
argument be
__Decimal64
and __Decimal128
respectively.
The __builtin_dfp_dtstsfi_gt
function returns a non-zero value
if and only if the number of signficant digits of its value
argument
is greater than its comparison
argument. The
__builtin_dfp_dtstsfi_gt_dd
and
__builtin_dfp_dtstsfi_gt_td
functions behave similarly, but
require that the type of the value
argument be
__Decimal64
and __Decimal128
respectively.
The __builtin_dfp_dtstsfi_eq
function returns a non-zero value
if and only if the number of signficant digits of its value
argument
equals its comparison
argument. The
__builtin_dfp_dtstsfi_eq_dd
and
__builtin_dfp_dtstsfi_eq_td
functions behave similarly, but
require that the type of the value
argument be
__Decimal64
and __Decimal128
respectively.
The __builtin_dfp_dtstsfi_ov
function returns a non-zero value
if and only if its value
argument has an undefined number of
significant digits, such as when value
is an encoding of NaN
.
The __builtin_dfp_dtstsfi_ov_dd
and
__builtin_dfp_dtstsfi_ov_td
functions behave similarly, but
require that the type of the value
argument be
__Decimal64
and __Decimal128
respectively.
The __builtin_mffsl
uses the ISA 3.0 mffsl
instruction to read
the FPSCR. The instruction is a lower latency version of the mffs
instruction. If the mffsl
instruction is not available, then the
builtin uses the older mffs
instruction to read the FPSCR.
Basic PowerPC Built-in Functions Available on ISA 3.1#
The basic built-in functions described in this section are
available on the PowerPC family of processors starting with ISA 3.1.
Unless specific options are explicitly disabled on the
command line, specifying option -mcpu=power10
has the effect of
enabling all the same options as for -mcpu=power9
.
The following built-in functions are available on Linux 64-bit systems
that use a future architecture instruction set (-mcpu=power10
):
-
unsigned long long __builtin_cfuged(unsigned long long, unsigned long long int)#
Perform a 64-bit centrifuge operation, as if implemented by the
cfuged
instruction.
-
unsigned long long __builtin_cntlzdm(unsigned long long int, unsigned long long int)#
Perform a 64-bit count leading zeros operation under mask, as if implemented by the
cntlzdm
instruction.
-
unsigned long long __builtin_cnttzdm(unsigned long long, unsigned long long)#
Perform a 64-bit count trailing zeros operation under mask, as if implemented by the
cnttzdm
instruction.
-
unsigned long long __builtin_pdepd(unsigned long long int, unsigned long long int)#
Perform a 64-bit parallel bits deposit operation, as if implemented by the
pdepd
instruction.
-
unsigned long long __builtin_pextd(unsigned long long, unsigned long long)#
Perform a 64-bit parallel bits extract operation, as if implemented by the
pextd
instruction.
vector signed __int128 vsx_xl_sext (signed long long, signed char *)
vector signed __int128 vsx_xl_sext (signed long long, signed short *)
vector signed __int128 vsx_xl_sext (signed long long, signed int *)
vector signed __int128 vsx_xl_sext (signed long long, signed long long *)
vector unsigned __int128 vsx_xl_zext (signed long long, unsigned char *)
vector unsigned __int128 vsx_xl_zext (signed long long, unsigned short *)
vector unsigned __int128 vsx_xl_zext (signed long long, unsigned int *)
vector unsigned __int128 vsx_xl_zext (signed long long, unsigned long long *)
Load (and sign extend) to an __int128 vector, as if implemented by the ISA 3.1
lxvrbx
, lxvrhx
, lxvrwx
, and lxvrdx
instructions.
void vec_xst_trunc (vector signed __int128, signed long long, signed char *)
void vec_xst_trunc (vector signed __int128, signed long long, signed short *)
void vec_xst_trunc (vector signed __int128, signed long long, signed int *)
void vec_xst_trunc (vector signed __int128, signed long long, signed long long *)
void vec_xst_trunc (vector unsigned __int128, signed long long, unsigned char *)
void vec_xst_trunc (vector unsigned __int128, signed long long, unsigned short *)
void vec_xst_trunc (vector unsigned __int128, signed long long, unsigned int *)
void vec_xst_trunc (vector unsigned __int128, signed long long, unsigned long long *)
Truncate and store the rightmost element of a vector, as if implemented by the
ISA 3.1 stxvrbx
, stxvrhx
, stxvrwx
, and stxvrdx
instructions.