Operands#

In general, expressions in GIMPLE consist of an operation and the appropriate number of simple operands; these operands must either be a GIMPLE rvalue (is_gimple_val), i.e. a constant or a register variable. More complex operands are factored out into temporaries, so that

a = b + c + d

becomes

T1 = b + c;
a = T1 + d;

The same rule holds for arguments to a GIMPLE_CALL.

The target of an assignment is usually a variable, but can also be a MEM_REF or a compound lvalue as described below.

Compound Expressions#

The left-hand side of a C comma expression is simply moved into a separate statement.

Compound Lvalues#

Currently compound lvalues involving array and structure field references are not broken down; an expression like a.b[2] = 42 is not reduced any further (though complex array subscripts are). This restriction is a workaround for limitations in later optimizers; if we were to convert this to

T1 = &a.b;
T1[2] = 42;

alias analysis would not remember that the reference to T1[2] came by way of a.b, so it would think that the assignment could alias another member of a ; this broke struct-alias-1.c. Future optimizer improvements may make this limitation unnecessary.

Conditional Expressions#

A C ?: expression is converted into an if statement with each branch assigning to the same temporary. So,

a = b ? c : d;

becomes

if (b == 1)
  T1 = c;
else
  T1 = d;
a = T1;

The GIMPLE level if-conversion pass re-introduces ?: expression, if appropriate. It is used to vectorize loops with conditions using vector conditional operations.

Note that in GIMPLE, if statements are represented using GIMPLE_COND, as described below.

Logical Operators#

Except when they appear in the condition operand of a GIMPLE_COND, logical ‘and’ and ‘or’ operators are simplified as follows: a = b && c becomes

T1 = (bool)b;
if (T1 == true)
  T1 = (bool)c;
a = T1;

Note that T1 in this example cannot be an expression temporary, because it has two different assignments.

Manipulating operands#

All gimple operands are of type tree. But only certain types of trees are allowed to be used as operand tuples. Basic validation is controlled by the function get_gimple_rhs_class, which given a tree code, returns an enum with the following values of type enum gimple_rhs_class

  • GIMPLE_INVALID_RHS The tree cannot be used as a GIMPLE operand.

  • GIMPLE_TERNARY_RHS The tree is a valid GIMPLE ternary operation.

  • GIMPLE_BINARY_RHS The tree is a valid GIMPLE binary operation.

  • GIMPLE_UNARY_RHS The tree is a valid GIMPLE unary operation.

  • GIMPLE_SINGLE_RHS The tree is a single object, that cannot be split into simpler operands (for instance, SSA_NAME, VAR_DECL, COMPONENT_REF, etc).

    This operand class also acts as an escape hatch for tree nodes that may be flattened out into the operand vector, but would need more than two slots on the RHS. For instance, a COND_EXPR expression of the form (a op b) ? x : y could be flattened out on the operand vector using 4 slots, but it would also require additional processing to distinguish c = a op b from c = a op b ? x : y. Something similar occurs with ASSERT_EXPR. In time, these special case tree expressions should be flattened into the operand vector.

For tree nodes in the categories GIMPLE_TERNARY_RHS, GIMPLE_BINARY_RHS and GIMPLE_UNARY_RHS, they cannot be stored inside tuples directly. They first need to be flattened and separated into individual components. For instance, given the GENERIC expression

a = b + c

its tree representation is:

MODIFY_EXPR <VAR_DECL  <a>, PLUS_EXPR <VAR_DECL <b>, VAR_DECL <c>>>

In this case, the GIMPLE form for this statement is logically identical to its GENERIC form but in GIMPLE, the PLUS_EXPR on the RHS of the assignment is not represented as a tree, instead the two operands are taken out of the PLUS_EXPR sub-tree and flattened into the GIMPLE tuple as follows:

GIMPLE_ASSIGN <PLUS_EXPR, VAR_DECL <a>, VAR_DECL <b>, VAR_DECL <c>>

Operand vector allocation#

The operand vector is stored at the bottom of the three tuple structures that accept operands. This means, that depending on the code of a given statement, its operand vector will be at different offsets from the base of the structure. To access tuple operands use the following accessors

unsigned gimple_num_ops(gimple g)

Returns the number of operands in statement G.

tree gimple_op(gimple g, unsigned i)

Returns operand I from statement G.

tree *gimple_ops(gimple g)

Returns a pointer into the operand vector for statement G. This is computed using an internal table called gimple_ops_offset_ []. This table is indexed by the gimple code of G.

When the compiler is built, this table is filled-in using the sizes of the structures used by each statement code defined in gimple.def. Since the operand vector is at the bottom of the structure, for a gimple code C the offset is computed as sizeof (struct-of C) - sizeof (tree).

This mechanism adds one memory indirection to every access when using gimple_op (), if this becomes a bottleneck, a pass can choose to memoize the result from gimple_ops () and use that to access the operands.

Operand validation#

When adding a new operand to a gimple statement, the operand will be validated according to what each tuple accepts in its operand vector. These predicates are called by the gimple_name_set_...(). Each tuple will use one of the following predicates (Note, this list is not exhaustive):

bool is_gimple_val(tree t)#

Returns true if t is a “GIMPLE value”, which are all the non-addressable stack variables (variables for which is_gimple_reg returns true) and constants (expressions for which is_gimple_min_invariant returns true).

bool is_gimple_addressable(tree t)#

Returns true if t is a symbol or memory reference whose address can be taken.

bool is_gimple_asm_val(tree t)#

Similar to is_gimple_val but it also accepts hard registers.

bool is_gimple_call_addr(tree t)#

Return true if t is a valid expression to use as the function called by a GIMPLE_CALL.

bool is_gimple_mem_ref_addr(tree t)#

Return true if t is a valid expression to use as first operand of a MEM_REF expression.

bool is_gimple_constant(tree t)#

Return true if t is a valid gimple constant.

bool is_gimple_min_invariant(tree t)#

Return true if t is a valid minimal invariant. This is different from constants, in that the specific value of t may not be known at compile time, but it is known that it doesn’t change (e.g., the address of a function local variable).

bool is_gimple_ip_invariant(tree t)#

Return true if t is an interprocedural invariant. This means that t is a valid invariant in all functions (e.g. it can be an address of a global variable but not of a local one).

bool is_gimple_ip_invariant_address(tree t)#

Return true if t is an ADDR_EXPR that does not change once the program is running (and which is valid in all functions).

Statement validation#

bool is_gimple_assign(gimple g)#

Return true if the code of g is GIMPLE_ASSIGN.

bool is_gimple_call(gimple g)#

Return true if the code of g is GIMPLE_CALL.

bool is_gimple_debug(gimple g)#

Return true if the code of g is GIMPLE_DEBUG.

bool gimple_assign_cast_p(const_gimple g)#

Return true if g is a GIMPLE_ASSIGN that performs a type cast operation.

bool gimple_debug_bind_p(gimple g)#

Return true if g is a GIMPLE_DEBUG that binds the value of an expression to a variable.

bool is_gimple_omp(gimple g)#

Return true if g is any of the OpenMP codes.

bool gimple_debug_begin_stmt_p(gimple g)#

Return true if g is a GIMPLE_DEBUG that marks the beginning of a source statement.

bool gimple_debug_inline_entry_p(gimple g)#

Return true if g is a GIMPLE_DEBUG that marks the entry point of an inlined function.

bool gimple_debug_nonbind_marker_p(gimple g)#

Return true if g is a GIMPLE_DEBUG that marks a program location, without any variable binding.