On-the-Side SSA Form for RTL#

The patterns of an individual RTL instruction describe which registers are inputs to that instruction and which registers are outputs from that instruction. However, it is often useful to know where the definition of a register input comes from and where the result of a register output is used. One way of obtaining this information is to use the RTL SSA form, which provides a Static Single Assignment representation of the RTL instructions.

The RTL SSA code is located in the rtl-ssa subdirectory of the GCC source tree. This section only gives a brief overview of it; please see the comments in the source code for more details.

Using RTL SSA in a pass#

A pass that wants to use the RTL SSA form should start with the following:

#define INCLUDE_ALGORITHM
#define INCLUDE_FUNCTIONAL
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "backend.h"
#include "rtl.h"
#include "df.h"
#include "rtl-ssa.h"

All the RTL SSA code is contained in the rtl_ssa namespace, so most passes will then want to do:

using namespace rtl_ssa;

However, this is purely a matter of taste, and the examples in the rest of this section do not require it.

The RTL SSA represention is an optional on-the-side feature that applies on top of the normal RTL instructions. It is currently local to individual RTL passes and is not maintained across passes.

However, in order to allow the RTL SSA information to be preserved across passes in future, crtl->ssa points to the current function’s SSA form (if any). Passes that want to use the RTL SSA form should first do:

crtl->ssa = new rtl_ssa::function_info (fn);

where fn is the function that the pass is processing. (Passes that are using namespace rtl_ssa do not need the rtl_ssa::.)

Once the pass has finished with the SSA form, it should do the following:

free_dominance_info (CDI_DOMINATORS);
if (crtl->ssa->perform_pending_updates ())
  cleanup_cfg (0);

delete crtl->ssa;
crtl->ssa = nullptr;

The free_dominance_info call is necessary because dominance information is not currently maintained between RTL passes. The next two lines commit any changes to the RTL instructions that were queued for later; see the comment above the declaration of perform_pending_updates for details. The final two lines discard the RTL SSA form and free the associated memory.

RTL SSA Instructions#

RTL SSA instructions are represented by an rtl_ssa::insn_info. These instructions are chained together in a single list that follows a reverse postorder (RPO) traversal of the function. This means that if any path through the function can execute an instruction I1 and then later execute an instruction I2 for the first time, I1 appears before I2 in the list. Note that this order is different from the order of the underlying RTL instructions, which follow machine code order instead.

Two RTL SSA instructions can be compared to find which instruction occurs earlier than the other in the RPO. One way to do this is to use the C++ comparison operators, such as:

*insn1 < *insn2

Another way is to use the compare_with function:

insn1->compare_with (insn2)

This expression is greater than zero if insn1 comes after insn2 in the RPO, less than zero if insn1 comes before insn2 in the RPO, or zero if insn1 and insn2 are the same. This order is maintained even if instructions are added to the function or moved around.

The main purpose of rtl_ssa::insn_info is to hold SSA information about an instruction. However, it also caches certain properties of the instruction, such as whether it is an inline assembly instruction, whether it has volatile accesses, and so on.

RTL SSA Basic Blocks#

RTL SSA instructions (see RTL SSA Instructions) are organized into basic blocks, with each block being represented by an rtl_ssa:bb_info. There is a one-to-one mapping between these rtl_ssa:bb_info structures and the underlying CFG basic_block structures (see Basic Blocks).

If a CFG basic block bb contains an RTL instruction insn, the RTL SSA represenation of bb also contains an RTL SSA representation of insn. Note that this excludes non-instruction things like note s and barrier s that also appear in the chain of RTL instructions.

Within RTL SSA, these instructions are referred to as ‘real’ instructions. These real instructions fall into two groups: debug instructions and nondebug instructions. Only nondebug instructions should affect code generation decisions.

In addition, each RTL SSA basic block has two ‘artificial’ instructions: a ‘head’ instruction that comes before all the real instructions and an ‘end’ instruction that comes after all real instructions. These instructions exist to represent things that are conceptually defined or used at the start and end of a basic block. The instructions always exist, even if they do not currently do anything.

Like instructions, these blocks are chained together in a reverse postorder. This list includes the entry block (which always comes first) and the exit block (which always comes last).

RTL SSA basic blocks are chained together into ‘extended basic blocks’ (EBBs), represented by an rtl_ssa::ebb_info. Extended basic blocks contain one or more basic blocks. They have the property that if a block bby comes immediately after a block bbx in an EBB, then bby can only be reached by bbx ; in other words, bbx is the sole predecessor of bby.

Each extended basic block starts with an artificial ‘phi node’ instruction. This instruction defines all phi nodes for the EBB (see RTL SSA Phi Nodes). (Individual blocks in an EBB do not need phi nodes because their live values can only come from one source.)

The contents of a function are therefore represented using a four-level hierarchy:

  • functions (rtl_ssa::function_info), which contain …

  • extended basic blocks (rtl_ssa::ebb_info), which contain …

  • basic blocks (rtl_ssa::bb_info), which contain …

  • instructions (rtl_ssa::insn_info)

In dumps, a basic block is identified as bbn, where n is the index of the associated CFG basic_block structure. An EBB is in turn identified by the index of its first block. For example, an EBB that contains bb10, bb5, bb6 and bb9 is identified as ebb10.

RTL SSA Resources#

The RTL SSA form tracks two types of ‘resource’: registers and memory. Each hard and pseudo register is a separate resource. Memory is a single unified resource, like it is in GIMPLE (see GIMPLE).

Each resource has a unique identifier. The unique identifier for a register is simply its register number. The unique identifier for memory is a special register number called MEM_REGNO.

Since resource numbers so closely match register numbers, it is sometimes convenient to refer to them simply as register numbers, or ‘regnos’ for short. However, the RTL SSA form also provides an abstraction of resources in the form of rtl_ssa::resource_info. This is a lightweight class that records both the regno of a resource and the machine_mode that the resource has (see Machine Modes). It has functions for testing whether a resource is a register or memory. In principle it could be extended to other kinds of resource in future.

RTL SSA Register and Memory Accesses#

In the RTL SSA form, most reads or writes of a resource are represented as a rtl_ssa::access_info. The exceptions are call clobbers, which are generally represented separately. See the comment above rtl_ssa::insn_info for details.

These rtl_ssa::access_info s are organized into the following class hierarchy:

rtl_ssa::access_info
  |
  +-- rtl_ssa::use_info
  |
  +-- rtl_ssa::def_info
        |
        +-- rtl_ssa::clobber_info
        |
        +-- rtl_ssa::set_info
              |
              +-- rtl_ssa::phi_info

A rtl_ssa::use_info represents a read or use of a resource and a rtl_ssa::def_info represents a write or definition of a resource. As in the main RTL representation, there are two basic types of definition: clobbers and sets. The difference is that a clobber leaves the register with an unspecified value that cannot be used or relied on by later instructions, while a set leaves the register with a known value that later instructions could use if they wanted to. A rtl_ssa::clobber_info represents a clobber and a rtl_ssa::set_info represent a set.

Each rtl_ssa::use_info records which single rtl_ssa::set_info provides the value of the resource; this is null if the resource is completely undefined at the point of use. Each rtl_ssa::set_info in turn records all the rtl_ssa::use_info s that use its value.

If a value of a resource can come from multiple sources, a rtl_ssa::phi_info brings those multiple sources together into a single definition (see RTL SSA Phi Nodes).

RTL SSA Phi Nodes#

If a resource is live on entry to an extended basic block and if the resource’s value can come from multiple sources, the extended basic block has a ‘phi node’ that collects together these multiple sources. The phi node conceptually has one input for each incoming edge of the extended basic block, with the input specifying the value of the resource on that edge. For example, suppose a function contains the following RTL:

;; Basic block bb3
...
(set (reg:SI R1) (const_int 0))  ;; A
(set (pc) (label_ref bb5))

;; Basic block bb4
...
(set (reg:SI R1) (const_int 1))  ;; B
;; Fall through

;; Basic block bb5
;; preds: bb3, bb4
;; live in: R1 ...
(code_label bb5)
...
(set (reg:SI R2)
     (plus:SI (reg:SI R1) ...))  ;; C

The value of R1 on entry to block 5 can come from either A or B. The extended basic block that contains block 5 would therefore have a phi node with two inputs: the first input would have the value of R1 defined by A and the second input would have the value of R1 defined by B. This phi node would then provide the value of R1 for C (assuming that R1 does not change again between the start of block 5 and C).

Since RTL is not a ‘native’ SSA representation, these phi nodes simply collect together definitions that already exist. Each input to a phi node for a resource R is itself a definition of resource R (or is null if the resource is completely undefined for a particular incoming edge). This is in contrast to a native SSA representation like GIMPLE, where the phi inputs can be arbitrary expressions. As a result, RTL SSA phi nodes never involve ‘hidden’ moves: all moves are instead explicit.

Phi nodes are represented as a rtl_ssa::phi_node. Each input to a phi node is represented as an rtl_ssa::use_info.

RTL SSA Access Lists#

All the definitions of a resource are chained together in reverse postorder. In general, this list can contain an arbitrary mix of both sets (rtl_ssa::set_info) and clobbers (rtl_ssa::clobber_info). However, it is often useful to skip over all intervening clobbers of a resource in order to find the next set. The list is constructed in such a way that this can be done in amortized constant time.

All uses (rtl_ssa::use_info) of a given set are also chained together into a list. This list of uses is divided into three parts:

The first and second parts individually follow reverse postorder. The third part has no particular order.

The last use by a real nondebug instruction always comes earlier in the reverse postorder than the next definition of the resource (if any). This means that the accesses follow a linear sequence of the form:

  • first definition of resource R

    • first use by a real nondebug instruction of the first definition of resource R

    • last use by a real nondebug instruction of the first definition of resource R

  • second definition of resource R

    • first use by a real nondebug instruction of the second definition of resource R

    • last use by a real nondebug instruction of the second definition of resource R

  • last definition of resource R

    • first use by a real nondebug instruction of the last definition of resource R

    • last use by a real nondebug instruction of the last definition of resource R

(Note that clobbers never have uses; only sets do.)

This linear view is easy to achieve when there is only a single definition of a resource, which is commonly true for pseudo registers. However, things are more complex if code has a structure like the following:

// ebb2, bb2
R = va;        // A
if (...)
  {
    // ebb2, bb3
    use1 (R);  // B
    ...
    R = vc;    // C
  }
else
  {
    // ebb4, bb4
    use2 (R);  // D
  }

The list of accesses would begin as follows:

  • definition of R by A

    • use of A’s definition of R by B

  • definition of R by C

The next access to R is in D, but the value of R that D uses comes from A rather than C.

This is resolved by adding a phi node for ebb4. All inputs to this phi node have the same value, which in the example above is A’s definition of R. In other circumstances, it would not be necessary to create a phi node when all inputs are equal, so these phi nodes are referred to as ‘degenerate’ phi nodes.

The full list of accesses to R is therefore:

  • definition of R by A

    • use of A’s definition of R by B

  • definition of R by C

  • definition of R by ebb4’s phi instruction, with the input coming from A

    • use of the ebb4’s R phi definition of R by B

Note that A’s definition is also used by ebb4’s phi node, but this use belongs to the third part of the use list described above and so does not form part of the linear sequence.

It is possible to ‘look through’ any degenerate phi to the ultimate definition using the function look_through_degenerate_phi. Note that the input to a degenerate phi is never itself provided by a degenerate phi.

At present, the SSA form takes this principle one step further and guarantees that, for any given resource res, one of the following is true:

  • The resource has a single definition def, which is not a phi node. Excluding uses of undefined registers, all uses of res by real nondebug instructions use the value provided by def.

  • Excluding uses of undefined registers, all uses of res use values provided by definitions that occur earlier in the same extended basic block. These definitions might come from phi nodes or from real instructions.

Using the RTL SSA framework to change instructions#

There are various routines that help to change a single RTL instruction or a group of RTL instructions while keeping the RTL SSA form up-to-date. This section first describes the process for changing a single instruction, then goes on to describe the differences when changing multiple instructions.

Changing One RTL SSA Instruction#

Before making a change, passes should first use a statement like the following:

auto attempt = crtl->ssa->new_change_attempt ();

Here, attempt is an RAII object that should remain in scope for the entire change attempt. It automatically frees temporary memory related to the changes when it goes out of scope.

Next, the pass should create an rtl_ssa::insn_change object for the instruction that it wants to change. This object specifies several things:

  • what the instruction’s new list of uses should be (new_uses). By default this is the same as the instruction’s current list of uses.

  • what the instruction’s new list of definitions should be (new_defs). By default this is the same as the instruction’s current list of definitions.

  • where the instruction should be located (move_range). This is a range of instructions after which the instruction could be placed, represented as an rtl_ssa::insn_range. By default the instruction must remain at its current position.

If a pass was attempting to change all these properties of an instruction insn, it might do something like this:

rtl_ssa::insn_change change (insn);
change.new_defs = ...;
change.new_uses = ...;
change.move_range = ...;

This rtl_ssa::insn_change only describes something that the pass might do; at this stage, nothing has actually changed.

As noted above, the default move_range requires the instruction to remain where it is. At the other extreme, it is possible to allow the instruction to move anywhere within its extended basic block, provided that all the new uses and definitions can be performed at the new location. The way to do this is:

change.move_range = insn->ebb ()->insn_range ();

In either case, the next step is to make sure that move range is consistent with the new uses and definitions. The way to do this is:

if (!rtl_ssa::restrict_movement (change))
  return false;

This function tries to limit move_range to a range of instructions at which new_uses and new_defs can be correctly performed. It returns true on success or false if no suitable location exists.

The pass should also tentatively change the pattern of the instruction to whatever form the pass wants the instruction to have. This should use the facilities provided by recog.cc. For example:

rtl_insn *rtl = insn->rtl ();
insn_change_watermark watermark;
validate_change (rtl, &PATTERN (rtl), new_pat, 1);

will tentatively replace insn ‘s pattern with new_pat.

These changes and the construction of the rtl_ssa::insn_change can happen in either order or be interleaved.

After the tentative changes to the instruction are complete, the pass should check whether the new pattern matches a target instruction or satisfies the requirements of an inline asm:

if (!rtl_ssa::recog (change))
  return false;

This step might change the instruction pattern further in order to make it match. It might also add new definitions or restrict the range of the move. For example, if the new pattern did not match in its original form, but could be made to match by adding a clobber of the flags register, rtl_ssa::recog will check whether the flags register is free at an appropriate point. If so, it will add a clobber of the flags register to new_defs and restrict move_range to the locations at which the flags register can be safely clobbered.

Even if the proposed new instruction is valid according to rtl_ssa::recog, the change might not be worthwhile. For example, when optimizing for speed, the new instruction might turn out to be slower than the original one. When optimizing for size, the new instruction might turn out to be bigger than the original one.

Passes should check for this case using change_is_worthwhile. For example:

if (!rtl_ssa::change_is_worthwhile (change))
  return false;

If the change passes this test too then the pass can perform the change using:

confirm_change_group ();
crtl->ssa->change_insn (change);

Putting all this together, the change has the following form:

auto attempt = crtl->ssa->new_change_attempt ();

rtl_ssa::insn_change change (insn);
change.new_defs = ...;
change.new_uses = ...;
change.move_range = ...;

if (!rtl_ssa::restrict_movement (change))
  return false;

insn_change_watermark watermark;
// Use validate_change etc. to change INSN's pattern.
...
if (!rtl_ssa::recog (change)
    || !rtl_ssa::change_is_worthwhile (change))
  return false;

confirm_change_group ();
crtl->ssa->change_insn (change);

Changing Multiple RTL SSA Instructions#

The process for changing multiple instructions is similar to the process for changing single instructions (see Changing One RTL SSA Instruction). The pass should again start the change attempt with:

auto attempt = crtl->ssa->new_change_attempt ();

and keep attempt in scope for the duration of the change attempt. It should then construct an rtl_ssa::insn_change for each change that it wants to make.

After this, it should combine the changes into a sequence of rtl_ssa::insn_change pointers. This sequence must be in reverse postorder; the instructions will remain strictly in the order that the sequence specifies.

For example, if a pass is changing exactly two instructions, it might do:

rtl_ssa::insn_change *changes[] = { &change1, change2 };

where change1 ‘s instruction must come before change2 ‘s. Alternatively, if the pass is changing a variable number of instructions, it might build up the sequence in a vec<rtl_ssa::insn_change *>.

By default, rtl_ssa::restrict_movement assumes that all instructions other than the one passed to it will remain in their current positions and will retain their current uses and definitions. When changing multiple instructions, it is usually more effective to ignore the other instructions that are changing. The sequencing described above ensures that the changing instructions remain in the correct order with respect to each other. The way to do this is:

if (!rtl_ssa::restrict_movement (change, insn_is_changing (changes)))
  return false;

Similarly, when rtl_ssa::restrict_movement is detecting whether a register can be clobbered, it by default assumes that all other instructions will remain in their current positions and retain their current form. It is again more effective to ignore changing instructions (which might, for example, no longer need to clobber the flags register). The way to do this is:

if (!rtl_ssa::recog (change, insn_is_changing (changes)))
  return false;

When changing multiple instructions, the important question is usually not whether each individual change is worthwhile, but whether the changes as a whole are worthwhile. The way to test this is:

if (!rtl_ssa::changes_are_worthwhile (changes))
  return false;

The process for changing single instructions makes sure that one rtl_ssa::insn_change in isolation is valid. But when changing multiple instructions, it is also necessary to test whether the sequence as a whole is valid. For example, it might be impossible to satisfy all of the move_range s at once.

Therefore, once the pass has a sequence of changes that are individually correct, it should use:

if (!crtl->ssa->verify_insn_changes (changes))
  return false;

to check whether the sequence as a whole is valid. If all checks pass, the final step is:

confirm_change_group ();
crtl->ssa->change_insns (changes);

Putting all this together, the process for a two-instruction change is:

auto attempt = crtl->ssa->new_change_attempt ();

rtl_ssa::insn_change change (insn1);
change1.new_defs = ...;
change1.new_uses = ...;
change1.move_range = ...;

rtl_ssa::insn_change change (insn2);
change2.new_defs = ...;
change2.new_uses = ...;
change2.move_range = ...;

rtl_ssa::insn_change *changes[] = { &change1, change2 };

auto is_changing = insn_is_changing (changes);
if (!rtl_ssa::restrict_movement (change1, is_changing)
    || !rtl_ssa::restrict_movement (change2, is_changing))
  return false;

insn_change_watermark watermark;
// Use validate_change etc. to change INSN1's and INSN2's patterns.
...
if (!rtl_ssa::recog (change1, is_changing)
    || !rtl_ssa::recog (change2, is_changing)
    || !rtl_ssa::changes_are_worthwhile (changes)
    || !crtl->ssa->verify_insn_changes (changes))
  return false;

confirm_change_group ();
crtl->ssa->change_insns (changes);