.. Copyright 1988-2022 Free Software Foundation, Inc. This is part of the GCC manual. For copying conditions, see the copyright.rst file. .. _ipa: Using summary information in IPA passes *************************************** Programs are represented internally as a *callgraph* (a multi-graph where nodes are functions and edges are call sites) and a *varpool* (a list of static and external variables in the program). The inter-procedural optimization is organized as a sequence of individual passes, which operate on the callgraph and the varpool. To make the implementation of WHOPR possible, every inter-procedural optimization pass is split into several stages that are executed at different times during WHOPR compilation: * LGEN time * *Generate summary* (``generate_summary`` in ``struct ipa_opt_pass_d``). This stage analyzes every function body and variable initializer is examined and stores relevant information into a pass-specific data structure. * *Write summary* (``write_summary`` in ``struct ipa_opt_pass_d``). This stage writes all the pass-specific information generated by ``generate_summary``. Summaries go into their own ``LTO_section_*`` sections that have to be declared in :samp:`lto-streamer.h`: ``enum lto_section_type``. A new section is created by calling ``create_output_block`` and data can be written using the ``lto_output_*`` routines. * WPA time * *Read summary* (``read_summary`` in ``struct ipa_opt_pass_d``). This stage reads all the pass-specific information in exactly the same order that it was written by ``write_summary``. * *Execute* (``execute`` in ``struct opt_pass``). This performs inter-procedural propagation. This must be done without actual access to the individual function bodies or variable initializers. Typically, this results in a transitive closure operation over the summary information of all the nodes in the callgraph. * *Write optimization summary* (``write_optimization_summary`` in ``struct ipa_opt_pass_d``). This writes the result of the inter-procedural propagation into the object file. This can use the same data structures and helper routines used in ``write_summary``. * LTRANS time * *Read optimization summary* (``read_optimization_summary`` in ``struct ipa_opt_pass_d``). The counterpart to ``write_optimization_summary``. This reads the interprocedural optimization decisions in exactly the same format emitted by ``write_optimization_summary``. * *Transform* (``function_transform`` and ``variable_transform`` in ``struct ipa_opt_pass_d``). The actual function bodies and variable initializers are updated based on the information passed down from the *Execute* stage. The implementation of the inter-procedural passes are shared between LTO, WHOPR and classic non-LTO compilation. * During the traditional file-by-file mode every pass executes its own *Generate summary*, *Execute*, and *Transform* stages within the single execution context of the compiler. * In LTO compilation mode, every pass uses *Generate summary* and *Write summary* stages at compilation time, while the *Read summary*, *Execute*, and *Transform* stages are executed at link time. * In WHOPR mode all stages are used. To simplify development, the GCC pass manager differentiates between normal inter-procedural passes (see :ref:`regular-ipa-passes`), small inter-procedural passes (see :ref:`small-ipa-passes`) and late inter-procedural passes (see :ref:`late-ipa-passes`). A small or late IPA pass (``SIMPLE_IPA_PASS``) does everything at once and thus cannot be executed during WPA in WHOPR mode. It defines only the *Execute* stage and during this stage it accesses and modifies the function bodies. Such passes are useful for optimization at LGEN or LTRANS time and are used, for example, to implement early optimization before writing object files. The simple inter-procedural passes can also be used for easier prototyping and development of a new inter-procedural pass. Virtual clones ^^^^^^^^^^^^^^ One of the main challenges of introducing the WHOPR compilation mode was addressing the interactions between optimization passes. In LTO compilation mode, the passes are executed in a sequence, each of which consists of analysis (or *Generate summary*), propagation (or *Execute*) and *Transform* stages. Once the work of one pass is finished, the next pass sees the updated program representation and can execute. This makes the individual passes dependent on each other. In WHOPR mode all passes first execute their *Generate summary* stage. Then summary writing marks the end of the LGEN stage. At WPA time, the summaries are read back into memory and all passes run the *Execute* stage. Optimization summaries are streamed and sent to LTRANS, where all the passes execute the *Transform* stage. Most optimization passes split naturally into analysis, propagation and transformation stages. But some do not. The main problem arises when one pass performs changes and the following pass gets confused by seeing different callgraphs between the *Transform* stage and the *Generate summary* or *Execute* stage. This means that the passes are required to communicate their decisions with each other. To facilitate this communication, the GCC callgraph infrastructure implements *virtual clones*, a method of representing the changes performed by the optimization passes in the callgraph without needing to update function bodies. A *virtual clone* in the callgraph is a function that has no associated body, just a description of how to create its body based on a different function (which itself may be a virtual clone). The description of function modifications includes adjustments to the function's signature (which allows, for example, removing or adding function arguments), substitutions to perform on the function body, and, for inlined functions, a pointer to the function that it will be inlined into. It is also possible to redirect any edge of the callgraph from a function to its virtual clone. This implies updating of the call site to adjust for the new function signature. Most of the transformations performed by inter-procedural optimizations can be represented via virtual clones. For instance, a constant propagation pass can produce a virtual clone of the function which replaces one of its arguments by a constant. The inliner can represent its decisions by producing a clone of a function whose body will be later integrated into a given function. Using *virtual clones*, the program can be easily updated during the *Execute* stage, solving most of pass interactions problems that would otherwise occur during *Transform*. Virtual clones are later materialized in the LTRANS stage and turned into real functions. Passes executed after the virtual clone were introduced also perform their *Transform* stage on new functions, so for a pass there is no significant difference between operating on a real function or a virtual clone introduced before its *Execute* stage. Optimization passes then work on virtual clones introduced before their *Execute* stage as if they were real functions. The only difference is that clones are not visible during the *Generate Summary* stage. To keep function summaries updated, the callgraph interface allows an optimizer to register a callback that is called every time a new clone is introduced as well as when the actual function or variable is generated or when a function or variable is removed. These hooks are registered in the *Generate summary* stage and allow the pass to keep its information intact until the *Execute* stage. The same hooks can also be registered during the *Execute* stage to keep the optimization summaries updated for the *Transform* stage. IPA references ^^^^^^^^^^^^^^ GCC represents IPA references in the callgraph. For a function or variable ``A``, the *IPA reference* is a list of all locations where the address of ``A`` is taken and, when ``A`` is a variable, a list of all direct stores and reads to/from ``A``. References represent an oriented multi-graph on the union of nodes of the callgraph and the varpool. See :samp:`ipa-reference.cc`: ``ipa_reference_write_optimization_summary`` and :samp:`ipa-reference.cc`: ``ipa_reference_read_optimization_summary`` for details. Jump functions ^^^^^^^^^^^^^^ Suppose that an optimization pass sees a function ``A`` and it knows the values of (some of) its arguments. The *jump function* describes the value of a parameter of a given function call in function ``A`` based on this knowledge. Jump functions are used by several optimizations, such as the inter-procedural constant propagation pass and the devirtualization pass. The inliner also uses jump functions to perform inlining of callbacks.