The preprocessing language#
After tokenization, the stream of tokens may simply be passed straight to the compiler’s parser. However, if it contains any operations in the preprocessing language, it will be transformed first. This stage corresponds roughly to the standard’s ‘translation phase 4’ and is what most people think of as the preprocessor’s job.
The preprocessing language consists of directives to be executed and macros to be expanded. Its primary capabilities are:
Inclusion of header files. These are files of declarations that can be substituted into your program.
Macro expansion. You can define macros, which are abbreviations for arbitrary fragments of C code. The preprocessor will replace the macros with their definitions throughout the program. Some macros are automatically defined for you.
Conditional compilation. You can include or exclude parts of the program according to various conditions.
Line control. If you use a program to combine or rearrange source files into an intermediate file which is then compiled, you can use line control to inform the compiler where each source line originally came from.
Diagnostics. You can detect problems at compile time and issue errors or warnings.
There are a few more, less useful, features.
Except for expansion of predefined macros, all these operations are
triggered with preprocessing directives. Preprocessing directives
are lines in your program that start with #
. Whitespace is
allowed before and after the #
. The #
is followed by an
identifier, the directive name. It specifies the operation to
perform. Directives are commonly referred to as #name
where name
is the directive name. For example, #define
is
the directive that defines a macro.
The #
which begins a directive cannot come from a macro
expansion. Also, the directive name is not macro expanded. Thus, if
foo
is defined as a macro expanding to define
, that does
not make #foo
a valid preprocessing directive.
The set of valid directive names is fixed. Programs cannot define new preprocessing directives.
Some directives require arguments; these make up the rest of the
directive line and must be separated from the directive name by
whitespace. For example, #define
must be followed by a macro
name and the intended expansion of the macro.
A preprocessing directive cannot cover more than one line. The line may, however, be continued with backslash-newline, or by a block comment which extends past the end of the line. In either case, when the directive is processed, the continuations have already been merged with the first line to make one long line.