Type Encoding#
This is an advanced section. Type encodings are used extensively by the compiler and by the runtime, but you generally do not need to know about them to use Objective-C.
The Objective-C compiler generates type encodings for all the types. These type encodings are used at runtime to find out information about selectors and methods and about objects and classes.
The types are encoded in the following way:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
an |
unknown type |
|
Complex types |
|
bit-fields |
|
The encoding of bit-fields has changed to allow bit-fields to be properly handled by the runtime functions that compute sizes and alignments of types that contain bit-fields. The previous encoding contained only the size of the bit-field. Using only this information it is not possible to reliably compute the size occupied by the bit-field. This is very important in the presence of the Boehm’s garbage collector because the objects are allocated using the typed memory facility available in this collector. The typed memory allocation requires information about where the pointers are located inside the object.
The position in the bit-field is the position, counting in bits, of the bit closest to the beginning of the structure.
The non-atomic types are encoded as follows:
pointers |
|
arrays |
|
structures |
|
unions |
|
vectors |
|
Here are some types and their encodings, as they are generated by the compiler on an i386 machine:
Objective-C type |
Compiler encoding |
---|---|
int a[10];
|
|
struct {
int i;
float f[3];
int a:3;
int b:2;
char c;
}
|
|
int a __attribute__ ((vector_size (16)));
|
|
In addition to the types the compiler also encodes the type specifiers. The table below describes the encoding of the current Objective-C type specifiers:
Specifier |
Encoding |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The type specifiers are encoded just before the type. Unlike types however, the type specifiers are only encoded when they appear in method argument types.
Note how const
interacts with pointers:
Objective-C type |
Compiler encoding |
---|---|
const int
|
|
const int*
|
|
int *const
|
|
const int*
is a pointer to a const int
, and so is
encoded as ^ri
. int* const
, instead, is a const
pointer to an int
, and so is encoded as r^i
.
Finally, there is a complication when encoding const char *
versus char * const
. Because char *
is encoded as
*
and not as ^c
, there is no way to express the fact
that r
applies to the pointer or to the pointee.
Hence, it is assumed as a convention that r*
means const
char *
(since it is what is most often meant), and there is no way to
encode char *const
. char *const
would simply be encoded
as *
, and the const
is lost.
Legacy Type Encoding#
Unfortunately, historically GCC used to have a number of bugs in its encoding code. The NeXT runtime expects GCC to emit type encodings in this historical format (compatible with GCC-3.3), so when using the NeXT runtime, GCC will introduce on purpose a number of incorrect encodings:
the read-only qualifier of the pointee gets emitted before the ‘^’. The read-only qualifier of the pointer itself gets ignored, unless it is a typedef. Also, the ‘r’ is only emitted for the outermost type.
32-bit longs are encoded as ‘l’ or ‘L’, but not always. For typedefs, the compiler uses ‘i’ or ‘I’ instead if encoding a struct field or a pointer.
enum
s are always encoded as ‘i’ (int) even if they are actually unsigned or long.
In addition to that, the NeXT runtime uses a different encoding for
bitfields. It encodes them as b
followed by the size, without
a bit offset or the underlying field type.
@encode#
GNU Objective-C supports the @encode
syntax that allows you to
create a type encoding from a C/Objective-C type. For example,
@encode(int)
is compiled by the compiler into "i"
.
@encode
does not support type qualifiers other than
const
. For example, @encode(const char*)
is valid and
is compiled into "r*"
, while @encode(bycopy char *)
is
invalid and will cause a compilation error.
Method Signatures#
This section documents the encoding of method types, which is rarely needed to use Objective-C. You should skip it at a first reading; the runtime provides functions that will work on methods and can walk through the list of parameters and interpret them for you. These functions are part of the public ‘API’ and are the preferred way to interact with method signatures from user code.
But if you need to debug a problem with method signatures and need to know how they are implemented (i.e., the ‘ABI’), read on.
Methods have their ‘signature’ encoded and made available to the runtime. The ‘signature’ encodes all the information required to dynamically build invocations of the method at runtime: return type and arguments.
The ‘signature’ is a null-terminated string, composed of the following:
The return type, including type qualifiers. For example, a method returning
int
would havei
here.The total size (in bytes) required to pass all the parameters. This includes the two hidden parameters (the object
self
and the method selector_cmd
).Each argument, with the type encoding, followed by the offset (in bytes) of the argument in the list of parameters.
For example, a method with no arguments and returning int
would
have the signature i8@0:4
if the size of a pointer is 4. The
signature is interpreted as follows: the i
is the return type
(an int
), the 8
is the total size of the parameters in
bytes (two pointers each of size 4), the @0
is the first
parameter (an object at byte offset 0
) and :4
is the
second parameter (a SEL
at byte offset 4
).
You can easily find more examples by running the ‘strings’ program
on an Objective-C object file compiled by GCC. You’ll see a lot of
strings that look very much like i8@0:4
. They are signatures
of Objective-C methods.