GObject binary typelib for introspection ----------------------------------------- Version 0.8 Changes since 0.7: - Add dependencies Changes since 0.6: - rename metadata to typelib, to follow xpcom terminology Changes since 0.5: - basic type cleanup: + remove GString + add [u]int, [u]long, [s]size_t + rename string to utf8, add filename - allow blob_type to be zero for non-local entries Changes since 0.4: - add a UnionBlob Changes since 0.3: - drop short_name for ValueBlob Changes since 0.2: - make inline types 4 bytes after all, remove header->types and allow types to appear anywhere - allow error domains in the directory Changes since 0.1: - drop comments about _GOBJ_METADATA - drop string pool, strings can appear anywhere - use 'blob' as collective name for the various blob types - rename 'type' field in blobs to 'blob_type' - rename 'type_name' and 'type_init' fields to 'gtype_name', 'gtype_init' - shrink directory entries to 12 bytes - merge struct and boxed blobs - split interface blobs into enum, object and interface blobs - add an 'unregistered' flag to struct and enum blobs - add a 'wraps_vfunc' flag to function blobs and link them to the vfuncs they wrap - restrict value blobs to only occur inside enums and flags again - add constant blobs, allow them toplevel, in interfaces and in objects - rename 'receiver_owns_value' and 'receiver_owns_container' to 'transfer_ownership' and 'transfer_container_ownership' - add a 'struct_offset' field to virtual function and field blobs - add 'dipper' and 'optional' flags to arg blobs - add a 'true_stops_emit' flag to signal blobs - add variable blob sizes to header - store offsets to signature blobs instead of including them directly - change the type offset to be measured in words rather than bytes Typelib -------- The format of GObject typelib is strongly influenced by the Mozilla XPCOM format. Some of the differences to XPCOM include: - Type information is stored not quite as compactly (XPCOM stores it inline in function descriptions in variable-sized blobs of 1 to n bytes. We store 16 bits of type information for each parameter, which is enough to encode simple types inline. Complex (e.g. recursive) types are stored out of line in a separate list of types. - String and complex type data is stored outside of typelib entry blobs, references are stored as offsets relative to the start of the typelib. One possibility is to store the strings and types in a pools at the end of the typelib. Overview -------- The typelib has the following general format. typelib ::= header, directory, blobs, annotations directory ::= list of entries entry ::= blob type, name, namespace, offset blob ::= function|callback|struct|boxed|enum|flags|object|interface|constant|errordomain|union annotations ::= list of annotations, sorted by offset annotation ::= offset, key, value Details ------- We describe the fragments that make up the typelib in the form of C structs (although some fall short of being valid C structs since they contain multiple flexible arrays). Header (70 bytes) struct Header { gchar[16] magic; guint8 major_version; guint8 minor_version; guint16 reserved; guint16 n_entries; guint16 n_local_entries; guint32 directory; guint32 annotations; guint32 dependencies; guint32 size; guint32 namespace; guint32 nsversion; guint16 entry_blob_size; /* 12 */ guint16 function_blob_size; /* 16 */ guint16 callback_blob_size; /* 12 */ guint16 signal_blob_size; /* 12 */ guint16 vfunc_blob_size; /* 16 */ guint16 arg_blob_size; /* 12 */ guint16 property_blob_size; /* 12 */ guint16 field_blob_size; /* 12 */ guint16 value_blob_size; /* 16 */ guint16 constant_blob_size; /* 20 */ guint16 error_domain_blob_size; /* 16 */ guint16 annotation_blob_size; /* 12 */ guint16 signature_blob_size; /* 4 */ guint16 enum_blob_size; /* 20 */ guint16 struct_blob_size; /* 20 */ guint16 object_blob_size; /* 32 */ guint16 interface_blob_size; /* 28 */ guint16 union_blob_size; /* 28 */ } magic: The string "GOBJ\nMETADATA\r\n\032". This was inspired by XPCOM, which in turn borrowed from PNG. major_version, minor_version: The version of the typelib format. Minor version changes indicate compatible changes and should still allow the typelib to be parsed by a parser designed for the same major_version. n_entries: The number of entries in the directory. n_local_entries: The number of entries referring to blobs in this typelib. The local entries must occur before the unresolved entries. directory: Offset of the directory in the typelib. FIXME: need to specify if and how the directory is sorted annotations: Offset of the list of annotations in the typelib. dependencies: Offset of a single string, which is the list of dependencies, separated by the '|' character. The dependencies are required in order to avoid having programs consuming a typelib check for an "Unresolved" type return from every API call. size: The size of the typelib. namespace: Offset of the namespace string in the typelib. nsversion: Offset of the namespace version string in the typelib. entry_blob_size: function_blob_size: callback_blob_size: signal_blob_size: vfunc_blob_size: arg_blob_size: property_blob_size: field_blob_size: value_blob_size: annotation_blob_size: constant_blob_size: The sizes of fixed-size blobs. Recording this information here allows to write parser which continue to work if the format is extended by adding new fields to the end of the fixed-size blobs. signature_blob_size: enum_blob_size: struct_blob_size: interface_blob_size: For variable-size blobs, the size of the struct up to the first flexible array member. Recording this information here allows to write parser which continue to work if the format is extended by adding new fields before the first flexible array member in variable-size blobs. Directory entry (12 bytes) References to directory entries are stored as 1-based 16-bit indexes. struct DirectoryEntry { guint16 blob_type; guint is_local : 1; guint reserved :15; guint32 name; guint32 offset; } blob_type: The type of blob this entry points to: 0 unknown (allowed only for non-local entries) 1 function 2 callback 3 struct 4 boxed 5 enum 6 flags 7 object 8 interface 9 constant 10 errordomain is_local: Whether this entry refers to a blob in this typelib. name: The name of the entry. offset: If is_local is set, this is the offset of the blob in the typelib. Otherwise, it is the offset of the namespace in which the blob has to be looked up by name. All blobs pointed to by a directory entry start with the same layout for the first 8 bytes (the reserved flags may be used by some blob types) struct InterfacePrefix { guint16 blob_type; guint deprecated : 1; guint reserved :15; guint32 name; } blob_type: An integer specifying the type of the blob, see DirectoryEntry for details. deprecated: Whether the blob is deprecated. name: The name of the blob. The SignatureBlob is shared between Functions, Callbacks, Signals and VirtualFunctions. SignatureBlob (8 + 12 * n_arguments bytes) struct SignatureBlob { SimpleTypeBlob return_type; guint may_return_null : 1; guint caller_owns_return_value : 1; guint caller_owns_return_container : 1; guint reserved :13; guint16 n_arguments; ArgBlob[] arguments; } return_type: Describes the type of the return value. See details below. may_return_null: Only relevant for pointer types. Indicates whether the caller must expect NULL as a return value. caller_owns_return_value: If set, the caller is responsible for freeing the return value if it is no longer needed. caller_owns_return_container: This flag is only relevant if the return type is a container type. If the flag is set, the caller is resonsible for freeing the container, but not its contents. n_arguments: The number of arguments that this function expects, also the length of the array of ArgBlobs. arguments: An array of ArgBlob for the arguments of the function. FunctionBlob (20 bytes) struct FunctionBlob { guint16 blob_type; /* 1 */ guint deprecated : 1; guint is_setter : 1; guint is_getter : 1; guint is_constructor : 1; guint wraps_vfunc : 1; guint throws : 1; guint index :10; guint32 name; guint32 c_name; guint32 signature; guint is_static : 1; guint reserved : 31; } c_name: The symbol which can be used to obtain the function pointer with dlsym(). deprecated The function is deprecated. is_setter The function is a setter for a property. Language bindings may prefer to not bind individual setters and rely on the generic g_object_set(). is_getter The function is a getter for a property. Language bindings may prefer to not bind individual getters and rely on the generic g_object_get(). is_constructor The function acts as a constructor for the object it is contained in. wraps_vfunc: The function is a simple wrapper for a virtual function. index: Index of the property that this function is a setter or getter of in the array of properties of the containing interface, or index of the virtual function that this function wraps. signature: Offset of the SignatureBlob describing the parameter types and the return value type. is_static The function is a "static method"; in other words it's a pure function whose name is conceptually scoped to the object. CallbackBlob (12 bytes) struct CallbackBlob { guint16 blob_type; /* 2 */ guint deprecated : 1; guint reserved :15; guint32 name; guint32 signature; } signature: Offset of the SignatureBlob describing the parameter types and the return value type. ArgBlob (12 bytes) struct ArgBlob { guint32 name; guint in : 1; guint out : 1; guint dipper : 1; guint allow_none : 1; guint optional : 1; guint transfer_ownership : 1; guint transfer_container_ownership : 1; guint is_return_value : 1; guint scope : 3; guint reserved :21: gint8 closure; gint8 destroy; SimpleTypeBlob arg_type; } name: A suggested name for the parameter. in: The parameter is an input to the function out: The parameter is used to return an output of the function. Parameters can be both in and out. Out parameters implicitly add another level of indirection to the parameter type. Ie if the type is uint32 in an out parameter, the function actually takes an uint32*. dipper: The parameter is a pointer to a struct or object that will receive an output of the function. allow_none: Only meaningful for types which are passed as pointers. For an in parameter, indicates if it is ok to pass NULL in, for an out parameter, whether it may return NULL. Note that NULL is a valid GList and GSList value, thus allow_none will normally be set for parameters of these types. optional: For an out parameter, indicates that NULL may be passed in if the value is not needed. transfer_ownership: For an in parameter, indicates that the function takes over ownership of the parameter value. For an out parameter, it indicates that the caller is responsible for freeing the return value. transfer_container_ownership: For container types, indicates that the ownership of the container, but not of its contents is transferred. This is typically the case for out parameters returning lists of statically allocated things. is_return_value: The parameter should be considered the return value of the function. Only out parameters can be marked as return value, and there can be at most one per function call. If an out parameter is marked as return value, the actual return value of the function should be either void or a boolean indicating the success of the call. scope: If the parameter is of a callback type, this denotes the scope of the user_data and the callback function pointer itself (for languages that emit code at run-time). 0 invalid -- the argument is not of callback type 1 call -- the callback and associated user_data is only used during the call to this function 2 object -- the callback and associated user_data is used until the object containing this method is destroyed 3 async -- the callback and associated user_data is only used until the callback is invoked, and the callback is invoked always exactly once. 4 notified -- the callback and and associated user_data is used until the caller is notfied via the destroy_notify closure: Index of the closure (user_data) parameter associated with the callback, or -1. destroy: Index of the destroy notfication callback parameter associated with the callback, or -1. arg_type: Describes the type of the parameter. See details below. Types are specified by four bytes. If the three high bytes are zero, the low byte describes a basic type, otherwise the 32bit number is an offset which points to a TypeBlob. SimpleTypeBlob (4 bytes) union SimpleTypeBlob { struct { guint reserved :24; /* 0 */ guint is_pointer : 1; guint reserved : 2; guint tag : 5; }; guint32 offset; } is_pointer: indicates whether the type is passed by reference. tag: specifies what kind of type is described, as follows: 0 void 1 boolean (booleans are passed as ints) 2 int8 3 uint8 4 int16 5 uint16 6 int32 7 uint32 8 int64 9 uint64 10 int 11 uint 12 long 13 ulong 14 ssize_t 15 size_t 16 float 17 double 18 time_t 19 GType 20 utf8 (these are zero-terminated char[] and assumed to be in UTF-8) 21 filename (these are zero-terminated char[] and assumed to be in the GLib filename encoding) For utf8 and filename is_pointer will always be set. offset: Offset relative to header->types that points to a TypeBlob. Unlike other offsets, this is in words (ie 32bit units) rather than bytes. TypeBlob (4 or more bytes) union TypeBlob { ArrayTypeBlob array_type; InterfaceTypeBlob interface_type; ParameterTypeBlob parameter_type; ErrorTypeBlob error_type; } ArrayTypeBlob (4 bytes) Arrays have a tag value of 20. They are passed by reference, thus is_pointer is always 1. struct ArrayTypeBlob { guint is_pointer :1; /* 1 */ guint reserved :2; guint tag :5; /* 20 */ guint zero_terminated :1; guint has_length :1; guint length :6; SimpleTypeBlob type; } zero_terminated: Indicates that the array must be terminated by a suitable NULL value. has_length: Indicates that length points to a parameter specifying the length of the array. If both has_length and zero_terminated are set, the convention is to pass -1 for the length if the array is zero-terminated. FIXME: what does this mean for types of field and properties ? length: The index of the parameter which is used to pass the length of the array. The parameter must be an integer type and have the same direction as this one. type: The type of the array elements. InterfaceTypeBlob (4 bytes) struct InterfaceTypeBlob { guint is_pointer :1; guint reserved :2; guint tag :5; /* 21 */ guint8 reserved; guint16 interface; } Types which are described by an entry in the typelib have a tag value of 21. If the interface is an enum of flags type, is_pointer is 0, otherwise it is 1. interface: Index of the directory entry for the interface. ParameterTypeBlob (4 + n * 4 bytes) GLists have a tag value of 22, GSLists have a tag value of 23, GHashTables have a tag value of 24. They are passed by reference, thus is_pointer is always 1. struct ParameterTypeBlob { guint is_pointer :1; /* 1 */ guint reserved :2; guint tag :5; /* 22, 23 or 24 */ guint reserved :8; guint16 n_types; SimpleTypeBlob type[]; } n_types: The number of parameter types to follow. type: Describes the type of the list elements. ErrorTypeBlob (4 + 2 * n_domains bytes) struct ErrorTypeBlob { guint is_pointer :1; /* 1 */ guint reserved :2; guint tag :5; /* 25 */ guint8 reserved; guint16 n_domains; guint16 domains[]; } n_domains: The number of domains to follow domains: Indices of the directory entries for the error domains ErrorDomainBlob (16 bytes) struct ErrorDomainBlob { guint16 blob_type; /* 10 */ guint deprecated : 1; guint reserved :15; guint32 name; guint32 get_quark; guint16 error_codes; } get_quark: The symbol name of the function which must be called to obtain the GQuark for the error domain. error_codes: Index of the InterfaceBlob describing the enumeration which lists the possible error codes. PropertyBlob (12 bytes) struct PropertyBlob { guint32 name; guint deprecated : 1; guint readable : 1; guint writable : 1; guint construct : 1; guint construct_only : 1; guint reserved :27 SimpleTypeBlob type; } name: The name of the property. readable: writable: construct: construct_only: The ParamFlags used when registering the property. type: Describes the type of the property. SignalBlob (12 bytes) struct SignalBlob { guint32 name; guint deprecated : 1; guint run_first : 1; guint run_last : 1; guint run_cleanup : 1; guint no_recurse : 1; guint detailed : 1; guint action : 1; guint no_hooks : 1; guint has_class_closure : 1; guint true_stops_emit : 1; guint reserved : 5; guint16 class_closure; guint32 signature; } name: The name of the signal. run_first: run_last: run_cleanup: no_recurse: detailed: action: no_hooks: The flags used when registering the signal. has_class_closure: Set if the signal has a class closure. true_stops_emit: Whether the signal has true-stops-emit semantics class_closure: The index of the class closure in the list of virtual functions of the object or interface on which the signal is defined. signature: Offset of the SignatureBlob describing the parameter types and the return value type. VirtualFunctionBlob (16 bytes) struct VirtualFunctionBlob { guint32 name; guint must_chain_up : 1; guint must_be_implemented : 1; guint must_not_be_implemented : 1; guint is_class_closure : 1; guint reserved :12; guint16 signal; guint16 struct_offset; guint16 reserved; guint32 signature; } name: The name of the virtual function. must_chain_up: If set, every implementation of this virtual function must chain up to the implementation of the parent class. must_be_implemented: If set, every derived class must override this virtual function. must_not_be_implemented: If set, derived class must not override this virtual function. is_class_closure: Set if this virtual function is the class closure of a signal. signal: The index of the signal in the list of signals of the object or interface to which this virtual function belongs. struct_offset: The offset of the function pointer in the class struct. The value 0xFFFF indicates that the struct offset is unknown. signature: Offset of the SignatureBlob describing the parameter types and the return value type. FieldBlob (12 bytes) struct FieldBlob { guint32 name; guint readable : 1; guint writable : 1; guint reserved : 6; guint8 bits; guint16 struct_offset; SimpleTypeBlob type; } name: The name of the field. readable: writable: How the field may be accessed. bits: If this field is part of a bitfield, the number of bits which it uses, otherwise 0. struct_offset: The offset of the field in the struct. The value 0xFFFF indicates that the struct offset is unknown. type: The type of the field. ValueBlob (12 bytes) Values commonly occur in enums and flags. struct ValueBlob { guint deprecated : 1; guint reserved :31; guint32 name; guint32 value; } value: The numerical value; GTypeBlob (8 bytes) struct GTypeBlob { guint32 gtype_name; guint32 gtype_init; } gtype_name: The name under which the type is registered with GType. gtype_init: The symbol name of the get_type() function which registers the type. StructBlob (20 + 8 * n_fields + x * n_functions) struct StructBlob { guint16 blob_type; /* 3: struct, 4: boxed */ guint deprecated : 1; guint unregistered : 1; guint alignment : 6; guint reserved : 8; guint32 name; GTypeBlob gtype; guint32 size; guint16 n_fields; guint16 n_functions; FieldBlob fields[]; FunctionBlob functions[]; } unregistered: If this is set, the type is not registered with GType. alignment: The byte boundary that the struct is aligned to in memory size: The size of the struct in bytes. gtype: For types which are registered with GType, contains the information about the GType. Otherwise unused. n_fields: n_functions: The lengths of the arrays. fields: An array of n_fields FieldBlobs. functions: An array of n_functions FunctionBlobs. The described functions should be considered as methods of the struct. EnumBlob (20 + 16 * n_values) struct EnumBlob { guint16 blob_type; /* 5: enum, 6: flags */ guint deprecated : 1; guint unregistered : 1; guint storage_type : 5; guint reserved : 9; guint32 name; GTypeBlob gtype; guint16 n_values; guint16 reserved; ValueBlob values[]; } unregistered: If this is set, the type is not registered with GType. storage_type: The tag of the type used for the enum in the C ABI (will be a signed or unsigned integral type) gtype: For types which are registered with GType, contains the information about the GType. Otherwise unused. n_values: The lengths of the values arrays. values: Describes the enum values. ObjectBlob (32 + x bytes) struct ObjectBlob { guint16 blob_type; /* 7 */ guint deprecated : 1; guint abstract : 1; guint reserved :14; guint32 name; GTypeBlob gtype; guint16 parent; guint16 n_interfaces; guint16 n_fields; guint16 n_properties; guint16 n_methods; guint16 n_signals; guint16 n_virtual_functions; guint16 n_constants; guint16 interfaces[]; FieldBlob fields[]; PropertyBlob properties[]; FunctionBlob methods[]; SignalBlob signals[]; VirtualFunctionBlob virtual_functions[]; ConstantBlob constants[]; } gtype: Contains the information about the GType. parent: The directory index of the parent type. This is only set for objects. If an object does not have a parent, it is zero. n_interfaces: n_fields: n_properties: n_methods: n_signals: n_virtual_functions: n_constants: The lengths of the arrays. Up to 16bits of padding may be inserted between the arrays to ensure that they start on a 32bit boundary. interfaces: An array of indices of directory entries for the implemented interfaces. fields: Describes the fields. functions: Describes the methods, constructors, setters and getters. properties: Describes the properties. signals: Describes the signals. virtual_functions: Describes the virtual functions. constants: Describes the constants. InterfaceBlob (28 + x bytes) struct InterfaceBlob { guint16 blob_type; /* 8 */ guint deprecated : 1; guint reserved :15; guint32 name; GTypeBlob gtype; guint16 n_prerequisites; guint16 n_properties; guint16 n_methods; guint16 n_signals; guint16 n_virtual_functions; guint16 n_constants; guint16 prerequisites[]; PropertyBlob properties[]; FunctionBlob methods[]; SignalBlob signals[]; VirtualFunctionBlob virtual_functions[]; ConstantBlob constants[]; } n_prerequisites: n_properties: n_methods: n_signals: n_virtual_functions: n_constants: The lengths of the arrays. Up to 16bits of padding may be inserted between the arrays to ensure that they start on a 32bit boundary. prerequisites: An array of indices of directory entries for required interfaces. functions: Describes the methods, constructors, setters and getters. properties: Describes the properties. signals: Describes the signals. virtual_functions: Describes the virtual functions. constants: Describes the constants. ConstantBlob (20 bytes) struct ConstantBlob { guint16 blob_type; /* 9 */ guint deprecated : 1; guint reserved :15; guint32 name; SimpleTypeBlob type; guint32 size; guint32 offset; } type: The type of the value. In most cases this should be a numeric type or string. size: The size of the value in bytes. offset: The offset of the value in the typelib. AnnotationBlob (12 bytes) struct AnnotationBlob { guint32 offset; guint32 name; guint32 value; } offset: The offset of the typelib entry to which this annotation refers. Annotations are kept sorted by offset, so that the annotations of an entry can be found by a binary search. name: The name of the annotation, a string. value: The value of the annotation (also a string) UnionBlob (28 + x bytes) struct UnionBlob { guint16 blob_type; /* 11 */ guint deprecated : 1; guint unregistered : 1; guint discriminated : 1; guint alignment : 6; guint reserved : 7; guint32 name; GTypeBlob gtype; guint32 size; guint16 n_fields; guint16 n_functions; gint32 discriminator_offset; SimpleTypeBlob discriminator_type; FieldBlob fields[]; FunctionBlob functions[]; ConstantBlob discriminator_values[] } unregistered: If this is set, the type is not registered with GType. discriminated: Is set if the union is discriminated alignment: The byte boundary that the union is aligned to in memory size: The size of the union in bytes. gtype: For types which are registered with GType, contains the information about the GType. Otherwise unused. n_fields: Length of the arrays discriminator_offset: Offset from the beginning of the union where the discriminator of a discriminated union is located. The value 0xFFFF indicates that the discriminator offset is unknown. discriminator_type: Type of the discriminator discriminator_values: On discriminator value per field fields: Array of FieldBlobs describing the alternative branches of the union