diff options
author | Steffen Mueller <smueller@cpan.org> | 2012-01-28 17:47:16 +0100 |
---|---|---|
committer | Steffen Mueller <smueller@cpan.org> | 2012-02-01 08:07:49 +0100 |
commit | 0eb29def25ed2263d8d36222cdd82b0e7694d181 (patch) | |
tree | a90ccb89c414a9db1a0f8e982c9d4819be167086 /dist | |
parent | 56ab3cdc1c88e9c7720d71c2800aae249a917b9a (diff) | |
download | perl-0eb29def25ed2263d8d36222cdd82b0e7694d181.tar.gz |
Move typemap documentation to its own file
Sadly, the POD in Typemap.xs was not easily extractable into a POD file
at build time, so it now lives in a separate POD file from the start.
Makes keeping documentation and testing efforts in sync marginally
harder, but it's probably the right trade-off.
What's left to do is finding the right places in other POD files to
refer to this old/new documentation.
Diffstat (limited to 'dist')
-rw-r--r-- | dist/ExtUtils-ParseXS/lib/perlxstypemap.pod | 578 |
1 files changed, 578 insertions, 0 deletions
diff --git a/dist/ExtUtils-ParseXS/lib/perlxstypemap.pod b/dist/ExtUtils-ParseXS/lib/perlxstypemap.pod new file mode 100644 index 0000000000..8424041b7e --- /dev/null +++ b/dist/ExtUtils-ParseXS/lib/perlxstypemap.pod @@ -0,0 +1,578 @@ +=head1 NAME + +perlxstypemap - Perl XS C/Perl type mapping + +=head1 DESCRIPTION + +The more you think about interfacing between two languages, the more +you'll realize that the majority of programmer effort has to go into +converting between the data structures that are native to either of +the languages involved. This trumps other matter such as differing +calling conventions because the problem space is so much greater. +There are simply more ways to shove data into memory than there are +ways to implement a function call. + +Perl XS' attempt at a solution to this is the concept of typemaps. +At an abstract level, a Perl XS typemap is nothing but a recipe for +converting from a certain Perl data structure to a certain C +data structure and/or vice versa. Since there can be C types that +are sufficiently similar to warrant converting with the same logic, +XS typemaps are represented by a unique identifier, called XS type +henceforth in this document. You can then tell the XS compiler that +multiple C types are to be mapped with the same XS typemap. + +In your XS code, when you define an argument with a C type or when +you are using a C<CODE:> and an C<OUTPUT:> section together with a +C return type of your XSUB, it'll be the typemapping mechanism that +makes this easy. + +=head2 Anatomy of a typemap File + +Traditionally, typemaps needed to be written to a separate file, +conventionally called C<typemap>. With ExtUtils::ParseXS (the XS +compiler) version 3.00 or better (comes with perl 5.16), typemaps +can also be embedded directly into your XS code using a HERE-doc +like syntax: + + TYPEMAP: <<HERE + ... + HERE + +where C<HERE> can be replaced by other identifiers like with normal +Perl HERE-docs. All details below about the typemap textual format +remain valid. + +A typemap file generally has three sections: The C<TYPEMAP> +section is used to associate C types with XS type identifiers. +The C<INPUT> section is used to define the typemaps for I<input> +into the XSUB from Perl, and the C<OUTPUT> section has the opposite +conversion logic for getting data out of an XSUB back into Perl. + +Each section is started by the section name in capital letters on a +line of its own. A typemap file implicitly starts in the C<TYPEMAP> +section. Each type of section can appear an arbitrary number of times +and does not have to appear at all. For example, a typemap file may +lack C<INPUT> and C<OUTPUT> sections if all it needs to do is +associate additional C types with core XS types like T_PTROBJ. +Lines that start with a hash C<#> are considered comments and ignored +in the C<TYPEMAP> section, but are considered significant in C<INPUT> +and C<OUTPUT>. Blank lines are generally ignored. + +The C<TYPEMAP> section should contain one pair of C type and +XS type per line as follows. An example from the core typemap file: + + TYPEMAP + # all variants of char* is handled by the T_PV typemap + char * T_PV + const char * T_PV + unsigned char * T_PV + ... + +The C<INPUT> and C<OUTPUT> sections have identical formats, that is, +each unindented line starts a new in- or output map respectively. +A new in- or output map must start with the name of the XS type to +map on a line by itself, followed by the code that implements it +indented on the following lines. Example: + + INPUT + T_PV + $var = ($type)SvPV_nolen($arg) + T_PTR + $var = INT2PTR($type,SvIV($arg)) + +We'll get to the meaning of those Perlish-looking variables in a +little bit. + +Finally, here's an example of the full typemap file for mapping C +strings of the C<char *> type to Perl scalars/strings: + + TYPEMAP + char * T_PV + + INPUT + T_PV + $var = ($type)SvPV_nolen($arg) + + OUTPUT + T_PV + sv_setpv((SV*)$arg, $var); + +=head2 The Role of the typemap File in Your Distribution + +For CPAN distributions, you can assume that the XS types defined by +the perl core are already available. Additionally, the core typemap +has default XS types for a large number of C types. For example, if +you simply return a C<char *> from your XSUB, the core typemap will +have this C type associated with the T_PV XS type. That means your +C string will be copied into the PV (pointer value) slot of a new scalar +that will be returned from your XSUB to to Perl. + +If you're developing a CPAN distribution using XS, you may add your own +file called F<typemap> to the distribution. That file may contain +typemaps that either map types that are specific to your code or that +override the core typemap file's mappings for common C types. + +=head2 Sharing typemaps Between CPAN Distributions + +Starting with ExtUtils::ParseXS version 3.13_01 (comes with perl 5.16 +and better), it is rather easy to share typemap code between multiple +CPAN distributions. The general idea is to share it as a module that +offers a certain API and have the dependent modules declare that as a +built-time requirement and import the typemap into the XS. An example +of such a typemap-sharing module on CPAN is +C<ExtUtils::Typemaps::Basic>. Two steps to getting that module's +typemaps available in your code: + +=over 4 + +=item * + +Declare C<ExtUtils::Typemaps::Basic> as a build-time dependency +in C<Makefile.PL> (use C<BUILD_REQUIRES>), or in your C<Build.PL> +(use C<build_requires>). + +=item * + +Include the following line in the XS section of your XS file: +(don't break the line) + + INCLUDE_COMMAND: $^X -MExtUtils::Typemaps::Cmd + -e "print embeddable_typemap(q{Basic})" + +=back + +=head2 Full Listing of Core Typemaps + +Each C type is represented by an entry in the typemap file that +is responsible for converting perl variables (SV, AV, HV, CV, etc.) +to and from that type. The following sections list all XS types +that come with perl by default. + +=over 4 + +=item T_SV + +This simply passes the C representation of the Perl variable (an SV*) +in and out of the XS layer. This can be used if the C code wants +to deal directly with the Perl variable. + +=item T_SVREF + +Used to pass in and return a reference to an SV. + +Note that this typemap does not decrement the reference count +when returning the reference to an SV*. +See also: T_SVREF_REFCOUNT_FIXED + +=item T_SVREF_FIXED + +Used to pass in and return a reference to an SV. +This is a fixed +variant of T_SVREF that decrements the refcount appropriately +when returning a reference to an SV*. Introduced in perl 5.15.4. + +=item T_AVREF + +From the perl level this is a reference to a perl array. +From the C level this is a pointer to an AV. + +Note that this typemap does not decrement the reference count +when returning an AV*. See also: T_AVREF_REFCOUNT_FIXED + +=item T_AVREF_REFCOUNT_FIXED + +From the perl level this is a reference to a perl array. +From the C level this is a pointer to an AV. This is a fixed +variant of T_AVREF that decrements the refcount appropriately +when returning an AV*. Introduced in perl 5.15.4. + +=item T_HVREF + +From the perl level this is a reference to a perl hash. +From the C level this is a pointer to an HV. + +Note that this typemap does not decrement the reference count +when returning an HV*. See also: T_HVREF_REFCOUNT_FIXED + +=item T_HVREF_REFCOUNT_FIXED + +From the perl level this is a reference to a perl hash. +From the C level this is a pointer to an HV. This is a fixed +variant of T_HVREF that decrements the refcount appropriately +when returning an HV*. Introduced in perl 5.15.4. + +=item T_CVREF + +From the perl level this is a reference to a perl subroutine +(e.g. $sub = sub { 1 };). From the C level this is a pointer +to a CV. + +Note that this typemap does not decrement the reference count +when returning an HV*. See also: T_HVREF_REFCOUNT_FIXED + +=item T_CVREF_REFCOUNT_FIXED + +From the perl level this is a reference to a perl subroutine +(e.g. $sub = sub { 1 };). From the C level this is a pointer +to a CV. + +This is a fixed +variant of T_HVREF that decrements the refcount appropriately +when returning an HV*. Introduced in perl 5.15.4. + +=item T_SYSRET + +The T_SYSRET typemap is used to process return values from system calls. +It is only meaningful when passing values from C to perl (there is +no concept of passing a system return value from Perl to C). + +System calls return -1 on error (setting ERRNO with the reason) +and (usually) 0 on success. If the return value is -1 this typemap +returns C<undef>. If the return value is not -1, this typemap +translates a 0 (perl false) to "0 but true" (which +is perl true) or returns the value itself, to indicate that the +command succeeded. + +The L<POSIX|POSIX> module makes extensive use of this type. + +=item T_UV + +An unsigned integer. + +=item T_IV + +A signed integer. This is cast to the required integer type when +passed to C and converted to an IV when passed back to Perl. + +=item T_INT + +A signed integer. This typemap converts the Perl value to a native +integer type (the C<int> type on the current platform). When returning +the value to perl it is processed in the same way as for T_IV. + +Its behaviour is identical to using an C<int> type in XS with T_IV. + +=item T_ENUM + +An enum value. Used to transfer an enum component +from C. There is no reason to pass an enum value to C since +it is stored as an IV inside perl. + +=item T_BOOL + +A boolean type. This can be used to pass true and false values to and +from C. + +=item T_U_INT + +This is for unsigned integers. It is equivalent to using T_UV +but explicitly casts the variable to type C<unsigned int>. +The default type for C<unsigned int> is T_UV. + +=item T_SHORT + +Short integers. This is equivalent to T_IV but explicitly casts +the return to type C<short>. The default typemap for C<short> +is T_IV. + +=item T_U_SHORT + +Unsigned short integers. This is equivalent to T_UV but explicitly +casts the return to type C<unsigned short>. The default typemap for +C<unsigned short> is T_UV. + +T_U_SHORT is used for type C<U16> in the standard typemap. + +=item T_LONG + +Long integers. This is equivalent to T_IV but explicitly casts +the return to type C<long>. The default typemap for C<long> +is T_IV. + +=item T_U_LONG + +Unsigned long integers. This is equivalent to T_UV but explicitly +casts the return to type C<unsigned long>. The default typemap for +C<unsigned long> is T_UV. + +T_U_LONG is used for type C<U32> in the standard typemap. + +=item T_CHAR + +Single 8-bit characters. + +=item T_U_CHAR + +An unsigned byte. + +=item T_FLOAT + +A floating point number. This typemap guarantees to return a variable +cast to a C<float>. + +=item T_NV + +A Perl floating point number. Similar to T_IV and T_UV in that the +return type is cast to the requested numeric type rather than +to a specific type. + +=item T_DOUBLE + +A double precision floating point number. This typemap guarantees to +return a variable cast to a C<double>. + +=item T_PV + +A string (char *). + +=item T_PTR + +A memory address (pointer). Typically associated with a C<void *> +type. + +=item T_PTRREF + +Similar to T_PTR except that the pointer is stored in a scalar and the +reference to that scalar is returned to the caller. This can be used +to hide the actual pointer value from the programmer since it is usually +not required directly from within perl. + +The typemap checks that a scalar reference is passed from perl to XS. + +=item T_PTROBJ + +Similar to T_PTRREF except that the reference is blessed into a class. +This allows the pointer to be used as an object. Most commonly used to +deal with C structs. The typemap checks that the perl object passed +into the XS routine is of the correct class (or part of a subclass). + +The pointer is blessed into a class that is derived from the name +of type of the pointer but with all '*' in the name replaced with +'Ptr'. + +=item T_REF_IV_REF + +NOT YET + +=item T_REF_IV_PTR + +Similar to T_PTROBJ in that the pointer is blessed into a scalar object. +The difference is that when the object is passed back into XS it must be +of the correct type (inheritance is not supported). + +The pointer is blessed into a class that is derived from the name +of type of the pointer but with all '*' in the name replaced with +'Ptr'. + +=item T_PTRDESC + +NOT YET + +=item T_REFREF + +Similar to T_PTRREF, except the pointer stored in the referenced scalar +is dereferenced and copied to the output variable. This means that +T_REFREF is to T_PTRREF as T_OPAQUE is to T_OPAQUEPTR. All clear? + +Only the INPUT part of this is implemented (Perl to XSUB) and there +are no known users in core or on CPAN. + +=item T_REFOBJ + +NOT YET + +=item T_OPAQUEPTR + +This can be used to store bytes in the string component of the +SV. Here the representation of the data is irrelevant to perl and the +bytes themselves are just stored in the SV. It is assumed that the C +variable is a pointer (the bytes are copied from that memory +location). If the pointer is pointing to something that is +represented by 8 bytes then those 8 bytes are stored in the SV (and +length() will report a value of 8). This entry is similar to T_OPAQUE. + +In principal the unpack() command can be used to convert the bytes +back to a number (if the underlying type is known to be a number). + +This entry can be used to store a C structure (the number +of bytes to be copied is calculated using the C C<sizeof> function) +and can be used as an alternative to T_PTRREF without having to worry +about a memory leak (since Perl will clean up the SV). + +=item T_OPAQUE + +This can be used to store data from non-pointer types in the string +part of an SV. It is similar to T_OPAQUEPTR except that the +typemap retrieves the pointer directly rather than assuming it +is being supplied. For example, if an integer is imported into +Perl using T_OPAQUE rather than T_IV the underlying bytes representing +the integer will be stored in the SV but the actual integer value will +not be available. i.e. The data is opaque to perl. + +The data may be retrieved using the C<unpack> function if the +underlying type of the byte stream is known. + +T_OPAQUE supports input and output of simple types. +T_OPAQUEPTR can be used to pass these bytes back into C if a pointer +is acceptable. + +=item Implicit array + +xsubpp supports a special syntax for returning +packed C arrays to perl. If the XS return type is given as + + array(type, nelem) + +xsubpp will copy the contents of C<nelem * sizeof(type)> bytes from +RETVAL to an SV and push it onto the stack. This is only really useful +if the number of items to be returned is known at compile time and you +don't mind having a string of bytes in your SV. Use T_ARRAY to push a +variable number of arguments onto the return stack (they won't be +packed as a single string though). + +This is similar to using T_OPAQUEPTR but can be used to process more +than one element. + +=item T_PACKED + +Calls user-supplied functions for conversion. For C<OUTPUT> +(XSUB to Perl), a function named C<XS_pack_$ntype> is called +with the output Perl scalar and the C variable to convert from. +C<$ntype> is the normalized C type that is to be mapped to +Perl. Normalized means that all C<*> are replaced by the +string C<Ptr>. The return value of the function is ignored. + +Conversely for C<INPUT> (Perl to XSUB) mapping, the +function named C<XS_unpack_$ntype> is called with the input Perl +scalar as argument and the return value is cast to the mapped +C type and assigned to the output C variable. + +An example conversion function for a typemapped struct +C<foo_t *> might be: + + static void + XS_pack_foo_tPtr(SV *out, foo_t *in) + { + dTHX; /* alas, signature does not include pTHX_ */ + HV* hash = newHV(); + hv_stores(hash, "int_member", newSViv(in->int_member)); + hv_stores(hash, "float_member", newSVnv(in->float_member)); + /* ... */ + + /* mortalize as thy stack is not refcounted */ + sv_setsv(out, sv_2mortal(newRV_noinc((SV*)hash))); + } + +The conversion from Perl to C is left as an exercise to the reader, +but the prototype would be: + + static foo_t * + XS_unpack_foo_tPtr(SV *in); + +Instead of an actual C function that has to fetch the thread context +using C<dTHX>, you can define macros of the same name and avoid the +overhead. Also, keep in mind to possibly free the memory allocated by +C<XS_unpack_foo_tPtr>. + +=item T_PACKEDARRAY + +T_PACKEDARRAY is similar to T_PACKED. In fact, the C<INPUT> (Perl +to XSUB) typemap is indentical, but the C<OUTPUT> typemap passes +an additional argument to the C<XS_pack_$ntype> function. This +third parameter indicates the number of elements in the output +so that the function can handle C arrays sanely. The variable +needs to be declared by the user and must have the name +C<count_$ntype> where C<$ntype> is the normalized C type name +as explained above. The signature of the function would be for +the example above and C<foo_t **>: + + static void + XS_pack_foo_tPtrPtr(SV *out, foo_t *in, UV count_foo_tPtrPtr); + +The type of the third parameter is arbitrary as far as the typemap +is concerned. It just has to be in line with the declared variable. + +Of course, unless you know the number of elements in the +C<sometype **> C array, within your XSUB, the return value from +C<foo_t ** XS_unpack_foo_tPtrPtr(...)> will be hard to decypher. +Since the details are all up to the XS author (the typemap user), +there are several solutions, none of which particularly elegant. +The most commonly seen solution has been to allocate memory for +N+1 pointers and assign C<NULL> to the (N+1)th to facilitate +iteration. + +Alternatively, using a customized typemap for your purposes in +the first place is probably preferrable. + +=item T_DATAUNIT + +NOT YET + +=item T_CALLBACK + +NOT YET + +=item T_ARRAY + +This is used to convert the perl argument list to a C array +and for pushing the contents of a C array onto the perl +argument stack. + +The usual calling signature is + + @out = array_func( @in ); + +Any number of arguments can occur in the list before the array but +the input and output arrays must be the last elements in the list. + +When used to pass a perl list to C the XS writer must provide a +function (named after the array type but with 'Ptr' substituted for +'*') to allocate the memory required to hold the list. A pointer +should be returned. It is up to the XS writer to free the memory on +exit from the function. The variable C<ix_$var> is set to the number +of elements in the new array. + +When returning a C array to Perl the XS writer must provide an integer +variable called C<size_$var> containing the number of elements in the +array. This is used to determine how many elements should be pushed +onto the return argument stack. This is not required on input since +Perl knows how many arguments are on the stack when the routine is +called. Ordinarily this variable would be called C<size_RETVAL>. + +Additionally, the type of each element is determined from the type of +the array. If the array uses type C<intArray *> xsubpp will +automatically work out that it contains variables of type C<int> and +use that typemap entry to perform the copy of each element. All +pointer '*' and 'Array' tags are removed from the name to determine +the subtype. + +=item T_STDIO + +This is used for passing perl filehandles to and from C using +C<FILE *> structures. + +=item T_INOUT + +This is used for passing perl filehandles to and from C using +C<PerlIO *> structures. The file handle can used for reading and +writing. This corresponds to the C<+E<lt>> mode, see also T_IN +and T_OUT. + +See L<perliol> for more information on the Perl IO abstraction +layer. Perl must have been built with C<-Duseperlio>. + +There is no check to assert that the filehandle passed from Perl +to C was created with the right C<open()> mode. + +=item T_IN + +Same as T_INOUT, but the filehandle that is returned from C to Perl +can only be used for reading (mode C<E<lt>>). + +=item T_OUT + +Same as T_INOUT, but the filehandle that is returned from C to Perl +is set to use the open mode C<+E<gt>>. + +=back + |