diff options
author | Larry Wall <lwall@netlabs.com> | 1994-10-17 23:00:00 +0000 |
---|---|---|
committer | Larry Wall <lwall@netlabs.com> | 1994-10-17 23:00:00 +0000 |
commit | a0d0e21ea6ea90a22318550944fe6cb09ae10cda (patch) | |
tree | faca1018149b736b1142f487e44d1ff2de5cc1fa /pod/perlguts.pod | |
parent | 85e6fe838fb25b257a1b363debf8691c0992ef71 (diff) | |
download | perl-a0d0e21ea6ea90a22318550944fe6cb09ae10cda.tar.gz |
perl 5.000perl-5.000
[editor's note: this commit combines approximate 4 months of furious
releases of Andy Dougherty and Larry Wall - see pod/perlhist.pod for
details. Andy notes that;
Alas neither my "Irwin AccuTrack" nor my DC 600A quarter-inch cartridge
backup tapes from that era seem to be readable anymore. I guess 13 years
exceeds the shelf life for that backup technology :-(.
]
Diffstat (limited to 'pod/perlguts.pod')
-rw-r--r-- | pod/perlguts.pod | 521 |
1 files changed, 521 insertions, 0 deletions
diff --git a/pod/perlguts.pod b/pod/perlguts.pod new file mode 100644 index 0000000000..a08ac95340 --- /dev/null +++ b/pod/perlguts.pod @@ -0,0 +1,521 @@ +=head1 NAME + +perlguts - Perl's Internal Functions + +=head1 DESCRIPTION + +This document attempts to describe some of the internal functions of the +Perl executable. It is far from complete and probably contains many errors. +Please refer any questions or comments to the author below. + +=head1 Datatypes + +Perl has three typedefs that handle Perl's three main data types: + + SV Scalar Value + AV Array Value + HV Hash Value + +Each typedef has specific routines that manipulate the various data type. + +=head2 What is an "IV"? + +Perl uses a special typedef IV which is large enough to hold either an +integer or a pointer. + +Perl also uses a special typedef I32 which will always be a 32-bit integer. + +=head2 Working with SV's + +An SV can be created and loaded with one command. There are four types of +values that can be loaded: an integer value (IV), a double (NV), a string, +(PV), and another scalar (SV). + +The four routines are: + + SV* newSViv(IV); + SV* newSVnv(double); + SV* newSVpv(char*, int); + SV* newSVsv(SV*); + +To change the value of an *already-existing* scalar, there are five routines: + + void sv_setiv(SV*, IV); + void sv_setnv(SV*, double); + void sv_setpvn(SV*, char*, int) + void sv_setpv(SV*, char*); + void sv_setsv(SV*, SV*); + +Notice that you can choose to specify the length of the string to be +assigned by using C<sv_setpvn>, or allow Perl to calculate the length by +using C<sv_setpv>. Be warned, though, that C<sv_setpv> determines the +string's length by using C<strlen>, which depends on the string terminating +with a NUL character. + +To access the actual value that an SV points to, you can use the macros: + + SvIV(SV*) + SvNV(SV*) + SvPV(SV*, STRLEN len) + +which will automatically coerce the actual scalar type into an IV, double, +or string. + +In the C<SvPV> macro, the length of the string returned is placed into the +variable C<len> (this is a macro, so you do I<not> use C<&len>). If you do not +care what the length of the data is, use the global variable C<na>. Remember, +however, that Perl allows arbitrary strings of data that may both contain +NUL's and not be terminated by a NUL. + +If you simply want to know if the scalar value is TRUE, you can use: + + SvTRUE(SV*) + +Although Perl will automatically grow strings for you, if you need to force +Perl to allocate more memory for your SV, you can use the macro + + SvGROW(SV*, STRLEN newlen) + +which will determine if more memory needs to be allocated. If so, it will +call the function C<sv_grow>. Note that C<SvGROW> can only increase, not +decrease, the allocated memory of an SV. + +If you have an SV and want to know what kind of data Perl thinks is stored +in it, you can use the following macros to check the type of SV you have. + + SvIOK(SV*) + SvNOK(SV*) + SvPOK(SV*) + +You can get and set the current length of the string stored in an SV with +the following macros: + + SvCUR(SV*) + SvCUR_set(SV*, I32 val) + +But note that these are valid only if C<SvPOK()> is true. + +If you know the name of a scalar variable, you can get a pointer to its SV +by using the following: + + SV* perl_get_sv("varname", FALSE); + +This returns NULL if the variable does not exist. + +If you want to know if this variable (or any other SV) is actually defined, +you can call: + + SvOK(SV*) + +The scalar C<undef> value is stored in an SV instance called C<sv_undef>. Its +address can be used whenever an C<SV*> is needed. + +There are also the two values C<sv_yes> and C<sv_no>, which contain Boolean +TRUE and FALSE values, respectively. Like C<sv_undef>, their addresses can +be used whenever an C<SV*> is needed. + +Do not be fooled into thinking that C<(SV *) 0> is the same as C<&sv_undef>. +Take this code: + + SV* sv = (SV*) 0; + if (I-am-to-return-a-real-value) { + sv = sv_2mortal(newSViv(42)); + } + sv_setsv(ST(0), sv); + +This code tries to return a new SV (which contains the value 42) if it should +return a real value, or undef otherwise. Instead it has returned a null +pointer which, somewhere down the line, will cause a segmentation violation, +or just weird results. Change the zero to C<&sv_undef> in the first line and +all will be well. + +To free an SV that you've created, call C<SvREFCNT_dec(SV*)>. Normally this +call is not necessary. See the section on B<MORTALITY>. + +=head2 Private and Public Values + +Recall that the usual method of determining the type of scalar you have is +to use C<Sv[INP]OK> macros. Since a scalar can be both a number and a string, +usually these macros will always return TRUE and calling the C<Sv[INP]V> +macros will do the appropriate conversion of string to integer/double or +integer/double to string. + +If you I<really> need to know if you have an integer, double, or string +pointer in an SV, you can use the following three macros instead: + + SvIOKp(SV*) + SvNOKp(SV*) + SvPOKp(SV*) + +These will tell you if you truly have an integer, double, or string pointer +stored in your SV. + +In general, though, it's best to just use the C<Sv[INP]V> macros. + +=head2 Working with AV's + +There are two ways to create and load an AV. The first method just creates +an empty AV: + + AV* newAV(); + +The second method both creates the AV and initially populates it with SV's: + + AV* av_make(I32 num, SV **ptr); + +The second argument points to an array containing C<num> C<SV*>'s. + +Once the AV has been created, the following operations are possible on AV's: + + void av_push(AV*, SV*); + SV* av_pop(AV*); + SV* av_shift(AV*); + void av_unshift(AV*, I32 num); + +These should be familiar operations, with the exception of C<av_unshift>. +This routine adds C<num> elements at the front of the array with the C<undef> +value. You must then use C<av_store> (described below) to assign values +to these new elements. + +Here are some other functions: + + I32 av_len(AV*); /* Returns length of array */ + + SV** av_fetch(AV*, I32 key, I32 lval); + /* Fetches value at key offset, but it seems to + set the value to lval if lval is non-zero */ + SV** av_store(AV*, I32 key, SV* val); + /* Stores val at offset key */ + + void av_clear(AV*); + /* Clear out all elements, but leave the array */ + void av_undef(AV*); + /* Undefines the array, removing all elements */ + +If you know the name of an array variable, you can get a pointer to its AV +by using the following: + + AV* perl_get_av("varname", FALSE); + +This returns NULL if the variable does not exist. + +=head2 Working with HV's + +To create an HV, you use the following routine: + + HV* newHV(); + +Once the HV has been created, the following operations are possible on HV's: + + SV** hv_store(HV*, char* key, U32 klen, SV* val, U32 hash); + SV** hv_fetch(HV*, char* key, U32 klen, I32 lval); + +The C<klen> parameter is the length of the key being passed in. The C<val> +argument contains the SV pointer to the scalar being stored, and C<hash> is +the pre-computed hash value (zero if you want C<hv_store> to calculate it +for you). The C<lval> parameter indicates whether this fetch is actually a +part of a store operation. + +Remember that C<hv_store> and C<hv_fetch> return C<SV**>'s and not just +C<SV*>. In order to access the scalar value, you must first dereference +the return value. However, you should check to make sure that the return +value is not NULL before dereferencing it. + +These two functions check if a hash table entry exists, and deletes it. + + bool hv_exists(HV*, char* key, U32 klen); + SV* hv_delete(HV*, char* key, U32 klen); + +And more miscellaneous functions: + + void hv_clear(HV*); + /* Clears all entries in hash table */ + void hv_undef(HV*); + /* Undefines the hash table */ + + I32 hv_iterinit(HV*); + /* Prepares starting point to traverse hash table */ + HE* hv_iternext(HV*); + /* Get the next entry, and return a pointer to a + structure that has both the key and value */ + char* hv_iterkey(HE* entry, I32* retlen); + /* Get the key from an HE structure and also return + the length of the key string */ + SV* hv_iterval(HV*, HE* entry); + /* Return a SV pointer to the value of the HE + structure */ + +If you know the name of a hash variable, you can get a pointer to its HV +by using the following: + + HV* perl_get_hv("varname", FALSE); + +This returns NULL if the variable does not exist. + +The hash algorithm, for those who are interested, is: + + i = klen; + hash = 0; + s = key; + while (i--) + hash = hash * 33 + *s++; + +=head2 References + +References are a special type of scalar that point to other scalar types +(including references). To treat an AV or HV as a scalar, it is simply +a matter of casting an AV or HV to an SV. + +To create a reference, use the following command: + + SV* newRV((SV*) pointer); + +Once you have a reference, you can use the following macro with a cast to +the appropriate typedef (SV, AV, HV): + + SvRV(SV*) + +then call the appropriate routines, casting the returned C<SV*> to either an +C<AV*> or C<HV*>. + +To determine, after dereferencing a reference, if you still have a reference, +you can use the following macro: + + SvROK(SV*) + +=head1 XSUB'S and the Argument Stack + +The XSUB mechanism is a simple way for Perl programs to access C subroutines. +An XSUB routine will have a stack that contains the arguments from the Perl +program, and a way to map from the Perl data structures to a C equivalent. + +The stack arguments are accessible through the C<ST(n)> macro, which returns +the C<n>'th stack argument. Argument 0 is the first argument passed in the +Perl subroutine call. These arguments are C<SV*>, and can be used anywhere +an C<SV*> is used. + +Most of the time, output from the C routine can be handled through use of +the RETVAL and OUTPUT directives. However, there are some cases where the +argument stack is not already long enough to handle all the return values. +An example is the POSIX tzname() call, which takes no arguments, but returns +two, the local timezone's standard and summer time abbreviations. + +To handle this situation, the PPCODE directive is used and the stack is +extended using the macro: + + EXTEND(sp, num); + +where C<sp> is the stack pointer, and C<num> is the number of elements the +stack should be extended by. + +Now that there is room on the stack, values can be pushed on it using the +macros to push IV's, doubles, strings, and SV pointers respectively: + + PUSHi(IV) + PUSHn(double) + PUSHp(char*, I32) + PUSHs(SV*) + +And now the Perl program calling C<tzname>, the two values will be assigned +as in: + + ($standard_abbrev, $summer_abbrev) = POSIX::tzname; + +An alternate (and possibly simpler) method to pushing values on the stack is +to use the macros: + + XPUSHi(IV) + XPUSHn(double) + XPUSHp(char*, I32) + XPUSHs(SV*) + +These macros automatically adjust the stack for you, if needed. + +=head1 Mortality + +In Perl, values are normally "immortal" -- that is, they are not freed unless +explicitly done so (via the Perl C<undef> call or other routines in Perl +itself). + +In the above example with C<tzname>, we needed to create two new SV's to push +onto the argument stack, that being the two strings. However, we don't want +these new SV's to stick around forever because they will eventually be +copied into the SV's that hold the two scalar variables. + +An SV (or AV or HV) that is "mortal" acts in all ways as a normal "immortal" +SV, AV, or HV, but is only valid in the "current context". When the Perl +interpreter leaves the current context, the mortal SV, AV, or HV is +automatically freed. Generally the "current context" means a single +Perl statement. + +To create a mortal variable, use the functions: + + SV* sv_newmortal() + SV* sv_2mortal(SV*) + SV* sv_mortalcopy(SV*) + +The first call creates a mortal SV, the second converts an existing SV to +a mortal SV, the third creates a mortal copy of an existing SV. + +The mortal routines are not just for SV's -- AV's and HV's can be made mortal +by passing their address (and casting them to C<SV*>) to the C<sv_2mortal> or +C<sv_mortalcopy> routines. + +=head1 Creating New Variables + +To create a new Perl variable, which can be accessed from your Perl script, +use the following routines, depending on the variable type. + + SV* perl_get_sv("varname", TRUE); + AV* perl_get_av("varname", TRUE); + HV* perl_get_hv("varname", TRUE); + +Notice the use of TRUE as the second parameter. The new variable can now +be set, using the routines appropriate to the data type. + +=head1 Stashes and Objects + +A stash is a hash table (associative array) that contains all of the +different objects that are contained within a package. Each key of the +hash table is a symbol name (shared by all the different types of +objects that have the same name), and each value in the hash table is +called a GV (for Glob Value). The GV in turn contains references to +the various objects of that name, including (but not limited to) the +following: + + Scalar Value + Array Value + Hash Value + File Handle + Directory Handle + Format + Subroutine + +Perl stores various stashes in a GV structure (for global variable) but +represents them with an HV structure. + +To get the HV pointer for a particular package, use the function: + + HV* gv_stashpv(char* name, I32 create) + HV* gv_stashsv(SV*, I32 create) + +The first function takes a literal string, the second uses the string stored +in the SV. + +The name that C<gv_stash*v> wants is the name of the package whose symbol table +you want. The default package is called C<main>. If you have multiply nested +packages, it is legal to pass their names to C<gv_stash*v>, separated by +C<::> as in the Perl language itself. + +Alternately, if you have an SV that is a blessed reference, you can find +out the stash pointer by using: + + HV* SvSTASH(SvRV(SV*)); + +then use the following to get the package name itself: + + char* HvNAME(HV* stash); + +If you need to return a blessed value to your Perl script, you can use the +following function: + + SV* sv_bless(SV*, HV* stash) + +where the first argument, an C<SV*>, must be a reference, and the second +argument is a stash. The returned C<SV*> can now be used in the same way +as any other SV. + +=head1 Magic + +[This section under construction] + +=head1 Double-Typed SV's + +Scalar variables normally contain only one type of value, an integer, +double, pointer, or reference. Perl will automatically convert the +actual scalar data from the stored type into the requested type. + +Some scalar variables contain more than one type of scalar data. For +example, the variable C<$!> contains either the numeric value of C<errno> +or its string equivalent from C<sys_errlist[]>. + +To force multiple data values into an SV, you must do two things: use the +C<sv_set*v> routines to add the additional scalar type, then set a flag +so that Perl will believe it contains more than one type of data. The +four macros to set the flags are: + + SvIOK_on + SvNOK_on + SvPOK_on + SvROK_on + +The particular macro you must use depends on which C<sv_set*v> routine +you called first. This is because every C<sv_set*v> routine turns on +only the bit for the particular type of data being set, and turns off +all the rest. + +For example, to create a new Perl variable called "dberror" that contains +both the numeric and descriptive string error values, you could use the +following code: + + extern int dberror; + extern char *dberror_list; + + SV* sv = perl_get_sv("dberror", TRUE); + sv_setiv(sv, (IV) dberror); + sv_setpv(sv, dberror_list[dberror]); + SvIOK_on(sv); + +If the order of C<sv_setiv> and C<sv_setpv> had been reversed, then the +macro C<SvPOK_on> would need to be called instead of C<SvIOK_on>. + +=head1 Calling Perl Routines from within C Programs + +There are four routines that can be used to call a Perl subroutine from +within a C program. These four are: + + I32 perl_call_sv(SV*, I32); + I32 perl_call_pv(char*, I32); + I32 perl_call_method(char*, I32); + I32 perl_call_argv(char*, I32, register char**); + +The routine most often used should be C<perl_call_sv>. The C<SV*> argument +contains either the name of the Perl subroutine to be called, or a reference +to the subroutine. The second argument tells the appropriate routine what, +if any, variables are being returned by the Perl subroutine. + +All four routines return the number of arguments that the subroutine returned +on the Perl stack. + +When using these four routines, the programmer must manipulate the Perl stack. +These include the following macros and functions: + + dSP + PUSHMARK() + PUTBACK + SPAGAIN + ENTER + SAVETMPS + FREETMPS + LEAVE + XPUSH*() + +For more information, consult L<perlcall>. + +=head1 Memory Allocation + +[This section under construction] + +=head1 AUTHOR + +Jeff Okamoto <okamoto@corp.hp.com> + +With lots of help and suggestions from Dean Roehrich, Malcolm Beattie, +Andreas Koenig, Paul Hudson, Ilya Zakharevich, Paul Marquess, and Neil +Bowers. + +=head1 DATE + +Version 12: 1994/10/16 + + |