PerlFAQ sync.

p4raw-id: //depot/perl@18185
author: Rafael Garcia-Suarez <rgarciasuarez@gmail.com> 2002-11-26 21:06:48 +0000
committer: Rafael Garcia-Suarez <rgarciasuarez@gmail.com> 2002-11-26 21:06:48 +0000
commit: 49d635f9372392ae44fe4c5b62b06e41912ae0c9 (patch)
tree: 29a0e48c51466f10da69fffa12babc88587672a9 /pod/perlfaq4.pod
parent: ad0f383a28b730182ea06492027f82167ce7032b (diff)
download: perl-49d635f9372392ae44fe4c5b62b06e41912ae0c9.tar.gz
1 files changed, 194 insertions, 176 deletions
diff --git a/pod/perlfaq4.pod b/pod/perlfaq4.pod
index f2512059cc..7c616aca3d 100644
--- a/pod/perlfaq4.pod
+++ b/pod/perlfaq4.pod
@@ -1,6 +1,6 @@
 =head1 NAME
 
-perlfaq4 - Data Manipulation ($Revision: 1.25 $, $Date: 2002/05/30 07:04:25 $)
+perlfaq4 - Data Manipulation ($Revision: 1.37 $, $Date: 2002/11/13 06:04:00 $)
 
 =head1 DESCRIPTION
 
@@ -11,56 +11,36 @@ numbers, dates, strings, arrays, hashes, and miscellaneous data issues.
 
 =head2 Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?
 
-The infinite set that a mathematician thinks of as the real numbers can
-only be approximated on a computer, since the computer only has a finite
-number of bits to store an infinite number of, um, numbers.
-
-Internally, your computer represents floating-point numbers in binary.
-Floating-point numbers read in from a file or appearing as literals
-in your program are converted from their decimal floating-point
-representation (eg, 19.95) to an internal binary representation.
-
-However, 19.95 can't be precisely represented as a binary
-floating-point number, just like 1/3 can't be exactly represented as a
-decimal floating-point number.  The computer's binary representation
-of 19.95, therefore, isn't exactly 19.95.
-
-When a floating-point number gets printed, the binary floating-point
-representation is converted back to decimal.  These decimal numbers
-are displayed in either the format you specify with printf(), or the
-current output format for numbers.  (See L<perlvar/"$#"> if you use
-print.  C<$#> has a different default value in Perl5 than it did in
-Perl4.  Changing C<$#> yourself is deprecated.)
-
-This affects B<all> computer languages that represent decimal
-floating-point numbers in binary, not just Perl.  Perl provides
-arbitrary-precision decimal numbers with the Math::BigFloat module
-(part of the standard Perl distribution), but mathematical operations
-are consequently slower.
-
-If precision is important, such as when dealing with money, it's good
-to work with integers and then divide at the last possible moment.
-For example, work in pennies (1995) instead of dollars and cents
-(19.95) and divide by 100 at the end.
-
-To get rid of the superfluous digits, just use a format (eg,
-C<printf("%.2f", 19.95)>) to get the required precision.
-See L<perlop/"Floating-point Arithmetic">.  
+Internally, your computer represents floating-point numbers
+in binary. Digital (as in powers of two) computers cannot
+store all numbers exactly.  Some real numbers lose precision
+in the process.  This is a problem with how computers store
+numbers and affects all computer languages, not just Perl.
 
+L<perlnumber> show the gory details of number
+representations and conversions.
+
+To limit the number of decimal places in your numbers, you
+can use the printf or sprintf function.  See the
+L<perlop|"Floating Point Arithmetic"> for more details.
+
+	printf "%.2f", 10/3;
+	
+	my $number = sprintf "%.2f", 10/3;
+	
 =head2 Why isn't my octal data interpreted correctly?
 
-Perl only understands octal and hex numbers as such when they occur
-as literals in your program.  Octal literals in perl must start with 
-a leading "0" and hexadecimal literals must start with a leading "0x".
-If they are read in from somewhere and assigned, no automatic 
-conversion takes place.  You must explicitly use oct() or hex() if you 
-want the values converted to decimal.  oct() interprets
-both hex ("0x350") numbers and octal ones ("0350" or even without the
-leading "0", like "377"), while hex() only converts hexadecimal ones,
-with or without a leading "0x", like "0x255", "3A", "ff", or "deadbeef".
+Perl only understands octal and hex numbers as such when they occur as
+literals in your program.  Octal literals in perl must start with a
+leading "0" and hexadecimal literals must start with a leading "0x".
+If they are read in from somewhere and assigned, no automatic
+conversion takes place.  You must explicitly use oct() or hex() if you
+want the values converted to decimal.  oct() interprets hex ("0x350"),
+octal ("0350" or even without the leading "0", like "377") and binary
+("0b1010") numbers, while hex() only converts hexadecimal ones, with
+or without a leading "0x", like "0x255", "3A", "ff", or "deadbeef".
 The inverse mapping from decimal to octal can be done with either the
-"%o" or "%O" sprintf() formats.  To get from decimal to hex try either 
-the "%x" or the "%X" formats to sprintf().
+"%o" or "%O" sprintf() formats.
 
 This problem shows up most often when people try using chmod(), mkdir(),
 umask(), or sysopen(), which by widespread tradition typically take 
@@ -264,7 +244,7 @@ C<00110011>).  The operators work with the binary form of a number
 (the number C<3> is treated as the bit pattern C<00000011>).
 
 So, saying C<11 & 3> performs the "and" operation on numbers (yielding
-C<1>).  Saying C<"11" & "3"> performs the "and" operation on strings
+C<3>).  Saying C<"11" & "3"> performs the "and" operation on strings
 (yielding C<"1">).
 
 Most problems with C<&> and C<|> arise because the programmer thinks
@@ -335,14 +315,17 @@ Get the http://www.cpan.org/modules/by-module/Roman module.
 
 If you're using a version of Perl before 5.004, you must call C<srand>
 once at the start of your program to seed the random number generator.
+
+	 BEGIN { srand() if $[ < 5.004 }
+
 5.004 and later automatically call C<srand> at the beginning.  Don't
-call C<srand> more than once--you make your numbers less random, rather
+call C<srand> more than once---you make your numbers less random, rather
 than more.
 
 Computers are good at being predictable and bad at being random
 (despite appearances caused by bugs in your programs :-).  see the
-F<random> artitcle in the "Far More Than You Ever Wanted To Know"
-collection in http://www.cpan.org/olddoc/FMTEYEWTK.tgz , courtesy of
+F<random> article in the "Far More Than You Ever Wanted To Know"
+collection in http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz , courtesy of
 Tom Phoenix, talks more about this.  John von Neumann said, ``Anyone
 who attempts to generate random numbers by deterministic means is, of
 course, living in a state of sin.''
@@ -388,11 +371,20 @@ Use the following simple functions:
 	return 1+int((((localtime(shift || time))[5] + 1899))/1000);
     } 
 
-On some systems, you'll find that the POSIX module's strftime() function
-has been extended in a non-standard way to use a C<%C> format, which they
-sometimes claim is the "century".  It isn't, because on most such systems,
-this is only the first two digits of the four-digit year, and thus cannot
-be used to reliably determine the current century or millennium.
+You can also use the POSIX strftime() function which may be a bit
+slower but is easier to read and maintain.
+
+	use POSIX qw/strftime/;
+	
+	my $week_of_the_year = strftime "%W", localtime;
+	my $day_of_the_year  = strftime "%j", localtime;
+
+On some systems, the POSIX module's strftime() function has
+been extended in a non-standard way to use a C<%C> format,
+which they sometimes claim is the "century".  It isn't,
+because on most such systems, this is only the first two
+digits of the four-digit year, and thus cannot be used to
+reliably determine the current century or millennium.
 
 =head2 How can I compare two dates and find the difference?
 
@@ -438,58 +430,60 @@ modules.  (Thanks to David Cassell for most of this text.)
 
 =head2 How do I find yesterday's date?
 
-The C<time()> function returns the current time in seconds since the
-epoch.  Take twenty-four hours off that:
+If you only need to find the date (and not the same time), you
+can use the Date::Calc module.
 
-    $yesterday = time() - ( 24 * 60 * 60 );
+	use Date::Calc qw(Today Add_Delta_Days);
+	
+	my @date = Add_Delta_Days( Today(), -1 );
+	
+	print "@date\n";
 
-Then you can pass this to C<localtime()> and get the individual year,
-month, day, hour, minute, seconds values.
-
-Note very carefully that the code above assumes that your days are
-twenty-four hours each.  For most people, there are two days a year
-when they aren't: the switch to and from summer time throws this off.
-A solution to this issue is offered by Russ Allbery.
+Most people try to use the time rather than the calendar to
+figure out dates, but that assumes that your days are
+twenty-four hours each.  For most people, there are two days
+a year when they aren't: the switch to and from summer time
+throws this off. Russ Allbery offers this solution.
 
     sub yesterday {
-	my $now  = defined $_[0] ? $_[0] : time;
-	my $then = $now - 60 * 60 * 24;
-	my $ndst = (localtime $now)[8] > 0;
-	my $tdst = (localtime $then)[8] > 0;
-	$then - ($tdst - $ndst) * 60 * 60;
-    }
-    # Should give you "this time yesterday" in seconds since epoch relative to
-    # the first argument or the current time if no argument is given and
-    # suitable for passing to localtime or whatever else you need to do with
-    # it.  $ndst is whether we're currently in daylight savings time; $tdst is
-    # whether the point 24 hours ago was in daylight savings time.  If $tdst
-    # and $ndst are the same, a boundary wasn't crossed, and the correction
-    # will subtract 0.  If $tdst is 1 and $ndst is 0, subtract an hour more
-    # from yesterday's time since we gained an extra hour while going off
-    # daylight savings time.  If $tdst is 0 and $ndst is 1, subtract a
-    # negative hour (add an hour) to yesterday's time since we lost an hour.
-    #
-    # All of this is because during those days when one switches off or onto
-    # DST, a "day" isn't 24 hours long; it's either 23 or 25.
-    #
-    # The explicit settings of $ndst and $tdst are necessary because localtime
-    # only says it returns the system tm struct, and the system tm struct at
-    # least on Solaris doesn't guarantee any particular positive value (like,
-    # say, 1) for isdst, just a positive value.  And that value can
-    # potentially be negative, if DST information isn't available (this sub
-    # just treats those cases like no DST).
-    #
-    # Note that between 2am and 3am on the day after the time zone switches
-    # off daylight savings time, the exact hour of "yesterday" corresponding
-    # to the current hour is not clearly defined.  Note also that if used
-    # between 2am and 3am the day after the change to daylight savings time,
-    # the result will be between 3am and 4am of the previous day; it's
-    # arguable whether this is correct.
-    #
-    # This sub does not attempt to deal with leap seconds (most things don't).
-    #
-    # Copyright relinquished 1999 by Russ Allbery <rra@stanford.edu>
-    # This code is in the public domain
+		my $now  = defined $_[0] ? $_[0] : time;
+		my $then = $now - 60 * 60 * 24;
+		my $ndst = (localtime $now)[8] > 0;
+		my $tdst = (localtime $then)[8] > 0;
+		$then - ($tdst - $ndst) * 60 * 60;
+		}
+		
+Should give you "this time yesterday" in seconds since epoch relative to
+the first argument or the current time if no argument is given and
+suitable for passing to localtime or whatever else you need to do with
+it.  $ndst is whether we're currently in daylight savings time; $tdst is
+whether the point 24 hours ago was in daylight savings time.  If $tdst
+and $ndst are the same, a boundary wasn't crossed, and the correction
+will subtract 0.  If $tdst is 1 and $ndst is 0, subtract an hour more
+from yesterday's time since we gained an extra hour while going off
+daylight savings time.  If $tdst is 0 and $ndst is 1, subtract a
+negative hour (add an hour) to yesterday's time since we lost an hour.
+
+All of this is because during those days when one switches off or onto
+DST, a "day" isn't 24 hours long; it's either 23 or 25.
+
+The explicit settings of $ndst and $tdst are necessary because localtime
+only says it returns the system tm struct, and the system tm struct at
+least on Solaris doesn't guarantee any particular positive value (like,
+say, 1) for isdst, just a positive value.  And that value can
+potentially be negative, if DST information isn't available (this sub
+just treats those cases like no DST).
+
+Note that between 2am and 3am on the day after the time zone switches
+off daylight savings time, the exact hour of "yesterday" corresponding
+to the current hour is not clearly defined.  Note also that if used
+between 2am and 3am the day after the change to daylight savings time,
+the result will be between 3am and 4am of the previous day; it's
+arguable whether this is correct.
+
+This sub does not attempt to deal with leap seconds (most things don't).
+
+
 
 =head2 Does Perl have a Year 2000 problem?  Is Perl Y2K compliant?
 
@@ -557,14 +551,6 @@ a subroutine call (in list context) into a string:
 
     print "My sub returned @{[mysub(1,2,3)]} that time.\n";
 
-If you prefer scalar context, similar chicanery is also useful for
-arbitrary expressions:
-
-    print "That yields ${\($n + 5)} widgets\n";
-
-Version 5.004 of Perl had a bug that gave list context to the
-expression in C<${...}>, but this is fixed in version 5.005.
-
 See also ``How can I expand variables in text strings?'' in this
 section of the FAQ.
 
@@ -645,23 +631,25 @@ done by making a shell alias, like so:
 See the documentation for Text::Autoformat to appreciate its many
 capabilities.
 
-=head2 How can I access/change the first N letters of a string?
-
-There are many ways.  If you just want to grab a copy, use
-substr():
+=head2 How can I access or change N characters of a string?
 
-    $first_byte = substr($a, 0, 1);
+You can access the first characters of a string with substr().
+To get the first character, for example, start at position 0
+and grab the string of length 1.  
 
-If you want to modify part of a string, the simplest way is often to
-use substr() as an lvalue:
 
-    substr($a, 0, 3) = "Tom";
+	$string = "Just another Perl Hacker";
+    $first_char = substr( $string, 0, 1 );  #  'J'
 
-Although those with a pattern matching kind of thought process will
-likely prefer
+To change part of a string, you can use the optional fourth
+argument which is the replacement string.
 
-    $a =~ s/^.../Tom/;
+    substr( $string, 13, 4, "Perl 5.8.0" );
+	
+You can also use substr() as an lvalue.
 
+    substr( $string, 13, 4 ) =  "Perl 5.8.0";
+	
 =head2 How do I change the Nth occurrence of something?
 
 You have to keep track of N yourself.  For example, let's say you want
@@ -753,20 +741,21 @@ case", but that's not quite accurate.  Consider the proper
 capitalization of the movie I<Dr. Strangelove or: How I Learned to
 Stop Worrying and Love the Bomb>, for example.
 
-=head2 How can I split a [character] delimited string except when inside
-[character]? (Comma-separated files)
+=head2 How can I split a [character] delimited string except when inside [character]?
 
-Take the example case of trying to split a string that is comma-separated
-into its different fields.  (We'll pretend you said comma-separated, not
-comma-delimited, which is different and almost never what you mean.) You
-can't use C<split(/,/)> because you shouldn't split if the comma is inside
-quotes.  For example, take a data line like this:
+Several modules can handle this sort of pasing---Text::Balanced,
+Text::CVS, Text::CVS_XS, and Text::ParseWords, among others.
+
+Take the example case of trying to split a string that is
+comma-separated into its different fields. You can't use C<split(/,/)>
+because you shouldn't split if the comma is inside quotes.  For
+example, take a data line like this:
 
     SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core Dumped"
 
 Due to the restriction of the quotes, this is a fairly complex
-problem.  Thankfully, we have Jeffrey Friedl, author of a highly
-recommended book on regular expressions, to handle these for us.  He
+problem.  Thankfully, we have Jeffrey Friedl, author of 
+I<Mastering Regular Expressions>, to handle these for us.  He
 suggests (assuming your string is contained in $text):
 
      @new = ();
@@ -779,8 +768,7 @@ suggests (assuming your string is contained in $text):
 
 If you want to represent quotation marks inside a
 quotation-mark-delimited field, escape them with backslashes (eg,
-C<"like \"this\"">.  Unescaping them is a task addressed earlier in
-this section.
+C<"like \"this\"">.
 
 Alternatively, the Text::ParseWords module (part of the standard Perl
 distribution) lets you say:
@@ -1271,16 +1259,37 @@ an exercise to the reader.
 
 =head2 How do I find the first array element for which a condition is true?
 
-You can use this if you care about the index:
-
-    for ($i= 0; $i < @array; $i++) {
-        if ($array[$i] eq "Waldo") {
-	    $found_index = $i;
-            last;
+To find the first array element which satisfies a condition, you can
+use the first() function in the List::Util module, which comes with
+Perl 5.8.  This example finds the first element that contains "Perl".
+
+	use List::Util qw(first);
+	
+	my $element = first { /Perl/ } @array;
+	
+If you cannot use List::Util, you can make your own loop to do the
+same thing.  Once you find the element, you stop the loop with last.
+
+	my $found;
+	foreach my $element ( @array )
+		{
+		if( /Perl/ ) { $found = $element; last }
+		}
+
+If you want the array index, you can iterate through the indices
+and check the array element at each index until you find one
+that satisfies the condition.
+
+	my( $found, $i ) = ( undef, -1 );
+    for( $i = 0; $i < @array; $i++ ) 
+    	{
+        if( $array[$i] =~ /Perl/ ) 
+        	{ 
+        	$found = $array[$i];
+        	$index = $i; 
+        	last;
+        	}
         }
-    }
-
-Now C<$found_index> has what you want.
 
 =head2 How do I handle linked lists?
 
@@ -1399,6 +1408,11 @@ Here's another; let's compute spherical volumes:
 	$_ **= 3;
 	$_ *= (4/3) * 3.14159;  # this will be constant folded
     }
+    
+which can also be done with map() which is made to transform
+one list into another:
+
+	@volumes = map {$_ ** 3 * (4/3) * 3.14159} @radii;
 
 If you want to do the same thing to modify the values of the
 hash, you can use the C<values> function.  As of Perl 5.6
@@ -1431,34 +1445,40 @@ call to rand), you're almost certainly doing something wrong.
 
 =head2 How do I permute N elements of a list?
 
-Here's a little program that generates all permutations
-of all the words on each line of input.  The algorithm embodied
-in the permute() function should work on any list:
-
-    #!/usr/bin/perl -n
-    # tsc-permute: permute each word of input
-    permute([split], []);
-    sub permute {
-        my @items = @{ $_[0] };
-        my @perms = @{ $_[1] };
-        unless (@items) {
-            print "@perms\n";
-	} else {
-            my(@newitems,@newperms,$i);
-            foreach $i (0 .. $#items) {
-                @newitems = @items;
-                @newperms = @perms;
-                unshift(@newperms, splice(@newitems, $i, 1));
-                permute([@newitems], [@newperms]);
-	    }
+Use the List::Permutor module on CPAN.  If the list is
+actually an array, try the Algorithm::Permute module (also
+on CPAN).  It's written in XS code and is very efficient.
+
+	use Algorithm::Permute;
+	my @array = 'a'..'d';
+	my $p_iterator = Algorithm::Permute->new ( \@array );
+	while (my @perm = $p_iterator->next) {
+	   print "next permutation: (@perm)\n";
+	}
+
+Here's a little program that generates all permutations of
+all the words on each line of input. The algorithm embodied
+in the permute() function is discussed in Volume 4 (still
+unpublished) of Knuth's I<The Art of Computer Programming>
+and will work on any list:
+
+	#!/usr/bin/perl -n
+	# Fischer-Kause ordered permutation generator
+
+	sub permute (&@) {
+		my $code = shift;
+		my @idx = 0..$#_;
+		while ( $code->(@_[@idx]) ) {
+			my $p = $#idx;
+			--$p while $idx[$p-1] > $idx[$p];
+			my $q = $p or return;
+			push @idx, reverse splice @idx, $p;
+			++$q while $idx[$p-1] > $idx[$q];
+			@idx[$p-1,$q]=@idx[$q,$p-1];
+		}
 	}
-    }
 
-Unfortunately, this algorithm is very inefficient. The Algorithm::Permute
-module from CPAN runs at least an order of magnitude faster. If you don't
-have a C compiler (or a binary distribution of Algorithm::Permute), then
-you can use List::Permutor which is written in pure Perl, and is still
-several times faster than the algorithm above.
+	permute {print"@_\n"} split;
 
 =head2 How do I sort an array by (anything)?
 
@@ -1502,7 +1522,7 @@ This can be conveniently combined with precalculation of keys as given
 above.
 
 See the F<sort> artitcle article in the "Far More Than You Ever Wanted
-To Know" collection in http://www.cpan.org/olddoc/FMTEYEWTK.tgz for
+To Know" collection in http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz for
 more about this approach.
 
 See also the question below on sorting hashes.
@@ -1842,11 +1862,11 @@ it on top of either DB_File or GDBM_File.
 Use the Tie::IxHash from CPAN.
 
     use Tie::IxHash;
-    tie(%myhash, Tie::IxHash);
-    for ($i=0; $i<20; $i++) {
+    tie my %myhash, Tie::IxHash;
+    for (my $i=0; $i<20; $i++) {
         $myhash{$i} = 2*$i;
     }
-    @keys = keys %myhash;
+    my @keys = keys %myhash;
     # @keys = (0,1,2,3,...)
 
 =head2 Why does passing a subroutine an undefined element in a hash create it?
@@ -1902,9 +1922,7 @@ this works fine (assuming the files are found):
 
 On less elegant (read: Byzantine) systems, however, you have
 to play tedious games with "text" versus "binary" files.  See
-L<perlfunc/"binmode"> or L<perlopentut>.  Most of these ancient-thinking
-systems are curses out of Microsoft, who seem to be committed to putting
-the backward into backward compatibility.
+L<perlfunc/"binmode"> or L<perlopentut>.
 
 If you're concerned about 8-bit ASCII data, then see L<perllocale>.
author	Rafael Garcia-Suarez <rgarciasuarez@gmail.com>	2002-11-26 21:06:48 +0000
committer	Rafael Garcia-Suarez <rgarciasuarez@gmail.com>	2002-11-26 21:06:48 +0000
commit	49d635f9372392ae44fe4c5b62b06e41912ae0c9 (patch)
tree	29a0e48c51466f10da69fffa12babc88587672a9 /pod/perlfaq4.pod
parent	ad0f383a28b730182ea06492027f82167ce7032b (diff)
download	perl-49d635f9372392ae44fe4c5b62b06e41912ae0c9.tar.gz