From fc36a67e8855d031b2a6921819d899eb149eee2d Mon Sep 17 00:00:00 2001 From: Perl 5 Porters Date: Fri, 25 Apr 1997 00:00:00 +1200 Subject: [inseparable changes from match from perl-5.003_97h to perl-5.003_97i] CORE PORTABILITY Subject: Provide memset() if it's missing From: Chip Salzenberg Files: global.sym perl.h proto.h util.c Subject: Don't tell GCC that warn(), croak(), and die() are printf-lik From: Chip Salzenberg Files: proto.h DOCUMENTATION Subject: FAQ udpate (24-apr-97) Date: Thu, 24 Apr 1997 16:47:23 -0600 (MDT) From: Nathan Torkington Files: pod/perlfaq*.pod private-msgid: 199704242247.QAA07010@prometheus.frii.com OTHER CORE CHANGES Subject: Misc. sv_vcatpvfn() fixes From: Hugo van der Sanden Files: gv.c mg.c op.c perl.c pp.c pp_ctl.c sv.c toke.c util.c Subject: Enforce order of sprintf() elements From: Chip Salzenberg Files: sv.c Subject: Guard against long numbers, < From: Chip Salzenberg Files: global.sym mg.c perl.c pod/perldiag.pod proto.h toke.c util.c Subject: Guard against C to deeply nested label From: Chip Salzenberg Files: pod/perldiag.pod pp_ctl.c Subject: Guard against overflow in dup2() emulation From: Chip Salzenberg Files: util.c Subject: Win32: Guard against long function names From: Chip Salzenberg Files: win32/win32sck.c Subject: Make mess() always work, by using a non-arena SV From: Chip Salzenberg Files: perl.c util.c Subject: When copying a format line, take only its string value From: Chip Salzenberg Files: sv.c Subject: Fix LEAKTEST numbers From: Chip Salzenberg Files: ext/DynaLoader/dl_vms.xs handy.h os2/os2.c util.c vms/vms.c win32/win32.c win32/win32sck.c --- pod/perlfaq6.pod | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-) (limited to 'pod/perlfaq6.pod') diff --git a/pod/perlfaq6.pod b/pod/perlfaq6.pod index 1af7948339..d21a11157b 100644 --- a/pod/perlfaq6.pod +++ b/pod/perlfaq6.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq6 - Regexps ($Revision: 1.16 $, $Date: 1997/03/25 18:16:56 $) +perlfaq6 - Regexps ($Revision: 1.17 $, $Date: 1997/04/24 22:44:10 $) =head1 DESCRIPTION @@ -138,7 +138,8 @@ on matching balanced text. $/ must be a string, not a regular expression. Awk has to be better for something. :-) -Actually, you could do this if you don't mind reading the whole file into +Actually, you could do this if you don't mind reading the whole file +into memory: undef $/; @records = split /your_pattern/, ; @@ -325,9 +326,9 @@ playing hot potato. Use the split function: while (<>) { - foreach $word ( split ) { + foreach $word ( split ) { # do something with $word here - } + } } Note that this isn't really a word in the English sense; it's just @@ -360,7 +361,7 @@ in the previous question: If you wanted to do the same thing for lines, you wouldn't need a regular expression: - while (<>) { + while (<>) { $seen{$_}++; } while ( ($line, $count) = each %seen ) { @@ -546,19 +547,20 @@ synonymous. The following set of approaches was offered by Jeffrey Friedl, whose article in issue #5 of The Perl Journal talks about this very matter. -Let's suppose you have some weird Martian encoding where pairs of ASCII -uppercase letters encode single Martian letters (i.e. the two bytes -"CV" make a single Martian letter, as do the two bytes "SG", "VS", -"XX", etc.). Other bytes represent single characters, just like ASCII. +Let's suppose you have some weird Martian encoding where pairs of +ASCII uppercase letters encode single Martian letters (i.e. the two +bytes "CV" make a single Martian letter, as do the two bytes "SG", +"VS", "XX", etc.). Other bytes represent single characters, just like +ASCII. -So, the string of Martian "I am CVSGXX!" uses 12 bytes to encode the nine -characters 'I', ' ', 'a', 'm', ' ', 'CV', 'SG', 'XX', '!'. +So, the string of Martian "I am CVSGXX!" uses 12 bytes to encode the +nine characters 'I', ' ', 'a', 'm', ' ', 'CV', 'SG', 'XX', '!'. Now, say you want to search for the single character C. Perl -doesn't know about Martian, so it'll find the two bytes "GX" in the -"I am CVSGXX!" string, even though that character isn't there: it just -looks like it is because "SG" is next to "XX", but there's no real "GX". -This is a big problem. +doesn't know about Martian, so it'll find the two bytes "GX" in the "I +am CVSGXX!" string, even though that character isn't there: it just +looks like it is because "SG" is next to "XX", but there's no real +"GX". This is a big problem. Here are a few ways, all painful, to deal with it: -- cgit v1.2.1