diff options
author | Perl 5 Porters <perl5-porters@africa.nicoh.com> | 1997-04-25 00:00:00 +1200 |
---|---|---|
committer | Chip Salzenberg <chip@atlantic.net> | 1997-04-25 00:00:00 +1200 |
commit | fc36a67e8855d031b2a6921819d899eb149eee2d (patch) | |
tree | 7e927725470a83d271eae7d78123f60cb86e60df /pod/perlfaq6.pod | |
parent | 74a7701791a30556a92328b89e5a00414a4ce4a3 (diff) | |
download | perl-fc36a67e8855d031b2a6921819d899eb149eee2d.tar.gz |
[inseparable changes from match from perl-5.003_97h to perl-5.003_97i]
CORE PORTABILITY
Subject: Provide memset() if it's missing
From: Chip Salzenberg <chip@perl.com>
Files: global.sym perl.h proto.h util.c
Subject: Don't tell GCC that warn(), croak(), and die() are printf-lik
From: Chip Salzenberg <chip@perl.com>
Files: proto.h
DOCUMENTATION
Subject: FAQ udpate (24-apr-97)
Date: Thu, 24 Apr 1997 16:47:23 -0600 (MDT)
From: Nathan Torkington <gnat@prometheus.frii.com>
Files: pod/perlfaq*.pod
private-msgid: 199704242247.QAA07010@prometheus.frii.com
OTHER CORE CHANGES
Subject: Misc. sv_vcatpvfn() fixes
From: Hugo van der Sanden <hv@crypt.compulink.co.uk>
Files: gv.c mg.c op.c perl.c pp.c pp_ctl.c sv.c toke.c util.c
Subject: Enforce order of sprintf() elements
From: Chip Salzenberg <chip@perl.com>
Files: sv.c
Subject: Guard against long numbers, <<LONG_DELIM, and <long glob>
From: Chip Salzenberg <chip@perl.com>
Files: global.sym mg.c perl.c pod/perldiag.pod proto.h toke.c util.c
Subject: Guard against C<goto> to deeply nested label
From: Chip Salzenberg <chip@perl.com>
Files: pod/perldiag.pod pp_ctl.c
Subject: Guard against overflow in dup2() emulation
From: Chip Salzenberg <chip@perl.com>
Files: util.c
Subject: Win32: Guard against long function names
From: Chip Salzenberg <chip@perl.com>
Files: win32/win32sck.c
Subject: Make mess() always work, by using a non-arena SV
From: Chip Salzenberg <chip@perl.com>
Files: perl.c util.c
Subject: When copying a format line, take only its string value
From: Chip Salzenberg <chip@perl.com>
Files: sv.c
Subject: Fix LEAKTEST numbers
From: Chip Salzenberg <chip@perl.com>
Files: ext/DynaLoader/dl_vms.xs handy.h os2/os2.c util.c vms/vms.c win32/win32.c win32/win32sck.c
Diffstat (limited to 'pod/perlfaq6.pod')
-rw-r--r-- | pod/perlfaq6.pod | 32 |
1 files changed, 17 insertions, 15 deletions
diff --git a/pod/perlfaq6.pod b/pod/perlfaq6.pod index 1af7948339..d21a11157b 100644 --- a/pod/perlfaq6.pod +++ b/pod/perlfaq6.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq6 - Regexps ($Revision: 1.16 $, $Date: 1997/03/25 18:16:56 $) +perlfaq6 - Regexps ($Revision: 1.17 $, $Date: 1997/04/24 22:44:10 $) =head1 DESCRIPTION @@ -138,7 +138,8 @@ on matching balanced text. $/ must be a string, not a regular expression. Awk has to be better for something. :-) -Actually, you could do this if you don't mind reading the whole file into +Actually, you could do this if you don't mind reading the whole file +into memory: undef $/; @records = split /your_pattern/, <FH>; @@ -325,9 +326,9 @@ playing hot potato. Use the split function: while (<>) { - foreach $word ( split ) { + foreach $word ( split ) { # do something with $word here - } + } } Note that this isn't really a word in the English sense; it's just @@ -360,7 +361,7 @@ in the previous question: If you wanted to do the same thing for lines, you wouldn't need a regular expression: - while (<>) { + while (<>) { $seen{$_}++; } while ( ($line, $count) = each %seen ) { @@ -546,19 +547,20 @@ synonymous. The following set of approaches was offered by Jeffrey Friedl, whose article in issue #5 of The Perl Journal talks about this very matter. -Let's suppose you have some weird Martian encoding where pairs of ASCII -uppercase letters encode single Martian letters (i.e. the two bytes -"CV" make a single Martian letter, as do the two bytes "SG", "VS", -"XX", etc.). Other bytes represent single characters, just like ASCII. +Let's suppose you have some weird Martian encoding where pairs of +ASCII uppercase letters encode single Martian letters (i.e. the two +bytes "CV" make a single Martian letter, as do the two bytes "SG", +"VS", "XX", etc.). Other bytes represent single characters, just like +ASCII. -So, the string of Martian "I am CVSGXX!" uses 12 bytes to encode the nine -characters 'I', ' ', 'a', 'm', ' ', 'CV', 'SG', 'XX', '!'. +So, the string of Martian "I am CVSGXX!" uses 12 bytes to encode the +nine characters 'I', ' ', 'a', 'm', ' ', 'CV', 'SG', 'XX', '!'. Now, say you want to search for the single character C</GX/>. Perl -doesn't know about Martian, so it'll find the two bytes "GX" in the -"I am CVSGXX!" string, even though that character isn't there: it just -looks like it is because "SG" is next to "XX", but there's no real "GX". -This is a big problem. +doesn't know about Martian, so it'll find the two bytes "GX" in the "I +am CVSGXX!" string, even though that character isn't there: it just +looks like it is because "SG" is next to "XX", but there's no real +"GX". This is a big problem. Here are a few ways, all painful, to deal with it: |