Teach Perl about Unicode named character sequences

mktables is changed to process the Unicode named sequence file. charnames.pm is changed to cache the looked-up values in utf8. A new function, string_vianame is created that can handle named sequences, as the interface for vianame cannot. The subroutine lookup_name() is slightly refactored to do almost all of the common work for \N{} and the vianame routines. It now understands named sequences as created my mktables.. tests and documentation are added. In the randomized testing section, half use vianame() and half string_vianame().
author: Karl Williamson <public@khwilliamson.com> 2010-09-12 21:33:12 -0600
committer: Father Chrysostomos <sprout@cpan.org> 2010-09-25 00:47:02 -0700
commit: fb121860c2407cd1d1566d63a95a5220fa93d8e4 (patch)
tree: cc61893dd3ffe9966e079addeaa538172e2290e9 /pod/perlre.pod
parent: 8ebef31d4feab4b7c35ff0eb427632a67b1abdd9 (diff)
download: perl-fb121860c2407cd1d1566d63a95a5220fa93d8e4.tar.gz
1 files changed, 4 insertions, 4 deletions
diff --git a/pod/perlre.pod b/pod/perlre.pod
index b9216c156c..88089ee1d7 100644
--- a/pod/perlre.pod
+++ b/pod/perlre.pod
@@ -231,7 +231,7 @@ also work:
  \e          escape (think troff)  (ESC)
  \cK         control char          (example: VT)
  \x{}, \x00  character whose ordinal is the given hexadecimal number
- \N{name}    named Unicode character
+ \N{name}    named Unicode character or character sequence
  \N{U+263D}  Unicode character     (example: FIRST QUARTER MOON)
  \o{}, \000  character whose ordinal is the given octal number
  \l          lowercase next char (think vi)
@@ -316,9 +316,9 @@ See L</Extended Patterns> below for details.
 =item [7]
 
 Note that C<\N> has two meanings.  When of the form C<\N{NAME}>, it matches the
-character whose name is C<NAME>; and similarly when of the form
-C<\N{U+I<wide hex char>}>, it matches the character whose Unicode ordinal is
-I<wide hex char>.  Otherwise it matches any character but C<\n>.
+character or character sequence whose name is C<NAME>; and similarly
+when of the form C<\N{U+I<hex>}>, it matches the character whose Unicode
+code point is I<hex>.  Otherwise it matches any character but C<\n>.
 
 =back
author	Karl Williamson <public@khwilliamson.com>	2010-09-12 21:33:12 -0600
committer	Father Chrysostomos <sprout@cpan.org>	2010-09-25 00:47:02 -0700
commit	fb121860c2407cd1d1566d63a95a5220fa93d8e4 (patch)
tree	cc61893dd3ffe9966e079addeaa538172e2290e9 /pod/perlre.pod
parent	8ebef31d4feab4b7c35ff0eb427632a67b1abdd9 (diff)
download	perl-fb121860c2407cd1d1566d63a95a5220fa93d8e4.tar.gz