summaryrefslogtreecommitdiff
path: root/pod/perlrebackslash.pod
diff options
context:
space:
mode:
authorKarl Williamson <khw@khw-desktop.(none)>2009-12-21 11:44:35 -0700
committerRafael Garcia-Suarez <rgs@consttype.org>2009-12-22 11:44:37 +0100
commit0111a78fcc993bdfaa4b46112924c3a9751ecfa5 (patch)
treef9dc23978c71cd47fd18e36fff0613f8673b58e1 /pod/perlrebackslash.pod
parentc3c0aa283b73660f84ae7e190dcbbd607facb512 (diff)
downloadperl-0111a78fcc993bdfaa4b46112924c3a9751ecfa5.tar.gz
Fix up pods for \X
Diffstat (limited to 'pod/perlrebackslash.pod')
-rw-r--r--pod/perlrebackslash.pod18
1 files changed, 7 insertions, 11 deletions
diff --git a/pod/perlrebackslash.pod b/pod/perlrebackslash.pod
index 40f73fcbc1..e8ffcf16d0 100644
--- a/pod/perlrebackslash.pod
+++ b/pod/perlrebackslash.pod
@@ -100,7 +100,7 @@ quoted constructs>.
\w Character class for word characters.
\W Character class for non-word characters.
\x{}, \x00 Hexadecimal escape sequence.
- \X Extended Unicode "combining character sequence".
+ \X Unicode "extended grapheme cluster".
\z End of string.
\Z End of string.
@@ -507,18 +507,14 @@ metacharacter, and suggests C<\R> as the notation.
=item \X
-This matches an extended Unicode I<combining character sequence>, and
-is equivalent to C<< (?>\PM\pM*) >>. C<\PM> matches any character that is
-not considered a Unicode mark character, while C<\pM> matches any character
-that is considered a Unicode mark character; so C<\X> matches any non
-mark character followed by zero or more mark characters. Mark characters
-include (but are not restricted to) I<combining characters> and
-I<vowel signs>.
+This matches a Unicode I<extended grapheme cluster>.
C<\X> matches quite well what normal (non-Unicode-programmer) usage
-would consider a single character: for example a base character
-(the C<\PM> above), for example a letter, followed by zero or more
-diacritics, which are I<combining characters> (the C<\pM*> above).
+would consider a single character. As an example, consider a G with some sort
+of accent mark over it (a diacritic). There is no such single character in
+Unicode, but something like one can be constructed by using a G followed by a
+Unicode combining accent, and would be displayed by Unicode-aware software as
+if it were a single character.
Mnemonic: eI<X>tended Unicode character.