diff options
author | Karl Williamson <khw@cpan.org> | 2020-03-19 22:13:30 -0600 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2020-03-20 07:44:31 -0600 |
commit | cc06e157d785a514b8f825dccebf13aec98e7a27 (patch) | |
tree | 9f6f4a73094a6cc39a4773d2d48fb35699c094be /pod/perlunicode.pod | |
parent | 770e79e94914f38efa409664c53df95c2b5073f3 (diff) | |
download | perl-cc06e157d785a514b8f825dccebf13aec98e7a27.tar.gz |
Add named sequences to Unicode wildcard name capabilites
Prior to this commit, specifying a named sequence would result in a
mostly unhelpful fatal error message. This makes their use legal.
This is also the beginning of allowing Unicode string properties, which
are a new thing in the (still draft) Unicode requirements for regular
expression parsing, UTS 18. Full compliance will have to come later.
Diffstat (limited to 'pod/perlunicode.pod')
-rw-r--r-- | pod/perlunicode.pod | 16 |
1 files changed, 6 insertions, 10 deletions
diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod index fb446d62e2..fa1710dfd3 100644 --- a/pod/perlunicode.pod +++ b/pod/perlunicode.pod @@ -938,7 +938,7 @@ summarizes the differences between these two: can interpolate only with eval yes [1] custom names yes no [2] name aliases yes yes [3] - named sequences yes not yet [4] + named sequences yes yes [4] name value parsing exact Unicode loose [5] =over @@ -965,10 +965,6 @@ Some characters have multiple names (synonyms). Some particular sequences of characters are given a single name, in addition to their individual ones. -It is planned to add support for named sequences to the C<\p{...}> form -before 5.32; in the meantime, an accurate but not fully informative -message is generated if use of one of these is attempted. - =item [5] Exact name value matching means you have to specify case, hyphens, @@ -1079,11 +1075,11 @@ matched by your pattern. It's likely that a future release will raise a warning if your pattern ends up causing every possible code point to match. -Starting in 5.32, the Name and Name Aliases properties are allowed to be -matched. They are considered to be a single combination property, just -as has long been the case for C<\N{}>. Loose matching doesn't work in -exactly the same way for these as it does for the values of other -properties. The rules are given in +Starting in 5.32, the Name, Name Aliases, and Named Sequences properties +are allowed to be matched. They are considered to be a single +combination property, just as has long been the case for C<\N{}>. Loose +matching doesn't work in exactly the same way for these as it does for +the values of other properties. The rules are given in L<https://www.unicode.org/reports/tr44/tr44-24.html#UAX44-LM2>. As a result, Perl doesn't try loose matching for you, like it does in other properties. All letters in names are uppercase, but you can add C<(?i)> |