diff options
author | Karl Williamson <public@khwilliamson.com> | 2012-08-26 11:25:13 -0600 |
---|---|---|
committer | Karl Williamson <public@khwilliamson.com> | 2012-08-26 12:28:28 -0600 |
commit | 1e958ea900b080a533d425464154978759f37121 (patch) | |
tree | b406539537a475284369ab74a636b14b869eeda9 /utf8.c | |
parent | 8f78a100ba7595776f161ae7fa4a2780a2e3faca (diff) | |
download | perl-1e958ea900b080a533d425464154978759f37121.tar.gz |
Prepare for Unicode 6.2
This changes code to be able to handle Unicode 6.2, while continuing to
handle all prevrious releases.
The major change was a new definition of \X, which adds a property to
its calculation. Unfortunately \X is hard-coded into regexec.c, and so
has to revised whenever there is a change of this magnitude in Unicode,
which fortunately isn't all that often. I refactored the code in
mktables to make it easier next time there is a change like this one.
Diffstat (limited to 'utf8.c')
-rw-r--r-- | utf8.c | 16 |
1 files changed, 13 insertions, 3 deletions
@@ -2270,13 +2270,13 @@ Perl_is_utf8_X_prepend(pTHX_ const U8 *p) } bool -Perl_is_utf8_X_non_hangul(pTHX_ const U8 *p) +Perl_is_utf8_X_special_begin(pTHX_ const U8 *p) { dVAR; - PERL_ARGS_ASSERT_IS_UTF8_X_NON_HANGUL; + PERL_ARGS_ASSERT_IS_UTF8_X_SPECIAL_BEGIN; - return is_utf8_common(p, &PL_utf8_X_non_hangul, "_X_HST_Not_Applicable"); + return is_utf8_common(p, &PL_utf8_X_special_begin, "_X_Special_Begin"); } bool @@ -2289,6 +2289,16 @@ Perl_is_utf8_X_L(pTHX_ const U8 *p) return is_utf8_common(p, &PL_utf8_X_L, "_X_GCB_L"); } +bool +Perl_is_utf8_X_RI(pTHX_ const U8 *p) +{ + dVAR; + + PERL_ARGS_ASSERT_IS_UTF8_X_RI; + + return is_utf8_common(p, &PL_utf8_X_RI, "_X_RI"); +} + /* These constants are for finding GCB=LV and GCB=LVT. These are for the * pre-composed Hangul syllables, which are all in a contiguous block and * arranged there in such a way so as to facilitate alorithmic determination of |