diff options
author | Jarkko Hietaniemi <jhi@iki.fi> | 2002-04-16 03:59:00 +0000 |
---|---|---|
committer | Jarkko Hietaniemi <jhi@iki.fi> | 2002-04-16 03:59:00 +0000 |
commit | 82686b017bb20f55e16f84c47f7ac0bf8d0c714b (patch) | |
tree | e7ad28a90ea768b323c2fb37103841ceb7b8dd93 /handy.h | |
parent | 58858581d2d18dc2bff021fb2c755408c36929c4 (diff) | |
download | perl-82686b017bb20f55e16f84c47f7ac0bf8d0c714b.tar.gz |
my $utf8here, our $utf8here, and package variable $utf8here.
The actual minimal fix is in utf8.c and from NI-S,
the rest are the tests (in fresh_perl since I couldn't get
them easily to work elsewhere) and a slight behaviour change:
previously UTF-8 identifiers had to start with an alphabetic
character. No more so, now they can start with an (Unicode)
ID_Continue character (which however is not a (Unicode) digit).
(Limiting the first character to ID_Start would be rather
restrictive, since ID_Start allows only alphabetic letters.)
TODO: use vars qw($utf8here). This I don't find to be
a showstopper.
p4raw-id: //depot/perl@15943
Diffstat (limited to 'handy.h')
-rw-r--r-- | handy.h | 5 |
1 files changed, 4 insertions, 1 deletions
@@ -460,7 +460,10 @@ Converts the specified character to lowercase. #define isBLANK_LC_uni(c) isBLANK(c) /* could be wrong */ #define isALNUM_utf8(p) is_utf8_alnum(p) -#define isIDFIRST_utf8(p) is_utf8_idfirst(p) +/* The ID_Start of Unicode is quite limiting: it assumes a L-class + * character (meaning that you cannot have, say, a CJK character). + * Instead, let's allow ID_Continue but not digits. */ +#define isIDFIRST_utf8(p) (is_utf8_idcont(p) && !is_utf8_digit(p)) #define isALPHA_utf8(p) is_utf8_alpha(p) #define isSPACE_utf8(p) is_utf8_space(p) #define isDIGIT_utf8(p) is_utf8_digit(p) |