| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Message-ID: <3B9D23D6.90BCCC25@rowman.com>
p4raw-id: //depot/perl@11986
|
|
|
|
|
|
| |
and the Perl will be built to do that by default (adding that
will break scripts having non-UTF-8 binary data, such as Latin-1.)
p4raw-id: //depot/perl@11656
|
|
|
| |
p4raw-id: //depot/perl@11652
|
|
|
|
|
| |
Message-Id: <200107061339.JAA12582@bottesini.harvard.edu>
p4raw-id: //depot/perl@11184
|
|
|
|
|
|
| |
patch: rename HINT_BYTE and IN_BYTE to HINT_BYTES and IN_BYTES
to match the pragma name; various robustness cleanups.
p4raw-id: //depot/perl@10339
|
|
|
|
|
| |
Message-Id: <5.0.2.1.1.20010421192107.01ce5a50@ix.netcorps.com>
p4raw-id: //depot/perl@9775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
into mainline.
fix a broken workaround for Borland compiler in change#4739
(caused weird "short reads" on DATA, which caused op/misc.t to fail)
nits spotted by Borland compiler
avoid redefinition warnings under Borland 5.02
various nits identified by the Borland 5.5 compiler; remove suppression
of a few warnings
p4raw-link: @9496 on //depot/maint-5.6/perl: 9d05ad52b0aa7d1f7d147da0c4dbc14de5fe4a37
p4raw-link: @9495 on //depot/maint-5.6/perl: 759997f1e719f33541bed70dd7f79bfa26a930b3
p4raw-link: @9494 on //depot/maint-5.6/perl: 01b59bde1cb7ff62776f3b83c0f2575c79a950a6
p4raw-link: @9493 on //depot/maint-5.6/perl: eea7051a8d4ef81c032143ab3193bc1240ab2e8f
p4raw-link: @4739 on //depot/perl: c39cd00800303e8967294e98aa4c427a1872a251
p4raw-id: //depot/perl@9497
p4raw-integrated: from //depot/maint-5.6/perl@9492 'merge in' sv.c
utf8.h (@9288..) toke.c (@9292..) ext/File/Glob/bsd_glob.c
(@9415..) win32/makefile.mk (@9426..) win32/win32.h (@9494..)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Loose the extra level of function on ASCII.
- spotted a chr(0) issue in sv.c
- re-work of UTF-X tr/// ranges to work in Unicode
space. Still issues with the "0xff is illegal UTF-8" hack.
- Yet another ad. hoc. utf8 'upgrade' in op.c recoded
(why do it once when you can do it all over the place :-(
- Enable HINTS_UTF8 on EBCDIC - then ignore it in toke.c,
need utf8.pm for swashes.
- Simplified and commented scan_const() in toke.c
Still something wrong regexp and tr (swashes?).
p4raw-id: //depot/perlio@9267
|
|
|
| |
p4raw-id: //depot/perlio@9246
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
encoding on EBCDIC platforms. This has property that U+0000..U+009F i.e.
a superset of ASCII are invariant under the encoding. This is EBCDIC
friendly as an encoded string can be looked at as being EBCDIC by lexer
sprintf("%d",...) etc. in same manner that a UTF-8 string be considered
ASCII on ASCII machines.
- re-arrange utf8.h to get ASCII specific vs Unicode generic bits
seperate.
- Add some more macros to comprehend different shift amounts and
possible swizzle in UTF-EBCDIC vs UTF-8. Change utf8.c to use them.
- add utfebcdic.h which provides UTF-EBCDIC versions of the macros,
and conditionally #include it.
EBCDIC build as yet untested. ASCII still fails the one test.
p4raw-id: //depot/perlio@9185
|
|
|
| |
p4raw-id: //depot/perlio@9184
|
|
|
| |
p4raw-id: //depot/perlio@9180
|
|
|
| |
p4raw-id: //depot/perlio@9110
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- rename utf8/uv functions to indicate what sort of uv they provide (uvuni/uvchr)
- use utf8n_xxxx (c.f. pvn) for forms which take length.
- back out vN.N and $^V exceptions to e2a/a2e
- make "locale" isxxx macros be uvchr (may be redundant?)
Not clear yet that toUPPER_uni et. al. return being handled correctly.
The tr// and rexexp stuff still needs an audit, assumption is they are working
in Unicode space.
Need to provide v5.6 names for XS modules (decide is uni or chr ?).
p4raw-id: //depot/perlio@9096
|
|
|
|
|
| |
Message-ID: <Pine.OSF.4.10.10103081617390.377472-100000@aspara.forte.com>
p4raw-id: //depot/perl@9082
|
|
|
| |
p4raw-id: //depot/perl@8770
|
|
|
| |
p4raw-id: //depot/perl@8647
|
|
|
| |
p4raw-id: //depot/perl@8323
|
|
|
| |
p4raw-id: //depot/perl@8289
|
|
|
|
|
| |
it revealed a bug in #8248 (the UTF8_EIGHT_BIT_LO() was wrong).
p4raw-id: //depot/perl@8249
|
|
|
|
|
|
|
|
|
| |
Internally: sv_catsv() wasn't quite okay on UTF-8, it assumed
that the only cases to care about are byte+byte and byte+character.
TODO: See how well pp_concat() could be implemented in terms
of sv_catsv().
p4raw-id: //depot/perl@8248
|
|
|
|
|
|
| |
everywhere because we do generate illegal UTF-8 in some situations.
This is of course naughty.
p4raw-id: //depot/perl@8033
|
|
|
| |
p4raw-id: //depot/perl@8028
|
|
|
| |
p4raw-id: //depot/perl@7700
|
|
|
|
|
|
| |
Subject: [ID 20001114.006] 5.7.0-7680 Solaris 8, 64 bit, utf8 patch
Message-Id: <20001114191623.G20559@Strawberry.COM>
p4raw-id: //depot/perl@7691
|
|
|
|
|
| |
Message-Id: <200011132249.eADMnek09679@garcia.efn.org>
p4raw-id: //depot/perl@7677
|
|
|
| |
p4raw-id: //depot/perl@7438
|
|
|
|
|
| |
UTF8LEN() and UTF8SKIP().
p4raw-id: //depot/perl@7437
|
|
|
|
|
|
|
|
|
|
|
|
| |
malformation happens. This involved adding an argument
to utf8_to_uv_chk(), which involved changing its prototype,
and prefer STRLEN over I32 for the UTF-8 length, which as
a domino effect necessitated changing the prototypes of
scan_bin(), scan_oct(), scan_hex(), and reg_uni().
The stricter UTF-8 decoding checking uses Markus Kuhn's
UTF-8 Decode Stress Tester from
http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt
p4raw-id: //depot/perl@7416
|
|
|
|
|
|
| |
Subject: [PATCH] Re: [ID 20000918.005] ~ on wide chars
Message-ID: <20001014205213.A9645@pembro4.pmb.ox.ac.uk>
p4raw-id: //depot/perl@7235
|
|
|
| |
p4raw-id: //depot/perl@7154
|
|
|
| |
p4raw-id: //depot/perl@7153
|
|
|
|
|
|
|
| |
permitted by change#5011 (from Gisle Aas)
p4raw-link: @5011 on //depot/perl: 3c77ea2bace63b1ad27d15a6366cb938bdd158cb
p4raw-id: //depot/perl@5136
|
|
|
| |
p4raw-id: //depot/perl@5011
|
|
|
|
|
| |
years (from Gisle Aas)
p4raw-id: //depot/perl@5009
|
|
|
|
|
| |
perlunicode.pod that reflects changes to unicode support so far
p4raw-id: //depot/perl@4941
|
|
|
|
|
|
|
|
|
| |
whether to use widechar semantics; lexer and RE engine continue
to need "use utf8" to enable unicode awareness in literals
and patterns (TODO: this needs to be fixed); $1 et al are marked
SvUTF8 if the pattern was compiled for utf8 (TODO: propagating
it from the data is probably better)
p4raw-id: //depot/perl@4930
|
|
|
|
|
| |
Basic SvUTF8 stuff in headers, no functional changes yet.
p4raw-id: //depot/utfperl@4193
|
|
|
|
|
|
| |
headers, so perl can be built even in C++ mode; win32
build fixups; regen headers
p4raw-id: //depot/perl@3537
|
|
|
| |
p4raw-id: //depot/perl@3124
|
|
|
| |
p4raw-id: //depot/perl@2241
|
|
|
|
|
| |
p4raw-link: @1927 on //depot/perl: eb07465ebe1238598e948058857ec948c6697f86
p4raw-id: //depot/perl@1936
|
|
|
|
|
|
| |
s/PL_utf8skip/utf8skip/ for now, or we end up with Perl_PL_;
add typecasts to silence warnings; tweaks for win32 builds
p4raw-id: //depot/perl@1663
|
|
p4raw-id: //depot/utfperl@1651
|