| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- rename utf8/uv functions to indicate what sort of uv they provide (uvuni/uvchr)
- use utf8n_xxxx (c.f. pvn) for forms which take length.
- back out vN.N and $^V exceptions to e2a/a2e
- make "locale" isxxx macros be uvchr (may be redundant?)
Not clear yet that toUPPER_uni et. al. return being handled correctly.
The tr// and rexexp stuff still needs an audit, assumption is they are working
in Unicode space.
Need to provide v5.6 names for XS modules (decide is uni or chr ?).
p4raw-id: //depot/perlio@9096
|
|
|
|
|
|
|
| |
from el zero
Message-ID: <15013.20716.201459.540421@ix.netsoft.ro>
p4raw-id: //depot/perl@9068
|
|
|
|
|
|
| |
by not changing from pre-Unicode days into being Unicode-aware.
Sniff.
p4raw-id: //depot/perl@8966
|
|
|
|
|
|
|
|
| |
In-Reply-To: <20010227140737.Y10633@chaos.wustl.edu>
Message-ID: <Pine.LNX.4.30.0102271322070.8623-100000@lapaki.jach.hawaii.edu>
Replace djSP with dSP.
p4raw-id: //depot/perl@8963
|
|
|
|
|
| |
without rethinking utf8decode.t.
p4raw-id: //depot/perl@8880
|
|
|
| |
p4raw-id: //depot/perl@8875
|
|
|
| |
p4raw-id: //depot/perl@8869
|
|
|
|
|
|
| |
evil influence of 'use bytes'. Similarly, unpack("C", ...)
will understand Unicode, unless you under know what.
p4raw-id: //depot/perl@8865
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(via maintperl #8855).
Fixed %^H scoping bug
Removed GV <-> CV refcount loop
Removed %ENV refcount loop
p4raw-link: @8850 on //depot/maint-5.6/pureperl: bae1eca58b94313e4b7677aa241da9fad57bb363
p4raw-link: @8845 on //depot/maint-5.6/pureperl: 4d40626c12bbdd62acfbbe3be104711e58cec2f7
p4raw-link: @8844 on //depot/maint-5.6/pureperl: ea100fc6cfd2f0e23aceb84ac0e804e3c9c3c9a2
p4raw-id: //depot/perl@8858
p4raw-integrated: from //depot/maint-5.6/perl@8857 'merge in' gv.c
scope.c (@8606..) pp.c (@8635..) op.c (@8758..) perl.c
(@8806..)
|
|
|
|
|
|
|
| |
Message-ID: <20010130195105.R76607@plum.flirble.org>
op/inc cure.
p4raw-id: //depot/perl@8637
|
|
|
| |
p4raw-id: //depot/perl@8561
|
|
|
|
|
|
|
|
|
| |
Message-ID: <5930DC161690D2119667009027157547038C8A85@madt009a.siemens.es>
pp_int() was dropping an NV to the floor,
int(279964589018079/59) either returned not an integer
4745162525730.15, or one got "Attempt to free unreferenced scalar."
p4raw-id: //depot/perl@8464
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- The substr lval was still not okay.
- Now pp_stringify and sv_setsv copies source's UTF8 flag
even if IN_BYTE. pp_stringify is called from fold_constants
at optimization phase and "\x{100}" was made SvUTF8_off under
use bytes (the bytes pragma is for "byte semantics" and not
for "do not produce UTF8 data")
- New `qu' operator to generate UTF8 string explicitly.
Though I agree with the policy "0x00-0xff always produce bytes",
sometimes want to such a string to be coded in UTF8.
I can use pack"U0a*" but it requires more typing and has
runtime overhead.
- Fix pp_regcomp bug uncovered by "0x00-0xff always produce bytes"
change, the bug appears if a pm has PMdf_UTF8 flag but interpolated
string is not UTF8_on and has char 0x80-0xff.
TODO: document and test qu.
p4raw-id: //depot/perl@8439
|
|
|
|
|
| |
Message-ID: <14941.16925.736415.785818@soda.csua.berkeley.edu>
p4raw-id: //depot/perl@8417
|
|
|
| |
p4raw-id: //depot/perl@8415
|
|
|
|
|
| |
Should finish up the bug id 20001205.022.
p4raw-id: //depot/perl@8382
|
|
|
|
|
|
|
| |
Message-ID: <3A59E510.52BAB5B9@st.rim.or.jp>
UTF-8 fixes for 'x' and tr///.
p4raw-id: //depot/perl@8378
|
|
|
| |
p4raw-id: //depot/perl@8328
|
|
|
| |
p4raw-id: //depot/perl@8323
|
|
|
| |
p4raw-id: //depot/perlio@8298
|
|
|
| |
p4raw-id: //depot/perl@8289
|
|
|
|
|
|
|
|
| |
Message-ID: <3A4D722D.243AFD88@st.rim.or.jp>
Just the patch part for now, and the pragma renamed
as unicode::distinct.
p4raw-id: //depot/perl@8267
|
|
|
|
|
| |
it revealed a bug in #8248 (the UTF8_EIGHT_BIT_LO() was wrong).
p4raw-id: //depot/perl@8249
|
|
|
|
|
|
| |
non-progress) assumed bytes instead of characters in s///
and split().
p4raw-id: //depot/perl@8245
|
|
|
| |
p4raw-id: //depot/perl@8244
|
|
|
|
|
| |
Message-ID: <20001227023003.A7677@deep-dark-truthful-mirror.perlhacker.org>
p4raw-id: //depot/perl@8243
|
|
|
|
|
|
| |
in Digital UNIX (the broken strtoul brokenness detection
seems to have been the fly in the ointment).
p4raw-id: //depot/perl@8138
|
|
|
|
|
| |
(it basically is 8102..8118+8122 but no 8120, 8121, 8123, 8124)
p4raw-id: //depot/perl@8125
|
|
|
|
|
| |
Message-ID: <20001213200849.B71166@plum.flirble.org>
p4raw-id: //depot/perl@8119
|
|
|
|
|
| |
Message-ID: <20001211012144.A23467@deep-dark-truthful-mirror.perlhacker.org>
p4raw-id: //depot/perl@8077
|
|
|
|
|
|
|
|
|
|
| |
Message-ID: <20001210001333.A16221@deep-dark-truthful-mirror.perlhacker.org>
Make CORE::substr to be '$$;$$' instead of '$$;$;$'.
In other words, make the returned prototypes for any function
stop prepending the ';' optionality marker after the first one.
If they start getting optional, all the rest are optional.
p4raw-id: //depot/perl@8064
|
|
|
| |
p4raw-id: //depot/perl@7984
|
|
|
|
|
| |
for unfinished and buggy :-)
p4raw-id: //depot/perl@7978
|
|
|
|
|
|
|
|
| |
Subject: Re: utf8 in hash keys, implementor missing
Message-ID: <20001202194935.A25673@pembro33.pmb.ox.ac.uk>
The first step at UTF-8 hash keys.
p4raw-id: //depot/perl@7977
|
|
|
|
|
| |
always use (at least) UTF8_MAXLEN + 1 U8s deep buffer.
p4raw-id: //depot/perl@7967
|
|
|
| |
p4raw-id: //depot/perl@7816
|
|\
| |
| | |
p4raw-id: //depot/perlio@7735
|
| |
| |
| | |
p4raw-id: //depot/perl@7732
|
|/
|
|
|
|
|
| |
Valid generic fix to auto-vivify code in rv2gv - only "upgrade" to
SVt_PVRV if not already something better (else vivify of say magic gets
core dump).
p4raw-id: //depot/perlio@7727
|
|
|
|
|
| |
Message-Id: <200011132249.eADMnek09679@garcia.efn.org>
p4raw-id: //depot/perl@7677
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
if none of the characters in the string are > 0xff,
the result is a complemented byte string, not a (UTF-8)
char string. Based on the summary in
Subject: Re: [ID 20000918.005] ~ on wide chars
Message-ID: <jSDD6gzkgi/T092yn@efn.org>
This should give us the maximum backward (pre-char string)
compatibility and utf8 compatibility. The other alternative
would be to limit the bit complement to be always byte only,
taking the least significant byte of the chars.
p4raw-id: //depot/perl@7665
|
|
|
|
|
| |
Message-Id: <200010262100.e9QL03U06386@garcia.efn.org>
p4raw-id: //depot/perl@7454
|
|
|
|
|
|
|
|
| |
Rename utf8_to_uv_chk() back to utf8_to_uv() because it's
used much more than the simpler API, now called utf8_to_uv_simple().
Still not quite happy with API, too much partial duplication
of functionality.
p4raw-id: //depot/perl@7439
|
|
|
| |
p4raw-id: //depot/perl@7438
|
|
|
|
|
| |
UTF8LEN() and UTF8SKIP().
p4raw-id: //depot/perl@7437
|
|
|
| |
p4raw-id: //depot/perl@7422
|
|
|
|
|
|
|
|
|
|
|
|
| |
malformation happens. This involved adding an argument
to utf8_to_uv_chk(), which involved changing its prototype,
and prefer STRLEN over I32 for the UTF-8 length, which as
a domino effect necessitated changing the prototypes of
scan_bin(), scan_oct(), scan_hex(), and reg_uni().
The stricter UTF-8 decoding checking uses Markus Kuhn's
UTF-8 Decode Stress Tester from
http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt
p4raw-id: //depot/perl@7416
|
|
|
|
|
|
|
|
| |
based on a part of
Subject: [ID 20001016.017] [jens: 5.7.0 Solaris 8, 64 Bit, Workshop 6.0 Compiler]
Message-Id: <20001017083936.A11104@Strawberry.COM>
p4raw-id: //depot/perl@7380
|
|
|
| |
p4raw-id: //depot/perl@7237
|
|
|
| |
p4raw-id: //depot/perl@7236
|