| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ 9400]
More EBCDIC tweaks:
- one more swash issue &~(0xA0-1) did not do the right thing,
for UTF-EBCDIC where &~(0x80-1) does for UTF-8.
- add "use re 'asciirange'" to make [!-~] etc. work
use it in MIME::QuotedPrint and t/op/regexp.t and t/op/pat.t
- Choose a key for t/op/each.t test which gets encoded.
- Skip utf8decode if this is UTF-EBCDIC.
p4raw-link: @9400 on //depot/perlio: daf0f78e031c718c75590ef9ef573756f805776e
p4raw-id: //depot/perl@9407
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Loose the extra level of function on ASCII.
- spotted a chr(0) issue in sv.c
- re-work of UTF-X tr/// ranges to work in Unicode
space. Still issues with the "0xff is illegal UTF-8" hack.
- Yet another ad. hoc. utf8 'upgrade' in op.c recoded
(why do it once when you can do it all over the place :-(
- Enable HINTS_UTF8 on EBCDIC - then ignore it in toke.c,
need utf8.pm for swashes.
- Simplified and commented scan_const() in toke.c
Still something wrong regexp and tr (swashes?).
p4raw-id: //depot/perlio@9267
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fix leak in pregcomp() when RE fails to compile (e.g. m/\\/)
remove squelch controls for "Scalars leaked" messages in most places
(these are now cured)
fix another memory leak reported by purify (tie callbacks that
croak can leak when wiping out magic)
p4raw-link: @9142 on //depot/maint-5.6/perl: 26972843796e21c404c9d13ec5ee86e7b952a2bd
p4raw-link: @9138 on //depot/maint-5.6/perl: ad7f1144250940f9ca43bac32708ec5e718b30ff
p4raw-link: @9137 on //depot/maint-5.6/perl: 1f35595ecca168b4f66e3399344799fdbd496d17
p4raw-id: //depot/perl@9144
p4raw-integrated: from //depot/maint-5.6/perl@9143 'copy in'
t/pragma/strict-vars (@7318..) t/pragma/warn/regcomp (@7887..)
t/op/regexp.t (@8551..) t/op/lex_assign.t (@8987..) 'merge in'
t/op/local.t (@5902..) t/pragma/warn/op (@7846..)
t/pragma/warnings.t (@7895..) t/comp/proto.t (@8173..)
t/pragma/warn/toke (@8570..) regcomp.c (@8777..) scope.c
(@8855..) t/op/pat.t (@9076..)
|
|
|
|
|
| |
as it is doing compare against 'W' in \W etc.
p4raw-id: //depot/perlio@9106
|
|
|
|
|
| |
Use ASCII_TO_NATIVE and NATIVE_TO_ASCII to avoid some #ifs.
p4raw-id: //depot/perlio@9105
|
|
|
| |
p4raw-id: //depot/perl@9098
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- rename utf8/uv functions to indicate what sort of uv they provide (uvuni/uvchr)
- use utf8n_xxxx (c.f. pvn) for forms which take length.
- back out vN.N and $^V exceptions to e2a/a2e
- make "locale" isxxx macros be uvchr (may be redundant?)
Not clear yet that toUPPER_uni et. al. return being handled correctly.
The tr// and rexexp stuff still needs an audit, assumption is they are working
in Unicode space.
Need to provide v5.6 names for XS modules (decide is uni or chr ?).
p4raw-id: //depot/perlio@9096
|
|
|
| |
p4raw-id: //depot/perl@9058
|
|
|
| |
p4raw-id: //depot/perl@8935
|
|
|
|
|
| |
Attempt to fix the EBCDIC character range problem with //.
p4raw-id: //depot/perl@8930
|
|
|
|
|
|
|
|
|
|
| |
Fixes 20001230.002.
What still remains broken is that the submatches that
have \C in them get their UTF8 flag on because their
parent SV has it on. This will result in malformed
UTF8 if a \C happened to match a non-ASCII byte.
p4raw-id: //depot/perl@8836
|
|
|
|
|
| |
Message-Id: <200102130011.AAA14310@crypt.compulink.co.uk>
p4raw-id: //depot/perl@8779
|
|
|
|
|
|
| |
Subject: Re: Incorrect scoping of PL_reg_start_tmp causes leak
Message-Id: <3A808A9D.20F7A035@uk.sun.com>
p4raw-id: //depot/perl@8711
|
|
|
|
|
| |
Message-ID: <20010116144318.7140.qmail@plover.com>
p4raw-id: //depot/perl@8455
|
|
|
|
|
|
|
| |
Message-ID: <3A59E510.52BAB5B9@st.rim.or.jp>
UTF-8 fixes for 'x' and tr///.
p4raw-id: //depot/perl@8378
|
|
|
| |
p4raw-id: //depot/perl@8328
|
|
|
| |
p4raw-id: //depot/perl@8289
|
|
|
|
|
|
|
|
| |
Message-ID: <3A4D722D.243AFD88@st.rim.or.jp>
Just the patch part for now, and the pragma renamed
as unicode::distinct.
p4raw-id: //depot/perl@8267
|
|
|
| |
p4raw-id: //depot/perl@8184
|
|
|
|
|
| |
Fixes the bug 20001218.016.
p4raw-id: //depot/perl@8183
|
|
|
|
|
|
|
| |
Fixes at least the bugs 20001028.003 (both of them...) and
20001108.001. The bugs 20001114.001 and 20001205.014 seem
also to be fixed by now, probably already before this patch.
p4raw-id: //depot/perl@8143
|
|
|
| |
p4raw-id: //depot/perl@7984
|
|
|
|
|
|
| |
used to swash_init(), makes regprop() dumps more informative
(+utf8::IsAlpha, -utf8::IsDigit, for example).
p4raw-id: //depot/perl@7969
|
|
|
| |
p4raw-id: //depot/perl@7968
|
|
|
|
|
| |
always use (at least) UTF8_MAXLEN + 1 U8s deep buffer.
p4raw-id: //depot/perl@7967
|
|
|
| |
p4raw-id: //depot/perl@7940
|
|
|
|
|
| |
should it be required inside regexen?
p4raw-id: //depot/perl@7938
|
|
|
|
|
|
| |
stored and restored, and thusly was trounced by the utf8 swash
routines.
p4raw-id: //depot/perl@7937
|
|
|
|
|
|
|
|
| |
Not really fixed (should really dump the UTF-8 charclass),
but stopped displaying the garbage.
Also add a note on the (missing) Unicode PSXSPC and BLANK.
p4raw-id: //depot/perl@7874
|
|
|
| |
p4raw-id: //depot/perl@7870
|
|
|
| |
p4raw-id: //depot/perl@7824
|
|
|
|
|
| |
Message-ID: <20001120183051.A15228@monk.mps.ohio-state.edu>
p4raw-id: //depot/perl@7815
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Date: Fri, 17 Nov 2000 20:35:11 -0500
Message-ID: <20001117203511.A13121@monk.mps.ohio-state.edu>
Subject: Re: [PATCH 5.7.0] make regcomp reenterable
From: Ilya Zakharevich <ilya@math.ohio-state.edu>
Date: Fri, 17 Nov 2000 21:03:47 -0500
Message-ID: <20001117210347.A16570@monk.mps.ohio-state.edu>
Plus a little bit of tweaking in pregcomp().
p4raw-id: //depot/perl@7741
|
|
|
|
|
| |
Message-ID: <20001117172802.A1032@monk.mps.ohio-state.edu>
p4raw-id: //depot/perl@7733
|
|
|
|
|
|
| |
the test to run 0.5% _slower_. Requires much more instrumentation.
Retract #7590.
p4raw-id: //depot/perl@7591
|
|
|
|
|
|
| |
execution time in regcomp.c S_cl_any() and S_cl_is_anything()
by using memset() and testing bytewise (as opposed to bitwise).
p4raw-id: //depot/perl@7590
|
|
|
|
|
| |
Message-Id: <200010312239.e9VMdZR01580@night-porter.duskware.de>
p4raw-id: //depot/perl@7512
|
|
|
|
|
|
|
|
| |
Rename utf8_to_uv_chk() back to utf8_to_uv() because it's
used much more than the simpler API, now called utf8_to_uv_simple().
Still not quite happy with API, too much partial duplication
of functionality.
p4raw-id: //depot/perl@7439
|
|
|
|
|
|
|
|
|
|
|
|
| |
malformation happens. This involved adding an argument
to utf8_to_uv_chk(), which involved changing its prototype,
and prefer STRLEN over I32 for the UTF-8 length, which as
a domino effect necessitated changing the prototypes of
scan_bin(), scan_oct(), scan_hex(), and reg_uni().
The stricter UTF-8 decoding checking uses Markus Kuhn's
UTF-8 Decode Stress Tester from
http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt
p4raw-id: //depot/perl@7416
|
|
|
|
|
|
|
| |
i.e. rename Simon's function to Perl_utf8_to_uv_chk, change all calls to it
to use new name and add Perl_utf8_to_uv() as a wrapper which calls it passing
0 to checking to get the warning.
p4raw-id: //depot/perl@7096
|
|
|
|
|
| |
Message-Id: <200009141707.SAA13276@tempest.npl.co.uk>
p4raw-id: //depot/perl@7081
|
|
|
|
|
|
| |
Subject: Re: [ID 20000910.005] Another segfault with regexes.
Message-Id: <200009132152.RAA24029@leggy.zk3.dec.com>
p4raw-id: //depot/perl@7076
|
|
|
| |
p4raw-id: //depot/perl@7075
|
|
|
|
|
| |
Message-Id: <200008221021.LAA03332@crypt.compulink.co.uk>
p4raw-id: //depot/perl@6770
|
|
|
|
|
|
| |
can't tell the difference and expand arguments also inside
double quoted strings.
p4raw-id: //depot/perl@6747
|
|
|
|
|
|
| |
Subject: PATCH @6698 for [ID 20000817.007] Not OK: perl v5.7.0 +SUIDMAIL +DEVEL6676 on alpha-dec_osf 4.0f (UNINSTALLED)
Message-Id: <200008182241.SAA29667@Orb.Nashua.NH.US>
p4raw-id: //depot/perl@6709
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Subject: [ID 20000716.024] [=cc=] / [:blank:]
Message-Id: <200007170055.RAA23528@fummy.dsl.yahoo.com>
(the [=cc=] has already been taken care of by #6439
so the whole bug report can be closed)
and make [[:space:]] to be equivalent to isspace(3)
(as opposed to \s, which is isSPACE()). The difference
is that now [[:space:]] matches the mythical vertical tab,
while \s doesn't.
p4raw-id: //depot/perl@6703
|
|
|
| |
p4raw-id: //depot/perl@6663
|
|
|
|
|
|
|
|
| |
since pod makes using the latter quite messy. Reported in
ID 20000814.006 by Abigail and in
Subject: Unknown escape E<> ?
Message-ID: <20000811003027.F17420@alanya.lupe-christoph.de>
p4raw-id: //depot/perl@6653
|
|
|
| |
p4raw-id: //depot/perl@6563
|