| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Message-ID: <Pine.OSF.4.10.10103301805450.63762-100000@aspara.forte.com>
p4raw-id: //depot/perl@9485
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Loose the extra level of function on ASCII.
- spotted a chr(0) issue in sv.c
- re-work of UTF-X tr/// ranges to work in Unicode
space. Still issues with the "0xff is illegal UTF-8" hack.
- Yet another ad. hoc. utf8 'upgrade' in op.c recoded
(why do it once when you can do it all over the place :-(
- Enable HINTS_UTF8 on EBCDIC - then ignore it in toke.c,
need utf8.pm for swashes.
- Simplified and commented scan_const() in toke.c
Still something wrong regexp and tr (swashes?).
p4raw-id: //depot/perlio@9267
|
|
|
| |
p4raw-id: //depot/perl@9148
|
|
|
| |
p4raw-id: //depot/perl@9098
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- rename utf8/uv functions to indicate what sort of uv they provide (uvuni/uvchr)
- use utf8n_xxxx (c.f. pvn) for forms which take length.
- back out vN.N and $^V exceptions to e2a/a2e
- make "locale" isxxx macros be uvchr (may be redundant?)
Not clear yet that toUPPER_uni et. al. return being handled correctly.
The tr// and rexexp stuff still needs an audit, assumption is they are working
in Unicode space.
Need to provide v5.6 names for XS modules (decide is uni or chr ?).
p4raw-id: //depot/perlio@9096
|
|
|
|
|
| |
Message-Id: <200103081206.MAA06281@tiree.fdgroup.co.uk>
p4raw-id: //depot/perl@9084
|
|
|
|
|
|
|
|
|
|
| |
Fixes 20001230.002.
What still remains broken is that the submatches that
have \C in them get their UTF8 flag on because their
parent SV has it on. This will result in malformed
UTF8 if a \C happened to match a non-ASCII byte.
p4raw-id: //depot/perl@8836
|
|
|
| |
p4raw-id: //depot/perl@8769
|
|
|
| |
p4raw-id: //depot/perl@8762
|
|
|
| |
p4raw-id: //depot/perl@8566
|
|
|
| |
p4raw-id: //depot/perl@8510
|
|
|
| |
p4raw-id: //depot/perl@8415
|
|
|
| |
p4raw-id: //depot/perl@8414
|
|
|
|
|
| |
Message-Id: <200010300133.BAA10390@crypt.compulink.co.uk>
p4raw-id: //depot/perl@8403
|
|
|
|
|
|
|
| |
Message-ID: <3A59E510.52BAB5B9@st.rim.or.jp>
UTF-8 fixes for 'x' and tr///.
p4raw-id: //depot/perl@8378
|
|
|
| |
p4raw-id: //depot/perl@8328
|
|
|
| |
p4raw-id: //depot/perl@8289
|
|
|
|
|
|
|
|
| |
Message-ID: <3A4D722D.243AFD88@st.rim.or.jp>
Just the patch part for now, and the pragma renamed
as unicode::distinct.
p4raw-id: //depot/perl@8267
|
|
|
|
|
|
|
| |
Message-Id: <p04320404b6639e7aa043@[192.168.1.4]>
This patchlet is needed in order that perl can be statically linked.
p4raw-id: //depot/perl@8191
|
|
|
|
|
|
|
| |
Fixes at least the bugs 20001028.003 (both of them...) and
20001108.001. The bugs 20001114.001 and 20001205.014 seem
also to be fixed by now, probably already before this patch.
p4raw-id: //depot/perl@8143
|
|
|
| |
p4raw-id: //depot/perl@7984
|
|
|
|
|
|
| |
used to swash_init(), makes regprop() dumps more informative
(+utf8::IsAlpha, -utf8::IsDigit, for example).
p4raw-id: //depot/perl@7969
|
|
|
|
|
| |
always use (at least) UTF8_MAXLEN + 1 U8s deep buffer.
p4raw-id: //depot/perl@7967
|
|
|
| |
p4raw-id: //depot/perl@7940
|
|
|
|
|
| |
codes are needed.
p4raw-id: //depot/perl@7881
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
limit for the unused submatch 'cleanup' loop so that under
"use utf8" the following code wouldn't dump core:
"," =~ /([^,]*,)*/ With the the wrong lower limit (>=1)
the cleanup loop in regtry() stomped beyond allocated area
in the startp[] array. Therefore, copied the correct lower
loop limit (*PL_reglastparen) to regtry(). Note: something
may still not be quite right: why was the _higher_ loop limit
(prog->nparens) different in the utf8 case?
After this patch "./perl -Ilib -Mutf8 t/op/regexp.t" works
without core dumps, there were about 17 of them before
the patch (with us since Perl 5.7.0). Two failures, still:
496 and 505 (though these may not be severe).
Patch #7881 is also needed since both the cleanup loops
seem to be needed.
Also, the t/op/pat#44 seems to core dump under utf8.
Plus a couple of failures. UGH-8.
p4raw-id: //depot/perl@7879
|
|
|
|
|
|
|
| |
the code in regcppop() seems to be redundant for the test suite --
but it contains a germ of truth, and it needed for the build
process itself: see #7879 and #7881.
p4raw-id: //depot/perl@7878
|
|
|
| |
p4raw-id: //depot/perl@7877
|
|
|
| |
p4raw-id: //depot/perl@7873
|
|
|
|
|
| |
Message-ID: <20001119223026.A5165@monk.mps.ohio-state.edu>
p4raw-id: //depot/perl@7760
|
|
|
|
|
| |
Message-ID: <20001117172802.A1032@monk.mps.ohio-state.edu>
p4raw-id: //depot/perl@7733
|
|
|
|
|
|
|
|
| |
Rename utf8_to_uv_chk() back to utf8_to_uv() because it's
used much more than the simpler API, now called utf8_to_uv_simple().
Still not quite happy with API, too much partial duplication
of functionality.
p4raw-id: //depot/perl@7439
|
|
|
|
|
|
|
|
|
|
|
|
| |
malformation happens. This involved adding an argument
to utf8_to_uv_chk(), which involved changing its prototype,
and prefer STRLEN over I32 for the UTF-8 length, which as
a domino effect necessitated changing the prototypes of
scan_bin(), scan_oct(), scan_hex(), and reg_uni().
The stricter UTF-8 decoding checking uses Markus Kuhn's
UTF-8 Decode Stress Tester from
http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt
p4raw-id: //depot/perl@7416
|
|
|
|
|
| |
Message-Id: <200010222347.AAA09697@crypt.compulink.co.uk>
p4raw-id: //depot/perl@7407
|
|
|
|
|
| |
Message-ID: <20000928215531.A4315@monk.mps.ohio-state.edu>
p4raw-id: //depot/perl@7115
|
|
|
|
|
|
|
| |
i.e. rename Simon's function to Perl_utf8_to_uv_chk, change all calls to it
to use new name and add Perl_utf8_to_uv() as a wrapper which calls it passing
0 to checking to get the warning.
p4raw-id: //depot/perl@7096
|
|
|
| |
p4raw-id: //depot/perl@7075
|
|
|
|
|
|
|
|
| |
Subject: [ID 20000903.001] \w in utf8-strings
Message-Id: <E13VUS5-0000cv-00.pgcc-forever-2000-09-03-09-44-29@fuji>
and various related nits.
p4raw-id: //depot/perl@7030
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Subject: [ID 20000716.024] [=cc=] / [:blank:]
Message-Id: <200007170055.RAA23528@fummy.dsl.yahoo.com>
(the [=cc=] has already been taken care of by #6439
so the whole bug report can be closed)
and make [[:space:]] to be equivalent to isspace(3)
(as opposed to \s, which is isSPACE()). The difference
is that now [[:space:]] matches the mythical vertical tab,
while \s doesn't.
p4raw-id: //depot/perl@6703
|
|
|
|
|
|
|
| |
uninitialized value
Message-Id: <200008101823.TAA23580@crypt.compulink.co.uk>
p4raw-id: //depot/perl@6591
|
|
|
|
|
| |
Message-Id: <200008021353.OAA24761@crypt.compulink.co.uk>
p4raw-id: //depot/perl@6493
|
|
|
|
|
|
|
|
| |
Subject: Re: [PATCH] [ID 20000701.002] Regular Expressions Not Unsetting $1 Vars When Backtracking
Message-Id: <200007140316.EAA15857@crypt.compulink.co.uk>
p4raw-link: @6337 on //depot/cfgperl: f06a1d4e6ae96bf8af49f0ef1c79f500d8de0143
p4raw-id: //depot/cfgperl@6395
|
|
|
|
|
| |
Message-Id: <200007111144.MAA04446@crypt.compulink.co.uk>
p4raw-id: //depot/cfgperl@6337
|
|
|
|
|
| |
(from Ilya Zakharevich)
p4raw-id: //depot/perl@6172
|
|
|
| |
p4raw-id: //depot/perl@6152
|
|
|
| |
p4raw-id: //depot/perl@6151
|
|
|
| |
p4raw-id: //depot/perl@5973
|
|
|
| |
p4raw-id: //depot/perl@5931
|
|
|
| |
p4raw-id: //depot/perl@5540
|
|
|
|
|
|
|
| |
To: perl5-porters@perl.org
Subject: [ID 20000223.005]
Message-Id: <20000223160308.1830.qmail@md.media-web.de>
p4raw-id: //depot/cfgperl@5277
|