summaryrefslogtreecommitdiff
path: root/regcomp.c
Commit message (Collapse)AuthorAgeFilesLines
* RE: [PATCHES] regcomp.c, pod/perldiag.pod, t/op/pat.tPaul Marquess2001-06-031-2/+10
| | | | | Message-ID: <000601c0ebae$77d10dc0$99dcfea9@bfs.phone.com> p4raw-id: //depot/perl@10410
* One less -Wall whine.Jarkko Hietaniemi2001-06-031-1/+1
| | | p4raw-id: //depot/perl@10406
* -Wall cleanup continues.Jarkko Hietaniemi2001-06-021-2/+8
| | | p4raw-id: //depot/perl@10392
* Re: [PATCHES] regcomp.c, pod/perldiag.pod, t/op/pat.tJeff Pinyan2001-06-011-2/+55
| | | | | Message-ID: <Pine.GSO.4.21.0106011032080.21027-100000@crusoe.crusoe.net> p4raw-id: //depot/perl@10376
* More -Wall sweeping.Jarkko Hietaniemi2001-05-301-14/+14
| | | p4raw-id: //depot/perl@10338
* Medley of -Wall cleanups from Michael Schwen, Hugo van der Sanden,Jarkko Hietaniemi2001-05-301-7/+6
| | | | | and Abhijit Menon-Sen. p4raw-id: //depot/perl@10321
* Re: [ID 20010506.041] segfault when matching utf8 stringInaba Hiroto2001-05-251-0/+1
| | | | | Message-Id: <200105250124.KAA19571@toshiba.co.jp> p4raw-id: //depot/perl@10206
* Re: [ID 20000716.007] \G in a m//g expression causes problems Hugo van der Sanden2001-05-231-1/+1
| | | | | Message-Id: <200105211532.QAA03999@crypt.compulink.co.uk> p4raw-id: //depot/perl@10187
* Re: [PATCH] HERE mark in regexRonald J. Kimball2001-05-161-3/+3
| | | | | Message-ID: <20010516130443.E1516273@linguist.thayer.dartmouth.edu> p4raw-id: //depot/perl@10136
* Remove the 'asciir' re subpragma. Should instead implementJarkko Hietaniemi2001-05-111-31/+7
| | | | | | | | | | | | | | | | | | | | | | | | | the 'physical vs logical' range scheme: \xAA-\xCC is a native physical range, you want that range of codepoints in your native encoding. In EBCDIC the codepoints in the gaps (between i-j and r-s) should be included. \x{AA}-\x{CC} is a physical Unicode range, you want that range of codepoints in Unicode. a-z is a logical range, you want that range of 'logical' codepoints in your native encoding. In EBCDIC the codepoints in the gaps (between i-j and r-s) should not be included. Mixed cases (a-\xAA, etc) should either be errors, or maybe the 'logical' endpoints should be converted to native/Unicode codepoints, and the range handled as a physical range. 'Logical endpoints' are to be recognized only in the A-Z, a-z, and 0-9 ranges. Probably a warning should be given for mixed cases like A-z or a-9 (since such expressions are encoding dependent), with a recommendation to use physical ranges. p4raw-id: //depot/perl@10085
* Insecure regexesRobin Houston2001-05-071-1/+1
| | | | | Message-ID: <20010507215612.A31114@penderel> p4raw-id: //depot/perl@10021
* -Wformat error from ext/re/re_comp.cRobin Barker2001-05-041-6/+6
| | | | | Message-Id: <200105041709.SAA14835@tempest.npl.co.uk> p4raw-id: //depot/perl@9991
* The #9901 had removed one line essential for EBCDIC.Jarkko Hietaniemi2001-05-041-0/+1
| | | p4raw-id: //depot/perl@9987
* Re: [PATCH bleadperl] [ID 20010426.002] Word boundry regex [...] Hugo van der Sanden2001-04-301-1/+0
| | | | | Message-Id: <200104291609.RAA17790@crypt.compulink.co.uk> p4raw-id: //depot/perl@9911
* In character classes one couldn't have 0x80..0xff charactersJarkko Hietaniemi2001-04-291-63/+40
| | | | | | at the left hand side if there were 0x100.. characters in the character class. p4raw-id: //depot/perl@9901
* Re: [PATCH @9846] dumping ANYOFHugo van der Sanden2001-04-261-1/+5
| | | | | Message-Id: <200104262233.XAA22352@crypt.compulink.co.uk> p4raw-id: //depot/perl@9873
* Retract #9851, core dumps from pod2man.Jarkko Hietaniemi2001-04-261-1/+0
| | | p4raw-id: //depot/perl@9852
* (Retracted by #9852.)Hugo van der Sanden2001-04-261-0/+1
| | | | | | Subject: [PATCH @9846] dumping ANYOF Message-Id: <200104260432.FAA12669@crypt.compulink.co.uk> p4raw-id: //depot/perl@9851
* Re: ANYOF_SIZE is wrong in 5.7.1 Mark-Jason Dominus2001-04-221-2/+1
| | | | | Message-ID: <20010422012749.27024.qmail@plover.com> p4raw-id: //depot/perl@9778
* Re: Regex debugger patchMark-Jason Dominus2001-04-221-21/+194
| | | | | | | Message-ID: <20010421182439.16508.qmail@plover.com> Regex debugger backend. p4raw-id: //depot/perl@9776
* Integrate perlio:Jarkko Hietaniemi2001-03-281-8/+34
| | | | | | | | | | | | | | [ 9400] More EBCDIC tweaks: - one more swash issue &~(0xA0-1) did not do the right thing, for UTF-EBCDIC where &~(0x80-1) does for UTF-8. - add "use re 'asciirange'" to make [!-~] etc. work use it in MIME::QuotedPrint and t/op/regexp.t and t/op/pat.t - Choose a key for t/op/each.t test which gets encoded. - Skip utf8decode if this is UTF-EBCDIC. p4raw-link: @9400 on //depot/perlio: daf0f78e031c718c75590ef9ef573756f805776e p4raw-id: //depot/perl@9407
* More EBCDIC stuff:Nick Ing-Simmons2001-03-201-10/+6
| | | | | | | | | | | | | | - Loose the extra level of function on ASCII. - spotted a chr(0) issue in sv.c - re-work of UTF-X tr/// ranges to work in Unicode space. Still issues with the "0xff is illegal UTF-8" hack. - Yet another ad. hoc. utf8 'upgrade' in op.c recoded (why do it once when you can do it all over the place :-( - Enable HINTS_UTF8 on EBCDIC - then ignore it in toke.c, need utf8.pm for swashes. - Simplified and commented scan_const() in toke.c Still something wrong regexp and tr (swashes?). p4raw-id: //depot/perlio@9267
* Integrate changes #9137,9138,9142 from maintperl into mainline.Jarkko Hietaniemi2001-03-141-3/+2
| | | | | | | | | | | | | | | | | | | | | | fix leak in pregcomp() when RE fails to compile (e.g. m/\\/) remove squelch controls for "Scalars leaked" messages in most places (these are now cured) fix another memory leak reported by purify (tie callbacks that croak can leak when wiping out magic) p4raw-link: @9142 on //depot/maint-5.6/perl: 26972843796e21c404c9d13ec5ee86e7b952a2bd p4raw-link: @9138 on //depot/maint-5.6/perl: ad7f1144250940f9ca43bac32708ec5e718b30ff p4raw-link: @9137 on //depot/maint-5.6/perl: 1f35595ecca168b4f66e3399344799fdbd496d17 p4raw-id: //depot/perl@9144 p4raw-integrated: from //depot/maint-5.6/perl@9143 'copy in' t/pragma/strict-vars (@7318..) t/pragma/warn/regcomp (@7887..) t/op/regexp.t (@8551..) t/op/lex_assign.t (@8987..) 'merge in' t/op/local.t (@5902..) t/pragma/warn/op (@7846..) t/pragma/warnings.t (@7895..) t/comp/proto.t (@8173..) t/pragma/warn/toke (@8570..) regcomp.c (@8777..) scope.c (@8855..) t/op/pat.t (@9076..)
* regcomp.c is working in native space, not Unicode space (if different)Nick Ing-Simmons2001-03-111-9/+8
| | | | | as it is doing compare against 'W' in \W etc. p4raw-id: //depot/perlio@9106
* Audit #ifdef EBCDIC and #ifndef ASCIIish, replace latter with former.Nick Ing-Simmons2001-03-111-20/+7
| | | | | Use ASCII_TO_NATIVE and NATIVE_TO_ASCII to avoid some #ifs. p4raw-id: //depot/perlio@9105
* Fix for ID 20010306.008, UTF-8 and \w without 'use utf8' coredump.Jarkko Hietaniemi2001-03-101-18/+0
| | | p4raw-id: //depot/perl@9098
* EBCDIC sanity - phase INick Ing-Simmons2001-03-101-11/+11
| | | | | | | | | | | | | | - rename utf8/uv functions to indicate what sort of uv they provide (uvuni/uvchr) - use utf8n_xxxx (c.f. pvn) for forms which take length. - back out vN.N and $^V exceptions to e2a/a2e - make "locale" isxxx macros be uvchr (may be redundant?) Not clear yet that toUPPER_uni et. al. return being handled correctly. The tr// and rexexp stuff still needs an audit, assumption is they are working in Unicode space. Need to provide v5.6 names for XS modules (decide is uni or chr ?). p4raw-id: //depot/perlio@9096
* Make /x{abcd}/ to work without use utf8.Jarkko Hietaniemi2001-03-061-0/+2
| | | p4raw-id: //depot/perl@9058
* Retract #8929,8930,8932,8933 for now.Jarkko Hietaniemi2001-02-251-31/+43
| | | p4raw-id: //depot/perl@8935
* (Retracted by #8395.)Jarkko Hietaniemi2001-02-251-43/+31
| | | | | Attempt to fix the EBCDIC character range problem with //. p4raw-id: //depot/perl@8930
* Misapplied regex optimizations when \C is present.Jarkko Hietaniemi2001-02-181-0/+3
| | | | | | | | | | Fixes 20001230.002. What still remains broken is that the submatches that have \C in them get their UTF8 flag on because their parent SV has it on. This will result in malformed UTF8 if a \C happened to match a non-ASCII byte. p4raw-id: //depot/perl@8836
* Re: [ID 20010212.006] Core dump with /((?:hard|soft)cover)?/ Hugo van der Sanden2001-02-131-6/+4
| | | | | Message-Id: <200102130011.AAA14310@crypt.compulink.co.uk> p4raw-id: //depot/perl@8779
* Manually applied version for dev branch of Alan/Sarathy 5.6 patch.Alan Burlison2001-02-071-118/+117
| | | | | | Subject: Re: Incorrect scoping of PL_reg_start_tmp causes leak Message-Id: <3A808A9D.20F7A035@uk.sun.com> p4raw-id: //depot/perl@8711
* regcomp.c old feature removalMark-Jason Dominus2001-01-161-5/+0
| | | | | Message-ID: <20010116144318.7140.qmail@plover.com> p4raw-id: //depot/perl@8455
* One more patch for UTF8 Inaba Hiroto2001-01-091-1/+5
| | | | | | | Message-ID: <3A59E510.52BAB5B9@st.rim.or.jp> UTF-8 fixes for 'x' and tr///. p4raw-id: //depot/perl@8378
* UTF-8 cleanup.Jarkko Hietaniemi2001-01-051-1/+5
| | | p4raw-id: //depot/perl@8328
* Bump up Larry's copyright.Jarkko Hietaniemi2001-01-011-1/+1
| | | p4raw-id: //depot/perl@8289
* more UTF8 test suites and an UTF8 patchInaba Hiroto2000-12-301-40/+89
| | | | | | | | Message-ID: <3A4D722D.243AFD88@st.rim.or.jp> Just the patch part for now, and the pragma renamed as unicode::distinct. p4raw-id: //depot/perl@8267
* Comments work so much better when they are closed.Jarkko Hietaniemi2000-12-181-1/+1
| | | p4raw-id: //depot/perl@8184
* Some compilers (e.g. HP-UX) can't switch on 64-bit integers.Jarkko Hietaniemi2000-12-181-2/+8
| | | | | Fixes the bug 20001218.016. p4raw-id: //depot/perl@8183
* Polymorphic regexps.Jarkko Hietaniemi2000-12-171-501/+339
| | | | | | | Fixes at least the bugs 20001028.003 (both of them...) and 20001108.001. The bugs 20001114.001 and 20001205.014 seem also to be fixed by now, probably already before this patch. p4raw-id: //depot/perl@8143
* dTHR is a nop in 5.6.0 onwards. Ergo, it can go.Jarkko Hietaniemi2000-12-051-26/+0
| | | p4raw-id: //depot/perl@7984
* On DEBUGGING make ANYOFUTF8 nodes store away also the SVJarkko Hietaniemi2000-12-031-2/+40
| | | | | | used to swash_init(), makes regprop() dumps more informative (+utf8::IsAlpha, -utf8::IsDigit, for example). p4raw-id: //depot/perl@7969
* Implement ANYOFUTF8 regprop() dumping.Jarkko Hietaniemi2000-12-031-10/+39
| | | p4raw-id: //depot/perl@7968
* Make uv_to_utf8() to zero-terminate its output buffer,Jarkko Hietaniemi2000-12-031-7/+1
| | | | | always use (at least) UTF8_MAXLEN + 1 U8s deep buffer. p4raw-id: //depot/perl@7967
* Get the three different space character classes right under utf8.Jarkko Hietaniemi2000-12-011-7/+8
| | | p4raw-id: //depot/perl@7940
* \x{} doesn't any more require 'use utf8' outside regexen so whyJarkko Hietaniemi2000-12-011-7/+1
| | | | | should it be required inside regexen? p4raw-id: //depot/perl@7938
* Fix for 20001130.008 and 20001130.010, the PL_regnpar wasn'tJarkko Hietaniemi2000-12-011-0/+1
| | | | | | stored and restored, and thusly was trounced by the utf8 swash routines. p4raw-id: //depot/perl@7937
* Debug dump of ANYOFUTF8 was garbage (data from ANYOF).Jarkko Hietaniemi2000-11-261-16/+24
| | | | | | | | Not really fixed (should really dump the UTF-8 charclass), but stopped displaying the garbage. Also add a note on the (missing) Unicode PSXSPC and BLANK. p4raw-id: //depot/perl@7874
* Message nit.Jarkko Hietaniemi2000-11-261-1/+1
| | | p4raw-id: //depot/perl@7870