summaryrefslogtreecommitdiff
path: root/regexec.c
Commit message (Collapse)AuthorAgeFilesLines
* add more positive gofs GPOS tests and fix some bugs tooYves Orton2009-09-101-4/+16
|
* Fix RT69056 - postive GPOS leads to segv on first matchYves Orton2009-09-091-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | http://rt.perl.org/rt3/Ticket/Display.html?id=69056 In perl 5.8 we get this: $ perl -Mre=debug -le '$_="foo"; s/(.)\G//g; print' Freeing REx: `","' Compiling REx `(.)\G' size 7 Got 60 bytes for offset annotations. first at 3 1: OPEN1(3) 3: REG_ANY(4) 4: CLOSE1(6) 6: GPOS(7) 7: END(0) GPOS minlen 1 Offsets: [7] 1[1] 0[0] 2[1] 3[1] 0[0] 4[2] 6[0] Matching REx `(.)\G' against `foo' Setting an EVAL scope, savestack=3 0 <> <foo> | 1: OPEN1 0 <> <foo> | 3: REG_ANY 1 <f> <oo> | 4: CLOSE1 1 <f> <oo> | 6: GPOS failed... Setting an EVAL scope, savestack=3 1 <f> <oo> | 1: OPEN1 1 <f> <oo> | 3: REG_ANY 2 <fo> <o> | 4: CLOSE1 2 <fo> <o> | 6: GPOS failed... Setting an EVAL scope, savestack=3 2 <fo> <o> | 1: OPEN1 2 <fo> <o> | 3: REG_ANY 3 <foo> <> | 4: CLOSE1 3 <foo> <> | 6: GPOS failed... Setting an EVAL scope, savestack=3 3 <foo> <> | 1: OPEN1 3 <foo> <> | 3: REG_ANY failed... Match failed foo Freeing REx: `"(.)\\G"' In perl 5.10 we get this: $ perl -Mre=debug -le '$_="foo"; s/(.)\G//g; print' Compiling REx "(.)\G" Final program: 1: OPEN1 (3) 3: REG_ANY (4) 4: CLOSE1 (6) 6: GPOS (7) 7: END (0) anchored(GPOS) GPOS:1 minlen 1 Matching REx "(.)\G" against "foo" -1 <> <%0foo> | 1:OPEN1(3) -1 <> <%0foo> | 3:REG_ANY(4) 0 <> <foo> | 4:CLOSE1(6) 0 <> <foo> | 6:GPOS(7) 0 <> <foo> | 7:END(0) Match successful! Segmentation fault With this patch we get: $ ./perl -Ilib -Mre=debug -le '$_="foo"; s/(.)\G//g; print' Compiling REx "(.)\G" Final program: 1: OPEN1 (3) 3: REG_ANY (4) 4: CLOSE1 (6) 6: GPOS (7) 7: END (0) anchored(GPOS) GPOS:1 minlen 1 Matching REx "(.)\G" against "foo" Match failed foo Freeing REx: "(.)\G" Which seems to me to be a net improvement.
* much better swap logic to support reentrancy and fix assert failureGeorge Greer2009-07-261-29/+17
| | | | | | | | | | | Commit c74340f9 added backreferences as well as the idea of a ->swap regex pointer to keep track of the match offsets in case of backtracking. The problem is that when Perl re-enters the regex engine to handle utf8::SWASHNEW, the ->swap is not saved/restored/cleared so any capture from the utf8 (Perl) code could inadvertently modify the regex match data that caused the utf8 swash to get built. This change should close out RT #60508
* Save and restore PL_regeol for op inside of regex (RT ##66110)Craig A. Berry2009-07-251-0/+2
| | | | | | | | | | | | If the op inside of a (?{ }) construct is another regex, the two regexen end up corrupting each others' end-of-string markers, resulting in various pathologies including access violations, stack corruptions, and memory use growing without bound. The change here is intended to be a relatively safe, cheap way to prevent memory errors and makes no attempt to save and restore other aspects of regex state; i.e., general purpose reentrancy for the regex engine is still a TODO.
* Regex fails when string is too longhv@crypt.org2009-07-061-2/+3
| | | | | | | | This looks to be a simple oversight. All tests pass here. Hugo Signed-off-by: H.Merijn Brand <h.m.brand@xs4all.nl>
* fix [RT #60034]. An equivalent fix was already in 5.8.9 as change 34580.David Mitchell2009-03-221-2/+5
|
* Fix #56194 Regex: (((??{1 + $^N}))) behaves differently in 5.10.0 than in bleadBram2009-03-121-1/+15
| | | | | | | | | | | | | | | | | PL_reglastparen and PL_reglastcloseparen contains a pointer are set to & rex->lastparen and & rex->lastcloseparen. In case END the rex var is modified but PL_reglastparen and PL_reglastcloseparen is not. Some part of the codes access PL_reglastparen while other parts use rex->lastparen. This patch corrects this and adds 3 assertions. I'm currently unable to proof (with a test case) that the code in case EVAL_ab is really nessesary... Logically speaking it is nessesary but I do not know if it can cause test failures. Also in the patch are missing regressions between 5.8 -> 5.10 and 5.10 -> 5.11. (and a test script that contains these regressions) Message-ID: <rt-3.6.HEAD-4802-1236806863-900.56194-15-0@perl.org> [Includes message and patch edits by committer.]
* Fix memory leakKarl2009-01-261-0/+3
|
* Another regexp failure with utf8-flagged string and byte-flagged pattern ↵Slaven Rezic2009-01-041-2/+6
| | | | | | | (reminder) Date: 17 Nov 2007 16:29:29 +0100 Message-ID: <87r6iohova.fsf@biokovo-amd64.herceg.de>
* Fix malformed utf8 in regexec.cKarl2008-12-281-6/+12
|
* fix bug #57042 - preserve $^R across TRIE matchesYves Orton2008-12-271-1/+4
|
* Assigning to DEFSV leaks if PL_defgv's gp_sv isn't set.Marcus Holland-Moritz2008-11-081-1/+1
| | | | | | | | | As Nicholas already noted in a FIXME, assigning to DEFSV should use GvSV instead of GvSVn. This change ensures that, at least under -DPERL_CORE, DEFSV cannot be assigned to and introduces a DEFSV_set macro to allow setting DEFSV. This fixes #53038: map leaks memory. p4raw-id: //depot/perl@34776
* Revert SvPVX() to allow lvalue usage, but also add aMarcus Holland-Moritz2008-11-071-1/+1
| | | | | | | MUTABLE_SV() check. Use SvPVX_const() instead of SvPVX() where only a const SV* is available. Also fix two falsely consted pointers in Perl_sv_2pv_flags(). p4raw-id: //depot/perl@34770
* Various changes to regex diagnostics and testingYves Orton2008-11-061-1/+2
| | | | | | | | | | | * Make ANYOF output from regprop easier to read by adding ][ in between the unicode representation and the "ascii" one * Make it possible to make tests in re_tests todo. * add a todo test for a complementary character class match that should fail (perl #60156) * Also add a comment explaining a previous commit (relating to perl #60344) p4raw-id: //depot/perl@34755
* Resolve perl #60344: Regex lookbehind failure after an (if)then|else in perl ↵Yves Orton2008-11-061-0/+1
| | | | | | | | | | 5.10 During the de-recursivization it looks like Dave M forgot to reset the 'logical' flag after using it, which in turn causes UNLESSM/IFTHEN when used after a LOGICAL operator to be incorrectly intrepreted. This change resets the logical flag after each time it is stored in ST.logical. p4raw-id: //depot/perl@34746
* PATCH: Large omnibus patch to clean up the JRRT quotesTom Christiansen2008-11-021-1/+5
| | | | | | Message-ID: <25940.1225611819@chthon> Date: Sun, 02 Nov 2008 01:43:39 -0600 p4raw-id: //depot/perl@34698
* Eliminate (SV *) casts from the rest of *.c, picking up one (further)Nicholas Clark2008-10-301-8/+8
| | | | | erroneous const in dump.c. p4raw-id: //depot/perl@34675
* Eliminate (AV *) casts in *.c.Nicholas Clark2008-10-291-3/+3
| | | p4raw-id: //depot/perl@34650
* Every remaining (HV *) cast in *.cNicholas Clark2008-10-281-2/+2
| | | p4raw-id: //depot/perl@34629
* Update copyright years.Nicholas Clark2008-10-251-1/+2
| | | p4raw-id: //depot/perl@34585
* assert() that every NN argument is not NULL. Otherwise we have theNicholas Clark2008-02-121-15/+49
| | | | | | | | | | | | ability to create landmines that will explode under someone in the future when they upgrade their compiler to one with better optimisation. We've already done this at least twice. (Yes, some of the assertions are after code that would already have SEGVd because it already deferences a pointer, but they are put in to make it easier to automate checking that each and every case is covered.) Add a tool, checkARGS_ASSERT.pl, to check that every case is covered. p4raw-id: //depot/perl@33291
* REGEXPs are now stored directly in PL_regex_padav, rather thanNicholas Clark2008-01-111-1/+1
| | | | | | indirectly via RVs. This saves memory, and removes 1 level of pointer indirection. p4raw-id: //depot/perl@32950
* It seems that you don't need to reference count PL_reg_curpm withoutNicholas Clark2008-01-101-0/+5
| | | | | ithreads, so don't waste time doing it there. p4raw-id: //depot/perl@32939
* The correct solution is to reference count the regexp in PL_reg_curpm,Nicholas Clark2008-01-101-3/+6
| | | | | | rather than put in lots of hacks to work round not reference counting it. p4raw-id: //depot/perl@32938
* Change 32899 missed the other double-reference count.Nicholas Clark2008-01-091-1/+1
| | | p4raw-id: //depot/perl@32913
* Correct a long-standing ithreads reference counting anonamly - theNicholas Clark2008-01-081-1/+1
| | | | | | reference count only needs "doubling" when the scalar is pushed onto PL_regex_padav for the second time. p4raw-id: //depot/perl@32899
* Clarify the use of SVf_BREAK on PL_reg_curpm.Nicholas Clark2008-01-071-1/+2
| | | p4raw-id: //depot/perl@32895
* Make REGEXP a type distinct from SV. (Much like AV, CV, GV, HV).Nicholas Clark2008-01-051-5/+5
| | | p4raw-id: //depot/perl@32861
* Fix regexec.c so $^N and $+ are correctly updated so that they work properly ↵Moritz Lenz2008-01-051-2/+13
| | | | | | | | inside of (?{...}) blocks as reported by Moritz Lenz in Subject: Bugs in extended regexp features Message-ID: <477FACED.4000505@casella.verplant.org> p4raw-id: //depot/perl@32857
* Convert all accesses of the member paren_names of struct regexp toNicholas Clark2008-01-051-2/+2
| | | | | | be accessed via RXp_PAREN_NAMES(). (They are entirely within the regexp implementation). p4raw-id: //depot/perl@32853
* Abolish RXf_UTF8. Store the UTF-8-ness of the pattern with SvUTF8().Nicholas Clark2008-01-051-1/+0
| | | p4raw-id: //depot/perl@32852
* Make Perl_pregcomp() use SvUTF8() of the pattern, rather than the flagNicholas Clark2008-01-051-1/+13
| | | | | bit in pmflags, to decide whether the pattern is UTF-8. p4raw-id: //depot/perl@32851
* s/re/rx/ in an assert overlooked during recent renovationsYves Orton2008-01-051-1/+1
| | | p4raw-id: //depot/perl@32850
* Replace all reads of RXf_UTF8 with RX_UTF8().Nicholas Clark2008-01-051-3/+3
| | | p4raw-id: //depot/perl@32849
* Add RX_UTF8(), which is effectively SvUTF8() but for regexps.Nicholas Clark2008-01-051-6/+6
| | | | | | | Remove RXp_PRECOMP() and RXp_WRAPPED(). Change the parameter of S_debug_start_match() from regexp to REGEXP. Change its callers [the only part wrong for 5.10.x] p4raw-id: //depot/perl@32840
* Make struct regexp the body of SVt_REGEXP SVs, REGEXPs become SVs,Nicholas Clark2008-01-021-34/+46
| | | | | | and regexp reference counting is via the regular SV reference counting. This was not as easy at it looks. p4raw-id: //depot/perl@32804
* Wrap all deferences of struct regexp* in macros RX_*() [and forNicholas Clark2008-01-021-8/+8
| | | | | | | regcomp.c and regexec.c RXp_* where necessary] so that in future we can maintain source compatibility when we add an extra level of dereferencing. p4raw-id: //depot/perl@32802
* Wrap all accesses to the members precomp and prelen of struct regexp inNicholas Clark2007-12-281-2/+2
| | | | | | the macros RX_PRECOMP() and RX_PRELEN(). This will allow us to reduce the regexp storage overhead by computing them at retrieve time. p4raw-id: //depot/perl@32753
* First class regexps.Nicholas Clark2007-12-281-6/+18
| | | p4raw-id: //depot/perl@32751
* scalars used in postponed subexpressions aren't first class regexps,Nicholas Clark2007-12-271-1/+3
| | | | | | so don't upgrade them to ORANGE before attaching qr magic. (And don't stop using qr magic once regexps become first class) p4raw-id: //depot/perl@32748
* assert() that the sv_unmagic() in S_regmatch() is unneeded.Nicholas Clark2007-12-271-2/+12
| | | | | Add a comment about the mg_find() that follows. p4raw-id: //depot/perl@32742
* Regexps are now orange.Nicholas Clark2007-12-271-1/+3
| | | | | (Correct a comparison of $] with 5.011 in B.pm) p4raw-id: //depot/perl@32740
* Comment out a now unused variableRafael Garcia-Suarez2007-12-171-1/+2
| | | p4raw-id: //depot/perl@32630
* Fix various bugs in regex engine with mixed utf8/latin pattern and strings. ↵Yves Orton2007-12-171-27/+61
| | | | | | | Related to [perl #36207] among others Message-ID: <9b18b3110712170621h41de2c76k331971e3660abcb0@mail.gmail.com> p4raw-id: //depot/perl@32628
* Re: [perl #47195] $1 suddenly tainted after regexp on utf-8 stringRick Delaney2007-11-071-1/+1
| | | | | | | Message-ID: <20071107001845.GA21000@bort.ca> [plus remove the TODO from the now passing test] p4raw-id: //depot/perl@32236
* misc blead stuffJarkko Hietaniemi2007-08-301-2/+2
| | | | | Message-ID: <46D617B5.3000002@iki.fi> p4raw-id: //depot/perl@31765
* TRIE must use 'yes' state transitions when more than one match possible to ↵Marcus Holland-Moritz2007-08-181-10/+2
| | | | | | | | | | ensure proper scope cleanup. Fix and test for issue raised in: Subject: Very strange interaction between regex and lexical array in blead Message-ID: <20070818015537.0088db31@r2d2> p4raw-id: //depot/perl@31733
* s/\bunicode\b/Unicode/; # For everything not dual lifeNicholas Clark2007-06-241-2/+2
| | | p4raw-id: //depot/perl@31455
* [perl #43159] 5.9.4 regexp capturing wronglyDave Mitchell2007-06-181-6/+0
| | | | | | | | change #28398 accidently made the last branch of an alternation not restore the paren state after failure backtrack. Fix this by removing the last-branch-skips-pushing-a-state optimisation. p4raw-link: @28398 on //depot/perl: 40a824489101168f94fce98aa2824baf40bad402 p4raw-id: //depot/perl@31417
* add test for, and update comments for, old defined($1) oddity.Dave Mitchell2007-06-181-8/+6
| | | | | | | | | Some code in regexec.c had a comment to the effect that without this code, Dynaloader failed (this is back at 5.6.0). Replace the comments with something more specific, and add a test for it (basically without the code $1 is '' rather than undefined sometimes). p4raw-id: //depot/perl@31408