summaryrefslogtreecommitdiff
path: root/regcomp.c
Commit message (Collapse)AuthorAgeFilesLines
* [RT #78266] Don't leak memory when accessing named captures that didn't matchÆvar Arnfjörð Bjarmason2011-12-131-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Since 5.10 (probably 44a2ac759e) named captures have been leaking memory when they're used, don't actually match, but are later accessed. E.g.: $ perl -wle 'for (1..10_000_000) { if ("foo" =~ /(foo|(?<capture>bar))?/) { my $capture = $+{capture} } } system "ps -o rss $$"' RSS 238524 Here we match the "foo" branch of our regex, but since we've used a name capture we'll end up running the code in Perl_reg_named_buff_fetch, which allocates a newSVsv(&PL_sv_undef) but never uses it unless it's trying to return an array. Just change that code not to allocate scalars we don't plan to return. With this fix we don't leak any memory since there's nothing to leak anymore. $ ./perl -Ilib -wle 'for (1..10_000_000) { if ("foo" =~ /(foo|(?<capture>bar))?/) { my $capture = $+{capture} } } system "ps -o rss $$"' RSS 3528 This reverts commit b28f4af8cf94eb18c0cfde71e9625081912499a8 ("Fix allocating something in the first place is a better solution than allocating it, not using it, and then freeing it.
* Fix #78266: Memory leak with named regexp captures.Johannes Plunien2011-12-131-0/+4
|
* Another confusing comment; this time in regcomp.cFather Chrysostomos2011-11-301-1/+2
| | | | | | This one comes from c2123ae38. And we shouldn’t be suing our mothers’ tykes. :-)
* regcomp.c: Bypass unneeded stepKarl Williamson2011-11-111-2/+2
| | | | | We don't have to convert from utf8 to code point to fold; instead can call the function that starts from utf8
* Fix volatile declarationKarl Williamson2011-11-091-1/+1
| | | | | | Commit 24efd69ba77ba76cd714519dccee88f45820d8b4 introduced a VOL declaration. I thought I had tested this, but apparently not. It needs to apply to the pointee instead of the pointer.
* regcomp.c: Change char used to force reading in fold swashesKarl Williamson2011-11-081-1/+4
| | | | | Future commits will change things so that a latin1 character no longer will go out to disk to load a swash.
* regcomp.c: Add assertionKarl Williamson2011-11-081-0/+1
|
* regcomp.c: Silence compiler warning about longjumpKarl Williamson2011-11-081-1/+1
| | | | | | I believe that there isn't a code path that can screw this up, but one compiler at least believes otherwise. Declaring it volatile should fix that.
* PATCH: [perl #101940]: BBC TkKarl Williamson2011-10-291-6/+8
| | | | | | | | | | | | This commit that turned up this bug turns out merely exposes an underlying problem that could be generated via other means. regcomp.c was looking at the SvUTF8 flag on the input pattern before doing an SvPV on it. Generally the flag is considered not reliable unless checked immediately after a SvPV. I haven't been able to come up with a simple test case that reproduces the bug. I suspect that XS code is required to trigger it.
* regcomp.c: Use no_mg for 2nd fetch of patternKarl Williamson2011-10-291-1/+3
| | | | | | The pattern could be tied, for example, and so only want to access it once. I couldn't come up with a test case that actually exercised this, but I can think of future changes to regcomp that would.
* PATCH: [perl #101970] /[[:lower:]]/i matches upper caseKarl Williamson2011-10-271-18/+31
| | | | | | | | | This bug is a regression in 5.14, in which /[[:lower:]]/i and /[[:upper:]]/i no longer matched the opposite case. The fix is to have these use a different table under /i matching, that includes the correct /i code points. These tables were already available, just unused.
* regcomp.c: Don't prefix posix props with IsKarl Williamson2011-10-271-1/+1
| | | | | | | When confronted with something like [[:alpha:]], regcomp.c adds IsPosixAlpha to the list of properties for code points above 255 to match against when executing the pattern. The 'Is' is extraneous, and future commits will not want it.
* regcomp.c: White space onlyKarl Williamson2011-10-171-9/+9
| | | | | Indent the newly formed block, and reflow comments for narrower available space.
* regcomp.c: generate folded for EXACTF and EXACTFUKarl Williamson2011-10-171-2/+9
| | | | | | regcomp.c folds the string in these two nodes except in one case. Change that case to correspond with the predominant behavior. This enables future optimizations
* Comment-only nitsKarl Williamson2011-10-011-3/+3
|
* regcomp.c: Add invlist_invert_prop()Karl Williamson2011-10-011-0/+38
| | | | | | | | | This new function inverts a Unicode property. A regular inversion doesn't work because it operates on the whole of the code space, and Unicode property inversions don't invert above-Unicode code points. This does for inversion lists, what an earlier commit did for swashes. This function is currently not called by anyone.
* regcomp.c: Add assertionKarl Williamson2011-10-011-0/+2
| | | | | | This is to guard against misuse of the functions. There is no guard currently in the underlying Perl functions to lengthening a string beyond the capacity to hold it.
* avoid defining the same global functions in multiple objectsTony Cook2011-07-241-1/+6
| | | | This caused -Uusedl builds to fail
* fix segv in regcomp.c:S_join_exact()David Mitchell2011-07-051-5/+5
| | | | | | | | | | | This function joins multiple EXACT* nodes into a single node. At the end, under DEBUGGING, it marks the optimised-out nodes as being type OPTIMIZED. However, some of the 'nodes' aren't actually nodes; they're random bits of string at the tail of those nodes. So you can't peek that the 'node's OP field to decide what type it was. Instead, just unconditionally overwrite all the slots with fake OPTIMIZED nodes.
* Change 4 inversion list functions from S_ to Perl_Karl Williamson2011-07-031-8/+8
| | | | | This is in preparation for them to be called from another file. Note that they are still protected by an #ifdef in embed.fnc.
* Change _invlist_invert() from being in-lineKarl Williamson2011-07-031-1/+1
| | | | | | This is in preparation for it to be called from another file. If for performance reasons it needs to be made inline again, it could then be moved into a header.
* Change names of some inversion list functionsKarl Williamson2011-07-031-15/+15
| | | | | | The names now begin with an underscore to emphasize that they are for internal use only. This is in preparation for making them accessible beyond regcomp.c.
* regcomp.c: White space onlyKarl Williamson2011-07-031-2/+2
|
* regcomp.c: Do some [^abc] inversion at compile timeKarl Williamson2011-07-031-5/+32
| | | | | The new facilities with inversion lists enables us to do some more compile-time inversions.
* Add 3 methods for inversion listsKarl Williamson2011-07-031-1/+77
| | | | This adds inversion, cloning, and set subtraction
* Add inversion list dump routine, #ifdef'd out to prevent compiler warning, ↵Karl Williamson2011-07-031-0/+24
| | | | for use when debugging
* Add an element to inversion list data structureKarl Williamson2011-07-031-18/+153
| | | | | This element is restricted to either 0 or 1. The comments detail how its use enables an inversion list to be efficiently inverted.
* regcomp.c: Add commentsKarl Williamson2011-07-031-7/+29
|
* regcomp.c: Parenthesize rhs of #defineKarl Williamson2011-07-031-1/+1
|
* regcomp.c: Move a function aroundKarl Williamson2011-07-031-10/+11
| | | | This is so functions that operate on the same data are adjacent
* Add length element to inversion listsKarl Williamson2011-07-031-2/+15
| | | | | Future changes will make the length no longer the same as SvCUR, so create an element to hold the correct length
* regcomp.c: Use inversion list iteratorKarl Williamson2011-07-031-41/+6
| | | | This changes to use the iterator when traversing an inversion list.
* Add iterator for inversion listsKarl Williamson2011-07-031-1/+53
|
* Allow a header in inversion lists.Karl Williamson2011-07-031-5/+16
| | | | | | An inversion list is an array of UVs. This allows for other UVs to be added at the beginning for ancillary purposes. This patch does not allocate any space for these, however.
* regcomp.c: Correct commentKarl Williamson2011-07-031-1/+1
|
* regcomp.c: Macroize two expressionsKarl Williamson2011-07-031-5/+9
| | | | This is in preparation for making things more complex in a later commit
* regcomp.c: Rmv no longer called functionKarl Williamson2011-07-031-16/+0
| | | | | | | | | | | | | | This hasn't been used since 626725768b7b17463e9ec7b92e2da37105036252 Author: Nicholas Clark <nick@ccl4.org> Date: Thu May 26 22:29:40 2011 -0600 regcomp.c: Fix memory leak regression here was a remaining memory leak in the new inversion lists data structure under threading. This solves it by changing the implementation to use a SVpPV instead of doing our own memory management. Then the already existing code for handling SVs returns the memory when done.
* regcomp.c: Remove no longer called functionKarl Williamson2011-07-031-10/+0
| | | | | | The invlist_destroy function was misleading, as it has changed to just decrement the reference count, which may or may not lead to immediate destruction
* regcomp.c: Remove invlist_destroy callsKarl Williamson2011-07-031-4/+4
| | | | This is in preparation to removing the function
* regcomp.c: #undef after finishedKarl Williamson2011-07-031-0/+2
| | | | | | regcomp.c has a subsection dealing with the implementation of the inversion list class(-like object). Undef its macros so they can't possibly interfere with the rest of regcomp.c
* regcomp.c: Remove unneeded temporaryKarl Williamson2011-07-031-6/+2
| | | | A previous commit changed things so that this is no longer necessary
* regcomp.c: Revise inversion list APIKarl Williamson2011-07-031-18/+33
| | | | | | | These are static functions so no external effect. Revise the calling sequence of two functions so that they can know enough to free memory if appropriate of the other parameters. This hides from the callers the need for tracking when to free memory.
* regcomp.c: PL_utf8_foldclosures is a HVKarl Williamson2011-07-031-1/+1
| | | | It is not an inversion list, contrary to what this line used to say.
* Change inversion lists to SVsKarl Williamson2011-07-031-33/+32
| | | | | | The inversion list is an opaque object, currently implemented as an SV. Even if it ends up being an HV in the future, Nicholas is of the opinion that it should be presented to the world as an SV*.
* regcomp.c: Add commentKarl Williamson2011-05-311-0/+1
|
* regcomp.c: Fix memory leak regressionNicholas Clark2011-05-261-91/+15
| | | | | | | | There was a remaining memory leak in the new inversion lists data structure under threading. This solves it by changing the implementation to use a SVpPV instead of doing our own memory management. Then the already existing code for handling SVs returns the memory when done.
* regcomp.c: Another leak regressionKarl Williamson2011-05-221-0/+1
|
* regcomp.c: Another memory leak regressionKarl Williamson2011-05-201-0/+1
| | | | The reference count should be decremented upon freeing.
* regcomp.c: Fix bug in inversion list intersectionKarl Williamson2011-05-191-21/+35
| | | | | | | | | This code was derived from published code, which says use at your own risk. And in fact had bugs. I don't believe these show up in 5.14, as I think you have to have a list that has been inverted for this to happen. The comments describe what should have been done.
* regcomp.c: Add new macro for readabilityKarl Williamson2011-05-191-6/+7
| | | | | Adding this macro which is the complement of an existing macro helps understanding what is happening at its point of use