| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
| |
This new function inverts a Unicode property. A regular inversion
doesn't work because it operates on the whole of the code space, and
Unicode property inversions don't invert above-Unicode code points.
This does for inversion lists, what an earlier commit did for swashes.
This function is currently not called by anyone.
|
|
|
|
|
|
| |
This is to guard against misuse of the functions. There is no guard
currently in the underlying Perl functions to lengthening a string
beyond the capacity to hold it.
|
|
|
|
| |
This caused -Uusedl builds to fail
|
|
|
|
|
|
|
|
|
|
|
| |
This function joins multiple EXACT* nodes into a single node.
At the end, under DEBUGGING, it marks the optimised-out nodes as being
type OPTIMIZED. However, some of the 'nodes' aren't actually nodes;
they're random bits of string at the tail of those nodes. So you
can't peek that the 'node's OP field to decide what type it was.
Instead, just unconditionally overwrite all the slots with fake
OPTIMIZED nodes.
|
|
|
|
|
| |
This is in preparation for them to be called from another file. Note
that they are still protected by an #ifdef in embed.fnc.
|
|
|
|
|
|
| |
This is in preparation for it to be called from another file. If
for performance reasons it needs to be made inline again, it could
then be moved into a header.
|
|
|
|
|
|
| |
The names now begin with an underscore to emphasize that they are
for internal use only. This is in preparation for making them
accessible beyond regcomp.c.
|
| |
|
|
|
|
|
| |
The new facilities with inversion lists enables us to do
some more compile-time inversions.
|
|
|
|
| |
This adds inversion, cloning, and set subtraction
|
|
|
|
| |
for use when debugging
|
|
|
|
|
| |
This element is restricted to either 0 or 1. The comments detail
how its use enables an inversion list to be efficiently inverted.
|
| |
|
| |
|
|
|
|
| |
This is so functions that operate on the same data are adjacent
|
|
|
|
|
| |
Future changes will make the length no longer the same as SvCUR,
so create an element to hold the correct length
|
|
|
|
| |
This changes to use the iterator when traversing an inversion list.
|
| |
|
|
|
|
|
|
| |
An inversion list is an array of UVs. This allows for other UVs
to be added at the beginning for ancillary purposes. This patch
does not allocate any space for these, however.
|
| |
|
|
|
|
| |
This is in preparation for making things more complex in a later commit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This hasn't been used since 626725768b7b17463e9ec7b92e2da37105036252
Author: Nicholas Clark <nick@ccl4.org>
Date: Thu May 26 22:29:40 2011 -0600
regcomp.c: Fix memory leak regression
here was a remaining memory leak in the new inversion lists data
structure under threading. This solves it by changing the
implementation to use a SVpPV instead of doing our own memory
management. Then the already existing code for handling SVs
returns the memory when done.
|
|
|
|
|
|
| |
The invlist_destroy function was misleading, as it has changed to
just decrement the reference count, which may or may not lead to
immediate destruction
|
|
|
|
| |
This is in preparation to removing the function
|
|
|
|
|
|
| |
regcomp.c has a subsection dealing with the implementation of the
inversion list class(-like object). Undef its macros so they
can't possibly interfere with the rest of regcomp.c
|
|
|
|
| |
A previous commit changed things so that this is no longer necessary
|
|
|
|
|
|
|
| |
These are static functions so no external effect. Revise the calling
sequence of two functions so that they can know enough to free
memory if appropriate of the other parameters. This hides from the
callers the need for tracking when to free memory.
|
|
|
|
| |
It is not an inversion list, contrary to what this line used to say.
|
|
|
|
|
|
| |
The inversion list is an opaque object, currently implemented as an SV.
Even if it ends up being an HV in the future, Nicholas is of the opinion
that it should be presented to the world as an SV*.
|
| |
|
|
|
|
|
|
|
|
| |
There was a remaining memory leak in the new inversion lists data
structure under threading. This solves it by changing the
implementation to use a SVpPV instead of doing our own memory
management. Then the already existing code for handling SVs
returns the memory when done.
|
| |
|
|
|
|
| |
The reference count should be decremented upon freeing.
|
|
|
|
|
|
|
|
|
| |
This code was derived from published code, which says use at your
own risk. And in fact had bugs. I don't believe these show up in
5.14, as I think you have to have a list that has been inverted for this
to happen.
The comments describe what should have been done.
|
|
|
|
|
| |
Adding this macro which is the complement of an existing macro helps
understanding what is happening at its point of use
|
|
|
|
|
|
| |
GREEK PROSGEGRAMMENI and COMBINING GREEK YPOGEGRAMMENI fold to the
the first character of one of the tricky folds, and hence need
to be treated as potentially tricky themselves.
|
| |
|
| |
|
|
|
|
|
| |
These two characters fold to lower-case characters that are involved
in tricky folds, and hence these can be too.
|
|
|
|
|
|
|
|
|
|
|
| |
I carelessly added this memory leak which happens in a bracketed
character class under /i when there is both a above 255 code point
listed plus one of the several below 256 code points that participate
in a fold with ones above 256.
For 5.16, as the use of the inversion list data structure expands, an
automatic method of freeing space will need to be put in place. But
this should be sufficient for 5.14.1.
|
|
|
|
|
|
|
| |
It turns out that caseless tries currently only work on UTF-8 EXACTFU
nodes. The code attempted to test that by using UNI_SEMANTICS, but that
doesn't actually work; what is important is the semantics of the current
node.
|
|
|
|
| |
It has been unused since 28d8d7f41ab202dd restructured the regexp dup code.
|
|
|
|
|
|
| |
gcc 4.6.0 warns about variables that are set but never read, and unless
RE_TRACK_PATTERN_OFFSETS is defined, parse_start is never read. So avoid
declaring or setting it if it's not actually going to be used later.
|
| |
|
|
|
|
|
| |
A previous commit added an 'if' around this code. This now indents
the block properly.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch causes inverted [bracketed] character classes to not handle
multi-character folds. The reason is that these can lead to very
counter-intuitive results (see bug discussion).
In an inverted character class, only single-char folds are now
generated. However the fold for \xDF=>ss is hard-coded in,
and it was too much trouble sending flags to the sub-sub routine that
does this, so another check is done at the point of storing the list of
multi-char folds. Since \xDF doesn't have a single char fold, this
works.
|
|
|
|
|
| |
As agreed, this improvement is going into 5.14. A customized
message is output, instead of a generic one.
|
|
|
|
|
|
|
| |
An earlier commit in this series changed some error messages.
I realized that it did not make sense really to use "/a" for the regex
modifier, when the message was for the infix form "(?a:", so this
just removes the slash.
|
|
|
|
|
|
| |
This allows a second regex 'a' modifier in the infix form to not have to
be contiguous with the first, and improves the message if there are extra
modifiers.
|