summaryrefslogtreecommitdiff
path: root/regnodes.h
Commit message (Collapse)AuthorAgeFilesLines
* regcomp.sym: Add commentsKarl Williamson2011-10-171-4/+4
|
* regcomp.sym: Add nodes for backref of EXACTFAKarl Williamson2011-02-141-90/+100
| | | | These are not used yet.
* regcomp.sym: Add regnode for /aa matchingKarl Williamson2011-02-141-118/+123
| | | | It is not used yet.
* Initial setup to accommodate /aa regex modifierKarl Williamson2011-02-141-4/+4
| | | | | This changes the bits to add a new charset type for /aa, and other bookkeeping for it.
* Move all the generated file header printing into read_only_top()Nicholas Clark2011-01-231-1/+1
| | | | | | | | | Previously all the scripts in regen/ had code to generate header comments (buffer-read-only, "do not edit this file", and optionally regeneration script, regeneration data, copyright years and filename). This change results in some minor reformatting of header blocks, and standardises the copyright line as "Larry Wall and others".
* regcomp.sym: Add nodes for /aKarl Williamson2011-01-171-185/+226
| | | | These aren't used yet.
* regcomp.sym: Remove unused nodes DIGITU, NDIGITUKarl Williamson2011-01-161-148/+137
| | | | | | These are unused because there is no difference between Unicode semantics and non for digits. That is there are no digit characters in the 128-255 range.
* regcomp.sym: Add BOUNDU, NBOUNDU regnodesKarl Williamson2011-01-161-185/+195
| | | | | | This will make for somewhat more efficient execution, as won't have to test the regnode type multiple times, at the expense of slightly bigger code space.
* regex: Add separate regnodes for \w \s Uni semanticsKarl Williamson2011-01-161-156/+187
| | | | | These nodes aren't actually used yet, but allow the splitting out of Unicode semantics for \w, \s, and complements
* regcomp.sym: add clarifying commentsKarl Williamson2011-01-161-2/+2
|
* Use multi-bit field for regex character setKarl Williamson2011-01-161-2/+2
| | | | | | | | | | | | | The /d, /l, and /u regex modifiers are mutually exclusive. This patch changes the field that stores the character set to use more than one bit with an enum determining which one. This data structure more closely follows the semantics of their being mutually exclusive, and conserves bits as well, and is better expandable. A small API is added to set and query the bit field. This patch is not .xs source backwards compatible. A handful of cpan programs are affected.
* regcomp.sym: Add ANYOFV nodeKarl Williamson2011-01-131-160/+165
| | | | | | | | | | | | This node is like a straight ANYOF node to match [bracketed character classes], but can match multiple characters; in particular it can match a multi-char fold. When multi-char Unicode folding was added to Perl, it was overlooked that the ANYOF node is supposed to match exactly one character, hence there have been bugs ever since. Adding a specialized node that can match multiple chars, these can be fixed more easily. I tried at first to make ANYOF match multiple chars, but this causes Perl to not be able to fully compile.
* Run make regen after 486ec47ab73770ab updated regcomp.sym.Nicholas Clark2011-01-071-1/+1
|
* regcomp.sym: Correct DIGITL, NDIGITL entriesKarl Williamson2010-12-071-3/+3
| | | | | These were missing that they were simple (matching exactly 1 character) and have 0 regnode arguments
* regcomp.sym: Re-order for better groupingKarl Williamson2010-12-071-134/+134
| | | | | | The recently added regnodes are moved to their respective equivalence classes, and the named backreferences are moved to just after the numbered backreferences
* regcomp.sym: Add REFFU and NREFFU nodesKarl Williamson2010-12-011-9/+20
| | | | | | | These will be used for matching capture buffers case-insensitively using Unicode semantics. make regen will regenerate the delivered regnodes.h
* regcomp.sym: Add EXACTFU regnodeKarl Williamson2010-11-281-6/+11
| | | | | This node will be used for matching case insensitive exactish nodes using Unicode semantics
* regcomp.sym: Clarify commentKarl Williamson2010-11-221-1/+1
| | | | make regen needed
* regcomp.sym: Fix descriptionsKarl Williamson2010-11-221-4/+4
| | | | requires regen
* regcomp.pl -> regen/regcomp.plFather Chrysostomos2010-10-131-1/+1
|
* Add /d, /l, /u (infixed) regex modifiersKarl Williamson2010-09-221-2/+2
| | | | | | | | | | | | This patch adds recognition of these modifiers, with appropriate action for d and l. u does nothing useful yet. This allows for the interpolation of a regex into another one without losing the character set semantics that it was compiled with, as for the first time, the semantics is now specified in the stringification as one of these modifiers. To this end, it allocates an unused bit in the structures. The off- sets change so as to not disturb other bits.
* regexp.h: Move bits aroundKarl Williamson2010-08-111-13/+13
| | | | | | | | | | | | make regen needed. This commit moves some bits in extflags around so that all the unallocated ones are at the boundary between the unshared portion and the portion shared with op.h. This allows them to be allocated in the future to go either way, without affecting binary compatibility at that time. The high-order bits are unaffected, but the low order ones move to fill the gap.
* Convert REGNODE_{SIMPLE,VARIES} to a bitmask lookup, from a strchr() lookup.Nicholas Clark2010-05-271-6/+22
| | | | | | | This is O(1) with no branching, instead of O(n) with branching. Deprecate the old implementation's externally visible variables PL_simple and PL_varies. Google codesearch suggests that nothing outside the core regexp code was using these.
* Encapsulate lookups in PL_{varies,simple} within macros REGNODE_{VARIES,SIMPLE}Nicholas Clark2010-05-271-0/+4
| | | | This allows the implementation of the lookup mechanism to change.
* Generate PL_simple[] and PL_varies[] with regcomp.pl, rather than hard-coding.Nicholas Clark2010-05-271-0/+24
| | | | | | Add a new flags column to regcomp.sym, with V if the node type is in PL_varies, S if it is in PL_simple, and . if a placeholder is needed because subsequent optional columns are present.
* Remove stray tab character in definition for VERB.Nicholas Clark2010-05-271-2/+2
| | | | | | | As VERB is "Used only for the type field of verbs" this is only a cosmetic change, causing that correct description to appear in the comment in regnodes.h. The change to regarglen doesn't affect anything, as the VERB type is never actually used for compiled nodes.
* Abolish RXf_UTF8. Store the UTF-8-ness of the pattern with SvUTF8().Nicholas Clark2008-01-051-2/+2
| | | p4raw-id: //depot/perl@32852
* Reorder the external regexp flags to get RXf_PMf_STD_PMMOD into theNicholas Clark2007-12-291-30/+30
| | | | | | | lowest 4 bits (which saves a shift), and the "flags indicating special patterns" into contiguous bits. This makes everything a little tidier, and saves 88 bytes (woohoo!) of object file with -Os on x86 FreeBSD. p4raw-id: //depot/perl@32775
* TRIE must use 'yes' state transitions when more than one match possible to ↵Marcus Holland-Moritz2007-08-181-2/+2
| | | | | | | | | | ensure proper scope cleanup. Fix and test for issue raised in: Subject: Very strange interaction between regex and lexical array in blead Message-ID: <20070818015537.0088db31@r2d2> p4raw-id: //depot/perl@31733
* /p vs (?p)Abigail2007-06-301-0/+42
| | | | | | | | | | | | | Date: Fri, 29 Jun 2007 23:38:07 +0200 Message-ID: <20070629213807.GA14454@abigail.nl> Subject: [PATCH pod/perlre.pod] Keeping up with the changes. From: Abigail <abigail@abigail.be> Date: Sat, 30 Jun 2007 01:24:36 +0200 Message-ID: <20070629232436.GA15326@abigail.nl> Plus tweaks, and debug enahancements. p4raw-id: //depot/perl@31506
* Re: Analysis of problems with mixed encoding case insensitive matches in ↵Yves Orton2007-04-261-6/+11
| | | | | | | regex engine. Message-ID: <9b18b3110704240746u461e4bdcl208ef7d7f9c5ef64@mail.gmail.com> p4raw-id: //depot/perl@31081
* Change meaning of \v, \V, and add \h, \H to match Perl6, add \R to match ↵Yves Orton2007-04-231-6/+31
| | | | | | | PCRE and unicode tr18 Message-ID: <9b18b3110704221434g43457742p28cab00289f83639@mail.gmail.com> p4raw-id: //depot/perl@31026
* Symbian syncJarkko Hietaniemi2007-04-011-1/+1
| | | | | Message-ID: <460EB6C1.4020406@iki.fi> p4raw-id: //depot/perl@30824
* Change 30461 was wrong. As ext/re (re)builds the regexp engine withNicholas Clark2007-03-051-5/+3
| | | | | | | -DDEBUGGING, it's going to need PL_reg_name even if core perl doesn't. So something is always going to use it, so always define it, and always export it. (But only define it once, so that static builds work.) p4raw-id: //depot/perl@30464
* Define and initialise reg_name only once.Nicholas Clark2007-03-031-6/+7
| | | | | | | This allows re to be a static extension. As it's now no-longer a static variable in regcomp.c, it needs a PL_ prefix. p4raw-id: //depot/perl@30451
* Add Regexp::Keep \K functionality to regex engine as well as add \v and \V, ↵Yves Orton2007-01-111-9/+20
| | | | | | | | | cleanup and more docs for regatom() Message-ID: <9b18b3110701101133i46dc5fd0p1476a0f1dd1e9c5a@mail.gmail.com> (plus POD nits by Merijn and myself) p4raw-id: //depot/perl@29756
* Re: [PATCH] New regex syntax omnibusYves Orton2006-11-131-22/+33
| | | | | Message-ID: <9b18b3110611090809l667860c9t6c27453d7c86a21e@mail.gmail.com> p4raw-id: //depot/perl@29260
* New regex syntax omnibusYves Orton2006-11-071-130/+162
| | | | | | | | Message-ID: <9b18b3110611060406u2fa1572as57073949a5df9e62@mail.gmail.com> Plus a portability fix (in string comparison for regex verbs) and doc tweaks / podchecker fixes p4raw-id: //depot/perl@29222
* Add more backtracking control verbs to regex engine (?CUT), (?ERROR)Yves Orton2006-11-021-72/+80
| | | | | Message-ID: <9b18b3110611020335h7ea469a8g28ca483f6832816d@mail.gmail.com> p4raw-id: //depot/perl@29189
* Add a commit verb to regex engine to allow fine tuning of backtracking control.Yves Orton2006-11-011-66/+77
| | | | | Message-ID: <9b18b3110610311349n5947cc8fsf0b2e6ddd9a7ee01@mail.gmail.com> p4raw-id: //depot/perl@29183
* The second patch from:Yves Orton2006-10-301-118/+118
| | | | | | Subject: [PATCH] regex engine optimiser should grok subroutine patterns, and, name subroutine regops more intuitively Message-ID: <9b18b3110610300915x3abf6cddu9c2071a70bea48e1@mail.gmail.com> p4raw-id: //depot/perl@29162
* The first patch from:Yves Orton2006-10-301-3/+3
| | | | | | Subject: [PATCH] regex engine optimiser should grok subroutine patterns, and, name subroutine regops more intuitively Message-ID: <9b18b3110610300915x3abf6cddu9c2071a70bea48e1@mail.gmail.com> p4raw-id: //depot/perl@29161
* Fix a problem with jump-tries, add (?FAIL) pattern.Yves Orton2006-10-261-66/+71
| | | | | Message-ID: <9b18b3110610260559k3efa98barc28987e88c581a8a@mail.gmail.com> p4raw-id: //depot/perl@29118
* Add Regex conditionals. Various bugfixes. More tests.Yves Orton2006-10-121-168/+183
| | | | | Message-ID: <9b18b3110610111546j74ca490dg21bd9fd1e7e10d42@mail.gmail.com> p4raw-id: //depot/perl@28998
* Re: [PATCH] Initial attempt at named captures for perls regexp engineYves Orton2006-10-071-66/+81
| | | | | Message-ID: <9b18b3110610061016x5ddce965u30d9a821f632d450@mail.gmail.com> p4raw-id: //depot/perl@28957
* migrate CURLYX/WHILEM branch in regmatch() to new FSM-esque paradigmDave Mitchell2006-10-051-50/+65
| | | p4raw-id: //depot/perl@28944
* Re: [PATCH] Add recursive regexes similar to PCREYves Orton2006-10-051-56/+74
| | | | | | | | | | | Date: Wed, 4 Oct 2006 15:45:15 +0200 Message-ID: <9b18b3110610040645s563220a2id6f235494b497e90@mail.gmail.com> Subject: Re: [PATCH] Add recursive regexes similar to PCRE From: demerphq <demerphq@gmail.com> Date: Wed, 4 Oct 2006 21:05:10 +0200 Message-ID: <9b18b3110610041205m2660eb43m1315cf4b0653db96@mail.gmail.com> p4raw-id: //depot/perl@28939
* Fixes to compile Perl with g++ and DEBUGGING.Steve Peters2006-10-041-2/+2
| | | p4raw-id: //depot/perl@28934
* Re: [PATCH] Add hook for re_dup() into regex engine as reg_dupe (make re ↵Yves Orton2006-09-291-93/+93
| | | | | | | pluggable under threads) Message-ID: <9b18b3110609290341p11767110sec20a6fee2038a00@mail.gmail.com> p4raw-id: //depot/perl@28900
* Automate generation of the regmatch() state constantsYves Orton2006-09-251-340/+424
| | | | | | Subject: Re: Problem with EVAL handling in bleads iterative regex code. Message-Id: <9b18b3110609251109t4cb1d443y87d7a7dc94fcfc24@mail.gmail.com> p4raw-id: //depot/perl@28892