| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
These are not used yet.
|
|
|
|
| |
It is not used yet.
|
|
|
|
|
| |
This changes the bits to add a new charset type for /aa, and other bookkeeping
for it.
|
|
|
|
|
|
|
|
|
| |
Previously all the scripts in regen/ had code to generate header comments
(buffer-read-only, "do not edit this file", and optionally regeneration
script, regeneration data, copyright years and filename).
This change results in some minor reformatting of header blocks, and
standardises the copyright line as "Larry Wall and others".
|
|
|
|
| |
These aren't used yet.
|
|
|
|
|
|
| |
These are unused because there is no difference between Unicode
semantics and non for digits. That is there are no digit characters in
the 128-255 range.
|
|
|
|
|
|
| |
This will make for somewhat more efficient execution, as won't have to
test the regnode type multiple times, at the expense of slightly bigger
code space.
|
|
|
|
|
| |
These nodes aren't actually used yet, but allow the splitting out of
Unicode semantics for \w, \s, and complements
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The /d, /l, and /u regex modifiers are mutually exclusive. This patch
changes the field that stores the character set to use more than one bit
with an enum determining which one. This data structure more
closely follows the semantics of their being mutually exclusive, and
conserves bits as well, and is better expandable.
A small API is added to set and query the bit field.
This patch is not .xs source backwards compatible. A handful of cpan
programs are affected.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This node is like a straight ANYOF node to match [bracketed character classes],
but can match multiple characters; in particular it can match a multi-char
fold.
When multi-char Unicode folding was added to Perl, it was overlooked that the
ANYOF node is supposed to match exactly one character, hence there have been
bugs ever since. Adding a specialized node that can match multiple chars,
these can be fixed more easily. I tried at first to make ANYOF match multiple
chars, but this causes Perl to not be able to fully compile.
|
| |
|
|
|
|
|
| |
These were missing that they were simple (matching exactly 1 character)
and have 0 regnode arguments
|
|
|
|
|
|
| |
The recently added regnodes are moved to their respective equivalence
classes, and the named backreferences are moved to just after the
numbered backreferences
|
|
|
|
|
|
|
| |
These will be used for matching capture buffers case-insensitively using
Unicode semantics.
make regen will regenerate the delivered regnodes.h
|
|
|
|
|
| |
This node will be used for matching case insensitive exactish nodes
using Unicode semantics
|
|
|
|
| |
make regen needed
|
|
|
|
| |
requires regen
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds recognition of these modifiers, with appropriate action
for d and l. u does nothing useful yet. This allows for the
interpolation of a regex into another one without losing the character
set semantics that it was compiled with, as for the first time, the
semantics is now specified in the stringification as one of these
modifiers.
To this end, it allocates an unused bit in the structures. The off-
sets change so as to not disturb other bits.
|
|
|
|
|
|
|
|
|
|
|
|
| |
make regen needed.
This commit moves some bits in extflags around so that all the unallocated
ones are at the boundary between the unshared portion and the portion
shared with op.h. This allows them to be allocated in the future to go
either way, without affecting binary compatibility at that time.
The high-order bits are unaffected, but the low order ones move to fill
the gap.
|
|
|
|
|
|
|
| |
This is O(1) with no branching, instead of O(n) with branching.
Deprecate the old implementation's externally visible variables
PL_simple and PL_varies. Google codesearch suggests that nothing outside the
core regexp code was using these.
|
|
|
|
| |
This allows the implementation of the lookup mechanism to change.
|
|
|
|
|
|
| |
Add a new flags column to regcomp.sym, with V if the node type is in PL_varies,
S if it is in PL_simple, and . if a placeholder is needed because subsequent
optional columns are present.
|
|
|
|
|
|
|
| |
As VERB is "Used only for the type field of verbs" this is only a cosmetic
change, causing that correct description to appear in the comment in
regnodes.h. The change to regarglen doesn't affect anything, as the VERB type
is never actually used for compiled nodes.
|
|
|
| |
p4raw-id: //depot/perl@32852
|
|
|
|
|
|
|
| |
lowest 4 bits (which saves a shift), and the "flags indicating special
patterns" into contiguous bits. This makes everything a little tidier,
and saves 88 bytes (woohoo!) of object file with -Os on x86 FreeBSD.
p4raw-id: //depot/perl@32775
|
|
|
|
|
|
|
|
|
|
| |
ensure proper scope cleanup.
Fix and test for issue raised in:
Subject: Very strange interaction between regex and lexical array in blead
Message-ID: <20070818015537.0088db31@r2d2>
p4raw-id: //depot/perl@31733
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Date: Fri, 29 Jun 2007 23:38:07 +0200
Message-ID: <20070629213807.GA14454@abigail.nl>
Subject: [PATCH pod/perlre.pod] Keeping up with the changes.
From: Abigail <abigail@abigail.be>
Date: Sat, 30 Jun 2007 01:24:36 +0200
Message-ID: <20070629232436.GA15326@abigail.nl>
Plus tweaks, and debug enahancements.
p4raw-id: //depot/perl@31506
|
|
|
|
|
|
|
| |
regex engine.
Message-ID: <9b18b3110704240746u461e4bdcl208ef7d7f9c5ef64@mail.gmail.com>
p4raw-id: //depot/perl@31081
|
|
|
|
|
|
|
| |
PCRE and unicode tr18
Message-ID: <9b18b3110704221434g43457742p28cab00289f83639@mail.gmail.com>
p4raw-id: //depot/perl@31026
|
|
|
|
|
| |
Message-ID: <460EB6C1.4020406@iki.fi>
p4raw-id: //depot/perl@30824
|
|
|
|
|
|
|
| |
-DDEBUGGING, it's going to need PL_reg_name even if core perl doesn't.
So something is always going to use it, so always define it, and always
export it. (But only define it once, so that static builds work.)
p4raw-id: //depot/perl@30464
|
|
|
|
|
|
|
| |
This allows re to be a static extension.
As it's now no-longer a static variable in regcomp.c, it needs a PL_
prefix.
p4raw-id: //depot/perl@30451
|
|
|
|
|
|
|
|
|
| |
cleanup and more docs for regatom()
Message-ID: <9b18b3110701101133i46dc5fd0p1476a0f1dd1e9c5a@mail.gmail.com>
(plus POD nits by Merijn and myself)
p4raw-id: //depot/perl@29756
|
|
|
|
|
| |
Message-ID: <9b18b3110611090809l667860c9t6c27453d7c86a21e@mail.gmail.com>
p4raw-id: //depot/perl@29260
|
|
|
|
|
|
|
|
| |
Message-ID: <9b18b3110611060406u2fa1572as57073949a5df9e62@mail.gmail.com>
Plus a portability fix (in string comparison for regex verbs)
and doc tweaks / podchecker fixes
p4raw-id: //depot/perl@29222
|
|
|
|
|
| |
Message-ID: <9b18b3110611020335h7ea469a8g28ca483f6832816d@mail.gmail.com>
p4raw-id: //depot/perl@29189
|
|
|
|
|
| |
Message-ID: <9b18b3110610311349n5947cc8fsf0b2e6ddd9a7ee01@mail.gmail.com>
p4raw-id: //depot/perl@29183
|
|
|
|
|
|
| |
Subject: [PATCH] regex engine optimiser should grok subroutine patterns, and, name subroutine regops more intuitively
Message-ID: <9b18b3110610300915x3abf6cddu9c2071a70bea48e1@mail.gmail.com>
p4raw-id: //depot/perl@29162
|
|
|
|
|
|
| |
Subject: [PATCH] regex engine optimiser should grok subroutine patterns, and, name subroutine regops more intuitively
Message-ID: <9b18b3110610300915x3abf6cddu9c2071a70bea48e1@mail.gmail.com>
p4raw-id: //depot/perl@29161
|
|
|
|
|
| |
Message-ID: <9b18b3110610260559k3efa98barc28987e88c581a8a@mail.gmail.com>
p4raw-id: //depot/perl@29118
|
|
|
|
|
| |
Message-ID: <9b18b3110610111546j74ca490dg21bd9fd1e7e10d42@mail.gmail.com>
p4raw-id: //depot/perl@28998
|
|
|
|
|
| |
Message-ID: <9b18b3110610061016x5ddce965u30d9a821f632d450@mail.gmail.com>
p4raw-id: //depot/perl@28957
|
|
|
| |
p4raw-id: //depot/perl@28944
|
|
|
|
|
|
|
|
|
|
|
| |
Date: Wed, 4 Oct 2006 15:45:15 +0200
Message-ID: <9b18b3110610040645s563220a2id6f235494b497e90@mail.gmail.com>
Subject: Re: [PATCH] Add recursive regexes similar to PCRE
From: demerphq <demerphq@gmail.com>
Date: Wed, 4 Oct 2006 21:05:10 +0200
Message-ID: <9b18b3110610041205m2660eb43m1315cf4b0653db96@mail.gmail.com>
p4raw-id: //depot/perl@28939
|
|
|
| |
p4raw-id: //depot/perl@28934
|
|
|
|
|
|
|
| |
pluggable under threads)
Message-ID: <9b18b3110609290341p11767110sec20a6fee2038a00@mail.gmail.com>
p4raw-id: //depot/perl@28900
|
|
|
|
|
|
| |
Subject: Re: Problem with EVAL handling in bleads iterative regex code.
Message-Id: <9b18b3110609251109t4cb1d443y87d7a7dc94fcfc24@mail.gmail.com>
p4raw-id: //depot/perl@28892
|