| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed on p5p, ibcmp has different semantics from other cmp
functions in that it is a binary instead of ternary function. It is
less confusing then to have a name that implies true/false.
There are three functions affected: ibcmp, ibcmp_locale and ibcmp_utf8.
ibcmp is actually equivalent to foldNE, but for the same reason that things
like 'unless' and 'until' are cautioned against, I changed the functions
to foldEQ, so that the existing names, like ibcmp_utf8 are defined as
macros as being the complement of foldEQ.
This patch also changes the one file where turning ibcmp into a macro
causes problems. It changes it to use the new name. It also documents
for the first time ibcmp, ibcmp_locale and their new names.
|
| |
|
|
|
|
|
|
|
| |
This removes the comment about the function name, and converts tabs to
blanks throughout the function, as so much of it is changing already.
It also removes trailing whitespace in other lines of the file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I had a hard time understanding how this routine worked; there were no
comments. In figuring it out, I discovered it could be made more
efficient. This routine is called over and over in the innermost loops
in regex matching, so efficiency is a concern.
Setup is done once before the main while loop so that it now has two
conditions instead of eight. The loop was rearranged slightly to be
smaller and a couple of unneeded assignments to temporaries were
removed, and recomputation of some values was avoided. Several other
small efficiency changes were made.
Several asserts had been commented out, saying that they make tests
fail. But they no longer do, at least on my platform. There was a
reason that they were asserts to begin with, and that is they denoted an
insane or trivial condition. Apparently there have been fixes to the
other code calling this, so I re-enabled them.
The names of several variables were changed to be less confusing; hence
f1 means the fold buffer for string 1 whereas it used to mean its goal,
which is now g1.
The leading indent was changed from 5 to 4 blanks. I made enough
other changes that I didn't submit this as a separate commit
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Users can define their own case changing mappings to replace the
standard ones. Prior to this patch, any mappings on characters whose
ordinals are 0-222, 224-255 that resulted in multiple characters were
ignored.
Note that there still is a deficiency in that the mappings will be
applied only to strings in utf8 format.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a character folds to multiple ones in case-insensitive matching,
it should not match just one of those, or the regular expression can
loop. For example, \N{LATIN SMALL LIGATURE FF} folds to 'ff', and so
"\N{LATIN SMALL LIGATURE FF}" =~ /f+/i
should match. Prior to this patch, this function returned that there is
a match, but left the matching string pointer at the beginning of the
"\N{LATIN SMALL LIGATURE FF}" because it doesn't make sense to match
just half a character, and at this level it doesn't know about the '+'.
This leaves things in an inconsistent state, with the reporting of a
match, but the input pointer unchanged, the result of which is a loop.
I don't know how to fix this so that it correctly matches, and there are
semantic issues with doing so. For example, if
"\N{LATIN SMALL LIGATURE FF}" =~ /ff/i
matches, then one would think that so should
"\N{LATIN SMALL LIGATURE FF}" =~ /(f)(f)/i
But $1 and $2 don't really make sense here, since they both refer to the
half of the same character.
So this patch just returns failure if only a partial character is
matched. That leaves things consistent, and solves the problem of
looping, so that Perl doesn't hang on such a construct, but leaves the
ultimate solution for another day.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
The "be understanding" bodge to not panic, introduced in 1de9afcdf18cf98b, is
no longer needed now that c28d61051c446453 fixes the underlying problem.
|
|
|
|
|
| |
("be understanding" being a bodge added in 1de9afcdf18cf98b, which will soon go
when I fix the underlying cause of the bugs it works around.)
|
| |
|
|
|
|
| |
Both high ends were one too low.
|
| |
|
| |
|
|
|
|
|
|
| |
Rather than transposing n + 1 bytes, including 1 it was not passed, before
calling utf16_to_utf8() and having that croak.
e 69422~
|
|
|
|
|
| |
Replace ckWARN_d{,2,3,4}() && Perl_warner() with it, which trades reduced code
size for 1 more function call if warnings are not enabled.
|
|
|
|
|
| |
That now reads "Unicode non-character is illegal in interchange" and the
perldiag documentation is expanded a bit.
|
| |
|
|
|
|
| |
is_utf8_string(), is_utf8_string_loclen() as they don't need it
|
| |
|
|
|
|
|
|
|
| |
UTF8SKIP appears to be a rather slow call; use UTF8_IS_INVARIANT to
skip it whenever possible. We also move the malformed utf8 check
until after the loop, since it can be checked after the termination
condition, instead of at every pass through the loop.
|
|
|
|
|
| |
and hence the 'create' argument is actually 'flags'. Fix code and documentation
that used TRUE or FALSE to use 0 or GV_ADD.
|
|
|
|
|
|
| |
From: karl williamson <public@khwilliamson.com>
Date: Tue, 16 Dec 2008 16:00:34 -0700
Message-ID: <49483312.80804@khwilliamson.com>
|
|
|
|
|
|
| |
Message-ID: <25940.1225611819@chthon>
Date: Sun, 02 Nov 2008 01:43:39 -0600
p4raw-id: //depot/perl@34698
|
|
|
| |
p4raw-id: //depot/perl@34653
|
|
|
|
|
|
| |
Those are already in embed.fnc, and most of them were already
outdated. This also fixes the docs for pv_escape and pv_pretty.
p4raw-id: //depot/perl@34642
|
|
|
| |
p4raw-id: //depot/perl@34624
|
|
|
| |
p4raw-id: //depot/perl@34585
|
|
|
| |
p4raw-id: //depot/perl@34416
|
|
|
|
|
|
|
|
|
|
|
|
| |
ability to create landmines that will explode under someone in the
future when they upgrade their compiler to one with better
optimisation. We've already done this at least twice.
(Yes, some of the assertions are after code that would already have
SEGVd because it already deferences a pointer, but they are put in
to make it easier to automate checking that each and every case is
covered.)
Add a tool, checkARGS_ASSERT.pl, to check that every case is covered.
p4raw-id: //depot/perl@33291
|
|
|
|
|
|
| |
and mortalizing them. Use these macros where possible. And also
mX?PUSH[inpu] where possible.
p4raw-id: //depot/perl@32821
|
|
|
|
|
| |
sv_2mortal(newSVpvs(...)) constructions to use it.
p4raw-id: //depot/perl@32819
|
|
|
|
|
|
| |
the flags. Move its implementation just ahead of sv_2mortal()'s for
CPU cache locality. Refactor all code that can be to use this.
p4raw-id: //depot/perl@32818
|
|
|
|
|
|
|
| |
Related to [perl #36207] among others
Message-ID: <9b18b3110712170621h41de2c76k331971e3660abcb0@mail.gmail.com>
p4raw-id: //depot/perl@32628
|
|
|
|
|
|
|
| |
From: "Craig A. Berry" <craig.a.berry@gmail.com>
Message-ID: <c9ab31fc0710061147x3ee7f9bdg2b1bac3acd018bb2@mail.gmail.com>
Date: Sat, 6 Oct 2007 13:47:03 -0500
p4raw-id: //depot/perl@32058
|
|
|
|
|
| |
followed by SvGROW(size+1)
p4raw-id: //depot/perl@32045
|
|
|
|
|
|
|
|
| |
(the sprintf "%c" code will work correctly when the SV is UTF-8).
Audit all the rest for UTF-8 correctness, and force SvUTF-8_off() in
utf8.c to ensure correctness. (The string is reset to "", so this will
not be a behaviour change.)
p4raw-id: //depot/perl@32040
|
|
|
|
|
|
| |
replacing them with constructions that are more efficient because they
avoid the overhead of the *printf format parser and interpreter code.
p4raw-id: //depot/perl@32034
|
|
|
| |
p4raw-id: //depot/perl@31455
|
|
|
| |
p4raw-id: //depot/perl@31252
|
|
|
| |
p4raw-id: //depot/perl@31203
|
|
|
|
|
|
| |
the local variable. Add an assertion that another cast is not a data
loss (and that there is no buffer overflow)
p4raw-id: //depot/perl@31069
|
|
|
| |
p4raw-id: //depot/perl@31055
|
|
|
|
|
| |
up until you test on a "real" architecture)
p4raw-id: //depot/perl@31023
|
|
|
|
|
| |
of flags, not a boolean, so correct the documenation and callers.
p4raw-id: //depot/perl@29977
|
|
|
| |
p4raw-id: //depot/perl@29696
|
|
|
|
|
|
|
|
|
| |
Subject: [PATCH] Cleanup SVf arguments (2nd try)
Message-ID: <20070101201613.4120d9ef@r2d2>
Introduce an SVfARG() macro for %SVf (%-p here) arguments to
perl's printf
p4raw-id: //depot/perl@29687
|