| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On a WinCE build. On the 2nd nmake run, using Makefile.ce, eventually calls
the Extensions target which calls make_ext.pl. What happens is nmake for CE
for each module is called on the Desktop per module makefile from the
earlier Desktop build. Since the Desktop Perl already was built
sucessfully, all rules/deps are met in the Desktop per module makefile, and
nothing happens during the module building phase for a CE build. Previously
I used external file management tools to delete the per module Makefiles
before running Makefile.ce.
*make_ext.pl
- implement deleting and rebuilding the per module makefile on a Cross
build
- use constants for constant folding, there are opportunities for other
variables to be converted to constants in the future
- fix a bug from commit baff067e71 where unlink() on a file with an open
handle ($mfh) didn't delete the file from disk and a new per module
makefile would be not be built by make_ext.pl later since the per module
makefile was still on disk. This was observed on Win32. Also harden the
unlink code with a new _unlink sub that is fatal if the file is still on
disk after unlink supposedly deleted it.
- var $header and the quotemeta is because of an issue in Perl #119793
*Makefile.ce
- bring the debugging symbol generation flags and optimization flags
to be closer to a Dekstop VC Perl build
- ICWD is obsolete as of commit f6b3c354c9 , remove it
- MINIMOD is obsolete as of commit 7b4d95f74b , remove it
- make a poisoned config.h so if there is a XS building mixup between
a desktop and CE perl, the poisoned config.h for CE will stop the build
gracefully
- $(MINIPERL) has never been defined in Makefile.ce from day 1 (10 years)
replace with $(HPERL) everywhere, this was causing things to not run
silently since $(MINIPERL) was empty string. Use HPERL instead of
MINIPERL to allow flexibility to use the full perl binary if necessery
one day
- better cleaning on root makefile clean target
*win32/win32.h
*win32/win32iop.h
- silence alot of redefinition warnings which gave pages of warnings on
each WinCE compliand
*mg.c
- win32_get_errno is only on WIN32 build not WINCE
a "nmake -f Makefile.ce all" will now build the CE interp and all modules
in 1 shot with no user intervention
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since perl previously assigned WSAExxx values to $! on Windows it is quite
possible that (Perl-level) user code may also manually make similar assignments,
which will now cause breakage if the value put in $! is subsequently compared to
Errno/POSIX constants because the latter are now the corresponding Exxx values
where possible.
An example of this is in Net::Ping::tcp_connect(), which does the following to
fetch a socket-level error code:
unpack("i", getsockopt($self->{"fh"}, SOL_SOCKET, SO_ERROR))
and assigns the result (a WSAExxx value such as 10061) to $! and then goes wrong
in the subsequent test (in ping_tcp()) for $! == ECONNREFUSED (which is now 107
rather than 10061 if perl is built with VC10 or higher).
To avoid this we now intercept assignment to $! and convert any WSAExxx values
to Exxx values first. This causes a minor oddity in that this:
perl -le "$! = 10061; print 0+$!"
will now output 107 (for VC10+ perls) but this is surely preferable to the
alternative breakage described above.
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the checks that avoided confusing $^P, ${^PREMATCH} and ${^POSTMATCH}
now that the latter two do not take that code path. Remove a similar
check for $^S added by commit 4ffa73a366885f68 (Feb 2003). (This commit did
not add any other variable starting with a control-S.) This eliminates all
uses of the variable remaining.
Move the goto target do_numbuf_fetch inside the checks for PL_curpm, as
both its comefroms have already made the same check.
|
|
|
|
|
| |
The previous commit's changes to Perl_gv_fetchpvn_flags() rendered this
code in Perl_magic_get() and Perl_magic_set() unreachable.
|
|
|
|
|
|
|
| |
Perl_gv_fetchpvn_flags() now stores the appropriate RX_BUFF_IDX_* constant
in mg_len for $` $' ${^MATCH} ${^PREMATCH} and ${^POSTMATCH}
This makes some code in mg.c unreachable and hence unnecessary; the next
commit will remove it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The match variables $1, $2 etc, along with many other special scalars, have
magic type PERL_MAGIC_sv, with the variable's name stored in mg_ptr. The
look up in mg.c involved calling atoi() on the string in mg_ptr to get the
capture buffer as an integer, which is passed to the regex API.
To avoid this repeated use of atoi() at runtime, change the storage in the
MAGIC structure for $1, $2 etc and $&. Set mg_ptr to NULL, and store the
capture buffer in mg_len. Other code which manipulates magic ignores mg_len
if mg_ptr is NULL, so this representation does not require changes outside
of the routines which set up, read and write these variables.
(Perl_gv_fetchpvn_flags(), Perl_magic_get() and Perl_magic_set())
|
|
|
|
|
| |
The value on which atoi() is called is actually always the buffer of an SV.
Hence we can use sv_2iv() instead.
|
|
|
|
|
|
|
| |
This is something I failed to change in commit ce0d59f. I don’t know
of a way to trigger this in pure-Perl code, hence the use of XS in
the test. It did show up in pure-Perl code due to a bug fixed by the
previous commit.
|
|
|
|
|
|
|
|
|
|
|
| |
pos($x) returns a special magical scalar that sets the match position
on $x. Calling pos($x) twice will provide two such scalars. If we
set one of them to a reference, set the other to undef, and then read
the first, all hail breaks loose, because of the use of SvOK_off.
SvOK_off is not sufficient if arbitrary values can be assigned by Perl
code. Globs, refs and regexps (among others) need special handling,
which sv_setsv knows how to do.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These arrays are very similar to tied arrays, in that the elements are
created on the fly when looked up. So push @_, \$+[0], \$+[0], will
push references to two different scalars on to @_.
That they are created on the fly prevents this bug from showing up
in most code: If you reference the element you can observe that, on
FETCH, it gets set to the corresponding offset *if* the last match has
a set of capturing parentheses with the right number. Otherwise, the
value in the element is left as-is.
So, doing another pattern match with, say, 5 captures and then another
with fewer will leave $+[5] and $-[5] holding values from the first
match, if there is a FETCH in between the two matches:
$ perl -le '" "=~/()()()()(..)/; $_ = \$+[5]; print $$_; ""=~ /()/; print $$_;'
2
2
And attempts at assignment will succeed, even though they croak:
$ perl -le 'for ($-[0]) { eval { $_ = *foo }; print $_ }'
*main::foo
The solution here is to make the magic ‘get’ handler set the SV
no matter what, instead of just setting it when it refers to a
valid offset.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The value of pos() is stored as a byte offset. If it is stored on a
tied variable or a reference (or glob), then the stringification could
change, resulting in pos() now pointing to a different character off-
set or pointing to the middle of a character:
$ ./perl -Ilib -le '$x = bless [], chr 256; pos $x=1; bless $x, a; print pos $x'
2
$ ./perl -Ilib -le '$x = bless [], chr 256; pos $x=1; bless $x, "\x{1000}"; print pos $x'
Malformed UTF-8 character (unexpected end of string) in match position at -e line 1.
0
So pos() should be stored as a character offset.
The regular expression engine expects byte offsets always, so allow it
to store bytes when possible (a pure non-magical string) but use char-
acters otherwise.
This does result in more complexity than I should like, but the alter-
native (always storing a character offset) would slow down regular
expressions, which is a big no-no.
|
|
|
|
|
| |
lsv can never be null here. This null check has been here since
vec’s get-magic was added in ae389c8a or 6ff81951f7.
|
|
|
|
|
|
|
|
|
| |
If the array has been freed and a reference is then assigned to
the arylen scalar and then get-magic is called on that scalar,
Perl_magic_getarylen misbehaves. SvOK_off is not sufficient if
arbitrary values can be assigned by Perl code. Globs, refs and
regexps (among others) need special handling, which sv_setsv
knows how to do.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a nonexistent array element is passed to a subroutine, a special
‘deferred element’ scalar (implemented using something called defelem
magic) is passed to the subroutine instead, which delegates to the
array element. This allows some_benign_function($array[$nonexistent])
to avoid autovivifying unnecessarily.
Whether this magic would be triggered was based on whether the element
was within the range 0..$#array. Since arrays can contain nonexistent
elements before $#array, this logic is incorrect. It also makes sense
to allow $array[$neg] where the negative number points before the
beginning of the array to create a deferred element and only croak if
it is assigned to.
This commit fixes the logic for when deferred elements are created
and implements these deferred negative elements.
Since we have to be able to store negative values in xlv_targoff, it
is convenient to make it a union (with two types--signed and unsigned)
and use LvSTARGOFF for defelem array indices.
|
|
|
|
|
|
|
| |
This was caused by 3805b5fb04. This commit restores the !=0 that
was there before 2fd13eccf0.
Thanks to Steve Hay for helping to track down the smoke failures.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PL_hints stores the hints at compile time that get copied into the
cop_hints field of each COP (in newSTATEOP).
Since perl-5.8.0-8053-gd5ec298, COPs have stored all the hints.
Before that, COPs used to store only some of the hints. The hints
were copied here and there into PL_compiling, a static COP-shaped buf-
fer used during compilation, so that things like constant folding
would see the correct hints. a0ed51b3 back in 1998 did that.
Now that COPs can store all the hints, we can just use
PL_compiling.cop_hints to avoid having to copy them from PL_hints from
time to time.
This simplifies the code and avoids creating bugs like those that
a547fd219 and 1c75beb82 fixed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
$ ./perl -le 'BEGIN { print $^H; ${^OPEN} = "a\0b"; print $^H}'
256
917760
So changing $^H back should change the value of ${^OPEN} back to undef, right?
$ ./perl -le 'BEGIN { ${^OPEN} = "a\0b"; $^H=256; print ${^OPEN}//"undef"}'
ab
$ ./perl -le 'BEGIN { ${^OPEN} = "a\0b"; $^H=256;}BEGIN{ print ${^OPEN}//"undef"}'
undef
Apparently you have to hop from one BEGIN block to another to see
the changes.
This happens because compile-time hints are stored in PL_hints (which
$^H sets) but ${^OPEN} looks in PL_compiling.cop_hints. Setting
${^OPEN} sets both. The contents of PL_hints are assigned to
PL_compiling.cop_hints at certain points (the start of a BEGIN block
sees the right value because newSTATEOP sets it), but the two are not
always kept in synch.
The smallest fix here is to have $^H set PL_compiling.cop_hints as
well as PL_hints, but the ultimate fix--to come later--is to merge the
two and stop storing hints in two different places.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It no longer needs to worry about SvIsCOW. This logic is left over
from when READONLY+FAKE was used for COWs.
Since it is possible for COWs to be read-only now, this logic is actu-
ally faulty, as it doesn’t temporarily stop read-only COWs from being
read-only, as it does for other read-only values.
This actually causes discrepancies with scalar-tied locked hash keys,
which differ in readonliness when localised depending on whether the previous value used copy-on-write.
Whether such scalars should be read-only after localisation is open
to debate, but it should not differ based on the means of storing the
previous value.
|
|
|
|
|
|
|
|
|
|
|
|
| |
It no longer needs to worry about SvIsCOW. This logic is left over
from when READONLY+FAKE was used for COWs.
Since it is possible for COWs to be read-only now, this logic is actu-
ally faulty, as it doesn’t temporarily stop read-only COWs from being
read-only, as it does for other read-only values.
This actually causes bugs with scalar-tied locked hash keys, which
croak on FETCH.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Under ithreads, constants and GVs are stored in the pad.
When names are looked up up in a pad, the search begins at the end and
works its way toward the beginning, so that an $x declared later masks
one declared earlier.
If there are many constants at the end of the pad, which can happen
for generated code such as lib/unicore/TestProp.pl (which has about
100,000 lines and over 500,000 pad entries for constants at the
end of the file scope’s pad), it can take a long time to search
through them all.
Before commit 325e1816, constants used &PL_sv_undef ‘names’. Since
that is the default value for array elements (when viewed directly
through AvARRAY, rather than av_fetch), the pad allocation code did
not even bother storing the ‘name’ for these. So the name pad (aka
padnamelist) was not extended, leaving just 10 entries or so in the
case of lib/unicore/TestProp.pl.
Commit 325e1816 make pad constants have &PL_sv_no names, so the
name pad would be implicitly extended as a result of storing
&PL_sv_no, causing a huge slowdown in t/re/uniprops.t (which runs
lib/unicore/TestProp.pl) under threaded builds.
Now, normally the name pad *does* get extended to match the pad,
in pad_tidy, but that is skipped for string eval (and required
file scope, of course). Hence, wrapping the contents of
lib/unicore/TestProp.pl in a sub or adding ‘my $x’ to end of it will
cause the same slowdown before 325e1816.
lib/unicore/TestProp.pl just happened to be written (ok, generated) in
such a way that it ended up with a small name pad.
This commit fixes things to make them as fast as before by recording
the index of the last named variable in the pad. Anything following
that is disregarded in pad lookup and search begins with the last
named variable. (This actually does make things faster before for
subs with many trailing constants in the pad.)
This is not a complete fix. Adding ‘my $x’ to the end of a large file
like lib/unicore/TestProp.pl will make it just as slow again.
Ultimately we need another algorithm, such as a binary search.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is part of #116907, too. It also fixes #72924 as a side effect;
the next commit will explain.
The value of pos($foo) was being stored as an I32, not allowing values
above I32_MAX. Change it to SSize_t (the signed equivalent of size_t,
representing the maximum string length the OS/compiler supports).
This is accomplished by changing the size of the entry in the magic
struct, which is the simplest fix.
Other parts of the code base can benefit from this, too.
We actually cast the pos value to STRLEN (size_t) when reading
it, to allow *very* long strings. Only the value -1 is special,
meaning there is no pos. So the maximum supported offset is
2**sizeof(size_t)-2.
The regexp engine itself still cannot handle large strings, so being
able to set pos to large values is useless right now. This is but one
piece in a larger puzzle.
Changing the size of mg->mg_len also requires that
Perl_hv_placeholders_p change its type. This function
should in fact not be in the API, since it exists
solely to implement the HvPLACEHOLDERS macro. See
<https://rt.perl.org/rt3/Ticket/Display.html?id=116907#txn-1237043>.
|
|
|
|
|
|
|
|
|
| |
When elements of @_ refer to nonexistent hash or array elements, then
the magic scalar in $_[0] delegates all set/get actions to the element
in represents, vivifying it if needed.
tie/tied/untie, however, were not delegating to the element, but were
tying the the magical ‘deferred element’ scalar itself.
|
|
|
|
|
|
|
|
|
| |
When elements of @_ refer to nonexistent hash or array elements, then
the magic scalar in $_[0] delegates all set/get actions to the element
in represents, vivifying it if needed.
pos($_[0]), however, was not delegating the value to the element, but
storing it on the magical ‘deferred element’ scalar.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Assigning a vstring to a tied variable would result in a plain string
in $_[1] in STORE.
Assigning a vstring to a magic deferred element would result in a
plain string in the aggregate’s actual element.
When magic is invoked, the magic flags are temporarily turned off on
the sv so that recursive calls to magic don’t happen. This makes it
easier to implement functions like Perl_magic_set to read the value of
the sv without triggering get-magic.
Since vstrings are only considered vstrings when they are SvRMAGICAL,
this meant that set-magic would turn vstrings temporarily into plain
strings. Subsequent copying (e.g., in STORE) would then fail to copy
the vstring magic.
This commit changes mg_set to leave the rmagical flag on, since it
does not affect the functionaiity of set-magic.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch sets the utf8 flag on $! if the error string passes utf8
validity tests and has some bytes with the upper bit set. (If none
have that bit set, is an ASCII string, and whether or not it is UTF-8 is
irrelevant.) This is a heuristic that could fail, but as the reference
in the comments points out this is unlikely.
One can reasonably assume that a UTF-8 locale will return a UTF-8
result. So another approach would be to look at that (but we wouldn't
want to turn the flag on for a purely ASCII string anyway, as that could
change the semantics from existing behavior by making the string follow
Unicode rules, whereas it didn't necessarily before.) To do this, we
could keep track of the utf8ness of the LC_MESSAGES locale. But until
the heuristic in this patch is shown to not be good enough, I don't see
the need to do this extra work.
|
|
|
|
|
|
| |
Perl_magic_methcall is not public API, so there is no
need to add another function and we can just change
function's arguments.
|
|
|
|
|
|
|
|
|
| |
Can be used when it's known that method name has no
package part - just method name.
With flag set SV with precomputed hash value is used
and pp_method_named is called instead of pp_method.
Method lookup is faster.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code dealt rather inconsistently with uids and gids. Some
places assumed that they could be safely stored in UVs, others
in IVs, others in ints; All of them should've been using the
macros from config.h instead. Similarly, code that created
SVs or pushed values into the stack was also making incorrect
assumptions -- As a point of reference, only pp_stat did the
right thing:
#if Uid_t_size > IVSIZE
mPUSHn(PL_statcache.st_uid);
#else
# if Uid_t_sign <= 0
mPUSHi(PL_statcache.st_uid);
# else
mPUSHu(PL_statcache.st_uid);
# endif
#endif
The other places were potential bugs, and some were even causing
warnings in some unusual OSs, like haiku or qnx.
This commit ammends the situation by introducing four new macros,
SvUID(), sv_setuid(), SvGID(), and sv_setgid(), and using them
where needed.
|
|
|
|
|
| |
Using SvREFCNT_dec_NN in a couple of places eliminates needless
null checks.
|
|
|
|
|
| |
This causes each deprecated function to have a prominent note to that
effect in its API documentation.
|
|
|
|
| |
I found re-formatting this multi-line 'if' to be easier to understand
|
|
|
|
|
| |
The are lots of places where local vars aren't used when compiled
with NO_TAINT_SUPPORT.
|
|
|
|
|
| |
pod/perldelta.pod: document reversion of ENV{foo} = undef behaviour in delta
t/op/magic.t: add a test for ENV{foo} = undef
|
|
|
|
|
|
|
|
|
|
|
| |
SvGROW unconditionally derefs SvANY to check SvLEN. A crash occurs if the
sv is SVt_NULL. Also mg_get uses SvMAGIC which also has the same problem.
Rather than having people finding these properties out by trial and error,
document them. There is no sense in adding type checks since these 2 calls
have been had sv type restrictions since probably day 1 and for
performance reason. Anyone who hit the restrictions would have either
fixed their code immediately, or abandoned using XS. I observed someone
abandoning XS in the field over these undocumented restrictions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove a large amount of machine code (~4KB for me) from funcs that use
ERRSV making Perl faster and smaller by preventing multiple evaluation.
ERRSV is a macro that contains GvSVn which eventually conditionally calls
Perl_gv_add_by_type. If a SvTRUE or any other multiple evaluation macro
is used on ERRSV, the expansion will, in asm have dozens of calls to
Perl_gv_add_by_type one for each test/deref of the SV in SvTRUE. A less
severe problem exists when multiple funcs (sv_set*) in a row call, each
with ERRSV as an arg. Its recalculated then, Perl_gv_add_by_type and all.
I think ERRSV macro got the func call in commit f5fa9033b8, Perl RT #70862.
Prior to that commit it would be pure derefs I think. Saving the SV* is
still better than looking into interp->gv->gp to get the SV * after each
func call.
I received no responses to
http://www.nntp.perl.org/group/perl.perl5.porters/2012/11/msg195724.html
explaining when the SV is replaced in PL_errgv, so took a conservative
view and assumed callbacks (with Perl stack/ENTER/LEAVE/eval_*/call_*)
can change it. I also assume ERRSV will never return null, this allows a
more efficiently version of SvTRUE to be used.
In Perl_newATTRSUB_flags a wasteful copy to C stack operation with the
string was removed, and a croak_notcontext to remove push instructions to
the stack. I was not sure about the interaction between ERRSV and message
sv, I didn't change it to a more efficient (instruction wise, speed, idk)
format string combining of the not safe string and ERRSV in the croak call.
If such an optimization is done, a compiler potentially will put the not
safe string on the first, unconditionally, then check PL_in_eval, and
then jump to the croak call site, or eval ERRSV, push the SV on the C stack
then push the format string "%"SVf"%s". The C stack allocated const char
array came from commit e1ec3a884f .
In Perl_eval_pv, croak_on_error was checked first to not eval ERRSV unless
necessery. I was not sure about the side effects of using a more efficient
croak_sv instead of Perl_croak (null chars, utf8, etc) so I left a comment.
nocontext used to save an push instruction on implicit sys perl.
In S_doeval, don't open a new block to avoid large whitespace changes.
The NULL assignment should optimize away unless accidental usage of errsv
in the future happens through a code change. There might be a bug here from
commit ecad31f018 since previous a char * was derefed to check for null
char, but ERRSV will never be null, so "Unknown error\n" branch will never
be taken.
For pp_sys.c, in pp_die a new block was opened to not eval ERRSV if
"well-formed exception supplied". The else if else if else blocks all used
ERRSV, so a "SV * errsv = NULL;" and a eval in the conditional with comma
op thing wouldn't work (maybe it would, see toke.c comments later in this
message). pp_warn, I have no comments.
In S_compile_runtime_code, a croak_sv question comes up same as in
Perl_eval_pv.
In S_new_constant, a eval in the conditional is done to avoid evaling
ERRSV if PL_in_eval short circuits. Same thing in Perl_yyerror_pvn.
Perl__core_swash_init I have no comments.
In the future, a SvEMPTYSTRING macro should be considered (not fully
thought out by me) to replace the SvTRUEs with something smaller and
faster when dealing with ERRSV. _nomg is another thing to think about.
In S_init_main_stash there is an opportunity to prevent an extra ERRSV
between "sv_grow(ERRSV, 240);" and "CLEAR_ERRSV();" that was too complicated
for me to optimize.
before perl517.dll
.text 0xc2f77
.rdata 0x212dc
.data 0x3948
after perl517.dll
.text 0xc20d7
.rdata 0x212dc
.data 0x3948
Numbers are from VC 2003 x86 32 bit.
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the context/pTHX from Perl_croak_no_modify and Perl_croak_xs_usage.
For croak_no_modify, it now has no parameters (and always has been
no return), and on some compilers will now be optimized to a conditional
jump. For Perl_croak_xs_usage one push asm opcode is removed at the caller.
For both funcs, their footprint in their callers (which probably are hot
code) is smaller, which means a tiny bit more room in the cache. My text
section went from 0xC1A2F to 0xC198F after apply this. Also see
http://www.nntp.perl.org/group/perl.perl5.porters/2012/11/msg195233.html .
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By defining NO_TAINT_SUPPORT, all the various checks that perl does for
tainting become no-ops. It's not an entirely complete change: it doesn't
attempt to remove the taint-related interpreter variables, but instead
virtually eliminates access to it.
Why, you ask? Because it appears to speed up perl's run-time
significantly by avoiding various "are we running under taint" checks
and the like.
This change is not in a state to go into blead yet. The actual way I
implemented it might raise some (valid) objections. Basically, I
replaced all uses of the global taint variables (but not PL_taint_warn!)
with an extra layer of get/set macros (TAINT_get/TAINTING_get).
Furthermore, the change is not complete:
- PL_taint_warn would likely deserve the same treatment.
- Obviously, tests fail. We have tests for -t/-T
- Right now, I added a Perl warn() on startup when -t/-T are detected
but the perl was not compiled support it. It might be argued that it
should be silently ignored! Needs some thinking.
- Code quality concerns - needs review.
- Configure support required.
- Needs thinking: How does this tie in with CPAN XS modules that use
PL_taint and friends? It's easy to backport the new macros via PPPort,
but that doesn't magically change all code out there. Might be
harmless, though, because whenever you're running under
NO_TAINT_SUPPORT, any check of PL_taint/etc is going to come up false.
Thus, the only CPAN code that SHOULD be adversely affected is code
that changes taint state.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was added to make SvREADONLY_off safe. (I think read-only is
turned off during magic so the magic scalar itself can be set without
the sv_set* functions getting upset.) Since SvREADONLY doesn’t mean
read-only for COWs, we don’t actually need to do sv_force_normal, but
can simply skip SvREADONLY_off for COWs.
By leaving it to sv_set* functions to do sv_force_normal, we avoid
having to copy the string buffer if it is just going to be thrown away
anyway. S_save_magic can’t know whether the scalar will actually be
overwritten, so it has to copy the buffer.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Assigning to a substr lvalue scalar was invoking overload too
many times if the target was a UTF8 string and the assigned sub-
string was not.
Since sv_insert_flags itself stringifies the scalar, the easiest
way to fix this is to force the target to a PV before doing any-
thing to it.
|
|
|
|
|
| |
This was added in commit 9bf12eaf4 to fix a crash, but it is not
necessary any more, due to changes elsewhere.
|
|
|
|
|
|
|
|
|
| |
It is not possible to know how to interpret the returned length
without accessing the UTF8 flag, which is not reliable until
the SV has been stringified, which requires get-magic. So length
magic has not made senses since utf8 support was added. I have
removed all uses of length magic from the core, so this is now
dead code.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By checking it before, it can end up treating a UTF8 string as bytes
when calculating offsets if the UTF8 flag is not turned on until
the target is stringified. This can happen with overloading and
typeglobs.
This is a regression from 5.14. 5.14 itself was buggy, too, but one
would have to modify the target after creating the substr lvalue but
before assigning to it; and that because of another bug fixed by
83f78d1a27, which was cancelling out this one.
package o {
use overload '""' => sub { $_[0][0] }
}
my $refee = bless ["\x{100}a"], o::;
my $substr = \substr $refee, -2;
$$substr = "b";
warn $refee;
That prints:
Wide character in warn at - line 7.
Āb at - line 7.
In 5.14 it prints:
b at - line 7.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This touches on issues raised in tickets #114410 and #114690.
If the UTF8ness of an overloaded string changes with each call, it
will make magic_setpos panic if it tries to stringify the SV multiple
times. We have to avoid any sv-specific utf8 length functions when
it comes to overloading. And we should do the same thing for gmagic,
too, to avoid creating caches that will shortly be invalidated.
The test class is very closely based on code written by Nicholas Clark
in a response to #114410.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
mg_length returns the number of bytes if a scalar has length magic,
but the number of characters otherwise.
sv_len_utf8 used to assume that mg_length would return bytes. The
first mistake was added in commit b76347f2eb, which assumed that
mg_length would return characters. But it was #ifdeffed out until
commit ffc61ed20e.
Later, commit 5636d518683 met sv_len_utf8’s assumptions by making
mg_length return the length in characters, without accounting for
sv_len, which also used mg_length.
So we ended up with a buggy sv_len that would return a character
count for scalars with get- but not length-magic, and a byte count
otherwise.
In the previous commit, I fixed sv_len not to use mg_length at all. I
plan shortly to remove any use of mg_length (the one remaining use is
in sv_len_utf8, which is currently not called on magical values).
The reason for removing all calls to mg_length is that the returned
length cannot be converted to characters without access to the PV as
well, which requires get-magic. So length magic on scalars makes no
sense since the advent of utf8.
This commit restore mg_length to its old behaviour and lists it as
deprecated. This is mostly cosmetic, as there are no CPAN users. But
it is in the API, and I don’t know whether we can easily remove it.
|
| |
|
|
|
|
|
|
|
| |
This was brought up in ticket #96672.
This variable gives access to the last-read filehandle that Perl uses
when it appends ", <STDIN> line 1" to a warning or error message.
|