| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
This was breaking some builds
|
|
|
|
|
|
| |
Under locale rules, this commit quotes all non-ASCII Latin1 characters
in UTF-8 encoded strings. This provides consistency with this function
and other functions, such as lc().
|
| |
|
|
|
|
|
|
|
|
|
|
| |
As described in the pod changes in this commit, this changes quotemeta()
to consistenly quote non-ASCII characters when used under
unicode_strings. The behavior is changed for these and UTF-8 encoded
strings to more closely align with Unicode's recommendations.
The end result is that we *could* at some future point start using other
characters as metacharacters than the 12 we do now.
|
|
|
|
|
|
| |
Changing the macro to a differently-named equivalent stresses that only
ASCII characters may escape from being quoted. That is, all non-ASCII
are quoted.
|
| |
|
|
|
|
|
| |
fc() brought to life its own version of #39028. fc(""), like
lc("") and friends, shouldn't taint the result.
|
|
|
|
|
| |
The max expansion when a Latin1 character is folded and converted to
UTF-8 is '2' bytes per input byte, not the more general case.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Along with the simple_casefolding and full_casefolding features.
fc() stands for foldcase, a sort of pseudo case (like lowercase),
which is used to implement Unicode casefolding. It maps a string
to a form where all case differences are erased, so it's a
locale-independent way of checking if two strings are the same,
regardless of case.
This functionality was, and still is, available through the
regular expression engine -- /i matches would use casefolding
internally. The fc keyword merely exposes this for easier access.
Previously, one could attempt to case-insensitively test two strings
for equality by doing
lc($a) eq lc($b)
But that might get you wrong results, for example in the case of
\x{DF}, LATIN SMALL LETTER SHARP S.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a read-only scalar is passed to one of | & ^ and it decides to do
a numeric operation, the numeric flags on the read-only scalar are
turned off afterwards if they were not on to begin with.
This was introduced in commit b20c4ee1f, which did so to stop $x | "0"
from coercing the rhs and making it behave differently the second
time through.
What that commit did not take into account was that the read-only
flag is set on cow scalars, and the same pp function is used for the
assignment forms. So it was turning off the numeric flags after
$cow |= 1, leaving $cow undef.
I made this numeric flag-twiddling apply only to read-only scalars
(supposedly), because that seemed the most conservative and acceptable
change. I am actually in favour of extending it to all scalars, to
make these operators less surprising. For that reason, this commit
preserves the current behaviour with cows in the non-assignment case:
they don’t get coerced into numbers. Changing them to work the same
way as non-cow writable scalars would make things more consistent, but
more consistently buggy. I would like to make this non-coercion apply
to all scalars in 5.18.
This commit simply skips the flag-twiddling on the lhs in the assign-
ment case.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The convention is that when the interpreter dies with an internal error, the
message starts "panic: ". Historically, many panic messages had been terse
fixed strings, which means that the out-of-range values that triggered the
panic are lost. Now we try to report these values, as such panics may not be
repeatable, and the original error message may be the only diagnostic we get
when we try to find the cause.
We can't report diagnostics when the panic message is generated by something
other than croak(), as we don't have *printf-style format strings. Don't
attempt to report values in panics related to *printf buffer overflows, as
attempting to format the values to strings may repeat or compound the
original error.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
> > Actually, the simplest solution seem to be to put the av or hv on
> > the mortals stack in pp_aassign and pp_undef, rather than in
> > [ah]v_undef/clear.
>
> This makes me nervous. The tmps stack is typically cleared only on
> statement boundaries, so we run the risks of
>
> * user-visible delaying of freeing elements;
> * large tmps stack growth might be possible with
> certain types of loop that repeatedly assign to an array without
> freeing tmps (eg map? I think I fixed most map/grep tmps leakage
> a
> while back, but there may still be some edge cases).
>
> Surely an ENTER/SAVEFREESV/LEAVE inside pp_aassign is just as
> efficient,
> without any attendant risks?
>
> Also, although pp_aassign and pp_undef are now fixed, the
> [ah]v_undef/clear functions aren't, and they're part of the public API
> that can be called independently of pp_aassign etc. Ideally they
> should
> be fixed (so they don't crash in mid-loop), and their documentation
> updated to point out that on return, their AV/HV arg may have been
> freed.
This commit takes care of the first part; it changes pp_aassign to use
ENTER/SAVEFREESV/LEAVE and adds the same to h_freeentries (called both
by hv_undef and hv_clear), av_undef and av_clear.
It effectively reverts the C code part of 9f71cfe6ef2.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In pp_undef and pp_aassign, we should put the av or hv that is being
cleared on the mortals stack (with an increased refcount), so that
destructors fired during the clearing do not free the av or hv.
I was going to put this in av_undef, etc., but pp_aassign also needs
to access the aggregate after clearing it. We still get a crash with
that approach.
Putting the aggregate on the mortals stack in av_undef, av_clear and
h_freeentries would work, too, but might cause the aggregate to leak
too far. That may cause problems, e.g., if it is %^H, because it may
last until the end of the current compilation unit.
Directly inside a runloop (in a pp function), it should be OK to use
the mortals stack, as it *will* be cleared ‘soon’. This seems the
least intrusive approach.
|
|
|
|
|
| |
In two instances, I actually modified to code to avoid %s for a
constant string, as it should be faster that way.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The lvalue context that the last statement of an lvalue subroutine
provides, when applied to entersub, causes the ops below the entersub
to be complied oddly. Compare regular subs and lvalue subs:
$ ./perl -Ilib -MO=Concise,bar,foo -e 'sub bar { &$x } sub foo:lvalue { &$x }'
main::bar:
5 <1> leavesub[1 ref] K/REFC,1 ->(end)
- <@> lineseq KP ->5
1 <;> nextstate(main 1 -e:1) v ->2
4 <1> entersub[t2] K/TARG ->5
- <1> ex-list K ->4
2 <0> pushmark s ->3
- <1> ex-rv2cv vK ->-
- <1> ex-rv2sv sK/1 ->-
3 <#> gvsv[*x] s ->4
main::foo:
b <1> leavesublv[1 ref] K/REFC,1 ->(end)
- <@> lineseq KP ->b
6 <;> nextstate(main 2 -e:1) v ->7
a <1> entersub[t2] K/LVINTRO,TARG,INARGS ->b
- <1> ex-list K ->a
7 <0> pushmark s ->8
9 <1> rv2cv vK/NO() ->a
- <1> ex-rv2sv sK/1 ->9
8 <#> gvsv[*x] s ->9
-e syntax OK
Notice that, in the second case, the rv2cv is not being optimised
away. Under strict mode, this allows a sub call on a string, since
rv2cv is not subject to strict refs.
It’s this code in op.c:op_lvalue_flags that is to blame:
if (kid->op_type != OP_GV) {
/* Restore RV2CV to check lvalueness */
restore_2cv:
if (kid->op_next && kid->op_next != kid) { /* Happens? */
okid->op_next = kid->op_next;
kid->op_next = okid;
}
else
okid->op_next = NULL;
okid->op_type = OP_RV2CV;
okid->op_targ = 0;
okid->op_ppaddr = PL_ppaddr[OP_RV2CV];
okid->op_private |= OPpLVAL_INTRO;
okid->op_private &= ~1;
break;
}
This code is a little strange. Using rv2cv to check lvalueness causes
the problem with strict refs. The lvalue check could just as well go
in entersub.
The way this is currently written (and this is something I missed when
supposedly fixing lvalue subs), the rv2cv op will reject a non-lvalue
subroutine even when the caller is not called in lvalue context.
So we actually have two bugs.
Presumably the check was done in rv2cv to keep entersub fast. But the
code I quoted above is only part of it. There is also a special block
to create an rv2cv op anew to deal with method calls.
This commit fixes both issues by moving the run-time lvalueness check
to entersub. I put it after PUSHSUB for speed in the most common
case (when there is no error). PUSHSUB already calls a function
(was_lvalue_sub) to determine whether the current sub call is happen-
ing in lvalue context. So the check I am adding after it only has to
check a couple of flags, instead of calling was_lvalue_sub itself.
This also fixes a bug I introduced earlier in the 5.15.x series. This
is supposed to die (in fact, I made the mistake earlier of changing
tests that were checking for this, but so many tests were wrong back
then it was an easy mistake to make):
$ ./perl -Ilib -e 'sub bar {$x} sub foo:lvalue { bar}; foo=3'
And a fourth bug I discovered when writing tests:
sub AUTOLOAD :lvalue { warn autoloading; $x }
sub _102486 { warn "called" }
&{'_102486'} = 72;
warn $x
__END__
autoloading at - line 1.
72 at - line 4.
And it happens even if there is an lvalue sub defined under that name:
sub AUTOLOAD :lvalue { warn autoloading; $x }
sub _102486 :lvalue { warn "called" }
&{'_102486'} = 72;
warn $x
__END__
autoloading at - line 1.
72 at - line 4.
Since the sub cannot be seen at compile time, the lvalue check in
rv2cv, as mentioned above. The autoloading is happening in rv2cv,
too, instead of entersub (the code is repeated), but the sub is not
checked for definition first. It was put in rv2cv because it had to
come before the lvalue check. Putting the latter in entersub lets us
delete that repeated autoload code, which is completely wrong anyway.
|
|
|
|
|
|
|
| |
It’s possible for XS code to create hash entries with null values.
pp_helem and pp_slice were not taking that into account. In fact,
the core produces such hash entries, but they are rarely visible from
Perl. It’s good to check for them anyway.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This bug is a side effect of rv2gv’s starting to return an incoercible
mortal copy of a coercible glob in 5.14:
$ perl5.12.4 -le 'open FH, "t/test.pl"; $fh=*FH; tell $fh; print tell'
0
$ perl5.14.0 -le 'open FH, "t/test.pl"; $fh=*FH; tell $fh; print tell'
-1
In the first case, tell without arguments is returning the position of
the filehandle.
In the second case, tell with an explicit argument that happens to
be a coercible glob (tell has an implicit rv2gv, so tell $fh is actu-
ally tell *$fh) sets PL_last_in_gv to a mortal copy thereof, which is
freed at the end of the statement, setting PL_last_in_gv to null. So
there is no ‘last used’ handle by the time we get to the tell without
arguments.
This commit adds a new rv2gv flag that tells it not to copy the glob.
By doing it unconditionally on the kidop, this allows tell(*$fh) to
work the same way.
Let’s hope nobody does tell(*{*$fh}), which will unset PL_last_in_gv
because the inner * returns a mortal copy.
This whole area is really icky. PL_last_in_gv should be refcounted,
but that would cause handles to leak out of scope, breaking programs
that rely on the auto-closing ‘feature’.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rather than $__ANONIO__
That dollar sign *has* to have been a mistake. In ck_fun, the
name was set to __ANONIO__, but it seems the change that added it
(afd1915d43) did not account for the fact that a little later on the
same function checks to makes sure it begins with a dollar sign, as it
could only be a variable name.
rv2gv’s use of $__ANONIO__ (added recently by yours truly) was just
copying was ck_fun was doing.
|
|
|
|
|
|
|
|
|
| |
As proposed on p5p and approved, this changes the functions uc(), lc(),
ucfirst(), and lcfirst() to respect locale for code points < 255; and
use Unicode semantics for those above 255. This results in better, but
not perfect results, as noted in the changed pods, and brings these
functions into line with how regular expression pattern matching already
works.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When substr() occurs in potential lvalue context, the offsets are
adjusted to the current string (negative being converted to positive,
lengths reaching beyond the end of the string being shortened, etc.)
as soon as the special lvalue to be returned is created.
When that lvalue is assigned to, the original scalar is stringified
once more.
That implementation results in two bugs:
1) Fetch is called twice in a simple substr() assignment (except in
void context, due to the special optimisation of commit 24fcb59fc).
2) These two calls are not equivalent:
$SIG{__WARN__} = sub { warn "w ",shift};
sub myprint { print @_; $_[0] = 1 }
print substr("", 2);
myprint substr("", 2);
The second one dies. The first one only warns. That’s mean. The
error is also wrong, sometimes, if the original string is going to get
longer before the substr lvalue is actually used.
The behaviour of \substr($str, -1) if $str changes length is com-
pletely undocumented. Before 5.10, it was documented as being unreli-
able and subject to change.
What this commit does is make the lvalue returned by substr remember
the original arguments and only adjust the offsets when the assign-
ment happens.
This means that the following now prints z, instead of xyz (which is
actually what I would expect):
$str = "a";
$substr = \substr($str,-1);
$str = "xyz";
print $substr;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In void context we can optimise
substr($foo, $bar, $baz) = $replacement;
to something like
substr($foo, $bar, $baz, $replacement);
except that the execution order must be preserved. So what we actu-
ally do is
substr($replacement, $foo, $bar, $baz);
with a flag to indicate that the replacement comes first. This means
we can also optimise assignment to two-argument substr the same way.
Although optimisations are not supposed to change behaviour,
this one does.
• It stops substr assignment from calling get-magic twice, which means
the optimisation makes things less buggy than usual.
• It causes the uninitialized warning (for an undefined first argu-
ment) to mention the substr operator, as it did before the previous
commit, rather than the assignment operator. I think that sort of
detail is minor enough.
I had to make the warning about clobbering references apply whenever
substr does a replacement, and not only when used as an lvalue. So
four-argument substr now emits that warning. I would consider that a
bug fix, too.
Also, if the numeric arguments to four-argument substr and the
replacement string are undefined, the order of the uninitialized warn-
ings is slightly different, but is consistent regardless of whether
the optimisation is in effect.
I believe this will make 95% of substr assignments run faster. So
there is less incentive to use what I consider the less readable form
(the four-argument form, which is not self-documenting).
Since I like naïve benchmarks, here are Before and After:
$ time ./miniperl -le 'do{$x="hello"; substr ($x,0,0) = 34;0}for 1..1000000'
real 0m2.391s
user 0m2.381s
sys 0m0.005s
$ time ./miniperl -le 'do{$x="hello"; substr ($x,0,0) = 34;0}for 1..1000000'
real 0m0.936s
user 0m0.927s
sys 0m0.005s
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This program:
#!perl -l
sub myprint { print @_ }
print substr *foo, 1;
myprint substr *foo, 1;
produces:
main::foo
Can't coerce GLOB to string in substr at - line 4.
Ouch!
I would expect \substr simply to give me a scalar that peeks into the
original string, but without modifying the original until the return
value of \substr is actually assigned to.
But it turns out that it coerces the original into a string immedi-
ately, unless it’s GMAGICAL. I find the exception for magical varia-
ble rather befuddling. I can only imagine it was for efficency (since
the stringified form will be overwritten when magic_setsubstr calls
SvGETMAGIC), but that doesn’t make sense as the original variable can
itself be modified between the return of the special lvalue and the
assignment to that lvalue.
Since magic_setsubstr itself coerces the variable into a string upon
assignment to the lvalue, we can just remove the coercion code from
pp_substr.
But that causes double uninitialized warnings in cases like
substr($undef, 0,0) = "lrep".
That happens because pp_substr is still stringifying the variable (but
without modifying it). It has to do that, as it looks at the length
of the original string and accordingly adjusts the offsets stored in
the lvalue if they are negative or if they extend beyond the end of
the string.
So this commit takes the simple route of avoiding the warning in
pp_substr by only stringifying a variable that is SvOK if called in
lvalue context.
Hence, assignment to substr($tied...) will continue to call FETCH
twice, but that is not a new bug.
The ideal solution would be for the offsets to be translated in mg.c,
rather than in pp_substr. But that would be a more involved change
(including most of this commit, which is therefore not wasted) with
potential backward-compatibility issue with negative numbers.
A side effect it that the ‘Attempt to use reference as lvalue in
substr’ warning now occurs during the assignment to the substr lvalue,
rather that substr itself. This means it occurs even for tied varia-
bles, so things are now more consistent.
The example at the beginning could still croak if the glob were
replaced with a null string, so this commit only partially allevi-
ates the pain.
|
| |
|
|
|
|
|
|
| |
After sv_force_normal_flags, the scalar will no longer be read-only,
except in those cases where sv_force_normal_flags croaks. So this
check will never be true when SvFAKE was true.
|
|
|
|
|
| |
As amagic_deref_call pushes a new stack, PL_stack_sp will always have
the same value before and after, so SPAGAIN is unnecessary.
|
|
|
|
|
| |
After much alternation, altercation and alteration, __SUB__ is
finally here.
|
|
|
|
| |
This brings it into conformity with y without the /r.
|
| |
|
|
|
|
|
| |
A compiler generated a warning about this. It is the degenerate case
with an empty input, so isn't really a problem, but silence the warning
|
|
|
|
|
|
|
|
| |
Now that there is a function that can convert a latin1 character to
title or upper case without going out to swashes, we can call it instead
of repeating the code. There is the additional overhead of a function
call, but this could be avoided if it comes down to it by making it
in-line.
|
| |
|
|
|
|
|
|
|
|
|
| |
Now that there is a function that can convert a latin1 character to
title or upper case without going out to swashes, we can call it
instead of repeating the code. There is the additional overhead of a
function call, but this could be avoided if it comes down to it by
making it in-line. And this only happens when upper-casing y with
diaresis, and the micro sign
|
|
|
|
|
| |
This outdents and reflows comments as a result of the removal of a
surrounding block
|
|
|
|
|
|
|
|
|
| |
Now that toLOWER_utf8() and toTITLE_utf8() have the intelligence to skip
going out to swashes for Latin1 code points, it's not so critical to
bypass calling them for these (for speed). It simplifies things not to
have the intelligence repeated. There is the additional overhead of two
function calls (minus the branches saved), but these could be avoided if
it comes down to it by making them in-line.
|
|
|
|
|
| |
This outdents and reflows comments as a result of the removal of a
surrounding block
|
|
|
|
|
|
|
|
|
| |
Now that toUPPER_utf8() has the intelligence to skip going out to
swashes for Latin1 code points, it's not so critical to bypass calling
it for these (for speed). It simplifies things not to have the
intelligence repeated. There is the additional overhead of two function
calls (minus the branches saved), but these could be avoided if it comes
down to it by making them in-line.
|
|
|
|
|
| |
Almost always the input to uc() will be one of the other 253 Latin1
characters rather than one of the three that gets here.
|
|
|
|
|
| |
This outdents and reflows comments as a result of the removal of a
surrounding block
|
|
|
|
|
|
|
|
|
| |
Now that toLOWER_utf8() has the intelligence to skip going out to
swashes for Latin1 code points, it's not so critical to bypass calling
it for these (for speed). It simplifies things not to have the
intelligence repeated. There is the additional overhead of two function
calls (minus the branches saved), but these could be avoided if it comes
down to it by making them in-line.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gv_efullname4 produces undef if the GV points to no stash, instead of
using __ANON__, as it does when the stash has no name.
Instead of going through hoops to try and work around it elsewhere, fix
gv_efullname4.
This means that
$x = *$io;
$x .= "whate’er";
no longer produces an uninitialized warning. (The warning was rather
strange, as defined() returned true.)
This commit also gives the glob the name $__ANONIO__ (yes, with a dol-
lar sign). It may seem a little strange, but there is precedent in
other autovivified globs, such as those open() produces when it cannot
determine the variable name (e.g, open $t->{fh}).
|
|
|
|
|
|
| |
This outdents a block to the same level as the surrounding text, and
reflows the comments to take advantage of the extra space and use fewer
lines.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This code was always #ifdef'd out. It would have been used to convert
to a Greek final sigma from a non-final one, depending on context. The
problem is that we can't know algorithmically if a final sigma is in
order or not. I excerpt this quote, that I find persuasive, from
correspondence from Father Chrysostomos, who knows Greek:
"I cannot see how any algorithm can know to get it right.
"The letter σ (or Σ in capitals) represents the number 200 in Greek
numerals. Those are not just ancient Greek numerals, but are used on a
regular basis even in modern Greek. In many printed books ς is used in
place of ϛ, which represents the number 6. So if casefolding should
change ͵ΑΣʹ to ͵αςʹ, or if an output layer changes ͵ασʹ similarly, it
will be changing the number (from 1200 to 1006). You can’t get around
it by checking for the Greek numeral sign (ʹ), as sometimes the tonos
(΄), oxeia (´), or even the ASCII straight quote is used. And often in
lists or chapter titles a dot is used instead of numeral sign.
"Also, σ is commonly used at the ends of abbreviations. Changing ‘βλέπε
σ. 16’ (‘see page 16’) to ‘βλέπε ς. 16’ is not acceptable.
"So, no, I don’t think a programming language should be fiddling with σ
versus ς. (A word processor is another matter.)"
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Did you know that a subroutine’s prototype can be modified with s///?
Don’t look:
*AUTOLOAD = *Internals'SvREFCNT;
my $f = "Just another "; eval{main->$f};
print prototype AUTOLOAD;
$f =~ s/Just another /Perl hacker,\n/;
print prototype AUTOLOAD;
You did look, didn’t you? You must admit that’s creepy.
The problem goes back to this:
commit adb5a9ae91a0bed93d396bb0abda99831f9e2e6f
Author: Doug MacEachern <dougm@covalent.net>
Date: Sat Jan 6 01:30:05 2001 -0800
[patch] xsub AUTOLOAD fix/optimization
Message-ID: <Pine.LNX.4.10.10101060924280.24460-100000@mojo.covalent.net>
Allow AUTOLOAD to be an xsub and allow such xsubs
to avoid use of $AUTOLOAD.
p4raw-id: //depot/perl@8362
which includes this:
+ if (CvXSUB(cv)) {
+ /* rather than lookup/init $AUTOLOAD here
+ * only to have the XSUB do another lookup for $AUTOLOAD
+ * and split that value on the last '::',
+ * pass along the same data via some unused fields in the CV
+ */
+ CvSTASH(cv) = stash;
+ SvPVX(cv) = (char *)name; /* cast to loose constness warning */
+ SvCUR(cv) = len;
+ return gv;
+ }
That ‘unused’ field is not unused. It’s where the prototype is
stored. So, not only is it clobbering the prototype, it’s also leak-
ing it by assigning over the top of SvPVX. Furthermore, it’s blindly
assigning someone else’s string, which could be freed before it’s
even used.
Since it has been documented for a long time that SvPVX contains the
name of the AUTOLOADed sub, and since the use of SvPVX for prototypes
is documented nowhere, we have to preserve the former.
So this commit makes the prototype and the sub name share the same
buffer, in a manner resembling that which CvFILE used before I changed
it with bad4ae38.
There are two new internal macros, CvPROTO and CvPROTOLEN for retriev-
ing the prototype.
|
|
|
|
|
|
|
|
| |
This makes perl -E '$::{example} = "\x{30cb}"; say prototype example;'
store and fetch the correctly flagged prototype.
With this, all TODO tests in gv.t pass; The next commit will deal
with making the parsing of prototypes nul-clean.
|
| |
|
| |
|
|
|
|
|
| |
Since typeglobs may have the UTF8 flag set now, we need to avoid
testing SvCUR on a potential glob, as that would trip an assertion.
|
|
|
|
|
|
|
|
|
| |
This adds a new function to sv.c, sv_ref, which is a nul-and-UTF8
clean version of sv_reftype. pp_ref now uses that.
sv_ref() not only returns the SV, but also takes in an SV
to modify, so we can say both sv_ref(TARG, obj, TRUE); and
sv = sv_ref(NULL, obj, TRUE);
|