| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
The macro expansion generates over 1K of object code. This is in every shared
object, and is only called once. Hence this change increases the perl binary
by about 1K (once), to save 1K for every XS module loaded.
|
|
|
|
|
| |
New API function parse_stmtseq() parses a sequence of statements, up to
closing brace or EOF.
|
| |
|
|
|
|
|
| |
Additionally, sort embed.h by public API, then core-or-ext, and finally core
only. This reduces the number of #if/#endif pairs in embed.h and proto.h
|
|
|
|
|
| |
Anywhere an API function takes a string in pvn form, ensure that there
are corresponding pv, pvs, and sv APIs.
|
|
|
|
|
|
| |
This fixes ! by changing sv_2bool to sv_2bool_flags (with a macro
wrapper) and adding SvTRUE_nomg. It also corrects the docs that state
incorrectly that SvTRUE does not handle magic.
|
|
|
|
|
|
| |
This patch changes sv_eq, sv_cmp, sv_cmp_locale and sv_collxfrm
to _flags forms, with macros under the old names for sv_eq and
sv_collxfrm, but functions for sv_cmp* since pp_sort.c needs them.
|
|
|
|
|
|
|
|
| |
Since regcurly is now a static inline function, it no longer
needs to appear in embed.fnc. embed.pl doesn't quite have the
right flags to deal with static inline functions, so I just
removed regcurly entirely. It's not for embedding or exporting
anyway.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's was intended as a temporary namespace only, and we really don't want to
ship it in any release until we've figured out what it should really look like.
This reverts commit 05c0d6bbe3ec5cc9af99d105b8648ad02ed7cc95,
"add sv_reftype_len() and make sv_reftype() be a wrapper for it"
commit 792477b9c2e4c75cb03d07bd6d25dc7e1fdf448e,
"create the "mauve" temporary namespace for things like reftype"
commit 8df6b97c1de8326d50ac9c8cae4bf716393b45bb,
"mauve.t needs access to %Config, make sure it's available"
commit cfe9162d0d593cd12a979c73df82c7509b324343,
"use more efficient sv_reftype_len() interface"
and commit 47b13905e23c2a72acdde8bb4669e25e5eaefec4
"add more tests to lib/mauve.t so it tests also that mauve::reftype can return "LVALUE""
There's a `mauve' branch still containing all the code for the temporary mauve
namespace. That should be used to work on it until it's mostly ready to be
released, and only then merged to blead. Alternatively, it should be deleted if
another way to provide mauve's features in the core is found.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
sv_reftype() mostly returns strings whose length is known at compile
time, so we can avoid a strlen() call if we return the length.
Additionally, the non-length interface is potentially buggy in the
face of class names which contain "\0", therefore providing a way
to obtain the true length allows us to avoid any trickyness.
|
|
|
|
|
|
|
|
|
|
| |
This commit adds the new construct \o{} to express a character constant
by its octal ordinal value, along with ancillary tests and
documentation.
A function to handle this is added to util.c, and it is called from the
3 parsing places it could occur. The function is a candidate for
in-lining, though I doubt that it will ever be used frequently.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each CV usually has a pointer, CvGV(cv), back to the GV that corresponds
to the CV's name (or to *foo::__ANON__ for anon CVs). This pointer wasn't
reference counted, to avoid loops. This could leave it dangling if the GV
is deleted.
We fix this by:
For named subs, adding backref magic to the GV, so that when the GV is
freed, it can trigger processing the CV's CvGV field. This processing
consists of: if it looks like the freeing of the GV is about to trigger
freeing of the CV too, set it to NULL; otherwise make it point to
*foo::__ANON__ (and set CvAONON(cv)).
For anon subs, make CvGV a strong reference, i.e. increment the refcnt of
*foo::__ANON__. This doesn't cause a loop, since in this case the
__ANON__ glob doesn't point to the CV. This also avoids dangling pointers
if someone does an explicit 'delete $foo::{__ANON__}'.
Note that there was already some partial protection for CvGV with
commit f1c32fec87699aee2eeb638f44135f21217d2127. This worked by
anonymising any corresponding CV when freeing a stash or stash entry.
This had two drawbacks. First it didn't fix CVs that were anonmous or that
weren't currently pointed to by the GV (e.g. after local *foo), and
second, it caused *all* CVs to get anonymised during cleanup, even the
ones that would have been deleted shortly afterwards anyway. This commit
effectively removes that former commit, while reusing a bit of the
actual anonymising code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each CV usually has a pointer, CvSTASH, back to the stash that it was
complied in. This pointer isn't reference counted, to avoid loops. Which
can leave it dangling if the stash is deleted.
There is already protection for the similar GvSTASH field in GVs: the
stash has an array of backrefs, xhv_backreferences, pointing to the GVs
whose GvSTASHes point to it, and which is used to zero all the GvSTASH
fields should the stash be deleted.
All this patch does is also add the CVs with CvSTASH to that stash's
backref list too.
|
|
|
|
|
| |
This should help prevent people from thinking they can get cute with the
contents.
|
|
|
|
|
|
|
| |
my_stat() and my_lstat() call get magic on the stack arg, so create _flags()
variants that allow us to control this. (I can't just change the signature
or the mg_get() behaviour since my_[l]stat() are listed as being in the
public API, even though they're undocumented.)
|
|
|
|
|
| |
This reduces object code size, reducing CPU cache pressure on the non-exception
paths.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed on p5p, ibcmp has different semantics from other cmp
functions in that it is a binary instead of ternary function. It is
less confusing then to have a name that implies true/false.
There are three functions affected: ibcmp, ibcmp_locale and ibcmp_utf8.
ibcmp is actually equivalent to foldNE, but for the same reason that things
like 'unless' and 'until' are cautioned against, I changed the functions
to foldEQ, so that the existing names, like ibcmp_utf8 are defined as
macros as being the complement of foldEQ.
This patch also changes the one file where turning ibcmp into a macro
causes problems. It changes it to use the new name. It also documents
for the first time ibcmp, ibcmp_locale and their new names.
|
|
|
|
|
|
| |
This is achieved by introducing a new find_rundefsv() function in pad.c
This fixes [perl #75436].
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As it allocates memory dynamically, add Perl_clone_params_del(). This will
allow CLONE_PARAMS to be expand in future in a source and binary compatible
fashion.
These implementations of Perl_clone_params_new()/Perl_clone_params_del() jump
through hoops to remain source and binary compatible, in particular, by not
assuming that the structure member is present and correctly initialised. Hence
they should be suitable for inclusion into Devel::PPPort.
Convert threads.xs to use them, resolving RT #73046.
|
| |
|
|
|
|
|
| |
Add a function Perl_hv_fill to perform the count. This will save 1 IV per hash,
and on some systems cause struct xpvhv to become cache aligned.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In most places, ops checked their args for overload *before* doing
mg_get(). This meant that, among other issues, tied vars that
returned overloaded objects wouldn't trigger calling the
overloaded method. (Actually, for tied and arrays and hashes, it
still often would since mg_get gets called beforehand in rvalue
context).
This patch does the following:
Makes sure get magic is called first.
Moves most of the overload code formerly included by macros at the
start of each pp function into the separate helper functions
Perl_try_amagic_bin, Perl_try_amagic_un, S_try_amagic_ftest,
with 3 new wrapper macros:
tryAMAGICbin_MG, tryAMAGICun_MG, tryAMAGICftest_MG.
This made the code 3800 bytes smaller.
Makes sure that FETCH is not called multiple times. Much of this
bit was helped by some earlier work from Father Chrysostomos.
Added new functions and macros sv_inc_nomg(), sv_dec_nomg(),
dPOPnv_nomg, dPOPXiirl_ul_nomg, dPOPTOPnnrl_nomg, dPOPTOPiirl_ul_nomg
dPOPTOPiirl_nomg, SvIV_please_nomg, SvNV_nomg (again, some of
these were based on Father Chrysostomos's work).
Fixed the list version of the repeat operator (x): it now only
calls overloaded methods for the scalar version:
(1,2,$overloaded) x 10
no longer erroneously calls
x_method($overloaded,10))
The only thing I haven't checked/fixed yet is overloading the
iterator operator, <>.
|
| |
|
|\
| |
| |
| |
| | |
Conflicts:
pp_ctl.c
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
New functions croak_sv(), die_sv(), mess_sv(), and warn_sv(), each act
much like their _sv-less counterparts, but take a single SV argument
instead of sprintf-like format and args. They will accept RVs, passing
them through as such. This means there's no more need to clobber ERRSV
in order to throw a structured exception.
pp_warn() and pp_die() are rewritten to use the _sv interfaces.
This fixes part of [perl #74538]. It also means that a structured
warning object will be passed through to $SIG{__WARN__} instead of
being stringified, thus bringing warn in line with die with respect to
structured exception objects.
The new functions and their existing counterparts are all fully
documented.
|
| | |
|
|/ |
|
| |
|
|
|
|
|
|
|
| |
Change from a value/return offset pointer to passing a Unicode offset, and
returning a byte offset. The optional length value/return pointer remains.
Add a flags argument, passed to SvPV_flags(). This allows the caller to
specify whether mg_get() should be called on sv.
|
|
|
|
| |
available for the pos and len arguments, with safe conversion to STRLEN where it's smaller than an IV.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authors: John Peacock, David Golden and Zefram
The goal of this mega-patch is to enforce strict rules for version
numbers provided to 'package NAME VERSION' while formalizing the prior,
lax rules used for version object creation. Parsing for use() is
unchanged.
version.pm adds two globals, $STRICT and $LAX, containing regular
expressions that define the rules. There are two additional functions
-- version::is_strict and version::is_lax -- that test an argument
against these rules.
However, parsing of strings that might contain version numbers is done
in core via the Perl_scan_version function, which may be called during
compilation or may be called later when version objects are created by
Perl_new_version or Perl_upg_version.
A new helper function, Perl_prescan_version, has been added to validate
a string under either strict or lax rules. This is used in toke.c for
'package NAME VERSION' in strict mode and by Perl_scan_version in lax
mode. It matches the behavior of the verison.pm regular expressions,
but does not use them directly.
A new test file, comp/packagev.t, validates strict and lax behaviors of
'package NAME VERSION' and 'version->new(VERSION)' respectively and
verifies their behavior against the $STRICT and $LAX regular
expressions, as well. Validating these two implementation should help
ensure they each work as intended.
Other files and tests have been modified as necessary to support these
changes.
There is remaining work to be done in a few areas:
* documenting all changes in behavior and new functions
* determining proper treatment of "," as decimal separators in
various locales
* updating diagnostics for new error messages
* porting changes back to the version.pm distribution on CPAN,
including pure-Perl versions
|
| |
|
|
|
|
|
|
|
|
|
| |
Attached is a patch that adds a public API for the lowest layers of
lexing. This is meant to provide a solid foundation for the parsing that
Devel::Declare and similar modules do, and it complements the pluggable
keyword mechanism. The API consists of some existing variables combined
with some new functions, all marked as experimental (which making them
public certainly is).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Attached is a patch that changes how the tokeniser looks up subroutines,
when they're referenced by a bareword, for prototype and const-sub
purposes. Formerly, it has looked up bareword subs directly in the
package, which is contrary to the way the generated op tree looks up
the sub, via an rv2cv op. The patch makes the tokeniser generate the
rv2cv op earlier, and dig around in that.
The motivation for this is to allow modules to hook the rv2cv op
creation, to affect the name->subroutine lookup process. Currently,
such hooking affects op execution as intended, but everything goes wrong
with a bareword ref where the tokeniser looks at some unrelated CV,
or a blank space, in the package. With the patch in place, an rv2cv
hook correctly affects the tokeniser and therefore the prototype-based
aspects of parsing.
The patch also changes ck_subr (which applies the argument context and
checking parts of prototype behaviour) to handle subs referenced by an
RV const op inside the rv2cv, where formerly it would only handle a gv
op inside the rv2cv. This is to support the most likely kind of
modified rv2cv op.
The attached patch is the resulting revised version of the bareword
sub patch. It incorporates the original patch (allowing rv2cv op
hookers to control prototype processing), the GV-downgrading addition,
and a mention in perldelta.
|
|
|
|
| |
There is currently still a linker error about PL_keyword_plugin.
|
| |
|
|
|
|
|
| |
Replace ckWARN_d{,2,3,4}() && Perl_warner() with it, which trades reduced code
size for 1 more function call if warnings are not enabled.
|
|
|
|
|
|
|
| |
Replace ckWARN{,2,3,4}() && Perl_warner() with it, which trades reduced code
size (about 0.2%), for 1 more function call if warnings are not enabled.
However, if we're now in the L1 or L2 cache when we weren't previously, that's
still going to be a speed win.
|
| |
|
| |
|
|
|
|
| |
The "short" names become macro wrappers, and the Perl_* versions become mathoms.
|
|
|
|
| |
save_hdelete() is just like save_delete() except that it takes an SV instead of char buffer.
|
|
|
|
| |
It's the symmetric of save_helem_flags(). save_aelem() is now a macro wrapping around save_aelem_flags().
|
|
|
|
| |
(and run "make regen")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider what currently happens when the tokenizer is scanning a string.
It looks through it byte-by-byte until it finds a character that forces
it to decide to go to utf8. It then calls sv_utf8_upgrade() with the
portion of the string scanned so far.
sv_utf8_upgrade() starts over from the beginning, and scans the string
byte-by-byte until it finds a character that varies between non-utf8 and
utf8. It then calls bytes_to_utf8().
bytes_to_utf8() allocates a new string that can handle the worst case
expansion, 2n+1, of the entire string, and starts over from the
beginning, and scans the input string byte-by-byte copying and
converting each character to the output string as it goes.
It doesn't return the size of the new string, so sv_utf8_upgrade()
assumes it is only as big as what actually got converted, throwing away
knowledge of any spare.
It then returns to the tokenizer, which immediately does a grow to get
space for the unparsed input. This is likely to cause a new string to
be allocated and copied from the one we had just created, even if that
string in actuality had enough space in it.
Thus, the invariant head portion of the string is scanned 3 times, and
probably 2 strings will be allocated and copied.
My solution to cutting this down is to do several things.
First, I added an extra flag for sv_utf8_upgrade that says don't bother
to check if the string needs to be converted to utf8, just assume it
does. This eliminates one of the passes.
I also added a new parameter to sv_utf8_upgrade that says when you
return, I want this much unused space in the string. That eliminates
the extra grow.
This was all done by renaming the current work-horse function from
sv_utf8_upgrade_flags to be sv_utf8_upgrade_flags_grow() and making the
current function name be a macro which calls the revised one with a 0
grow parameter.
I also improved the internal efficiency of sv_utf8_upgrade so that when
it does scan the string, it doesn't call bytes_to_utf8, but does the
conversion itself, using a fast memory copy instead of the byte-oriented
one for the invariant header, and it uses that header to get a better
estimate of the needed size of the new string, and it doesn't throw away
the knowledge of the allocated size.
And, if it is clear without scanning the whole string that the
conversion will fit in the already allocated string, it just uses that
instead of allocating and copying a new one, using the algorithm I
copied from the tokenizer. (In this case it does have to finish
scanning the whole string to get the correct size.) The comments have
details.
It still is byte-oriented. Vectorization et. al. could yield
performance improvements. One idea for that is in the comments.
The patch also includes a new synonym I created which is a more accurate
name than NATIVE_TO_ASCII.
|
| |
|
|
|
|
| |
which can be called from C code (such as the guts of extensions).
|