summaryrefslogtreecommitdiff
path: root/sv.c
Commit message (Collapse)AuthorAgeFilesLines
* sv.c: Use modern macro namesKarl Williamson2017-09-091-4/+4
| | | | | This used older names for some locale functions, that have been superseded by more meaningful ones.
* Add API function Perl_langinfo()Karl Williamson2017-09-091-0/+3
| | | | | | This is designed to generally replace nl_langinfo() in XS code. It is thread-safer, hides the quirks of perl's LC_NUMERIC handling, and can be used on systems lacking nl_langinfo.
* Add new API function sv_rvunweakenDagfinn Ilmari Mannsåker2017-09-041-1/+38
| | | | | | | Needed to fix in-place sort of weak references in a future commit. Stolen from Scalar::Util::unweaken, which will be made to use this when available via CPAN upstream.
* Revert "Perl_sv_vcatpvfn_flags: skip IN_LC(LC_NUMERIC)"David Mitchell2017-08-081-4/+2
| | | | | | | | | | | | | | | This reverts commit c10a72e1914795f6399890aafae13734552645cd. I thought that if PL_numeric_radix_sv is true, then IN_LC(LC_NUMERIC) must be true, so no need to test for both. So I replaced the expensive IN_LC(LC_NUMERIC) test with an assert. But in http://nntp.perl.org/group/perl.perl5.porters/245455, Karl points out that the assert is triggering on HP-UX. So there's something wrong with my logic something. So revert.
* set SVs_PADTMP flag on PL_sv_zeroDavid Mitchell2017-08-041-1/+2
| | | | | | | | | | | | | | | | | | | Where an op in scalar but not boolean context returns &PL_sv_zero as a more efficient way of doing sv_2mortal(newSViv(0)), the returned value must be mutable. For example my @a = (); my $r = \ scalar grep $_ == 1, @a; $$r += 10; By setting the SVs_PADTMP flag, this forces pp_srefgen() and similar to make a mortal copy of &PL_sv_zero. This kind of defeats the original optimisation, but the copy only kicks in under certain circumstances, whereas the newSViv(0) approach would create a new mortal every time. See RT #78288 for where FC suggested the problem and the solution.
* PVLV-as-REGEXP: avoid PVX double freeDavid Mitchell2017-08-041-2/+10
| | | | | | | | | | | | | | With v5.27.2-30-gdf6b4bd, I changed the way that PVLVs store a regexp value (by making the xpv_len field point to a regexp struct). There was a bug in this, which caused the PVX buffer to be double freed. Several REGEXP SVs can share a PVX buffer. Only one of them will have a non-zero xpv_len field, and that SV gets to free the buffer. After the commit above, the non-zero xpv_len was triggering an extra free. This was showing up in smokes as failures in re/recompile.t when invoked with PERL_DESTRUCT_LEVEL=2 (which t/TEST does).
* make utf8::upgrade() of a REGEXP a NOOPDavid Mitchell2017-08-041-1/+6
| | | | | | | | | | | | | | | | | | RT #131821 After my recent commit v5.27.2-30-gdf6b4bd, "give REGEXP SVs the POK flag again", $r = qr/.../; utf8::upgrade($$r); was setting the utf8 flag on the compiled REGEXP SV, which made no sense, as the regex was already compiled and individual nodes would remain non-utf8. The POK flag was removed from REGEXPs in 5.18.0, but before then it didn't seem to matter if the utf8 flag got set later, but it does now - it broke a Tk test.
* SvTRUE(): inline ROK, outline NOKDavid Mitchell2017-07-271-0/+4
| | | | | | | | | | | | | | SvTRUE (and its variants) are wrappers around sv_2bool(), which attempt to test for the common cases without the overhead of a function call. This commit changes the definition of common: SvROK() becomes common: it's very common to test whether a variable is undef or a ref; SvNOK becomes uncommon: these days perl prefers IV values over NV values in SVs whenever possible, so testing the truth value of an NV is less common.
* Make immortal SVs contiguousDavid Mitchell2017-07-271-0/+20
| | | | | | | | | | | | | | | | | | | Ensure that PL_sv_yes, PL_sv_undef, PL_sv_no and PL_sv_zero are allocated adjacently in memory. This allows the SvIMMORTAL() test to be more efficient, and will (in the next commit) allow SvTRUE() to be more efficient. In MULTIPLICITY builds the constraint is already met by virtue of them being adjacent items in the interpreter struct. For non-MULTIPLICITY builds, they were just 4 global vars with no guarantees of where they would be allocated. For this case, PL_sv_undef are deleted as global vars and replaced with a new global var PL_sv_immortals[4], with #define PL_sv_yes (PL_sv_immortals[0]) etc in their place.
* give REGEXP SVs the POK flag againDavid Mitchell2017-07-271-31/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit v5.17.5-99-g8d919b0 stopped SVt_REGEXP SVs (and PVLVs acting as regexes) from having the POK and pPOK flags set. This made things like SvOK() and SvTRUE() slower, because as well as the quick single test for any I/N/P/R flags, SvOK() also has to test for (SvTYPE(sv) == SVt_REGEXP || (SvFLAGS(sv) & (SVTYPEMASK|SVp_POK|SVpgv_GP|SVf_FAKE)) == (SVt_PVLV|SVf_FAKE)) This commit fixes the issue fixed by g8d919b0 in a slightly different way, which is less invasive and allows the POK flag. Background: PVLV are basically PVMGs with a few extra fields. They are intended to be a superset of all scalar types, so any scalar value can be assigned to a PVLV SV. However, once REGEXPs were made into first-class scalar SVs, this assumption broke - there are a whole bunch of fields in a regex SV body which can't be copied to to a PVLV. So this broke: sub f { my $r = qr/abc/; # $r is reference to an SVt_REGEXP $_[0] = $$r; } f($h{foo}); # the hash access is deferred - a temporary PVLV is # passed instead The basic idea behind the g8d919b0 fix was, for an LV-acting-as-regex, to attach both a PVLV body and a regex body to the SV head. This commit keeps this basic concept; it just changes how the extra body is attached. The original fix changed SVt_REGEXP SVs so that sv.sv_u.svu_pv no longer pointed to the regexp's string representation; instead this pointer was stored in a union made out of the xpv_len field. Doing this necessitated not turning the POK flag on for any REGEXP SVs. This freed up the sv_u to point to the regex body, while the sv_any field could continue to point to the PVLV body. An ReANY() macro was introduced that returned the sv_u field rather than the sv_any field. This commit changes it so that instead, on regexp SVs (and LV-as-regexp SVs), sv_u always points to the string buffer (so they can have POK set again), but on specifically LV-as-regex SVs, the xpv_len_u union of the PVLV body points to the regexp body. This means that SVt_REGEXP SVs are now completely "normal" again, and SVt_PVLV SVs are normal except in the one case where they hold a regex, in which case rather than storing the string buffer's length, the PVLV body stores a pointer to the regex body.
* sv_2bool_flags(): assume ROK implies SvRVDavid Mitchell2017-07-271-1/+2
| | | | | If the SvROK flag is set, the SV must have a valid non-nnull SvRV() pointer, so don;t bother to check that whether its null.
* add PL_sv_zeroDavid Mitchell2017-07-271-0/+13
| | | | | | | | | | it's like PL_sv_no, except that its string value is "0" rather than "". It can be used for example where pp function wants to push a zero return value on the stack. The next commit will start to use it. Also update the SvIMMORTAL() to be more efficient: it now checks whether the SV's address is in a range rather than individually checking against &PL_sv_undef, &PL_sv_no etc.
* PERL_SNPRINTF_CHECK(): off by 1 errorDavid Mitchell2017-06-271-1/+9
| | | | | | | | | | | | | | | | | | | | | | | PERL_SNPRINTF_CHECK() is used as part of a wrapper for snprintf() to check that snprintf didn't return more bytes than the buffer size given it to it (which should never happen anyway). But it was checking return_value >= buf_size rather than >. So a spurious panic could ensue if the formatted string exactly matched the buffer size. This hadn't been detected before because the old Perl_sv_vcatpvfn_flags() implementation added lots of fudge factors to the buffer size. At the same time, change the code in Perl_sv_vcatpvfn_flags() which grows PL_efloatbuf if its not big enough for float_need: 1) Make it require the buf size to be at least float_need + 1 rather than just float_need, to accommodate the \0 appended by snprintf() (we don't strictly need the \0, and a conforming snprintf() implementation should just return the string without trailing \0 if there isn't room for it, but its possible an snprintf() out there might stumble). 2) When growing PL_efloatbuf, grow by an extra margin of 0x20, to reduce the likelihood of multiple reallocs.
* sv.c: Refactor slightly to avoid a gotoKarl Williamson2017-06-081-23/+25
| | | | | The introduction of the inline function in the previous commit makes it clear that the code can be refactored to be more structured.
* sv.c: Convert to use is_utf8_invariant_string_locKarl Williamson2017-06-081-11/+7
| | | | | | | This inline function was added in the previous commit. And the function has the potential to be sped up by using word-at-a-time operations.
* sv.c: Clarify some commentsKarl Williamson2017-06-081-13/+13
|
* Perl_sv_vcatpvfn_flags: rename a labelDavid Mitchell2017-06-071-4/+4
| | | | | | s/donevalidconversion/done_valid_conversion/ so its a bit easier to read.
* sv_vcatpvfn_flags and wrappers: s/svmax/sv_count/David Mitchell2017-06-071-12/+12
| | | | | | | | | | | | | Rename the 'svmax' parameter of Perl_sv_vcatpvfn_flags(), Perl_sv_vcatpvfn(), Perl_sv_vsetpvfn(), to 'sv_count'. 'max' often implies N-1 (e.g. svarsg[0]..svargs[svmax]), whereas it's actually the number of SV args passed to the functions.
* Perl_sv_vcatpvfn_flags: handle mixed utf8 betterDavid Mitchell2017-06-071-3/+18
| | | | | | | | | | | Once the output string gets upgraded to utf8 (e.g. due to a utf8 %s argument), any remaining appending of plain (non-%) parts of the format string becomes very inefficient. It basically creates an SV out of the next format chunk, upgrades that SV to utf8, then appends the upgraded buffer. This commits makes it just append the format chunk byte by byte, upgrading in the fly if that byte is !NATIVE_BYTE_IS_INVARIANT
* add S_sv_catpvn_simple() for use by sprintfDavid Mitchell2017-06-071-17/+25
| | | | | | | | | | Currently Perl_sv_vcatpvfn_flags() uses an unrolled sv_catpvn_nomg() to append floating point formats, a call to sv_catpvn_nomg() to append non-% parts of the format, and a few other non-performance-critical calls to sv_catpvn_nomg(). Move the unrolled code block into an inline static function, and make the non-% appending use it too.
* Perl_sv_vcatpvfn_flags: re-indent a code blockDavid Mitchell2017-06-071-10/+11
| | | | whitespace only
* Perl_sv_vcatpvfn_flags: eliminate p varDavid Mitchell2017-06-071-15/+17
| | | | | | | It has 1500-line scope, and is equal to fmtstart-1 for most of the time. This also allows us to 'const'ify some variables better.
* Perl_sv_vcatpvfn_flags: clarify GCC bug commentsDavid Mitchell2017-06-071-2/+6
| | | | | | | | In particular it wasn't clear what bug was being worked around, nor that '#13488' referred to a GNU ticket rather than a perl ticket. This bug was fixed back in 2004, but the workaround is fairly harmless, so I've left it as-is.
* Perl_sv_vcatpvfn_flags: simplify alt handlingDavid Mitchell2017-06-071-3/+2
| | | | only do calculations for alt (#) formatting in the branches which use it
* Perl_sv_vcatpvfn_flags: rename 'p' var 's'David Mitchell2017-06-071-14/+15
| | | | | | | In the 'append # block of code at the end of the loop, don't re-use the widely-scoped 'p' pointer; instead use a tightly scope var instead (named 's' do it doesn't clash with p which is still valid in an outer scope.)
* Perl_sv_vcatpvfn_flags: simplify format appendingDavid Mitchell2017-06-071-21/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The bit at the end of the main loop has a whole bunch of conditionals along the lines of if (gap && !left) apppend gap if (esignlen && !fill) append esignbuf if (zeros) append zeroes if (elen) append ebuf if (gap && left) append gap This involves many tests along the main code path to cope with all the possibilities (e.g. if left, gap is output before ebuf, otherwise after) Instead split it into a couple of major branches with duplication between the branches, but requiring few tests along any one code path. For example, sprintf("%5d", -1) formerly required 9 branches, 1 for loop, and 1 memset(). It now requires 2 branches and 3 for loops, I've removed memset()s and replaced them with for loops. For the short padding typically used (e.g. "%9d" rather than "%8192d") a loop is faster.
* Perl_sv_vcatpvfn_flags: eliminate a wrap checkDavid Mitchell2017-06-071-12/+5
| | | | This is one case where it can never wrap, so don't check.
* Perl_sv_vcatpvfn_flags: simpler special formatsDavid Mitchell2017-06-071-47/+47
| | | | | | | | | | | | | | | | | | At the top of Perl_sv_vcatpvfn_flags(), certain fixed formats are special-cased: "", "%s", "%-p", "%.0f". Simplify the code which handles these. In particular, don't try to issue "missing" or "redundant" arg warnings there. Instead, check for the correct number of args as part of the test for whether this can be special-cased, and if not, fall through to the general code in the main body of the function to handle that format and issue any warnings. This makes the code a lot simpler. It also now detects the redundant arg in printf("%.0f",1,2). The code is now also more efficient - it tries to check for things like pat[0] == '%' only once, rather than re-checking for every special-case variant its trying.
* Perl_sv_vcatpvfn_flags: simpler redundant arg testDavid Mitchell2017-06-071-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 5.24.0 added a new warning: Redundant argument in printf at .... That warning is issued if there are more args than format elements. However, it may also warn for invalid format - e.g. for something like printf("%Z%d", 1,2) you get both Invalid conversion in printf: "%Z" at ... Redundant argument in printf at ... Personally I think once once part of the format has been determined to be invalid, its hard for perl to second-guess in what way the format was invalid, and thus to be able to conclude that there is in fact a redundant arg. So this commit commit suppresses any "redundant" warning once an "invalid" warning has been issued. Doing this makes it possible to simplify the code and remove the used_explicit_ix variable. Apart from warnings, used_explicit_ix was only used in %p to check for 'simple' special forms - but that code checks for a trailing '$' character anyway, so that test was redundant.
* Perl_sv_vcatpvfn_flags: fix comment typoDavid Mitchell2017-06-071-1/+1
|
* Perl_sv_vcatpvfn_flags: add comment about wrapDavid Mitchell2017-06-071-0/+3
|
* Perl_sv_vcatpvfn_flags: only do utf8 in radix codeDavid Mitchell2017-06-071-16/+13
| | | | | | | | | For floating point formats, the output can only be utf8 if the radix point is utf8. Currently the radix point code sets the is_utf8 variable, then later, in the main floating-point code path, it tests is_utf8 and upgrades the output string to utf8. Instead, just do the upgrade directly in the radix code block.
* Perl_sv_vcatpvfn_flags: simplify radix len addingDavid Mitchell2017-06-071-9/+8
| | | | | Assume the length of the radix point is a constant 1 (i.e. length('.')) and only increment float_need further if we're in a locale.
* sprintf %a/%A more sanity checksDavid Mitchell2017-06-071-0/+11
| | | | | For the code which generates hexadecimal floating-point formats, add extra sanity checks against buffer overruns.
* S_hextract(): fix #if indentationDavid Mitchell2017-06-071-5/+6
| | | | | | | a complex set of nested #if/#else/#endif's had incorrect and confusing indentation. whitespace-only change
* Perl_sv_vcatpvfn_flags: simplify some wrap checksDavid Mitchell2017-06-071-4/+9
| | | | Skip doing some overflow checks when we know it can't overflow.
* Perl_sv_vcatpvfn_flags: simplify float_need calcDavid Mitchell2017-06-071-12/+7
| | | | | Include another constant addition in the initial assignment, to eliminate a later wrap check.
* S_format_hexfp(): s/int/STRLEN/David Mitchell2017-06-071-1/+1
| | | | | | | | | In the helper function that sprintf's %a/%A hex floating point values, the calculation of the number of zeros to pad with should be in terms of STRLEN rather than int. A bit academic unless someone ever tries to print a hex f/p value with a precision > 2Gb digits.
* Perl_sv_vcatpvfn_flags: add inits to silence gccDavid Mitchell2017-06-071-4/+4
| | | | | | | Add a couple of unnecessary variable initialisers, to keep gcc's "this variable might be used uninitialised - then again it might not - in fact I don't really know what I'm talking about, but I've decided to annoy you with it anyway" warning at bay.
* Perl_sv_vcatpvfn_flags: avoid wrap on precisionDavid Mitchell2017-06-071-3/+10
| | | | | | | | | Where the precision is specified literally in the format string, the integer precision value could wrap. Instead, make it croak with Integer overflow in format string As in other recent commits, the upper limit is set at 1/4 of STRLEN.
* Perl_sv_vcatpvfn_flags: s/int/STRLEN/gDavid Mitchell2017-06-071-5/+5
| | | | | | | | | | | | There wee a few residual places that used int loop counters, e.g. to prepend N '0's to a number. Since the N's are of type STRLEN, make the loop counters STRLEN too. Its a bit academic since you're unlikely to have a number needing >2Gb worth of zero padding, but it makes things consistent and easier to audit. At this point I believe that any remaining usage of int / I32 / U32 in Perl_sv_vcatpvfn_flags() is legitimate.
* Perl_sv_vcatpvfn_flags: %n: avoid wrapDavid Mitchell2017-06-071-3/+5
| | | | | Its a bit academic, but in principle if a string was longer than 2Gb chars, the length as set by %n could wrap. So use the correct type(s).
* Perl_sv_vcatpvfn_flags: width/precis arg wrapDavid Mitchell2017-06-071-21/+83
| | | | | | | | | | | | | | | | | | | | | | | | When the width or precision is specified via an argument rather than literally, check whether the value wraps. Formerly, something like $w = 0x100000005; printf "%*s", $w, "abc"; might print " abc" or similar, depending on platform. Now it croaks with "Integer overflow in format string". I did wonder whether it should just warn instead, but: 1) over-large literal widths/precisions already croak. 2) Code that has wild field specifiers like that is already likely to crash with an out-of-memory error. 3) At least this croak is trappable via eval - OOM isn't. I also set the maximum allowed value to be 1/4 of the size of a pointer, to give a safety margin for possible wrapping later
* Perl_sv_vcatpvfn_flags: move vector initialisationDavid Mitchell2017-06-071-37/+35
| | | | | | | | | Move the generation of vecstr/veclen/vec_utf8 into the vector-initialisation block, rather than being part of the general 'get next arg' block. Also, stop vecsv being in scope for the whole of the loop block, and make it two separate tightly-scope vars (with different purposes).
* Perl_sv_vcatpvfn_flags: warn on missing %v argDavid Mitchell2017-06-071-1/+2
| | | | The explicit arg variant, e.g. %3$vd, didn't give 'missing arg' warning.
* Perl_sv_vcatpvfn_flags: warn on missing width argDavid Mitchell2017-06-071-1/+1
| | | | | It didn't used to warn when the width value was obtained from the next or specified arg, and there wasn't such an arg.
* Eliminate FETCH_VCATPVFN_ARGUMENT macroDavid Mitchell2017-06-071-17/+7
| | | | | This can be simplified so much now that it might as well just be expanded in situ for its 3 uses.
* Perl_sv_vcatpvfn_flags: re-indent blockDavid Mitchell2017-06-071-16/+16
| | | | whitespace-only
* Perl_sv_vcatpvfn_flags: unify %v vers obj handlingDavid Mitchell2017-06-071-18/+9
| | | | | | | | | Cureently sv_vcatpvfn_flags() has special handling of the arg under %v when the arg is a version object, but only via the perlish interface (argsv and svmax). This commit extends that handling to the C-sih interface (args). There seems no good reason not to, and it simplifies the code.
* Perl_sv_vcatpvfn_flags: unify args handlingDavid Mitchell2017-06-071-22/+16
| | | | | | | | | | | | | | | Several places do something along the lines of: if (explicit arg index) FETCH_VCATPVFN_ARGUMENT(...., svargs[ix-1]) else FETCH_VCATPVFN_ARGUMENT(...., svargs[svix++]) For each of these, reduce the duplicate code by changing the above to (approximately) ix = ix ? ix - 1 : svix++; FETCH_VCATPVFN_ARGUMENT(...., svargs[ix])