summaryrefslogtreecommitdiff
path: root/embed.fnc
Commit message (Collapse)AuthorAgeFilesLines
* set FD_CLOEXEC atomically in easy casesZefram2017-12-221-0/+3
| | | | | | | In many places where a file descriptor is being opened, open it with FD_CLOEXEC already set if possible. This commit covers the easy cases, where the file descriptor arises without the use of PerlIO, pp_open, or my_popen.
* *_cloexec() I/O functionsZefram2017-12-221-0/+19
| | | | | | | | | | | | | | | | | New functions PerlLIO_dup_cloexec(), PerlLIO_dup2_cloexec(), PerlLIO_open_cloexec(), PerlLIO_open3_cloexec(), PerlProc_pipe_cloexec(), PerlSock_socket_cloexec(), PerlSock_accept_cloexec(), and PerlSock_socketpair_cloexec() each do the same thing as their "_cloexec"-less counterpart, but return with the FD_CLOEXEC flag set on each new file descriptor. They set the flag atomically as part of the file descriptor creation syscall where possible, but will fall back to setting it separately from creation where necessary. In all cases, setting the flag atomically depends not only on the correct syscall interface being defined, but on it being actually implemented in the runtime kernel. Each function will experiment to see whether the atomic flag setting actually works, and is prepared for the flag to cause EINVAL or ENOSYS or to be ignored.
* merge branch zefram/dumb_matchZefram2017-12-171-18/+6
|\
| * better name for parameter to newGIVENOP()Zefram2017-12-051-1/+1
| | | | | | | | "cond" sounds like it's a condition. It's actually supplying a topic.
| * internally change "when" to "whereso"Zefram2017-12-051-4/+4
| | | | | | | | | | The names of ops, context types, functions, etc., all change in accordance with the change of keyword.
| * make "when" do implicit "next"Zefram2017-11-291-1/+0
| | | | | | | | | | | | | | | | A "when" construct, upon reaching the end of its conditionally-executed block, used to perform an implicit jump to the end of the enclosing topicalizer, defined as either a "given" block or a "foreach" operating on $_. Change it to jump to the enclosing loop of any kind (which now includes "given" blocks).
| * make loop control apply to "given"Zefram2017-11-291-2/+1
| | | | | | | | A "given" construct is now officially a one-iteration loop.
| * use LOOP struct for entergiven opZefram2017-11-291-2/+0
| | | | | | | | | | This will support the upcoming change to let loop control ops apply to "given" blocks.
| * remove useless "default" mechanismZefram2017-11-281-1/+1
| |
| * eviscerate smartmatchZefram2017-11-221-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Regularise smartmatch's operand handling, by removing the implicit enreferencement and just supplying scalar context. Eviscerate its runtime behaviour, by removing all the matching rules other than rhs overloading. Overload smartmatching in the Regexp package to perform regexp matching. There are consequential customisations to autodie, in two areas. Firstly, autodie::exception objects are matchers, but autodie has been advising smartmatching with the exception on the lhs. This has to change to the rhs, in both documentation and tests. Secondly, it uses smartmatching as part of its hint mechanism. Most of the hint examples, in documentation and tests, have to change to subroutines, to be portable across Perl versions.
| * remove unused arg from newGIVWHENOP()Zefram2017-11-211-2/+1
| | | | | | | | The entertarg argument hasn't been used for a long time.
| * regularise "when"Zefram2017-11-211-1/+0
| | | | | | | | | | | | | | | | | | | | | | Remove from "when" the implicit enreferencement of array/hash conditions and the implicit smartmatch of most conditions. Delete most of the documentation about behaviour of older versions of given/when, because explaining the now-old "when" behaviour would be excessively cumbersome and there's little compatibility to take advantage of. Delete the documentation about differences of given/when from the Perl 6 feature, because the differences are now even more extensive and it's too much difference to sensibly explain. Add tests of "when" in isolation.
* | widen size-type variables in pack/unpackZefram2017-12-161-5/+5
| | | | | | | | | | | | Most size-type variables in pp_pack.c were of type I32, with a smattering of other types. Use SSize_t in place of I32, and generally use size_t-width variables as appropriate. Fixes [perl #119367].
* | make exec keep its argument list more reliablyZefram2017-12-141-1/+0
| | | | | | | | | | | | | | | | | | | | Bits of exec code were putting the constructed commands into globals PL_Argv and PL_Cmd, which could then be clobbered by reentrancy. These are only global in order to manage their freeing, but that's better managed by using the scope stack. So replace them with automatic variables, with ENTER/SAVEFREEPV/LEAVE to free the memory. Also copy the strings acquired from SVs, to avoid magic clobbering the buffers of SVs already read. Fixes [perl #129888].
* | Add variant_under_utf8_count() core functionKarl Williamson2017-12-111-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function takes a string that isn't encoded in UTF-8 (hence is assumed to be in Latin1), and counts how many of the bytes therein would change if it were to be translated into UTF-8. Each such byte would occupy two UTF-8 bytes. This function is useful for calculating the expansion factor precisely when converting to UTF-8, so as to know how much to malloc. This function uses a non-obvious method to do the calculations word-at-a-time, as opposed to the byte-at-a-time method used now, and hence should be much faster than the current methods. The performance change in short string lengths is equivocal. Here is the result for a single character and a 64-bit word. bytes words Ratio % -------- -------- ------- Ir 932.0 947.0 98.4 Dr 325.0 325.0 100.0 Dw 104.0 104.0 100.0 COND 136.0 137.0 99.3 IND 28.0 28.0 100.0 COND_m 1.0 0.0 Inf IND_m 6.0 6.0 100.0 There are some extra instructions executed and an extra branch to check for and handle the case where we can go word-by-word vs. not. But the one cache miss is removed. The results are essentially the same until we get to being able to handle a full word. Some of the extra instructions are to ensure that if the input is not aligned on a word boundary, that performance doesn't suffer. Here's the results for 8-bytes on a 64-bit system. bytes words Ratio % -------- -------- ------- Ir 974.0 955.0 102.0 Dr 332.0 325.0 102.2 Dw 104.0 104.0 100.0 COND 143.0 138.0 103.6 IND 28.0 28.0 100.0 COND_m 1.0 0.0 Inf IND_m 6.0 6.0 100.0 Things keep improving as the strings get longer. Here's for 24 bytes. bytes words Ratio % -------- -------- ------- Ir 1070.0 975.0 109.7 Dr 348.0 327.0 106.4 Dw 104.0 104.0 100.0 COND 159.0 140.0 113.6 IND 28.0 28.0 100.0 COND_m 2.0 0.0 Inf IND_m 6.0 6.0 100.0 And 96: bytes words Ratio % -------- -------- ------- Ir 1502.0 1065.0 141.0 Dr 420.0 336.0 125.0 Dw 104.0 104.0 100.0 COND 231.0 149.0 155.0 IND 28.0 28.0 100.0 COND_m 2.0 1.0 200.0 IND_m 6.0 6.0 100.0 And 10,000 bytes words Ratio % -------- -------- ------- Ir 60926.0 13445.0 453.1 Dr 10324.0 1574.0 655.9 Dw 104.0 104.0 100.0 COND 10135.0 1387.0 730.7 IND 28.0 28.0 100.0 COND_m 2.0 1.0 200.0 IND_m 6.0 6.0 100.0 I found this trick on the internet many years ago, but I can't seem to find it again to give them credit.
* | utf8_length() is not a pure functionKarl Williamson2017-12-071-1/+1
| | | | | | | | Because it can output warnings.
* | document newATTRSUB_x()Zefram2017-12-051-1/+1
| |
* | document newXS_len_flags()Zefram2017-12-051-1/+1
| |
* | Use is_utf8_invariant_string() moreKarl Williamson2017-11-271-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that this function was changed to do word-at-a time searching in commit e17544a60909ed9555c0dad7cd24afc40eb736e7, we can more quickly find the first variant byte in a string, if any. Given that a lot of usage of Perl is on ASCII data, it makes sense to try this first before any byte-at-a-time processing. Since Perl can be used on things that are mostly non-ASCII, we give up at the first such one, and process the rest of the string byte-by-byte. Otherwise we could have a pipeline of finding the next variant quickly, but this would only be faster if variants were rare, which I don't feel we can be confident about, after finding at least one.
* | [perl #132485] Warn about "$foo'bar"Father Chrysostomos2017-11-261-1/+2
| |
* | Add is_utf8_non_invariant_string()Karl Williamson2017-11-261-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This function tells whether or not its argument is a sequence of bytes that is legal Perl-extended-UTF-8, and which either requires UTF-8 (because it contains wide characters) or would have a different representation when not under UTF-8. This paradigm is used in several places in the perl core to decide whether to turn on an SV's utf8 flag. None of those places realized that there was a simple way to avoid rescanning the string (though perhaps a good C optimizer would). This commit creates a funtion that does this task without the rescan; the next commits will convert to use this function.
* | Change 3 functions to be #defines of othersKarl Williamson2017-11-261-3/+3
| | | | | | | | | | | | | | | | | | | | | | I made these separate functions because I thought it would make faster code, but I realized that modern compilers should be able to optimize the more general functions into the same code as the ones removed by this commit, given that the parameters are known to be 0 at compile time. It's easier to maintain one version of a function than two, so this commit favors that.
* | inline.h: Avoid some extra strlen()Karl Williamson2017-11-261-4/+4
| | | | | | | | | | | | | | | | The API of these functions says that if the length is 0, strlen() is called to compute it. In several cases, control is handed off to a function using 0, throwing away the already-computed length. Change to use the computed length when calling the functions, avoiding the issue.
* | Search for UTF-8 invariants by wordKarl Williamson2017-11-231-2/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The functions is_utf8_invariant_string() and is_utf8_invariant_string_loc() are used in several places in the core and are part of the public API. This commit speeds them up significantly on ASCII (not EBCDIC) platforms, by changing to use word-at-a-time parsing instead of per-byte. (Per-byte is retained for any initial bytes to reach the next word boundary, and any final bytes that don't fill an entire word.) The following results were obtained parsing a long string on a 64-bit word machine: byte word ------ ------ Ir 100.00 665.35 Dr 100.00 797.03 Dw 100.00 102.12 COND 100.00 799.27 IND 100.00 97.56 COND_m 100.00 144.83 IND_m 100.00 75.00 Ir_m1 100.00 100.00 Dr_m1 100.00 100.02 Dw_m1 100.00 104.12 Ir_mm 100.00 100.00 Dr_mm 100.00 100.00 Dw_mm 100.00 100.00 100% is baseline; numbers larger than that are improvements. The COND measurement indicates, for example, that there 1/8 as many conditional branches in the word-at-a-time version.
* rip out quicksort and sort algorithm controlZefram2017-11-171-1/+0
| | | | [perl #119635]
* embed.fnc: Add X flag to newly private UTF16 fcnsKarl Williamson2017-11-161-2/+2
| | | | | | | The E flag added in cfd95a374972942cba5e8afc019dc6019815b45c needs either the X flag or be private to its containing file. Spotted by Craig Berry.
* Remove UTF16 functions from public accessKarl Williamson2017-11-161-2/+2
| | | | | | | | | See thread starting at http://nntp.perl.org/group/perl.perl5.porters/247120 I don't believe this needs a perldelta, as the functions weren't documented, hence are not supposed to be used, and in fact are not used in cpan.
* add wrap_keyword_plugin function (RT #132413)Lukas Mai2017-11-111-0/+1
|
* embed.fnc: Change fcn from A to XKarl Williamson2017-11-081-1/+1
| | | | | | | | This function is marked as accessible anywhere, but experimental, and so is changeable at any time without notice, and its name begins with an underscore to indicate its private nature. I didn't know at the time I wrote it that we have an existing mechanism to deal with functions whose only use should be a public macro. This changes to use that mechanism.
* Change name of internal functionKarl Williamson2017-11-081-1/+1
| | | | | Following on the previous commit, this changes the name of the function that changes the variable to be in sync with it.
* _byte_dump_string(): Don't output leading spaceKarl Williamson2017-11-081-1/+1
| | | | | This changes this function to not put an initial space character in the returned string.
* locale.c: Change static fcn nameKarl Williamson2017-11-081-1/+1
| | | | The new name more closely reflects what it does
* locale.c: Refactor static fcn to save workKarl Williamson2017-11-081-1/+1
| | | | | | | | | | | | | | | This adds a parameter to the function that sets the radix character for floating point numbers. We know that the radix by default is a dot, so no need to calculate it in that case. This code was previously using localeconv() to find the locale's decimal point. The just added my_nl_langinfo() fcn does the same with an easier API, and is more thread safe, and automatically switches to use localeconv() when n nl_langinfo() isn't available, so revise the conditional compilation directives that previously were necessary, and collapse directives that were unnecessarily nested. And adjust indentation
* locale.c: Create extended internal Perl_langinfo()Karl Williamson2017-11-081-0/+5
| | | | | | | | This extended version allows it to be called so that it uses the current locale for the LC_NUMERIC, instead of toggling to the underlying one. (This can be useful when in the middle of things.) This ability won't be used until the next commit
* toke.c: Add limit parameter to 3 static functionsKarl Williamson2017-11-061-3/+3
| | | | | This will make it possible to fix to handle embedded NULs in the next commits.
* dquote.c: Use memchr() instead of strchr()Karl Williamson2017-11-061-2/+6
| | | | | | | This allows \x and \o to work properly in the face of embedded NULs. A limit parameter is added to each function, and that is passed to memchr (which replaces strchr). See the branch merge message for more information.
* Add my_memrchr() implementation of memrchr()Karl Williamson2017-11-011-0/+3
| | | | | | | | | | | On platforms that have memrchr(), my_mrchr() maps to use that instead. This is useful functionality, lacking on many platforms. This commit also uses the new function in two places in the core where the comments previously indicated it would be advantageous to use it if we had it. It is left usable only in core, so that if this turns out to have been a bad idea, it can be easily removed.
* Add OP_MULTICONCAT opDavid Mitchell2017-10-311-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow multiple OP_CONCAT, OP_CONST ops, plus optionally an OP_SASSIGN or OP_STRINGIFY, to be combined into a single OP_MULTICONCAT op, which can make things a *lot* faster: 4x or more. In more detail: it will optimise into a single OP_MULTICONCAT, most expressions of the form LHS RHS where LHS is one of (empty) my $lexical = $lexical = $lexical .= expression = expression .= and RHS is one of (A . B . C . ...) where A,B,C etc are expressions and/or string constants "aAbBc..." where a,A,b,B etc are expressions and/or string constants sprintf "..%s..%s..", A,B,.. where the format is a constant string containing only '%s' and '%%' elements, and A,B, etc are scalar expressions (so only a fixed, compile-time-known number of args: no arrays or list context function calls etc) It doesn't optimise other forms, such as ($a . $b) . ($c. $d) ((($a .= $b) .= $c) .= $d); (although sub-parts of those expressions might be converted to an OP_MULTICONCAT). This is partly because it would be hard to maintain the correct ordering of tie or overload calls. The compiler uses heuristics to determine when to convert: in general, expressions involving a single OP_CONCAT aren't converted, unless some other saving can be made, for example if an OP_CONST can be eliminated, or in the presence of 'my $x = .. ' which OP_MULTICONCAT can apply OPpTARGET_MY to, but OP_CONST can't. The multiconcat op is of type UNOP_AUX, with the op_aux structure directly holding a pointer to a single constant char* string plus a list of segment lengths. So for "a=$a b=$b\n"; the constant string is "a= b=\n", and the segment lengths are (2,3,1). If the constant string has different non-utf8 and utf8 representations (such as "\x80") then both variants are pre-computed and stored in the aux struct, along with two sets of segment lengths. For all the above LHS types, any SASSIGN op is optimised away. For a LHS of '$lex=', '$lex.=' or 'my $lex=', the PADSV is optimised away too. For example where $a and $b are lexical vars, this statement: my $c = "a=$a, b=$b\n"; formerly compiled to const[PV "a="] s padsv[$a:1,3] s concat[t4] sK/2 const[PV ", b="] s concat[t5] sKS/2 padsv[$b:1,3] s concat[t6] sKS/2 const[PV "\n"] s concat[t7] sKS/2 padsv[$c:2,3] sRM*/LVINTRO sassign vKS/2 and now compiles to: padsv[$a:1,3] s padsv[$b:1,3] s multiconcat("a=, b=\n",2,4,1)[$c:2,3] vK/LVINTRO,TARGMY,STRINGIFY In terms of how much faster it is, this code: my $a = "the quick brown fox jumps over the lazy dog"; my $b = "to be, or not to be; sorry, what was the question again?"; for my $i (1..10_000_000) { my $c = "a=$a, b=$b\n"; } runs 2.7 times faster, and if you throw utf8 mixtures in it gets even better. This loop runs 4 times faster: my $s; my $a = "ab\x{100}cde"; my $b = "fghij"; my $c = "\x{101}klmn"; for my $i (1..10_000_000) { $s = "\x{100}wxyz"; $s .= "foo=$a bar=$b baz=$c"; } The main ways in which OP_MULTICONCAT gains its speed are: * any OP_CONSTs are eliminated, and the constant bits (already in the right encoding) are copied directly from the constant string attached to the op's aux structure. * It optimises away any SASSIGN op, and possibly a PADSV op on the LHS, in all cases; OP_CONCAT only did this in very limited circumstances. * Because it has a holistic view of the entire concatenation expression, it can do the whole thing in one efficient go, rather than creating and copying intermediate results. pp_multiconcat() goes to considerable efforts to avoid inefficiencies. For example it will only SvGROW() the target once, and to the exact size needed, no matter what mix of utf8 and non-utf8 appear on the LHS and RHS. It never allocates any temporary SVs except possibly in the case of tie or overloading. * It does all its own appending and utf8 handling rather than calling out to functions like sv_catsv(). * It's very good at handling the LHS appearing on the RHS; for example in $x = "abcd"; $x = "-$x-$x-"; It will do roughly the equivalent of the following (where targ is $x); SvPV_force(targ); SvGROW(targ, 11); p = SvPVX(targ); Move(p, p+1, 4, char); Copy("-", p, 1, char); Copy("-", p+5, 1, char); Copy(p+1, p+6, 4, char); Copy("-", p+10, 1, char); SvCUR(targ) = 11; p[11] = '\0'; Formerly, pp_concat would have used multiple PADTMPs or temporary SVs to handle situations like that. The code is quite big; both S_maybe_multiconcat() and pp_multiconcat() (the main compile-time and runtime parts of the implementation) are over 700 lines each. It turns out that when you combine multiple ops, the number of edge cases grows exponentially ;-)
* add extra optimization phaseDavid Mitchell2017-10-311-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the function optimize_optree(). Optree optimization/finalization is now done in three main phases: 1) optimize_optree(optree); 2) CALL_PEEP(*startp); 3) finalize_optree(optree); (1) and (3) are done in top-down order, while (2) is done in execution order. Note that this function doesn't actually optimize anything yet; this commit is just adding the necessary infrastructure. Adding this extra top-down phase allows certain combinations of ops to be recognised in ways that the peephole optimizer would find hard. For example in $a = expression1 . expression2 . expression3 . expression4 the top-down tree looks like sassign concat concat concat expression1 ... expression2 ... expression3 ... expression4 ... padsv[$a] so its easy to see the nested concats, while execution order looks like ... lots of ops for expression1 ... ... lots of ops for expression2 ... concat ... lots of ops for expression3 ... concat ... lots of ops for expression4 ... concat padsv[$a] sassign where its not at all obvious that there is a chain of nested concats. Similarly, trying to do this in finalize_optree() is hard because the peephole optimizer will have messed things up. Also it will be too late to remove nulled-out ops from the execution path.
* Provide fallback strnlen implementationDagfinn Ilmari Mannsåker2017-10-211-0/+4
|
* Rely on C89 sprintf() return value semanticsAaron Crane2017-10-211-4/+0
|
* Don't use VOL internally, because "volatile" works just fineAaron Crane2017-10-211-1/+1
| | | | However, we do preserve it outside PERL_CORE for the use of XS authors.
* Assume we have sane C89 memcmp()Aaron Crane2017-10-211-3/+0
| | | | | | | "Sane" means that it works correctly on bytes with their high bit set, as C89 also requires. We therefore no longer need to probe for and/or use BSD bcmp().
* Assume we have C89 memcpy() and memmove()Aaron Crane2017-10-211-3/+0
| | | | We can therefore also avoid probing for and/or using BSD bcopy().
* Don't look for a "safe" memcpy()Aaron Crane2017-10-211-1/+1
| | | | | | | | | | | | | C89 says that, if you want to copy overlapping memory blocks, you must use memmove(), and that attempt to copy overlapping memory blocks using memcpy() yields undefined behaviour. So we should never even attempt to probe for a system memcpy() implementation that just happens to handle overlapping memory blocks. In particular, the compiler might compile the probe program in such a way that Configure thinks overlapping memcpy() works even when it doesn't. This has the additional advantage of removing a Configure probe that needs to execute a target-platform program on the build host.
* Assume we have C89 memset()Aaron Crane2017-10-211-6/+0
| | | | This means we also never need to consider using BSD bzero().
* (perl #127663) safer in-place editingTony Cook2017-09-111-0/+1
| | | | | | | | | | | | Previously in-place editing opened the file then immediately *replaced* the file, so if an error occurs while writing the output, such as running out of space, the content of the original file is lost. This changes in-place editing to write to a work file which is renamed over the original only once the output file is successfully closed. It also fixes an issue with setting setuid/setgid file modes for recursive in-place editing.
* (perl #127663) add our own mkstemp() implementationTony Cook2017-09-111-0/+4
| | | | | | | | | | | | Needed to generate temp files for safer in-place editing. Not based on any particular implementation, the BSD implementations tend to be wrappers around a megafunction that also does a few variations of mkstemp() and mkdtemp(), which we don't need (yet.) This might also be useful as a replacement for broken mkstemp() implementations that use a mode of 0666 when creating the file, though we'd need to add Configure probing for that.
* Add API function Perl_langinfo()Karl Williamson2017-09-091-5/+16
| | | | | | This is designed to generally replace nl_langinfo() in XS code. It is thread-safer, hides the quirks of perl's LC_NUMERIC handling, and can be used on systems lacking nl_langinfo.
* Add new API function sv_rvunweakenDagfinn Ilmari Mannsåker2017-09-041-0/+1
| | | | | | | Needed to fix in-place sort of weak references in a future commit. Stolen from Scalar::Util::unweaken, which will be made to use this when available via CPAN upstream.