summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* add new release to perlhistv5.23.6David Golden2015-12-211-0/+1
|
* Update perldelta with additional module updatesDavid Golden2015-12-211-13/+57
|
* Update perldelta with Module::CoreList version bumpDavid Golden2015-12-211-0/+5
|
* Update Module::CoreList from 5.23.6David Golden2015-12-211-11/+74
|
* Update perldelta to near-final stateDavid Golden2015-12-211-284/+66
|
* perldelta for case changing on caseless languageKarl Williamson2015-12-211-1/+6
|
* perldelta for -Dr fixKarl Williamson2015-12-201-0/+7
|
* Update perldeltaDavid Golden2015-12-211-2/+114
| | | | | | | | This commit adds various release notes covering: * module updates * documentation updates * some bug fixes and internal changes
* Correct perldelta typoDavid Golden2015-12-201-1/+1
|
* Add alternate email address for dagolden to checkAUTHORS.plDavid Golden2015-12-201-0/+1
|
* perldelta for 18371617dfb (B::Deparse)Lukas Mai2015-12-211-0/+9
|
* Do not define invlistEQ in the re extension.Craig A. Berry2015-12-201-1/+1
| | | | | Because it's already defined in regcomp.c and the VMS build was failing with a linker error (multiply-defined symbol).
* regcomp.c: Skip some workKarl Williamson2015-12-191-1/+12
| | | | | | | We can optimize ANYOF nodes that are equivalent to POSIX character classes. Discovering if they are equivalent takes work, which can be skipped with a simple test that will rule out many run-of-the-mill character classes.
* regcomp.c: White space onlyKarl Williamson2015-12-191-21/+22
| | | | | Indent a section of code in preparation for the next commit which will make it into a block.
* regcomp.c: Add commentsKarl Williamson2015-12-191-2/+22
|
* mktables: Add "$0:" to its first outputKarl Williamson2015-12-193-4/+4
| | | | So in a make, it is abundantly clear where the messages are coming from
* regcomp.c: Silence uninit compiler warningKarl Williamson2015-12-181-1/+1
| | | | | This shouldn't actually happen, and g++ under -O0 didn't flag it, but gcc under -O2 does, so initialize to an illegal value
* regcomp.c: Remove outdated commentsKarl Williamson2015-12-181-6/+1
| | | | | These were invalidated by commit 709be747a32edc503b4645d9c5396bd4b40100d2
* Fix -Dr problems.Karl Williamson2015-12-181-2/+3
| | | | | | Commits 108316fb65dc7243a1c5d87b4b29068b7d62d32e and 5e85fd899767ba3003766fc9289c0ee2d8427d10 broke -Dr output in rare cases.
* perldelta for 572cd850,406d5545 (signbit)Jarkko Hietaniemi2015-12-181-0/+5
|
* perldelta for the hexfp %a fixes.Jarkko Hietaniemi2015-12-181-0/+8
|
* perldelta for 3118d7d,74c6ce8,1f02ab1 (ppc64el fp)Jarkko Hietaniemi2015-12-181-0/+5
|
* perldelta for 68bcb86 (openindiana: useshrplib for all solaris)Jarkko Hietaniemi2015-12-181-2/+15
|
* Configure: notes on the m68881 extended precision formatJarkko Hietaniemi2015-12-181-1/+9
|
* Double-double implementations differ.Jarkko Hietaniemi2015-12-181-4/+4
|
* Optimize some qr/[...]/ classesKarl Williamson2015-12-174-2/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bracketed character classes generally generate an ANYOF-type regnode, which consists of a bitmap for the lower code points, and an inversion list or swash to handle ones not in the bitmap. They take up more memory than other regnode types. There are already some optimizations that use a smaller and/or faster regnode instead. For example, some people prefer not to use a backslash to escape metacharacters, instead writing something like /abc[.]def/. This has for some time generated the same thing as /abc\.def/ does, namely a single EXACT node, which is both smaller and faster than an ANYOF node in the middle of two EXACT nodes. This commit adds some optimizations that hadn't been done previously. Now things like /[\p{Word}]/ will optimize to \w, for example. I had not done this before, because my tests had shown very little performance difference, but I had added most of the code to regcomp.c so it wouldn't get lost, #ifdef'd out. It turns out that I hadn't tested on code points above the bitmap, which with this commit have a small, but appreciable speed up in matching, so this commit enables and finishes that code. Prior to this commit, things like /[[:word:]]/ were optimized to \w, but things like /[_[:word:]]/ were not. This commit fixes that. If the following command is run on a perl compiled with -O2 and no DEBUGGING: blead Porting/bench.pl --raw --benchfile=charclass_perf --perlargs=-Ilib /path_to_prior_perl="before this commit" /path_to_this_perl=after and the file 'charclass_perf' contains [ 'regex::charclass::ascii' => { desc => 'charclass, ascii range', setup => 'my $a = qr/[\p{Word}]/', code => '"A" =~ $a' }, 'regex::charclass::upper_latin1' => { desc => 'charclass, upper latin1 range', setup => 'my $a = qr/[\p{Word}]/', code => '"\x{e0}" =~ $a' }, 'regex::charclass::above_latin1' => { desc => 'charclass, above latin1 range', setup => 'my $a = qr/[\p{Word}]/', code => '"\x{100}" =~ $a' }, 'regex::charclass::high_Unicode' => { desc => 'charclass, high Unicode code point', setup => 'my $a = qr/[\p{Word}]/', code => '"\x{10FFFF}" =~ $a' }, ]; the following results are obtained: The numbers represent raw counts per loop iteration. regex::charclass::above_latin1 charclass, above latin1 range before this commit after ------------------ -------- Ir 3344.0 2888.0 Dr 971.0 855.0 Dw 604.0 541.0 COND 575.0 504.0 IND 25.0 25.0 COND_m 11.0 10.7 IND_m 10.0 10.0 Ir_m1 8.9 6.0 Dr_m1 3.0 3.2 Dw_m1 1.5 1.4 Ir_mm 0.0 0.0 Dr_mm 0.0 0.0 Dw_mm 0.0 0.0 regex::charclass::ascii charclass, ascii range before this commit after ------------------ -------- Ir 2661.0 2649.0 Dr 798.0 795.0 Dw 516.0 517.0 COND 467.0 465.0 IND 23.0 23.0 COND_m 10.0 8.8 IND_m 10.0 10.0 Ir_m1 7.9 0.0 Dr_m1 2.9 3.1 Dw_m1 1.3 1.3 Ir_mm 0.0 0.0 Dr_mm 0.0 0.0 Dw_mm 0.0 0.0 regex::charclass::high_Unicode charclass, high Unicode code point before this commit after ------------------ -------- Ir 3344.0 2888.0 Dr 971.0 855.0 Dw 604.0 541.0 COND 575.0 504.0 IND 25.0 25.0 COND_m 11.0 10.7 IND_m 10.0 10.0 Ir_m1 8.9 6.0 Dr_m1 3.0 3.2 Dw_m1 1.5 1.4 Ir_mm 0.0 0.0 Dr_mm 0.0 0.0 Dw_mm 0.0 0.0 regex::charclass::upper_latin1 charclass, upper latin1 range before this commit after ------------------ -------- Ir 2661.0 2651.0 Dr 798.0 796.0 Dw 516.0 517.0 COND 467.0 466.0 IND 23.0 23.0 COND_m 11.0 8.8 IND_m 10.0 10.0 Ir_m1 7.9 0.0 Dr_m1 2.9 3.3 Dw_m1 1.5 1.2 Ir_mm 0.0 0.0 Dr_mm 0.0 0.0 Dw_mm 0.0 0.0
* regcomp.h: Add commentsKarl Williamson2015-12-171-40/+119
|
* regex matching: Don't do unnecessary workKarl Williamson2015-12-173-3/+6
| | | | | | This commit sets a flag at pattern compilation time to indicate if a rare case is present that requires special handling, so that that handling can be avoided unless necessary.
* regcomp.h: Renumber 2 flag bitsKarl Williamson2015-12-171-4/+4
| | | | | | This changes the spare bit to be adjacent to the LOC_FOLD bit, in preparation for the next commit, which will use that bit for a LOC_FOLD-related use.
* regex: Free a ANYOF node bitKarl Williamson2015-12-173-50/+59
| | | | | | | | This is done by combining 2 mutually exclusive bits into one. I hadn't seen this possibility before because the name of one of them misled me. It also misled me into turning on one that flag unnecessarily, and to miss opportunities to not have to create a swash at runtime. This commit corrects those things as well.
* regcomp.c: Move comments adjacent to their objectKarl Williamson2015-12-171-3/+4
|
* regcomp.c: Try simplifications in some qr/[...]/dKarl Williamson2015-12-171-5/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Characters in a bracketed character class can come from a bunch of sources, all bundled together. Some things under /d match only when the target string is UTF-8; some match only when it isn't UTF-8. Other sources may introduce ones that match regardless. It may be that some things are specified as conditionally matching from one source, and as unconditionally matching from another. We can subtract the unconditionals from the conditionals, leaving a simpler set of things that must be conditionally matched. In some cases, the conditional set may go to zero, allowing other optimizations to happen that otherwise couldn't. An example is qr/[\W\xAB]/ which before this commit compiled to: ANYOFD[^0-9A-Z_a-z\x{80}-\x{AA}\x{AC}-\x{FF}][{non-utf8-latin1-all} {utf8}0080-00A9 00AC-00B4 00B6-00B9 00BB-00BF 00D7 00F7 02C2-02C5...] (12) and after it, compiles to ANYOFD[^0-9A-Z_a-z\x{AA}\x{B5}\x{BA}\x{C0}-\x{D6}\x{D8}-\x{F6} \x{F8}-\x{FF}][{non-utf8-latin1-all}{utf8}02C2-02C5...] (12) Notice that the {utf8} component has been stripped of everything below 256. That means no swash has to be created at runtime when matching code points below 256, unlike the case before this commit. A starker example, though unlikely in real life except in machine-generated code, is qr/[\w\W]/ Before this commit, it would generate: ANYOFD[\x{00}-\x{7F}][{non-utf8-latin1-all}{above_bitmap_all} {utf8}0080-00FF] and afterwards, simply: SANY
* regcomp.c: Change variable name to be clearerKarl Williamson2015-12-171-20/+29
| | | | | This name confused me, and led to suboptimal code. The new name is more cumbersome, but won't confuse (at least it won't confuse me).
* Configure: grep -q is not portableJarkko Hietaniemi2015-12-171-1/+1
| | | | It does not work in SysV (solaris) or old BSD greps.
* Revert "Upgrade Socket from 2.020 to 2.021"Steve Hay2015-12-176-102/+47
| | | | | | This reverts commit 0bd66ca801c5fb84ee6a8feeb8114f0d8248029f. Worked for me, but Jenkins isn't happy :-(
* Update META.yml following commit 0d99ea0387Steve Hay2015-12-171-1/+1
|
* Upgrade Term-ANSIColor from 4.03 to 4.04Steve Hay2015-12-172-11/+15
|
* Upgrade Socket from 2.020 to 2.021Steve Hay2015-12-176-47/+102
| | | | Blead customizations are now assimilated.
* Upgrade CPAN-Meta-YAML from 0.017-TRIAL to 0.018Steve Hay2015-12-172-3/+3
|
* Upgrade CPAN-Meta-Requirements from 2.133 to 2.140Steve Hay2015-12-179-73/+204
|
* perldelta for e3962106e93fTony Cook2015-12-171-0/+11
|
* [perl #126240] use -DPERL_USE_SAFE_PUTENV where possible on OS XTony Cook2015-12-171-0/+10
| | | | | | | | | | | | | | | | | | | On threaded builds on OS X, libSystem registers atfork handlers that call setenv(), which internally modifies members of environ[], setting them to malloc()ed blocks. In some cases Perl_my_setenv() reallocates environ[] using safesysmalloc(), which under debugging builds adds a tracking header, and if perl_destruct() sees that environ[] has been reallocated, frees it with safesysfree(). When these combine, perl attempts to free the malloc()ed block with safesysfree(), which attempts to access the tracking header, causing an invalid access in tools like valgrind, or a "free from wrong pool" error, since the header contains unrelated data. Avoid this mess by letting libc manage environ[] if unsetenv() is available.
* perldelta for dc9ef9989ca4Tony Cook2015-12-171-0/+5
|
* document save_gp() and the GVf_INTRO flagTony Cook2015-12-173-1/+18
|
* [perl #124097] don't let the GPs be removed out from under pp_sortTony Cook2015-12-172-1/+19
| | | | | | | | | | pp_sort() saves the SV pointers for *a and *b, if the sort block cleared *a or *b the GP, which the pointer is stored would be freed and the save stack processing would try to write to freed memory. Make sure the GP lasts at least long enough for the SV slots to be restored. This doesn't attempt to restore *a or *b, the user chose to clear them.
* Explicitly build the shared Perl library in Solaris and variants.Jarkko Hietaniemi2015-12-161-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Symptom of failure: in openindiana "make" fails: ... ./perl -Ilib -f pod/buildtoc -q Can't load 'lib/auto/re/re.so' for module re: ld.so.1: perl: fatal: relocation error: file lib/auto/re/re.so: symbol PL_localizing: referenced symbol not found at lib/XSLoader.pm line 71. at lib/re.pm line 88. ... Running the above command with 'env LD_DEBUG=files ...' shows that there are many other symbol lookup failures, the one above is just the last one before bailing. If configured explicitly with -Duseshrplib, openindiana build succeeds. Curiously, while the hints/solaris_2.sh (which openindiana uses) does not specify useshrplib, Oracle/Sun builds/has been building their perl with useshrplib since Perl 5.6.1 or thereabouts (source: Alan Burlison). Using shared libraries is strongly recommended in Solaris in general (source: the same). Tested in: - Solaris 5.10/i386 with solstudio 12.2 and gcc 4.8.0 - Solaris 5.10/sparc with solarisstudio 12.3 and gcc 4.9.2 - OpenIndiana 5.11/i386 with solarisstudio 12.3 and gcc 4.5.0
* perlpodspec: fix typoLukas Mai2015-12-161-1/+1
|
* Deprecate wide chars in logical string opsKarl Williamson2015-12-168-0/+81
| | | | | | | See thread starting at http://nntp.perl.org/group/perl.perl5.porters/227698 Ricardo Signes provided the perldelta and perldiag text.
* Change deprecation warning textKarl Williamson2015-12-163-23/+23
| | | | | The old text used the passive voice. No 5.23 release has been made with the old text, so no perldelta changes are needed.
* perldiag.pod: Correctly alphabetize an entryKarl Williamson2015-12-161-8/+8
|