summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Perlhist entry for 5.27.8v5.27.8Abigail2018-01-201-0/+1
|
* Fixup for perldelta.Abigail2018-01-201-3/+0
| | | | Removed a XXX section.
* Updated modules for perldelta.Abigail2018-01-201-17/+71
|
* Acknowledgements for perldeltaAbigail2018-01-201-2/+26
|
* Update Module::CoreList for 5.27.8Abigail2018-01-201-0/+41
|
* perldelta: Clarify entryKarl Williamson2018-01-191-1/+2
| | | | Spotted by Dan Book
* Don’t vivify elems when putting array on stackFather Chrysostomos2018-01-192-4/+43
| | | | | | | | | | 6661956a2 was a little too powerful, and, in addition to fixing the bug that @_ did not properly alias nonexistent elements, also broke other uses of nonexistent array elements. (See the tests added.) This commit changes it so that putting @a on the stack does not vivify all ‘holes’ in @a, but creates defelem (deferred element) scalars, but only in lvalue context.
* Apply the mod flag to @a in \(@a)Father Chrysostomos2018-01-191-2/+10
| | | | The next commit will depend on it.
* perldelta for signatures/attribute order flipDavid Mitchell2018-01-191-0/+14
|
* move sub attributes before the signatureDavid Mitchell2018-01-198-1232/+1171
| | | | | | | | | | | | | | | | | | | | | | | RT #132141 Attributes such as :lvalue have to come *before* the signature to ensure that they're applied to any code block within the signature; e.g. sub f :lvalue ($a = do { $x = "abc"; return substr($x,0,1)}) { .... } So this commit moves sub attributes to come before the signature. This is how they were originally, but they were swapped with v5.21.7-394-gabcf453. This commit is essentially a revert of that commit (and its followups v5.21.7-395-g71917f6, v5.21.7-421-g63ccd0d), plus some extra work for Deparse, and an extra test. See: RT #123069 for why they were originally swapped RT #132141 for why that broke :lvalue http://nntp.perl.org/group/perl.perl5.porters/247999 for a general discussion about RT #132141
* newSVpvn(): Fix podKarl Williamson2018-01-191-1/+1
| | | | | | There is no "buffer" argument; don't refer to one. Spotted by KES
* Raise deprecation for qr/(?foo})/Karl Williamson2018-01-195-26/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An unescaped left brace that is meant to be taken literally is officially deprecated, though there are no plans to remove it in contexts where we don't expect to use it to mean something else, and no warning is raised in those contexts. reg_mesg.t tests the known set of these contexts, currently (after this commit): /^{/ /foo|{/ /foo|^{/ /foo(:?{bar)/ /\s*{/ /a{3,4}{/ This commit deprecates this context: /foo({bar})/ This probably should have been illegal all along when 'bar' is a valid quantifier, as we do with the other quantifiers that follow a left paren whose illegality we haven't already taken advantage of to mean something else: qr/(+0)/ Quantifier follows nothing in regex This deprecation will allow ({...}) to be usable for a possible future regex extension
* doop.c: White-space onlyKarl Williamson2018-01-191-10/+10
| | | | Indent to correspond with the new block placed by the previous commit.
* Deprecate above \xFF in bitwise string opsKarl Williamson2018-01-195-7/+106
| | | | | | | | | | | | This is already a fatal error for operations whose outcome depends on them, but in things like "abc" & "def\x{100}" the wide character doesn't actually need to participate in the AND, and so perl doesn't. As a result of the discussion in the thread beginning with http://nntp.perl.org/group/perl.perl5.porters/244884, it was decided to deprecate these ones too.
* doop.c: Use MIN()Karl Williamson2018-01-191-1/+1
| | | | This is slightly cleaner than hand rolling the min.
* op/bop.t: Fix typo in test nameKarl Williamson2018-01-191-1/+1
|
* Update Copyright years in README and perl.c.Abigail2018-01-192-3/+4
| | | | Now, 2018 is included.
* perldelta: add recent tr/// changesDavid Mitchell2018-01-191-0/+22
|
* [MERGE] various tr/// fixups, esp for /c and /dDavid Mitchell2018-01-1913-203/+811
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This branch does the following: Fixes an issue with tr/non_utf8/long_non_utf8/c, where length(long_non_utf8) > 0x7fff. Fixes an issue with tr/non_utf8/non_utf8/cd: basically, the implicit \x{100}-\x{7fffffff} added to the searchlist by /c wasn't being added. Adds a lot of code comments to the various tr/// functions. Adds tr///c tests - basically /c was almost completely untested. Changes the layout of the op_pv transliteration table: it used to be roughly 256 x short - basic table 1 x short - length of extended table (n) n x short - extended table where the 2 and 3rd items were only present under /c. Its now 1 x Size_t - length of table (256+n) (256+n) x short - table - both basic and extended where n == 0 apart from under /c. The new table format also allowed the tr/non_utf8/non_utf8/ code branches to be considerably simplified. op_dump() now dumps the contents of the (non-utf8 variant) transliteration table. Removes I32's from the tr/non_utf8/non_utf8/ code paths, making it fully 64-bit clean. Improves the pod for tr///.
| * perlop: improve tr/// documentationDavid Mitchell2018-01-191-6/+20
| | | | | | | | Specifically, explain more clearly what the /csd modifiers do.
| * tr///: eliminate I32 from the do_trans*() fnsDavid Mitchell2018-01-191-15/+15
| | | | | | | | Replace each with a more appropriate type
| * tr///: return Size_t count rather than I32David Mitchell2018-01-194-29/+29
| | | | | | | | | | | | Change the signature of all the internal do_trans*() functions to return Size_t rather than I32, so that the count returned by tr//// can cope with strings longer than 2Gb.
| * tr///: remove some I32 from S_pmtrans()David Mitchell2018-01-191-16/+15
| | | | | | | | | | I32 to hold char counts etc is generally a bug. I've replaced with Size_t. I've left the swash part of the code alone.
| * tr/nonutf8/nonutf8/c: simplify GROW calcDavid Mitchell2018-01-191-2/+5
| | | | | | | | | | | | | | | | | | | | When, for each slot, deciding whether to set OPpTRANS_GROWS, the calculation is only done in one of 4 possible branches. It turns out that in the other branches, the condition can never be true; but determining that is subtle, and the assumption might break for future changes. Move the test outside the if/else tree so it can be seen to always apply. So in theory this commit makes no function difference.
| * op_dump(): dump tr/// translation tableDavid Mitchell2018-01-191-3/+35
| | | | | | | | | | | | previously it just displayed its address. Also, when the table is in fact a swash, don't display its address on threaded builds, as its actually just a padix.
| * tr///; simplify $utf8 =~ tr/nonutf8/nonutf8/David Mitchell2018-01-195-175/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The run-time code to handle a non-utf8 tr/// against a utf8 string is complex, with many variants of similar code repeated depending on the presence of the /s and /c flags. Simplify them all into a single code block by changing how the translation table is stored. Formerly, the tr struct contained possibly two tables: the basic 0-255 slot one, plus in the presence of /c, a second one to map the implicit search range (\x{100}...) against any residual replacement chars not consumed by the first table. This commit merges the two tables into a single unified whole. For example tr/\x00-\xfe/abcd/c is equivalent to tr/xff-\x{7fffffff}/abcd/ which generates a 259-entry translation table consisting of: 0x00 => -1 0x01 => -1 ... 0xfe => -1 0xff => a 0x100 => b 0x101 => c 0x102 => d In addition we store: 1) the size of the translation table (0x103 in the example above); 2) an extra 'wildcard' entry stored 1 slot beyond the main table, which specifies the action for any codepoints outside the range of the table (i.e. chars 0x103..0x7fffffff). This can be either: a) a character, when the last replacement char is repeated; b) -1 when /c isn't in effect; c) -2 when /d is in effect; c) -3 identity: when the replacement list is empty but not /d. In the example above, this would be 0x103 => d The addition of -3 as a valid slot value is new. This makes the main runtime code for the utf8 string with non-utf8 tr// case look like, at its core: size = tbl->size; mapped_ch = tbl->map[ch >= size ? size : ch]; which then processes mapped_ch based on whether its >=0, or -1/-2/-3. This is a lot simpler than the old scheme, and should generally be faster too.
| * tr///c: handle len(replacement charlist) > 32767David Mitchell2018-01-196-6/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RT #132608 In the non-utf8 case, the /c (complement) flag to tr adds an implied \x{100}-\x{7fffffff} range to the search charlist. If the replacement list contains more chars than are paired with the 0-255 part of the search list, then the excess chars are stored in an extended part of the table. The excess char count was being stored as a short, which caused problems if the replacement list contained more than 32767 excess chars: either substituting the wrong char, or substituting for a char located up to 0xffff bytes in memory before the real translation table. So change it to SSize_t. Note that this is only a problem when the search and replacement charlists are non-utf8, the replacement list contains around 0x8000+ entries, and where the string being translated is utf8 with at least one codepoint >= U+8000.
| * B, Deparse fixups for tr///cDavid Mitchell2018-01-193-16/+34
| | | | | | | | | | | | | | | | Recent commits slightly changed the layout of the extended map table: it now always stores a repeat count, and there are now two structs defined, rather than treating certain slots, like tbl[0x101], specially. Update B and Deparse to reflect this.
| * add two structs for OP_TRANSDavid Mitchell2018-01-193-46/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally, the op_pv of an OP_TRANS op pointed to a 256-slot array of shorts, which contained the translations. However, in the presence of tr///c, extra information needs to be stored to handle utf8 strings. The 256 slot array was extended, with slot 0x100 holding a length, and slots 0x101 holding some extra chars. This has made things a bit messy, so this commit adds two structs, one being an array of 256 shorts, and the other being the same but with some extra fields. So for example tbl->[0x100] has been replaced with tbl->excess_len. This commit should make no functional difference, but will allow us shortly to fix a bug by changing the type of the excess_len field from short to something bigger, for example.
| * S_do_trans_complex(): re-indentDavid Mitchell2018-01-191-6/+6
| | | | | | | | outdent a code block following previous commit.
| * fix "\x{100}..." =~ tr/.../.../cdDavid Mitchell2018-01-193-40/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In transliterations where the search and replacement charlists are non-utf8, but where the string being modified contains codepoints >= 0x100, then tr/.../.../cd would always delete all such codepoints, rather than potentially mapping some of them. In more detail: in the presence of /c (complement), an implicit 0x100..0x7fffffff is added to a non-utf8 search charlist. If the replacement list is longer than the < 0x100 part of the search list, then the last few replacement chars should in principle be paired off against the first few of (\x100, \x101, ...). However, this wasn't happening. For example, tr/\x00-\xfd/ABCD/cd should be equivalent to tr/\xfe-\x{7fffffff}/ABCD/d which should map: \xfe => A, \xff => B, \x{100} => C, \x{101} => D, and delete \x{102} onwards. But instead, it behaved like tr/\xfe-\x{7fffffff}/AB/d and deleted all codepoints >= 0x100. This commit fixes that by using the extended mapping table format for all /c variants (formerly it excluded /cd). I also changed a variable holding the mapped char from being I32 to UV: principally to avoid a casting mess in the fixed code. This may (or may not), as a side-effect, have fixed possible issues with very large codepoints.
| * OP_TRANS: change extended table formatDavid Mitchell2018-01-192-28/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For non-utf8, OP_TRANS(R) ops have a translation table consisting of an array of 256 shorts attached. For tr///c, this table is extended to hold information about chars in the replacement list which aren't paired with chars in the search list. For example, tr/\x00-AE-\xff/bcdefg/c is equivalent to tr/BCD\x{100}-\x{7fffffff}/bcdefg/ which is equivalent to tr/BCD\x{100}-\x{7fffffff}/bcdefggggggggg..../ Only the BCD => bcd mappings can be stored in the basic 256-slot table, so potentially the following extra information needs recording in an extended table to handle codepoints > 0xff in the string being modified: 1) the extra replacement chars ("efg"); 2) the number of extra replacement chars (3); 3) the "repeat" char ('g'). Currently 2) and 3) are combined: the repeat char is found as the last extra char, and if there are no extra chars, the repeat char is treated as an extra char list of length 1. Similarly, an 'extra chars' length value of 1 can imply either one extra char, or no extra chars with the repeat char being faked as an extra char. An 'extra chars' length of 0 implies an empty replacement list, i.e. tr/....//c. This commit changes it so that the repeat char is *always* stored (in slot 0x101), with the extra chars stored beginning at slot 0x102. The 'extra chars' length value (located at slot 0x0100) has changed its meaning slightly: now -1 implies tr/....//c 0 implies no more replacement chars than search chars 1+ the number of excess replacement chars. This (should) make no function difference, but the extra information stored will make it easier to fix some bugs shortly.
| * S_pmtrans(): add assert and simplify conditionalDavid Mitchell2018-01-191-2/+4
| | | | | | | | | | | | | | | | | | in tr/search/replace/c, the number of 'paired' replacement chars will always be <= length(replace). Assert this, and thus simplify a couple of conditionals from >= to ==. It should make no difference to execution, but reduces the cognitive load.
| * t/op/tr.t: add tr///c testsDavid Mitchell2018-01-191-1/+411
| | | | | | | | | | | | | | | | | | | | | | | | The /c (complement) flag is almost completely untested. Indeed, for the all non-utf8 case, nothing in core exercises a plain tr///c. So this commit adds reasonably comprehensive tests for tr//c and variants (/cs, /cd, /csd) where the search and replacement ranges are non-utf8, and the string being matched may or may not be utf8. A few tests are TODO for now as I've exposed some bugs - to be fixed shortly.
| * S_pmtrans(): always use op_private flag variablesDavid Mitchell2018-01-191-3/+2
| | | | | | | | | | | | | | | | | | Various flag vars are set early on, such as: const I32 complement = o->op_private & OPpTRANS_COMPLEMENT; but sometimes these vars weren't being used, and op_private was being tested again.
| * remove fossil debugging statement from do_trans()David Mitchell2018-01-191-2/+0
| | | | | | | | | | | | | | | | | | This: DEBUG_t( Perl_deb(aTHX_ "2.TBL\n")); has been around in one form or another since perl1, but it makes no sense since perl5,000, where -Dt now shows the name of the op being executed.
| * S_pmtrans(): remove some whitespaceDavid Mitchell2018-01-191-2/+1
| | | | | | | | | | Removal of MAD a long time ago left a couple of lines with very weird indentation.
| * tr/// functions: add some basic code commentsDavid Mitchell2018-01-193-6/+156
|/ | | | | | | | | | | | | For the various C functions which implement the compile-time and run-time aspects of OP_TRANS, add some basic code comments at the top of each function explaining what its purpose is. Also add lots of code comments to the body of S_pmtrans() (which compiles a tr///). Also comment what the OPpTRANS_ private flag bits mean. No functional changes.
* Fix original version of Socket in perldeltaDagfinn Ilmari Mannsåker2018-01-191-1/+1
| | | | | | | The upgrade from 2.020_04 to 2.025 was not noted manually in perldelta, since there was nothing particularly noteworthy in the update, and the comment at the top of the section says this will happen automatically as part of the release process.
* perldelta 60fa46621ae5d0d44c802aedc205274584701fa0Zefram2018-01-191-0/+5
|
* fix F0convert() on edge casesZefram2018-01-192-4/+14
| | | | | | | | | | The F0convert() function used to implement the %.0f format specifier more cheaply went wrong on some edge cases. Its rounding went wrong when the exponent is such that fractional values are not representable, making the "+= 0.5" invoke floating point rounding. Fix that by only invoking that rounding logic for values that start out fractional. That fixes the output part of [perl #47602]. It also failed to emit the sign for negative zero. Fix that by making it not apply to zero values.
* perldelta for 7d97880ddc4f275caa3eeab435a4f5a8cf601971James E Keenan2018-01-181-0/+4
|
* Sync Socket with CPAN (2.025 -> 2.027).James E Keenan2018-01-184-8/+11
| | | | Addresses RT # 132737.
* Getting perldelta for 5.27.8 into shape.Abigail2018-01-191-222/+59
|
* Additional fix-ups for configure.com.Craig A. Berry2018-01-181-0/+5
| | | | | Some things VMS doesn't have and one that it does. All were missing from the config.sh we generate.
* override autodetection of mkostemp() on DarwinZefram2018-01-181-0/+4
| | | | | | | On Darwin 15.6.0, mkostemp() was observed to be autodetected as present but actually be unlinkable. It is unknown what other Darwin versions are affected, so for the time being just override the autodetection on all versions.
* Revert "Revert "make PerlIO handle FD_CLOEXEC""Zefram2018-01-1810-71/+85
| | | | | | This reverts commit 523d71b314dc75bd212794cc8392eab8267ea744, reinstating commit 2cdf406af42834c46ef407517daab0734f7066fc. Reversion is not the way to address the porting problem that motivated that reversion.
* tick off release 5.27.7Karen Etheridge2018-01-181-1/+1
|
* Correct pad.c pod: PadARRAY, not PAD_ARRAYFather Chrysostomos2018-01-181-1/+1
|
* perl -Dr: avoid coredump in \1David Mitchell2018-01-181-1/+1
| | | | | | | | | | | | | | | | | When displaying each reg node being executed, the code that dumps a REF node assumed that a capture was valid if progs->offs[n].start != -1. In fact during backtracking after a failure, a capture is "undone" by merely setting progs->offs[n].end = -1. So make the dump code account for that too. This was causing a test in t/re/pat.t to coredump: use re qw(Debug EXECUTE); "x" =~ m{ () y | () \1 }x; Although given that neither the test nor the REF code in regprop() have changed recently, I'm not sure why this has only recently started crashing.