delta/perl.git - github.com: perl/perl5.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Perlhist entry for 5.27.8v5.27.8	Abigail	2018-01-20	1	-0/+1
\|
*	Fixup for perldelta.	Abigail	2018-01-20	1	-3/+0
\| \| \| \|	Removed a XXX section.
*	Updated modules for perldelta.	Abigail	2018-01-20	1	-17/+71
\|
*	Acknowledgements for perldelta	Abigail	2018-01-20	1	-2/+26
\|
*	Update Module::CoreList for 5.27.8	Abigail	2018-01-20	1	-0/+41
\|
*	perldelta: Clarify entry	Karl Williamson	2018-01-19	1	-1/+2
\| \| \| \|	Spotted by Dan Book
*	Don’t vivify elems when putting array on stack	Father Chrysostomos	2018-01-19	2	-4/+43
\| \| \| \| \| \| \| \| \| \|	6661956a2 was a little too powerful, and, in addition to fixing the bug that @_ did not properly alias nonexistent elements, also broke other uses of nonexistent array elements. (See the tests added.) This commit changes it so that putting @a on the stack does not vivify all ‘holes’ in @a, but creates defelem (deferred element) scalars, but only in lvalue context.
*	Apply the mod flag to @a in \(@a)	Father Chrysostomos	2018-01-19	1	-2/+10
\| \| \| \|	The next commit will depend on it.
*	perldelta for signatures/attribute order flip	David Mitchell	2018-01-19	1	-0/+14
\|
*	move sub attributes before the signature	David Mitchell	2018-01-19	8	-1232/+1171
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RT #132141 Attributes such as :lvalue have to come before the signature to ensure that they're applied to any code block within the signature; e.g. sub f :lvalue ($a = do { $x = "abc"; return substr($x,0,1)}) { .... } So this commit moves sub attributes to come before the signature. This is how they were originally, but they were swapped with v5.21.7-394-gabcf453. This commit is essentially a revert of that commit (and its followups v5.21.7-395-g71917f6, v5.21.7-421-g63ccd0d), plus some extra work for Deparse, and an extra test. See: RT #123069 for why they were originally swapped RT #132141 for why that broke :lvalue http://nntp.perl.org/group/perl.perl5.porters/247999 for a general discussion about RT #132141
*	newSVpvn(): Fix pod	Karl Williamson	2018-01-19	1	-1/+1
\| \| \| \| \| \|	There is no "buffer" argument; don't refer to one. Spotted by KES
*	Raise deprecation for qr/(?foo})/	Karl Williamson	2018-01-19	5	-26/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An unescaped left brace that is meant to be taken literally is officially deprecated, though there are no plans to remove it in contexts where we don't expect to use it to mean something else, and no warning is raised in those contexts. reg_mesg.t tests the known set of these contexts, currently (after this commit): /^{/ /foo\|{/ /foo\|^{/ /foo(:?{bar)/ /\s*{/ /a{3,4}{/ This commit deprecates this context: /foo({bar})/ This probably should have been illegal all along when 'bar' is a valid quantifier, as we do with the other quantifiers that follow a left paren whose illegality we haven't already taken advantage of to mean something else: qr/(+0)/ Quantifier follows nothing in regex This deprecation will allow ({...}) to be usable for a possible future regex extension
*	doop.c: White-space only	Karl Williamson	2018-01-19	1	-10/+10
\| \| \| \|	Indent to correspond with the new block placed by the previous commit.
*	Deprecate above \xFF in bitwise string ops	Karl Williamson	2018-01-19	5	-7/+106
\| \| \| \| \| \| \| \| \| \| \| \|	This is already a fatal error for operations whose outcome depends on them, but in things like "abc" & "def\x{100}" the wide character doesn't actually need to participate in the AND, and so perl doesn't. As a result of the discussion in the thread beginning with http://nntp.perl.org/group/perl.perl5.porters/244884, it was decided to deprecate these ones too.
*	doop.c: Use MIN()	Karl Williamson	2018-01-19	1	-1/+1
\| \| \| \|	This is slightly cleaner than hand rolling the min.
*	op/bop.t: Fix typo in test name	Karl Williamson	2018-01-19	1	-1/+1
\|
*	Update Copyright years in README and perl.c.	Abigail	2018-01-19	2	-3/+4
\| \| \| \|	Now, 2018 is included.
*	perldelta: add recent tr/// changes	David Mitchell	2018-01-19	1	-0/+22
\|
*	[MERGE] various tr/// fixups, esp for /c and /d	David Mitchell	2018-01-19	13	-203/+811
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This branch does the following: Fixes an issue with tr/non_utf8/long_non_utf8/c, where length(long_non_utf8) > 0x7fff. Fixes an issue with tr/non_utf8/non_utf8/cd: basically, the implicit \x{100}-\x{7fffffff} added to the searchlist by /c wasn't being added. Adds a lot of code comments to the various tr/// functions. Adds tr///c tests - basically /c was almost completely untested. Changes the layout of the op_pv transliteration table: it used to be roughly 256 x short - basic table 1 x short - length of extended table (n) n x short - extended table where the 2 and 3rd items were only present under /c. Its now 1 x Size_t - length of table (256+n) (256+n) x short - table - both basic and extended where n == 0 apart from under /c. The new table format also allowed the tr/non_utf8/non_utf8/ code branches to be considerably simplified. op_dump() now dumps the contents of the (non-utf8 variant) transliteration table. Removes I32's from the tr/non_utf8/non_utf8/ code paths, making it fully 64-bit clean. Improves the pod for tr///.
\| *	perlop: improve tr/// documentation	David Mitchell	2018-01-19	1	-6/+20
\| \| \| \| \| \| \| \|	Specifically, explain more clearly what the /csd modifiers do.
\| *	tr///: eliminate I32 from the do_trans*() fns	David Mitchell	2018-01-19	1	-15/+15
\| \| \| \| \| \| \| \|	Replace each with a more appropriate type
\| *	tr///: return Size_t count rather than I32	David Mitchell	2018-01-19	4	-29/+29
\| \| \| \| \| \| \| \| \| \| \| \|	Change the signature of all the internal do_trans*() functions to return Size_t rather than I32, so that the count returned by tr//// can cope with strings longer than 2Gb.
\| *	tr///: remove some I32 from S_pmtrans()	David Mitchell	2018-01-19	1	-16/+15
\| \| \| \| \| \| \| \| \| \|	I32 to hold char counts etc is generally a bug. I've replaced with Size_t. I've left the swash part of the code alone.
\| *	tr/nonutf8/nonutf8/c: simplify GROW calc	David Mitchell	2018-01-19	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When, for each slot, deciding whether to set OPpTRANS_GROWS, the calculation is only done in one of 4 possible branches. It turns out that in the other branches, the condition can never be true; but determining that is subtle, and the assumption might break for future changes. Move the test outside the if/else tree so it can be seen to always apply. So in theory this commit makes no function difference.
\| *	op_dump(): dump tr/// translation table	David Mitchell	2018-01-19	1	-3/+35
\| \| \| \| \| \| \| \| \| \| \| \|	previously it just displayed its address. Also, when the table is in fact a swash, don't display its address on threaded builds, as its actually just a padix.
\| *	tr///; simplify $utf8 =~ tr/nonutf8/nonutf8/	David Mitchell	2018-01-19	5	-175/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The run-time code to handle a non-utf8 tr/// against a utf8 string is complex, with many variants of similar code repeated depending on the presence of the /s and /c flags. Simplify them all into a single code block by changing how the translation table is stored. Formerly, the tr struct contained possibly two tables: the basic 0-255 slot one, plus in the presence of /c, a second one to map the implicit search range (\x{100}...) against any residual replacement chars not consumed by the first table. This commit merges the two tables into a single unified whole. For example tr/\x00-\xfe/abcd/c is equivalent to tr/xff-\x{7fffffff}/abcd/ which generates a 259-entry translation table consisting of: 0x00 => -1 0x01 => -1 ... 0xfe => -1 0xff => a 0x100 => b 0x101 => c 0x102 => d In addition we store: 1) the size of the translation table (0x103 in the example above); 2) an extra 'wildcard' entry stored 1 slot beyond the main table, which specifies the action for any codepoints outside the range of the table (i.e. chars 0x103..0x7fffffff). This can be either: a) a character, when the last replacement char is repeated; b) -1 when /c isn't in effect; c) -2 when /d is in effect; c) -3 identity: when the replacement list is empty but not /d. In the example above, this would be 0x103 => d The addition of -3 as a valid slot value is new. This makes the main runtime code for the utf8 string with non-utf8 tr// case look like, at its core: size = tbl->size; mapped_ch = tbl->map[ch >= size ? size : ch]; which then processes mapped_ch based on whether its >=0, or -1/-2/-3. This is a lot simpler than the old scheme, and should generally be faster too.
\| *	tr///c: handle len(replacement charlist) > 32767	David Mitchell	2018-01-19	6	-6/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RT #132608 In the non-utf8 case, the /c (complement) flag to tr adds an implied \x{100}-\x{7fffffff} range to the search charlist. If the replacement list contains more chars than are paired with the 0-255 part of the search list, then the excess chars are stored in an extended part of the table. The excess char count was being stored as a short, which caused problems if the replacement list contained more than 32767 excess chars: either substituting the wrong char, or substituting for a char located up to 0xffff bytes in memory before the real translation table. So change it to SSize_t. Note that this is only a problem when the search and replacement charlists are non-utf8, the replacement list contains around 0x8000+ entries, and where the string being translated is utf8 with at least one codepoint >= U+8000.
\| *	B, Deparse fixups for tr///c	David Mitchell	2018-01-19	3	-16/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recent commits slightly changed the layout of the extended map table: it now always stores a repeat count, and there are now two structs defined, rather than treating certain slots, like tbl[0x101], specially. Update B and Deparse to reflect this.
\| *	add two structs for OP_TRANS	David Mitchell	2018-01-19	3	-46/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally, the op_pv of an OP_TRANS op pointed to a 256-slot array of shorts, which contained the translations. However, in the presence of tr///c, extra information needs to be stored to handle utf8 strings. The 256 slot array was extended, with slot 0x100 holding a length, and slots 0x101 holding some extra chars. This has made things a bit messy, so this commit adds two structs, one being an array of 256 shorts, and the other being the same but with some extra fields. So for example tbl->[0x100] has been replaced with tbl->excess_len. This commit should make no functional difference, but will allow us shortly to fix a bug by changing the type of the excess_len field from short to something bigger, for example.
\| *	S_do_trans_complex(): re-indent	David Mitchell	2018-01-19	1	-6/+6
\| \| \| \| \| \| \| \|	outdent a code block following previous commit.
\| *	fix "\x{100}..." =~ tr/.../.../cd	David Mitchell	2018-01-19	3	-40/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In transliterations where the search and replacement charlists are non-utf8, but where the string being modified contains codepoints >= 0x100, then tr/.../.../cd would always delete all such codepoints, rather than potentially mapping some of them. In more detail: in the presence of /c (complement), an implicit 0x100..0x7fffffff is added to a non-utf8 search charlist. If the replacement list is longer than the < 0x100 part of the search list, then the last few replacement chars should in principle be paired off against the first few of (\x100, \x101, ...). However, this wasn't happening. For example, tr/\x00-\xfd/ABCD/cd should be equivalent to tr/\xfe-\x{7fffffff}/ABCD/d which should map: \xfe => A, \xff => B, \x{100} => C, \x{101} => D, and delete \x{102} onwards. But instead, it behaved like tr/\xfe-\x{7fffffff}/AB/d and deleted all codepoints >= 0x100. This commit fixes that by using the extended mapping table format for all /c variants (formerly it excluded /cd). I also changed a variable holding the mapped char from being I32 to UV: principally to avoid a casting mess in the fixed code. This may (or may not), as a side-effect, have fixed possible issues with very large codepoints.
\| *	OP_TRANS: change extended table format	David Mitchell	2018-01-19	2	-28/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For non-utf8, OP_TRANS(R) ops have a translation table consisting of an array of 256 shorts attached. For tr///c, this table is extended to hold information about chars in the replacement list which aren't paired with chars in the search list. For example, tr/\x00-AE-\xff/bcdefg/c is equivalent to tr/BCD\x{100}-\x{7fffffff}/bcdefg/ which is equivalent to tr/BCD\x{100}-\x{7fffffff}/bcdefggggggggg..../ Only the BCD => bcd mappings can be stored in the basic 256-slot table, so potentially the following extra information needs recording in an extended table to handle codepoints > 0xff in the string being modified: 1) the extra replacement chars ("efg"); 2) the number of extra replacement chars (3); 3) the "repeat" char ('g'). Currently 2) and 3) are combined: the repeat char is found as the last extra char, and if there are no extra chars, the repeat char is treated as an extra char list of length 1. Similarly, an 'extra chars' length value of 1 can imply either one extra char, or no extra chars with the repeat char being faked as an extra char. An 'extra chars' length of 0 implies an empty replacement list, i.e. tr/....//c. This commit changes it so that the repeat char is always stored (in slot 0x101), with the extra chars stored beginning at slot 0x102. The 'extra chars' length value (located at slot 0x0100) has changed its meaning slightly: now -1 implies tr/....//c 0 implies no more replacement chars than search chars 1+ the number of excess replacement chars. This (should) make no function difference, but the extra information stored will make it easier to fix some bugs shortly.
\| *	S_pmtrans(): add assert and simplify conditional	David Mitchell	2018-01-19	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in tr/search/replace/c, the number of 'paired' replacement chars will always be <= length(replace). Assert this, and thus simplify a couple of conditionals from >= to ==. It should make no difference to execution, but reduces the cognitive load.
\| *	t/op/tr.t: add tr///c tests	David Mitchell	2018-01-19	1	-1/+411
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The /c (complement) flag is almost completely untested. Indeed, for the all non-utf8 case, nothing in core exercises a plain tr///c. So this commit adds reasonably comprehensive tests for tr//c and variants (/cs, /cd, /csd) where the search and replacement ranges are non-utf8, and the string being matched may or may not be utf8. A few tests are TODO for now as I've exposed some bugs - to be fixed shortly.
\| *	S_pmtrans(): always use op_private flag variables	David Mitchell	2018-01-19	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Various flag vars are set early on, such as: const I32 complement = o->op_private & OPpTRANS_COMPLEMENT; but sometimes these vars weren't being used, and op_private was being tested again.
\| *	remove fossil debugging statement from do_trans()	David Mitchell	2018-01-19	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This: DEBUG_t( Perl_deb(aTHX_ "2.TBL\n")); has been around in one form or another since perl1, but it makes no sense since perl5,000, where -Dt now shows the name of the op being executed.
\| *	S_pmtrans(): remove some whitespace	David Mitchell	2018-01-19	1	-2/+1
\| \| \| \| \| \| \| \| \| \|	Removal of MAD a long time ago left a couple of lines with very weird indentation.
\| *	tr/// functions: add some basic code comments	David Mitchell	2018-01-19	3	-6/+156
\|/ \| \| \| \| \| \| \| \| \| \| \| \|	For the various C functions which implement the compile-time and run-time aspects of OP_TRANS, add some basic code comments at the top of each function explaining what its purpose is. Also add lots of code comments to the body of S_pmtrans() (which compiles a tr///). Also comment what the OPpTRANS_ private flag bits mean. No functional changes.
*	Fix original version of Socket in perldelta	Dagfinn Ilmari Mannsåker	2018-01-19	1	-1/+1
\| \| \| \| \| \| \|	The upgrade from 2.020_04 to 2.025 was not noted manually in perldelta, since there was nothing particularly noteworthy in the update, and the comment at the top of the section says this will happen automatically as part of the release process.
*	perldelta 60fa46621ae5d0d44c802aedc205274584701fa0	Zefram	2018-01-19	1	-0/+5
\|
*	fix F0convert() on edge cases	Zefram	2018-01-19	2	-4/+14
\| \| \| \| \| \| \| \| \| \|	The F0convert() function used to implement the %.0f format specifier more cheaply went wrong on some edge cases. Its rounding went wrong when the exponent is such that fractional values are not representable, making the "+= 0.5" invoke floating point rounding. Fix that by only invoking that rounding logic for values that start out fractional. That fixes the output part of [perl #47602]. It also failed to emit the sign for negative zero. Fix that by making it not apply to zero values.
*	perldelta for 7d97880ddc4f275caa3eeab435a4f5a8cf601971	James E Keenan	2018-01-18	1	-0/+4
\|
*	Sync Socket with CPAN (2.025 -> 2.027).	James E Keenan	2018-01-18	4	-8/+11
\| \| \| \|	Addresses RT # 132737.
*	Getting perldelta for 5.27.8 into shape.	Abigail	2018-01-19	1	-222/+59
\|
*	Additional fix-ups for configure.com.	Craig A. Berry	2018-01-18	1	-0/+5
\| \| \| \| \|	Some things VMS doesn't have and one that it does. All were missing from the config.sh we generate.
*	override autodetection of mkostemp() on Darwin	Zefram	2018-01-18	1	-0/+4
\| \| \| \| \| \| \|	On Darwin 15.6.0, mkostemp() was observed to be autodetected as present but actually be unlinkable. It is unknown what other Darwin versions are affected, so for the time being just override the autodetection on all versions.
*	Revert "Revert "make PerlIO handle FD_CLOEXEC""	Zefram	2018-01-18	10	-71/+85
\| \| \| \| \| \|	This reverts commit 523d71b314dc75bd212794cc8392eab8267ea744, reinstating commit 2cdf406af42834c46ef407517daab0734f7066fc. Reversion is not the way to address the porting problem that motivated that reversion.
*	tick off release 5.27.7	Karen Etheridge	2018-01-18	1	-1/+1
\|
*	Correct pad.c pod: PadARRAY, not PAD_ARRAY	Father Chrysostomos	2018-01-18	1	-1/+1
\|
*	perl -Dr: avoid coredump in \1	David Mitchell	2018-01-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When displaying each reg node being executed, the code that dumps a REF node assumed that a capture was valid if progs->offs[n].start != -1. In fact during backtracking after a failure, a capture is "undone" by merely setting progs->offs[n].end = -1. So make the dump code account for that too. This was causing a test in t/re/pat.t to coredump: use re qw(Debug EXECUTE); "x" =~ m{ () y \| () \1 }x; Although given that neither the test nor the REF code in regprop() have changed recently, I'm not sure why this has only recently started crashing.