delta/perl.git - github.com: perl/perl5.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Remove duplicate "the" in comments	Elvin Aslanov	2023-05-03	1	-2/+2
\| \| \| \|	Fix spelling on various files pertaining to core Perl.
*	fix incorrect vi filetype declarations in generated files	Lukas Mai	2023-03-24	1	-1/+1
\| \| \| \| \|	Vim's filetype declarations are case sensitive. The correct types for Perl, C, and Pod are perl, c, and pod, respectively.
*	manually triggered generated files - add file type data to modeline	Yves Orton	2023-02-19	1	-2/+2
\| \| \| \|	so that github syntax highlights them properly
*	mktables: flush output immediately when debugging	Karl Williamson	2022-09-28	1	-1/+1
\| \| \| \|	Helps disentangle mixed up output
*	Support Unicode 15.0	Unicode Consortium	2022-09-28	1	-7597/+7676
\|
*	mktables: Skip some new 15.0 files	Karl Williamson	2022-09-28	1	-1/+1
\| \| \| \| \| \| \| \| \|	These are newly delivered by Unicode. I haven't had time to analyze them for use for potential new properties. They deal with security issues of characters that look alike. I'm not adding them to the list of files under git, but they are explicitly mentioned in mktables to indicate their not being used.
*	mktables: Skip some code for Unicode 15	Karl Williamson	2022-09-28	1	-1/+1
\| \| \| \|	As it becomes obsolete
*	mktables: Revise Version line search in inputs	Karl Williamson	2022-09-28	1	-1/+1
\| \| \| \| \|	Unicode 15.0 is revising the heading format for non-UCD files; Fix mktables to be able to parse that.
*	mktables: Accept multiple @missing lines in input files	Karl Williamson	2022-09-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Unicode 15.0 will now use this approach to deal with ranges of code points that have a different default for unassigned code points than the table at large. For example, a table may have one default, but all Ideographic character ranges have something else. Prior to this new mechanism, the files had entries for each unassigned code point that had a different default than the global one. So this saves some lines in the files that Unicode delivers that were otherwise useless. Not all files in 15.0 have been converted to use the new scheme, for whatever reason.
*	mktables: Multi_Default now accepts multiple defaults per property	Karl Williamson	2022-09-28	1	-1/+1
\| \| \| \| \| \|	Unicode 15.0 may have multiple @missing lines for a single property, that should use this class. This commit converts the storage into an array to accommodate that need..
*	mktables: Add two methods to Multi_Default class	Karl Williamson	2022-09-28	1	-1/+1
\| \| \| \| \| \|	These are so that you don't have to know everything at construction time. The constructor function changes to call these with whatever it does get passed
*	mktables: More closely examine @missing lines	Karl Williamson	2022-09-28	1	-1/+1
\| \| \| \| \| \|	These lines have all had the same range (all of Unicode). But in Unicode 15.0, there will be some with different ranges. This commit changes to save those values (which are currently still unused)
*	mktables: Standardize value aliases	Karl Williamson	2022-09-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	As stated in the code comments added by this commit, Unicode has various spellings for the same property value. For example in some places it uses 'W', and in others 'Wide'. The legal spellings are listed in PropValueAliases.txt, which is processed early in the construction. So we can standardize things on input, which makes it easier later. This commit produces minimal changes in the generated tables, so that the algorithm can be verified by inspection of the results. And no other code that has hard-coded in expected spellings needs to be changed. Prior to this commit, we standardized the default value for properties that have a default value,.
*	mktables: Add/Fix comments, white-space	Karl Williamson	2022-09-28	1	-1/+1
\| \| \| \| \|	This includes indenting a block of code in anticipation of a future commit which will form a conditional block around it
*	mktables: Use intermed variable to shorten name	Karl Williamson	2022-09-28	1	-1/+1
\| \| \| \| \|	This changes an inside-out hash reference to have a shorthand for it, making for better readability
*	mktables: Convert array to hash	Karl Williamson	2022-09-28	1	-1/+1
\| \| \| \| \| \| \| \|	Prior to this commit we had a two element array, and it was known that element 0 contained a particular thing; and element 1 contained the other. But a future commit will add several elements, so keeping track of which is which will become more problematic. Solve this by using a hash instead, with the elements appropriately named.
*	mktables: Fix some function signatures	Karl Williamson	2022-09-28	1	-1/+1
\| \| \| \| \|	These functions were missed or broken by 4fe9356b250. They're used only in debugging, so it wasn't noticed until now.
*	mktables: Reorder some code	Karl Williamson	2022-09-28	1	-1/+1
\| \| \| \| \| \| \|	This moves some code ahead of other code so that the end of the sub all works on a single related issue. This is in preparation for 15.0, where that issue becomes moot, so we can then change to return early from the sub.
*	mktables: Stop infinite loop with invalid input	Karl Williamson	2022-09-28	1	-1/+1
\| \| \| \|	This failed to exit when the file handle was exhausted
*	regen/mk_invlists.pl - under DEBUG=1 show some progress output	Yves Orton	2022-08-03	1	-1/+1
\|
*	mktables: Don't generate pod for Name.pm	Karl Williamson	2022-07-02	1	-1/+2
\| \| \| \| \| \| \|	This is a relic from long ago. mktables creates lib/unicore/Name.pm. And in that file which is for internal core use only, it was creating the beginnings of some pod, but quite incomplete; this was confusing buildtoc, which perhaps could be hardened against such inputs.
*	Change handy.h macro names to be C standard conformant	Karl Williamson	2022-06-12	1	-2/+1
\| \| \| \| \| \| \|	C reserves symbols beginning with underscores for its own use. This commit moves the underscore so it is trailing, which is legal. The symbols changed here are many of the ones in handy.h that have significant uses outside it.
*	Update checksums in some generated files	Karl Williamson	2022-06-06	1	-1/+2
\| \| \| \| \| \| \| \|	These use checksums to see if the generated data could be out of date. The new NormTest.pl wasn't counted in this, and needn't be, but excluding it and other similar ones is more trouble than it's worth, so make a comment to that effect and update to include the NormTest.pl digest value.
*	regen/mph.pl - make sure the author of _squeeze() has a commit in the log	Ilya Sashcheka	2022-04-21	1	-1/+1
\| \| \| \| \| \| \| \|	This commit is actually by the committer, and is intended to ensure that someone looking for what the author wrote can find it. It took me a while to get a email address for him or I would have done this in eda35008b17e739922 which is where his work on the _squeeze() split key algorithm was added. Credit where credit is due and all of that. Thanks Ilya.
*	regen/mph.pl - add a validation step to build_split_words()	Yves Orton	2022-04-19	1	-1/+1
\| \| \| \| \| \| \|	Exercise an abundance of caution and validate that the buffer and split point data returned is fit for pupose. Includes the output of running regen/mk_invlist.pl.
*	regen/mph.pl & mk_invlists.pl - add the "_squeeze" algorithm to produce ↵	Yves Orton	2022-04-19	1	-7588/+7560
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	smaller blobs The squeeze algorithm produces smaller blobs, 10-20% depending on how it is used. With the "randomize_squeeze" option enabled it is slower but produces 20% smaller blobs than the "_simple" strategy we used to use. With the "randomize_squeeze" option disabled it is about as fast as "_simple" but produces about 10% smaller blobs. Regardless "_squeeze" uses more memory than _simple; quite a bit more currently, although that is unforced and could be changed if required. -blob length: 10548 +blob length: 8635 ... -data size: 69908 (%67.07) +data size: 67995 (%65.23) So it saves 1913 bytes running with this seed. I happened to get lucky with the seed, depending on the seed used the blob ended up about 8650 bytes. This algorithm is originally by Ilya Sashcheka, so I have added him to the AUTHORS file, but unfortunately I no longer have his email address as we lost touch. It contains many modifications by me.
*	regen/mph.pl & mk_invlists.pl - convert from sub interfaces to OO interfaces	Yves Orton	2022-04-19	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \|	The old sub based API was passing around an awkward number of arguments and it was becoming difficult to enhance in certain ways. This patch changes all the "user servicable" functions into methods, and moves the configuration defaults into the constructor. Note, not all the functions have been converted, the core routines with simple interfaces have not been changed. This is OO for the purpose of encapsulation not inheritance or overloading.
*	regen/mph.pl - Clean up diagnostics logic, allow DEBUG from env.	Yves Orton	2022-04-19	1	-271/+272
\| \| \| \| \|	Be silent unless requested to. If DEBUG>1 produce lots of output, if DEBUG==1 produce some basic information about what is going on.
*	regen/mk_invlists.pl - add a way to dump the keywords hash for review	Yves Orton	2022-04-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This adds a way to tell mk_invlists.pl to dump the keywords hash so it can be reviewed, or used for testing or whatnot. A user can define the env var DUMP_KEYWORDS_FILE to be a file name which will be used to save the keywords hash to. If the env var is not set the file won't get written to disk. Includes regenerated output from running regen/mk_invlists.pl to keep porting/regen.t happy.
*	regen/mk_invlists.pl - move token_name() sub closer to where it is used	Yves Orton	2022-04-19	1	-1/+1
\| \| \| \| \| \| \| \| \|	sub token_name() was injected into the middle of totally unrelated logic that does not use it. token_name() is a wrapper around sanitize_name() so move it next to that sub. Also includes the output from running regen/mk_invlists.pl to keep porting/regen.t happy.
*	regen/mk_invlists.pl - move require to top of file	Yves Orton	2022-04-19	1	-7581/+7580
\| \| \| \| \| \| \| \| \| \| \|	mk_invlists.pl does a lot and takes a while before it gets to the part where it requires regen/mph.pl, which means that if there are issues in it they arent discovered until a fair amount of time elapses, which is frustrating when debugging. Moving the require to the top means the script dies early and can be fixed. Includes a regen of uni_keywords.h and friends as this changes a regen script which causes regen.t to fail if its output is not up to date.
*	Bump \p{nv=} precision from 2 to 3	Karl Williamson	2022-04-12	1	-6296/+6297
\| \| \| \| \| \| \| \| \| \| \| \|	This closes #19603 Unicode has various characters whose numeric value is rational non-integer. These can be specified in \p{nv=...} constructs by either the rational form or by an expression that it evaluates to. The number of significant digits that must match are kept to a minimum to allow for variances in different platforms floating point lengths and rounding decisions. Previously that number was 2 digits; but that is no longer always sufficient for all platforms. This commit changes it to 3.
*	Remove 'no warnings experimental::signatures' from support files	Paul "LeoNerd" Evans	2022-02-20	1	-1/+1
\|
*	Fix lib/unicore/mktables for experimental::builtin warnings	Paul "LeoNerd" Evans	2022-01-25	1	-1/+1
\|
*	Remove remaining uses of @_ in signatured subs in lib/unicore/mktables	Paul "LeoNerd" Evans	2022-01-24	1	-1/+1
\|
*	Add missing aliases for \p{Present_In}	Karl Williamson	2022-01-05	1	-7332/+7336
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	\p{Present_In} is a Perl extension of the Unicode Age property, added because knowing the exact Unicode version in which a code point became assigned is rarely what you want; much more frequently you want to know if the code point exists in the version or not. (Since this extension was added, Unicode changed their language to declare that the Age property should be interpreted in pattern matching, not as described, but as Perl's Present_In is. But I chose to not change Age, to avoid backwards compatibility issues, and this way, a coder can choose which thing s/he wanted.) Unicode typically has synonyms (aliases) for each value a property can tak on, so \p{Age=6.1} and \p{Age=V61_1} mean the same thing. Prior to this commit, neither \p{Present_In=1_1} nor \p{Present_In=NA} worked.
*	mktables: Use builtin::refaddr	Karl Williamson	2021-12-13	1	-1/+1
\| \| \| \| \|	Now that this function is available in miniperl, mktables can use it to avoid a bunch of visually distracting 'no overloading' calls.
*	mktables: Don't calculate some unused values	Karl Williamson	2021-12-13	1	-1/+1
\| \| \| \|	These apparently were once needed, but no longer.
*	mktables: Use mnemonic variable names	Karl Williamson	2021-12-07	1	-1/+1
\| \| \| \|	Spotted by Dagfinn Ilmari Mannsåker
*	Fix unicore/mktables to avoid any @_ accesses in signatured subs	Paul "LeoNerd" Evans	2021-12-07	1	-7348/+7348
\|
*	mktables: Remove relics of removed legacy tables	Karl Williamson	2021-09-15	1	-1/+1
\| \| \| \| \|	These mentions of the tables removed in b852e1da77b497e086508451bebff00541073fb1 were missed in that commit.
*	Support Unicode 14.0	Unicode Consortium	2021-09-15	1	-7482/+7637
\|
*	regen/mk_invlists.pl: Add comment	Karl Williamson	2021-09-15	1	-1/+1
\|
*	mktables: Split a Line Break equivalence class	Karl Williamson	2021-09-15	1	-2/+2
\| \| \| \|	This is used for the \b{lb}, and the rule is changing in Unicode 14.0
*	mktables: Reorder some comments, white-space	Karl Williamson	2021-09-15	1	-1/+1
\| \| \| \|	Move comments closer to the action
*	mktables: Rename variable, and hoist calc from loop	Karl Williamson	2021-09-15	1	-1/+1
\|
*	Unicode::UCD: Don't depend on a file current syntax	Karl Williamson	2021-08-31	1	-1/+1
\| \| \| \| \|	This generated file will be changed in a future commit. This shouldn't have been relying on its syntax anyway, but the value it returns.
*	Unicode::UCD: Fix typo in pod	Karl Williamson	2021-08-31	1	-1/+1
\|
*	Remove deprecated Unicode files	Karl Williamson	2021-09-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These files were once apparently intended for use by modules to supplement the core Unicode handling. They contain tables suitable for use by Perl code of the portions of the Unicode character database about changing the case of characters and finding the numeric value of a given \d character, in a form suitable for use by perl code. In particular, they were designed for fast access using the swash mechanism that has since been removed. Now, Unicode::UCD now contains more convenient methods of accessing the data these contain, and the use of these files has been deprecated since 5.16. I could not figure out a way to force a message should someone open and read one of these files, but each of their texts say that the file may be removed without notice at any time. I did not find any uses on cpan of them. Unicode is adding new properties that the format of these files will not be able to handle. Consequently I'm coming up with a new format. Though these files don't contain the new properties, their existence means having the burden of having to maintain two separate mechanisms. Better to have just one mechanism, suitable for going forward.
*	mktables: Generate =head1 NAME line in Name.pm	Karl Williamson	2021-08-15	1	-1/+1
\| \| \| \| \|	All .pm files are supposed to have this line. So far this hasn't been necessary for this file, but future commits will require it.