summaryrefslogtreecommitdiff
path: root/regen
Commit message (Collapse)AuthorAgeFilesLines
* Remove no longer necessary constantsKarl Williamson2013-08-291-6/+0
| | | | | | These character constants were used only for a special edge case in trie construction that has been removed -- except for one instance in regexec.c which could just as well be some other character.
* utf8.h, unicode_constants.h: Add some #defines.Karl Williamson2013-08-291-0/+3
| | | | These will be used in a future commit
* unicode_constants.h: Add #defines for CR, LFKarl Williamson2013-08-291-0/+2
|
* regen/regcharclass.pl: Make more EBCDIC-friendlyKarl Williamson2013-08-291-3/+19
| | | | | | | | This commit changes the code generated by the macros so that they work right out-of-the-box on non-ASCII platforms for non-UTF-8 inputs. THEY ARE WRONG for UTF-8, but this is good enough to get perl bootstrapped onto the target platform, and regcharclass.pl can be run there, generating macros with correct UTF-8.
* unicode_constants.h: Add #defines for Byte Order MarkKarl Williamson2013-08-291-0/+2
| | | | These will be used in future commits
* Don't refer to U+XXXX when mean nativeKarl Williamson2013-08-291-1/+1
| | | | | These messages say the output number is Unicode, but it is really native, so change to saying is 0xXXXX.
* [perl #117265] safesyscalls: check embedded nul in syscall argsTony Cook2013-08-261-3/+4
| | | | | | | | | | | | | | | | Check for the nul char in pathnames and string arguments to syscalls, return undef and set errno to ENOENT. Added to the io warnings category syscalls. Strings with embedded \0 chars were prev. ignored in the syscall but kept in perl. The hidden payloads in these invalid string args may cause unnoticed security problems, as they are hard to detect, ignored by the syscalls but kept around in perl PVs. Allow an ending \0 though, as several modules add a \0 to such strings without adjusting the length. This is based on a change originally by Reini Urban, but pretty much all of the code has been replaced.
* Generate the lib/ cleanup rules in the Win32 Makefiles from MANIFEST.Nicholas Clark2013-07-241-4/+32
|
* Generate the lib/ cleanup rules in Makefile.SH automatically from MANIFEST.Nicholas Clark2013-07-241-3/+33
|
* Generate lib/.gitignore from MANIFEST.Nicholas Clark2013-07-241-0/+122
| | | | | | | | | | It's possible to programmatically determine almost all the files and directories which will be created in lib/ by building the extensions. Hence add a new script regen/lib_cleanup.pl to do this. This saves having to manually update lib/.gitignore to reflect changes in the build products of extensions, which has become a small but reoccurring instance of scut-work.
* On failure, regen_lib.pl now generates diagnostics, not just "not ok".Nicholas Clark2013-07-241-2/+33
| | | | | We have to stop using File::Compare's compare(), as it doesn't return diagnostics about what went wrong.
* Fix off-by-one error in inversion lists.Karl Williamson2013-07-161-7/+2
| | | | | | | The first commit of this topic branch added a dummy 0 element to the end of certain inversion lists to work around an off-by-one error. This commit makes the necessary changes to stop that error, and to remove the dummy element. SvCUR() and invlist_len() now are kept in sync.
* Reinstate "regcomp.c: Make C-array inversion lists const"Karl Williamson2013-07-161-2/+2
| | | | | | | | | | | This reverts commit 18505f093a44607b687ae5fe644872f835f66313, which reverted 241136e0ed70738cccd6c4b20ce12b26231f30e5, thus reinstating the latter commit. It turns out that the error being chased down was not due to this commit. Its original message was: The inversion lists that are compiled into a C header are now const.
* Reinstate "regcomp.c: Move 2 hdr inversion fields to SV hdr"Karl Williamson2013-07-161-7/+1
| | | | | | | | | | | | | | This reverts commit 67434bafe4f2406e7c92e69013aecd446c896a9a, which reverted 4fdeca7844470c929f35857f49078db1fd124dbc, thus reinstating the latter commit. It turns out that the error being chased down was not due to this commit. Its original message was: This commit continues the process of separating the header area of inversion lists from the body. 2 more fields are moved out of the header portion of the inversion list, and into the header portion of the SV that contains it.
* Reinstate + fix "Revert "regcomp.c: Add a constant 0 element before ↵Karl Williamson2013-07-161-18/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | inversion lists" " This reverts commit de353015643cf10b437d714d3483c1209e079916 which reverted 533c4e2f08b42d977e5004e823d4849f7473d2d0, thus reinstating it, plus this commit adds a fix to get it to pass under Address Sanitizer. The root cause of the problem is that there are two measures of the length of an inversion list. One is SvCUR(), and the other is invlist_len(). The original commit caused these to get off-by-one in some cases. The ultimate solution is to only store one value, and return the other one based off that. Rather than redo the whole branch, I've taken an easier way out, which is to add a dummy element at the end of some inversion lists, so that they aren't off-by-one. Then the other patches from the original branch will be applied. Each will be tested with Address Sanitizer. Then the work to fix the underlying problem will be done. The original commit's message was: This commit is the first step to separating the header from the body of inversion lists. Doing so will allow the compiled-in inversion lists to be fully read-only. To invert an inversion list, one simply unshifts a 0 to the front of it if one is not there, and shifts off the 0 if it does have one. The current data structure reserves an element at the beginning of each inversion list that is either 0 or 1. If 0, it means the inversion list begins there; if 1, it means the inversion list starts at the next element. Inverting involves flipping this bit. This commit changes the structure so that there is an additional element just after the element that flips. This new element is always 0, and the flipping element now says whether the inversion list begins at the constant 0 element, or the one after that. Doing this allows the flipping element to be separated in later commits from the body of the inversion list, which will always begin with the constant 0 element. That means that the body of the inversion list can be const.
* Specify the versions of ExtUtils::MiniPerl and ExtUtils::Embed needed.Nicholas Clark2013-07-121-1/+1
| | | | | Without this, regen/miniperlmain.pl could end up finding versions which are out of date, and silently generate an incorrect miniperlmain.c
* Move the "editor block" from miniperlmain.c to ExtUtils::MiniperlNicholas Clark2013-07-071-2/+2
| | | | | | | As miniperlmain.c is now generated by ExtUtils::Miniperl (and not the other way round), there's no reason to have an editor block in the generated file, as it's not intended to be edited. Instead, add the "generated from" and read-only headers to miniperlmain.c
* Invert the build logic for miniperlmain.c and ExtUtils::MiniperlNicholas Clark2013-07-071-0/+15
| | | | | | | | | | | | | Now ExtUtils::Miniperl has the master version of {mini,}perlmain.c and is checked into the repository. miniperlmain.c is now generated by a script in regen/ which uses ExtUtils::Miniperl. Tweak ExtUtils::Miniperl::writemain() to take an optional first argument, a reference to a file handle. This permits the regen script to use the regen_lib.pl functions for file opening/closing/renaming and TAP generation. For now check in ExtUtils::Miniperl minimally modified from the version generated by the former minimod.pl. The next commit will tidy it up.
* Add an "always update" parameter to regen_lib's open_new().Nicholas Clark2013-07-071-11/+18
| | | | | | | | | | | | | | | | | | By default the code in regen_lib compares the newly written file it has just closed with the (assumed) existing file, and only overwrites the existing file if the new file differs. This is a useful behaviour for regeneration scripts. However, it's not ideal for build scripts called from the Makefile, as make assumes that targets will be regenerated (and the timestamp touched). So add an "always update" parameter for the use of Makefile invoked scripts, such as autodoc.pl. If set, delete any existing file early (so that fatal errors during the generation don't confuse the build by leaving an existing stale file around), skip the comparison and skip the diagnostic output listing the changed files. Change autodoc.pl to set this parameter. Correct a typo in an error message in regen_lib's open_new().
* Refactor the Text::Wrap::wrap() logic in regen/regen_lib.plNicholas Clark2013-07-071-6/+9
| | | | | | Provide a local subroutine wrap(). Pass columns as its first parameter and set $Text::Wrap::columns, as all uses of Text::Wrap::wrap() were setting this variable.
* Refactor regen_lib.pl to reduce verbosity.Nicholas Clark2013-07-071-14/+12
| | | | | Use hash slices to avoid repeated typeglob dereferences on $fh. In read_only_top() use a lexical to avoid repeated $args{lang} lookups.
* Revert "regcomp.c: Add a constant 0 element before inversion lists"Karl Williamson2013-07-041-9/+18
| | | | | | | This reverts commit 533c4e2f08b42d977e5004e823d4849f7473d2d0. This continues the backing out of this topic branch. A bisect shows that the first commit exhibiting an error is the first one in the branch.
* Revert "regcomp.c: Move 2 hdr inversion fields to SV hdr"Karl Williamson2013-07-041-1/+7
| | | | | | | This reverts commit 4fdeca7844470c929f35857f49078db1fd124dbc. This continues the backing out of this topic branch. A bisect shows that the first commit exhibiting an error is the first one in the branch.
* Revert "regcomp.c: Make C-array inversion lists const"Karl Williamson2013-07-041-2/+2
| | | | | | | This reverts commit 241136e0ed70738cccd6c4b20ce12b26231f30e5. This continues the backing out of this topic branch. A bisect shows that the first commit exhibiting an error is the first one in the branch.
* regcomp.c: Make C-array inversion lists constKarl Williamson2013-07-031-2/+2
| | | | The inversion lists that are compiled into a C header are now const.
* regcomp.c: Move 2 hdr inversion fields to SV hdrKarl Williamson2013-07-031-7/+1
| | | | | | | This commit continues the process of separating the header area of inversion lists from the body. 2 more fields are moved out of the header portion of the inversion list, and into the header portion of the SV that contains it.
* regcomp.c: Add a constant 0 element before inversion listsKarl Williamson2013-07-031-18/+9
| | | | | | | | | | | | | | | | | | | | | | | | This commit is the first step to separating the header from the body of inversion lists. Doing so will allow the compiled-in inversion lists to be fully read-only. To invert an inversion list, one simply unshifts a 0 to the front of it if one is not there, and shifts off the 0 if it does have one. The current data structure reserves an element at the beginning of each inversion list that is either 0 or 1. If 0, it means the inversion list begins there; if 1, it means the inversion list starts at the next element. Inverting involves flipping this bit. This commit changes the structure so that there is an additional element just after the element that flips. This new element is always 0, and the flipping element now says whether the inversion list begins at the constant 0 element, or the one after that. Doing this allows the flipping element to be separated in later commits from the body of the inversion list, which will always begin with the constant 0 element. That means that the body of the inversion list can be const.
* regen/genpacksizetables.pl: Add commentKarl Williamson2013-06-261-0/+4
|
* Show intflags as well as extflagsYves Orton2013-06-221-1/+46
|
* Make ‘make regen’ regenerate the tree in perllexwarnFather Chrysostomos2013-06-091-1/+23
| | | | | | | | | | | ‘perl regen/warnings.pl tree’ would already generate the tree, but it had to be run separately and then copied and pasted into perllexwarn. Now regen/warnings.pl modifies perllexwarn in place as part of its regeneration. The ‘tree’ command line argument will still cause the tree to be output to STDOUT. This causes the three missing experimental categories to be listed in perllexwarn, resolving ticket #118369.
* In regen/regen_lib.pl, add 'Pod' as a third supported 'language'.Nicholas Clark2013-05-231-5/+9
| | | | Pod needs a commenting style distinct from C and Perl. (ie the empty string)
* typo fixes for regen scriptsDavid Steinbrunner2013-05-224-11/+11
|
* Eliminate pre-5.9.x conditional code for PERL_PACK_CAN_SHRIEKSIGNNicholas Clark2013-05-201-4/+4
| | | | | | | PERL_PACK_CAN_SHRIEKSIGN has been unconditionally defined for versions 5.9.x and greater, and undefined for 5.8.x. As we are never going to need to port changes back to maint-5.8 any more, eliminate all the 5.8.x related code and the macro that supports it.
* Move genpacksizetables.pl to regen/genpacksizetables.plNicholas Clark2013-05-201-0/+125
|
* Fix multi-char fold edge caseKarl Williamson2013-05-201-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | use locale; fc("\N{LATIN CAPITAL LETTER SHARP S}") eq 2 x fc("\N{LATIN SMALL LETTER LONG S}") should return true, as the SHARP S folds to two 's's in a row, and the LONG S is an antique variant of 's', and folds to s. Until this commit, the expression was false. Similarly, the following should match, but didn't until this commit: "\N{LATIN SMALL LETTER SHARP S}" =~ /\N{LATIN SMALL LETTER LONG S}{2}/iaa The reason these didn't work properly is that in both cases the actual fold to 's' is disallowed. In the first case because of locale; and in the second because of /aa. And the code wasn't smart enough to realize that these were legal. The fix is to special case these so that the fold of sharp s (both capital and small) is two LONG S's under /aa; as is the fold of the capital sharp s under locale. The latter is user-visible, and the documentation of fc() now points that out. I believe this is such an edge case that no mention of it need be done in perldelta.
* unicode_constants.h: Add some #definesKarl Williamson2013-05-201-0/+3
| | | | These will be used in future commits
* pp.c: Eliminate custom macro and use Copy() insteadKarl Williamson2013-05-201-0/+3
| | | | | | I think it's clearer to use Copy. When I wrote this custom macro, we didn't have the infrastructure to generate a UTF-8 encoded string at compile time.
* regen feature.pmRicardo Signes2013-05-181-1/+3
|
* Improve how regcomp.pl handles multibitsYves Orton2013-03-271-7/+16
| | | | In preparation for future changes.
* Make smartmatch, given & when experimentalBrian Fraser2013-03-261-1/+3
|
* regen/unicode_constants.pl: Change #define nameKarl Williamson2013-03-081-1/+1
| | | | | This was added in the 5.17 series so there's no code relying on its current name. I think that the abbreviation is clearer.
* regen/unicode_constants.pl: Make portable to non-ASCIIKarl Williamson2013-03-081-32/+31
| | | | | | This now uses the U+ notation to indicate code points, which is unambiguous not matter what the platform's character set is. (charnames accepts the U+ notation)
* regen/unicode_constants.pl: Remove unused constantKarl Williamson2013-03-081-1/+0
| | | | | This was added in the 5.17 series, so can't be yet in the field; and isn't needed.
* regen/unicode_constants.pl: Pass through input commentsKarl Williamson2013-03-081-6/+17
| | | | | The data can now have comments, which are converted to C and passed through
* regen/unicode_constants.pl: Convert '-' in names to '_'Karl Williamson2013-03-081-4/+4
| | | | | | Unicode character names can have dashes in them. These aren't accepted in C macro names. Change so both blanks and the hyphen-minus are converted to underscores.
* put an experimental warning on lexical topicRicardo Signes2013-02-201-0/+2
|
* In warnings.pm, delete a hash slice, instead of using a loop.Nicholas Clark2013-02-171-2/+2
| | | | | | | Deleting a hash slice compiles 5 fewer ops, and executes 21 fewer than looping over the keys to delete each in turn. Whilst this is arguably a micro-optimisation, it does not increase obfuscation and is in code loaded by nearly every Perl program, so feels worthwhile.
* regen/embed.pl: Extract out duplicate code into a fcnKarl Williamson2013-02-081-20/+13
|
* regen/embed.pl: Warn if have > 1 i, p, and s flagsKarl Williamson2013-02-081-2/+7
| | | | These should be mutually exclusive
* regcharclass.h: Add macro for non-ASCII PATWSKarl Williamson2013-01-231-1/+1
| | | | This will be used to deprecate uses of non-ASCII Pattern White Space