summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* utf8.h: Add new #define for extended length UTF-8Karl Williamson2021-08-072-1/+2
| | | | | | | | The previous commit added a convenient place to create a symbol to indicate that the UTF-8 on this platform includes Perl's nearly-double length extension. The platforms this isn't needed on are 32-bit ASCII ones. This symbol allows removing one place where EBCDIC need be considered, and future commits will use it as well.
* utf8.h: Refactor MAX_UTF8_TWO_BYTEKarl Williamson2021-08-071-3/+11
| | | | | | The previous commit removed a macro that the comments for this refer to in explaining its derivation. So use an alternative, that is actually clearer.
* Reimplement OFFUNISKIPKarl Williamson2021-08-072-47/+31
| | | | | | | Now that previous commits have made it fast to find the position of the first set bit in a word, we can use a forumla to find how many bytes the UTF-8 of that will occupy. This allows for simplification of this macro, removing several conditionals
* utf8.h: Add macro to compute UV skip by its log2Karl Williamson2021-08-071-2/+30
| | | | | | | | | | This macro will calculate at compile time, if passed a compile-time constant, how many UTF-8 bytes are required to represent the parameter. The macro is a helper which works fine except for edge cases, which a wrapper is needed to handle. The commit changes one instance to use this new macro
* utf8.h: Rmv EBCDIC dependencyKarl Williamson2021-08-072-31/+59
| | | | | | | This moves a #define into the common code for ASCII and EBCDIC machines. It adds a bunch of comments about the value that I wish I hadn't had to figure out for myself.
* Rename internal macro and move to utf8.hKarl Williamson2021-08-072-5/+5
| | | | | | This macro has a corresponding, older, name for the non-UTF-8 case. It makes sense to use the same paradigm, and move the definitions together so that the comments for one don't have to be repeated for the other.
* utf8.h: Remove an EBCDIC dependencyKarl Williamson2021-08-072-4/+19
| | | | | A symbol introduced in a previous commit allows this internal macro to only need a single version, suitable for either EBCDIC or ASCII.
* utf8.h: Add symbol for easing EBCDIC handlingKarl Williamson2021-08-072-5/+12
| | | | This is then used in regcomp.c to avoid an #ifdef EBCDIC
* utf8.h: Make a bit of EBCDIC known to ASCIIKarl Williamson2021-08-072-9/+16
| | | | | This info is needed in one other place; doing it here means only specifying it once.
* utf8.h: Add a #define synonymKarl Williamson2021-08-072-5/+10
| | | | | This is more clearly named for various uses in this file. It has an unwieldy length, but is unlikely to be used outside it.
* Refactor UTF_START_MASK()Karl Williamson2021-08-071-5/+14
| | | | | | | | A slight change to this very low level macro (hence called a lot) removes the need for a conditional, and causes it to work on single-byte UTF-8 characters on ASCII platforms. The definition is also moved to a more logical place in the file
* utf8.h: Move macro to earlier in fileKarl Williamson2021-08-071-13/+13
| | | | This is now defined before first use
* UTF8_IS_DOWNGRADEABLE_START: Call less general helperKarl Williamson2021-08-071-1/+1
| | | | | | Future commits would otherwise make the expansion of this macro too complicated for some C compilers. Use a less general internal helper function to avoid that.
* regcharclass.pl: Further improve EBCDIC codeKarl Williamson2021-08-072-32/+48
| | | | | | | | | | | A couple of commits ago improved the generated output of this script. This builds on that. The improvements were to try a transform that could lead to fewer conditionals, as bytes were greouped in fewer ranges. But that introduced a useless transformation for the single element ranges that remain. This commit removes the transformation if not needed.
* regcharclass.pl: Make 2 locals into global hashesKarl Williamson2021-08-072-7/+9
| | | | This is in preparation for a future commit
* regcharclass.pl: Improve generated code for EBCDICKarl Williamson2021-08-072-183/+240
| | | | | | | | | | | | | | | | UTF-8 has some desirable characteristics not shared by UTF-EBCDIC. One example is all the continuation bytes are in a single range. By transforming a UTF-EBCDIC byte into I8 (similar to UTF-8), we gain those characteristics, and may be able to save a conditional or three. This commit creates a 2nd pass over the bytes that are to be matched, transforming them into I8. If that pass results in fewer conditionals than the traditional, native, generated code, use the fewer result. This saves quite a bit in some of the generated code, enabling the quotemeta macro to be represented in a single part; previously it had to be split to avoid compiler macro size limits.
* regcharclass.pl: White-space comment onlyKarl Williamson2021-08-072-16/+20
| | | | A future commit will put a block around this; indent now.
* regcharclass.pl: Get UTF EBCDIC translationsKarl Williamson2021-08-072-3/+13
| | | | These will be used in a future commit
* regcharclass.pl: Add ability to avoid wrong mnemonicKarl Williamson2021-08-072-2/+3
| | | | | | | A future commit will pass this function data that shouldn't be translated into a mnemonic, like 'f' for the letter f. The reason is that that code will potentially be executed on a machine with a different character set than what the mnemonic would be valid for.
* regcharclass.pl: Change variable nameKarl Williamson2021-08-072-7/+7
| | | | A future commit will use this differently than the current name implies
* regcharclass.pl: Reorder execution pathKarl Williamson2021-08-072-18/+17
| | | | | This moves a loop earlier in the execution path. This will be useful in a later commit
* regcharclass.pl: Rmv unused variableKarl Williamson2021-08-072-3/+2
|
* regcharclass.pl: Add an error checkKarl Williamson2021-08-072-2/+6
|
* regcharclass.pl: Move some code earlierKarl Williamson2021-08-072-40/+42
| | | | | We can short circuit some work by moving the test earlier. This does not change the generated file.
* regcharclass.pl: Rmv unused variableKarl Williamson2021-08-072-3/+1
|
* regen/regcharclass.pl: Use deref of an arrayKarl Williamson2021-08-072-6/+7
| | | | This will make future commits read better.
* fix typo in podKaren Etheridge2021-08-061-1/+1
|
* doop.c: do_vecget(): Add trivial case to the switch()Karl Williamson2021-08-061-8/+9
| | | | | We can save another conditional by adding a default: case to the switch statement created by the previous commit.
* doop.c: Refactor do_vecget()Karl Williamson2021-08-061-116/+50
| | | | | By using a switch statement this function can be cut in half, with fewer conditionals executed.
* doop.c: White space onlyKarl Williamson2021-08-061-2/+2
|
* doop.c: Call the macro instead of reinventing itKarl Williamson2021-08-061-1/+1
|
* doop.c: Refactor do_vecset()Karl Williamson2021-08-061-25/+20
| | | | | By converting to a switch statement with fall through, some redundancies can be removed and conditionals avoided.
* Add MAIN_MODULE to Porting/Maintainers.plAristotle Pagaltzis2021-08-062-8/+12
|
* Porting/sync-with-cpan: remove redundant variableAristotle Pagaltzis2021-08-061-2/+1
|
* Merge branch 'smoke-me/jkeenan/eserte/gh-19017-net-hostent-20210804' into bleadJames E Keenan2021-08-062-8/+17
|\
| * Enable capture variables to be used in a Net::hostent gethost callSlaven Rezic2021-08-062-8/+17
|/ | | | | | Add test cases for gethost. For: https://github.com/Perl/perl5/issues/19017
* perldelta for upgrade to Text-Tabs+Wrap (3 commits)James E Keenan2021-08-051-0/+5
|
* Text-Tabs+Wrap: Sync with CPAN version 2021.0804Aristotle Pagaltzis2021-08-0522-576/+151
| | | | | | | | | | | | | | | | | | | | From upstream CHANGELOG: * Explicitly declared strictures and warnings everywhere (to support -Dusedefaultstrict perls) * Makefile.PL fixes * Unicode support on all supported versions of Perl * Full strict and warnings cleanliness * Packaging cleanups * Removal of reference benchmark from test suite (moved to xt/bench) Committer: Manual verification of the procedure Aristotle used in https://github.com/Perl/perl5/pull/19026.
* Text-Tabs+Wrap: Prepare to synch with CPANJames E Keenan2021-08-051-3/+3
| | | | | | Because (i) the CPAN contributor who is releasing this distribution upstream and (ii) the EXCLUDED condition has changed, we need to manually edit this file before running Porting/sync-with-cpan.
* Merge branch 'ap-contrib-sync-with-cpan-weird-distnames' into bleadJames E Keenan2021-08-051-1/+1
|\ | | | | | | For: https://github.com/Perl/perl5/pull/19025
| * Merge branch 'sync-with-cpan-weird-distnames' of ↵James E Keenan2021-08-051-1/+1
| |\ |/ / | | | | https://github.com/ap-contrib/perl5 into ap-contrib-sync-with-cpan-weird-distnames
| * Porting/sync-with-cpan: handle weird tarball basedir namesAristotle Pagaltzis2021-08-051-1/+1
| | | | | | | | | | | | This was failing to map new Text::Tabs releases properly because its distname is Text-Tabs+Wrap and interpolating it into a pattern without quoting causes the `+` to be misinterpreted as a quantifier.
* | Only initialize threads::shared interpreter onceLeon Timmermans2021-08-052-10/+12
|/ | | | | | | | | | Previously, the shared interpreter would be recreated every time the bootstrap was run, in the assumption that the bootstrap would only be run once. This assumption isn't necessarily true if multiple non-cloned interpreters exist. Theoretically there's still a race condition around initialization, but I'm not particularly worried about that.
* POSIX: Use NV instead of hardcoded 'double' in strtol()/strtoul().TAKAI Kousuke2021-08-051-2/+2
| | | | | | Casting (unsigned) long value to 'double' might cause unnecessary loss of precision if double's significand is not enough wide to preserve (unsigned) long and NV is configured to be wider than double.
* Text::Tabs: skip failing test file for nowDavid Mitchell2021-08-041-0/+9
| | | | | | | This was causing smokes to fail: $ PERLIO=stdio ./perl -Ilib cpan/Text-Tabs/t/dnsparks.t -T and -B not implemented on filehandles at cpan/Text-Tabs/t/dnsparks.t line 130
* perlexperiment: fix missing POD directive in headingDagfinn Ilmari Mannsåker2021-08-031-1/+1
|
* use CLANG_DIAG_IGNORE_STMT instead of GCC_DIAG_IGNORE_STMTTony Cook2021-08-031-2/+2
| | | | | | | | | | It turns out gcc wasn't warning on this code, but older gcc (as included in debian buster) *does* warn on the switch which it doesn't recognise. Newer gcc does recognise the -Wstring-compare switch, but it controls warning on a different construct, so there's no reason to present it to gcc.
* Merge branch 'core-team-data' into bleadRicardo Signes2021-08-026-86/+199
|\
| * perlgov-team-update: cope with non-ASCII contentRicardo Signes2021-08-021-2/+8
| |
| * mailmap: add another entry for xdgRicardo Signes2021-08-021-0/+1
| |