delta/perl.git - github.com: perl/perl5.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Tiny comment typo fix in handy.h	Father Chrysostomos	2011-06-24	1	-1/+1
\|
*	handy.h: Link moved to perlhacktips	Karl Williamson	2011-05-18	1	-1/+1
\|
*	handy.h: isIDFIRST_utf8() changed to use XIDStart	Karl Williamson	2011-02-17	1	-10/+7
\| \| \| \| \| \| \| \| \| \|	Previously this used a home-grown definition of an identifier start, stemming from a bug in some early Unicode versions. This led to some problems, fixed by #74022. But the home-grown solution did not track Unicode, and allowed for characters, like marks, to begin words when they shouldn't. This change brings this macro into compliance with Unicode going-forward.
*	Add comments	Karl Williamson	2011-02-14	1	-0/+2
\|
*	Move the non-generated parts of l1_char_class_tab.h out into handy.h	Nicholas Clark	2011-01-24	1	-1/+45
\| \| \| \| \|	Now the contents of l1_char_class_tab.h is only the output of Porting/mk_PL_charclass.pl
*	Move metaconfig control comments into its own files	H.Merijn Brand	2010-12-21	1	-14/+0
\|
*	Add sin6_scope_id probe (LeoNerd)	H.Merijn Brand	2010-12-20	1	-1/+1
\|
*	Add probe for sa_len availability in sockaddr struct	H.Merijn Brand	2010-12-10	1	-0/+1
\| \| \| \|	Sorry for the huge config_h.SH re-order. Don't know (yet) what caused that
*	regexec.c: Latin1 chars can fold match UTF8_ALL	Karl Williamson	2010-11-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Some ANYOF regnodes have the ANYOF_UNICODE_ALL flag set, which means they match any non-Latin1 character. These should match /i (in a utf8 target string) any ASCII or Latin1 character that folds outside the Latin1 range As part of this patch, an internal only macro is renamed to account for its new use in regexec.c. The cumbersome name is to ward off others from using it until the final semantics have been settled on.
*	handy.h: New #define to use new bit	Karl Williamson	2010-11-22	1	-0/+1
\| \| \| \| \| \| \| \| \|	This creates a new macro for use by regcomp to test the new bit regarding non-ascii folds. Because the semantics may change in the future to deal with multi-char folds, the name of the macro is unwieldy and specific enough that no one should be tempted to use it.
*	[perl #74022] Parser hangs on some Unicode characters	Father Chrysostomos	2010-11-14	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This changes the definition of isIDFIRST_utf8 to avoid any characters that would put the parser in a loop. isIDFIRST_utf8 is used all over the place in toke.c. Almost every instance is followed by a call to S_scan_word. S_scan_word is only called when it is known that there is a word to scan. What was happening was that isIDFIRST_utf8 would accept a character, but S_scan_word in toke.t would then reject it, as it was using is_utf8_alnum, resulting in an infinite number of zero-length identifiers. Another possible solution was to change S_scan_word to use isIDFIRST_utf8 or similar, but that has back-compatibility problems, as it stops q·foo· from being a strings and makes it an identi- fier instead.
*	systematically provide pv/pvn/pvs/sv quartets	Zefram	2010-09-28	1	-3/+44
\| \| \| \| \|	Anywhere an API function takes a string in pvn form, ensure that there are corresponding pv, pvs, and sv APIs.
*	handy.h: Fix so x2p compiles	Karl Williamson	2010-09-25	1	-19/+54
\| \| \| \| \| \| \| \| \| \| \| \| \|	The recent series of commits on handy.h causes x2p to not compile. These commits had some differences from what I submitted, in that they moved the new table to a new header file instead of the submitted perl.h. Unfortunately, this bypasses code in perl.h that figures out about duplicate definitions, and externs, and so fails on programs that include handy.h but not perl.h. This patch changes things so that the table lookup is not used unless perl.h is included. This is essentially my original patch, but adding an #include of the new header file.
*	handy.h: Add isFOO_L1() macros, using table lookup	Karl Williamson	2010-09-25	1	-37/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds *_L1() macros for character class lookup, using table lookup for O(1) performance. These force a Latin-1 interpretation on ASCII platforms. There were a couple existing macros that had the suffix U for Unicode semantics. I thought that those names might be confusing, so settled on L1 as the least bad name. The older names are kept as synonyms for backward compatibility. The problem with those names is that these are actually macros, not functions, and hence can be called with any int, including any Unicode code point. The U suffix might be mistaken for indicating they are more general purpose, whereas they are really only valid for the latin1 subset of Unicode (including the EBCDIC isomorphs). When called with something outside the latin1 range, they will return false. This patch necessitated rearranging a few things in the file. I added documentation for several more macros, and intend to document the rest. (This commit was modified from its original form by Steffen.)
*	handy.h: Make isWORDCHAR() primary documentation	Karl Williamson	2010-09-25	1	-5/+8
\| \| \| \| \|	This macro is clearer as to intent over isALNUM, and isn't confusable with isALNUMC. So document it primarily.
*	handy.h: Slightly change the pod	Karl Williamson	2010-09-25	1	-8/+8
\|
*	handy.h: alphabetize pod entries	Karl Williamson	2010-09-25	1	-8/+8
\| \| \| \| \|	There are a number of macros missing from the documentation. This helps me figure out which ones.
*	handy.h: Change isFOO_A() to be O(1) performance	Karl Williamson	2010-09-25	1	-32/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch changes the macros whose names end in _A to use table lookup except for the one (isASCII) which always has only one comparison. The table is in l1_char_class_tab.h. The advantage of this is speed. It replaces some fairly complicated expressions with an O(1) look-up and a mask. It uses the FITS_IN_8_BITS() macro to guarantee that the table bounds are not exceeded. For legal inputs that are byte size, the optimizer should get rid of this macro leaving only the lookup and mask. (This commit was changed from its original form by Steffen.)
*	handy.h: EBCDIC should use native isalpha()	Karl Williamson	2010-09-25	1	-1/+2
\|
*	handy.h: Add isFOO_A() macros for ASCII range matches	Karl Williamson	2010-09-25	1	-26/+64
\| \| \| \|	These macros return true only if the parameter is an ASCII character.
*	handy.h: should use EBCDIC libc isdigit()	Karl Williamson	2010-09-25	1	-1/+2
\| \| \| \|	as is better optimized and suitable for the purpose.
*	handy.h: move macro in file	Karl Williamson	2010-09-25	1	-1/+2
\|
*	Subject: handy.h: Add isWORDCHAR() for clarity	Karl Williamson	2010-09-25	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The name isALNUM() is problematic, as it is very close to isALNUMC(), and doesn't mean exactly what most people might think. I presume the C in isALNUMC stands for C language or libc, but am not sure. Others don't know either. But in any event, isALNUM is different from the C isalnum(), in that it matches the Perl concept of \w, which differs from the C definition in exactly one place. Perl includes the underscore character, '_'. So, I'm adding a isWORDCHAR() macro for future code to use to be more clear. I thought also about isWORD(), but I think confusion can arise from thinking that means a whole word. isWORDCHAR_L1() matches in the Latin1 range, to be equivalent to isALNUMU(). The motivation for using L1 instead of U will be explained in a commit message for the other L1 macros that are to be added.
*	Add a comment; clarify another	Karl Williamson	2010-09-25	1	-2/+2
\|
*	Indent a comment better	Karl Williamson	2010-09-25	1	-1/+1
\|
*	Subject: handy.h: Reorder #defines alphabetically	Karl Williamson	2010-09-25	1	-12/+13
\| \| \| \| \|	The only change here is that I sorted these #defines within their groups, to make it much easier to follow what's going on.
*	handy.h: isSPACE() is wrong for EBCDIC	Karl Williamson	2010-09-25	1	-2/+3
\| \| \| \|	It didn't include the Latin1 space components.
*	handy.h: EBCDIC isBLANK() is wrong	Karl Williamson	2010-09-25	1	-1/+2
\| \| \| \|	It doesn't include NBSP
*	handy.h: isPSXSPC() is wrong for EBCDIC	Karl Williamson	2010-09-25	1	-1/+2
\| \| \| \| \| \|	The macro was using the ASCII definition, which doesn't include NEL nor NBSP. But, libc contains the correct definition, which is usable on EBCDIC since we don't worry about locales there.
*	Subject: handy.h: Move defn's outside #ifndef EBCDIC	Karl Williamson	2010-09-25	1	-15/+15
\| \| \| \| \| \|	Commit 4125141464884619e852c7b0986a51eba8fe1636 improperly got rid of EBCDIC handling, as it combined the ASCII and EBCDIC versions, but left the result in the ASCII-only branch. Just move to the common code.
*	Rename isALNUM_L1 to isWORDCHAR_L1	Karl Williamson	2010-09-23	1	-1/+1
\|
*	handy.h: Add isALNUM_L1() macro	Karl Williamson	2010-09-23	1	-0/+1
\| \| \| \|	This is a synonym for isALNUMU
*	Subject: handy.h: Add isSPACE_L1 with Unicode semantics	Karl Williamson	2010-09-23	1	-0/+4
\|
*	handy.h: isASCII() extend to work on > 8 bit values	Karl Williamson	2010-09-22	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prior to this patch, if isASCII() is called with something like '256', it would return true. For some reason unknown to me, U64 is defined only inside the perl core. However, the equivalent U64TYPE is known everywhere, so in the macro that can be called outside of core, use that instead. The commit log doesn't give a reason for not defining U64 outside of core, and no tests in the suite fail when it is defined outside core. But out of caution, I'm just doing this workaround instead of exposing U64.
*	handy.h: Don't use isascii() as not in all libc's	Karl Williamson	2010-09-22	1	-2/+1
\| \| \| \| \|	EBCDIC platforms use isascii(), but is not in all libc's so better to use our own.
*	handy.h: Fix-up documentation	Karl Williamson	2010-09-22	1	-18/+25
\| \| \| \| \|	Previous documentation was wrong for EBCDIC platforms. This fixes that and adds some more explanation.
*	handy.h: toUPPER is not a char class fcn	Karl Williamson	2010-09-22	1	-0/+2
\| \| \| \| \| \|	toUPPER() and toLOWER() were grouped with the character class functions (in perlapi), to which they are related, but aren't the same. Create a new heading for these.
*	Fix /[\8]/ to not match NULL; give correct warning	Karl Williamson	2010-09-16	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	8 and 9 are not treated as alphas in parsing as opposed to illegal octals. This also adds tests to verify that 1-3 digits work in char classes. I created an isOCTAL macro in case that lookup gets moved to a bit field, as I plan to do later, for speed.
*	handy.h: Add bounds checking to case change arrays	Karl Williamson	2010-09-13	1	-7/+13
\| \| \| \| \| \| \|	This makes sure that the index into the arrays used to change between lower and upper case will fit into their bounds; returning an error character if not. The check is likely to be optimized out if the index is stored in 8 bits.
*	handy.h: Add FITS_IN_8_BITS() macro	Karl Williamson	2010-09-13	1	-0/+14
\| \| \| \| \| \| \|	This macro is designed to be optimized out if the argument is byte-length, but otherwise to be a bomb-proof way of making sure that the argument occupies only 8 bits or fewer in whatever storage class it is in.
*	add lex_stuff_pvs()	Zefram	2010-08-22	1	-0/+9
\| \| \| \|	New macro lex_stuff_pvs(), wrapping lex_stuff_pvn() for literal strings.
*	handy.h: Note Devel::PPPort has duplicated macros	Karl Williamson	2010-08-02	1	-0/+3
\| \| \| \| \| \|	If a bug is found in the handy.h macros, it may be necessary to fix the duplicates in the cpan module. This may require filing a bug report there.
*	Add C_ARRAY_END(), returning a pointer to after the last element of an array.	Nicholas Clark	2010-05-28	1	-0/+1
\| \| \| \|	Refactor the macro append_flags() in dump.c to use it.
*	PATCH: Clean up EBCDIC handling of \cX	Karl Williamson	2010-05-17	1	-10/+3
\| \| \| \| \| \| \| \| \| \|	The function perl_ebcdic_control() is unnecessary, as the toCTRL macro that calls it can be changed to just map EBCDIC to ASCII first, and then doing the normal procedure. This means that EBCDIC and ASCII will no longer diverge. Currently, EBCIDIC gives a syntax error for inputs outside its domain, whereas the ASCII version accepts some of them.
*	Make sure isCNTRL and isASCII work on signed chars	Karl Williamson	2010-04-26	1	-2/+7
\| \| \| \| \| \| \| \|	Prior to this patch, there is a potential bug in these two macros, in which, if they are called with a signed character outside the ASCII range, it will be negative and they always returned true for negative. Casting the parameter to an unsigned should fix that by having it be interpreted as a number above the ASCII range.
*	More defensive definition of memEQs().	Nicholas Clark	2010-04-25	1	-1/+1
\|
*	Set the legacy process name with prctl() on assignment to $0 on Linux	Ævar Arnfjörð Bjarmason	2010-04-15	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Ever since perl 4.000 we've only set the POSIX process name via argv[0]. Unfortunately on Linux the POSIX name isn't used by utilities like top(1), ps(1) and killall(1). Now when we set C<$0 = "hello"> both C<qx[ps h $$]> (POSIX) and C<qx[ps hc $$]> (legacy) will say "hello", instead of the latter being "perl" as was previously the case. See also the March 9 2010 thread "Why doesn't assignment to $0 on Linux also call prctl()?" on perl5-porters.
*	use cBOOL for bool casts	David Mitchell	2010-04-15	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	bool b = (bool)some_int doesn't necessarily do what you think. In some builds, bool is defined as char, and that cast's behaviour is thus undefined. So this line in mg.c: const bool was_temp = (bool)SvTEMP(sv); was actually setting was_temp to false even when the SVs_TEMP flag was set. Fix this by replacing all the (bool) casts with a new cBOOL() cast macro that (hopefully) does the right thing.
*	Probe for prctl () and check id PR_SET_NAME is supported	H.Merijn Brand	2010-04-13	1	-1/+2
\|
*	PATCH: deprecation warnings for unreasonable charnames	Karl Williamson	2010-02-20	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prior to now just about anything has been legal for a character name in \N{...}. This means that legal code was broken by having \N{3,4} for example mean [^\n]{3,4}. Such code doesn't come from standard charnames, but from legal custom translators. This patch deprecates "unreasonable" names. handy.h is changed by the addition of macros that taken together define the names we deem reasonable, namely alpha beginning with alphanumerics and some punctuations as continuations. toke.c is changed to parse each name and to raise a warning if any problematic characters are found. Some tests and diagnostic documentation are also included.