Change filter of problematic code points for EBCDIC

There are three classes of problematic Unicode code points that may require special handling. Which code points are problematic is fairly complicated, requiring lots of branches. However, the smallest of them is 0xD800, which means that most code points in modern use are below them all, and a single test can be used to exclude just about everything likely to be encountered. The problem was that the way this test was done on EBCDIC caused way too many things to pass and have to be checked with the more complicated branches. The digits 0-9 and some capital letters were not filtered out, for example. This commit changes the EBCDIC test to transform into I8 (an array lookup), and this fixes it to exclude things that shouldn't have passed before.
author: Karl Williamson <khw@cpan.org> 2015-05-15 14:48:23 -0600
committer: Karl Williamson <khw@cpan.org> 2015-09-04 10:21:17 -0600
commit: ac6f1fbe3462b7efc6bfb0e77bde7e04d14f02c2 (patch)
tree: 59a949d083d1c799ab6606bd56fe0eb867e12bcc /utf8.c
parent: a62b247b9f3d5cc6214f83defea2e06d12398275 (diff)
download: perl-ac6f1fbe3462b7efc6bfb0e77bde7e04d14f02c2.tar.gz
1 files changed, 1 insertions, 1 deletions
diff --git a/utf8.c b/utf8.c
index 2a9d20e794..2a44d75f1d 100644
--- a/utf8.c
+++ b/utf8.c
@@ -3840,7 +3840,7 @@ Perl_check_utf8_print(pTHX_ const U8* s, const STRLEN len)
 			   "%s in %s", unees, PL_op ? OP_DESC(PL_op) : "print");
 	    return FALSE;
 	}
-	if (UNLIKELY(*s >= UTF8_FIRST_PROBLEMATIC_CODE_POINT_FIRST_BYTE)) {
+	if (UNLIKELY(isUTF8_POSSIBLY_PROBLEMATIC(*s))) {
 	    STRLEN char_len;
 	    if (UTF8_IS_SUPER(s, e)) {
 		if (ckWARN_d(WARN_NON_UNICODE)) {
author	Karl Williamson <khw@cpan.org>	2015-05-15 14:48:23 -0600
committer	Karl Williamson <khw@cpan.org>	2015-09-04 10:21:17 -0600
commit	ac6f1fbe3462b7efc6bfb0e77bde7e04d14f02c2 (patch)
tree	59a949d083d1c799ab6606bd56fe0eb867e12bcc /utf8.c
parent	a62b247b9f3d5cc6214f83defea2e06d12398275 (diff)
download	perl-ac6f1fbe3462b7efc6bfb0e77bde7e04d14f02c2.tar.gz