summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorEric Blake <eblake@redhat.com>2010-09-21 12:09:55 -0600
committerEric Blake <eblake@redhat.com>2010-09-21 14:16:30 -0600
commitd07f81f1ce3a6420701eeaf9c9eddedba4ec37fb (patch)
treec8d0b177aff30166dcdf0a56cb27850afe2ff850
parentde858e701fe75a7254ef708688ce8fec90945278 (diff)
downloadautoconf-d07f81f1ce3a6420701eeaf9c9eddedba4ec37fb.tar.gz
tests: XFAIL in the face of a MacOS X bug
* doc/autoconf.texi (Limitations of Usual Tools) <sed>: Mention the issue. * tests/torture.at (Substitute and define special characters): Detect if sed cannot process 8-bit bytes in the C locale. * THANKS: Update. Reported by Rochan. Signed-off-by: Eric Blake <eblake@redhat.com>
-rw-r--r--ChangeLog10
-rw-r--r--THANKS1
-rw-r--r--doc/autoconf.texi22
-rw-r--r--tests/torture.at3
4 files changed, 36 insertions, 0 deletions
diff --git a/ChangeLog b/ChangeLog
index 1b47a2c4..311f41f2 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2010-09-21 Eric Blake <eblake@redhat.com>
+
+ tests: XFAIL in the face of a MacOS X bug
+ * doc/autoconf.texi (Limitations of Usual Tools) <sed>: Mention
+ the issue.
+ * tests/torture.at (Substitute and define special characters):
+ Detect if sed cannot process 8-bit bytes in the C locale.
+ * THANKS: Update.
+ Reported by Rochan.
+
2010-09-20 Eric Blake <eblake@redhat.com>
autom4te: don't filter out portions of location traces
diff --git a/THANKS b/THANKS
index cb1589b2..4acb36f6 100644
--- a/THANKS
+++ b/THANKS
@@ -339,6 +339,7 @@ Richard Stallman rms@gnu.org
Robert Lipe robertlipe@usa.net
Robert S. Maier rsm@math.arizona.edu
Roberto Bagnara bagnara@cs.unipr.it
+Rochan rochan@ices.utexas.edu
Roger Leigh rleigh@whinlatter.ukfsn.org
Roland McGrath roland@gnu.org
Rolf Ebert rolf.ebert.gcc@gmx.de
diff --git a/doc/autoconf.texi b/doc/autoconf.texi
index 64243029..66d8a211 100644
--- a/doc/autoconf.texi
+++ b/doc/autoconf.texi
@@ -18700,6 +18700,28 @@ implementations have an input buffer limited to 4000 bytes. Likewise,
not all @command{sed} implementations can handle embedded @code{NUL} or
a missing trailing newline.
+Remember that ranges within a bracket expression of a regular expression
+are only well-defined in the @samp{C} (or @samp{POSIX}) locale.
+Meanwhile, support for character classes like @samp{[[:upper:]]} is not
+yet universal, so if you cannot guarantee the setting of @env{LC_ALL},
+it is better to spell out a range @samp{[ABCDEFGHIJKLMNOPQRSTUVWXYZ]}
+than to rely on @samp{[A-Z]}.
+
+Additionally, Posix states that regular expressions are only
+well-defined on characters. Unfortunately, there exist platforms such
+as MacOS X 10.5 where not all 8-bit byte values are valid characters,
+even though that platform has a single-byte @samp{C} locale. And Posix
+allows the existence of a multi-byte @samp{C} locale, although that does
+not yet appear to be a common implementation. At any rate, it means
+that not all bytes will be matched by the regular expression @samp{.}:
+
+@example
+$ @kbd{printf '\200\n' | LC_ALL=C sed -n /./p | wc -l}
+0
+$ @kbd{printf '\200\n' | LC_ALL=en_US.ISO8859-1 sed -n /./p | wc -l}
+1
+@end example
+
Portable @command{sed} regular expressions should use @samp{\} only to escape
characters in the string @samp{$()*.0123456789[\^n@{@}}. For example,
alternation, @samp{\|}, is common but Posix does not require its
diff --git a/tests/torture.at b/tests/torture.at
index 673c7a59..511834df 100644
--- a/tests/torture.at
+++ b/tests/torture.at
@@ -882,6 +882,9 @@ AT_CLEANUP
AT_SETUP([Substitute and define special characters])
AT_KEYWORDS([AC@&t@_DEFINE AC@&t@_DEFINE_UNQUOTED])
+AT_XFAIL_IF([byte=\\200s; dnl
+test `{ printf $byte; echo; } | sed -n '/^./p' | wc -l` = 0])
+
AT_DATA([Foo.in], [@foo@
@bar@@notsubsted@@baz@ stray @ and more@@@baz@
abc@bar@baz@baz