diff options
author | Eric Blake <eblake@redhat.com> | 2010-09-21 12:09:55 -0600 |
---|---|---|
committer | Eric Blake <eblake@redhat.com> | 2010-09-21 14:16:30 -0600 |
commit | d07f81f1ce3a6420701eeaf9c9eddedba4ec37fb (patch) | |
tree | c8d0b177aff30166dcdf0a56cb27850afe2ff850 | |
parent | de858e701fe75a7254ef708688ce8fec90945278 (diff) | |
download | autoconf-d07f81f1ce3a6420701eeaf9c9eddedba4ec37fb.tar.gz |
tests: XFAIL in the face of a MacOS X bug
* doc/autoconf.texi (Limitations of Usual Tools) <sed>: Mention
the issue.
* tests/torture.at (Substitute and define special characters):
Detect if sed cannot process 8-bit bytes in the C locale.
* THANKS: Update.
Reported by Rochan.
Signed-off-by: Eric Blake <eblake@redhat.com>
-rw-r--r-- | ChangeLog | 10 | ||||
-rw-r--r-- | THANKS | 1 | ||||
-rw-r--r-- | doc/autoconf.texi | 22 | ||||
-rw-r--r-- | tests/torture.at | 3 |
4 files changed, 36 insertions, 0 deletions
@@ -1,3 +1,13 @@ +2010-09-21 Eric Blake <eblake@redhat.com> + + tests: XFAIL in the face of a MacOS X bug + * doc/autoconf.texi (Limitations of Usual Tools) <sed>: Mention + the issue. + * tests/torture.at (Substitute and define special characters): + Detect if sed cannot process 8-bit bytes in the C locale. + * THANKS: Update. + Reported by Rochan. + 2010-09-20 Eric Blake <eblake@redhat.com> autom4te: don't filter out portions of location traces @@ -339,6 +339,7 @@ Richard Stallman rms@gnu.org Robert Lipe robertlipe@usa.net Robert S. Maier rsm@math.arizona.edu Roberto Bagnara bagnara@cs.unipr.it +Rochan rochan@ices.utexas.edu Roger Leigh rleigh@whinlatter.ukfsn.org Roland McGrath roland@gnu.org Rolf Ebert rolf.ebert.gcc@gmx.de diff --git a/doc/autoconf.texi b/doc/autoconf.texi index 64243029..66d8a211 100644 --- a/doc/autoconf.texi +++ b/doc/autoconf.texi @@ -18700,6 +18700,28 @@ implementations have an input buffer limited to 4000 bytes. Likewise, not all @command{sed} implementations can handle embedded @code{NUL} or a missing trailing newline. +Remember that ranges within a bracket expression of a regular expression +are only well-defined in the @samp{C} (or @samp{POSIX}) locale. +Meanwhile, support for character classes like @samp{[[:upper:]]} is not +yet universal, so if you cannot guarantee the setting of @env{LC_ALL}, +it is better to spell out a range @samp{[ABCDEFGHIJKLMNOPQRSTUVWXYZ]} +than to rely on @samp{[A-Z]}. + +Additionally, Posix states that regular expressions are only +well-defined on characters. Unfortunately, there exist platforms such +as MacOS X 10.5 where not all 8-bit byte values are valid characters, +even though that platform has a single-byte @samp{C} locale. And Posix +allows the existence of a multi-byte @samp{C} locale, although that does +not yet appear to be a common implementation. At any rate, it means +that not all bytes will be matched by the regular expression @samp{.}: + +@example +$ @kbd{printf '\200\n' | LC_ALL=C sed -n /./p | wc -l} +0 +$ @kbd{printf '\200\n' | LC_ALL=en_US.ISO8859-1 sed -n /./p | wc -l} +1 +@end example + Portable @command{sed} regular expressions should use @samp{\} only to escape characters in the string @samp{$()*.0123456789[\^n@{@}}. For example, alternation, @samp{\|}, is common but Posix does not require its diff --git a/tests/torture.at b/tests/torture.at index 673c7a59..511834df 100644 --- a/tests/torture.at +++ b/tests/torture.at @@ -882,6 +882,9 @@ AT_CLEANUP AT_SETUP([Substitute and define special characters]) AT_KEYWORDS([AC@&t@_DEFINE AC@&t@_DEFINE_UNQUOTED]) +AT_XFAIL_IF([byte=\\200s; dnl +test `{ printf $byte; echo; } | sed -n '/^./p' | wc -l` = 0]) + AT_DATA([Foo.in], [@foo@ @bar@@notsubsted@@baz@ stray @ and more@@@baz@ abc@bar@baz@baz |