summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJohn Millaway <john43@users.sourceforge.net>2002-12-17 20:28:21 +0000
committerJohn Millaway <john43@users.sourceforge.net>2002-12-17 20:28:21 +0000
commit59081b956bb717e46ca6af28ed221989ab73cfa0 (patch)
tree5c53218d9115f78f72bc24022237a5be64544cc7
parent604e6483d56d1da0a19491c2a5a35f7bf048afc9 (diff)
downloadflex-git-59081b956bb717e46ca6af28ed221989ab73cfa0.tar.gz
Documented new behavior with character ranges.
-rw-r--r--flex.texi29
1 files changed, 25 insertions, 4 deletions
diff --git a/flex.texi b/flex.texi
index 7056bf0..412bbc0 100644
--- a/flex.texi
+++ b/flex.texi
@@ -856,14 +856,35 @@ For example, the following character classes are all equivalent:
@end verbatim
@end example
+Some notes on patterns are in order.
+
+
+@itemize
@cindex case-insensitive, effect on character classes
-If your scanner is case-insensitive (the @samp{-i} flag), then
+@item If your scanner is case-insensitive (the @samp{-i} flag), then
@samp{[:upper:]} and @samp{[:lower:]} are equivalent to
@samp{[:alpha:]}.
-Some notes on patterns are in order.
+@anchor{case and character ranges}
+@item Character classes with ranges, such as @samp{[a-Z]}, should be used with
+caution in a case-insensitive scanner if the range spans upper or lowercase
+characters. Flex does not know if you want to fold all upper and lowercase
+characters together, or if you want the literal numeric range specified (with
+no case folding). When in doubt, flex will assume that you meant the literal
+numeric range, and will issue a warning. The exception to this rule is a
+character range such as @samp{[a-z]} or @samp{[S-W]} where it is obvious that you
+want case-folding to occur. Here are some examples with the @samp{-i} flag
+enabled:
+
+@multitable {@samp{[a-zA-Z]}} {ambiguous} {@samp{[A-Z\[\\\]_`a-t]}} {@samp{[@@A-Z\[\\\]_`abc]}}
+@item Range @tab Result @tab Literal Range @tab Alternate Range
+@item @samp{[a-t]} @tab ok @tab @samp{[a-tA-T]} @tab
+@item @samp{[A-T]} @tab ok @tab @samp{[a-tA-T]} @tab
+@item @samp{[A-t]} @tab ambiguous @tab @samp{[A-Z\[\\\]_`a-t]} @tab @samp{[a-tA-T]}
+@item @samp{[_-@{]} @tab ambiguous @tab @samp{[_`a-z@{]} @tab @samp{[_`a-zA-Z@{]}
+@item @samp{[@@-C]} @tab ambiguous @tab @samp{[@@ABC]} @tab @samp{[@@A-Z\[\\\]_`abc]}
+@end multitable
-@itemize
@cindex end of line, in negated character classes
@cindex EOL, in negated character classes
@item
@@ -2445,7 +2466,7 @@ instructs @code{flex} to generate a @dfn{case-insensitive} scanner. The
case of letters given in the @code{flex} input patterns will be ignored,
and tokens in the input will be matched regardless of case. The matched
text given in @code{yytext} will have the preserved case (i.e., it will
-not be folded).
+not be folded). For tricky behavior, see @ref{case and character ranges}.