summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authormillaway <millaway>2002-07-11 20:06:52 +0000
committermillaway <millaway>2002-07-11 20:06:52 +0000
commit7c3c752ffad196597dfd94ab605d8bb6690ca7a5 (patch)
tree285945eb3f9a8976d5c16c40197c4a74e15cde3f
parente909e7375fbd3031709bbf26b04029331c24827c (diff)
downloadflex-7c3c752ffad196597dfd94ab605d8bb6690ca7a5.tar.gz
Moved all faqs into manual -- but did not evaluate them yet.
Removed the old faq files.
-rw-r--r--faq.texi2255
1 files changed, 2255 insertions, 0 deletions
diff --git a/faq.texi b/faq.texi
index 5efefeb..4142e6e 100644
--- a/faq.texi
+++ b/faq.texi
@@ -69,6 +69,71 @@
* Whenever flex can not match the input it says "flex scanner jammed".::
* Why doesn't flex have non-greedy operators like perl does?::
* Memory leak - 16386 bytes allocated by malloc.::
+* How do I track the byte offset for lseek()?::
+* unnamed-faq-1::
+* unnamed-faq-16::
+* unnamed-faq-22::
+* unnamed-faq-33::
+* unnamed-faq-42::
+* unnamed-faq-43::
+* unnamed-faq-44::
+* unnamed-faq-45::
+* unnamed-faq-46::
+* unnamed-faq-47::
+* unnamed-faq-48::
+* unnamed-faq-49::
+* unnamed-faq-50::
+* unnamed-faq-51::
+* unnamed-faq-52::
+* unnamed-faq-53::
+* unnamed-faq-54::
+* unnamed-faq-55::
+* unnamed-faq-56::
+* unnamed-faq-57::
+* unnamed-faq-58::
+* unnamed-faq-59::
+* unnamed-faq-60::
+* unnamed-faq-61::
+* unnamed-faq-62::
+* unnamed-faq-63::
+* unnamed-faq-64::
+* unnamed-faq-65::
+* unnamed-faq-66::
+* unnamed-faq-67::
+* unnamed-faq-68::
+* unnamed-faq-69::
+* unnamed-faq-70::
+* unnamed-faq-71::
+* unnamed-faq-72::
+* unnamed-faq-73::
+* unnamed-faq-74::
+* unnamed-faq-75::
+* unnamed-faq-76::
+* unnamed-faq-77::
+* unnamed-faq-78::
+* unnamed-faq-79::
+* unnamed-faq-80::
+* unnamed-faq-81::
+* unnamed-faq-82::
+* unnamed-faq-83::
+* unnamed-faq-84::
+* unnamed-faq-85::
+* unnamed-faq-86::
+* unnamed-faq-87::
+* unnamed-faq-88::
+* unnamed-faq-89::
+* unnamed-faq-90::
+* unnamed-faq-91::
+* unnamed-faq-92::
+* unnamed-faq-93::
+* unnamed-faq-94::
+* unnamed-faq-95::
+* unnamed-faq-96::
+* unnamed-faq-97::
+* unnamed-faq-98::
+* unnamed-faq-99::
+* unnamed-faq-100::
+* unnamed-faq-101::
@end menu
@@ -756,6 +821,10 @@ This approach also has much better error-reporting properties.
@node Memory leak - 16386 bytes allocated by malloc.
@unnumberedsec Memory leak - 16386 bytes allocated by malloc.
+UPDATED 2002-07-10: As of flex version 2.5.9, this leak means that you did not
+call yylex_destroy(). If you are using an earlier version of flex, then read
+on.
+
The leak is about 16426 bytes. That is, (8192 * 2 + 2) for the read-buffer, and
about 40 for struct yy_buffer_state (depending upon alignment). The leak is in
the non-reentrant C scanner only (NOT in the reentrant scanner, NOT in the C++
@@ -778,3 +847,2189 @@ yy_init = 1;
Note: yy_init is an "internal variable", and hasn't been tested in this
situation. It is possible that some other globals may need resetting as well.
+@node How do I track the byte offset for lseek()?
+@unnumberedsec How do I track the byte offset for lseek()?
+
+@example
+@verbatim
+> We thought that it would be possible to have this number through the
+> evaluation of the following expression:
+>
+> seek_position = (no_buffers)*YY_READ_BUF_SIZE + yy_c_buf_p - yy_current_buffer->yy_ch_buf
+@end verbatim
+@end example
+
+While this is the right ideas, it has two problems. The first is that
+it's possible that flex will request less than YY_READ_BUF_SIZE during
+an invocation of YY_INPUT (or that your input source will return less
+even though YY_READ_BUF_SIZE bytes were requested). The second problem
+is that when refilling its internal buffer, flex keeps some characters
+from the previous buffer (because usually it's in the middle of a match,
+and needs those characters to construct yytext for the match once it's
+done). Because of this, yy_c_buf_p - yy_current_buffer->yy_ch_buf won't
+be exactly the number of characters already read from the current buffer.
+
+An alternative solution is to count the number of characters you've matched
+since starting to scan. This can be done by using YY_USER_ACTION. For
+example,
+
+ #define YY_USER_ACTION num_chars += yyleng;
+
+(You need to be careful to update your bookkeeping if you use yymore(),
+yyless(), unput(), or input().)
+
+@c TODO: Evaluate this faq.
+@c TODO: Evaluate this faq.
+@node unnamed-faq-1
+@unnumberedsec unnamed-faq-1
+@example
+@verbatim
+Return-Path: mdonahue@neptune.convex.com
+Received: from 130.168.1.1 by horse.ee.lbl.gov for vern (5.65/1.41)
+ id AA06605; Tue, 7 Jul 92 13:00:27 -0700
+Received: from neptune.convex.com by convex.convex.com (5.64/1.35)
+ id AA13737; Tue, 7 Jul 92 14:24:28 -0500
+Received: by neptune.convex.com (5.64/1.28)
+ id AA27330; Tue, 7 Jul 92 14:24:23 -0500
+From: mdonahue@neptune.convex.com (Mike Donahue)
+Message-Id: <9207071924.AA27330@neptune.convex.com>
+Subject: Re: Flex and Multiple Parsers
+To: vern@horse.ee.lbl.gov (Vern Paxson)
+Date: Tue, 7 Jul 92 14:24:22 CDT
+In-Reply-To: <9207021608.AA25669@horse.ee.lbl.gov>; from "Vern Paxson" at Jul 2, 92 9:08 am
+X-Mailer: ELM [version 2.3 PL11]
+
+
+ Thanks for your help, actually my solution ended up being
+ quite different. let me further describe my problem and
+ its solution to you just in case someone else has this problem.
+
+ Basically we have a library of parsers (I think there are 15).
+ Most programs we write also have several of there own parsers and
+ call some of the routines in the library. Therefore each program
+ must be able to have several parsers.
+
+ To this end I simply changed all the subroutines in the skel file
+ to static (just the subroutines)
+
+ We have parsers that start parsing a file and find a section that
+ needs to be parsed by a library parser (the same file) therefore
+ the file pointer and a state must be passed into the new parser
+ or the file pointer and state buffers must be global declarations
+ (the solution we use in the previous flex and now the new one)
+
+ Because the parsers are working on the same file declaring
+ yyout, yytoken, yyin, yytext, and yylen
+ and the other variables as static does not work, so I declared each
+ as and extern and local(therefore making those variable available to
+ all parsers)
+
+ I.E.
+
+ extern YYCHAR yytext;
+ YYCHAR yytext;
+
+ The solution is still not yet complete, now there are compiler conflicts
+ with the variables that need start states (yy_init, yy_c_buf_p, ...)
+ and if these are static each parser will try to re_init since yy_init
+ will be set to 1 in every parser. The solution here was to get rid
+ of these declarations and make these variables global to all parsers
+ as well (see above). Now the user must initialize the variables,
+ an easy way to do this was to change yyrestart and call it after opening
+ a new file for reading :
+
+static void yyrestart( input_file )
+ {
+ /* Below 3 added by MJD for multiple interactive parsers */
+ yy_c_buf_p = (YY_CHAR *) 0;
+ yy_init = 1; /* whether we need to initialize */
+ yy_start = 0; /* start state number */
+
+ /* Below 2 added by MJD for multiple interactive parsers */
+ if (!input_file) return;
+ if (!yy_current_buffer) return;
+
+ yy_init_buffer( yy_current_buffer, input_file );
+ yy_load_buffer_state();
+
+ }
+
+I also added a routine called yyopenfile (this is actually in our
+library, not in the skeleton), which opens a file, does the restart
+and sets yyin to the new file pointer, returning a NULL if unsuccessful.
+If this routine is used by all the parsers to open a new file then
+the changes are transparent.
+
+
+Anyways, to make a long story short that has got us going with the new
+flex and if anyone else is having trouble with a similar problem
+feel free to give them this solution.
+
+Thanks for your help and the effort you have put in providing flex,
+it is an excellent tool.
+
+Mike
+
+
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-16
+@unnumberedsec unnamed-faq-16
+@example
+@verbatim
+To: steves@telebase.com
+Subject: Re: flex C++ question
+In-reply-to: Your message of Thu, 08 Dec 94 13:10:58 EST.
+Date: Wed, 14 Dec 94 16:40:47 PST
+From: Vern Paxson <vern>
+
+> We'd like to override the provided LexerInput() and LexerOutput()
+> functions, but we'd like to *not* use iostreams. Instead, we'd like
+> to use some of our own I/O classes. Is this possible?
+
+You can do this by passing the various functions nil iostream*'s, and then
+dealing with your own I/O classes surreptitiously (i.e., stashing them in
+special member variables). This works because the only assumption about
+the lexer regarding what's done with the iostream's is that they're
+ultimately passed to LexerInput and LexerOutput, which then do whatever
+necessary with them.
+
+When the flex C++ scanning class rewrite finally happens (no date for this
+in sight), then this sort of thing should become much easier.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-22
+@unnumberedsec unnamed-faq-22
+@example
+@verbatim
+To: matute@emf.net
+Subject: Re: Question
+In-reply-to: Your message of Mon, 15 May 95 11:32:12 PDT.
+Date: Mon, 15 May 95 17:03:25 PDT
+From: Vern Paxson <vern>
+
+> %x SKIP
+> ...
+>
+> <SKIP>"endofit" { BEGIN(INITIAL); }
+> <SKIP>.* ;
+> <SKIP>\n lineno++;
+
+The simplest (but slow) fix is:
+
+<SKIP>"endofit" { BEGIN(INITIAL); }
+<SKIP>.
+<SKIP>\n lineno++;
+
+The better fixes all involve making the second rule match more, without
+making it match "endofit" plus something else. So for example:
+
+<SKIP>"endofit" { BEGIN(INITIAL); }
+<SKIP>.{1,7}
+<SKIP>\n lineno++;
+
+or
+
+<SKIP>"endofit" { BEGIN(INITIAL); }
+<SKIP>[^e\n]+
+<SKIP>. /* so you eat up e's, too */
+<SKIP>\n lineno++;
+
+- Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-33
+@unnumberedsec unnamed-faq-33
+@example
+@verbatim
+QUESTION:
+When was flex born?
+
+Vern Paxson took over
+the Software Tools lex project from Jef Poskanzer in 1982. At that point it
+was written in Ratfor. Around 1987 or so, Paxson translated it into C, and
+a legend was born :-).
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-42
+@unnumberedsec unnamed-faq-42
+@example
+@verbatim
+To: Adoram Rogel <adoram@orna.hybridge.com>
+Subject: Re: Flex 2.5.2 performance questions
+In-reply-to: Your message of Wed, 18 Sep 96 11:12:17 EDT.
+Date: Wed, 18 Sep 96 10:51:02 PDT
+From: Vern Paxson <vern>
+
+[Note, the most recent flex release is 2.5.4, which you can get from
+ ftp.ee.lbl.gov. It has bug fixes over 2.5.2 and 2.5.3.]
+
+> 1. Using the pattern
+> ([Ff](oot)?)?[Nn](ote)?(\.)?
+> instead of
+> (((F|f)oot(N|n)ote)|((N|n)ote)|((N|n)\.)|((F|f)(N|n)(\.)))
+> (in a very complicated flex program) caused the program to slow from
+> 300K+/min to 100K/min (no other changes were done).
+
+These two are not equivalent. For example, the first can match "footnote."
+but the second can only match "footnote". This is almost certainly the
+cause in the discrepancy - the slower scanner run is matching more tokens,
+and/or having to do more backing up.
+
+> 2. Which of these two are better: [Ff]oot or (F|f)oot ?
+
+From a performance point of view, they're equivalent (modulo presumably
+minor effects such as memory cache hit rates; and the presence of trailing
+context, see below). From a space point of view, the first is slightly
+preferable.
+
+> 3. I have a pattern that look like this:
+> pats {p1}|{p2}|{p3}|...|{p50} (50 patterns ORd)
+>
+> running yet another complicated program that includes the following rule:
+> <snext>{and}/{no4}{bb}{pats}
+>
+> gets me to "too complicated - over 32,000 states"...
+
+I can't tell from this example whether the trailing context is variable-length
+or fixed-length (it could be the latter if {and} is fixed-length). If it's
+variable length, which flex -p will tell you, then this reflects a basic
+performance problem, and if you can eliminate it by restructuring your
+scanner, you will see significant improvement.
+
+> so I divided {pats} to {pats1}, {pats2},..., {pats5} each consists of about
+> 10 patterns and changed the rule to be 5 rules.
+> This did compile, but what is the rule of thumb here ?
+
+The rule is to avoid trailing context other than fixed-length, in which for
+a/b, either the 'a' pattern or the 'b' pattern have a fixed length. Use
+of the '|' operator automatically makes the pattern variable length, so in
+this case '[Ff]oot' is preferred to '(F|f)oot'.
+
+> 4. I changed a rule that looked like this:
+> <snext8>{and}{bb}/{ROMAN}[^A-Za-z] { BEGIN...
+>
+> to the next 2 rules:
+> <snext8>{and}{bb}/{ROMAN}[A-Za-z] { ECHO;}
+> <snext8>{and}{bb}/{ROMAN} { BEGIN...
+>
+> Again, I understand the using [^...] will cause a great performance loss
+
+Actually, it doesn't cause any sort of performance loss. It's a surprising
+fact about regular expressions that they always match in linear time
+regardless of how complex they are.
+
+> but are there any specific rules about it ?
+
+See the "Performance Considerations" section of the man page, and also
+the example in MISC/fastwc/.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-43
+@unnumberedsec unnamed-faq-43
+@example
+@verbatim
+To: Adoram Rogel <adoram@hybridge.com>
+Subject: Re: Flex 2.5.2 performance questions
+In-reply-to: Your message of Thu, 19 Sep 96 10:16:04 EDT.
+Date: Thu, 19 Sep 96 09:58:00 PDT
+From: Vern Paxson <vern>
+
+> a lot about the backing up problem.
+> I believe that there lies my biggest problem, and I'll try to improve
+> it.
+
+Since you have variable trailing context, this is a bigger performance
+problem. Fixing it is usually easier than fixing backing up, which in a
+complicated scanner (yours seems to fit the bill) can be extremely
+difficult to do correctly.
+
+You also don't mention what flags you are using for your scanner.
+-f makes a large speed difference, and -Cfe buys you nearly as much
+speed but the resulting scanner is considerably smaller.
+
+> I have an | operator in {and} and in {pats} so both of them are variable
+> length.
+
+-p should have reported this.
+
+> Is changing one of them to fixed-length is enough ?
+
+Yes.
+
+> Is it possible to change the 32,000 states limit ?
+
+Yes. I've appended instructions on how. Before you make this change,
+though, you should think about whether there are ways to fundamentally
+simplify your scanner - those are certainly preferable!
+
+ Vern
+
+
+To increase the 32K limit (on a machine with 32 bit integers), you increase
+the magnitude of the following in flexdef.h:
+
+ #define JAMSTATE -32766 /* marks a reference to the state that always jams */
+ #define MAXIMUM_MNS 31999
+ #define BAD_SUBSCRIPT -32767
+ #define MAX_SHORT 32700
+
+Adding a 0 or two after each should do the trick.
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-44
+@unnumberedsec unnamed-faq-44
+@example
+@verbatim
+To: Heeman_Lee@hp.com
+Subject: Re: flex - multi-byte support?
+In-reply-to: Your message of Thu, 03 Oct 1996 17:24:04 PDT.
+Date: Fri, 04 Oct 1996 11:42:18 PDT
+From: Vern Paxson <vern>
+
+> I assume as long as my *.l file defines the
+> range of expected character code values (in octal format), flex will
+> scan the file and read multi-byte characters correctly. But I have no
+> confidence in this assumption.
+
+Your lack of confidence is justified - this won't work.
+
+Flex has in it a widespread assumption that the input is processed
+one byte at a time. Fixing this is on the to-do list, but is involved,
+so it won't happen any time soon. In the interim, the best I can suggest
+(unless you want to try fixing it yourself) is to write your rules in
+terms of pairs of bytes, using definitions in the first section:
+
+ X \xfe\xc2
+ ...
+ %%
+ foo{X}bar found_foo_fe_c2_bar();
+
+etc. Definitely a pain - sorry about that.
+
+By the way, the email address you used for me is ancient, indicating you
+have a very old version of flex. You can get the most recent, 2.5.4, from
+ftp.ee.lbl.gov.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-45
+@unnumberedsec unnamed-faq-45
+@example
+@verbatim
+To: moleary@primus.com
+Subject: Re: Flex / Unicode compatibility question
+In-reply-to: Your message of Tue, 22 Oct 1996 10:15:42 PDT.
+Date: Tue, 22 Oct 1996 11:06:13 PDT
+From: Vern Paxson <vern>
+
+Unfortunately flex at the moment has a widespread assumption within it
+that characters are processed 8 bits at a time. I don't see any easy
+fix for this (other than writing your rules in terms of double characters -
+a pain). I also don't know of a wider lex, though you might try surfing
+the Plan 9 stuff because I know it's a Unicode system, and also the PCCT
+toolkit (try searching say Alta Vista for "Purdue Compiler Construction
+Toolkit").
+
+Fixing flex to handle wider characters is on the long-term to-do list.
+But since flex is a strictly spare-time project these days, this probably
+won't happen for quite a while, unless someone else does it first.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-46
+@unnumberedsec unnamed-faq-46
+@example
+@verbatim
+To: Johan Linde <jl@theophys.kth.se>
+Subject: Re: translation of flex
+In-reply-to: Your message of Sun, 10 Nov 1996 09:16:36 PST.
+Date: Mon, 11 Nov 1996 10:33:50 PST
+From: Vern Paxson <vern>
+
+> I'm working for the Swedish team translating GNU program, and I'm currently
+> working with flex. I have a few questions about some of the messages which
+> I hope you can answer.
+
+All of the things you're wondering about, by the way, concerning flex
+internals - probably the only person who understands what they mean in
+English is me! So I wouldn't worry too much about getting them right.
+That said ...
+
+> #: main.c:545
+> msgid " %d protos created\n"
+>
+> Does proto mean prototype?
+
+Yes - prototypes of state compression tables.
+
+> #: main.c:539
+> msgid " %d/%d (peak %d) template nxt-chk entries created\n"
+>
+> Here I'm mainly puzzled by 'nxt-chk'. I guess it means 'next-check'. (?)
+> However, 'template next-check entries' doesn't make much sense to me. To be
+> able to find a good translation I need to know a little bit more about it.
+
+There is a scheme in the Aho/Sethi/Ullman compiler book for compressing
+scanner tables. It involves creating two pairs of tables. The first has
+"base" and "default" entries, the second has "next" and "check" entries.
+The "base" entry is indexed by the current state and yields an index into
+the next/check table. The "default" entry gives what to do if the state
+transition isn't found in next/check. The "next" entry gives the next
+state to enter, but only if the "check" entry verifies that this entry is
+correct for the current state. Flex creates templates of series of
+next/check entries and then encodes differences from these templates as a
+way to compress the tables.
+
+> #: main.c:533
+> msgid " %d/%d base-def entries created\n"
+>
+> The same problem here for 'base-def'.
+
+See above.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-47
+@unnumberedsec unnamed-faq-47
+@example
+@verbatim
+To: Xinying Li <xli@npac.syr.edu>
+Subject: Re: FLEX ?
+In-reply-to: Your message of Wed, 13 Nov 1996 17:28:38 PST.
+Date: Wed, 13 Nov 1996 19:51:54 PST
+From: Vern Paxson <vern>
+
+> "unput()" them to input flow, question occurs. If I do this after I scan
+> a carriage, the variable "yy_current_buffer->yy_at_bol" is changed. That
+> means the carriage flag has gone.
+
+You can control this by calling yy_set_bol(). It's described in the manual.
+
+> And if in pre-reading it goes to the end of file, is anything done
+> to control the end of curren buffer and end of file?
+
+No, there's no way to put back an end-of-file.
+
+> By the way I am using flex 2.5.2 and using the "-l".
+
+The latest release is 2.5.4, by the way. It fixes some bugs in 2.5.2 and
+2.5.3. You can get it from ftp.ee.lbl.gov.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-48
+@unnumberedsec unnamed-faq-48
+@example
+@verbatim
+To: Alain.ISSARD@st.com
+Subject: Re: Start condition with FLEX
+In-reply-to: Your message of Mon, 18 Nov 1996 09:45:02 PST.
+Date: Mon, 18 Nov 1996 10:41:34 PST
+From: Vern Paxson <vern>
+
+> I am not able to use the start condition scope and to use the | (OR) with
+> rules having start conditions.
+
+The problem is that if you use '|' as a regular expression operator, for
+example "a|b" meaning "match either 'a' or 'b'", then it must *not* have
+any blanks around it. If you instead want the special '|' *action* (which
+from your scanner appears to be the case), which is a way of giving two
+different rules the same action:
+
+ foo |
+ bar matched_foo_or_bar();
+
+then '|' *must* be separated from the first rule by whitespace and *must*
+be followed by a new line. You *cannot* write it as:
+
+ foo | bar matched_foo_or_bar();
+
+even though you might think you could because yacc supports this syntax.
+The reason for this unfortunately incompatibility is historical, but it's
+unlikely to be changed.
+
+Your problems with start condition scope are simply due to syntax errors
+from your use of '|' later confusing flex.
+
+Let me know if you still have problems.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-49
+@unnumberedsec unnamed-faq-49
+@example
+@verbatim
+To: Gregory Margo <gmargo@newton.vip.best.com>
+Subject: Re: flex-2.5.3 bug report
+In-reply-to: Your message of Sat, 23 Nov 1996 16:50:09 PST.
+Date: Sat, 23 Nov 1996 17:07:32 PST
+From: Vern Paxson <vern>
+
+> Enclosed is a lex file that "real" lex will process, but I cannot get
+> flex to process it. Could you try it and maybe point me in the right direction?
+
+Your problem is that some of the definitions in the scanner use the '/'
+trailing context operator, and have it enclosed in ()'s. Flex does not
+allow this operator to be enclosed in ()'s because doing so allows undefined
+regular expressions such as "(a/b)+". So the solution is to remove the
+parentheses. Note that you must also be building the scanner with the -l
+option for AT&T lex compatibility. Without this option, flex automatically
+encloses the definitions in parentheses.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-50
+@unnumberedsec unnamed-faq-50
+@example
+@verbatim
+To: Thomas Hadig <hadig@toots.physik.rwth-aachen.de>
+Subject: Re: Flex Bug ?
+In-reply-to: Your message of Tue, 26 Nov 1996 14:35:01 PST.
+Date: Tue, 26 Nov 1996 11:15:05 PST
+From: Vern Paxson <vern>
+
+> In my lexer code, i have the line :
+> ^\*.* { }
+>
+> Thus all lines starting with an astrix (*) are comment lines.
+> This does not work !
+
+I can't get this problem to reproduce - it works fine for me. Note
+though that if what you have is slightly different:
+
+ COMMENT ^\*.*
+ %%
+ {COMMENT} { }
+
+then it won't work, because flex pushes back macro definitions enclosed
+in ()'s, so the rule becomes
+
+ (^\*.*) { }
+
+and now that the '^' operator is not at the immediate beginning of the
+line, it's interpreted as just a regular character. You can avoid this
+behavior by using the "-l" lex-compatibility flag, or "%option lex-compat".
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-51
+@unnumberedsec unnamed-faq-51
+@example
+@verbatim
+To: Adoram Rogel <adoram@hybridge.com>
+Subject: Re: Flex 2.5.4 BOF ???
+In-reply-to: Your message of Tue, 26 Nov 1996 16:10:41 PST.
+Date: Wed, 27 Nov 1996 10:56:25 PST
+From: Vern Paxson <vern>
+
+> Organization(s)?/[a-z]
+>
+> This matched "Organizations" (looking in debug mode, the trailing s
+> was matched with trailing context instead of the optional (s) in the
+> end of the word.
+
+That should only happen with lex. Flex can properly match this pattern.
+(That might be what you're saying, I'm just not sure.)
+
+> Is there a way to avoid this dangerous trailing context problem ?
+
+Unfortunately, there's no easy way. On the other hand, I don't see why
+it should be a problem. Lex's matching is clearly wrong, and I'd hope
+that usually the intent remains the same as expressed with the pattern,
+so flex's matching will be correct.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-52
+@unnumberedsec unnamed-faq-52
+@example
+@verbatim
+To: Cameron MacKinnon <mackin@interlog.com>
+Subject: Re: Flex documentation bug
+In-reply-to: Your message of Mon, 02 Dec 1996 00:07:08 PST.
+Date: Sun, 01 Dec 1996 22:29:39 PST
+From: Vern Paxson <vern>
+
+> I'm not sure how or where to submit bug reports (documentation or
+> otherwise) for the GNU project stuff ...
+
+Well, strictly speaking flex isn't part of the GNU project. They just
+distribute it because no one's written a decent GPL'd lex replacement.
+So you should send bugs directly to me. Those sent to the GNU folks
+sometimes find there way to me, but some may drop between the cracks.
+
+> In GNU Info, under the section 'Start Conditions', and also in the man
+> page (mine's dated April '95) is a nice little snippet showing how to
+> parse C quoted strings into a buffer, defined to be MAX_STR_CONST in
+> size. Unfortunately, no overflow checking is ever done ...
+
+This is already mentioned in the manual:
+
+ Finally, here's an example of how to match C-style quoted
+ strings using exclusive start conditions, including expanded
+ escape sequences (but not including checking for a string
+ that's too long):
+
+The reason for not doing the overflow checking is that it will needlessly
+clutter up an example whose main purpose is just to demonstrate how to
+use flex.
+
+The latest release is 2.5.4, by the way, available from ftp.ee.lbl.gov.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-53
+@unnumberedsec unnamed-faq-53
+@example
+@verbatim
+To: tsv@cs.UManitoba.CA
+Subject: Re: Flex (reg)..
+In-reply-to: Your message of Thu, 06 Mar 1997 23:50:16 PST.
+Date: Thu, 06 Mar 1997 15:54:19 PST
+From: Vern Paxson <vern>
+
+> [:alpha:] ([:alnum:] | \\_)*
+
+If your rule really has embedded blanks as shown above, then it won't
+work, as the first blank delimits the rule from the action. (It wouldn't
+even compile ...) You need instead:
+
+[:alpha:]([:alnum:]|\\_)*
+
+and that should work fine - there's no restriction on what can go inside
+of ()'s except for the trailing context operator, '/'.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-54
+@unnumberedsec unnamed-faq-54
+@example
+@verbatim
+To: "Mike Stolnicki" <mstolnic@ford.com>
+Subject: Re: FLEX help
+In-reply-to: Your message of Fri, 30 May 1997 13:33:27 PDT.
+Date: Fri, 30 May 1997 10:46:35 PDT
+From: Vern Paxson <vern>
+
+> We'd like to add "if-then-else", "while", and "for" statements to our
+> language ...
+> We've investigated many possible solutions. The one solution that seems
+> the most reasonable involves knowing the position of a TOKEN in yyin.
+
+I strongly advise you to instead build a parse tree (abstract syntax tree)
+and loop over that instead. You'll find this has major benefits in keeping
+your interpreter simple and extensible.
+
+That said, the functionality you mention for get_position and set_position
+have been on the to-do list for a while. As flex is a purely spare-time
+project for me, no guarantees when this will be added (in particular, it
+for sure won't be for many months to come).
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-55
+@unnumberedsec unnamed-faq-55
+@example
+@verbatim
+To: Colin Paul Adams <colin@colina.demon.co.uk>
+Subject: Re: Flex C++ classes and Bison
+In-reply-to: Your message of 09 Aug 1997 17:11:41 PDT.
+Date: Fri, 15 Aug 1997 10:48:19 PDT
+From: Vern Paxson <vern>
+
+> #define YY_DECL int yylex (YYSTYPE *lvalp, struct parser_control
+> *parm)
+>
+> I have been trying to get this to work as a C++ scanner, but it does
+> not appear to be possible (warning that it matches no declarations in
+> yyFlexLexer, or something like that).
+>
+> Is this supposed to be possible, or is it being worked on (I DID
+> notice the comment that scanner classes are still experimental, so I'm
+> not too hopeful)?
+
+What you need to do is derive a subclass from yyFlexLexer that provides
+the above yylex() method, squirrels away lvalp and parm into member
+variables, and then invokes yyFlexLexer::yylex() to do the regular scanning.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-56
+@unnumberedsec unnamed-faq-56
+@example
+@verbatim
+To: Mikael.Latvala@lmf.ericsson.se
+Subject: Re: Possible mistake in Flex v2.5 document
+In-reply-to: Your message of Fri, 05 Sep 1997 16:07:24 PDT.
+Date: Fri, 05 Sep 1997 10:01:54 PDT
+From: Vern Paxson <vern>
+
+> In that example you show how to count comment lines when using
+> C style /* ... */ comments. My question is, shouldn't you take into
+> account a scenario where end of a comment marker occurs inside
+> character or string literals?
+
+The scanner certainly needs to also scan character and string literals.
+However it does that (there's an example in the man page for strings), the
+lexer will recognize the beginning of the literal before it runs across the
+embedded "/*". Consequently, it will finish scanning the literal before it
+even considers the possibility of matching "/*".
+
+Example:
+
+ '([^']*|{ESCAPE_SEQUENCE})'
+
+will match all the text between the ''s (inclusive). So the lexer
+considers this as a token beginning at the first ', and doesn't even
+attempt to match other tokens inside it.
+
+I thinnk this subtlety is not worth putting in the manual, as I suspect
+it would confuse more people than it would enlighten.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-57
+@unnumberedsec unnamed-faq-57
+@example
+@verbatim
+To: "Marty Leisner" <leisner@sdsp.mc.xerox.com>
+Subject: Re: flex limitations
+In-reply-to: Your message of Sat, 06 Sep 1997 11:27:21 PDT.
+Date: Mon, 08 Sep 1997 11:38:08 PDT
+From: Vern Paxson <vern>
+
+> %%
+> [a-zA-Z]+ /* skip a line */
+> { printf("got %s\n", yytext); }
+> %%
+
+What version of flex are you using? If I feed this to 2.5.4, it complains:
+
+ "bug.l", line 5: EOF encountered inside an action
+ "bug.l", line 5: unrecognized rule
+ "bug.l", line 5: fatal parse error
+
+Not the world's greatest error message, but it manages to flag the problem.
+
+(With the introduction of start condition scopes, flex can't accommodate
+an action on a separate line, since it's ambiguous with an indented rule.)
+
+You can get 2.5.4 from ftp.ee.lbl.gov.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-58
+@unnumberedsec unnamed-faq-58
+@example
+@verbatim
+To: uocarroll@deagostini.co.uk (Ultan O'Carroll)
+Subject: Re: Flex repositries
+In-reply-to: Your message of Fri, 12 Sep 1997 15:02:28 PDT.
+Date: Fri, 12 Sep 1997 10:31:50 PDT
+From: Vern Paxson <vern>
+
+> before I start beavering away I wonder if you know of any
+> place/libraries for flex
+> desciption files that might already do this or give me a head start ?
+
+Unfortunately, no, I don't. You might try asking on comp.compilers.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-59
+@unnumberedsec unnamed-faq-59
+@example
+@verbatim
+To: Adoram Rogel <adoram@hybridge.com>
+Subject: Re: Conditional compiling in the definitions section
+In-reply-to: Your message of Thu, 25 Sep 1997 11:22:42 PDT.
+Date: Thu, 25 Sep 1997 10:56:31 PDT
+From: Vern Paxson <vern>
+
+> I'm trying to combine two large lex files that now differ only in
+> about 10 lines in the definitions section.
+> I would like to have something like this:
+> #ifdef FFF
+> it \<IT\>
+> #else
+> it \<I\>
+> #endif
+>
+> Now, I can't add states for these, as I have already too many states
+> and the program is very complicated, and I won't be able to handle
+> 10 or 20 more states.
+>
+> Any trick to do this ?
+
+You might try using m4, or the C preprocessor plus a sed script to
+clean up the result (strip out the #line's).
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-60
+@unnumberedsec unnamed-faq-60
+@example
+@verbatim
+To: Steve Antoch <SteveAn@visio.com>
+Subject: Re: lex and yacc grammars
+In-reply-to: Your message of Mon, 17 Nov 1997 15:31:25 PST.
+Date: Mon, 17 Nov 1997 15:27:01 PST
+From: Vern Paxson <vern>
+
+> Would you happen to know where I can find grammars for lex and yacc?
+
+The flex sources have a grammar for (f)lex. Dunno about yacc,
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-61
+@unnumberedsec unnamed-faq-61
+@example
+@verbatim
+To: Bryan Housel <bryan@drawcomp.com>
+Subject: Re: Question about Flex v2.5
+In-reply-to: Your message of Tue, 11 Nov 1997 21:30:23 PST.
+Date: Mon, 17 Nov 1997 17:12:21 PST
+From: Vern Paxson <vern>
+
+> It prints one of those "end of buffer.." messages for each character in the
+> token...
+
+This will happen if your LexerInput() function returns only one character
+at a time, which can happen either if you're scanner is "interactive", or
+if the streams library on your platform always returns 1 for yyin->gcount().
+
+Solution: override LexerInput() with a version that returns whole buffers.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-62
+@unnumberedsec unnamed-faq-62
+@example
+@verbatim
+To: Georg.Rehm@CL-KI.Uni-Osnabrueck.DE
+Subject: Re: Flex maximums
+In-reply-to: Your message of Mon, 17 Nov 1997 17:16:06 PST.
+Date: Mon, 17 Nov 1997 17:16:15 PST
+From: Vern Paxson <vern>
+
+> I took a quick look into the flex-sources and altered some #defines in
+> flexdefs.h:
+>
+> #define INITIAL_MNS 64000
+> #define MNS_INCREMENT 1024000
+> #define MAXIMUM_MNS 64000
+
+The things to fix are to add a couple of zeroes to:
+
+ #define JAMSTATE -32766 /* marks a reference to the state that always jams */
+ #define MAXIMUM_MNS 31999
+ #define BAD_SUBSCRIPT -32767
+ #define MAX_SHORT 32700
+
+and, if you get complaints about too many rules, make the following change too:
+
+ #define YY_TRAILING_MASK 0x200000
+ #define YY_TRAILING_HEAD_MASK 0x400000
+
+- Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-63
+@unnumberedsec unnamed-faq-63
+@example
+@verbatim
+To: jimmey@lexis-nexis.com (Jimmey Todd)
+Subject: Re: FLEX question regarding istream vs ifstream
+In-reply-to: Your message of Mon, 08 Dec 1997 15:54:15 PST.
+Date: Mon, 15 Dec 1997 13:21:35 PST
+From: Vern Paxson <vern>
+
+> stdin_handle = YY_CURRENT_BUFFER;
+> ifstream fin( "aFile" );
+> yy_switch_to_buffer( yy_create_buffer( fin, YY_BUF_SIZE ) );
+>
+> What I'm wanting to do, is pass the contents of a file thru one set
+> of rules and then pass stdin thru another set... It works great if, I
+> don't use the C++ classes. But since everything else that I'm doing is
+> in C++, I thought I'd be consistent.
+>
+> The problem is that 'yy_create_buffer' is expecting an istream* as it's
+> first argument (as stated in the man page). However, fin is a ifstream
+> object. Any ideas on what I might be doing wrong? Any help would be
+> appreciated. Thanks!!
+
+You need to pass &fin, to turn it into an ifstream* instead of an ifstream.
+Then its type will be compatible with the expected istream*, because ifstream
+is derived from istream.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-64
+@unnumberedsec unnamed-faq-64
+@example
+@verbatim
+To: Enda Fadian <fadiane@piercom.ie>
+Subject: Re: Question related to Flex man page?
+In-reply-to: Your message of Tue, 16 Dec 1997 15:17:34 PST.
+Date: Tue, 16 Dec 1997 14:17:09 PST
+From: Vern Paxson <vern>
+
+> Can you explain to me what is ment by a long-jump in relation to flex?
+
+Using the longjmp() function while inside yylex() or a routine called by it.
+
+> what is the flex activation frame.
+
+Just yylex()'s stack frame.
+
+> As far as I can see yyrestart will bring me back to the sart of the input
+> file and using flex++ isnot really an option!
+
+No, yyrestart() doesn't imply a rewind, even though its name might sound
+like it does. It tells the scanner to flush its internal buffers and
+start reading from the given file at its present location.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-65
+@unnumberedsec unnamed-faq-65
+@example
+@verbatim
+To: hassan@larc.info.uqam.ca (Hassan Alaoui)
+Subject: Re: Need urgent Help
+In-reply-to: Your message of Sat, 20 Dec 1997 19:38:19 PST.
+Date: Sun, 21 Dec 1997 21:30:46 PST
+From: Vern Paxson <vern>
+
+> /usr/lib/yaccpar: In function `int yyparse()':
+> /usr/lib/yaccpar:184: warning: implicit declaration of function `int yylex(...)'
+>
+> ld: Undefined symbol
+> _yylex
+> _yyparse
+> _yyin
+
+This is a known problem with Solaris C++ (and/or Solaris yacc). I believe
+the fix is to explicitly insert some 'extern "C"' statements for the
+corresponding routines/symbols.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-66
+@unnumberedsec unnamed-faq-66
+@example
+@verbatim
+To: mc0307@mclink.it
+Cc: gnu@prep.ai.mit.edu
+Subject: Re: [mc0307@mclink.it: Help request]
+In-reply-to: Your message of Fri, 12 Dec 1997 17:57:29 PST.
+Date: Sun, 21 Dec 1997 22:33:37 PST
+From: Vern Paxson <vern>
+
+> This is my definition for float and integer types:
+> . . .
+> NZD [1-9]
+> ...
+> I've tested my program on other lex version (on UNIX Sun Solaris an HP
+> UNIX) and it work well, so I think that my definitions are correct.
+> There are any differences between Lex and Flex?
+
+There are indeed differences, as discussed in the man page. The one
+you are probably running into is that when flex expands a name definition,
+it puts parentheses around the expansion, while lex does not. There's
+an example in the man page of how this can lead to different matching.
+Flex's behavior complies with the POSIX standard (or at least with the
+last POSIX draft I saw).
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-67
+@unnumberedsec unnamed-faq-67
+@example
+@verbatim
+To: hassan@larc.info.uqam.ca (Hassan Alaoui)
+Subject: Re: Thanks
+In-reply-to: Your message of Mon, 22 Dec 1997 16:06:35 PST.
+Date: Mon, 22 Dec 1997 14:35:05 PST
+From: Vern Paxson <vern>
+
+> Thank you very much for your help. I compile and link well with C++ while
+> declaring 'yylex ...' extern, But a little problem remains. I get a
+> segmentation default when executing ( I linked with lfl library) while it
+> works well when using LEX instead of flex. Do you have some ideas about the
+> reason for this ?
+
+The one possible reason for this that comes to mind is if you've defined
+yytext as "extern char yytext[]" (which is what lex uses) instead of
+"extern char *yytext" (which is what flex uses). If it's not that, then
+I'm afraid I don't know what the problem might be.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-68
+@unnumberedsec unnamed-faq-68
+@example
+@verbatim
+To: "Bart Niswonger" <NISWONGR@almaden.ibm.com>
+Subject: Re: flex 2.5: c++ scanners & start conditions
+In-reply-to: Your message of Tue, 06 Jan 1998 10:34:21 PST.
+Date: Tue, 06 Jan 1998 19:19:30 PST
+From: Vern Paxson <vern>
+
+> The problem is that when I do this (using %option c++) start
+> conditions seem to not apply.
+
+The BEGIN macro modifies the yy_start variable. For C scanners, this
+is a static with scope visible through the whole file. For C++ scanners,
+it's a member variable, so it only has visible scope within a member
+function. Your lexbegin() routine is not a member function when you
+build a C++ scanner, so it's not modifying the correct yy_start. The
+diagnostic that indicates this is that you found you needed to add
+a declaration of yy_start in order to get your scanner to compile when
+using C++; instead, the correct fix is to make lexbegin() a member
+function (by deriving from yyFlexLexer).
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-69
+@unnumberedsec unnamed-faq-69
+@example
+@verbatim
+To: "Boris Zinin" <boris@ippe.rssi.ru>
+Subject: Re: current position in flex buffer
+In-reply-to: Your message of Mon, 12 Jan 1998 18:58:23 PST.
+Date: Mon, 12 Jan 1998 12:03:15 PST
+From: Vern Paxson <vern>
+
+> The problem is how to determine the current position in flex active
+> buffer when a rule is matched....
+
+You will need to keep track of this explicitly, such as by redefining
+YY_USER_ACTION to count the number of characters matched.
+
+The latest flex release, by the way, is 2.5.4, available from ftp.ee.lbl.gov.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-70
+@unnumberedsec unnamed-faq-70
+@example
+@verbatim
+To: Bik.Dhaliwal@bis.org
+Subject: Re: Flex question
+In-reply-to: Your message of Mon, 26 Jan 1998 13:05:35 PST.
+Date: Tue, 27 Jan 1998 22:41:52 PST
+From: Vern Paxson <vern>
+
+> That requirement involves knowing
+> the character position at which a particular token was matched
+> in the lexer.
+
+The way you have to do this is by explicitly keeping track of where
+you are in the file, by counting the number of characters scanned
+for each token (available in yyleng). It may prove convenient to
+do this by redefining YY_USER_ACTION, as described in the manual.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-71
+@unnumberedsec unnamed-faq-71
+@example
+@verbatim
+To: Vladimir Alexiev <vladimir@cs.ualberta.ca>
+Subject: Re: flex: how to control start condition from parser?
+In-reply-to: Your message of Mon, 26 Jan 1998 05:50:16 PST.
+Date: Tue, 27 Jan 1998 22:45:37 PST
+From: Vern Paxson <vern>
+
+> It seems useful for the parser to be able to tell the lexer about such
+> context dependencies, because then they don't have to be limited to
+> local or sequential context.
+
+One way to do this is to have the parser call a stub routine that's
+included in the scanner's .l file, and consequently that has access ot
+BEGIN. The only ugliness is that the parser can't pass in the state
+it wants, because those aren't visible - but if you don't have many
+such states, then using a different set of names doesn't seem like
+to much of a burden.
+
+While generating a .h file like you suggests is certainly cleaner,
+flex development has come to a virtual stand-still :-(, so a workaround
+like the above is much more pragmatic than waiting for a new feature.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-72
+@unnumberedsec unnamed-faq-72
+@example
+@verbatim
+To: Barbara Denny <denny@3com.com>
+Subject: Re: freebsd flex bug?
+In-reply-to: Your message of Fri, 30 Jan 1998 12:00:43 PST.
+Date: Fri, 30 Jan 1998 12:42:32 PST
+From: Vern Paxson <vern>
+
+> lex.yy.c:1996: parse error before `='
+
+This is the key, identifying this error. (It may help to pinpoint
+it by using flex -L, so it doesn't generate #line directives in its
+output.) I will bet you heavy money that you have a start condition
+name that is also a variable name, or something like that; flex spits
+out #define's for each start condition name, mapping them to a number,
+so you can wind up with:
+
+ %x foo
+ %%
+ ...
+ %%
+ void bar()
+ {
+ int foo = 3;
+ }
+
+and the penultimate will turn into "int 1 = 3" after C preprocessing,
+since flex will put "#define foo 1" in the generated scanner.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-73
+@unnumberedsec unnamed-faq-73
+@example
+@verbatim
+To: Maurice Petrie <mpetrie@infoscigroup.com>
+Subject: Re: Lost flex .l file
+In-reply-to: Your message of Mon, 02 Feb 1998 14:10:01 PST.
+Date: Mon, 02 Feb 1998 11:15:12 PST
+From: Vern Paxson <vern>
+
+> I am curious as to
+> whether there is a simple way to backtrack from the generated source to
+> reproduce the lost list of tokens we are searching on.
+
+In theory, it's straight-forward to go from the DFA representation
+back to a regular-expression representation - the two are isomorphic.
+In practice, a huge headache, because you have to unpack all the tables
+back into a single DFA representation, and then write a program to munch
+on that and translate it into an RE.
+
+Sorry for the less-than-happy news ...
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-74
+@unnumberedsec unnamed-faq-74
+@example
+@verbatim
+To: jimmey@lexis-nexis.com (Jimmey Todd)
+Subject: Re: Flex performance question
+In-reply-to: Your message of Thu, 19 Feb 1998 11:01:17 PST.
+Date: Thu, 19 Feb 1998 08:48:51 PST
+From: Vern Paxson <vern>
+
+> What I have found, is that the smaller the data chunk, the faster the
+> program executes. This is the opposite of what I expected. Should this be
+> happening this way?
+
+This is exactly what will happen if your input file has embedded NULs.
+From the man page:
+
+ A final note: flex is slow when matching NUL's, particularly
+ when a token contains multiple NUL's. It's best to write
+ rules which match short amounts of text if it's anticipated
+ that the text will often include NUL's.
+
+So that's the first thing to look for.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-75
+@unnumberedsec unnamed-faq-75
+@example
+@verbatim
+To: jimmey@lexis-nexis.com (Jimmey Todd)
+Subject: Re: Flex performance question
+In-reply-to: Your message of Thu, 19 Feb 1998 11:01:17 PST.
+Date: Thu, 19 Feb 1998 15:42:25 PST
+From: Vern Paxson <vern>
+
+So there are several problems.
+
+First, to go fast, you want to match as much text as possible, which
+your scanners don't in the case that what they're scanning is *not*
+a <RN> tag. So you want a rule like:
+
+ [^<]+
+
+Second, C++ scanners are particularly slow if they're interactive,
+which they are by default. Using -B speeds it up by a factor of 3-4
+on my workstation.
+
+Third, C++ scanners that use the istream interface are slow, because
+of how poorly implemented istream's are. I built two versions of
+the following scanner:
+
+ %%
+ .*\n
+ .*
+ %%
+
+and the C version inhales a 2.5MB file on my workstation in 0.8 seconds.
+The C++ istream version, using -B, takes 3.8 seconds.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-76
+@unnumberedsec unnamed-faq-76
+@example
+@verbatim
+To: "Frescatore, David (CRD, TAD)" <frescatore@exc01crdge.crd.ge.com>
+Subject: Re: FLEX 2.5 & THE YEAR 2000
+In-reply-to: Your message of Wed, 03 Jun 1998 11:26:22 PDT.
+Date: Wed, 03 Jun 1998 10:22:26 PDT
+From: Vern Paxson <vern>
+
+> I am researching the Y2K problem with General Electric R&D
+> and need to know if there are any known issues concerning
+> the above mentioned software and Y2K regardless of version.
+
+There shouldn't be, all it ever does with the date is ask the system
+for it and then print it out.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-77
+@unnumberedsec unnamed-faq-77
+@example
+@verbatim
+To: "Hans Dermot Doran" <htd@ibhdoran.com>
+Subject: Re: flex problem
+In-reply-to: Your message of Wed, 15 Jul 1998 21:30:13 PDT.
+Date: Tue, 21 Jul 1998 14:23:34 PDT
+From: Vern Paxson <vern>
+
+> To overcome this, I gets() the stdin into a string and lex the string. The
+> string is lexed OK except that the end of string isn't lexed properly
+> (yy_scan_string()), that is the lexer dosn't recognise the end of string.
+
+Flex doesn't contain mechanisms for recognizing buffer endpoints. But if
+you use fgets instead (which you should anyway, to protect against buffer
+overflows), then the final \n will be preserved in the string, and you can
+scan that in order to find the end of the string.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-78
+@unnumberedsec unnamed-faq-78
+@example
+@verbatim
+To: soumen@almaden.ibm.com
+Subject: Re: Flex++ 2.5.3 instance member vs. static member
+In-reply-to: Your message of Mon, 27 Jul 1998 02:10:04 PDT.
+Date: Tue, 28 Jul 1998 01:10:34 PDT
+From: Vern Paxson <vern>
+
+> %{
+> int mylineno = 0;
+> %}
+> ws [ \t]+
+> alpha [A-Za-z]
+> dig [0-9]
+> %%
+>
+> Now you'd expect mylineno to be a member of each instance of class
+> yyFlexLexer, but is this the case? A look at the lex.yy.cc file seems to
+> indicate otherwise; unless I am missing something the declaration of
+> mylineno seems to be outside any class scope.
+>
+> How will this work if I want to run a multi-threaded application with each
+> thread creating a FlexLexer instance?
+
+Derive your own subclass and make mylineno a member variable of it.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-79
+@unnumberedsec unnamed-faq-79
+@example
+@verbatim
+To: Adoram Rogel <adoram@hybridge.com>
+Subject: Re: More than 32K states change hangs
+In-reply-to: Your message of Tue, 04 Aug 1998 16:55:39 PDT.
+Date: Tue, 04 Aug 1998 22:28:45 PDT
+From: Vern Paxson <vern>
+
+> Vern Paxson,
+>
+> I followed your advice, posted on Usenet bu you, and emailed to me
+> personally by you, on how to overcome the 32K states limit. I'm running
+> on Linux machines.
+> I took the full source of version 2.5.4 and did the following changes in
+> flexdef.h:
+> #define JAMSTATE -327660
+> #define MAXIMUM_MNS 319990
+> #define BAD_SUBSCRIPT -327670
+> #define MAX_SHORT 327000
+>
+> and compiled.
+> All looked fine, including check and bigcheck, so I installed.
+
+Hmmm, you shouldn't increase MAX_SHORT, though looking through my email
+archives I see that I did indeed recommend doing so. Try setting it back
+to 32700; that should suffice that you no longer need -Ca. If it still
+hangs, then the interesting question is - where?
+
+> Compiling the same hanged program with a out-of-the-box (RedHat 4.2
+> distribution of Linux)
+> flex 2.5.4 binary works.
+
+Since Linux comes with source code, you should diff it against what
+you have to see what problems they missed.
+
+> Should I always compile with the -Ca option now ? even short and simple
+> filters ?
+
+No, definitely not. It's meant to be for those situations where you
+absolutely must squeeze every last cycle out of your scanner.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-80
+@unnumberedsec unnamed-faq-80
+@example
+@verbatim
+To: "Schmackpfeffer, Craig" <Craig.Schmackpfeffer@usa.xerox.com>
+Subject: Re: flex output for static code portion
+In-reply-to: Your message of Tue, 11 Aug 1998 11:55:30 PDT.
+Date: Mon, 17 Aug 1998 23:57:42 PDT
+From: Vern Paxson <vern>
+
+> I would like to use flex under the hood to generate a binary file
+> containing the data structures that control the parse.
+
+This has been on the wish-list for a long time. In principle it's
+straight-forward - you redirect mkdata() et al's I/O to another file,
+and modify the skeleton to have a start-up function that slurps these
+into dynamic arrays. The concerns are (1) the scanner generation code
+is hairy and full of corner cases, so it's easy to get surprised when
+going down this path :-( ; and (2) being careful about buffering so
+that when the tables change you make sure the scanner starts in the
+correct state and reading at the right point in the input file.
+
+> I was wondering if you know of anyone who has used flex in this way.
+
+I don't - but it seems like a reasonable project to undertake (unlike
+numerous other flex tweaks :-).
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-81
+@unnumberedsec unnamed-faq-81
+@example
+@verbatim
+Received: from 131.173.17.11 (131.173.17.11 [131.173.17.11])
+ by ee.lbl.gov (8.9.1/8.9.1) with ESMTP id AAA03838
+ for <vern@ee.lbl.gov>; Thu, 20 Aug 1998 00:47:57 -0700 (PDT)
+Received: from hal.cl-ki.uni-osnabrueck.de (hal.cl-ki.Uni-Osnabrueck.DE [131.173.141.2])
+ by deimos.rz.uni-osnabrueck.de (8.8.7/8.8.8) with ESMTP id JAA34694
+ for <vern@ee.lbl.gov>; Thu, 20 Aug 1998 09:47:55 +0200
+Received: (from georg@localhost) by hal.cl-ki.uni-osnabrueck.de (8.6.12/8.6.12) id JAA34834 for vern@ee.lbl.gov; Thu, 20 Aug 1998 09:47:54 +0200
+From: Georg Rehm <georg@hal.cl-ki.uni-osnabrueck.de>
+Message-Id: <199808200747.JAA34834@hal.cl-ki.uni-osnabrueck.de>
+Subject: "flex scanner push-back overflow"
+To: vern@ee.lbl.gov
+Date: Thu, 20 Aug 1998 09:47:54 +0200 (MEST)
+Reply-To: Georg.Rehm@CL-KI.Uni-Osnabrueck.DE
+X-NoJunk: Do NOT send commercial mail, spam or ads to this address!
+X-URL: http://www.cl-ki.uni-osnabrueck.de/~georg/
+X-Mailer: ELM [version 2.4ME+ PL28 (25)]
+MIME-Version: 1.0
+Content-Type: text/plain; charset=US-ASCII
+Content-Transfer-Encoding: 7bit
+
+Hi Vern,
+
+Yesterday, I encountered a strange problem: I use the macro processor m4
+to include some lengthy lists into a .l file. Following is a flex macro
+definition that causes some serious pain in my neck:
+
+AUTHOR ("A. Boucard / L. Boucard"|"A. Dastarac / M. Levent"|"A.Boucaud / L.Boucaud"|"Abderrahim Lamchichi"|"Achmat Dangor"|"Adeline Toullier"|"Adewale Maja-Pearce"|"Ahmed Ziri"|"Akram Ellyas"|"Alain Bihr"|"Alain Gresh"|"Alain Guillemoles"|"Alain Joxe"|"Alain Morice"|"Alain Renon"|"Alain Zecchini"|"Albert Memmi"|"Alberto Manguel"|"Alex De Waal"|"Alfonso Artico"| [...])
+
+The complete list contains about 10kB. When I try to "flex" this file
+(on a Solaris 2.6 machine, using a modified flex 2.5.4 (I only increased
+some of the predefined values in flexdefs.h) I get the error:
+
+myflex/flex -8 sentag.tmp.l
+flex scanner push-back overflow
+
+When I remove the slashes in the macro definition everything works fine.
+As I understand it, the double quotes escape the slash-character so it
+really means "/" and not "trailing context". Furthermore, I tried to
+escape the slashes with backslashes, but with no use, the same error message
+appeared when flexing the code.
+
+Do you have an idea what's going on here?
+
+Greetings from Germany,
+ Georg
+--
+Georg Rehm georg@cl-ki.uni-osnabrueck.de
+Institute for Semantic Information Processing, University of Osnabrueck, FRG
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-82
+@unnumberedsec unnamed-faq-82
+@example
+@verbatim
+To: Georg.Rehm@CL-KI.Uni-Osnabrueck.DE
+Subject: Re: "flex scanner push-back overflow"
+In-reply-to: Your message of Thu, 20 Aug 1998 09:47:54 PDT.
+Date: Thu, 20 Aug 1998 07:05:35 PDT
+From: Vern Paxson <vern>
+
+> myflex/flex -8 sentag.tmp.l
+> flex scanner push-back overflow
+
+Flex itself uses a flex scanner. That scanner is running out of buffer
+space when it tries to unput() the humongous macro you've defined. When
+you remove the '/'s, you make it small enough so that it fits in the buffer;
+removing spaces would do the same thing.
+
+The fix is to either rethink how come you're using such a big macro and
+perhaps there's another/better way to do it; or to rebuild flex's own
+scan.c with a larger value for
+
+ #define YY_BUF_SIZE 16384
+
+- Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-83
+@unnumberedsec unnamed-faq-83
+@example
+@verbatim
+To: Jan Kort <jan@research.techforce.nl>
+Subject: Re: Flex
+In-reply-to: Your message of Fri, 04 Sep 1998 12:18:43 +0200.
+Date: Sat, 05 Sep 1998 00:59:49 PDT
+From: Vern Paxson <vern>
+
+> %%
+>
+> "TEST1\n" { fprintf(stderr, "TEST1\n"); yyless(5); }
+> ^\n { fprintf(stderr, "empty line\n"); }
+> . { }
+> \n { fprintf(stderr, "new line\n"); }
+>
+> %%
+> -- input ---------------------------------------
+> TEST1
+> -- output --------------------------------------
+> TEST1
+> empty line
+> ------------------------------------------------
+
+IMHO, it's not clear whether or not this is in fact a bug. It depends
+on whether you view yyless() as backing up in the input stream, or as
+pushing new characters onto the beginning of the input stream. Flex
+interprets it as the latter (for implementation convenience, I'll admit),
+and so considers the newline as in fact matching at the beginning of a
+line, as after all the last token scanned an entire line and so the
+scanner is now at the beginning of a new line.
+
+I agree that this is counter-intuitive for yyless(), given its
+functional description (it's less so for unput(), depending on whether
+you're unput()'ing new text or scanned text). But I don't plan to
+change it any time soon, as it's a pain to do so. Consequently,
+you do indeed need to use yy_set_bol() and YY_AT_BOL() to tweak
+your scanner into the behavior you desire.
+
+Sorry for the less-than-completely-satisfactory answer.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-84
+@unnumberedsec unnamed-faq-84
+@example
+@verbatim
+To: Patrick Krusenotto <krusenot@mac-info-link.de>
+Subject: Re: Problems with restarting flex-2.5.2-generated scanner
+In-reply-to: Your message of Thu, 24 Sep 1998 10:14:07 PDT.
+Date: Thu, 24 Sep 1998 23:28:43 PDT
+From: Vern Paxson <vern>
+
+> I am using flex-2.5.2 and bison 1.25 for Solaris and I am desperately
+> trying to make my scanner restart with a new file after my parser stops
+> with a parse error. When my compiler restarts, the parser always
+> receives the token after the token (in the old file!) that caused the
+> parser error.
+
+I suspect the problem is that your parser has read ahead in order
+to attempt to resolve an ambiguity, and when it's restarted it picks
+up with that token rather than reading a fresh one. If you're using
+yacc, then the special "error" production can sometimes be used to
+consume tokens in an attempt to get the parser into a consistent state.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-85
+@unnumberedsec unnamed-faq-85
+@example
+@verbatim
+To: Henric Jungheim <junghelh@pe-nelson.com>
+Subject: Re: flex 2.5.4a
+In-reply-to: Your message of Tue, 27 Oct 1998 16:41:42 PST.
+Date: Tue, 27 Oct 1998 16:50:14 PST
+From: Vern Paxson <vern>
+
+> This brings up a feature request: How about a command line
+> option to specify the filename when reading from stdin? That way one
+> doesn't need to create a temporary file in order to get the "#line"
+> directives to make sense.
+
+Use -o combined with -t (per the man page description of -o).
+
+> P.S., Is there any simple way to use non-blocking IO to parse multiple
+> streams?
+
+Simple, no.
+
+One approach might be to return a magic character on EWOULDBLOCK and
+have a rule
+
+ .*<magic-character> // put back .*, eat magic character
+
+This is off the top of my head, not sure it'll work.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-86
+@unnumberedsec unnamed-faq-86
+@example
+@verbatim
+To: "Repko, Billy D" <billy.d.repko@intel.com>
+Subject: Re: Compiling scanners
+In-reply-to: Your message of Wed, 13 Jan 1999 10:52:47 PST.
+Date: Thu, 14 Jan 1999 00:25:30 PST
+From: Vern Paxson <vern>
+
+> It appears that maybe it cannot find the lfl library.
+
+The Makefile in the distribution builds it, so you should have it.
+It's exceedingly trivial, just a main() that calls yylex() and
+a yyrap() that always returns 1.
+
+> %%
+> \n ++num_lines; ++num_chars;
+> . ++num_chars;
+
+You can't indent your rules like this - that's where the errors are coming
+from. Flex copies indented text to the output file, it's how you do things
+like
+
+ int num_lines_seen = 0;
+
+to declare local variables.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-87
+@unnumberedsec unnamed-faq-87
+@example
+@verbatim
+To: Erick Branderhorst <Erick.Branderhorst@asml.nl>
+Subject: Re: flex input buffer
+In-reply-to: Your message of Tue, 09 Feb 1999 13:53:46 PST.
+Date: Tue, 09 Feb 1999 21:03:37 PST
+From: Vern Paxson <vern>
+
+> In the flex.skl file the size of the default input buffers is set. Can you
+> explain why this size is set and why it is such a high number.
+
+It's large to optimize performance when scanning large files. You can
+safely make it a lot lower if needed.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-88
+@unnumberedsec unnamed-faq-88
+@example
+@verbatim
+To: "Guido Minnen" <guidomi@cogs.susx.ac.uk>
+Subject: Re: Flex error message
+In-reply-to: Your message of Wed, 24 Feb 1999 15:31:46 PST.
+Date: Thu, 25 Feb 1999 00:11:31 PST
+From: Vern Paxson <vern>
+
+> I'm extending a larger scanner written in Flex and I keep running into
+> problems. More specifically, I get the error message:
+> "flex: input rules are too complicated (>= 32000 NFA states)"
+
+Increase the definitions in flexdef.h for:
+
+ #define JAMSTATE -32766 /* marks a reference to the state that always j
+ams */
+ #define MAXIMUM_MNS 31999
+ #define BAD_SUBSCRIPT -32767
+
+recompile everything, and it should all work.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-89
+@unnumberedsec unnamed-faq-89
+@example
+@verbatim
+To: John Victor J <vjohn@its.soft.net>
+Subject: Re: flex---is thread safe
+In-reply-to: Your message of Sun, 23 May 1999 12:56:56 +0530.
+Date: Sun, 23 May 1999 00:32:53 PDT
+From: Vern Paxson <vern>
+
+> I would like to know whether flex is thread safe???
+
+I take it you mean the scanners it generates and not flex itself.
+
+The answer is (still) No, except if you use the -+ option to generate
+a C++ scanning class (and if your stream library is thread-safe).
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-90
+@unnumberedsec unnamed-faq-90
+@example
+@verbatim
+To: "Dmitriy Goldobin" <gold@ems.chel.su>
+Subject: Re: FLEX trouble
+In-reply-to: Your message of Mon, 31 May 1999 18:44:49 PDT.
+Date: Tue, 01 Jun 1999 00:15:07 PDT
+From: Vern Paxson <vern>
+
+> I have a trouble with FLEX. Why rule "/*".*"*/" work properly,=20
+> but rule "/*"(.|\n)*"*/" don't work ?
+
+The second of these will have to scan the entire input stream (because
+"(.|\n)*" matches an arbitrary amount of any text) in order to see if
+it ends with "*/", terminating the comment. That potentially will overflow
+the input buffer.
+
+> More complex rule "/*"([^*]|(\*/[^/]))*"*/ give an error
+> 'unrecognized rule'.
+
+You can't use the '/' operator inside parentheses. It's not clear
+what "(a/b)*" actually means.
+
+> I now use workaround with state <comment>, but single-rule is
+> better, i think.
+
+Single-rule is nice but will always have the problem of either setting
+restrictions on comments (like not allowing multi-line comments) and/or
+running the risk of consuming the entire input stream, as noted above.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-91
+@unnumberedsec unnamed-faq-91
+@example
+@verbatim
+Received: from mc-qout4.whowhere.com (mc-qout4.whowhere.com [209.185.123.18])
+ by ee.lbl.gov (8.9.3/8.9.3) with SMTP id IAA05100
+ for <vern@ee.lbl.gov>; Tue, 15 Jun 1999 08:56:06 -0700 (PDT)
+Received: from Unknown/Local ([?.?.?.?]) by my-deja.com; Tue Jun 15 08:55:43 1999
+To: vern@ee.lbl.gov
+Date: Tue, 15 Jun 1999 08:55:43 -0700
+From: "Aki Niimura" <neko@my-deja.com>
+Message-ID: <KNONDOHDOBGAEAAA@my-deja.com>
+Mime-Version: 1.0
+Cc:
+X-Sent-Mail: on
+Reply-To:
+X-Mailer: MailCity Service
+Subject: A question on flex C++ scanner
+X-Sender-Ip: 12.72.207.61
+Organization: My Deja Email (http://www.my-deja.com:80)
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+
+Dear Dr. Paxon,
+
+I have been using flex for years.
+It works very well on many projects.
+Most case, I used it to generate a scanner on C language.
+However, one project I needed to generate a scanner
+on C++ lanuage. Thanks to your enhancement, flex did
+the job.
+
+Currently, I'm working on enhancing my previous project.
+I need to deal with multiple input streams (recursive
+inclusion) in this scanner (C++).
+I did similar thing for another scanner (C) as you
+explained in your documentation.
+
+The generated scanner (C++) has necessary methods:
+- switch_to_buffer(struct yy_buffer_state *b)
+- yy_create_buffer(istream *is, int sz)
+- yy_delete_buffer(struct yy_buffer_state *b)
+
+However, I couldn't figure out how to access current
+buffer (yy_current_buffer).
+
+yy_current_buffer is a protected member of yyFlexLexer.
+I can't access it directly.
+Then, I thought yy_create_buffer() with is = 0 might
+return current stream buffer. But it seems not as far
+as I checked the source. (flex 2.5.4)
+
+I went through the Web in addition to Flex documentation.
+However, it hasn't been successful, so far.
+
+It is not my intention to bother you, but, can you
+comment about how to obtain the current stream buffer?
+
+Your response would be highly appreciated.
+
+Best regards,
+Aki Niimura
+
+
+
+--== Sent via Deja.com http://www.deja.com/ ==--
+Share what you know. Learn what you don't.
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-92
+@unnumberedsec unnamed-faq-92
+@example
+@verbatim
+To: neko@my-deja.com
+Subject: Re: A question on flex C++ scanner
+In-reply-to: Your message of Tue, 15 Jun 1999 08:55:43 PDT.
+Date: Tue, 15 Jun 1999 09:04:24 PDT
+From: Vern Paxson <vern>
+
+> However, I couldn't figure out how to access current
+> buffer (yy_current_buffer).
+
+Derive your own subclass from yyFlexLexer.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-93
+@unnumberedsec unnamed-faq-93
+@example
+@verbatim
+To: "Stones, Darren" <Darren.Stones@nectech.co.uk>
+Subject: Re: You're the man to see?
+In-reply-to: Your message of Wed, 23 Jun 1999 11:10:29 PDT.
+Date: Wed, 23 Jun 1999 09:01:40 PDT
+From: Vern Paxson <vern>
+
+> I hope you can help me. I am using Flex and Bison to produce an interpreted
+> language. However all goes well until I try to implement an IF statement or
+> a WHILE. I cannot get this to work as the parser parses all the conditions
+> eg. the TRUE and FALSE conditons to check for a rule match. So I cannot
+> make a decision!!
+
+You need to use the parser to build a parse tree (= abstract syntax trwee),
+and when that's all done you recursively evaluate the tree, binding variables
+to values at that time.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-94
+@unnumberedsec unnamed-faq-94
+@example
+@verbatim
+To: Petr Danecek <petr@ics.cas.cz>
+Subject: Re: flex - question
+In-reply-to: Your message of Mon, 28 Jun 1999 19:21:41 PDT.
+Date: Fri, 02 Jul 1999 16:52:13 PDT
+From: Vern Paxson <vern>
+
+> file, it takes an enormous amount of time. It is funny, because the
+> source code has only 12 rules!!! I think it looks like an exponencial
+> growth.
+
+Right, that's the problem - some patterns (those with a lot of
+ambiguity, where yours has because at any given time the scanner can
+be in the middle of all sorts of combinations of the different
+rules) blow up exponentially.
+
+For your rules, there is an easy fix. Change the ".*" that comes fater
+the directory name to "[^ ]*". With that in place, the rules are no
+longer nearly so ambiguous, because then once one of the directories
+has been matched, no other can be matched (since they all require a
+leading blank).
+
+If that's not an acceptable solution, then you can enter a start state
+to pick up the .*\n after each directory is matched.
+
+Also note that for speed, you'll want to add a ".*" rule at the end,
+otherwise rules that don't match any of the patterns will be matched
+very slowly, a character at a time.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-95
+@unnumberedsec unnamed-faq-95
+@example
+@verbatim
+To: Tielman Koekemoer <tielman@spi.co.za>
+Subject: Re: Please help.
+In-reply-to: Your message of Thu, 08 Jul 1999 13:20:37 PDT.
+Date: Thu, 08 Jul 1999 08:20:39 PDT
+From: Vern Paxson <vern>
+
+> I was hoping you could help me with my problem.
+>
+> I tried compiling (gnu)flex on a Solaris 2.4 machine
+> but when I ran make (after configure) I got an error.
+>
+> --------------------------------------------------------------
+> gcc -c -I. -I. -g -O parse.c
+> ./flex -t -p ./scan.l >scan.c
+> sh: ./flex: not found
+> *** Error code 1
+> make: Fatal error: Command failed for target `scan.c'
+> -------------------------------------------------------------
+>
+> What's strange to me is that I'm only
+> trying to install flex now. I then edited the Makefile to
+> and changed where it says "FLEX = flex" to "FLEX = lex"
+> ( lex: the native Solaris one ) but then it complains about
+> the "-p" option. Is there any way I can compile flex without
+> using flex or lex?
+>
+> Thanks so much for your time.
+
+You managed to step on the bootstrap sequence, which first copies
+initscan.c to scan.c in order to build flex. Try fetching a fresh
+distribution from ftp.ee.lbl.gov. (Or you can first try removing
+".bootstrap" and doing a make again.)
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-96
+@unnumberedsec unnamed-faq-96
+@example
+@verbatim
+To: Tielman Koekemoer <tielman@spi.co.za>
+Subject: Re: Please help.
+In-reply-to: Your message of Fri, 09 Jul 1999 09:16:14 PDT.
+Date: Fri, 09 Jul 1999 00:27:20 PDT
+From: Vern Paxson <vern>
+
+> First I removed .bootstrap (and ran make) - no luck. I downloaded the
+> software but I still have the same problem. Is there anything else I
+> could try.
+
+Try:
+
+ cp initscan.c scan.c
+ touch scan.c
+ make scan.o
+
+If this last tries to first build scan.c from scan.l using ./flex, then
+your "make" is broken, in which case compile scan.c to scan.o by hand.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-97
+@unnumberedsec unnamed-faq-97
+@example
+@verbatim
+To: Sumanth Kamenani <skamenan@crl.nmsu.edu>
+Subject: Re: Error
+In-reply-to: Your message of Mon, 19 Jul 1999 23:08:41 PDT.
+Date: Tue, 20 Jul 1999 00:18:26 PDT
+From: Vern Paxson <vern>
+
+> I am getting a compilation error. The error is given as "unknown symbol- yylex".
+
+The parser relies on calling yylex(), but you're instead using the C++ scanning
+class, so you need to supply a yylex() "glue" function that calls an instance
+scanner of the scanner (e.g., "scanner->yylex()").
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-98
+@unnumberedsec unnamed-faq-98
+@example
+@verbatim
+To: daniel@synchrods.synchrods.COM (Daniel Senderowicz)
+Subject: Re: lex
+In-reply-to: Your message of Mon, 22 Nov 1999 11:19:04 PST.
+Date: Tue, 23 Nov 1999 15:54:30 PST
+From: Vern Paxson <vern>
+
+Well, your problem is the
+
+ switch (yybgin-yysvec-1) { /* witchcraft */
+
+at the beginning of lex rules. "witchcraft" == "non-portable". It's
+assuming knowledge of the AT&T lex's internal variables.
+
+For flex, you can probably do the equivalent using a switch on YYSTATE.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-99
+@unnumberedsec unnamed-faq-99
+@example
+@verbatim
+To: archow@hss.hns.com
+Subject: Re: Regarding distribution of flex and yacc based grammars
+In-reply-to: Your message of Sun, 19 Dec 1999 17:50:24 +0530.
+Date: Wed, 22 Dec 1999 01:56:24 PST
+From: Vern Paxson <vern>
+
+> When we provide the customer with an object code distribution, is it
+> necessary for us to provide source
+> for the generated C files from flex and bison since they are generated by
+> flex and bison ?
+
+For flex, no. I don't know what the current state of this is for bison.
+
+> Also, is there any requrirement for us to neccessarily provide source for
+> the grammar files which are fed into flex and bison ?
+
+Again, for flex, no.
+
+See the file "COPYING" in the flex distribution for the legalese.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-100
+@unnumberedsec unnamed-faq-100
+@example
+@verbatim
+To: Martin Gallwey <gallweym@hyperion.moe.ul.ie>
+Subject: Re: Flex, and self referencing rules
+In-reply-to: Your message of Sun, 20 Feb 2000 01:01:21 PST.
+Date: Sat, 19 Feb 2000 18:33:16 PST
+From: Vern Paxson <vern>
+
+> However, I do not use unput anywhere. I do use self-referencing
+> rules like this:
+>
+> UnaryExpr ({UnionExpr})|("-"{UnaryExpr})
+
+You can't do this - flex is *not* a parser like yacc (which does indeed
+allow recursion), it is a scanner that's confined to regular expressions.
+
+ Vern
+@end verbatim
+@end example
+
+@c TODO: Evaluate this faq.
+@node unnamed-faq-101
+@unnumberedsec unnamed-faq-101
+@example
+@verbatim
+To: slg3@lehigh.edu (SAMUEL L. GULDEN)
+Subject: Re: Flex problem
+In-reply-to: Your message of Thu, 02 Mar 2000 12:29:04 PST.
+Date: Thu, 02 Mar 2000 23:00:46 PST
+From: Vern Paxson <vern>
+
+If this is exactly your program:
+
+> digit [0-9]
+> digits {digit}+
+> whitespace [ \t\n]+
+>
+> %%
+> "[" { printf("open_brac\n");}
+> "]" { printf("close_brac\n");}
+> "+" { printf("addop\n");}
+> "*" { printf("multop\n");}
+> {digits} { printf("NUMBER = %s\n", yytext);}
+> whitespace ;
+
+then the problem is that the last rule needs to be "{whitespace}" !
+
+ Vern
+@end verbatim
+@end example
+