summaryrefslogtreecommitdiff
path: root/ACE/apps/gperf/gperf.info
diff options
context:
space:
mode:
Diffstat (limited to 'ACE/apps/gperf/gperf.info')
-rw-r--r--ACE/apps/gperf/gperf.info1129
1 files changed, 1129 insertions, 0 deletions
diff --git a/ACE/apps/gperf/gperf.info b/ACE/apps/gperf/gperf.info
new file mode 100644
index 00000000000..49793970903
--- /dev/null
+++ b/ACE/apps/gperf/gperf.info
@@ -0,0 +1,1129 @@
+This is ../../../apps/gperf/gperf.info, produced by makeinfo version
+4.3 from ../../../apps/gperf/gperf.texi.
+
+INFO-DIR-SECTION GNU programming tools
+START-INFO-DIR-ENTRY
+* Gperf: (gperf). Perfect Hash Function Generator.
+END-INFO-DIR-ENTRY
+
+ This file documents the features of the GNU Perfect Hash Function
+Generator
+
+ Copyright (C) 1989 Free Software Foundation, Inc.
+
+ Permission is granted to make and distribute verbatim copies of this
+manual provided the copyright notice and this permission notice are
+preserved on all copies.
+
+ Permission is granted to copy and distribute modified versions of
+this manual under the conditions for verbatim copying, provided also
+that the section entitled "GNU General Public License" is included
+exactly as in the original, and provided that the entire resulting
+derived work is distributed under the terms of a permission notice
+identical to this one.
+
+ Permission is granted to copy and distribute translations of this
+manual into another language, under the above conditions for modified
+versions, except that the section entitled "GNU `gperf' General Public
+License" an d this permission notice may be included in translations
+approved by the Free Software Foundation instead of in the original
+English.
+
+
+File: gperf.info, Node: Top, Next: Copying, Prev: (dir), Up: (dir)
+
+GNU GPERF Utility
+*****************
+
+Introduction
+************
+
+ This manual documents the GNU `gperf' perfect hash function generator
+utility, focusing on its features and how to use them, and how to report
+bugs.
+
+* Menu:
+
+* Copying:: GNU `gperf' General Public License says
+ how you can copy and share `gperf'.
+* Contributors:: People who have contributed to `gperf'.
+* Motivation:: Static search structures and GNU GPERF.
+* Search Structures:: Static search structures and GNU `gperf'
+* Description:: High-level discussion of how GPERF functions.
+* Options:: A description of options to the program.
+* Bugs:: Known bugs and limitations with GPERF.
+* Projects:: Things still left to do.
+* Implementation:: Implementation Details for GNU GPERF.
+* Bibliography:: Material Referenced in this Report.
+
+ --- The Detailed Node Listing ---
+
+High-Level Description of GNU `gperf'
+
+* Input Format:: Input Format to `gperf'
+* Output Format:: Output Format for Generated C Code with `gperf'
+
+Input Format to `gperf'
+
+* Declarations:: `struct' Declarations and C Code Inclusion.
+* Keywords:: Format for Keyword Entries.
+* Functions:: Including Additional C Functions.
+
+
+File: gperf.info, Node: Copying, Next: Contributors, Prev: Top, Up: Top
+
+GNU GENERAL PUBLIC LICENSE
+**************************
+
+ Version 1, February 1989
+ Copyright (C) 1989 Free Software Foundation, Inc.
+ 675 Mass Ave, Cambridge, MA 02139, USA
+
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+Preamble
+========
+
+ The license agreements of most software companies try to keep users
+at the mercy of those companies. By contrast, our General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users. The
+General Public License applies to the Free Software Foundation's
+software and to any other program whose authors commit to using it.
+You can use it for your programs, too.
+
+ When we speak of free software, we are referring to freedom, not
+price. Specifically, the General Public License is designed to make
+sure that you have the freedom to give away or sell copies of free
+software, that you receive source code or can get it if you want it,
+that you can change the software or use pieces of it in new free
+programs; and that you know you can do these things.
+
+ To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+ For example, if you distribute copies of a such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have. You must make sure that they, too, receive or can get the
+source code. And you must tell them their rights.
+
+ We protect your rights with two steps: (1) copyright the software,
+and (2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+ Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software. If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+
+ TERMS AND CONDITIONS
+ 1. This License Agreement applies to any program or other work which
+ contains a notice placed by the copyright holder saying it may be
+ distributed under the terms of this General Public License. The
+ "Program", below, refers to any such program or work, and a "work
+ based on the Program" means either the Program or any work
+ containing the Program or a portion of it, either verbatim or with
+ modifications. Each licensee is addressed as "you".
+
+ 2. You may copy and distribute verbatim copies of the Program's source
+ code as you receive it, in any medium, provided that you
+ conspicuously and appropriately publish on each copy an
+ appropriate copyright notice and disclaimer of warranty; keep
+ intact all the notices that refer to this General Public License
+ and to the absence of any warranty; and give any other recipients
+ of the Program a copy of this General Public License along with
+ the Program. You may charge a fee for the physical act of
+ transferring a copy.
+
+ 3. You may modify your copy or copies of the Program or any portion of
+ it, and copy and distribute such modifications under the terms of
+ Paragraph 1 above, provided that you also do the following:
+
+ * cause the modified files to carry prominent notices stating
+ that you changed the files and the date of any change; and
+
+ * cause the whole of any work that you distribute or publish,
+ that in whole or in part contains the Program or any part
+ thereof, either with or without modifications, to be licensed
+ at no charge to all third parties under the terms of this
+ General Public License (except that you may choose to grant
+ warranty protection to some or all third parties, at your
+ option).
+
+ * If the modified program normally reads commands interactively
+ when run, you must cause it, when started running for such
+ interactive use in the simplest and most usual way, to print
+ or display an announcement including an appropriate copyright
+ notice and a notice that there is no warranty (or else,
+ saying that you provide a warranty) and that users may
+ redistribute the program under these conditions, and telling
+ the user how to view a copy of this General Public License.
+
+ * You may charge a fee for the physical act of transferring a
+ copy, and you may at your option offer warranty protection in
+ exchange for a fee.
+
+ Mere aggregation of another independent work with the Program (or
+ its derivative) on a volume of a storage or distribution medium
+ does not bring the other work under the scope of these terms.
+
+ 4. You may copy and distribute the Program (or a portion or
+ derivative of it, under Paragraph 2) in object code or executable
+ form under the terms of Paragraphs 1 and 2 above provided that you
+ also do one of the following:
+
+ * accompany it with the complete corresponding machine-readable
+ source code, which must be distributed under the terms of
+ Paragraphs 1 and 2 above; or,
+
+ * accompany it with a written offer, valid for at least three
+ years, to give any third party free (except for a nominal
+ charge for the cost of distribution) a complete
+ machine-readable copy of the corresponding source code, to be
+ distributed under the terms of Paragraphs 1 and 2 above; or,
+
+ * accompany it with the information you received as to where the
+ corresponding source code may be obtained. (This alternative
+ is allowed only for noncommercial distribution and only if you
+ received the program in object code or executable form alone.)
+
+ Source code for a work means the preferred form of the work for
+ making modifications to it. For an executable file, complete
+ source code means all the source code for all modules it contains;
+ but, as a special exception, it need not include source code for
+ modules which are standard libraries that accompany the operating
+ system on which the executable file runs, or for standard header
+ files or definitions files that accompany that operating system.
+
+ 5. You may not copy, modify, sublicense, distribute or transfer the
+ Program except as expressly provided under this General Public
+ License. Any attempt otherwise to copy, modify, sublicense,
+ distribute or transfer the Program is void, and will automatically
+ terminate your rights to use the Program under this License.
+ However, parties who have received copies, or rights to use
+ copies, from you under this General Public License will not have
+ their licenses terminated so long as such parties remain in full
+ compliance.
+
+ 6. By copying, distributing or modifying the Program (or any work
+ based on the Program) you indicate your acceptance of this license
+ to do so, and all its terms and conditions.
+
+ 7. Each time you redistribute the Program (or any work based on the
+ Program), the recipient automatically receives a license from the
+ original licensor to copy, distribute or modify the Program
+ subject to these terms and conditions. You may not impose any
+ further restrictions on the recipients' exercise of the rights
+ granted herein.
+
+ 8. The Free Software Foundation may publish revised and/or new
+ versions of the General Public License from time to time. Such
+ new versions will be similar in spirit to the present version, but
+ may differ in detail to address new problems or concerns.
+
+ Each version is given a distinguishing version number. If the
+ Program specifies a version number of the license which applies to
+ it and "any later version", you have the option of following the
+ terms and conditions either of that version or of any later
+ version published by the Free Software Foundation. If the Program
+ does not specify a version number of the license, you may choose
+ any version ever published by the Free Software Foundation.
+
+ 9. If you wish to incorporate parts of the Program into other free
+ programs whose distribution conditions are different, write to the
+ author to ask for permission. For software which is copyrighted
+ by the Free Software Foundation, write to the Free Software
+ Foundation; we sometimes make exceptions for this. Our decision
+ will be guided by the two goals of preserving the free status of
+ all derivatives of our free software and of promoting the sharing
+ and reuse of software generally.
+
+ NO WARRANTY
+
+ 10. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO
+ WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE
+ LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
+ HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT
+ WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT
+ NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
+ FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE
+ QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
+ PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY
+ SERVICING, REPAIR OR CORRECTION.
+
+ 11. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
+ WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY
+ MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE
+ LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL,
+ INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR
+ INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
+ DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU
+ OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY
+ OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN
+ ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
+
+ END OF TERMS AND CONDITIONS
+
+Appendix: How to Apply These Terms to Your New Programs
+=======================================================
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to humanity, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these
+terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least the
+"copyright" line and a pointer to where the full notice is found.
+
+ ONE LINE TO GIVE THE PROGRAM'S NAME AND A BRIEF IDEA OF WHAT IT DOES.
+ Copyright (C) 19YY NAME OF AUTHOR
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 1, or (at your option)
+ any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, write to the Free Software
+ Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+
+ Also add information on how to contact you by electronic and paper
+mail.
+
+ If the program is interactive, make it output a short notice like
+this when it starts in an interactive mode:
+
+ Gnomovision version 69, Copyright (C) 19YY NAME OF AUTHOR
+ Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+ This is free software, and you are welcome to redistribute it
+ under certain conditions; type `show c' for details.
+
+ The hypothetical commands `show w' and `show c' should show the
+appropriate parts of the General Public License. Of course, the
+commands you use may be called something other than `show w' and `show
+c'; they could even be mouse-clicks or menu items--whatever suits your
+program.
+
+ You should also get your employer (if you work as a programmer) or
+your school, if any, to sign a "copyright disclaimer" for the program,
+if necessary. Here a sample; alter the names:
+
+ Yoyodyne, Inc., hereby disclaims all copyright interest in the
+ program `Gnomovision' (a program to direct compilers to make passes
+ at assemblers) written by James Hacker.
+
+ SIGNATURE OF TY COON, 1 April 1989
+ Ty Coon, President of Vice
+
+ That's all there is to it!
+
+
+File: gperf.info, Node: Contributors, Next: Motivation, Prev: Copying, Up: Top
+
+Contributors to GNU `gperf' Utility
+***********************************
+
+ * The GNU `gperf' perfect hash function generator utility was
+ originally written in GNU C++ by Douglas C. Schmidt. It is now
+ also available in a highly-portable "old-style" C version. The
+ general idea for the perfect hash function generator was inspired
+ by Keith Bostic's algorithm written in C, and distributed to
+ net.sources around 1984. The current program is a heavily
+ modified, enhanced, and extended implementation of Keith's basic
+ idea, created at the University of California, Irvine. Bugs,
+ patches, and suggestions should be reported to schmidt at
+ ics.uci.edu.
+
+ * Special thanks is extended to Michael Tiemann and Doug Lea, for
+ providing a useful compiler, and for giving me a forum to exhibit
+ my creation.
+
+ In addition, Adam de Boor and Nels Olson provided many tips and
+ insights that greatly helped improve the quality and functionality
+ of `gperf'.
+
+
+File: gperf.info, Node: Motivation, Next: Search Structures, Prev: Contributors, Up: Top
+
+Introduction
+************
+
+ `gperf' is a perfect hash function generator written in C++. It
+transforms an _n_ element user-specified keyword set _W_ into a perfect
+hash function _F_. _F_ uniquely maps keywords in _W_ onto the range
+0.._k_, where _k_ >= _n_. If _k = n_ then _F_ is a _minimal_ perfect
+hash function. `gperf' generates a 0.._k_ element static lookup table
+and a pair of C functions. These functions determine whether a given
+character string _s_ occurs in _W_, using at most one probe into the
+lookup table.
+
+ `gperf' currently generates the reserved keyword recognizer for
+lexical analyzers in several production and research compilers and
+language processing tools, including GNU C, GNU C++, GNU Pascal, GNU
+Modula 3, and GNU indent. Complete C++ source code for `gperf' is
+available via anonymous ftp from ics.uci.edu. `gperf' also is
+distributed along with the GNU libg++ library. A highly portable,
+functionally equivalent K&R C version of `gperf' is archived in
+comp.sources.unix, volume 20. Finally, a paper describing `gperf''s
+design and implementation in greater detail is available in the Second
+USENIX C++ Conference proceedings.
+
+
+File: gperf.info, Node: Search Structures, Next: Description, Prev: Motivation, Up: Top
+
+Static search structures and GNU `gperf'
+****************************************
+
+ A "static search structure" is an Abstract Data Type with certain
+fundamental operations, _e.g._, _initialize_, _insert_, and _retrieve_.
+Conceptually, all insertions occur before any retrievals. In
+practice, `gperf' generates a `static' array containing search set
+keywords and any associated attributes specified by the user. Thus,
+there is essentially no execution-time cost for the insertions. It is
+a useful data structure for representing _static search sets_. Static
+search sets occur frequently in software system applications. Typical
+static search sets include compiler reserved words, assembler
+instruction opcodes, and built-in shell interpreter commands. Search
+set members, called "keywords", are inserted into the structure only
+once, usually during program initialization, and are not generally
+modified at run-time.
+
+ Numerous static search structure implementations exist, _e.g._,
+arrays, linked lists, binary search trees, digital search tries, and
+hash tables. Different approaches offer trade-offs between space
+utilization and search time efficiency. For example, an _n_ element
+sorted array is space efficient, though the average-case time
+complexity for retrieval operations using binary search is proportional
+to log _n_. Conversely, hash table implementations often locate a
+table entry in constant time, but typically impose additional memory
+overhead and exhibit poor worst case performance.
+
+ _Minimal perfect hash functions_ provide an optimal solution for a
+particular class of static search sets. A minimal perfect hash
+function is defined by two properties:
+
+ * It allows keyword recognition in a static search set using at most
+ _one_ probe into the hash table. This represents the "perfect"
+ property.
+
+ * The actual memory allocated to store the keywords is precisely
+ large enough for the keyword set, and _no larger_. This is the
+ "minimal" property.
+
+ For most applications it is far easier to generate _perfect_ hash
+functions than _minimal perfect_ hash functions. Moreover, non-minimal
+perfect hash functions frequently execute faster than minimal ones in
+practice. This phenomena occurs since searching a sparse keyword table
+increases the probability of locating a "null" entry, thereby reducing
+string comparisons. `gperf''s default behavior generates
+_near-minimal_ perfect hash functions for keyword sets. However,
+`gperf' provides many options that permit user control over the degree
+of minimality and perfection.
+
+ Static search sets often exhibit relative stability over time. For
+example, Ada's 63 reserved words have remained constant for nearly a
+decade. It is therefore frequently worthwhile to expend concerted
+effort building an optimal search structure _once_, if it subsequently
+receives heavy use multiple times. `gperf' removes the drudgery
+associated with constructing time- and space-efficient search
+structures by hand. It has proven a useful and practical tool for
+serious programming projects. Output from `gperf' is currently used in
+several production and research compilers, including GNU C, GNU C++,
+GNU Pascal, and GNU Modula 3. The latter two compilers are not yet
+part of the official GNU distribution. Each compiler utilizes `gperf'
+to automatically generate static search structures that efficiently
+identify their respective reserved keywords.
+
+
+File: gperf.info, Node: Description, Next: Options, Prev: Search Structures, Up: Top
+
+High-Level Description of GNU `gperf'
+*************************************
+
+* Menu:
+
+* Input Format:: Input Format to `gperf'
+* Output Format:: Output Format for Generated C Code with `gperf'
+
+ The perfect hash function generator `gperf' reads a set of
+"keywords" from a "keyfile" (or from the standard input by default).
+It attempts to derive a perfect hashing function that recognizes a
+member of the "static keyword set" with at most a single probe into the
+lookup table. If `gperf' succeeds in generating such a function it
+produces a pair of C source code routines that perform hashing and
+table lookup recognition. All generated C code is directed to the
+standard output. Command-line options described below allow you to
+modify the input and output format to `gperf'.
+
+ By default, `gperf' attempts to produce time-efficient code, with
+less emphasis on efficient space utilization. However, several options
+exist that permit trading-off execution time for storage space and vice
+versa. In particular, expanding the generated table size produces a
+sparse search structure, generally yielding faster searches.
+Conversely, you can direct `gperf' to utilize a C `switch' statement
+scheme that minimizes data space storage size. Furthermore, using a C
+`switch' may actually speed up the keyword retrieval time somewhat.
+Actual results depend on your C compiler, of course.
+
+ In general, `gperf' assigns values to the characters it is using for
+hashing until some set of values gives each keyword a unique value. A
+helpful heuristic is that the larger the hash value range, the easier
+it is for `gperf' to find and generate a perfect hash function.
+Experimentation is the key to getting the most from `gperf'.
+
+
+File: gperf.info, Node: Input Format, Next: Output Format, Prev: Description, Up: Description
+
+Input Format to `gperf'
+=======================
+
+ You can control the input keyfile format by varying certain
+command-line arguments, in particular the `-t' option. The input's
+appearance is similar to GNU utilities `flex' and `bison' (or UNIX
+utilities `lex' and `yacc'). Here's an outline of the general format:
+
+ declarations
+ %%
+ keywords
+ %%
+ functions
+
+ _Unlike_ `flex' or `bison', all sections of `gperf''s input are
+optional. The following sections describe the input format for each
+section.
+
+* Menu:
+
+* Declarations:: `struct' Declarations and C Code Inclusion.
+* Keywords:: Format for Keyword Entries.
+* Functions:: Including Additional C Functions.
+
+
+File: gperf.info, Node: Declarations, Next: Keywords, Prev: Input Format, Up: Input Format
+
+`struct' Declarations and C Code Inclusion
+------------------------------------------
+
+ The keyword input file optionally contains a section for including
+arbitrary C declarations and definitions, as well as provisions for
+providing a user-supplied `struct'. If the `-t' option _is_ enabled,
+you _must_ provide a C `struct' as the last component in the
+declaration section from the keyfile file. The first field in this
+struct must be a `char *' identifier called "name," although it is
+possible to modify this field's name with the `-K' option described
+below.
+
+ Here is simple example, using months of the year and their
+attributes as input:
+
+ struct months { char *name; int number; int days; int leap_days; };
+ %%
+ january, 1, 31, 31
+ february, 2, 28, 29
+ march, 3, 31, 31
+ april, 4, 30, 30
+ may, 5, 31, 31
+ june, 6, 30, 30
+ july, 7, 31, 31
+ august, 8, 31, 31
+ september, 9, 30, 30
+ october, 10, 31, 31
+ november, 11, 30, 30
+ december, 12, 31, 31
+
+ Separating the `struct' declaration from the list of key words and
+other fields are a pair of consecutive percent signs, `%%', appearing
+left justified in the first column, as in the UNIX utility `lex'.
+
+ Using a syntax similar to GNU utilities `flex' and `bison', it is
+possible to directly include C source text and comments verbatim into
+the generated output file. This is accomplished by enclosing the region
+inside left-justified surrounding `%{', `%}' pairs. Here is an input
+fragment based on the previous example that illustrates this feature:
+
+ %{
+ #include <assert.h>
+ /* This section of code is inserted directly into the output. */
+ int return_month_days (struct months *months, int is_leap_year);
+ %}
+ struct months { char *name; int number; int days; int leap_days; };
+ %%
+ january, 1, 31, 31
+ february, 2, 28, 29
+ march, 3, 31, 31
+ ...
+
+ It is possible to omit the declaration section entirely. In this
+case the keyfile begins directly with the first keyword line, _e.g._:
+
+ january, 1, 31, 31
+ february, 2, 28, 29
+ march, 3, 31, 31
+ april, 4, 30, 30
+ ...
+
+
+File: gperf.info, Node: Keywords, Next: Functions, Prev: Declarations, Up: Input Format
+
+Format for Keyword Entries
+--------------------------
+
+ The second keyfile format section contains lines of keywords and any
+associated attributes you might supply. A line beginning with `#' in
+the first column is considered a comment. Everything following the `#'
+is ignored, up to and including the following newline.
+
+ The first field of each non-comment line is always the key itself.
+It should be given as a simple name, _i.e._, without surrounding string
+quotation marks, and be left-justified flush against the first column.
+In this context, a "field" is considered to extend up to, but not
+include, the first blank, comma, or newline. Here is a simple example
+taken from a partial list of C reserved words:
+
+ # These are a few C reserved words, see the c.`gperf' file
+ # for a complete list of ANSI C reserved words.
+ unsigned
+ sizeof
+ switch
+ signed
+ if
+ default
+ for
+ while
+ return
+
+ Note that unlike `flex' or `bison' the first `%%' marker may be
+elided if the declaration section is empty.
+
+ Additional fields may optionally follow the leading keyword. Fields
+should be separated by commas, and terminate at the end of line. What
+these fields mean is entirely up to you; they are used to initialize the
+elements of the user-defined `struct' provided by you in the
+declaration section. If the `-t' option is _not_ enabled these fields
+are simply ignored. All previous examples except the last one contain
+keyword attributes.
+
+
+File: gperf.info, Node: Functions, Prev: Keywords, Up: Input Format
+
+Including Additional C Functions
+--------------------------------
+
+ The optional third section also corresponds closely with conventions
+found in `flex' and `bison'. All text in this section, starting at the
+final `%%' and extending to the end of the input file, is included
+verbatim into the generated output file. Naturally, it is your
+responsibility to ensure that the code contained in this section is
+valid C.
+
+
+File: gperf.info, Node: Output Format, Prev: Input Format, Up: Description
+
+Output Format for Generated C Code with `gperf'
+===============================================
+
+ Several options control how the generated C code appears on the
+standard output. Two C function are generated. They are called `hash'
+and `in_word_set', although you may modify the name for `in_word_set'
+with a command-line option. Both functions require two arguments, a
+string, `char *' STR, and a length parameter, `int' LEN. Their default
+function prototypes are as follows:
+
+ static int hash (char *str, int len);
+ int in_word_set (char *str, int len);
+
+ By default, the generated `hash' function returns an integer value
+created by adding LEN to several user-specified STR key positions
+indexed into an "associated values" table stored in a local static
+array. The associated values table is constructed internally by
+`gperf' and later output as a static local C array called HASH_TABLE;
+its meaning and properties are described below. *Note
+Implementation::. The relevant key positions are specified via the `-k'
+option when running `gperf', as detailed in the _Options_ section
+below. *Note Options::.
+
+ Two options, `-g' (assume you are compiling with GNU C and its
+`inline' feature) and `-a' (assume ANSI C-style function prototypes),
+alter the content of both the generated `hash' and `in_word_set'
+routines. However, function `in_word_set' may be modified more
+extensively, in response to your option settings. The options that
+affect the `in_word_set' structure are:
+
+ `-p'
+ Have function `in_word_set' return a pointer rather than a
+ boolean.
+
+ `-t'
+ Make use of the user-defined `struct'.
+
+ `-S TOTAL SWITCH STATEMENTS'
+ Generate 1 or more C `switch' statement rather than use a
+ large, (and potentially sparse) static array. Although the
+ exact time and space savings of this approach vary according
+ to your C compiler's degree of optimization, this method
+ often results in smaller and faster code.
+
+ If the `-t', `-S', and `-p' options are omitted the default action
+is to generate a `char *' array containing the keys, together with
+additional null strings used for padding the array. By experimenting
+with the various input and output options, and timing the resulting C
+code, you can determine the best option choices for different keyword
+set characteristics.
+
+
+File: gperf.info, Node: Options, Next: Bugs, Prev: Description, Up: Top
+
+Options to the `gperf' Utility
+******************************
+
+ There are _many_ options to `gperf'. They were added to make the
+program more convenient for use with real applications. "On-line" help
+is readily available via the `-h' option. Other options include:
+
+ `-a'
+ Generate ANSI Standard C code using function prototypes. The
+ default is to use "classic" K&R C function declaration syntax.
+
+ `-c'
+ Generates C code that uses the `strncmp' function to perform
+ string comparisons. The default action is to use `strcmp'.
+
+ `-C'
+ Makes the contents of all generated lookup tables constant,
+ _i.e._, "readonly." Many compilers can generate more
+ efficient code for this by putting the tables in readonly
+ memory.
+
+ `-d'
+ Enables the debugging option. This produces verbose
+ diagnostics to "standard error" when `gperf' is executing.
+ It is useful both for maintaining the program and for
+ determining whether a given set of options is actually
+ speeding up the search for a solution. Some useful
+ information is dumped at the end of the program when the `-d'
+ option is enabled.
+
+ `-D'
+ Handle keywords whose key position sets hash to duplicate
+ values. Duplicate hash values occur for two reasons:
+
+ * Since `gperf' does not backtrack it is possible for it
+ to process all your input keywords without finding a
+ unique mapping for each word. However, frequently only
+ a very small number of duplicates occur, and the
+ majority of keys still require one probe into the table.
+
+ * Sometimes a set of keys may have the same names, but
+ possess different attributes. With the -D option
+ `gperf' treats all these keys as part of an equivalence
+ class and generates a perfect hash function with multiple
+ comparisons for duplicate keys. It is up to you to
+ completely disambiguate the keywords by modifying the
+ generated C code. However, `gperf' helps you out by
+ organizing the output.
+
+ Option `-D' is extremely useful for certain large or highly
+ redundant keyword sets, _i.e._, assembler instruction opcodes.
+ Using this option usually means that the generated hash
+ function is no longer perfect. On the other hand, it permits
+ `gperf' to work on keyword sets that it otherwise could not
+ handle.
+
+ `-e KEYWORD DELIMITER LIST'
+ Allows the user to provide a string containing delimiters
+ used to separate keywords from their attributes. The default
+ is ",\n". This option is essential if you want to use
+ keywords that have embedded commas or newlines. One useful
+ trick is to use -e'TAB', where TAB is the literal tab
+ character.
+
+ `-E'
+ Define constant values using an enum local to the lookup
+ function rather than with #defines. This also means that
+ different lookup functions can reside in the same file.
+ Thanks to James Clark (jjc at ai.mit.edu).
+
+ `-f ITERATION AMOUNT'
+ Generate the perfect hash function "fast." This decreases
+ `gperf''s running time at the cost of minimizing generated
+ table-size. The iteration amount represents the number of
+ times to iterate when resolving a collision. `0' means
+ `iterate by the number of keywords. This option is probably
+ most useful when used in conjunction with options `-D' and/or
+ `-S' for _large_ keyword sets.
+
+ `-g'
+ Assume a GNU compiler, _e.g._, `g++' or `gcc'. This makes
+ all generated routines use the "inline" keyword to remove the
+ cost of function calls. Note that `-g' does _not_ imply
+ `-a', since other non-ANSI C compilers may have provisions
+ for a function `inline' feature.
+
+ `-G'
+ Generate the static table of keywords as a static global
+ variable, rather than hiding it inside of the lookup function
+ (which is the default behavior).
+
+ `-h'
+ Prints a short summary on the meaning of each program option.
+ Aborts further program execution.
+
+ `-H HASH FUNCTION NAME'
+ Allows you to specify the name for the generated hash
+ function. Default name is `hash.' This option permits the
+ use of two hash tables in the same file.
+
+ `-i INITIAL VALUE'
+ Provides an initial VALUE for the associate values array.
+ Default is 0. Increasing the initial value helps inflate the
+ final table size, possibly leading to more time efficient
+ keyword lookups. Note that this option is not particularly
+ useful when `-S' is used. Also, `-i' is overriden when the
+ `-r' option is used.
+
+ `-j JUMP VALUE'
+ Affects the "jump value," _i.e._, how far to advance the
+ associated character value upon collisions. JUMP VALUE is
+ rounded up to an odd number, the default is 5. If the JUMP
+ VALUE is 0 `gper f' jumps by random amounts.
+
+ `-k KEYS'
+ Allows selection of the character key positions used in the
+ keywords' hash function. The allowable choices range between
+ 1-126, inclusive. The positions are separated by commas,
+ _e.g._, `-k 9,4,13,14'; ranges may be used, _e.g._, `-k 2-7';
+ and positions may occur in any order. Furthermore, the
+ meta-character '*' causes the generated hash function to
+ consider *all* character positions in each key, whereas '$'
+ instructs the hash function to use the "final character" of a
+ key (this is the only way to use a character position greater
+ than 126, incidentally).
+
+ For instance, the option `-k 1,2,4,6-10,'$'' generates a hash
+ function that considers positions 1,2,4,6,7,8,9,10, plus the
+ last character in each key (which may differ for each key,
+ obviously). Keys with length less than the indicated key
+ positions work properly, since selected key positions
+ exceeding the key length are simply not referenced in the
+ hash function.
+
+ `-K KEY NAME'
+ By default, the program assumes the structure component
+ identifier for the keyword is "name." This option allows an
+ arbitrary choice of identifier for this component, although
+ it still must occur as the first field in your supplied
+ `struct'.
+
+ `-l'
+ Compare key lengths before trying a string comparison. This
+ might cut down on the number of string comparisons made
+ during the lookup, since keys with different lengths are
+ never compared via `strcmp'. However, using `-l' might
+ greatly increase the size of the generated C code if the
+ lookup table range is large (which implies that the switch
+ option `-S' is not enabled), since the length table contains
+ as many elements as there are entries in the lookup table.
+
+ `-L GENERATED LANGUAGE NAME'
+ Instructs `gperf' to generate code in the language specified
+ by the option's argument. Languages handled are currently
+ C++ and C. The default is C.
+
+ `-n'
+ Instructs the generator not to include the length of a
+ keyword when computing its hash value. This may save a few
+ assembly instructions in the generated lookup table.
+
+ `-N LOOKUP FUNCTION NAME'
+ Allows you to specify the name for the generated lookup
+ function. Default name is `in_word_set.' This option
+ permits completely automatic generation of perfect hash
+ functions, especially when multiple generated hash functions
+ are used in the same application.
+
+ `-o'
+ Reorders the keywords by sorting the keywords so that
+ frequently occuring key position set components appear first.
+ A second reordering pass follows so that keys with "already
+ determined values" are placed towards the front of the
+ keylist. This may decrease the time required to generate a
+ perfect hash function for many keyword sets, and also produce
+ more minimal perfect hash functions. The reason for this is
+ that the reordering helps prune the search time by handling
+ inevitable collisions early in the search process. On the
+ other hand, if the number of keywords is _very_ large using
+ `-o' may _increase_ `gperf''s execution time, since
+ collisions will begin earlier and continue throughout the
+ remainder of keyword processing. See Cichelli's paper from
+ the January 1980 Communications of the ACM for details.
+
+ `-p'
+ Changes the return value of the generated function
+ `in_word_set' from boolean (_i.e._, 0 or 1), to either type
+ "pointer to user-defined struct," (if the `-t' option is
+ enabled), or simply to `char *', if `-t' is not enabled.
+ This option is most useful when the `-t' option (allowing
+ user-defined structs) is used. For example, it is possible
+ to automatically generate the GNU C reserved word lookup
+ routine with the options `-p' and `-t'.
+
+ `-r'
+ Utilizes randomness to initialize the associated values
+ table. This frequently generates solutions faster than using
+ deterministic initialization (which starts all associated
+ values at 0). Furthermore, using the randomization option
+ generally increases the size of the table. If `gperf' has
+ difficultly with a certain keyword set try using `-r' or `-D'.
+
+ `-s SIZE-MULTIPLE'
+ Affects the size of the generated hash table. The numeric
+ argument for this option indicates "how many times larger or
+ smaller" the maximum associated value range should be, in
+ relationship to the number of keys. If the SIZE-MULTIPLE is
+ negative the maximum associated value is calculated by
+ _dividing_ it into the total number of keys. For example, a
+ value of 3 means "allow the maximum associated value to be
+ about 3 times larger than the number of input keys."
+
+ Conversely, a value of -3 means "allow the maximum associated
+ value to be about 3 times smaller than the number of input
+ keys." Negative values are useful for limiting the overall
+ size of the generated hash table, though this usually
+ increases the number of duplicate hash values.
+
+ If `generate switch' option `-S' is _not_ enabled, the maximum
+ associated value influences the static array table size, and
+ a larger table should decrease the time required for an
+ unsuccessful search, at the expense of extra table space.
+
+ The default value is 1, thus the default maximum associated
+ value about the same size as the number of keys (for
+ efficiency, the maximum associated value is always rounded up
+ to a power of 2). The actual table size may vary somewhat,
+ since this technique is essentially a heuristic. In
+ particular, setting this value too high slows down `gperf''s
+ runtime, since it must search through a much larger range of
+ values. Judicious use of the `-f' option helps alleviate this
+ overhead, however.
+
+ `-S TOTAL SWITCH STATEMENTS'
+ Causes the generated C code to use a `switch' statement
+ scheme, rather than an array lookup table. This can lead to
+ a reduction in both time and space requirements for some
+ keyfiles. The argument to this option determines how many
+ `switch' statements are generated. A value of 1 generates 1
+ `switch' containing all the elements, a value of 2 generates
+ 2 tables with 1/2 the elements in each `switch', etc. This
+ is useful since many C compilers cannot correctly generate
+ code for large `switch' statements. This option was inspired
+ in part by Keith Bostic's original C program.
+
+ `-t'
+ Allows you to include a `struct' type declaration for
+ generated code. Any text before a pair of consecutive %% is
+ consider part of the type declaration. Key words and
+ additional fields may follow this, one group of fields per
+ line. A set of examples for generating perfect hash tables
+ and functions for Ada, C, and G++, Pascal, and Modula 2 and 3
+ reserved words are distributed with this release.
+
+ `-T'
+ Prevents the transfer of the type declaration to the output
+ file. Use this option if the type is already defined
+ elsewhere.
+
+ `-v'
+ Prints out the current version number.
+
+ `-Z CLASS NAME'
+ Allow user to specify name of generated C++ class. Default
+ name is `Perfect_Hash'.
+
+
+File: gperf.info, Node: Bugs, Next: Projects, Prev: Options, Up: Top
+
+Known Bugs and Limitations with `gperf'
+***************************************
+
+ The following are some limitations with the current release of
+`gperf':
+
+ * The `gperf' utility is tuned to execute quickly, and works quickly
+ for small to medium size data sets (around 1000 keywords). It is
+ extremely useful for maintaining perfect hash functions for
+ compiler keyword sets. Several recent enhancements now enable
+ `gperf' to work efficiently on much larger keyword sets (over
+ 15,000 keywords). When processing large keyword sets it helps
+ greatly to have over 8 megs of RAM.
+
+ However, since `gperf' does not backtrack no guaranteed solution
+ occurs on every run. On the other hand, it is usually easy to
+ obtain a solution by varying the option parameters. In
+ particular, try the `-r' option, and also try changing the default
+ arguments to the `-s' and `-j' options. To _guarantee_ a
+ solution, use the `-D' and `-S' options, although the final
+ results are not likely to be a _perfect_ hash function anymore!
+ Finally, use the `-f' option if you want `gperf' to generate the
+ perfect hash function _fast_, with less emphasis on making it
+ minimal.
+
+ * The size of the generate static keyword array can get _extremely_
+ large if the input keyword file is large or if the keywords are
+ quite similar. This tends to slow down the compilation of the
+ generated C code, and _greatly_ inflates the object code size. If
+ this situation occurs, consider using the `-S' option to reduce
+ data size, potentially increasing keyword recognition time a
+ negligible amount. Since many C compilers cannot correctly
+ generated code for large switch statements it is important to
+ qualify the -S option with an appropriate numerical argument that
+ controls the number of switch statements generated.
+
+ * The maximum number of key positions selected for a given key has an
+ arbitrary limit of 126. This restriction should be removed, and if
+ anyone considers this a problem write me and let me know so I can
+ remove the constraint.
+
+ * The C++ source code only compiles correctly with GNU G++, version
+ 1.36 (and hopefully later versions). Porting to AT&T cfront would
+ be tedious, but possible (and desirable). There is also a K&R C
+ version available now. This should compile without change on most
+ BSD systems, but may require a bit of work to run on SYSV, since
+ `gperf' uses ALLOCA in several places. Send mail to schmidt at
+ ics.uci.edu for information.
+
+
+File: gperf.info, Node: Projects, Next: Implementation, Prev: Bugs, Up: Top
+
+Things Still Left to Do
+***********************
+
+ It should be "relatively" easy to replace the current perfect hash
+function algorithm with a more exhaustive approach; the perfect hash
+module is essential independent from other program modules. Additional
+worthwhile improvements include:
+
+ * Make the algorithm more robust. At present, the program halts
+ with an error diagnostic if it can't find a direct solution and
+ the `-D' option is not enabled. A more comprehensive, albeit
+ computationally expensive, approach would employ backtracking or
+ enable alternative options and retry. It's not clear how helpful
+ this would be, in general, since most search sets are rather small
+ in practice.
+
+ * Another useful extension involves modifying the program to generate
+ "minimal" perfect hash functions (under certain circumstances, the
+ current version can be rather extravagant in the generated table
+ size). Again, this is mostly of theoretical interest, since a
+ sparse table often produces faster lookups, and use of the `-S'
+ `switch' option can minimize the data size, at the expense of
+ slightly longer lookups (note that the gcc compiler generally
+ produces good code for `switch' statements, reducing the need for
+ more complex schemes).
+
+ * In addition to improving the algorithm, it would also be useful to
+ generate a C++ class or Ada package as the code output, in
+ addition to the current C routines.
+
+
+File: gperf.info, Node: Implementation, Next: Bibliography, Prev: Projects, Up: Top
+
+Implementation Details of GNU `gperf'
+*************************************
+
+ A paper describing the high-level description of the data structures
+and algorithms used to implement `gperf' will soon be available. This
+paper is useful not only from a maintenance and enhancement perspective,
+but also because they demonstrate several clever and useful programming
+techniques, _e.g._, `Iteration Number' boolean arrays, double hashing,
+a "safe" and efficient method for reading arbitrarily long input from a
+file, and a provably optimal algorithm for simultaneously determining
+both the minimum and maximum elements in a list.
+
+
+File: gperf.info, Node: Bibliography, Prev: Implementation, Up: Top
+
+Bibliography
+************
+
+ [1] Chang, C.C.: A Scheme for Constructing Ordered Minimal Perfect
+Hashing Functions Information Sciences 39(1986), 187-195.
+
+ [2] Cichelli, Richard J. Author's Response to "On Cichelli's Minimal
+Perfec t Hash Functions Method" Communications of the ACM, 23,
+12(December 1980), 729.
+
+ [3] Cichelli, Richard J. Minimal Perfect Hash Functions Made Simple
+Communications of the ACM, 23, 1(January 1980), 17-19.
+
+ [4] Cook, C. R. and Oldehoeft, R.R. A Letter Oriented Minimal
+Perfect Hashing Function SIGPLAN Notices, 17, 9(September 1982), 18-27.
+
+ [5] Cormack, G. V. and Horspool, R. N. S. and Kaiserwerth, M.
+Practical Perfect Hashing Computer Journal, 28, 1(January 1985), 54-58.
+
+ [6] Jaeschke, G. Reciprocal Hashing: A Method for Generating Minimal
+Perfect Hashing Functions Communications of the ACM, 24, 12(December
+1981), 829-833.
+
+ [7] Jaeschke, G. and Osterburg, G. On Cichelli's Minimal Perfect
+Hash Functions Method Communications of the ACM, 23, 12(December 1980),
+728-729.
+
+ [8] Sager, Thomas J. A Polynomial Time Generator for Minimal Perfect
+Hash Functions Communications of the ACM, 28, 5(December 1985), 523-532
+
+ [9] Schmidt, Douglas C. GPERF: A Perfect Hash Function Generator
+Second USENIX C++ Conference Proceedings, April 1990.
+
+ [10] Sebesta, R.W. and Taylor, M.A. Minimal Perfect Hash Functions
+for Reserved Word Lists SIGPLAN Notices, 20, 12(September 1985), 47-53.
+
+ [11] Sprugnoli, R. Perfect Hashing Functions: A Single Probe
+Retrieving Method for Static Sets Communications of the ACM, 20
+11(November 1977), 841-850.
+
+ [12] Stallman, Richard M. Using and Porting GNU CC Free Software
+Foundation, 1988.
+
+ [13] Stroustrup, Bjarne The C++ Programming Language.
+Addison-Wesley, 1986.
+
+ [14] Tiemann, Michael D. User's Guide to GNU C++ Free Software
+Foundation, 1989.
+
+
+
+Tag Table:
+Node: Top1277
+Node: Copying2566
+Node: Contributors15867
+Node: Motivation16967
+Node: Search Structures18234
+Node: Description21787
+Node: Input Format23607
+Node: Declarations24402
+Node: Keywords26709
+Node: Functions28300
+Node: Output Format28794
+Node: Options31264
+Node: Bugs44634
+Node: Projects47321
+Node: Implementation48898
+Node: Bibliography49617
+
+End Tag Table