Merge branch 'master' into feature/typed-regex

author: Arnold D. Robbins <arnold@skeeve.com> 2016-10-26 21:52:49 +0300
committer: Arnold D. Robbins <arnold@skeeve.com> 2016-10-26 21:52:49 +0300
commit: 627c1b8f9913547703c7c53b0716b913f327a402 (patch)
tree: f34e72451f571b4be5640e18ed9df054d4c3fff3
parent: e404706d5e2ea41229fe5be9b0725202f49bf308 (diff)
parent: e5abd6a16d42fc0f42277919a2d0a2c28476788c (diff)
download: gawk-627c1b8f9913547703c7c53b0716b913f327a402.tar.gz
3 files changed, 40187 insertions, 2 deletions
diff --git a/.gitignore b/.gitignore
index 937a497b..72445191 100644
--- a/.gitignore
+++ b/.gitignore
@@ -16,5 +16,3 @@ gawk
 stamp-h1
 
 test/fmtspcl.ok
-
-doc/*.info
diff --git a/doc/gawk.info b/doc/gawk.info
new file mode 100644
index 00000000..b8ab365a
--- /dev/null
+++ b/doc/gawk.info
@@ -0,0 +1,35781 @@
+This is gawk.info, produced by makeinfo version 6.1 from gawk.texi.
+
+Copyright (C) 1989, 1991, 1992, 1993, 1996-2005, 2007, 2009-2016
+Free Software Foundation, Inc.
+
+
+   This is Edition 4.1 of 'GAWK: Effective AWK Programming: A User's
+Guide for GNU Awk', for the 4.1.4 (or later) version of the GNU
+implementation of AWK.
+
+   Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3 or
+any later version published by the Free Software Foundation; with the
+Invariant Sections being "GNU General Public License", with the
+Front-Cover Texts being "A GNU Manual", and with the Back-Cover Texts as
+in (a) below.  A copy of the license is included in the section entitled
+"GNU Free Documentation License".
+
+  a. The FSF's Back-Cover Text is: "You have the freedom to copy and
+     modify this GNU manual."
+INFO-DIR-SECTION Text creation and manipulation
+START-INFO-DIR-ENTRY
+* Gawk: (gawk).                 A text scanning and processing language.
+END-INFO-DIR-ENTRY
+
+INFO-DIR-SECTION Individual utilities
+START-INFO-DIR-ENTRY
+* awk: (gawk)Invoking gawk.                     Text scanning and processing.
+END-INFO-DIR-ENTRY
+
+
+File: gawk.info,  Node: Top,  Next: Foreword3,  Up: (dir)
+
+General Introduction
+********************
+
+This file documents 'awk', a program that you can use to select
+particular records in a file and perform operations upon them.
+
+   Copyright (C) 1989, 1991, 1992, 1993, 1996-2005, 2007, 2009-2016
+Free Software Foundation, Inc.
+
+
+   This is Edition 4.1 of 'GAWK: Effective AWK Programming: A User's
+Guide for GNU Awk', for the 4.1.4 (or later) version of the GNU
+implementation of AWK.
+
+   Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3 or
+any later version published by the Free Software Foundation; with the
+Invariant Sections being "GNU General Public License", with the
+Front-Cover Texts being "A GNU Manual", and with the Back-Cover Texts as
+in (a) below.  A copy of the license is included in the section entitled
+"GNU Free Documentation License".
+
+  a. The FSF's Back-Cover Text is: "You have the freedom to copy and
+     modify this GNU manual."
+
+* Menu:
+
+* Foreword3::                      Some nice words about this
+                                   Info file.
+* Foreword4::                      More nice words.
+* Preface::                        What this Info file is about; brief
+                                   history and acknowledgments.
+* Getting Started::                A basic introduction to using
+                                   'awk'. How to run an 'awk'
+                                   program. Command-line syntax.
+* Invoking Gawk::                  How to run 'gawk'.
+* Regexp::                         All about matching things using regular
+                                   expressions.
+* Reading Files::                  How to read files and manipulate fields.
+* Printing::                       How to print using 'awk'. Describes
+                                   the 'print' and 'printf'
+                                   statements. Also describes redirection of
+                                   output.
+* Expressions::                    Expressions are the basic building blocks
+                                   of statements.
+* Patterns and Actions::           Overviews of patterns and actions.
+* Arrays::                         The description and use of arrays. Also
+                                   includes array-oriented control statements.
+* Functions::                      Built-in and user-defined functions.
+* Library Functions::              A Library of 'awk' Functions.
+* Sample Programs::                Many 'awk' programs with complete
+                                   explanations.
+* Advanced Features::              Stuff for advanced users, specific to
+                                   'gawk'.
+* Internationalization::           Getting 'gawk' to speak your
+                                   language.
+* Debugger::                       The 'gawk' debugger.
+* Arbitrary Precision Arithmetic:: Arbitrary precision arithmetic with
+                                   'gawk'.
+* Dynamic Extensions::             Adding new built-in functions to
+                                   'gawk'.
+* Language History::               The evolution of the 'awk'
+                                   language.
+* Installation::                   Installing 'gawk' under various
+                                   operating systems.
+* Notes::                          Notes about adding things to 'gawk'
+                                   and possible future work.
+* Basic Concepts::                 A very quick introduction to programming
+                                   concepts.
+* Glossary::                       An explanation of some unfamiliar terms.
+* Copying::                        Your right to copy and distribute
+                                   'gawk'.
+* GNU Free Documentation License:: The license for this Info file.
+* Index::                          Concept and Variable Index.
+
+* History::                             The history of 'gawk' and
+                                        'awk'.
+* Names::                               What name to use to find
+                                        'awk'.
+* This Manual::                         Using this Info file. Includes
+                                        sample input files that you can use.
+* Conventions::                         Typographical Conventions.
+* Manual History::                      Brief history of the GNU project and
+                                        this Info file.
+* How To Contribute::                   Helping to save the world.
+* Acknowledgments::                     Acknowledgments.
+* Running gawk::                        How to run 'gawk' programs;
+                                        includes command-line syntax.
+* One-shot::                            Running a short throwaway
+                                        'awk' program.
+* Read Terminal::                       Using no input files (input from the
+                                        keyboard instead).
+* Long::                                Putting permanent 'awk'
+                                        programs in files.
+* Executable Scripts::                  Making self-contained 'awk'
+                                        programs.
+* Comments::                            Adding documentation to 'gawk'
+                                        programs.
+* Quoting::                             More discussion of shell quoting
+                                        issues.
+* DOS Quoting::                         Quoting in Windows Batch Files.
+* Sample Data Files::                   Sample data files for use in the
+                                        'awk' programs illustrated in
+                                        this Info file.
+* Very Simple::                         A very simple example.
+* Two Rules::                           A less simple one-line example using
+                                        two rules.
+* More Complex::                        A more complex example.
+* Statements/Lines::                    Subdividing or combining statements
+                                        into lines.
+* Other Features::                      Other Features of 'awk'.
+* When::                                When to use 'gawk' and when to
+                                        use other things.
+* Intro Summary::                       Summary of the introduction.
+* Command Line::                        How to run 'awk'.
+* Options::                             Command-line options and their
+                                        meanings.
+* Other Arguments::                     Input file names and variable
+                                        assignments.
+* Naming Standard Input::               How to specify standard input with
+                                        other files.
+* Environment Variables::               The environment variables
+                                        'gawk' uses.
+* AWKPATH Variable::                    Searching directories for
+                                        'awk' programs.
+* AWKLIBPATH Variable::                 Searching directories for
+                                        'awk' shared libraries.
+* Other Environment Variables::         The environment variables.
+* Exit Status::                         'gawk''s exit status.
+* Include Files::                       Including other files into your
+                                        program.
+* Loading Shared Libraries::            Loading shared libraries into your
+                                        program.
+* Obsolete::                            Obsolete Options and/or features.
+* Undocumented::                        Undocumented Options and Features.
+* Invoking Summary::                    Invocation summary.
+* Regexp Usage::                        How to Use Regular Expressions.
+* Escape Sequences::                    How to write nonprinting characters.
+* Regexp Operators::                    Regular Expression Operators.
+* Bracket Expressions::                 What can go between '[...]'.
+* Leftmost Longest::                    How much text matches.
+* Computed Regexps::                    Using Dynamic Regexps.
+* GNU Regexp Operators::                Operators specific to GNU software.
+* Case-sensitivity::                    How to do case-insensitive matching.
+* Strong Regexp Constants::             Strongly typed regexp constants.
+* Regexp Summary::                      Regular expressions summary.
+* Records::                             Controlling how data is split into
+                                        records.
+* awk split records::                   How standard 'awk' splits
+                                        records.
+* gawk split records::                  How 'gawk' splits records.
+* Fields::                              An introduction to fields.
+* Nonconstant Fields::                  Nonconstant Field Numbers.
+* Changing Fields::                     Changing the Contents of a Field.
+* Field Separators::                    The field separator and how to change
+                                        it.
+* Default Field Splitting::             How fields are normally separated.
+* Regexp Field Splitting::              Using regexps as the field separator.
+* Single Character Fields::             Making each character a separate
+                                        field.
+* Command Line Field Separator::        Setting 'FS' from the command
+                                        line.
+* Full Line Fields::                    Making the full line be a single
+                                        field.
+* Field Splitting Summary::             Some final points and a summary table.
+* Constant Size::                       Reading constant width data.
+* Splitting By Content::                Defining Fields By Content
+* Multiple Line::                       Reading multiline records.
+* Getline::                             Reading files under explicit program
+                                        control using the 'getline'
+                                        function.
+* Plain Getline::                       Using 'getline' with no
+                                        arguments.
+* Getline/Variable::                    Using 'getline' into a variable.
+* Getline/File::                        Using 'getline' from a file.
+* Getline/Variable/File::               Using 'getline' into a variable
+                                        from a file.
+* Getline/Pipe::                        Using 'getline' from a pipe.
+* Getline/Variable/Pipe::               Using 'getline' into a variable
+                                        from a pipe.
+* Getline/Coprocess::                   Using 'getline' from a coprocess.
+* Getline/Variable/Coprocess::          Using 'getline' into a variable
+                                        from a coprocess.
+* Getline Notes::                       Important things to know about
+                                        'getline'.
+* Getline Summary::                     Summary of 'getline' Variants.
+* Read Timeout::                        Reading input with a timeout.
+* Retrying Input::                      Retrying input after certain errors.
+* Command-line directories::            What happens if you put a directory on
+                                        the command line.
+* Input Summary::                       Input summary.
+* Input Exercises::                     Exercises.
+* Print::                               The 'print' statement.
+* Print Examples::                      Simple examples of 'print'
+                                        statements.
+* Output Separators::                   The output separators and how to
+                                        change them.
+* OFMT::                                Controlling Numeric Output With
+                                        'print'.
+* Printf::                              The 'printf' statement.
+* Basic Printf::                        Syntax of the 'printf' statement.
+* Control Letters::                     Format-control letters.
+* Format Modifiers::                    Format-specification modifiers.
+* Printf Examples::                     Several examples.
+* Redirection::                         How to redirect output to multiple
+                                        files and pipes.
+* Special FD::                          Special files for I/O.
+* Special Files::                       File name interpretation in
+                                        'gawk'. 'gawk' allows
+                                        access to inherited file descriptors.
+* Other Inherited Files::               Accessing other open files with
+                                        'gawk'.
+* Special Network::                     Special files for network
+                                        communications.
+* Special Caveats::                     Things to watch out for.
+* Close Files And Pipes::               Closing Input and Output Files and
+                                        Pipes.
+* Nonfatal::                            Enabling Nonfatal Output.
+* Output Summary::                      Output summary.
+* Output Exercises::                    Exercises.
+* Values::                              Constants, Variables, and Regular
+                                        Expressions.
+* Constants::                           String, numeric and regexp constants.
+* Scalar Constants::                    Numeric and string constants.
+* Nondecimal-numbers::                  What are octal and hex numbers.
+* Regexp Constants::                    Regular Expression constants.
+* Using Constant Regexps::              When and how to use a regexp constant.
+* Variables::                           Variables give names to values for
+                                        later use.
+* Using Variables::                     Using variables in your programs.
+* Assignment Options::                  Setting variables on the command line
+                                        and a summary of command-line syntax.
+                                        This is an advanced method of input.
+* Conversion::                          The conversion of strings to numbers
+                                        and vice versa.
+* Strings And Numbers::                 How 'awk' Converts Between
+                                        Strings And Numbers.
+* Locale influences conversions::       How the locale may affect conversions.
+* All Operators::                       'gawk''s operators.
+* Arithmetic Ops::                      Arithmetic operations ('+',
+                                        '-', etc.)
+* Concatenation::                       Concatenating strings.
+* Assignment Ops::                      Changing the value of a variable or a
+                                        field.
+* Increment Ops::                       Incrementing the numeric value of a
+                                        variable.
+* Truth Values and Conditions::         Testing for true and false.
+* Truth Values::                        What is "true" and what is
+                                        "false".
+* Typing and Comparison::               How variables acquire types and how
+                                        this affects comparison of numbers and
+                                        strings with '<', etc.
+* Variable Typing::                     String type versus numeric type.
+* Comparison Operators::                The comparison operators.
+* POSIX String Comparison::             String comparison with POSIX rules.
+* Boolean Ops::                         Combining comparison expressions using
+                                        boolean operators '||' ("or"),
+                                        '&&' ("and") and '!'
+                                        ("not").
+* Conditional Exp::                     Conditional expressions select between
+                                        two subexpressions under control of a
+                                        third subexpression.
+* Function Calls::                      A function call is an expression.
+* Precedence::                          How various operators nest.
+* Locales::                             How the locale affects things.
+* Expressions Summary::                 Expressions summary.
+* Pattern Overview::                    What goes into a pattern.
+* Regexp Patterns::                     Using regexps as patterns.
+* Expression Patterns::                 Any expression can be used as a
+                                        pattern.
+* Ranges::                              Pairs of patterns specify record
+                                        ranges.
+* BEGIN/END::                           Specifying initialization and cleanup
+                                        rules.
+* Using BEGIN/END::                     How and why to use BEGIN/END rules.
+* I/O And BEGIN/END::                   I/O issues in BEGIN/END rules.
+* BEGINFILE/ENDFILE::                   Two special patterns for advanced
+                                        control.
+* Empty::                               The empty pattern, which matches every
+                                        record.
+* Using Shell Variables::               How to use shell variables with
+                                        'awk'.
+* Action Overview::                     What goes into an action.
+* Statements::                          Describes the various control
+                                        statements in detail.
+* If Statement::                        Conditionally execute some
+                                        'awk' statements.
+* While Statement::                     Loop until some condition is
+                                        satisfied.
+* Do Statement::                        Do specified action while looping
+                                        until some condition is satisfied.
+* For Statement::                       Another looping statement, that
+                                        provides initialization and increment
+                                        clauses.
+* Switch Statement::                    Switch/case evaluation for conditional
+                                        execution of statements based on a
+                                        value.
+* Break Statement::                     Immediately exit the innermost
+                                        enclosing loop.
+* Continue Statement::                  Skip to the end of the innermost
+                                        enclosing loop.
+* Next Statement::                      Stop processing the current input
+                                        record.
+* Nextfile Statement::                  Stop processing the current file.
+* Exit Statement::                      Stop execution of 'awk'.
+* Built-in Variables::                  Summarizes the predefined variables.
+* User-modified::                       Built-in variables that you change to
+                                        control 'awk'.
+* Auto-set::                            Built-in variables where 'awk'
+                                        gives you information.
+* ARGC and ARGV::                       Ways to use 'ARGC' and
+                                        'ARGV'.
+* Pattern Action Summary::              Patterns and Actions summary.
+* Array Basics::                        The basics of arrays.
+* Array Intro::                         Introduction to Arrays
+* Reference to Elements::               How to examine one element of an
+                                        array.
+* Assigning Elements::                  How to change an element of an array.
+* Array Example::                       Basic Example of an Array
+* Scanning an Array::                   A variation of the 'for'
+                                        statement. It loops through the
+                                        indices of an array's existing
+                                        elements.
+* Controlling Scanning::                Controlling the order in which arrays
+                                        are scanned.
+* Numeric Array Subscripts::            How to use numbers as subscripts in
+                                        'awk'.
+* Uninitialized Subscripts::            Using Uninitialized variables as
+                                        subscripts.
+* Delete::                              The 'delete' statement removes an
+                                        element from an array.
+* Multidimensional::                    Emulating multidimensional arrays in
+                                        'awk'.
+* Multiscanning::                       Scanning multidimensional arrays.
+* Arrays of Arrays::                    True multidimensional arrays.
+* Arrays Summary::                      Summary of arrays.
+* Built-in::                            Summarizes the built-in functions.
+* Calling Built-in::                    How to call built-in functions.
+* Numeric Functions::                   Functions that work with numbers,
+                                        including 'int()', 'sin()'
+                                        and 'rand()'.
+* String Functions::                    Functions for string manipulation,
+                                        such as 'split()', 'match()'
+                                        and 'sprintf()'.
+* Gory Details::                        More than you want to know about
+                                        '\' and '&' with
+                                        'sub()', 'gsub()', and
+                                        'gensub()'.
+* I/O Functions::                       Functions for files and shell
+                                        commands.
+* Time Functions::                      Functions for dealing with timestamps.
+* Bitwise Functions::                   Functions for bitwise operations.
+* Type Functions::                      Functions for type information.
+* I18N Functions::                      Functions for string translation.
+* User-defined::                        Describes User-defined functions in
+                                        detail.
+* Definition Syntax::                   How to write definitions and what they
+                                        mean.
+* Function Example::                    An example function definition and
+                                        what it does.
+* Function Caveats::                    Things to watch out for.
+* Calling A Function::                  Don't use spaces.
+* Variable Scope::                      Controlling variable scope.
+* Pass By Value/Reference::             Passing parameters.
+* Return Statement::                    Specifying the value a function
+                                        returns.
+* Dynamic Typing::                      How variable types can change at
+                                        runtime.
+* Indirect Calls::                      Choosing the function to call at
+                                        runtime.
+* Functions Summary::                   Summary of functions.
+* Library Names::                       How to best name private global
+                                        variables in library functions.
+* General Functions::                   Functions that are of general use.
+* Strtonum Function::                   A replacement for the built-in
+                                        'strtonum()' function.
+* Assert Function::                     A function for assertions in
+                                        'awk' programs.
+* Round Function::                      A function for rounding if
+                                        'sprintf()' does not do it
+                                        correctly.
+* Cliff Random Function::               The Cliff Random Number Generator.
+* Ordinal Functions::                   Functions for using characters as
+                                        numbers and vice versa.
+* Join Function::                       A function to join an array into a
+                                        string.
+* Getlocaltime Function::               A function to get formatted times.
+* Readfile Function::                   A function to read an entire file at
+                                        once.
+* Shell Quoting::                       A function to quote strings for the
+                                        shell.
+* Data File Management::                Functions for managing command-line
+                                        data files.
+* Filetrans Function::                  A function for handling data file
+                                        transitions.
+* Rewind Function::                     A function for rereading the current
+                                        file.
+* File Checking::                       Checking that data files are readable.
+* Empty Files::                         Checking for zero-length files.
+* Ignoring Assigns::                    Treating assignments as file names.
+* Getopt Function::                     A function for processing command-line
+                                        arguments.
+* Passwd Functions::                    Functions for getting user
+                                        information.
+* Group Functions::                     Functions for getting group
+                                        information.
+* Walking Arrays::                      A function to walk arrays of arrays.
+* Library Functions Summary::           Summary of library functions.
+* Library Exercises::                   Exercises.
+* Running Examples::                    How to run these examples.
+* Clones::                              Clones of common utilities.
+* Cut Program::                         The 'cut' utility.
+* Egrep Program::                       The 'egrep' utility.
+* Id Program::                          The 'id' utility.
+* Split Program::                       The 'split' utility.
+* Tee Program::                         The 'tee' utility.
+* Uniq Program::                        The 'uniq' utility.
+* Wc Program::                          The 'wc' utility.
+* Miscellaneous Programs::              Some interesting 'awk'
+                                        programs.
+* Dupword Program::                     Finding duplicated words in a
+                                        document.
+* Alarm Program::                       An alarm clock.
+* Translate Program::                   A program similar to the 'tr'
+                                        utility.
+* Labels Program::                      Printing mailing labels.
+* Word Sorting::                        A program to produce a word usage
+                                        count.
+* History Sorting::                     Eliminating duplicate entries from a
+                                        history file.
+* Extract Program::                     Pulling out programs from Texinfo
+                                        source files.
+* Simple Sed::                          A Simple Stream Editor.
+* Igawk Program::                       A wrapper for 'awk' that
+                                        includes files.
+* Anagram Program::                     Finding anagrams from a dictionary.
+* Signature Program::                   People do amazing things with too much
+                                        time on their hands.
+* Programs Summary::                    Summary of programs.
+* Programs Exercises::                  Exercises.
+* Nondecimal Data::                     Allowing nondecimal input data.
+* Array Sorting::                       Facilities for controlling array
+                                        traversal and sorting arrays.
+* Controlling Array Traversal::         How to use PROCINFO["sorted_in"].
+* Array Sorting Functions::             How to use 'asort()' and
+                                        'asorti()'.
+* Two-way I/O::                         Two-way communications with another
+                                        process.
+* TCP/IP Networking::                   Using 'gawk' for network
+                                        programming.
+* Profiling::                           Profiling your 'awk' programs.
+* Advanced Features Summary::           Summary of advanced features.
+* I18N and L10N::                       Internationalization and Localization.
+* Explaining gettext::                  How GNU 'gettext' works.
+* Programmer i18n::                     Features for the programmer.
+* Translator i18n::                     Features for the translator.
+* String Extraction::                   Extracting marked strings.
+* Printf Ordering::                     Rearranging 'printf' arguments.
+* I18N Portability::                    'awk'-level portability
+                                        issues.
+* I18N Example::                        A simple i18n example.
+* Gawk I18N::                           'gawk' is also
+                                        internationalized.
+* I18N Summary::                        Summary of I18N stuff.
+* Debugging::                           Introduction to 'gawk'
+                                        debugger.
+* Debugging Concepts::                  Debugging in General.
+* Debugging Terms::                     Additional Debugging Concepts.
+* Awk Debugging::                       Awk Debugging.
+* Sample Debugging Session::            Sample debugging session.
+* Debugger Invocation::                 How to Start the Debugger.
+* Finding The Bug::                     Finding the Bug.
+* List of Debugger Commands::           Main debugger commands.
+* Breakpoint Control::                  Control of Breakpoints.
+* Debugger Execution Control::          Control of Execution.
+* Viewing And Changing Data::           Viewing and Changing Data.
+* Execution Stack::                     Dealing with the Stack.
+* Debugger Info::                       Obtaining Information about the
+                                        Program and the Debugger State.
+* Miscellaneous Debugger Commands::     Miscellaneous Commands.
+* Readline Support::                    Readline support.
+* Limitations::                         Limitations and future plans.
+* Debugging Summary::                   Debugging summary.
+* Computer Arithmetic::                 A quick intro to computer math.
+* Math Definitions::                    Defining terms used.
+* MPFR features::                       The MPFR features in 'gawk'.
+* FP Math Caution::                     Things to know.
+* Inexactness of computations::         Floating point math is not exact.
+* Inexact representation::              Numbers are not exactly represented.
+* Comparing FP Values::                 How to compare floating point values.
+* Errors accumulate::                   Errors get bigger as they go.
+* Getting Accuracy::                    Getting more accuracy takes some work.
+* Try To Round::                        Add digits and round.
+* Setting precision::                   How to set the precision.
+* Setting the rounding mode::           How to set the rounding mode.
+* Arbitrary Precision Integers::        Arbitrary Precision Integer Arithmetic
+                                        with 'gawk'.
+* POSIX Floating Point Problems::       Standards Versus Existing Practice.
+* Floating point summary::              Summary of floating point discussion.
+* Extension Intro::                     What is an extension.
+* Plugin License::                      A note about licensing.
+* Extension Mechanism Outline::         An outline of how it works.
+* Extension API Description::           A full description of the API.
+* Extension API Functions Introduction:: Introduction to the API functions.
+* General Data Types::                  The data types.
+* Memory Allocation Functions::         Functions for allocating memory.
+* Constructor Functions::               Functions for creating values.
+* Registration Functions::              Functions to register things with
+                                        'gawk'.
+* Extension Functions::                 Registering extension functions.
+* Exit Callback Functions::             Registering an exit callback.
+* Extension Version String::            Registering a version string.
+* Input Parsers::                       Registering an input parser.
+* Output Wrappers::                     Registering an output wrapper.
+* Two-way processors::                  Registering a two-way processor.
+* Printing Messages::                   Functions for printing messages.
+* Updating ERRNO::               Functions for updating 'ERRNO'.
+* Requesting Values::                   How to get a value.
+* Accessing Parameters::                Functions for accessing parameters.
+* Symbol Table Access::                 Functions for accessing global
+                                        variables.
+* Symbol table by name::                Accessing variables by name.
+* Symbol table by cookie::              Accessing variables by "cookie".
+* Cached values::                       Creating and using cached values.
+* Array Manipulation::                  Functions for working with arrays.
+* Array Data Types::                    Data types for working with arrays.
+* Array Functions::                     Functions for working with arrays.
+* Flattening Arrays::                   How to flatten arrays.
+* Creating Arrays::                     How to create and populate arrays.
+* Redirection API::                     How to access and manipulate redirections.
+* Extension API Variables::             Variables provided by the API.
+* Extension Versioning::                API Version information.
+* Extension API Informational Variables:: Variables providing information about
+                                        'gawk''s invocation.
+* Extension API Boilerplate::           Boilerplate code for using the API.
+* Finding Extensions::                  How 'gawk' finds compiled
+                                        extensions.
+* Extension Example::                   Example C code for an extension.
+* Internal File Description::           What the new functions will do.
+* Internal File Ops::                   The code for internal file operations.
+* Using Internal File Ops::             How to use an external extension.
+* Extension Samples::                   The sample extensions that ship with
+                                        'gawk'.
+* Extension Sample File Functions::     The file functions sample.
+* Extension Sample Fnmatch::            An interface to 'fnmatch()'.
+* Extension Sample Fork::               An interface to 'fork()' and
+                                        other process functions.
+* Extension Sample Inplace::            Enabling in-place file editing.
+* Extension Sample Ord::                Character to value to character
+                                        conversions.
+* Extension Sample Readdir::            An interface to 'readdir()'.
+* Extension Sample Revout::             Reversing output sample output
+                                        wrapper.
+* Extension Sample Rev2way::            Reversing data sample two-way
+                                        processor.
+* Extension Sample Read write array::   Serializing an array to a file.
+* Extension Sample Readfile::           Reading an entire file into a string.
+* Extension Sample Time::               An interface to 'gettimeofday()'
+                                        and 'sleep()'.
+* Extension Sample API Tests::          Tests for the API.
+* gawkextlib::                          The 'gawkextlib' project.
+* Extension summary::                   Extension summary.
+* Extension Exercises::                 Exercises.
+* V7/SVR3.1::                           The major changes between V7 and
+                                        System V Release 3.1.
+* SVR4::                                Minor changes between System V
+                                        Releases 3.1 and 4.
+* POSIX::                               New features from the POSIX standard.
+* BTL::                                 New features from Brian Kernighan's
+                                        version of 'awk'.
+* POSIX/GNU::                           The extensions in 'gawk' not
+                                        in POSIX 'awk'.
+* Feature History::                     The history of the features in
+                                        'gawk'.
+* Common Extensions::                   Common Extensions Summary.
+* Ranges and Locales::                  How locales used to affect regexp
+                                        ranges.
+* Contributors::                        The major contributors to
+                                        'gawk'.
+* History summary::                     History summary.
+* Gawk Distribution::                   What is in the 'gawk'
+                                        distribution.
+* Getting::                             How to get the distribution.
+* Extracting::                          How to extract the distribution.
+* Distribution contents::               What is in the distribution.
+* Unix Installation::                   Installing 'gawk' under
+                                        various versions of Unix.
+* Quick Installation::                  Compiling 'gawk' under Unix.
+* Shell Startup Files::                 Shell convenience functions.
+* Additional Configuration Options::    Other compile-time options.
+* Configuration Philosophy::            How it's all supposed to work.
+* Non-Unix Installation::               Installation on Other Operating
+                                        Systems.
+* PC Installation::                     Installing and Compiling 'gawk' on
+                                        Microsoft Windows.
+* PC Binary Installation::              Installing a prepared distribution.
+* PC Compiling::                        Compiling 'gawk' for Windows32.
+* PC Using::                            Running 'gawk' on Windows32.
+* Cygwin::                              Building and running 'gawk'
+                                        for Cygwin.
+* MSYS::                                Using 'gawk' In The MSYS
+                                        Environment.
+* VMS Installation::                    Installing 'gawk' on VMS.
+* VMS Compilation::                     How to compile 'gawk' under
+                                        VMS.
+* VMS Dynamic Extensions::              Compiling 'gawk' dynamic
+                                        extensions on VMS.
+* VMS Installation Details::            How to install 'gawk' under
+                                        VMS.
+* VMS Running::                         How to run 'gawk' under VMS.
+* VMS GNV::                             The VMS GNV Project.
+* VMS Old Gawk::                        An old version comes with some VMS
+                                        systems.
+* Bugs::                                Reporting Problems and Bugs.
+* Bug address::                         Where to send reports to.
+* Usenet::                              Where not to send reports to.
+* Maintainers::                         Maintainers of non-*nix ports.
+* Other Versions::                      Other freely available 'awk'
+                                        implementations.
+* Installation summary::                Summary of installation.
+* Compatibility Mode::                  How to disable certain 'gawk'
+                                        extensions.
+* Additions::                           Making Additions To 'gawk'.
+* Accessing The Source::                Accessing the Git repository.
+* Adding Code::                         Adding code to the main body of
+                                        'gawk'.
+* New Ports::                           Porting 'gawk' to a new
+                                        operating system.
+* Derived Files::                       Why derived files are kept in the Git
+                                        repository.
+* Future Extensions::                   New features that may be implemented
+                                        one day.
+* Implementation Limitations::          Some limitations of the
+                                        implementation.
+* Extension Design::                    Design notes about the extension API.
+* Old Extension Problems::              Problems with the old mechanism.
+* Extension New Mechanism Goals::       Goals for the new mechanism.
+* Extension Other Design Decisions::    Some other design decisions.
+* Extension Future Growth::             Some room for future growth.
+* Old Extension Mechanism::             Some compatibility for old extensions.
+* Notes summary::                       Summary of implementation notes.
+* Basic High Level::                    The high level view.
+* Basic Data Typing::                   A very quick intro to data types.
+
+   To my parents, for their love, and for the wonderful example they set
+for me.
+
+   To my wife Miriam, for making me complete.  Thank you for building
+your life together with me.
+
+   To our children Chana, Rivka, Nachum and Malka, for enrichening our
+lives in innumerable ways.
+
+
+File: gawk.info,  Node: Foreword3,  Next: Foreword4,  Prev: Top,  Up: Top
+
+Foreword to the Third Edition
+*****************************
+
+Arnold Robbins and I are good friends.  We were introduced in 1990 by
+circumstances--and our favorite programming language, AWK. The
+circumstances started a couple of years earlier.  I was working at a new
+job and noticed an unplugged Unix computer sitting in the corner.  No
+one knew how to use it, and neither did I. However, a couple of days
+later, it was running, and I was 'root' and the one-and-only user.  That
+day, I began the transition from statistician to Unix programmer.
+
+   On one of many trips to the library or bookstore in search of books
+on Unix, I found the gray AWK book, a.k.a. Alfred V. Aho, Brian W.
+Kernighan, and Peter J. Weinberger's 'The AWK Programming Language'
+(Addison-Wesley, 1988).  'awk''s simple programming paradigm--find a
+pattern in the input and then perform an action--often reduced complex
+or tedious data manipulations to a few lines of code.  I was excited to
+try my hand at programming in AWK.
+
+   Alas, the 'awk' on my computer was a limited version of the language
+described in the gray book.  I discovered that my computer had "old
+'awk'" and the book described "new 'awk'."  I learned that this was
+typical; the old version refused to step aside or relinquish its name.
+If a system had a new 'awk', it was invariably called 'nawk', and few
+systems had it.  The best way to get a new 'awk' was to 'ftp' the source
+code for 'gawk' from 'prep.ai.mit.edu'.  'gawk' was a version of new
+'awk' written by David Trueman and Arnold, and available under the GNU
+General Public License.
+
+   (Incidentally, it's no longer difficult to find a new 'awk'.  'gawk'
+ships with GNU/Linux, and you can download binaries or source code for
+almost any system; my wife uses 'gawk' on her VMS box.)
+
+   My Unix system started out unplugged from the wall; it certainly was
+not plugged into a network.  So, oblivious to the existence of 'gawk'
+and the Unix community in general, and desiring a new 'awk', I wrote my
+own, called 'mawk'.  Before I was finished, I knew about 'gawk', but it
+was too late to stop, so I eventually posted to a 'comp.sources'
+newsgroup.
+
+   A few days after my posting, I got a friendly email from Arnold
+introducing himself.  He suggested we share design and algorithms and
+attached a draft of the POSIX standard so that I could update 'mawk' to
+support language extensions added after publication of 'The AWK
+Programming Language'.
+
+   Frankly, if our roles had been reversed, I would not have been so
+open and we probably would have never met.  I'm glad we did meet.  He is
+an AWK expert's AWK expert and a genuinely nice person.  Arnold
+contributes significant amounts of his expertise and time to the Free
+Software Foundation.
+
+   This book is the 'gawk' reference manual, but at its core it is a
+book about AWK programming that will appeal to a wide audience.  It is a
+definitive reference to the AWK language as defined by the 1987 Bell
+Laboratories release and codified in the 1992 POSIX Utilities standard.
+
+   On the other hand, the novice AWK programmer can study a wealth of
+practical programs that emphasize the power of AWK's basic idioms:
+data-driven control flow, pattern matching with regular expressions, and
+associative arrays.  Those looking for something new can try out
+'gawk''s interface to network protocols via special '/inet' files.
+
+   The programs in this book make clear that an AWK program is typically
+much smaller and faster to develop than a counterpart written in C.
+Consequently, there is often a payoff to prototyping an algorithm or
+design in AWK to get it running quickly and expose problems early.
+Often, the interpreted performance is adequate and the AWK prototype
+becomes the product.
+
+   The new 'pgawk' (profiling 'gawk'), produces program execution
+counts.  I recently experimented with an algorithm that for n lines of
+input, exhibited ~ C n^2 performance, while theory predicted ~ C n log n
+behavior.  A few minutes poring over the 'awkprof.out' profile
+pinpointed the problem to a single line of code.  'pgawk' is a welcome
+addition to my programmer's toolbox.
+
+   Arnold has distilled over a decade of experience writing and using
+AWK programs, and developing 'gawk', into this book.  If you use AWK or
+want to learn how, then read this book.
+
+     Michael Brennan
+     Author of 'mawk'
+     March 2001
+
+
+File: gawk.info,  Node: Foreword4,  Next: Preface,  Prev: Foreword3,  Up: Top
+
+Foreword to the Fourth Edition
+******************************
+
+Some things don't change.  Thirteen years ago I wrote: "If you use AWK
+or want to learn how, then read this book."  True then, and still true
+today.
+
+   Learning to use a programming language is about more than mastering
+the syntax.  One needs to acquire an understanding of how to use the
+features of the language to solve practical programming problems.  A
+focus of this book is many examples that show how to use AWK.
+
+   Some things do change.  Our computers are much faster and have more
+memory.  Consequently, speed and storage inefficiencies of a high-level
+language matter less.  Prototyping in AWK and then rewriting in C for
+performance reasons happens less, because more often the prototype is
+fast enough.
+
+   Of course, there are computing operations that are best done in C or
+C++.  With 'gawk' 4.1 and later, you do not have to choose between
+writing your program in AWK or in C/C++.  You can write most of your
+program in AWK and the aspects that require C/C++ capabilities can be
+written in C/C++, and then the pieces glued together when the 'gawk'
+module loads the C/C++ module as a dynamic plug-in.  *note Dynamic
+Extensions::, has all the details, and, as expected, many examples to
+help you learn the ins and outs.
+
+   I enjoy programming in AWK and had fun (re)reading this book.  I
+think you will too.
+
+     Michael Brennan
+     Author of 'mawk'
+     October 2014
+
+
+File: gawk.info,  Node: Preface,  Next: Getting Started,  Prev: Foreword4,  Up: Top
+
+Preface
+*******
+
+Several kinds of tasks occur repeatedly when working with text files.
+You might want to extract certain lines and discard the rest.  Or you
+may need to make changes wherever certain patterns appear, but leave the
+rest of the file alone.  Such jobs are often easy with 'awk'.  The 'awk'
+utility interprets a special-purpose programming language that makes it
+easy to handle simple data-reformatting jobs.
+
+   The GNU implementation of 'awk' is called 'gawk'; if you invoke it
+with the proper options or environment variables, it is fully compatible
+with the POSIX(1) specification of the 'awk' language and with the Unix
+version of 'awk' maintained by Brian Kernighan.  This means that all
+properly written 'awk' programs should work with 'gawk'.  So most of the
+time, we don't distinguish between 'gawk' and other 'awk'
+implementations.
+
+   Using 'awk' you can:
+
+   * Manage small, personal databases
+
+   * Generate reports
+
+   * Validate data
+
+   * Produce indexes and perform other document-preparation tasks
+
+   * Experiment with algorithms that you can adapt later to other
+     computer languages
+
+   In addition, 'gawk' provides facilities that make it easy to:
+
+   * Extract bits and pieces of data for processing
+
+   * Sort data
+
+   * Perform simple network communications
+
+   * Profile and debug 'awk' programs
+
+   * Extend the language with functions written in C or C++
+
+   This Info file teaches you about the 'awk' language and how you can
+use it effectively.  You should already be familiar with basic system
+commands, such as 'cat' and 'ls',(2) as well as basic shell facilities,
+such as input/output (I/O) redirection and pipes.
+
+   Implementations of the 'awk' language are available for many
+different computing environments.  This Info file, while describing the
+'awk' language in general, also describes the particular implementation
+of 'awk' called 'gawk' (which stands for "GNU 'awk'").  'gawk' runs on a
+broad range of Unix systems, ranging from Intel-architecture PC-based
+computers up through large-scale systems.  'gawk' has also been ported
+to Mac OS X, Microsoft Windows (all versions), and OpenVMS.(3)
+
+* Menu:
+
+* History::                     The history of 'gawk' and
+                                'awk'.
+* Names::                       What name to use to find 'awk'.
+* This Manual::                 Using this Info file. Includes sample
+                                input files that you can use.
+* Conventions::                 Typographical Conventions.
+* Manual History::              Brief history of the GNU project and this
+                                Info file.
+* How To Contribute::           Helping to save the world.
+* Acknowledgments::             Acknowledgments.
+
+   ---------- Footnotes ----------
+
+   (1) The 2008 POSIX standard is accessible online at
+<http://www.opengroup.org/onlinepubs/9699919799/>.
+
+   (2) These utilities are available on POSIX-compliant systems, as well
+as on traditional Unix-based systems.  If you are using some other
+operating system, you still need to be familiar with the ideas of I/O
+redirection and pipes.
+
+   (3) Some other, obsolete systems to which 'gawk' was once ported are
+no longer supported and the code for those systems has been removed.
+
+
+File: gawk.info,  Node: History,  Next: Names,  Up: Preface
+
+History of 'awk' and 'gawk'
+===========================
+
+                   Recipe for a Programming Language
+
+          1 part 'egrep'   1 part 'snobol'
+          2 parts 'ed'     3 parts C
+
+   Blend all parts well using 'lex' and 'yacc'.  Document minimally and
+release.
+
+   After eight years, add another part 'egrep' and two more parts C.
+Document very well and release.
+
+   The name 'awk' comes from the initials of its designers: Alfred V.
+Aho, Peter J. Weinberger, and Brian W. Kernighan.  The original version
+of 'awk' was written in 1977 at AT&T Bell Laboratories.  In 1985, a new
+version made the programming language more powerful, introducing
+user-defined functions, multiple input streams, and computed regular
+expressions.  This new version became widely available with Unix System
+V Release 3.1 (1987).  The version in System V Release 4 (1989) added
+some new features and cleaned up the behavior in some of the "dark
+corners" of the language.  The specification for 'awk' in the POSIX
+Command Language and Utilities standard further clarified the language.
+Both the 'gawk' designers and the original 'awk' designers at Bell
+Laboratories provided feedback for the POSIX specification.
+
+   Paul Rubin wrote 'gawk' in 1986.  Jay Fenlason completed it, with
+advice from Richard Stallman.  John Woods contributed parts of the code
+as well.  In 1988 and 1989, David Trueman, with help from me, thoroughly
+reworked 'gawk' for compatibility with the newer 'awk'.  Circa 1994, I
+became the primary maintainer.  Current development focuses on bug
+fixes, performance improvements, standards compliance, and,
+occasionally, new features.
+
+   In May 1997, Ju"rgen Kahrs felt the need for network access from
+'awk', and with a little help from me, set about adding features to do
+this for 'gawk'.  At that time, he also wrote the bulk of 'TCP/IP
+Internetworking with 'gawk'' (a separate document, available as part of
+the 'gawk' distribution).  His code finally became part of the main
+'gawk' distribution with 'gawk' version 3.1.
+
+   John Haque rewrote the 'gawk' internals, in the process providing an
+'awk'-level debugger.  This version became available as 'gawk' version
+4.0 in 2011.
+
+   *Note Contributors:: for a full list of those who have made important
+contributions to 'gawk'.
+
+
+File: gawk.info,  Node: Names,  Next: This Manual,  Prev: History,  Up: Preface
+
+A Rose by Any Other Name
+========================
+
+The 'awk' language has evolved over the years.  Full details are
+provided in *note Language History::.  The language described in this
+Info file is often referred to as "new 'awk'."  By analogy, the original
+version of 'awk' is referred to as "old 'awk'."
+
+   On most current systems, when you run the 'awk' utility you get some
+version of new 'awk'.(1)  If your system's standard 'awk' is the old
+one, you will see something like this if you try the test program:
+
+     $ awk 1 /dev/null
+     error-> awk: syntax error near line 1
+     error-> awk: bailing out near line 1
+
+In this case, you should find a version of new 'awk', or just install
+'gawk'!
+
+   Throughout this Info file, whenever we refer to a language feature
+that should be available in any complete implementation of POSIX 'awk',
+we simply use the term 'awk'.  When referring to a feature that is
+specific to the GNU implementation, we use the term 'gawk'.
+
+   ---------- Footnotes ----------
+
+   (1) Only Solaris systems still use an old 'awk' for the default 'awk'
+utility.  A more modern 'awk' lives in '/usr/xpg6/bin' on these systems.
+
+
+File: gawk.info,  Node: This Manual,  Next: Conventions,  Prev: Names,  Up: Preface
+
+Using This Book
+===============
+
+The term 'awk' refers to a particular program as well as to the language
+you use to tell this program what to do.  When we need to be careful, we
+call the language "the 'awk' language," and the program "the 'awk'
+utility."  This Info file explains both how to write programs in the
+'awk' language and how to run the 'awk' utility.  The term "'awk'
+program" refers to a program written by you in the 'awk' programming
+language.
+
+   Primarily, this Info file explains the features of 'awk' as defined
+in the POSIX standard.  It does so in the context of the 'gawk'
+implementation.  While doing so, it also attempts to describe important
+differences between 'gawk' and other 'awk' implementations.(1)  Finally,
+it notes any 'gawk' features that are not in the POSIX standard for
+'awk'.
+
+   There are sidebars scattered throughout the Info file.  They add a
+more complete explanation of points that are relevant, but not likely to
+be of interest on first reading.  All appear in the index, under the
+heading "sidebar."
+
+   Most of the time, the examples use complete 'awk' programs.  Some of
+the more advanced minor nodes show only the part of the 'awk' program
+that illustrates the concept being described.
+
+   Although this Info file is aimed principally at people who have not
+been exposed to 'awk', there is a lot of information here that even the
+'awk' expert should find useful.  In particular, the description of
+POSIX 'awk' and the example programs in *note Library Functions::, and
+in *note Sample Programs::, should be of interest.
+
+   This Info file is split into several parts, as follows:
+
+   * Part I describes the 'awk' language and the 'gawk' program in
+     detail.  It starts with the basics, and continues through all of
+     the features of 'awk'.  It contains the following chapters:
+
+        - *note Getting Started::, provides the essentials you need to
+          know to begin using 'awk'.
+
+        - *note Invoking Gawk::, describes how to run 'gawk', the
+          meaning of its command-line options, and how it finds 'awk'
+          program source files.
+
+        - *note Regexp::, introduces regular expressions in general, and
+          in particular the flavors supported by POSIX 'awk' and 'gawk'.
+
+        - *note Reading Files::, describes how 'awk' reads your data.
+          It introduces the concepts of records and fields, as well as
+          the 'getline' command.  I/O redirection is first described
+          here.  Network I/O is also briefly introduced here.
+
+        - *note Printing::, describes how 'awk' programs can produce
+          output with 'print' and 'printf'.
+
+        - *note Expressions::, describes expressions, which are the
+          basic building blocks for getting most things done in a
+          program.
+
+        - *note Patterns and Actions::, describes how to write patterns
+          for matching records, actions for doing something when a
+          record is matched, and the predefined variables 'awk' and
+          'gawk' use.
+
+        - *note Arrays::, covers 'awk''s one-and-only data structure:
+          the associative array.  Deleting array elements and whole
+          arrays is described, as well as sorting arrays in 'gawk'.  The
+          major node also describes how 'gawk' provides arrays of
+          arrays.
+
+        - *note Functions::, describes the built-in functions 'awk' and
+          'gawk' provide, as well as how to define your own functions.
+          It also discusses how 'gawk' lets you call functions
+          indirectly.
+
+   * Part II shows how to use 'awk' and 'gawk' for problem solving.
+     There is lots of code here for you to read and learn from.  This
+     part contains the following chapters:
+
+        - *note Library Functions::, provides a number of functions
+          meant to be used from main 'awk' programs.
+
+        - *note Sample Programs::, provides many sample 'awk' programs.
+
+     Reading these two chapters allows you to see 'awk' solving real
+     problems.
+
+   * Part III focuses on features specific to 'gawk'.  It contains the
+     following chapters:
+
+        - *note Advanced Features::, describes a number of advanced
+          features.  Of particular note are the abilities to control the
+          order of array traversal, have two-way communications with
+          another process, perform TCP/IP networking, and profile your
+          'awk' programs.
+
+        - *note Internationalization::, describes special features for
+          translating program messages into different languages at
+          runtime.
+
+        - *note Debugger::, describes the 'gawk' debugger.
+
+        - *note Arbitrary Precision Arithmetic::, describes advanced
+          arithmetic facilities.
+
+        - *note Dynamic Extensions::, describes how to add new variables
+          and functions to 'gawk' by writing extensions in C or C++.
+
+   * Part IV provides the appendices, the Glossary, and two licenses
+     that cover the 'gawk' source code and this Info file, respectively.
+     It contains the following appendices:
+
+        - *note Language History::, describes how the 'awk' language has
+          evolved since its first release to the present.  It also
+          describes how 'gawk' has acquired features over time.
+
+        - *note Installation::, describes how to get 'gawk', how to
+          compile it on POSIX-compatible systems, and how to compile and
+          use it on different non-POSIX systems.  It also describes how
+          to report bugs in 'gawk' and where to get other freely
+          available 'awk' implementations.
+
+        - *note Notes::, describes how to disable 'gawk''s extensions,
+          as well as how to contribute new code to 'gawk', and some
+          possible future directions for 'gawk' development.
+
+        - *note Basic Concepts::, provides some very cursory background
+          material for those who are completely unfamiliar with computer
+          programming.
+
+          The *note Glossary::, defines most, if not all, of the
+          significant terms used throughout the Info file.  If you find
+          terms that you aren't familiar with, try looking them up here.
+
+        - *note Copying::, and *note GNU Free Documentation License::,
+          present the licenses that cover the 'gawk' source code and
+          this Info file, respectively.
+
+   ---------- Footnotes ----------
+
+   (1) All such differences appear in the index under the entry
+"differences in 'awk' and 'gawk'."
+
+
+File: gawk.info,  Node: Conventions,  Next: Manual History,  Prev: This Manual,  Up: Preface
+
+Typographical Conventions
+=========================
+
+This Info file is written in Texinfo
+(http://www.gnu.org/software/texinfo/), the GNU documentation formatting
+language.  A single Texinfo source file is used to produce both the
+printed and online versions of the documentation.  This minor node
+briefly documents the typographical conventions used in Texinfo.
+
+   Examples you would type at the command line are preceded by the
+common shell primary and secondary prompts, '$' and '>'.  Input that you
+type is shown 'like this'.  Output from the command is preceded by the
+glyph "-|".  This typically represents the command's standard output.
+Error messages and other output on the command's standard error are
+preceded by the glyph "error->".  For example:
+
+     $ echo hi on stdout
+     -| hi on stdout
+     $ echo hello on stderr 1>&2
+     error-> hello on stderr
+
+   Characters that you type at the keyboard look 'like this'.  In
+particular, there are special characters called "control characters."
+These are characters that you type by holding down both the 'CONTROL'
+key and another key, at the same time.  For example, a 'Ctrl-d' is typed
+by first pressing and holding the 'CONTROL' key, next pressing the 'd'
+key, and finally releasing both keys.
+
+   For the sake of brevity, throughout this Info file, we refer to Brian
+Kernighan's version of 'awk' as "BWK 'awk'."  (*Note Other Versions::
+for information on his and other versions.)
+
+Dark Corners
+------------
+
+     Dark corners are basically fractal--no matter how much you
+     illuminate, there's always a smaller but darker one.
+                         -- _Brian Kernighan_
+
+   Until the POSIX standard (and 'GAWK: Effective AWK Programming'),
+many features of 'awk' were either poorly documented or not documented
+at all.  Descriptions of such features (often called "dark corners") are
+noted in this Info file with "(d.c.)."  They also appear in the index
+under the heading "dark corner."
+
+   But, as noted by the opening quote, any coverage of dark corners is
+by definition incomplete.
+
+   Extensions to the standard 'awk' language that are supported by more
+than one 'awk' implementation are marked "(c.e.)," and listed in the
+index under "common extensions" and "extensions, common."
+
+
+File: gawk.info,  Node: Manual History,  Next: How To Contribute,  Prev: Conventions,  Up: Preface
+
+The GNU Project and This Book
+=============================
+
+The Free Software Foundation (FSF) is a nonprofit organization dedicated
+to the production and distribution of freely distributable software.  It
+was founded by Richard M. Stallman, the author of the original Emacs
+editor.  GNU Emacs is the most widely used version of Emacs today.
+
+   The GNU(1) Project is an ongoing effort on the part of the Free
+Software Foundation to create a complete, freely distributable,
+POSIX-compliant computing environment.  The FSF uses the GNU General
+Public License (GPL) to ensure that its software's source code is always
+available to the end user.  A copy of the GPL is included for your
+reference (*note Copying::).  The GPL applies to the C language source
+code for 'gawk'.  To find out more about the FSF and the GNU Project
+online, see the GNU Project's home page (http://www.gnu.org).  This Info
+file may also be read from GNU's website
+(http://www.gnu.org/software/gawk/manual/).
+
+   A shell, an editor (Emacs), highly portable optimizing C, C++, and
+Objective-C compilers, a symbolic debugger and dozens of large and small
+utilities (such as 'gawk'), have all been completed and are freely
+available.  The GNU operating system kernel (the HURD), has been
+released but remains in an early stage of development.
+
+   Until the GNU operating system is more fully developed, you should
+consider using GNU/Linux, a freely distributable, Unix-like operating
+system for Intel, Power Architecture, Sun SPARC, IBM S/390, and other
+systems.(2)  Many GNU/Linux distributions are available for download
+from the Internet.
+
+   The Info file itself has gone through multiple previous editions.
+Paul Rubin wrote the very first draft of 'The GAWK Manual'; it was
+around 40 pages long.  Diane Close and Richard Stallman improved it,
+yielding a version that was around 90 pages and barely described the
+original, "old" version of 'awk'.
+
+   I started working with that version in the fall of 1988.  As work on
+it progressed, the FSF published several preliminary versions (numbered
+0.X).  In 1996, edition 1.0 was released with 'gawk' 3.0.0.  The FSF
+published the first two editions under the title 'The GNU Awk User's
+Guide'.
+
+   This edition maintains the basic structure of the previous editions.
+For FSF edition 4.0, the content was thoroughly reviewed and updated.
+All references to 'gawk' versions prior to 4.0 were removed.  Of
+significant note for that edition was the addition of *note Debugger::.
+
+   For FSF edition 4.1, the content has been reorganized into parts, and
+the major new additions are *note Arbitrary Precision Arithmetic::, and
+*note Dynamic Extensions::.
+
+   This Info file will undoubtedly continue to evolve.  If you find an
+error in the Info file, please report it!  *Note Bugs:: for information
+on submitting problem reports electronically.
+
+   ---------- Footnotes ----------
+
+   (1) GNU stands for "GNU's Not Unix."
+
+   (2) The terminology "GNU/Linux" is explained in the *note Glossary::.
+
+
+File: gawk.info,  Node: How To Contribute,  Next: Acknowledgments,  Prev: Manual History,  Up: Preface
+
+How to Contribute
+=================
+
+As the maintainer of GNU 'awk', I once thought that I would be able to
+manage a collection of publicly available 'awk' programs and I even
+solicited contributions.  Making things available on the Internet helps
+keep the 'gawk' distribution down to manageable size.
+
+   The initial collection of material, such as it is, is still available
+at <ftp://ftp.freefriends.org/arnold/Awkstuff>.  In the hopes of doing
+something more broad, I acquired the 'awk.info' domain.
+
+   However, I found that I could not dedicate enough time to managing
+contributed code: the archive did not grow and the domain went unused
+for several years.
+
+   Late in 2008, a volunteer took on the task of setting up an
+'awk'-related website--<http://awk.info>--and did a very nice job.
+
+   If you have written an interesting 'awk' program, or have written a
+'gawk' extension that you would like to share with the rest of the
+world, please see <http://awk.info/?contribute> for how to contribute it
+to the website.
+
+
+File: gawk.info,  Node: Acknowledgments,  Prev: How To Contribute,  Up: Preface
+
+Acknowledgments
+===============
+
+The initial draft of 'The GAWK Manual' had the following
+acknowledgments:
+
+     Many people need to be thanked for their assistance in producing
+     this manual.  Jay Fenlason contributed many ideas and sample
+     programs.  Richard Mlynarik and Robert Chassell gave helpful
+     comments on drafts of this manual.  The paper 'A Supplemental
+     Document for AWK' by John W. Pierce of the Chemistry Department at
+     UC San Diego, pinpointed several issues relevant both to 'awk'
+     implementation and to this manual, that would otherwise have
+     escaped us.
+
+   I would like to acknowledge Richard M. Stallman, for his vision of a
+better world and for his courage in founding the FSF and starting the
+GNU Project.
+
+   Earlier editions of this Info file had the following
+acknowledgements:
+
+     The following people (in alphabetical order) provided helpful
+     comments on various versions of this book: Rick Adams, Dr. Nelson
+     H.F. Beebe, Karl Berry, Dr. Michael Brennan, Rich Burridge, Claire
+     Cloutier, Diane Close, Scott Deifik, Christopher ("Topher") Eliot,
+     Jeffrey Friedl, Dr. Darrel Hankerson, Michal Jaegermann, Dr.
+     Richard J. LeBlanc, Michael Lijewski, Pat Rankin, Miriam Robbins,
+     Mary Sheehan, and Chuck Toporek.
+
+     Robert J. Chassell provided much valuable advice on the use of
+     Texinfo.  He also deserves special thanks for convincing me _not_
+     to title this Info file 'How to Gawk Politely'.  Karl Berry helped
+     significantly with the TeX part of Texinfo.
+
+     I would like to thank Marshall and Elaine Hartholz of Seattle and
+     Dr. Bert and Rita Schreiber of Detroit for large amounts of quiet
+     vacation time in their homes, which allowed me to make significant
+     progress on this Info file and on 'gawk' itself.
+
+     Phil Hughes of SSC contributed in a very important way by loaning
+     me his laptop GNU/Linux system, not once, but twice, which allowed
+     me to do a lot of work while away from home.
+
+     David Trueman deserves special credit; he has done a yeoman job of
+     evolving 'gawk' so that it performs well and without bugs.
+     Although he is no longer involved with 'gawk', working with him on
+     this project was a significant pleasure.
+
+     The intrepid members of the GNITS mailing list, and most notably
+     Ulrich Drepper, provided invaluable help and feedback for the
+     design of the internationalization features.
+
+     Chuck Toporek, Mary Sheehan, and Claire Cloutier of O'Reilly &
+     Associates contributed significant editorial help for this Info
+     file for the 3.1 release of 'gawk'.
+
+   Dr. Nelson Beebe, Andreas Buening, Dr. Manuel Collado, Antonio
+Colombo, Stephen Davies, Scott Deifik, Akim Demaille, Daniel Richard G.,
+Darrel Hankerson, Michal Jaegermann, Ju"rgen Kahrs, Stepan Kasal, John
+Malmberg, Dave Pitts, Chet Ramey, Pat Rankin, Andrew Schorr, Corinna
+Vinschen, and Eli Zaretskii (in alphabetical order) make up the current
+'gawk' "crack portability team."  Without their hard work and help,
+'gawk' would not be nearly the robust, portable program it is today.  It
+has been and continues to be a pleasure working with this team of fine
+people.
+
+   Notable code and documentation contributions were made by a number of
+people.  *Note Contributors:: for the full list.
+
+   Thanks to Michael Brennan for the Forewords.
+
+   Thanks to Patrice Dumas for the new 'makeinfo' program.  Thanks to
+Karl Berry, who continues to work to keep the Texinfo markup language
+sane.
+
+   Robert P.J. Day, Michael Brennan, and Brian Kernighan kindly acted as
+reviewers for the 2015 edition of this Info file.  Their feedback helped
+improve the final work.
+
+   I would also like to thank Brian Kernighan for his invaluable
+assistance during the testing and debugging of 'gawk', and for his
+ongoing help and advice in clarifying numerous points about the
+language.  We could not have done nearly as good a job on either 'gawk'
+or its documentation without his help.
+
+   Brian is in a class by himself as a programmer and technical author.
+I have to thank him (yet again) for his ongoing friendship and for being
+a role model to me for close to 30 years!  Having him as a reviewer is
+an exciting privilege.  It has also been extremely humbling...
+
+   I must thank my wonderful wife, Miriam, for her patience through the
+many versions of this project, for her proofreading, and for sharing me
+with the computer.  I would like to thank my parents for their love, and
+for the grace with which they raised and educated me.  Finally, I also
+must acknowledge my gratitude to G-d, for the many opportunities He has
+sent my way, as well as for the gifts He has given me with which to take
+advantage of those opportunities.
+
+
+Arnold Robbins
+Nof Ayalon
+Israel
+February 2015
+
+
+File: gawk.info,  Node: Getting Started,  Next: Invoking Gawk,  Prev: Preface,  Up: Top
+
+1 Getting Started with 'awk'
+****************************
+
+The basic function of 'awk' is to search files for lines (or other units
+of text) that contain certain patterns.  When a line matches one of the
+patterns, 'awk' performs specified actions on that line.  'awk'
+continues to process input lines in this way until it reaches the end of
+the input files.
+
+   Programs in 'awk' are different from programs in most other
+languages, because 'awk' programs are "data driven" (i.e., you describe
+the data you want to work with and then what to do when you find it).
+Most other languages are "procedural"; you have to describe, in great
+detail, every step the program should take.  When working with
+procedural languages, it is usually much harder to clearly describe the
+data your program will process.  For this reason, 'awk' programs are
+often refreshingly easy to read and write.
+
+   When you run 'awk', you specify an 'awk' "program" that tells 'awk'
+what to do.  The program consists of a series of "rules" (it may also
+contain "function definitions", an advanced feature that we will ignore
+for now; *note User-defined::).  Each rule specifies one pattern to
+search for and one action to perform upon finding the pattern.
+
+   Syntactically, a rule consists of a "pattern" followed by an
+"action".  The action is enclosed in braces to separate it from the
+pattern.  Newlines usually separate rules.  Therefore, an 'awk' program
+looks like this:
+
+     PATTERN { ACTION }
+     PATTERN { ACTION }
+     ...
+
+* Menu:
+
+* Running gawk::                How to run 'gawk' programs; includes
+                                command-line syntax.
+* Sample Data Files::           Sample data files for use in the 'awk'
+                                programs illustrated in this Info file.
+* Very Simple::                 A very simple example.
+* Two Rules::                   A less simple one-line example using two
+                                rules.
+* More Complex::                A more complex example.
+* Statements/Lines::            Subdividing or combining statements into
+                                lines.
+* Other Features::              Other Features of 'awk'.
+* When::                        When to use 'gawk' and when to use
+                                other things.
+* Intro Summary::               Summary of the introduction.
+
+
+File: gawk.info,  Node: Running gawk,  Next: Sample Data Files,  Up: Getting Started
+
+1.1 How to Run 'awk' Programs
+=============================
+
+There are several ways to run an 'awk' program.  If the program is
+short, it is easiest to include it in the command that runs 'awk', like
+this:
+
+     awk 'PROGRAM' INPUT-FILE1 INPUT-FILE2 ...
+
+   When the program is long, it is usually more convenient to put it in
+a file and run it with a command like this:
+
+     awk -f PROGRAM-FILE INPUT-FILE1 INPUT-FILE2 ...
+
+   This minor node discusses both mechanisms, along with several
+variations of each.
+
+* Menu:
+
+* One-shot::                    Running a short throwaway 'awk'
+                                program.
+* Read Terminal::               Using no input files (input from the keyboard
+                                instead).
+* Long::                        Putting permanent 'awk' programs in
+                                files.
+* Executable Scripts::          Making self-contained 'awk' programs.
+* Comments::                    Adding documentation to 'gawk'
+                                programs.
+* Quoting::                     More discussion of shell quoting issues.
+
+
+File: gawk.info,  Node: One-shot,  Next: Read Terminal,  Up: Running gawk
+
+1.1.1 One-Shot Throwaway 'awk' Programs
+---------------------------------------
+
+Once you are familiar with 'awk', you will often type in simple programs
+the moment you want to use them.  Then you can write the program as the
+first argument of the 'awk' command, like this:
+
+     awk 'PROGRAM' INPUT-FILE1 INPUT-FILE2 ...
+
+where PROGRAM consists of a series of patterns and actions, as described
+earlier.
+
+   This command format instructs the "shell", or command interpreter, to
+start 'awk' and use the PROGRAM to process records in the input file(s).
+There are single quotes around PROGRAM so the shell won't interpret any
+'awk' characters as special shell characters.  The quotes also cause the
+shell to treat all of PROGRAM as a single argument for 'awk', and allow
+PROGRAM to be more than one line long.
+
+   This format is also useful for running short or medium-sized 'awk'
+programs from shell scripts, because it avoids the need for a separate
+file for the 'awk' program.  A self-contained shell script is more
+reliable because there are no other files to misplace.
+
+   Later in this chapter, in *note Very Simple::, we'll see examples of
+several short, self-contained programs.
+
+
+File: gawk.info,  Node: Read Terminal,  Next: Long,  Prev: One-shot,  Up: Running gawk
+
+1.1.2 Running 'awk' Without Input Files
+---------------------------------------
+
+You can also run 'awk' without any input files.  If you type the
+following command line:
+
+     awk 'PROGRAM'
+
+'awk' applies the PROGRAM to the "standard input", which usually means
+whatever you type on the keyboard.  This continues until you indicate
+end-of-file by typing 'Ctrl-d'.  (On non-POSIX operating systems, the
+end-of-file character may be different.)
+
+   As an example, the following program prints a friendly piece of
+advice (from Douglas Adams's 'The Hitchhiker's Guide to the Galaxy'), to
+keep you from worrying about the complexities of computer programming:
+
+     $ awk 'BEGIN { print "Don\47t Panic!" }'
+     -| Don't Panic!
+
+   'awk' executes statements associated with 'BEGIN' before reading any
+input.  If there are no other statements in your program, as is the case
+here, 'awk' just stops, instead of trying to read input it doesn't know
+how to process.  The '\47' is a magic way (explained later) of getting a
+single quote into the program, without having to engage in ugly shell
+quoting tricks.
+
+     NOTE: If you use Bash as your shell, you should execute the command
+     'set +H' before running this program interactively, to disable the
+     C shell-style command history, which treats '!' as a special
+     character.  We recommend putting this command into your personal
+     startup file.
+
+   This next simple 'awk' program emulates the 'cat' utility; it copies
+whatever you type on the keyboard to its standard output (why this works
+is explained shortly):
+
+     $ awk '{ print }'
+     Now is the time for all good men
+     -| Now is the time for all good men
+     to come to the aid of their country.
+     -| to come to the aid of their country.
+     Four score and seven years ago, ...
+     -| Four score and seven years ago, ...
+     What, me worry?
+     -| What, me worry?
+     Ctrl-d
+
+
+File: gawk.info,  Node: Long,  Next: Executable Scripts,  Prev: Read Terminal,  Up: Running gawk
+
+1.1.3 Running Long Programs
+---------------------------
+
+Sometimes 'awk' programs are very long.  In these cases, it is more
+convenient to put the program into a separate file.  In order to tell
+'awk' to use that file for its program, you type:
+
+     awk -f SOURCE-FILE INPUT-FILE1 INPUT-FILE2 ...
+
+   The '-f' instructs the 'awk' utility to get the 'awk' program from
+the file SOURCE-FILE (*note Options::).  Any file name can be used for
+SOURCE-FILE.  For example, you could put the program:
+
+     BEGIN { print "Don't Panic!" }
+
+into the file 'advice'.  Then this command:
+
+     awk -f advice
+
+does the same thing as this one:
+
+     awk 'BEGIN { print "Don\47t Panic!" }'
+
+This was explained earlier (*note Read Terminal::).  Note that you don't
+usually need single quotes around the file name that you specify with
+'-f', because most file names don't contain any of the shell's special
+characters.  Notice that in 'advice', the 'awk' program did not have
+single quotes around it.  The quotes are only needed for programs that
+are provided on the 'awk' command line.  (Also, placing the program in a
+file allows us to use a literal single quote in the program text,
+instead of the magic '\47'.)
+
+   If you want to clearly identify an 'awk' program file as such, you
+can add the extension '.awk' to the file name.  This doesn't affect the
+execution of the 'awk' program but it does make "housekeeping" easier.
+
+
+File: gawk.info,  Node: Executable Scripts,  Next: Comments,  Prev: Long,  Up: Running gawk
+
+1.1.4 Executable 'awk' Programs
+-------------------------------
+
+Once you have learned 'awk', you may want to write self-contained 'awk'
+scripts, using the '#!' script mechanism.  You can do this on many
+systems.(1)  For example, you could update the file 'advice' to look
+like this:
+
+     #! /bin/awk -f
+
+     BEGIN { print "Don't Panic!" }
+
+After making this file executable (with the 'chmod' utility), simply
+type 'advice' at the shell and the system arranges to run 'awk' as if
+you had typed 'awk -f advice':
+
+     $ chmod +x advice
+     $ advice
+     -| Don't Panic!
+
+(We assume you have the current directory in your shell's search path
+variable [typically '$PATH'].  If not, you may need to type './advice'
+at the shell.)
+
+   Self-contained 'awk' scripts are useful when you want to write a
+program that users can invoke without their having to know that the
+program is written in 'awk'.
+
+                          Understanding '#!'
+
+   'awk' is an "interpreted" language.  This means that the 'awk'
+utility reads your program and then processes your data according to the
+instructions in your program.  (This is different from a "compiled"
+language such as C, where your program is first compiled into machine
+code that is executed directly by your system's processor.)  The 'awk'
+utility is thus termed an "interpreter".  Many modern languages are
+interpreted.
+
+   The line beginning with '#!' lists the full file name of an
+interpreter to run and a single optional initial command-line argument
+to pass to that interpreter.  The operating system then runs the
+interpreter with the given argument and the full argument list of the
+executed program.  The first argument in the list is the full file name
+of the 'awk' program.  The rest of the argument list contains either
+options to 'awk', or data files, or both.  (Note that on many systems
+'awk' may be found in '/usr/bin' instead of in '/bin'.)
+
+   Some systems limit the length of the interpreter name to 32
+characters.  Often, this can be dealt with by using a symbolic link.
+
+   You should not put more than one argument on the '#!' line after the
+path to 'awk'.  It does not work.  The operating system treats the rest
+of the line as a single argument and passes it to 'awk'.  Doing this
+leads to confusing behavior--most likely a usage diagnostic of some sort
+from 'awk'.
+
+   Finally, the value of 'ARGV[0]' (*note Built-in Variables::) varies
+depending upon your operating system.  Some systems put 'awk' there,
+some put the full pathname of 'awk' (such as '/bin/awk'), and some put
+the name of your script ('advice').  (d.c.)  Don't rely on the value of
+'ARGV[0]' to provide your script name.
+
+   ---------- Footnotes ----------
+
+   (1) The '#!' mechanism works on GNU/Linux systems, BSD-based systems,
+and commercial Unix systems.
+
+
+File: gawk.info,  Node: Comments,  Next: Quoting,  Prev: Executable Scripts,  Up: Running gawk
+
+1.1.5 Comments in 'awk' Programs
+--------------------------------
+
+A "comment" is some text that is included in a program for the sake of
+human readers; it is not really an executable part of the program.
+Comments can explain what the program does and how it works.  Nearly all
+programming languages have provisions for comments, as programs are
+typically hard to understand without them.
+
+   In the 'awk' language, a comment starts with the number sign
+character ('#') and continues to the end of the line.  The '#' does not
+have to be the first character on the line.  The 'awk' language ignores
+the rest of a line following a number sign.  For example, we could have
+put the following into 'advice':
+
+     # This program prints a nice, friendly message.  It helps
+     # keep novice users from being afraid of the computer.
+     BEGIN    { print "Don't Panic!" }
+
+   You can put comment lines into keyboard-composed throwaway 'awk'
+programs, but this usually isn't very useful; the purpose of a comment
+is to help you or another person understand the program when reading it
+at a later time.
+
+     CAUTION: As mentioned in *note One-shot::, you can enclose short to
+     medium-sized programs in single quotes, in order to keep your shell
+     scripts self-contained.  When doing so, _don't_ put an apostrophe
+     (i.e., a single quote) into a comment (or anywhere else in your
+     program).  The shell interprets the quote as the closing quote for
+     the entire program.  As a result, usually the shell prints a
+     message about mismatched quotes, and if 'awk' actually runs, it
+     will probably print strange messages about syntax errors.  For
+     example, look at the following:
+
+          $ awk 'BEGIN { print "hello" } # let's be cute'
+          >
+
+     The shell sees that the first two quotes match, and that a new
+     quoted object begins at the end of the command line.  It therefore
+     prompts with the secondary prompt, waiting for more input.  With
+     Unix 'awk', closing the quoted string produces this result:
+
+          $ awk '{ print "hello" } # let's be cute'
+          > '
+          error-> awk: can't open file be
+          error->  source line number 1
+
+     Putting a backslash before the single quote in 'let's' wouldn't
+     help, because backslashes are not special inside single quotes.
+     The next node describes the shell's quoting rules.
+
+
+File: gawk.info,  Node: Quoting,  Prev: Comments,  Up: Running gawk
+
+1.1.6 Shell Quoting Issues
+--------------------------
+
+* Menu:
+
+* DOS Quoting::                 Quoting in Windows Batch Files.
+
+For short to medium-length 'awk' programs, it is most convenient to
+enter the program on the 'awk' command line.  This is best done by
+enclosing the entire program in single quotes.  This is true whether you
+are entering the program interactively at the shell prompt, or writing
+it as part of a larger shell script:
+
+     awk 'PROGRAM TEXT' INPUT-FILE1 INPUT-FILE2 ...
+
+   Once you are working with the shell, it is helpful to have a basic
+knowledge of shell quoting rules.  The following rules apply only to
+POSIX-compliant, Bourne-style shells (such as Bash, the GNU Bourne-Again
+Shell).  If you use the C shell, you're on your own.
+
+   Before diving into the rules, we introduce a concept that appears
+throughout this Info file, which is that of the "null", or empty,
+string.
+
+   The null string is character data that has no value.  In other words,
+it is empty.  It is written in 'awk' programs like this: '""'.  In the
+shell, it can be written using single or double quotes: '""' or ''''.
+Although the null string has no characters in it, it does exist.  For
+example, consider this command:
+
+     $ echo ""
+
+Here, the 'echo' utility receives a single argument, even though that
+argument has no characters in it.  In the rest of this Info file, we use
+the terms "null string" and "empty string" interchangeably.  Now, on to
+the quoting rules:
+
+   * Quoted items can be concatenated with nonquoted items as well as
+     with other quoted items.  The shell turns everything into one
+     argument for the command.
+
+   * Preceding any single character with a backslash ('\') quotes that
+     character.  The shell removes the backslash and passes the quoted
+     character on to the command.
+
+   * Single quotes protect everything between the opening and closing
+     quotes.  The shell does no interpretation of the quoted text,
+     passing it on verbatim to the command.  It is _impossible_ to embed
+     a single quote inside single-quoted text.  Refer back to *note
+     Comments:: for an example of what happens if you try.
+
+   * Double quotes protect most things between the opening and closing
+     quotes.  The shell does at least variable and command substitution
+     on the quoted text.  Different shells may do additional kinds of
+     processing on double-quoted text.
+
+     Because certain characters within double-quoted text are processed
+     by the shell, they must be "escaped" within the text.  Of note are
+     the characters '$', '`', '\', and '"', all of which must be
+     preceded by a backslash within double-quoted text if they are to be
+     passed on literally to the program.  (The leading backslash is
+     stripped first.)  Thus, the example seen in *note Read Terminal:::
+
+          awk 'BEGIN { print "Don\47t Panic!" }'
+
+     could instead be written this way:
+
+          $ awk "BEGIN { print \"Don't Panic!\" }"
+          -| Don't Panic!
+
+     Note that the single quote is not special within double quotes.
+
+   * Null strings are removed when they occur as part of a non-null
+     command-line argument, while explicit null objects are kept.  For
+     example, to specify that the field separator 'FS' should be set to
+     the null string, use:
+
+          awk -F "" 'PROGRAM' FILES # correct
+
+     Don't use this:
+
+          awk -F"" 'PROGRAM' FILES  # wrong!
+
+     In the second case, 'awk' attempts to use the text of the program
+     as the value of 'FS', and the first file name as the text of the
+     program!  This results in syntax errors at best, and confusing
+     behavior at worst.
+
+   Mixing single and double quotes is difficult.  You have to resort to
+shell quoting tricks, like this:
+
+     $ awk 'BEGIN { print "Here is a single quote <'"'"'>" }'
+     -| Here is a single quote <'>
+
+This program consists of three concatenated quoted strings.  The first
+and the third are single-quoted, and the second is double-quoted.
+
+   This can be "simplified" to:
+
+     $ awk 'BEGIN { print "Here is a single quote <'\''>" }'
+     -| Here is a single quote <'>
+
+Judge for yourself which of these two is the more readable.
+
+   Another option is to use double quotes, escaping the embedded,
+'awk'-level double quotes:
+
+     $ awk "BEGIN { print \"Here is a single quote <'>\" }"
+     -| Here is a single quote <'>
+
+This option is also painful, because double quotes, backslashes, and
+dollar signs are very common in more advanced 'awk' programs.
+
+   A third option is to use the octal escape sequence equivalents (*note
+Escape Sequences::) for the single- and double-quote characters, like
+so:
+
+     $ awk 'BEGIN { print "Here is a single quote <\47>" }'
+     -| Here is a single quote <'>
+     $ awk 'BEGIN { print "Here is a double quote <\42>" }'
+     -| Here is a double quote <">
+
+This works nicely, but you should comment clearly what the escapes mean.
+
+   A fourth option is to use command-line variable assignment, like
+this:
+
+     $ awk -v sq="'" 'BEGIN { print "Here is a single quote <" sq ">" }'
+     -| Here is a single quote <'>
+
+   (Here, the two string constants and the value of 'sq' are
+concatenated into a single string that is printed by 'print'.)
+
+   If you really need both single and double quotes in your 'awk'
+program, it is probably best to move it into a separate file, where the
+shell won't be part of the picture and you can say what you mean.
+
+
+File: gawk.info,  Node: DOS Quoting,  Up: Quoting
+
+1.1.6.1 Quoting in MS-Windows Batch Files
+.........................................
+
+Although this Info file generally only worries about POSIX systems and
+the POSIX shell, the following issue arises often enough for many users
+that it is worth addressing.
+
+   The "shells" on Microsoft Windows systems use the double-quote
+character for quoting, and make it difficult or impossible to include an
+escaped double-quote character in a command-line script.  The following
+example, courtesy of Jeroen Brink, shows how to print all lines in a
+file surrounded by double quotes:
+
+     gawk "{ print \"\042\" $0 \"\042\" }" FILE
+
+
+File: gawk.info,  Node: Sample Data Files,  Next: Very Simple,  Prev: Running gawk,  Up: Getting Started
+
+1.2 Data files for the Examples
+===============================
+
+Many of the examples in this Info file take their input from two sample
+data files.  The first, 'mail-list', represents a list of peoples' names
+together with their email addresses and information about those people.
+The second data file, called 'inventory-shipped', contains information
+about monthly shipments.  In both files, each line is considered to be
+one "record".
+
+   In 'mail-list', each record contains the name of a person, his/her
+phone number, his/her email address, and a code for his/her relationship
+with the author of the list.  The columns are aligned using spaces.  An
+'A' in the last column means that the person is an acquaintance.  An 'F'
+in the last column means that the person is a friend.  An 'R' means that
+the person is a relative:
+
+     Amelia       555-5553     amelia.zodiacusque@gmail.com    F
+     Anthony      555-3412     anthony.asserturo@hotmail.com   A
+     Becky        555-7685     becky.algebrarum@gmail.com      A
+     Bill         555-1675     bill.drowning@hotmail.com       A
+     Broderick    555-0542     broderick.aliquotiens@yahoo.com R
+     Camilla      555-2912     camilla.infusarum@skynet.be     R
+     Fabius       555-1234     fabius.undevicesimus@ucb.edu    F
+     Julie        555-6699     julie.perscrutabor@skeeve.com   F
+     Martin       555-6480     martin.codicibus@hotmail.com    A
+     Samuel       555-3430     samuel.lanceolis@shu.edu        A
+     Jean-Paul    555-2127     jeanpaul.campanorum@nyu.edu     R
+
+   The data file 'inventory-shipped' represents information about
+shipments during the year.  Each record contains the month, the number
+of green crates shipped, the number of red boxes shipped, the number of
+orange bags shipped, and the number of blue packages shipped,
+respectively.  There are 16 entries, covering the 12 months of last year
+and the first four months of the current year.  An empty line separates
+the data for the two years:
+
+     Jan  13  25  15 115
+     Feb  15  32  24 226
+     Mar  15  24  34 228
+     Apr  31  52  63 420
+     May  16  34  29 208
+     Jun  31  42  75 492
+     Jul  24  34  67 436
+     Aug  15  34  47 316
+     Sep  13  55  37 277
+     Oct  29  54  68 525
+     Nov  20  87  82 577
+     Dec  17  35  61 401
+
+     Jan  21  36  64 620
+     Feb  26  58  80 652
+     Mar  24  75  70 495
+     Apr  21  70  74 514
+
+   The sample files are included in the 'gawk' distribution, in the
+directory 'awklib/eg/data'.
+
+
+File: gawk.info,  Node: Very Simple,  Next: Two Rules,  Prev: Sample Data Files,  Up: Getting Started
+
+1.3 Some Simple Examples
+========================
+
+The following command runs a simple 'awk' program that searches the
+input file 'mail-list' for the character string 'li' (a grouping of
+characters is usually called a "string"; the term "string" is based on
+similar usage in English, such as "a string of pearls" or "a string of
+cars in a train"):
+
+     awk '/li/ { print $0 }' mail-list
+
+When lines containing 'li' are found, they are printed because
+'print $0' means print the current line.  (Just 'print' by itself means
+the same thing, so we could have written that instead.)
+
+   You will notice that slashes ('/') surround the string 'li' in the
+'awk' program.  The slashes indicate that 'li' is the pattern to search
+for.  This type of pattern is called a "regular expression", which is
+covered in more detail later (*note Regexp::).  The pattern is allowed
+to match parts of words.  There are single quotes around the 'awk'
+program so that the shell won't interpret any of it as special shell
+characters.
+
+   Here is what this program prints:
+
+     $ awk '/li/ { print $0 }' mail-list
+     -| Amelia       555-5553     amelia.zodiacusque@gmail.com    F
+     -| Broderick    555-0542     broderick.aliquotiens@yahoo.com R
+     -| Julie        555-6699     julie.perscrutabor@skeeve.com   F
+     -| Samuel       555-3430     samuel.lanceolis@shu.edu        A
+
+   In an 'awk' rule, either the pattern or the action can be omitted,
+but not both.  If the pattern is omitted, then the action is performed
+for _every_ input line.  If the action is omitted, the default action is
+to print all lines that match the pattern.
+
+   Thus, we could leave out the action (the 'print' statement and the
+braces) in the previous example and the result would be the same: 'awk'
+prints all lines matching the pattern 'li'.  By comparison, omitting the
+'print' statement but retaining the braces makes an empty action that
+does nothing (i.e., no lines are printed).
+
+   Many practical 'awk' programs are just a line or two long.  Following
+is a collection of useful, short programs to get you started.  Some of
+these programs contain constructs that haven't been covered yet.  (The
+description of the program will give you a good idea of what is going
+on, but you'll need to read the rest of the Info file to become an 'awk'
+expert!)  Most of the examples use a data file named 'data'.  This is
+just a placeholder; if you use these programs yourself, substitute your
+own file names for 'data'.  For future reference, note that there is
+often more than one way to do things in 'awk'.  At some point, you may
+want to look back at these examples and see if you can come up with
+different ways to do the same things shown here:
+
+   * Print every line that is longer than 80 characters:
+
+          awk 'length($0) > 80' data
+
+     The sole rule has a relational expression as its pattern and has no
+     action--so it uses the default action, printing the record.
+
+   * Print the length of the longest input line:
+
+          awk '{ if (length($0) > max) max = length($0) }
+               END { print max }' data
+
+     The code associated with 'END' executes after all input has been
+     read; it's the other side of the coin to 'BEGIN'.
+
+   * Print the length of the longest line in 'data':
+
+          expand data | awk '{ if (x < length($0)) x = length($0) }
+                             END { print "maximum line length is " x }'
+
+     This example differs slightly from the previous one: the input is
+     processed by the 'expand' utility to change TABs into spaces, so
+     the widths compared are actually the right-margin columns, as
+     opposed to the number of input characters on each line.
+
+   * Print every line that has at least one field:
+
+          awk 'NF > 0' data
+
+     This is an easy way to delete blank lines from a file (or rather,
+     to create a new file similar to the old file but from which the
+     blank lines have been removed).
+
+   * Print seven random numbers from 0 to 100, inclusive:
+
+          awk 'BEGIN { for (i = 1; i <= 7; i++)
+                           print int(101 * rand()) }'
+
+   * Print the total number of bytes used by FILES:
+
+          ls -l FILES | awk '{ x += $5 }
+                             END { print "total bytes: " x }'
+
+   * Print the total number of kilobytes used by FILES:
+
+          ls -l FILES | awk '{ x += $5 }
+             END { print "total K-bytes:", x / 1024 }'
+
+   * Print a sorted list of the login names of all users:
+
+          awk -F: '{ print $1 }' /etc/passwd | sort
+
+   * Count the lines in a file:
+
+          awk 'END { print NR }' data
+
+   * Print the even-numbered lines in the data file:
+
+          awk 'NR % 2 == 0' data
+
+     If you used the expression 'NR % 2 == 1' instead, the program would
+     print the odd-numbered lines.
+
+
+File: gawk.info,  Node: Two Rules,  Next: More Complex,  Prev: Very Simple,  Up: Getting Started
+
+1.4 An Example with Two Rules
+=============================
+
+The 'awk' utility reads the input files one line at a time.  For each
+line, 'awk' tries the patterns of each rule.  If several patterns match,
+then several actions execute in the order in which they appear in the
+'awk' program.  If no patterns match, then no actions run.
+
+   After processing all the rules that match the line (and perhaps there
+are none), 'awk' reads the next line.  (However, *note Next Statement::
+and also *note Nextfile Statement::.)  This continues until the program
+reaches the end of the file.  For example, the following 'awk' program
+contains two rules:
+
+     /12/  { print $0 }
+     /21/  { print $0 }
+
+The first rule has the string '12' as the pattern and 'print $0' as the
+action.  The second rule has the string '21' as the pattern and also has
+'print $0' as the action.  Each rule's action is enclosed in its own
+pair of braces.
+
+   This program prints every line that contains the string '12' _or_ the
+string '21'.  If a line contains both strings, it is printed twice, once
+by each rule.
+
+   This is what happens if we run this program on our two sample data
+files, 'mail-list' and 'inventory-shipped':
+
+     $ awk '/12/ { print $0 }
+     >      /21/ { print $0 }' mail-list inventory-shipped
+     -| Anthony      555-3412     anthony.asserturo@hotmail.com   A
+     -| Camilla      555-2912     camilla.infusarum@skynet.be     R
+     -| Fabius       555-1234     fabius.undevicesimus@ucb.edu    F
+     -| Jean-Paul    555-2127     jeanpaul.campanorum@nyu.edu     R
+     -| Jean-Paul    555-2127     jeanpaul.campanorum@nyu.edu     R
+     -| Jan  21  36  64 620
+     -| Apr  21  70  74 514
+
+Note how the line beginning with 'Jean-Paul' in 'mail-list' was printed
+twice, once for each rule.
+
+
+File: gawk.info,  Node: More Complex,  Next: Statements/Lines,  Prev: Two Rules,  Up: Getting Started
+
+1.5 A More Complex Example
+==========================
+
+Now that we've mastered some simple tasks, let's look at what typical
+'awk' programs do.  This example shows how 'awk' can be used to
+summarize, select, and rearrange the output of another utility.  It uses
+features that haven't been covered yet, so don't worry if you don't
+understand all the details:
+
+     ls -l | awk '$6 == "Nov" { sum += $5 }
+                  END { print sum }'
+
+   This command prints the total number of bytes in all the files in the
+current directory that were last modified in November (of any year).
+The 'ls -l' part of this example is a system command that gives you a
+listing of the files in a directory, including each file's size and the
+date the file was last modified.  Its output looks like this:
+
+     -rw-r--r--  1 arnold   user   1933 Nov  7 13:05 Makefile
+     -rw-r--r--  1 arnold   user  10809 Nov  7 13:03 awk.h
+     -rw-r--r--  1 arnold   user    983 Apr 13 12:14 awk.tab.h
+     -rw-r--r--  1 arnold   user  31869 Jun 15 12:20 awkgram.y
+     -rw-r--r--  1 arnold   user  22414 Nov  7 13:03 awk1.c
+     -rw-r--r--  1 arnold   user  37455 Nov  7 13:03 awk2.c
+     -rw-r--r--  1 arnold   user  27511 Dec  9 13:07 awk3.c
+     -rw-r--r--  1 arnold   user   7989 Nov  7 13:03 awk4.c
+
+The first field contains read-write permissions, the second field
+contains the number of links to the file, and the third field identifies
+the file's owner.  The fourth field identifies the file's group.  The
+fifth field contains the file's size in bytes.  The sixth, seventh, and
+eighth fields contain the month, day, and time, respectively, that the
+file was last modified.  Finally, the ninth field contains the file
+name.
+
+   The '$6 == "Nov"' in our 'awk' program is an expression that tests
+whether the sixth field of the output from 'ls -l' matches the string
+'Nov'.  Each time a line has the string 'Nov' for its sixth field, 'awk'
+performs the action 'sum += $5'.  This adds the fifth field (the file's
+size) to the variable 'sum'.  As a result, when 'awk' has finished
+reading all the input lines, 'sum' is the total of the sizes of the
+files whose lines matched the pattern.  (This works because 'awk'
+variables are automatically initialized to zero.)
+
+   After the last line of output from 'ls' has been processed, the 'END'
+rule executes and prints the value of 'sum'.  In this example, the value
+of 'sum' is 80600.
+
+   These more advanced 'awk' techniques are covered in later minor nodes
+(*note Action Overview::).  Before you can move on to more advanced
+'awk' programming, you have to know how 'awk' interprets your input and
+displays your output.  By manipulating fields and using 'print'
+statements, you can produce some very useful and impressive-looking
+reports.
+
+
+File: gawk.info,  Node: Statements/Lines,  Next: Other Features,  Prev: More Complex,  Up: Getting Started
+
+1.6 'awk' Statements Versus Lines
+=================================
+
+Most often, each line in an 'awk' program is a separate statement or
+separate rule, like this:
+
+     awk '/12/  { print $0 }
+          /21/  { print $0 }' mail-list inventory-shipped
+
+   However, 'gawk' ignores newlines after any of the following symbols
+and keywords:
+
+     ,    {    ?    :    ||    &&    do    else
+
+A newline at any other point is considered the end of the statement.(1)
+
+   If you would like to split a single statement into two lines at a
+point where a newline would terminate it, you can "continue" it by
+ending the first line with a backslash character ('\').  The backslash
+must be the final character on the line in order to be recognized as a
+continuation character.  A backslash is allowed anywhere in the
+statement, even in the middle of a string or regular expression.  For
+example:
+
+     awk '/This regular expression is too long, so continue it\
+      on the next line/ { print $1 }'
+
+We have generally not used backslash continuation in our sample
+programs.  'gawk' places no limit on the length of a line, so backslash
+continuation is never strictly necessary; it just makes programs more
+readable.  For this same reason, as well as for clarity, we have kept
+most statements short in the programs presented throughout the Info
+file.  Backslash continuation is most useful when your 'awk' program is
+in a separate source file instead of entered from the command line.  You
+should also note that many 'awk' implementations are more particular
+about where you may use backslash continuation.  For example, they may
+not allow you to split a string constant using backslash continuation.
+Thus, for maximum portability of your 'awk' programs, it is best not to
+split your lines in the middle of a regular expression or a string.
+
+     CAUTION: _Backslash continuation does not work as described with
+     the C shell._  It works for 'awk' programs in files and for
+     one-shot programs, _provided_ you are using a POSIX-compliant
+     shell, such as the Unix Bourne shell or Bash.  But the C shell
+     behaves differently!  There you must use two backslashes in a row,
+     followed by a newline.  Note also that when using the C shell,
+     _every_ newline in your 'awk' program must be escaped with a
+     backslash.  To illustrate:
+
+          % awk 'BEGIN { \
+          ?   print \\
+          ?       "hello, world" \
+          ? }'
+          -| hello, world
+
+     Here, the '%' and '?' are the C shell's primary and secondary
+     prompts, analogous to the standard shell's '$' and '>'.
+
+     Compare the previous example to how it is done with a
+     POSIX-compliant shell:
+
+          $ awk 'BEGIN {
+          >   print \
+          >       "hello, world"
+          > }'
+          -| hello, world
+
+   'awk' is a line-oriented language.  Each rule's action has to begin
+on the same line as the pattern.  To have the pattern and action on
+separate lines, you _must_ use backslash continuation; there is no other
+option.
+
+   Another thing to keep in mind is that backslash continuation and
+comments do not mix.  As soon as 'awk' sees the '#' that starts a
+comment, it ignores _everything_ on the rest of the line.  For example:
+
+     $ gawk 'BEGIN { print "dont panic" # a friendly \
+     >                                    BEGIN rule
+     > }'
+     error-> gawk: cmd. line:2:                BEGIN rule
+     error-> gawk: cmd. line:2:                ^ syntax error
+
+In this case, it looks like the backslash would continue the comment
+onto the next line.  However, the backslash-newline combination is never
+even noticed because it is "hidden" inside the comment.  Thus, the
+'BEGIN' is noted as a syntax error.
+
+   When 'awk' statements within one rule are short, you might want to
+put more than one of them on a line.  This is accomplished by separating
+the statements with a semicolon (';').  This also applies to the rules
+themselves.  Thus, the program shown at the start of this minor node
+could also be written this way:
+
+     /12/ { print $0 } ; /21/ { print $0 }
+
+     NOTE: The requirement that states that rules on the same line must
+     be separated with a semicolon was not in the original 'awk'
+     language; it was added for consistency with the treatment of
+     statements within an action.
+
+   ---------- Footnotes ----------
+
+   (1) The '?' and ':' referred to here is the three-operand conditional
+expression described in *note Conditional Exp::.  Splitting lines after
+'?' and ':' is a minor 'gawk' extension; if '--posix' is specified
+(*note Options::), then this extension is disabled.
+
+
+File: gawk.info,  Node: Other Features,  Next: When,  Prev: Statements/Lines,  Up: Getting Started
+
+1.7 Other Features of 'awk'
+===========================
+
+The 'awk' language provides a number of predefined, or "built-in",
+variables that your programs can use to get information from 'awk'.
+There are other variables your program can set as well to control how
+'awk' processes your data.
+
+   In addition, 'awk' provides a number of built-in functions for doing
+common computational and string-related operations.  'gawk' provides
+built-in functions for working with timestamps, performing bit
+manipulation, for runtime string translation (internationalization),
+determining the type of a variable, and array sorting.
+
+   As we develop our presentation of the 'awk' language, we will
+introduce most of the variables and many of the functions.  They are
+described systematically in *note Built-in Variables:: and in *note
+Built-in::.
+
+
+File: gawk.info,  Node: When,  Next: Intro Summary,  Prev: Other Features,  Up: Getting Started
+
+1.8 When to Use 'awk'
+=====================
+
+Now that you've seen some of what 'awk' can do, you might wonder how
+'awk' could be useful for you.  By using utility programs, advanced
+patterns, field separators, arithmetic statements, and other selection
+criteria, you can produce much more complex output.  The 'awk' language
+is very useful for producing reports from large amounts of raw data,
+such as summarizing information from the output of other utility
+programs like 'ls'.  (*Note More Complex::.)
+
+   Programs written with 'awk' are usually much smaller than they would
+be in other languages.  This makes 'awk' programs easy to compose and
+use.  Often, 'awk' programs can be quickly composed at your keyboard,
+used once, and thrown away.  Because 'awk' programs are interpreted, you
+can avoid the (usually lengthy) compilation part of the typical
+edit-compile-test-debug cycle of software development.
+
+   Complex programs have been written in 'awk', including a complete
+retargetable assembler for eight-bit microprocessors (*note Glossary::,
+for more information), and a microcode assembler for a special-purpose
+Prolog computer.  The original 'awk''s capabilities were strained by
+tasks of such complexity, but modern versions are more capable.
+
+   If you find yourself writing 'awk' scripts of more than, say, a few
+hundred lines, you might consider using a different programming
+language.  The shell is good at string and pattern matching; in
+addition, it allows powerful use of the system utilities.  Python offers
+a nice balance between high-level ease of programming and access to
+system facilities.(1)
+
+   ---------- Footnotes ----------
+
+   (1) Other popular scripting languages include Ruby and Perl.
+
+
+File: gawk.info,  Node: Intro Summary,  Prev: When,  Up: Getting Started
+
+1.9 Summary
+===========
+
+   * Programs in 'awk' consist of PATTERN-ACTION pairs.
+
+   * An ACTION without a PATTERN always runs.  The default ACTION for a
+     pattern without one is '{ print $0 }'.
+
+   * Use either 'awk 'PROGRAM' FILES' or 'awk -f PROGRAM-FILE FILES' to
+     run 'awk'.
+
+   * You may use the special '#!' header line to create 'awk' programs
+     that are directly executable.
+
+   * Comments in 'awk' programs start with '#' and continue to the end
+     of the same line.
+
+   * Be aware of quoting issues when writing 'awk' programs as part of a
+     larger shell script (or MS-Windows batch file).
+
+   * You may use backslash continuation to continue a source line.
+     Lines are automatically continued after a comma, open brace,
+     question mark, colon, '||', '&&', 'do', and 'else'.
+
+
+File: gawk.info,  Node: Invoking Gawk,  Next: Regexp,  Prev: Getting Started,  Up: Top
+
+2 Running 'awk' and 'gawk'
+**************************
+
+This major node covers how to run 'awk', both POSIX-standard and
+'gawk'-specific command-line options, and what 'awk' and 'gawk' do with
+nonoption arguments.  It then proceeds to cover how 'gawk' searches for
+source files, reading standard input along with other files, 'gawk''s
+environment variables, 'gawk''s exit status, using include files, and
+obsolete and undocumented options and/or features.
+
+   Many of the options and features described here are discussed in more
+detail later in the Info file; feel free to skip over things in this
+major node that don't interest you right now.
+
+* Menu:
+
+* Command Line::                How to run 'awk'.
+* Options::                     Command-line options and their meanings.
+* Other Arguments::             Input file names and variable assignments.
+* Naming Standard Input::       How to specify standard input with other
+                                files.
+* Environment Variables::       The environment variables 'gawk' uses.
+* Exit Status::                 'gawk''s exit status.
+* Include Files::               Including other files into your program.
+* Loading Shared Libraries::    Loading shared libraries into your program.
+* Obsolete::                    Obsolete Options and/or features.
+* Undocumented::                Undocumented Options and Features.
+* Invoking Summary::            Invocation summary.
+
+
+File: gawk.info,  Node: Command Line,  Next: Options,  Up: Invoking Gawk
+
+2.1 Invoking 'awk'
+==================
+
+There are two ways to run 'awk'--with an explicit program or with one or
+more program files.  Here are templates for both of them; items enclosed
+in [...] in these templates are optional:
+
+     'awk' [OPTIONS] '-f' PROGFILE ['--'] FILE ...
+     'awk' [OPTIONS] ['--'] ''PROGRAM'' FILE ...
+
+   In addition to traditional one-letter POSIX-style options, 'gawk'
+also supports GNU long options.
+
+   It is possible to invoke 'awk' with an empty program:
+
+     awk '' datafile1 datafile2
+
+Doing so makes little sense, though; 'awk' exits silently when given an
+empty program.  (d.c.)  If '--lint' has been specified on the command
+line, 'gawk' issues a warning that the program is empty.
+
+
+File: gawk.info,  Node: Options,  Next: Other Arguments,  Prev: Command Line,  Up: Invoking Gawk
+
+2.2 Command-Line Options
+========================
+
+Options begin with a dash and consist of a single character.  GNU-style
+long options consist of two dashes and a keyword.  The keyword can be
+abbreviated, as long as the abbreviation allows the option to be
+uniquely identified.  If the option takes an argument, either the
+keyword is immediately followed by an equals sign ('=') and the
+argument's value, or the keyword and the argument's value are separated
+by whitespace.  If a particular option with a value is given more than
+once, it is the last value that counts.
+
+   Each long option for 'gawk' has a corresponding POSIX-style short
+option.  The long and short options are interchangeable in all contexts.
+The following list describes options mandated by the POSIX standard:
+
+'-F FS'
+'--field-separator FS'
+     Set the 'FS' variable to FS (*note Field Separators::).
+
+'-f SOURCE-FILE'
+'--file SOURCE-FILE'
+     Read the 'awk' program source from SOURCE-FILE instead of in the
+     first nonoption argument.  This option may be given multiple times;
+     the 'awk' program consists of the concatenation of the contents of
+     each specified SOURCE-FILE.
+
+'-v VAR=VAL'
+'--assign VAR=VAL'
+     Set the variable VAR to the value VAL _before_ execution of the
+     program begins.  Such variable values are available inside the
+     'BEGIN' rule (*note Other Arguments::).
+
+     The '-v' option can only set one variable, but it can be used more
+     than once, setting another variable each time, like this: 'awk
+     -v foo=1 -v bar=2 ...'.
+
+          CAUTION: Using '-v' to set the values of the built-in
+          variables may lead to surprising results.  'awk' will reset
+          the values of those variables as it needs to, possibly
+          ignoring any initial value you may have given.
+
+'-W GAWK-OPT'
+     Provide an implementation-specific option.  This is the POSIX
+     convention for providing implementation-specific options.  These
+     options also have corresponding GNU-style long options.  Note that
+     the long options may be abbreviated, as long as the abbreviations
+     remain unique.  The full list of 'gawk'-specific options is
+     provided next.
+
+'--'
+     Signal the end of the command-line options.  The following
+     arguments are not treated as options even if they begin with '-'.
+     This interpretation of '--' follows the POSIX argument parsing
+     conventions.
+
+     This is useful if you have file names that start with '-', or in
+     shell scripts, if you have file names that will be specified by the
+     user that could start with '-'.  It is also useful for passing
+     options on to the 'awk' program; see *note Getopt Function::.
+
+   The following list describes 'gawk'-specific options:
+
+'-b'
+'--characters-as-bytes'
+     Cause 'gawk' to treat all input data as single-byte characters.  In
+     addition, all output written with 'print' or 'printf' is treated as
+     single-byte characters.
+
+     Normally, 'gawk' follows the POSIX standard and attempts to process
+     its input data according to the current locale (*note Locales::).
+     This can often involve converting multibyte characters into wide
+     characters (internally), and can lead to problems or confusion if
+     the input data does not contain valid multibyte characters.  This
+     option is an easy way to tell 'gawk', "Hands off my data!"
+
+'-c'
+'--traditional'
+     Specify "compatibility mode", in which the GNU extensions to the
+     'awk' language are disabled, so that 'gawk' behaves just like BWK
+     'awk'.  *Note POSIX/GNU::, which summarizes the extensions.  Also
+     see *note Compatibility Mode::.
+
+'-C'
+'--copyright'
+     Print the short version of the General Public License and then
+     exit.
+
+'-d'[FILE]
+'--dump-variables'['='FILE]
+     Print a sorted list of global variables, their types, and final
+     values to FILE.  If no FILE is provided, print this list to a file
+     named 'awkvars.out' in the current directory.  No space is allowed
+     between the '-d' and FILE, if FILE is supplied.
+
+     Having a list of all global variables is a good way to look for
+     typographical errors in your programs.  You would also use this
+     option if you have a large program with a lot of functions, and you
+     want to be sure that your functions don't inadvertently use global
+     variables that you meant to be local.  (This is a particularly easy
+     mistake to make with simple variable names like 'i', 'j', etc.)
+
+'-D'[FILE]
+'--debug'['='FILE]
+     Enable debugging of 'awk' programs (*note Debugging::).  By
+     default, the debugger reads commands interactively from the
+     keyboard (standard input).  The optional FILE argument allows you
+     to specify a file with a list of commands for the debugger to
+     execute noninteractively.  No space is allowed between the '-D' and
+     FILE, if FILE is supplied.
+
+'-e' PROGRAM-TEXT
+'--source' PROGRAM-TEXT
+     Provide program source code in the PROGRAM-TEXT.  This option
+     allows you to mix source code in files with source code that you
+     enter on the command line.  This is particularly useful when you
+     have library functions that you want to use from your command-line
+     programs (*note AWKPATH Variable::).
+
+'-E' FILE
+'--exec' FILE
+     Similar to '-f', read 'awk' program text from FILE.  There are two
+     differences from '-f':
+
+        * This option terminates option processing; anything else on the
+          command line is passed on directly to the 'awk' program.
+
+        * Command-line variable assignments of the form 'VAR=VALUE' are
+          disallowed.
+
+     This option is particularly necessary for World Wide Web CGI
+     applications that pass arguments through the URL; using this option
+     prevents a malicious (or other) user from passing in options,
+     assignments, or 'awk' source code (via '-e') to the CGI
+     application.(1)  This option should be used with '#!' scripts
+     (*note Executable Scripts::), like so:
+
+          #! /usr/local/bin/gawk -E
+
+          AWK PROGRAM HERE ...
+
+'-g'
+'--gen-pot'
+     Analyze the source program and generate a GNU 'gettext' portable
+     object template file on standard output for all string constants
+     that have been marked for translation.  *Note
+     Internationalization::, for information about this option.
+
+'-h'
+'--help'
+     Print a "usage" message summarizing the short- and long-style
+     options that 'gawk' accepts and then exit.
+
+'-i' SOURCE-FILE
+'--include' SOURCE-FILE
+     Read an 'awk' source library from SOURCE-FILE.  This option is
+     completely equivalent to using the '@include' directive inside your
+     program.  It is very similar to the '-f' option, but there are two
+     important differences.  First, when '-i' is used, the program
+     source is not loaded if it has been previously loaded, whereas with
+     '-f', 'gawk' always loads the file.  Second, because this option is
+     intended to be used with code libraries, 'gawk' does not recognize
+     such files as constituting main program input.  Thus, after
+     processing an '-i' argument, 'gawk' still expects to find the main
+     source code via the '-f' option or on the command line.
+
+'-l' EXT
+'--load' EXT
+     Load a dynamic extension named EXT.  Extensions are stored as
+     system shared libraries.  This option searches for the library
+     using the 'AWKLIBPATH' environment variable.  The correct library
+     suffix for your platform will be supplied by default, so it need
+     not be specified in the extension name.  The extension
+     initialization routine should be named 'dl_load()'.  An alternative
+     is to use the '@load' keyword inside the program to load a shared
+     library.  This advanced feature is described in detail in *note
+     Dynamic Extensions::.
+
+'-L'[VALUE]
+'--lint'['='VALUE]
+     Warn about constructs that are dubious or nonportable to other
+     'awk' implementations.  No space is allowed between the '-L' and
+     VALUE, if VALUE is supplied.  Some warnings are issued when 'gawk'
+     first reads your program.  Others are issued at runtime, as your
+     program executes.  With an optional argument of 'fatal', lint
+     warnings become fatal errors.  This may be drastic, but its use
+     will certainly encourage the development of cleaner 'awk' programs.
+     With an optional argument of 'invalid', only warnings about things
+     that are actually invalid are issued.  (This is not fully
+     implemented yet.)
+
+     Some warnings are only printed once, even if the dubious constructs
+     they warn about occur multiple times in your 'awk' program.  Thus,
+     when eliminating problems pointed out by '--lint', you should take
+     care to search for all occurrences of each inappropriate construct.
+     As 'awk' programs are usually short, doing so is not burdensome.
+
+'-M'
+'--bignum'
+     Select arbitrary-precision arithmetic on numbers.  This option has
+     no effect if 'gawk' is not compiled to use the GNU MPFR and MP
+     libraries (*note Arbitrary Precision Arithmetic::).
+
+'-n'
+'--non-decimal-data'
+     Enable automatic interpretation of octal and hexadecimal values in
+     input data (*note Nondecimal Data::).
+
+          CAUTION: This option can severely break old programs.  Use
+          with care.  Also note that this option may disappear in a
+          future version of 'gawk'.
+
+'-N'
+'--use-lc-numeric'
+     Force the use of the locale's decimal point character when parsing
+     numeric input data (*note Locales::).
+
+'-o'[FILE]
+'--pretty-print'['='FILE]
+     Enable pretty-printing of 'awk' programs.  Implies '--no-optimize'.
+     By default, the output program is created in a file named
+     'awkprof.out' (*note Profiling::).  The optional FILE argument
+     allows you to specify a different file name for the output.  No
+     space is allowed between the '-o' and FILE, if FILE is supplied.
+
+          NOTE: In the past, this option would also execute your
+          program.  This is no longer the case.
+
+'-O'
+'--optimize'
+     Enable 'gawk''s default optimizations on the internal
+     representation of the program.  At the moment, this includes simple
+     constant folding and tail recursion elimination in function calls.
+
+     These optimizations are enabled by default.  This option remains
+     primarily for backwards compatibility.  However, it may be used to
+     cancel the effect of an earlier '-s' option (see later in this
+     list).
+
+'-p'[FILE]
+'--profile'['='FILE]
+     Enable profiling of 'awk' programs (*note Profiling::).  Implies
+     '--no-optimize'.  By default, profiles are created in a file named
+     'awkprof.out'.  The optional FILE argument allows you to specify a
+     different file name for the profile file.  No space is allowed
+     between the '-p' and FILE, if FILE is supplied.
+
+     The profile contains execution counts for each statement in the
+     program in the left margin, and function call counts for each
+     function.
+
+'-P'
+'--posix'
+     Operate in strict POSIX mode.  This disables all 'gawk' extensions
+     (just like '--traditional') and disables all extensions not allowed
+     by POSIX. *Note Common Extensions:: for a summary of the extensions
+     in 'gawk' that are disabled by this option.  Also, the following
+     additional restrictions apply:
+
+        * Newlines are not allowed after '?' or ':' (*note Conditional
+          Exp::).
+
+        * Specifying '-Ft' on the command line does not set the value of
+          'FS' to be a single TAB character (*note Field Separators::).
+
+        * The locale's decimal point character is used for parsing input
+          data (*note Locales::).
+
+     If you supply both '--traditional' and '--posix' on the command
+     line, '--posix' takes precedence.  'gawk' issues a warning if both
+     options are supplied.
+
+'-r'
+'--re-interval'
+     Allow interval expressions (*note Regexp Operators::) in regexps.
+     This is now 'gawk''s default behavior.  Nevertheless, this option
+     remains (both for backward compatibility and for use in combination
+     with '--traditional').
+
+'-s'
+'--no-optimize'
+     Disable 'gawk''s default optimizations on the internal
+     representation of the program.
+
+'-S'
+'--sandbox'
+     Disable the 'system()' function, input redirections with 'getline',
+     output redirections with 'print' and 'printf', and dynamic
+     extensions.  This is particularly useful when you want to run 'awk'
+     scripts from questionable sources and need to make sure the scripts
+     can't access your system (other than the specified input data
+     file).
+
+'-t'
+'--lint-old'
+     Warn about constructs that are not available in the original
+     version of 'awk' from Version 7 Unix (*note V7/SVR3.1::).
+
+'-V'
+'--version'
+     Print version information for this particular copy of 'gawk'.  This
+     allows you to determine if your copy of 'gawk' is up to date with
+     respect to whatever the Free Software Foundation is currently
+     distributing.  It is also useful for bug reports (*note Bugs::).
+
+   As long as program text has been supplied, any other options are
+flagged as invalid with a warning message but are otherwise ignored.
+
+   In compatibility mode, as a special case, if the value of FS supplied
+to the '-F' option is 't', then 'FS' is set to the TAB character
+('"\t"').  This is true only for '--traditional' and not for '--posix'
+(*note Field Separators::).
+
+   The '-f' option may be used more than once on the command line.  If
+it is, 'awk' reads its program source from all of the named files, as if
+they had been concatenated together into one big file.  This is useful
+for creating libraries of 'awk' functions.  These functions can be
+written once and then retrieved from a standard place, instead of having
+to be included in each individual program.  The '-i' option is similar
+in this regard.  (As mentioned in *note Definition Syntax::, function
+names must be unique.)
+
+   With standard 'awk', library functions can still be used, even if the
+program is entered at the keyboard, by specifying '-f /dev/tty'.  After
+typing your program, type 'Ctrl-d' (the end-of-file character) to
+terminate it.  (You may also use '-f -' to read program source from the
+standard input, but then you will not be able to also use the standard
+input as a source of data.)
+
+   Because it is clumsy using the standard 'awk' mechanisms to mix
+source file and command-line 'awk' programs, 'gawk' provides the '-e'
+option.  This does not require you to preempt the standard input for
+your source code; it allows you to easily mix command-line and library
+source code (*note AWKPATH Variable::).  As with '-f', the '-e' and '-i'
+options may also be used multiple times on the command line.
+
+   If no '-f' or '-e' option is specified, then 'gawk' uses the first
+nonoption command-line argument as the text of the program source code.
+
+   If the environment variable 'POSIXLY_CORRECT' exists, then 'gawk'
+behaves in strict POSIX mode, exactly as if you had supplied '--posix'.
+Many GNU programs look for this environment variable to suppress
+extensions that conflict with POSIX, but 'gawk' behaves differently: it
+suppresses all extensions, even those that do not conflict with POSIX,
+and behaves in strict POSIX mode.  If '--lint' is supplied on the
+command line and 'gawk' turns on POSIX mode because of
+'POSIXLY_CORRECT', then it issues a warning message indicating that
+POSIX mode is in effect.  You would typically set this variable in your
+shell's startup file.  For a Bourne-compatible shell (such as Bash), you
+would add these lines to the '.profile' file in your home directory:
+
+     POSIXLY_CORRECT=true
+     export POSIXLY_CORRECT
+
+   For a C shell-compatible shell,(2) you would add this line to the
+'.login' file in your home directory:
+
+     setenv POSIXLY_CORRECT true
+
+   Having 'POSIXLY_CORRECT' set is not recommended for daily use, but it
+is good for testing the portability of your programs to other
+environments.
+
+   ---------- Footnotes ----------
+
+   (1) For more detail, please see Section 4.4 of RFC 3875
+(http://www.ietf.org/rfc/rfc3875).  Also see the explanatory note sent
+to the 'gawk' bug mailing list
+(http://lists.gnu.org/archive/html/bug-gawk/2014-11/msg00022.html).
+
+   (2) Not recommended.
+
+
+File: gawk.info,  Node: Other Arguments,  Next: Naming Standard Input,  Prev: Options,  Up: Invoking Gawk
+
+2.3 Other Command-Line Arguments
+================================
+
+Any additional arguments on the command line are normally treated as
+input files to be processed in the order specified.  However, an
+argument that has the form 'VAR=VALUE', assigns the value VALUE to the
+variable VAR--it does not specify a file at all.  (See *note Assignment
+Options::.)  In the following example, COUNT=1 is a variable assignment,
+not a file name:
+
+     awk -f program.awk file1 count=1 file2
+
+   All the command-line arguments are made available to your 'awk'
+program in the 'ARGV' array (*note Built-in Variables::).  Command-line
+options and the program text (if present) are omitted from 'ARGV'.  All
+other arguments, including variable assignments, are included.  As each
+element of 'ARGV' is processed, 'gawk' sets 'ARGIND' to the index in
+'ARGV' of the current element.
+
+   Changing 'ARGC' and 'ARGV' in your 'awk' program lets you control how
+'awk' processes the input files; this is described in more detail in
+*note ARGC and ARGV::.
+
+   The distinction between file name arguments and variable-assignment
+arguments is made when 'awk' is about to open the next input file.  At
+that point in execution, it checks the file name to see whether it is
+really a variable assignment; if so, 'awk' sets the variable instead of
+reading a file.
+
+   Therefore, the variables actually receive the given values after all
+previously specified files have been read.  In particular, the values of
+variables assigned in this fashion are _not_ available inside a 'BEGIN'
+rule (*note BEGIN/END::), because such rules are run before 'awk' begins
+scanning the argument list.
+
+   The variable values given on the command line are processed for
+escape sequences (*note Escape Sequences::).  (d.c.)
+
+   In some very early implementations of 'awk', when a variable
+assignment occurred before any file names, the assignment would happen
+_before_ the 'BEGIN' rule was executed.  'awk''s behavior was thus
+inconsistent; some command-line assignments were available inside the
+'BEGIN' rule, while others were not.  Unfortunately, some applications
+came to depend upon this "feature."  When 'awk' was changed to be more
+consistent, the '-v' option was added to accommodate applications that
+depended upon the old behavior.
+
+   The variable assignment feature is most useful for assigning to
+variables such as 'RS', 'OFS', and 'ORS', which control input and output
+formats, before scanning the data files.  It is also useful for
+controlling state if multiple passes are needed over a data file.  For
+example:
+
+     awk 'pass == 1  { PASS 1 STUFF }
+          pass == 2  { PASS 2 STUFF }' pass=1 mydata pass=2 mydata
+
+   Given the variable assignment feature, the '-F' option for setting
+the value of 'FS' is not strictly necessary.  It remains for historical
+compatibility.
+
+
+File: gawk.info,  Node: Naming Standard Input,  Next: Environment Variables,  Prev: Other Arguments,  Up: Invoking Gawk
+
+2.4 Naming Standard Input
+=========================
+
+Often, you may wish to read standard input together with other files.
+For example, you may wish to read one file, read standard input coming
+from a pipe, and then read another file.
+
+   The way to name the standard input, with all versions of 'awk', is to
+use a single, standalone minus sign or dash, '-'.  For example:
+
+     SOME_COMMAND | awk -f myprog.awk file1 - file2
+
+Here, 'awk' first reads 'file1', then it reads the output of
+SOME_COMMAND, and finally it reads 'file2'.
+
+   You may also use '"-"' to name standard input when reading files with
+'getline' (*note Getline/File::).
+
+   In addition, 'gawk' allows you to specify the special file name
+'/dev/stdin', both on the command line and with 'getline'.  Some other
+versions of 'awk' also support this, but it is not standard.  (Some
+operating systems provide a '/dev/stdin' file in the filesystem;
+however, 'gawk' always processes this file name itself.)
+
+
+File: gawk.info,  Node: Environment Variables,  Next: Exit Status,  Prev: Naming Standard Input,  Up: Invoking Gawk
+
+2.5 The Environment Variables 'gawk' Uses
+=========================================
+
+A number of environment variables influence how 'gawk' behaves.
+
+* Menu:
+
+* AWKPATH Variable::            Searching directories for 'awk'
+                                programs.
+* AWKLIBPATH Variable::         Searching directories for 'awk' shared
+                                libraries.
+* Other Environment Variables:: The environment variables.
+
+
+File: gawk.info,  Node: AWKPATH Variable,  Next: AWKLIBPATH Variable,  Up: Environment Variables
+
+2.5.1 The 'AWKPATH' Environment Variable
+----------------------------------------
+
+The previous minor node described how 'awk' program files can be named
+on the command line with the '-f' option.  In most 'awk'
+implementations, you must supply a precise pathname for each program
+file, unless the file is in the current directory.  But with 'gawk', if
+the file name supplied to the '-f' or '-i' options does not contain a
+directory separator '/', then 'gawk' searches a list of directories
+(called the "search path") one by one, looking for a file with the
+specified name.
+
+   The search path is a string consisting of directory names separated
+by colons.(1)  'gawk' gets its search path from the 'AWKPATH'
+environment variable.  If that variable does not exist, or if it has an
+empty value, 'gawk' uses a default path (described shortly).
+
+   The search path feature is particularly helpful for building
+libraries of useful 'awk' functions.  The library files can be placed in
+a standard directory in the default path and then specified on the
+command line with a short file name.  Otherwise, you would have to type
+the full file name for each file.
+
+   By using the '-i' or '-f' options, your command-line 'awk' programs
+can use facilities in 'awk' library files (*note Library Functions::).
+Path searching is not done if 'gawk' is in compatibility mode.  This is
+true for both '--traditional' and '--posix'.  *Note Options::.
+
+   If the source code file is not found after the initial search, the
+path is searched again after adding the suffix '.awk' to the file name.
+
+   'gawk''s path search mechanism is similar to the shell's.  (See 'The
+Bourne-Again SHell manual' (http://www.gnu.org/software/bash/manual/).)
+It treats a null entry in the path as indicating the current directory.
+(A null entry is indicated by starting or ending the path with a colon
+or by placing two colons next to each other ['::'].)
+
+     NOTE: To include the current directory in the path, either place
+     '.' as an entry in the path or write a null entry in the path.
+
+     Different past versions of 'gawk' would also look explicitly in the
+     current directory, either before or after the path search.  As of
+     version 4.1.2, this no longer happens; if you wish to look in the
+     current directory, you must include '.' either as a separate entry
+     or as a null entry in the search path.
+
+   The default value for 'AWKPATH' is '.:/usr/local/share/awk'.(2)
+Since '.' is included at the beginning, 'gawk' searches first in the
+current directory and then in '/usr/local/share/awk'.  In practice, this
+means that you will rarely need to change the value of 'AWKPATH'.
+
+   *Note Shell Startup Files::, for information on functions that help
+to manipulate the 'AWKPATH' variable.
+
+   'gawk' places the value of the search path that it used into
+'ENVIRON["AWKPATH"]'.  This provides access to the actual search path
+value from within an 'awk' program.
+
+   Although you can change 'ENVIRON["AWKPATH"]' within your 'awk'
+program, this has no effect on the running program's behavior.  This
+makes sense: the 'AWKPATH' environment variable is used to find the
+program source files.  Once your program is running, all the files have
+been found, and 'gawk' no longer needs to use 'AWKPATH'.
+
+   ---------- Footnotes ----------
+
+   (1) Semicolons on MS-Windows.
+
+   (2) Your version of 'gawk' may use a different directory; it will
+depend upon how 'gawk' was built and installed.  The actual directory is
+the value of '$(datadir)' generated when 'gawk' was configured.  You
+probably don't need to worry about this, though.
+
+
+File: gawk.info,  Node: AWKLIBPATH Variable,  Next: Other Environment Variables,  Prev: AWKPATH Variable,  Up: Environment Variables
+
+2.5.2 The 'AWKLIBPATH' Environment Variable
+-------------------------------------------
+
+The 'AWKLIBPATH' environment variable is similar to the 'AWKPATH'
+variable, but it is used to search for loadable extensions (stored as
+system shared libraries) specified with the '-l' option rather than for
+source files.  If the extension is not found, the path is searched again
+after adding the appropriate shared library suffix for the platform.
+For example, on GNU/Linux systems, the suffix '.so' is used.  The search
+path specified is also used for extensions loaded via the '@load'
+keyword (*note Loading Shared Libraries::).
+
+   If 'AWKLIBPATH' does not exist in the environment, or if it has an
+empty value, 'gawk' uses a default path; this is typically
+'/usr/local/lib/gawk', although it can vary depending upon how 'gawk'
+was built.
+
+   *Note Shell Startup Files::, for information on functions that help
+to manipulate the 'AWKLIBPATH' variable.
+
+   'gawk' places the value of the search path that it used into
+'ENVIRON["AWKLIBPATH"]'.  This provides access to the actual search path
+value from within an 'awk' program.
+
+
+File: gawk.info,  Node: Other Environment Variables,  Prev: AWKLIBPATH Variable,  Up: Environment Variables
+
+2.5.3 Other Environment Variables
+---------------------------------
+
+A number of other environment variables affect 'gawk''s behavior, but
+they are more specialized.  Those in the following list are meant to be
+used by regular users:
+
+'GAWK_MSEC_SLEEP'
+     Specifies the interval between connection retries, in milliseconds.
+     On systems that do not support the 'usleep()' system call, the
+     value is rounded up to an integral number of seconds.
+
+'GAWK_READ_TIMEOUT'
+     Specifies the time, in milliseconds, for 'gawk' to wait for input
+     before returning with an error.  *Note Read Timeout::.
+
+'GAWK_SOCK_RETRIES'
+     Controls the number of times 'gawk' attempts to retry a two-way
+     TCP/IP (socket) connection before giving up.  *Note TCP/IP
+     Networking::.  Note that when nonfatal I/O is enabled (*note
+     Nonfatal::), 'gawk' only tries to open a TCP/IP socket once.
+
+'POSIXLY_CORRECT'
+     Causes 'gawk' to switch to POSIX-compatibility mode, disabling all
+     traditional and GNU extensions.  *Note Options::.
+
+   The environment variables in the following list are meant for use by
+the 'gawk' developers for testing and tuning.  They are subject to
+change.  The variables are:
+
+'AWKBUFSIZE'
+     This variable only affects 'gawk' on POSIX-compliant systems.  With
+     a value of 'exact', 'gawk' uses the size of each input file as the
+     size of the memory buffer to allocate for I/O. Otherwise, the value
+     should be a number, and 'gawk' uses that number as the size of the
+     buffer to allocate.  (When this variable is not set, 'gawk' uses
+     the smaller of the file's size and the "default" blocksize, which
+     is usually the filesystem's I/O blocksize.)
+
+'AWK_HASH'
+     If this variable exists with a value of 'gst', 'gawk' switches to
+     using the hash function from GNU Smalltalk for managing arrays.
+     This function may be marginally faster than the standard function.
+
+'AWKREADFUNC'
+     If this variable exists, 'gawk' switches to reading source files
+     one line at a time, instead of reading in blocks.  This exists for
+     debugging problems on filesystems on non-POSIX operating systems
+     where I/O is performed in records, not in blocks.
+
+'GAWK_MSG_SRC'
+     If this variable exists, 'gawk' includes the file name and line
+     number within the 'gawk' source code from which warning and/or
+     fatal messages are generated.  Its purpose is to help isolate the
+     source of a message, as there are multiple places that produce the
+     same warning or error message.
+
+'GAWK_LOCALE_DIR'
+     Specifies the location of compiled message object files for 'gawk'
+     itself.  This is passed to the 'bindtextdomain()' function when
+     'gawk' starts up.
+
+'GAWK_NO_DFA'
+     If this variable exists, 'gawk' does not use the DFA regexp matcher
+     for "does it match" kinds of tests.  This can cause 'gawk' to be
+     slower.  Its purpose is to help isolate differences between the two
+     regexp matchers that 'gawk' uses internally.  (There aren't
+     supposed to be differences, but occasionally theory and practice
+     don't coordinate with each other.)
+
+'GAWK_STACKSIZE'
+     This specifies the amount by which 'gawk' should grow its internal
+     evaluation stack, when needed.
+
+'INT_CHAIN_MAX'
+     This specifies intended maximum number of items 'gawk' will
+     maintain on a hash chain for managing arrays indexed by integers.
+
+'STR_CHAIN_MAX'
+     This specifies intended maximum number of items 'gawk' will
+     maintain on a hash chain for managing arrays indexed by strings.
+
+'TIDYMEM'
+     If this variable exists, 'gawk' uses the 'mtrace()' library calls
+     from the GNU C library to help track down possible memory leaks.
+
+
+File: gawk.info,  Node: Exit Status,  Next: Include Files,  Prev: Environment Variables,  Up: Invoking Gawk
+
+2.6 'gawk''s Exit Status
+========================
+
+If the 'exit' statement is used with a value (*note Exit Statement::),
+then 'gawk' exits with the numeric value given to it.
+
+   Otherwise, if there were no problems during execution, 'gawk' exits
+with the value of the C constant 'EXIT_SUCCESS'.  This is usually zero.
+
+   If an error occurs, 'gawk' exits with the value of the C constant
+'EXIT_FAILURE'.  This is usually one.
+
+   If 'gawk' exits because of a fatal error, the exit status is two.  On
+non-POSIX systems, this value may be mapped to 'EXIT_FAILURE'.
+
+
+File: gawk.info,  Node: Include Files,  Next: Loading Shared Libraries,  Prev: Exit Status,  Up: Invoking Gawk
+
+2.7 Including Other Files into Your Program
+===========================================
+
+This minor node describes a feature that is specific to 'gawk'.
+
+   The '@include' keyword can be used to read external 'awk' source
+files.  This gives you the ability to split large 'awk' source files
+into smaller, more manageable pieces, and also lets you reuse common
+'awk' code from various 'awk' scripts.  In other words, you can group
+together 'awk' functions used to carry out specific tasks into external
+files.  These files can be used just like function libraries, using the
+'@include' keyword in conjunction with the 'AWKPATH' environment
+variable.  Note that source files may also be included using the '-i'
+option.
+
+   Let's see an example.  We'll start with two (trivial) 'awk' scripts,
+namely 'test1' and 'test2'.  Here is the 'test1' script:
+
+     BEGIN {
+         print "This is script test1."
+     }
+
+and here is 'test2':
+
+     @include "test1"
+     BEGIN {
+         print "This is script test2."
+     }
+
+   Running 'gawk' with 'test2' produces the following result:
+
+     $ gawk -f test2
+     -| This is script test1.
+     -| This is script test2.
+
+   'gawk' runs the 'test2' script, which includes 'test1' using the
+'@include' keyword.  So, to include external 'awk' source files, you
+just use '@include' followed by the name of the file to be included,
+enclosed in double quotes.
+
+     NOTE: Keep in mind that this is a language construct and the file
+     name cannot be a string variable, but rather just a literal string
+     constant in double quotes.
+
+   The files to be included may be nested; e.g., given a third script,
+namely 'test3':
+
+     @include "test2"
+     BEGIN {
+         print "This is script test3."
+     }
+
+Running 'gawk' with the 'test3' script produces the following results:
+
+     $ gawk -f test3
+     -| This is script test1.
+     -| This is script test2.
+     -| This is script test3.
+
+   The file name can, of course, be a pathname.  For example:
+
+     @include "../io_funcs"
+
+and:
+
+     @include "/usr/awklib/network"
+
+are both valid.  The 'AWKPATH' environment variable can be of great
+value when using '@include'.  The same rules for the use of the
+'AWKPATH' variable in command-line file searches (*note AWKPATH
+Variable::) apply to '@include' also.
+
+   This is very helpful in constructing 'gawk' function libraries.  If
+you have a large script with useful, general-purpose 'awk' functions,
+you can break it down into library files and put those files in a
+special directory.  You can then include those "libraries," either by
+using the full pathnames of the files, or by setting the 'AWKPATH'
+environment variable accordingly and then using '@include' with just the
+file part of the full pathname.  Of course, you can keep library files
+in more than one directory; the more complex the working environment is,
+the more directories you may need to organize the files to be included.
+
+   Given the ability to specify multiple '-f' options, the '@include'
+mechanism is not strictly necessary.  However, the '@include' keyword
+can help you in constructing self-contained 'gawk' programs, thus
+reducing the need for writing complex and tedious command lines.  In
+particular, '@include' is very useful for writing CGI scripts to be run
+from web pages.
+
+   As mentioned in *note AWKPATH Variable::, the current directory is
+always searched first for source files, before searching in 'AWKPATH';
+this also applies to files named with '@include'.
+
+
+File: gawk.info,  Node: Loading Shared Libraries,  Next: Obsolete,  Prev: Include Files,  Up: Invoking Gawk
+
+2.8 Loading Dynamic Extensions into Your Program
+================================================
+
+This minor node describes a feature that is specific to 'gawk'.
+
+   The '@load' keyword can be used to read external 'awk' extensions
+(stored as system shared libraries).  This allows you to link in
+compiled code that may offer superior performance and/or give you access
+to extended capabilities not supported by the 'awk' language.  The
+'AWKLIBPATH' variable is used to search for the extension.  Using
+'@load' is completely equivalent to using the '-l' command-line option.
+
+   If the extension is not initially found in 'AWKLIBPATH', another
+search is conducted after appending the platform's default shared
+library suffix to the file name.  For example, on GNU/Linux systems, the
+suffix '.so' is used:
+
+     $ gawk '@load "ordchr"; BEGIN {print chr(65)}'
+     -| A
+
+This is equivalent to the following example:
+
+     $ gawk -lordchr 'BEGIN {print chr(65)}'
+     -| A
+
+For command-line usage, the '-l' option is more convenient, but '@load'
+is useful for embedding inside an 'awk' source file that requires access
+to an extension.
+
+   *note Dynamic Extensions::, describes how to write extensions (in C
+or C++) that can be loaded with either '@load' or the '-l' option.  It
+also describes the 'ordchr' extension.
+
+
+File: gawk.info,  Node: Obsolete,  Next: Undocumented,  Prev: Loading Shared Libraries,  Up: Invoking Gawk
+
+2.9 Obsolete Options and/or Features
+====================================
+
+This minor node describes features and/or command-line options from
+previous releases of 'gawk' that either are not available in the current
+version or are still supported but deprecated (meaning that they will
+_not_ be in the next release).
+
+   The process-related special files '/dev/pid', '/dev/ppid',
+'/dev/pgrpid', and '/dev/user' were deprecated in 'gawk' 3.1, but still
+worked.  As of version 4.0, they are no longer interpreted specially by
+'gawk'.  (Use 'PROCINFO' instead; see *note Auto-set::.)
+
+
+File: gawk.info,  Node: Undocumented,  Next: Invoking Summary,  Prev: Obsolete,  Up: Invoking Gawk
+
+2.10 Undocumented Options and Features
+======================================
+
+     Use the Source, Luke!
+                             -- _Obi-Wan_
+
+   This minor node intentionally left blank.
+
+
+File: gawk.info,  Node: Invoking Summary,  Prev: Undocumented,  Up: Invoking Gawk
+
+2.11 Summary
+============
+
+   * Use either 'awk 'PROGRAM' FILES' or 'awk -f PROGRAM-FILE FILES' to
+     run 'awk'.
+
+   * The three standard options for all versions of 'awk' are '-f',
+     '-F', and '-v'.  'gawk' supplies these and many others, as well as
+     corresponding GNU-style long options.
+
+   * Nonoption command-line arguments are usually treated as file names,
+     unless they have the form 'VAR=VALUE', in which case they are taken
+     as variable assignments to be performed at that point in processing
+     the input.
+
+   * All nonoption command-line arguments, excluding the program text,
+     are placed in the 'ARGV' array.  Adjusting 'ARGC' and 'ARGV'
+     affects how 'awk' processes input.
+
+   * You can use a single minus sign ('-') to refer to standard input on
+     the command line.  'gawk' also lets you use the special file name
+     '/dev/stdin'.
+
+   * 'gawk' pays attention to a number of environment variables.
+     'AWKPATH', 'AWKLIBPATH', and 'POSIXLY_CORRECT' are the most
+     important ones.
+
+   * 'gawk''s exit status conveys information to the program that
+     invoked it.  Use the 'exit' statement from within an 'awk' program
+     to set the exit status.
+
+   * 'gawk' allows you to include other 'awk' source files into your
+     program using the '@include' statement and/or the '-i' and '-f'
+     command-line options.
+
+   * 'gawk' allows you to load additional functions written in C or C++
+     using the '@load' statement and/or the '-l' option.  (This advanced
+     feature is described later, in *note Dynamic Extensions::.)
+
+
+File: gawk.info,  Node: Regexp,  Next: Reading Files,  Prev: Invoking Gawk,  Up: Top
+
+3 Regular Expressions
+*********************
+
+A "regular expression", or "regexp", is a way of describing a set of
+strings.  Because regular expressions are such a fundamental part of
+'awk' programming, their format and use deserve a separate major node.
+
+   A regular expression enclosed in slashes ('/') is an 'awk' pattern
+that matches every input record whose text belongs to that set.  The
+simplest regular expression is a sequence of letters, numbers, or both.
+Such a regexp matches any string that contains that sequence.  Thus, the
+regexp 'foo' matches any string containing 'foo'.  Thus, the pattern
+'/foo/' matches any input record containing the three adjacent
+characters 'foo' _anywhere_ in the record.  Other kinds of regexps let
+you specify more complicated classes of strings.
+
+* Menu:
+
+* Regexp Usage::                How to Use Regular Expressions.
+* Escape Sequences::            How to write nonprinting characters.
+* Regexp Operators::            Regular Expression Operators.
+* Bracket Expressions::         What can go between '[...]'.
+* Leftmost Longest::            How much text matches.
+* Computed Regexps::            Using Dynamic Regexps.
+* GNU Regexp Operators::        Operators specific to GNU software.
+* Case-sensitivity::            How to do case-insensitive matching.
+* Strong Regexp Constants::     Strongly typed regexp constants.
+* Regexp Summary::              Regular expressions summary.
+
+
+File: gawk.info,  Node: Regexp Usage,  Next: Escape Sequences,  Up: Regexp
+
+3.1 How to Use Regular Expressions
+==================================
+
+A regular expression can be used as a pattern by enclosing it in
+slashes.  Then the regular expression is tested against the entire text
+of each record.  (Normally, it only needs to match some part of the text
+in order to succeed.)  For example, the following prints the second
+field of each record where the string 'li' appears anywhere in the
+record:
+
+     $ awk '/li/ { print $2 }' mail-list
+     -| 555-5553
+     -| 555-0542
+     -| 555-6699
+     -| 555-3430
+
+   Regular expressions can also be used in matching expressions.  These
+expressions allow you to specify the string to match against; it need
+not be the entire current input record.  The two operators '~' and '!~'
+perform regular expression comparisons.  Expressions using these
+operators can be used as patterns, or in 'if', 'while', 'for', and 'do'
+statements.  (*Note Statements::.)  For example, the following is true
+if the expression EXP (taken as a string) matches REGEXP:
+
+     EXP ~ /REGEXP/
+
+This example matches, or selects, all input records with the uppercase
+letter 'J' somewhere in the first field:
+
+     $ awk '$1 ~ /J/' inventory-shipped
+     -| Jan  13  25  15 115
+     -| Jun  31  42  75 492
+     -| Jul  24  34  67 436
+     -| Jan  21  36  64 620
+
+   So does this:
+
+     awk '{ if ($1 ~ /J/) print }' inventory-shipped
+
+   This next example is true if the expression EXP (taken as a character
+string) does _not_ match REGEXP:
+
+     EXP !~ /REGEXP/
+
+   The following example matches, or selects, all input records whose
+first field _does not_ contain the uppercase letter 'J':
+
+     $ awk '$1 !~ /J/' inventory-shipped
+     -| Feb  15  32  24 226
+     -| Mar  15  24  34 228
+     -| Apr  31  52  63 420
+     -| May  16  34  29 208
+     ...
+
+   When a regexp is enclosed in slashes, such as '/foo/', we call it a
+"regexp constant", much like '5.27' is a numeric constant and '"foo"' is
+a string constant.
+
+
+File: gawk.info,  Node: Escape Sequences,  Next: Regexp Operators,  Prev: Regexp Usage,  Up: Regexp
+
+3.2 Escape Sequences
+====================
+
+Some characters cannot be included literally in string constants
+('"foo"') or regexp constants ('/foo/').  Instead, they should be
+represented with "escape sequences", which are character sequences
+beginning with a backslash ('\').  One use of an escape sequence is to
+include a double-quote character in a string constant.  Because a plain
+double quote ends the string, you must use '\"' to represent an actual
+double-quote character as a part of the string.  For example:
+
+     $ awk 'BEGIN { print "He said \"hi!\" to her." }'
+     -| He said "hi!" to her.
+
+   The backslash character itself is another character that cannot be
+included normally; you must write '\\' to put one backslash in the
+string or regexp.  Thus, the string whose contents are the two
+characters '"' and '\' must be written '"\"\\"'.
+
+   Other escape sequences represent unprintable characters such as TAB
+or newline.  There is nothing to stop you from entering most unprintable
+characters directly in a string constant or regexp constant, but they
+may look ugly.
+
+   The following list presents all the escape sequences used in 'awk'
+and what they represent.  Unless noted otherwise, all these escape
+sequences apply to both string constants and regexp constants:
+
+'\\'
+     A literal backslash, '\'.
+
+'\a'
+     The "alert" character, 'Ctrl-g', ASCII code 7 (BEL). (This often
+     makes some sort of audible noise.)
+
+'\b'
+     Backspace, 'Ctrl-h', ASCII code 8 (BS).
+
+'\f'
+     Formfeed, 'Ctrl-l', ASCII code 12 (FF).
+
+'\n'
+     Newline, 'Ctrl-j', ASCII code 10 (LF).
+
+'\r'
+     Carriage return, 'Ctrl-m', ASCII code 13 (CR).
+
+'\t'
+     Horizontal TAB, 'Ctrl-i', ASCII code 9 (HT).
+
+'\v'
+     Vertical TAB, 'Ctrl-k', ASCII code 11 (VT).
+
+'\NNN'
+     The octal value NNN, where NNN stands for 1 to 3 digits between '0'
+     and '7'.  For example, the code for the ASCII ESC (escape)
+     character is '\033'.
+
+'\xHH...'
+     The hexadecimal value HH, where HH stands for a sequence of
+     hexadecimal digits ('0'-'9', and either 'A'-'F' or 'a'-'f').  A
+     maximum of two digts are allowed after the '\x'.  Any further
+     hexadecimal digits are treated as simple letters or numbers.
+     (c.e.)  (The '\x' escape sequence is not allowed in POSIX awk.)
+
+          CAUTION: In ISO C, the escape sequence continues until the
+          first nonhexadecimal digit is seen.  For many years, 'gawk'
+          would continue incorporating hexadecimal digits into the value
+          until a non-hexadecimal digit or the end of the string was
+          encountered.  However, using more than two hexadecimal digits
+          produced undefined results.  As of version 4.2, only two
+          digits are processed.
+
+'\/'
+     A literal slash (necessary for regexp constants only).  This
+     sequence is used when you want to write a regexp constant that
+     contains a slash (such as '/.*:\/home\/[[:alnum:]]+:.*/'; the
+     '[[:alnum:]]' notation is discussed in *note Bracket
+     Expressions::).  Because the regexp is delimited by slashes, you
+     need to escape any slash that is part of the pattern, in order to
+     tell 'awk' to keep processing the rest of the regexp.
+
+'\"'
+     A literal double quote (necessary for string constants only).  This
+     sequence is used when you want to write a string constant that
+     contains a double quote (such as '"He said \"hi!\" to her."').
+     Because the string is delimited by double quotes, you need to
+     escape any quote that is part of the string, in order to tell 'awk'
+     to keep processing the rest of the string.
+
+   In 'gawk', a number of additional two-character sequences that begin
+with a backslash have special meaning in regexps.  *Note GNU Regexp
+Operators::.
+
+   In a regexp, a backslash before any character that is not in the
+previous list and not listed in *note GNU Regexp Operators:: means that
+the next character should be taken literally, even if it would normally
+be a regexp operator.  For example, '/a\+b/' matches the three
+characters 'a+b'.
+
+   For complete portability, do not use a backslash before any character
+not shown in the previous list or that is not an operator.
+
+                  Backslash Before Regular Characters
+
+   If you place a backslash in a string constant before something that
+is not one of the characters previously listed, POSIX 'awk' purposely
+leaves what happens as undefined.  There are two choices:
+
+Strip the backslash out
+     This is what BWK 'awk' and 'gawk' both do.  For example, '"a\qc"'
+     is the same as '"aqc"'.  (Because this is such an easy bug both to
+     introduce and to miss, 'gawk' warns you about it.)  Consider 'FS =
+     "[ \t]+\|[ \t]+"' to use vertical bars surrounded by whitespace as
+     the field separator.  There should be two backslashes in the
+     string: 'FS = "[ \t]+\\|[ \t]+"'.)
+
+Leave the backslash alone
+     Some other 'awk' implementations do this.  In such implementations,
+     typing '"a\qc"' is the same as typing '"a\\qc"'.
+
+   To summarize:
+
+   * The escape sequences in the preceding list are always processed
+     first, for both string constants and regexp constants.  This
+     happens very early, as soon as 'awk' reads your program.
+
+   * 'gawk' processes both regexp constants and dynamic regexps (*note
+     Computed Regexps::), for the special operators listed in *note GNU
+     Regexp Operators::.
+
+   * A backslash before any other character means to treat that
+     character literally.
+
+                  Escape Sequences for Metacharacters
+
+   Suppose you use an octal or hexadecimal escape to represent a regexp
+metacharacter.  (See *note Regexp Operators::.)  Does 'awk' treat the
+character as a literal character or as a regexp operator?
+
+   Historically, such characters were taken literally.  (d.c.)  However,
+the POSIX standard indicates that they should be treated as real
+metacharacters, which is what 'gawk' does.  In compatibility mode (*note
+Options::), 'gawk' treats the characters represented by octal and
+hexadecimal escape sequences literally when used in regexp constants.
+Thus, '/a\52b/' is equivalent to '/a\*b/'.
+
+
+File: gawk.info,  Node: Regexp Operators,  Next: Bracket Expressions,  Prev: Escape Sequences,  Up: Regexp
+
+3.3 Regular Expression Operators
+================================
+
+You can combine regular expressions with special characters, called
+"regular expression operators" or "metacharacters", to increase the
+power and versatility of regular expressions.
+
+   The escape sequences described in *note Escape Sequences:: are valid
+inside a regexp.  They are introduced by a '\' and are recognized and
+converted into corresponding real characters as the very first step in
+processing regexps.
+
+   Here is a list of metacharacters.  All characters that are not escape
+sequences and that are not listed here stand for themselves:
+
+'\'
+     This suppresses the special meaning of a character when matching.
+     For example, '\$' matches the character '$'.
+
+'^'
+     This matches the beginning of a string.  '^@chapter' matches
+     '@chapter' at the beginning of a string, for example, and can be
+     used to identify chapter beginnings in Texinfo source files.  The
+     '^' is known as an "anchor", because it anchors the pattern to
+     match only at the beginning of the string.
+
+     It is important to realize that '^' does not match the beginning of
+     a line (the point right after a '\n' newline character) embedded in
+     a string.  The condition is not true in the following example:
+
+          if ("line1\nLINE 2" ~ /^L/) ...
+
+'$'
+     This is similar to '^', but it matches only at the end of a string.
+     For example, 'p$' matches a record that ends with a 'p'.  The '$'
+     is an anchor and does not match the end of a line (the point right
+     before a '\n' newline character) embedded in a string.  The
+     condition in the following example is not true:
+
+          if ("line1\nLINE 2" ~ /1$/) ...
+
+'.' (period)
+     This matches any single character, _including_ the newline
+     character.  For example, '.P' matches any single character followed
+     by a 'P' in a string.  Using concatenation, we can make a regular
+     expression such as 'U.A', which matches any three-character
+     sequence that begins with 'U' and ends with 'A'.
+
+     In strict POSIX mode (*note Options::), '.' does not match the NUL
+     character, which is a character with all bits equal to zero.
+     Otherwise, NUL is just another character.  Other versions of 'awk'
+     may not be able to match the NUL character.
+
+'['...']'
+     This is called a "bracket expression".(1)  It matches any _one_ of
+     the characters that are enclosed in the square brackets.  For
+     example, '[MVX]' matches any one of the characters 'M', 'V', or 'X'
+     in a string.  A full discussion of what can be inside the square
+     brackets of a bracket expression is given in *note Bracket
+     Expressions::.
+
+'[^'...']'
+     This is a "complemented bracket expression".  The first character
+     after the '[' _must_ be a '^'.  It matches any characters _except_
+     those in the square brackets.  For example, '[^awk]' matches any
+     character that is not an 'a', 'w', or 'k'.
+
+'|'
+     This is the "alternation operator" and it is used to specify
+     alternatives.  The '|' has the lowest precedence of all the regular
+     expression operators.  For example, '^P|[aeiouy]' matches any
+     string that matches either '^P' or '[aeiouy]'.  This means it
+     matches any string that starts with 'P' or contains (anywhere
+     within it) a lowercase English vowel.
+
+     The alternation applies to the largest possible regexps on either
+     side.
+
+'('...')'
+     Parentheses are used for grouping in regular expressions, as in
+     arithmetic.  They can be used to concatenate regular expressions
+     containing the alternation operator, '|'.  For example,
+     '@(samp|code)\{[^}]+\}' matches both '@code{foo}' and '@samp{bar}'.
+     (These are Texinfo formatting control sequences.  The '+' is
+     explained further on in this list.)
+
+'*'
+     This symbol means that the preceding regular expression should be
+     repeated as many times as necessary to find a match.  For example,
+     'ph*' applies the '*' symbol to the preceding 'h' and looks for
+     matches of one 'p' followed by any number of 'h's.  This also
+     matches just 'p' if no 'h's are present.
+
+     There are two subtle points to understand about how '*' works.
+     First, the '*' applies only to the single preceding regular
+     expression component (e.g., in 'ph*', it applies just to the 'h').
+     To cause '*' to apply to a larger subexpression, use parentheses:
+     '(ph)*' matches 'ph', 'phph', 'phphph', and so on.
+
+     Second, '*' finds as many repetitions as possible.  If the text to
+     be matched is 'phhhhhhhhhhhhhhooey', 'ph*' matches all of the 'h's.
+
+'+'
+     This symbol is similar to '*', except that the preceding expression
+     must be matched at least once.  This means that 'wh+y' would match
+     'why' and 'whhy', but not 'wy', whereas 'wh*y' would match all
+     three.
+
+'?'
+     This symbol is similar to '*', except that the preceding expression
+     can be matched either once or not at all.  For example, 'fe?d'
+     matches 'fed' and 'fd', but nothing else.
+
+'{'N'}'
+'{'N',}'
+'{'N','M'}'
+     One or two numbers inside braces denote an "interval expression".
+     If there is one number in the braces, the preceding regexp is
+     repeated N times.  If there are two numbers separated by a comma,
+     the preceding regexp is repeated N to M times.  If there is one
+     number followed by a comma, then the preceding regexp is repeated
+     at least N times:
+
+     'wh{3}y'
+          Matches 'whhhy', but not 'why' or 'whhhhy'.
+
+     'wh{3,5}y'
+          Matches 'whhhy', 'whhhhy', or 'whhhhhy' only.
+
+     'wh{2,}y'
+          Matches 'whhy', 'whhhy', and so on.
+
+     Interval expressions were not traditionally available in 'awk'.
+     They were added as part of the POSIX standard to make 'awk' and
+     'egrep' consistent with each other.
+
+     Initially, because old programs may use '{' and '}' in regexp
+     constants, 'gawk' did _not_ match interval expressions in regexps.
+
+     However, beginning with version 4.0, 'gawk' does match interval
+     expressions by default.  This is because compatibility with POSIX
+     has become more important to most 'gawk' users than compatibility
+     with old programs.
+
+     For programs that use '{' and '}' in regexp constants, it is good
+     practice to always escape them with a backslash.  Then the regexp
+     constants are valid and work the way you want them to, using any
+     version of 'awk'.(2)
+
+     Finally, when '{' and '}' appear in regexp constants in a way that
+     cannot be interpreted as an interval expression (such as '/q{a}/'),
+     then they stand for themselves.
+
+   In regular expressions, the '*', '+', and '?' operators, as well as
+the braces '{' and '}', have the highest precedence, followed by
+concatenation, and finally by '|'.  As in arithmetic, parentheses can
+change how operators are grouped.
+
+   In POSIX 'awk' and 'gawk', the '*', '+', and '?' operators stand for
+themselves when there is nothing in the regexp that precedes them.  For
+example, '/+/' matches a literal plus sign.  However, many other
+versions of 'awk' treat such a usage as a syntax error.
+
+   If 'gawk' is in compatibility mode (*note Options::), interval
+expressions are not available in regular expressions.
+
+   ---------- Footnotes ----------
+
+   (1) In other literature, you may see a bracket expression referred to
+as either a "character set", a "character class", or a "character list".
+
+   (2) Use two backslashes if you're using a string constant with a
+regexp operator or function.
+
+
+File: gawk.info,  Node: Bracket Expressions,  Next: Leftmost Longest,  Prev: Regexp Operators,  Up: Regexp
+
+3.4 Using Bracket Expressions
+=============================
+
+As mentioned earlier, a bracket expression matches any character among
+those listed between the opening and closing square brackets.
+
+   Within a bracket expression, a "range expression" consists of two
+characters separated by a hyphen.  It matches any single character that
+sorts between the two characters, based upon the system's native
+character set.  For example, '[0-9]' is equivalent to '[0123456789]'.
+(See *note Ranges and Locales:: for an explanation of how the POSIX
+standard and 'gawk' have changed over time.  This is mainly of
+historical interest.)
+
+   With the increasing popularity of the Unicode character standard
+(http://www.unicode.org), there is an additional wrinkle to consider.
+Octal and hexadecimal escape sequences inside bracket expressions are
+taken to represent only single-byte characters (characters whose values
+fit within the range 0-256).  To match a range of characters where the
+endpoints of the range are larger than 256, enter the multibyte
+encodings of the characters directly.
+
+   To include one of the characters '\', ']', '-', or '^' in a bracket
+expression, put a '\' in front of it.  For example:
+
+     [d\]]
+
+matches either 'd' or ']'.  Additionally, if you place ']' right after
+the opening '[', the closing bracket is treated as one of the characters
+to be matched.
+
+   The treatment of '\' in bracket expressions is compatible with other
+'awk' implementations and is also mandated by POSIX. The regular
+expressions in 'awk' are a superset of the POSIX specification for
+Extended Regular Expressions (EREs).  POSIX EREs are based on the
+regular expressions accepted by the traditional 'egrep' utility.
+
+   "Character classes" are a feature introduced in the POSIX standard.
+A character class is a special notation for describing lists of
+characters that have a specific attribute, but the actual characters can
+vary from country to country and/or from character set to character set.
+For example, the notion of what is an alphabetic character differs
+between the United States and France.
+
+   A character class is only valid in a regexp _inside_ the brackets of
+a bracket expression.  Character classes consist of '[:', a keyword
+denoting the class, and ':]'.  *note Table 3.1: table-char-classes.
+lists the character classes defined by the POSIX standard.
+
+Class       Meaning
+--------------------------------------------------------------------------
+'[:alnum:]' Alphanumeric characters
+'[:alpha:]' Alphabetic characters
+'[:blank:]' Space and TAB characters
+'[:cntrl:]' Control characters
+'[:digit:]' Numeric characters
+'[:graph:]' Characters that are both printable and visible (a space is
+            printable but not visible, whereas an 'a' is both)
+'[:lower:]' Lowercase alphabetic characters
+'[:print:]' Printable characters (characters that are not control
+            characters)
+'[:punct:]' Punctuation characters (characters that are not letters,
+            digits, control characters, or space characters)
+'[:space:]' Space characters (such as space, TAB, and formfeed, to name
+            a few)
+'[:upper:]' Uppercase alphabetic characters
+'[:xdigit:]'Characters that are hexadecimal digits
+
+Table 3.1: POSIX character classes
+
+   For example, before the POSIX standard, you had to write
+'/[A-Za-z0-9]/' to match alphanumeric characters.  If your character set
+had other alphabetic characters in it, this would not match them.  With
+the POSIX character classes, you can write '/[[:alnum:]]/' to match the
+alphabetic and numeric characters in your character set.
+
+   Some utilities that match regular expressions provide a nonstandard
+'[:ascii:]' character class; 'awk' does not.  However, you can simulate
+such a construct using '[\x00-\x7F]'.  This matches all values
+numerically between zero and 127, which is the defined range of the
+ASCII character set.  Use a complemented character list ('[^\x00-\x7F]')
+to match any single-byte characters that are not in the ASCII range.
+
+   Two additional special sequences can appear in bracket expressions.
+These apply to non-ASCII character sets, which can have single symbols
+(called "collating elements") that are represented with more than one
+character.  They can also have several characters that are equivalent
+for "collating", or sorting, purposes.  (For example, in French, a plain
+"e" and a grave-accented "e`" are equivalent.)  These sequences are:
+
+Collating symbols
+     Multicharacter collating elements enclosed between '[.' and '.]'.
+     For example, if 'ch' is a collating element, then '[[.ch.]]' is a
+     regexp that matches this collating element, whereas '[ch]' is a
+     regexp that matches either 'c' or 'h'.
+
+Equivalence classes
+     Locale-specific names for a list of characters that are equal.  The
+     name is enclosed between '[=' and '=]'.  For example, the name 'e'
+     might be used to represent all of "e," "e^," "e`," and "e'."  In
+     this case, '[[=e=]]' is a regexp that matches any of 'e', 'e^',
+     'e'', or 'e`'.
+
+   These features are very valuable in non-English-speaking locales.
+
+     CAUTION: The library functions that 'gawk' uses for regular
+     expression matching currently recognize only POSIX character
+     classes; they do not recognize collating symbols or equivalence
+     classes.
+
+   Inside a bracket expression, an opening bracket ('[') that does not
+start a character class, collating element or equivalence class is taken
+literally.  This is also true of '.' and '*'.
+
+
+File: gawk.info,  Node: Leftmost Longest,  Next: Computed Regexps,  Prev: Bracket Expressions,  Up: Regexp
+
+3.5 How Much Text Matches?
+==========================
+
+Consider the following:
+
+     echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'
+
+   This example uses the 'sub()' function to make a change to the input
+record.  ('sub()' replaces the first instance of any text matched by the
+first argument with the string provided as the second argument; *note
+String Functions::.)  Here, the regexp '/a+/' indicates "one or more 'a'
+characters," and the replacement text is '<A>'.
+
+   The input contains four 'a' characters.  'awk' (and POSIX) regular
+expressions always match the leftmost, _longest_ sequence of input
+characters that can match.  Thus, all four 'a' characters are replaced
+with '<A>' in this example:
+
+     $ echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'
+     -| <A>bcd
+
+   For simple match/no-match tests, this is not so important.  But when
+doing text matching and substitutions with the 'match()', 'sub()',
+'gsub()', and 'gensub()' functions, it is very important.  *Note String
+Functions::, for more information on these functions.  Understanding
+this principle is also important for regexp-based record and field
+splitting (*note Records::, and also *note Field Separators::).
+
+
+File: gawk.info,  Node: Computed Regexps,  Next: GNU Regexp Operators,  Prev: Leftmost Longest,  Up: Regexp
+
+3.6 Using Dynamic Regexps
+=========================
+
+The righthand side of a '~' or '!~' operator need not be a regexp
+constant (i.e., a string of characters between slashes).  It may be any
+expression.  The expression is evaluated and converted to a string if
+necessary; the contents of the string are then used as the regexp.  A
+regexp computed in this way is called a "dynamic regexp" or a "computed
+regexp":
+
+     BEGIN { digits_regexp = "[[:digit:]]+" }
+     $0 ~ digits_regexp    { print }
+
+This sets 'digits_regexp' to a regexp that describes one or more digits,
+and tests whether the input record matches this regexp.
+
+     NOTE: When using the '~' and '!~' operators, be aware that there is
+     a difference between a regexp constant enclosed in slashes and a
+     string constant enclosed in double quotes.  If you are going to use
+     a string constant, you have to understand that the string is, in
+     essence, scanned _twice_: the first time when 'awk' reads your
+     program, and the second time when it goes to match the string on
+     the lefthand side of the operator with the pattern on the right.
+     This is true of any string-valued expression (such as
+     'digits_regexp', shown in the previous example), not just string
+     constants.
+
+   What difference does it make if the string is scanned twice?  The
+answer has to do with escape sequences, and particularly with
+backslashes.  To get a backslash into a regular expression inside a
+string, you have to type two backslashes.
+
+   For example, '/\*/' is a regexp constant for a literal '*'.  Only one
+backslash is needed.  To do the same thing with a string, you have to
+type '"\\*"'.  The first backslash escapes the second one so that the
+string actually contains the two characters '\' and '*'.
+
+   Given that you can use both regexp and string constants to describe
+regular expressions, which should you use?  The answer is "regexp
+constants," for several reasons:
+
+   * String constants are more complicated to write and more difficult
+     to read.  Using regexp constants makes your programs less
+     error-prone.  Not understanding the difference between the two
+     kinds of constants is a common source of errors.
+
+   * It is more efficient to use regexp constants.  'awk' can note that
+     you have supplied a regexp and store it internally in a form that
+     makes pattern matching more efficient.  When using a string
+     constant, 'awk' must first convert the string into this internal
+     form and then perform the pattern matching.
+
+   * Using regexp constants is better form; it shows clearly that you
+     intend a regexp match.
+
+         Using '\n' in Bracket Expressions of Dynamic Regexps
+
+   Some older versions of 'awk' do not allow the newline character to be
+used inside a bracket expression for a dynamic regexp:
+
+     $ awk '$0 ~ "[ \t\n]"'
+     error-> awk: newline in character class [
+     error-> ]...
+     error->  source line number 1
+     error->  context is
+     error->        $0 ~ "[ >>>  \t\n]" <<<
+
+   But a newline in a regexp constant works with no problem:
+
+     $ awk '$0 ~ /[ \t\n]/'
+     here is a sample line
+     -| here is a sample line
+     Ctrl-d
+
+   'gawk' does not have this problem, and it isn't likely to occur often
+in practice, but it's worth noting for future reference.
+
+
+File: gawk.info,  Node: GNU Regexp Operators,  Next: Case-sensitivity,  Prev: Computed Regexps,  Up: Regexp
+
+3.7 'gawk'-Specific Regexp Operators
+====================================
+
+GNU software that deals with regular expressions provides a number of
+additional regexp operators.  These operators are described in this
+minor node and are specific to 'gawk'; they are not available in other
+'awk' implementations.  Most of the additional operators deal with word
+matching.  For our purposes, a "word" is a sequence of one or more
+letters, digits, or underscores ('_'):
+
+'\s'
+     Matches any whitespace character.  Think of it as shorthand for
+     '[[:space:]]'.
+
+'\S'
+     Matches any character that is not whitespace.  Think of it as
+     shorthand for '[^[:space:]]'.
+
+'\w'
+     Matches any word-constituent character--that is, it matches any
+     letter, digit, or underscore.  Think of it as shorthand for
+     '[[:alnum:]_]'.
+
+'\W'
+     Matches any character that is not word-constituent.  Think of it as
+     shorthand for '[^[:alnum:]_]'.
+
+'\<'
+     Matches the empty string at the beginning of a word.  For example,
+     '/\<away/' matches 'away' but not 'stowaway'.
+
+'\>'
+     Matches the empty string at the end of a word.  For example,
+     '/stow\>/' matches 'stow' but not 'stowaway'.
+
+'\y'
+     Matches the empty string at either the beginning or the end of a
+     word (i.e., the word boundar*y*).  For example, '\yballs?\y'
+     matches either 'ball' or 'balls', as a separate word.
+
+'\B'
+     Matches the empty string that occurs between two word-constituent
+     characters.  For example, '/\Brat\B/' matches 'crate', but it does
+     not match 'dirty rat'.  '\B' is essentially the opposite of '\y'.
+
+   There are two other operators that work on buffers.  In Emacs, a
+"buffer" is, naturally, an Emacs buffer.  Other GNU programs, including
+'gawk', consider the entire string to match as the buffer.  The
+operators are:
+
+'\`'
+     Matches the empty string at the beginning of a buffer (string)
+
+'\''
+     Matches the empty string at the end of a buffer (string)
+
+   Because '^' and '$' always work in terms of the beginning and end of
+strings, these operators don't add any new capabilities for 'awk'.  They
+are provided for compatibility with other GNU software.
+
+   In other GNU software, the word-boundary operator is '\b'.  However,
+that conflicts with the 'awk' language's definition of '\b' as
+backspace, so 'gawk' uses a different letter.  An alternative method
+would have been to require two backslashes in the GNU operators, but
+this was deemed too confusing.  The current method of using '\y' for the
+GNU '\b' appears to be the lesser of two evils.
+
+   The various command-line options (*note Options::) control how 'gawk'
+interprets characters in regexps:
+
+No options
+     In the default case, 'gawk' provides all the facilities of POSIX
+     regexps and the GNU regexp operators described in *note Regexp
+     Operators::.
+
+'--posix'
+     Match only POSIX regexps; the GNU operators are not special (e.g.,
+     '\w' matches a literal 'w').  Interval expressions are allowed.
+
+'--traditional'
+     Match traditional Unix 'awk' regexps.  The GNU operators are not
+     special, and interval expressions are not available.  Because BWK
+     'awk' supports them, the POSIX character classes ('[[:alnum:]]',
+     etc.)  are available.  Characters described by octal and
+     hexadecimal escape sequences are treated literally, even if they
+     represent regexp metacharacters.
+
+'--re-interval'
+     Allow interval expressions in regexps, if '--traditional' has been
+     provided.  Otherwise, interval expressions are available by
+     default.
+
+
+File: gawk.info,  Node: Case-sensitivity,  Next: Strong Regexp Constants,  Prev: GNU Regexp Operators,  Up: Regexp
+
+3.8 Case Sensitivity in Matching
+================================
+
+Case is normally significant in regular expressions, both when matching
+ordinary characters (i.e., not metacharacters) and inside bracket
+expressions.  Thus, a 'w' in a regular expression matches only a
+lowercase 'w' and not an uppercase 'W'.
+
+   The simplest way to do a case-independent match is to use a bracket
+expression--for example, '[Ww]'.  However, this can be cumbersome if you
+need to use it often, and it can make the regular expressions harder to
+read.  There are two alternatives that you might prefer.
+
+   One way to perform a case-insensitive match at a particular point in
+the program is to convert the data to a single case, using the
+'tolower()' or 'toupper()' built-in string functions (which we haven't
+discussed yet; *note String Functions::).  For example:
+
+     tolower($1) ~ /foo/  { ... }
+
+converts the first field to lowercase before matching against it.  This
+works in any POSIX-compliant 'awk'.
+
+   Another method, specific to 'gawk', is to set the variable
+'IGNORECASE' to a nonzero value (*note Built-in Variables::).  When
+'IGNORECASE' is not zero, _all_ regexp and string operations ignore
+case.
+
+   Changing the value of 'IGNORECASE' dynamically controls the case
+sensitivity of the program as it runs.  Case is significant by default
+because 'IGNORECASE' (like most variables) is initialized to zero:
+
+     x = "aB"
+     if (x ~ /ab/) ...   # this test will fail
+
+     IGNORECASE = 1
+     if (x ~ /ab/) ...   # now it will succeed
+
+   In general, you cannot use 'IGNORECASE' to make certain rules case
+insensitive and other rules case sensitive, as there is no
+straightforward way to set 'IGNORECASE' just for the pattern of a
+particular rule.(1)  To do this, use either bracket expressions or
+'tolower()'.  However, one thing you can do with 'IGNORECASE' only is
+dynamically turn case sensitivity on or off for all the rules at once.
+
+   'IGNORECASE' can be set on the command line or in a 'BEGIN' rule
+(*note Other Arguments::; also *note Using BEGIN/END::).  Setting
+'IGNORECASE' from the command line is a way to make a program case
+insensitive without having to edit it.
+
+   In multibyte locales, the equivalences between upper- and lowercase
+characters are tested based on the wide-character values of the locale's
+character set.  Otherwise, the characters are tested based on the
+ISO-8859-1 (ISO Latin-1) character set.  This character set is a
+superset of the traditional 128 ASCII characters, which also provides a
+number of characters suitable for use with European languages.(2)
+
+   The value of 'IGNORECASE' has no effect if 'gawk' is in compatibility
+mode (*note Options::).  Case is always significant in compatibility
+mode.
+
+   ---------- Footnotes ----------
+
+   (1) Experienced C and C++ programmers will note that it is possible,
+using something like 'IGNORECASE = 1 && /foObAr/ { ... }' and
+'IGNORECASE = 0 || /foobar/ { ... }'.  However, this is somewhat obscure
+and we don't recommend it.
+
+   (2) If you don't understand this, don't worry about it; it just means
+that 'gawk' does the right thing.
+
+
+File: gawk.info,  Node: Strong Regexp Constants,  Next: Regexp Summary,  Prev: Case-sensitivity,  Up: Regexp
+
+3.9 Strongly Typed Regexp Constants
+===================================
+
+This minor node describes a 'gawk'-specific feature.
+
+   Regexp constants ('/.../') hold a strange position in the 'awk'
+language.  In most contexts, they act like an expression: '$0 ~ /.../'.
+In other contexts, they denote only a regexp to be matched.  In no case
+are they really a "first class citizen" of the language.  That is, you
+cannot define a scalar variable whose type is "regexp" in the same sense
+that you can define a variable to be a number or a string:
+
+     num = 42        Numeric variable
+     str = "hi"      String variable
+     re = /foo/      Wrong! re is the result of $0 ~ /foo/
+
+
+File: gawk.info,  Node: Regexp Summary,  Prev: Strong Regexp Constants,  Up: Regexp
+
+3.10 Summary
+============
+
+   * Regular expressions describe sets of strings to be matched.  In
+     'awk', regular expression constants are written enclosed between
+     slashes: '/'...'/'.
+
+   * Regexp constants may be used standalone in patterns and in
+     conditional expressions, or as part of matching expressions using
+     the '~' and '!~' operators.
+
+   * Escape sequences let you represent nonprintable characters and also
+     let you represent regexp metacharacters as literal characters to be
+     matched.
+
+   * Regexp operators provide grouping, alternation, and repetition.
+
+   * Bracket expressions give you a shorthand for specifying sets of
+     characters that can match at a particular point in a regexp.
+     Within bracket expressions, POSIX character classes let you specify
+     certain groups of characters in a locale-independent fashion.
+
+   * Regular expressions match the leftmost longest text in the string
+     being matched.  This matters for cases where you need to know the
+     extent of the match, such as for text substitution and when the
+     record separator is a regexp.
+
+   * Matching expressions may use dynamic regexps (i.e., string values
+     treated as regular expressions).
+
+   * 'gawk''s 'IGNORECASE' variable lets you control the case
+     sensitivity of regexp matching.  In other 'awk' versions, use
+     'tolower()' or 'toupper()'.
+
+
+File: gawk.info,  Node: Reading Files,  Next: Printing,  Prev: Regexp,  Up: Top
+
+4 Reading Input Files
+*********************
+
+In the typical 'awk' program, 'awk' reads all input either from the
+standard input (by default, this is the keyboard, but often it is a pipe
+from another command) or from files whose names you specify on the 'awk'
+command line.  If you specify input files, 'awk' reads them in order,
+processing all the data from one before going on to the next.  The name
+of the current input file can be found in the predefined variable
+'FILENAME' (*note Built-in Variables::).
+
+   The input is read in units called "records", and is processed by the
+rules of your program one record at a time.  By default, each record is
+one line.  Each record is automatically split into chunks called
+"fields".  This makes it more convenient for programs to work on the
+parts of a record.
+
+   On rare occasions, you may need to use the 'getline' command.  The
+'getline' command is valuable both because it can do explicit input from
+any number of files, and because the files used with it do not have to
+be named on the 'awk' command line (*note Getline::).
+
+* Menu:
+
+* Records::                     Controlling how data is split into records.
+* Fields::                      An introduction to fields.
+* Nonconstant Fields::          Nonconstant Field Numbers.
+* Changing Fields::             Changing the Contents of a Field.
+* Field Separators::            The field separator and how to change it.
+* Constant Size::               Reading constant width data.
+* Splitting By Content::        Defining Fields By Content
+* Multiple Line::               Reading multiline records.
+* Getline::                     Reading files under explicit program control
+                                using the 'getline' function.
+* Read Timeout::                Reading input with a timeout.
+* Retrying Input::              Retrying input after certain errors.
+* Command-line directories::    What happens if you put a directory on the
+                                command line.
+* Input Summary::               Input summary.
+* Input Exercises::             Exercises.
+
+
+File: gawk.info,  Node: Records,  Next: Fields,  Up: Reading Files
+
+4.1 How Input Is Split into Records
+===================================
+
+'awk' divides the input for your program into records and fields.  It
+keeps track of the number of records that have been read so far from the
+current input file.  This value is stored in a predefined variable
+called 'FNR', which is reset to zero every time a new file is started.
+Another predefined variable, 'NR', records the total number of input
+records read so far from all data files.  It starts at zero, but is
+never automatically reset to zero.
+
+* Menu:
+
+* awk split records::           How standard 'awk' splits records.
+* gawk split records::          How 'gawk' splits records.
+
+
+File: gawk.info,  Node: awk split records,  Next: gawk split records,  Up: Records
+
+4.1.1 Record Splitting with Standard 'awk'
+------------------------------------------
+
+Records are separated by a character called the "record separator".  By
+default, the record separator is the newline character.  This is why
+records are, by default, single lines.  To use a different character for
+the record separator, simply assign that character to the predefined
+variable 'RS'.
+
+   Like any other variable, the value of 'RS' can be changed in the
+'awk' program with the assignment operator, '=' (*note Assignment
+Ops::).  The new record-separator character should be enclosed in
+quotation marks, which indicate a string constant.  Often, the right
+time to do this is at the beginning of execution, before any input is
+processed, so that the very first record is read with the proper
+separator.  To do this, use the special 'BEGIN' pattern (*note
+BEGIN/END::).  For example:
+
+     awk 'BEGIN { RS = "u" }
+          { print $0 }' mail-list
+
+changes the value of 'RS' to 'u', before reading any input.  The new
+value is a string whose first character is the letter "u"; as a result,
+records are separated by the letter "u".  Then the input file is read,
+and the second rule in the 'awk' program (the action with no pattern)
+prints each record.  Because each 'print' statement adds a newline at
+the end of its output, this 'awk' program copies the input with each 'u'
+changed to a newline.  Here are the results of running the program on
+'mail-list':
+
+     $ awk 'BEGIN { RS = "u" }
+     >      { print $0 }' mail-list
+     -| Amelia       555-5553     amelia.zodiac
+     -| sq
+     -| e@gmail.com    F
+     -| Anthony      555-3412     anthony.assert
+     -| ro@hotmail.com   A
+     -| Becky        555-7685     becky.algebrar
+     -| m@gmail.com      A
+     -| Bill         555-1675     bill.drowning@hotmail.com       A
+     -| Broderick    555-0542     broderick.aliq
+     -| otiens@yahoo.com R
+     -| Camilla      555-2912     camilla.inf
+     -| sar
+     -| m@skynet.be     R
+     -| Fabi
+     -| s       555-1234     fabi
+     -| s.
+     -| ndevicesim
+     -| s@
+     -| cb.ed
+     -|     F
+     -| J
+     -| lie        555-6699     j
+     -| lie.perscr
+     -| tabor@skeeve.com   F
+     -| Martin       555-6480     martin.codicib
+     -| s@hotmail.com    A
+     -| Sam
+     -| el       555-3430     sam
+     -| el.lanceolis@sh
+     -| .ed
+     -|         A
+     -| Jean-Pa
+     -| l    555-2127     jeanpa
+     -| l.campanor
+     -| m@ny
+     -| .ed
+     -|      R
+     -|
+
+Note that the entry for the name 'Bill' is not split.  In the original
+data file (*note Sample Data Files::), the line looks like this:
+
+     Bill         555-1675     bill.drowning@hotmail.com       A
+
+It contains no 'u', so there is no reason to split the record, unlike
+the others, which each have one or more occurrences of the 'u'.  In
+fact, this record is treated as part of the previous record; the newline
+separating them in the output is the original newline in the data file,
+not the one added by 'awk' when it printed the record!
+
+   Another way to change the record separator is on the command line,
+using the variable-assignment feature (*note Other Arguments::):
+
+     awk '{ print $0 }' RS="u" mail-list
+
+This sets 'RS' to 'u' before processing 'mail-list'.
+
+   Using an alphabetic character such as 'u' for the record separator is
+highly likely to produce strange results.  Using an unusual character
+such as '/' is more likely to produce correct behavior in the majority
+of cases, but there are no guarantees.  The moral is: Know Your Data.
+
+   When using regular characters as the record separator, there is one
+unusual case that occurs when 'gawk' is being fully POSIX-compliant
+(*note Options::).  Then, the following (extreme) pipeline prints a
+surprising '1':
+
+     $ echo | gawk --posix 'BEGIN { RS = "a" } ; { print NF }'
+     -| 1
+
+   There is one field, consisting of a newline.  The value of the
+built-in variable 'NF' is the number of fields in the current record.
+(In the normal case, 'gawk' treats the newline as whitespace, printing
+'0' as the result.  Most other versions of 'awk' also act this way.)
+
+   Reaching the end of an input file terminates the current input
+record, even if the last character in the file is not the character in
+'RS'.  (d.c.)
+
+   The empty string '""' (a string without any characters) has a special
+meaning as the value of 'RS'.  It means that records are separated by
+one or more blank lines and nothing else.  *Note Multiple Line:: for
+more details.
+
+   If you change the value of 'RS' in the middle of an 'awk' run, the
+new value is used to delimit subsequent records, but the record
+currently being processed, as well as records already processed, are not
+affected.
+
+   After the end of the record has been determined, 'gawk' sets the
+variable 'RT' to the text in the input that matched 'RS'.
+
+
+File: gawk.info,  Node: gawk split records,  Prev: awk split records,  Up: Records
+
+4.1.2 Record Splitting with 'gawk'
+----------------------------------
+
+When using 'gawk', the value of 'RS' is not limited to a one-character
+string.  It can be any regular expression (*note Regexp::).  (c.e.)  In
+general, each record ends at the next string that matches the regular
+expression; the next record starts at the end of the matching string.
+This general rule is actually at work in the usual case, where 'RS'
+contains just a newline: a record ends at the beginning of the next
+matching string (the next newline in the input), and the following
+record starts just after the end of this string (at the first character
+of the following line).  The newline, because it matches 'RS', is not
+part of either record.
+
+   When 'RS' is a single character, 'RT' contains the same single
+character.  However, when 'RS' is a regular expression, 'RT' contains
+the actual input text that matched the regular expression.
+
+   If the input file ends without any text matching 'RS', 'gawk' sets
+'RT' to the null string.
+
+   The following example illustrates both of these features.  It sets
+'RS' equal to a regular expression that matches either a newline or a
+series of one or more uppercase letters with optional leading and/or
+trailing whitespace:
+
+     $ echo record 1 AAAA record 2 BBBB record 3 |
+     > gawk 'BEGIN { RS = "\n|( *[[:upper:]]+ *)" }
+     >             { print "Record =", $0,"and RT = [" RT "]" }'
+     -| Record = record 1 and RT = [ AAAA ]
+     -| Record = record 2 and RT = [ BBBB ]
+     -| Record = record 3 and RT = [
+     -| ]
+
+The square brackets delineate the contents of 'RT', letting you see the
+leading and trailing whitespace.  The final value of 'RT' is a newline.
+*Note Simple Sed:: for a more useful example of 'RS' as a regexp and
+'RT'.
+
+   If you set 'RS' to a regular expression that allows optional trailing
+text, such as 'RS = "abc(XYZ)?"', it is possible, due to implementation
+constraints, that 'gawk' may match the leading part of the regular
+expression, but not the trailing part, particularly if the input text
+that could match the trailing part is fairly long.  'gawk' attempts to
+avoid this problem, but currently, there's no guarantee that this will
+never happen.
+
+     NOTE: Remember that in 'awk', the '^' and '$' anchor metacharacters
+     match the beginning and end of a _string_, and not the beginning
+     and end of a _line_.  As a result, something like 'RS =
+     "^[[:upper:]]"' can only match at the beginning of a file.  This is
+     because 'gawk' views the input file as one long string that happens
+     to contain newline characters.  It is thus best to avoid anchor
+     metacharacters in the value of 'RS'.
+
+   The use of 'RS' as a regular expression and the 'RT' variable are
+'gawk' extensions; they are not available in compatibility mode (*note
+Options::).  In compatibility mode, only the first character of the
+value of 'RS' determines the end of the record.
+
+                      'RS = "\0"' Is Not Portable
+
+   There are times when you might want to treat an entire data file as a
+single record.  The only way to make this happen is to give 'RS' a value
+that you know doesn't occur in the input file.  This is hard to do in a
+general way, such that a program always works for arbitrary input files.
+
+   You might think that for text files, the NUL character, which
+consists of a character with all bits equal to zero, is a good value to
+use for 'RS' in this case:
+
+     BEGIN { RS = "\0" }  # whole file becomes one record?
+
+   'gawk' in fact accepts this, and uses the NUL character for the
+record separator.  This works for certain special files, such as
+'/proc/environ' on GNU/Linux systems, where the NUL character is in fact
+the record separator.  However, this usage is _not_ portable to most
+other 'awk' implementations.
+
+   Almost all other 'awk' implementations(1) store strings internally as
+C-style strings.  C strings use the NUL character as the string
+terminator.  In effect, this means that 'RS = "\0"' is the same as 'RS =
+""'.  (d.c.)
+
+   It happens that recent versions of 'mawk' can use the NUL character
+as a record separator.  However, this is a special case: 'mawk' does not
+allow embedded NUL characters in strings.  (This may change in a future
+version of 'mawk'.)
+
+   *Note Readfile Function:: for an interesting way to read whole files.
+If you are using 'gawk', see *note Extension Sample Readfile:: for
+another option.
+
+   ---------- Footnotes ----------
+
+   (1) At least that we know about.
+
+
+File: gawk.info,  Node: Fields,  Next: Nonconstant Fields,  Prev: Records,  Up: Reading Files
+
+4.2 Examining Fields
+====================
+
+When 'awk' reads an input record, the record is automatically "parsed"
+or separated by the 'awk' utility into chunks called "fields".  By
+default, fields are separated by "whitespace", like words in a line.
+Whitespace in 'awk' means any string of one or more spaces, TABs, or
+newlines; other characters that are considered whitespace by other
+languages (such as formfeed, vertical tab, etc.)  are _not_ considered
+whitespace by 'awk'.
+
+   The purpose of fields is to make it more convenient for you to refer
+to these pieces of the record.  You don't have to use them--you can
+operate on the whole record if you want--but fields are what make simple
+'awk' programs so powerful.
+
+   You use a dollar sign ('$') to refer to a field in an 'awk' program,
+followed by the number of the field you want.  Thus, '$1' refers to the
+first field, '$2' to the second, and so on.  (Unlike in the Unix shells,
+the field numbers are not limited to single digits.  '$127' is the 127th
+field in the record.)  For example, suppose the following is a line of
+input:
+
+     This seems like a pretty nice example.
+
+Here the first field, or '$1', is 'This', the second field, or '$2', is
+'seems', and so on.  Note that the last field, '$7', is 'example.'.
+Because there is no space between the 'e' and the '.', the period is
+considered part of the seventh field.
+
+   'NF' is a predefined variable whose value is the number of fields in
+the current record.  'awk' automatically updates the value of 'NF' each
+time it reads a record.  No matter how many fields there are, the last
+field in a record can be represented by '$NF'.  So, '$NF' is the same as
+'$7', which is 'example.'.  If you try to reference a field beyond the
+last one (such as '$8' when the record has only seven fields), you get
+the empty string.  (If used in a numeric operation, you get zero.)
+
+   The use of '$0', which looks like a reference to the "zeroth" field,
+is a special case: it represents the whole input record.  Use it when
+you are not interested in specific fields.  Here are some more examples:
+
+     $ awk '$1 ~ /li/ { print $0 }' mail-list
+     -| Amelia       555-5553     amelia.zodiacusque@gmail.com    F
+     -| Julie        555-6699     julie.perscrutabor@skeeve.com   F
+
+This example prints each record in the file 'mail-list' whose first
+field contains the string 'li'.
+
+   By contrast, the following example looks for 'li' in _the entire
+record_ and prints the first and last fields for each matching input
+record:
+
+     $ awk '/li/ { print $1, $NF }' mail-list
+     -| Amelia F
+     -| Broderick R
+     -| Julie F
+     -| Samuel A
+
+
+File: gawk.info,  Node: Nonconstant Fields,  Next: Changing Fields,  Prev: Fields,  Up: Reading Files
+
+4.3 Nonconstant Field Numbers
+=============================
+
+A field number need not be a constant.  Any expression in the 'awk'
+language can be used after a '$' to refer to a field.  The value of the
+expression specifies the field number.  If the value is a string, rather
+than a number, it is converted to a number.  Consider this example:
+
+     awk '{ print $NR }'
+
+Recall that 'NR' is the number of records read so far: one in the first
+record, two in the second, and so on.  So this example prints the first
+field of the first record, the second field of the second record, and so
+on.  For the twentieth record, field number 20 is printed; most likely,
+the record has fewer than 20 fields, so this prints a blank line.  Here
+is another example of using expressions as field numbers:
+
+     awk '{ print $(2*2) }' mail-list
+
+   'awk' evaluates the expression '(2*2)' and uses its value as the
+number of the field to print.  The '*' represents multiplication, so the
+expression '2*2' evaluates to four.  The parentheses are used so that
+the multiplication is done before the '$' operation; they are necessary
+whenever there is a binary operator(1) in the field-number expression.
+This example, then, prints the type of relationship (the fourth field)
+for every line of the file 'mail-list'.  (All of the 'awk' operators are
+listed, in order of decreasing precedence, in *note Precedence::.)
+
+   If the field number you compute is zero, you get the entire record.
+Thus, '$(2-2)' has the same value as '$0'.  Negative field numbers are
+not allowed; trying to reference one usually terminates the program.
+(The POSIX standard does not define what happens when you reference a
+negative field number.  'gawk' notices this and terminates your program.
+Other 'awk' implementations may behave differently.)
+
+   As mentioned in *note Fields::, 'awk' stores the current record's
+number of fields in the built-in variable 'NF' (also *note Built-in
+Variables::).  Thus, the expression '$NF' is not a special feature--it
+is the direct consequence of evaluating 'NF' and using its value as a
+field number.
+
+   ---------- Footnotes ----------
+
+   (1) A "binary operator", such as '*' for multiplication, is one that
+takes two operands.  The distinction is required because 'awk' also has
+unary (one-operand) and ternary (three-operand) operators.
+
+
+File: gawk.info,  Node: Changing Fields,  Next: Field Separators,  Prev: Nonconstant Fields,  Up: Reading Files
+
+4.4 Changing the Contents of a Field
+====================================
+
+The contents of a field, as seen by 'awk', can be changed within an
+'awk' program; this changes what 'awk' perceives as the current input
+record.  (The actual input is untouched; 'awk' _never_ modifies the
+input file.)  Consider the following example and its output:
+
+     $ awk '{ nboxes = $3 ; $3 = $3 - 10
+     >        print nboxes, $3 }' inventory-shipped
+     -| 25 15
+     -| 32 22
+     -| 24 14
+     ...
+
+The program first saves the original value of field three in the
+variable 'nboxes'.  The '-' sign represents subtraction, so this program
+reassigns field three, '$3', as the original value of field three minus
+ten: '$3 - 10'.  (*Note Arithmetic Ops::.)  Then it prints the original
+and new values for field three.  (Someone in the warehouse made a
+consistent mistake while inventorying the red boxes.)
+
+   For this to work, the text in '$3' must make sense as a number; the
+string of characters must be converted to a number for the computer to
+do arithmetic on it.  The number resulting from the subtraction is
+converted back to a string of characters that then becomes field three.
+*Note Conversion::.
+
+   When the value of a field is changed (as perceived by 'awk'), the
+text of the input record is recalculated to contain the new field where
+the old one was.  In other words, '$0' changes to reflect the altered
+field.  Thus, this program prints a copy of the input file, with 10
+subtracted from the second field of each line:
+
+     $ awk '{ $2 = $2 - 10; print $0 }' inventory-shipped
+     -| Jan 3 25 15 115
+     -| Feb 5 32 24 226
+     -| Mar 5 24 34 228
+     ...
+
+   It is also possible to assign contents to fields that are out of
+range.  For example:
+
+     $ awk '{ $6 = ($5 + $4 + $3 + $2)
+     >        print $6 }' inventory-shipped
+     -| 168
+     -| 297
+     -| 301
+     ...
+
+We've just created '$6', whose value is the sum of fields '$2', '$3',
+'$4', and '$5'.  The '+' sign represents addition.  For the file
+'inventory-shipped', '$6' represents the total number of parcels shipped
+for a particular month.
+
+   Creating a new field changes 'awk''s internal copy of the current
+input record, which is the value of '$0'.  Thus, if you do 'print $0'
+after adding a field, the record printed includes the new field, with
+the appropriate number of field separators between it and the previously
+existing fields.
+
+   This recomputation affects and is affected by 'NF' (the number of
+fields; *note Fields::).  For example, the value of 'NF' is set to the
+number of the highest field you create.  The exact format of '$0' is
+also affected by a feature that has not been discussed yet: the "output
+field separator", 'OFS', used to separate the fields (*note Output
+Separators::).
+
+   Note, however, that merely _referencing_ an out-of-range field does
+_not_ change the value of either '$0' or 'NF'.  Referencing an
+out-of-range field only produces an empty string.  For example:
+
+     if ($(NF+1) != "")
+         print "can't happen"
+     else
+         print "everything is normal"
+
+should print 'everything is normal', because 'NF+1' is certain to be out
+of range.  (*Note If Statement:: for more information about 'awk''s
+'if-else' statements.  *Note Typing and Comparison:: for more
+information about the '!=' operator.)
+
+   It is important to note that making an assignment to an existing
+field changes the value of '$0' but does not change the value of 'NF',
+even when you assign the empty string to a field.  For example:
+
+     $ echo a b c d | awk '{ OFS = ":"; $2 = ""
+     >                       print $0; print NF }'
+     -| a::c:d
+     -| 4
+
+The field is still there; it just has an empty value, delimited by the
+two colons between 'a' and 'c'.  This example shows what happens if you
+create a new field:
+
+     $ echo a b c d | awk '{ OFS = ":"; $2 = ""; $6 = "new"
+     >                       print $0; print NF }'
+     -| a::c:d::new
+     -| 6
+
+The intervening field, '$5', is created with an empty value (indicated
+by the second pair of adjacent colons), and 'NF' is updated with the
+value six.
+
+   Decrementing 'NF' throws away the values of the fields after the new
+value of 'NF' and recomputes '$0'.  (d.c.)  Here is an example:
+
+     $ echo a b c d e f | awk '{ print "NF =", NF;
+     >                           NF = 3; print $0 }'
+     -| NF = 6
+     -| a b c
+
+     CAUTION: Some versions of 'awk' don't rebuild '$0' when 'NF' is
+     decremented.
+
+   Finally, there are times when it is convenient to force 'awk' to
+rebuild the entire record, using the current values of the fields and
+'OFS'.  To do this, use the seemingly innocuous assignment:
+
+     $1 = $1   # force record to be reconstituted
+     print $0  # or whatever else with $0
+
+This forces 'awk' to rebuild the record.  It does help to add a comment,
+as we've shown here.
+
+   There is a flip side to the relationship between '$0' and the fields.
+Any assignment to '$0' causes the record to be reparsed into fields
+using the _current_ value of 'FS'.  This also applies to any built-in
+function that updates '$0', such as 'sub()' and 'gsub()' (*note String
+Functions::).
+
+                          Understanding '$0'
+
+   It is important to remember that '$0' is the _full_ record, exactly
+as it was read from the input.  This includes any leading or trailing
+whitespace, and the exact whitespace (or other characters) that
+separates the fields.
+
+   It is a common error to try to change the field separators in a
+record simply by setting 'FS' and 'OFS', and then expecting a plain
+'print' or 'print $0' to print the modified record.
+
+   But this does not work, because nothing was done to change the record
+itself.  Instead, you must force the record to be rebuilt, typically
+with a statement such as '$1 = $1', as described earlier.
+
+
+File: gawk.info,  Node: Field Separators,  Next: Constant Size,  Prev: Changing Fields,  Up: Reading Files
+
+4.5 Specifying How Fields Are Separated
+=======================================
+
+* Menu:
+
+* Default Field Splitting::      How fields are normally separated.
+* Regexp Field Splitting::       Using regexps as the field separator.
+* Single Character Fields::      Making each character a separate field.
+* Command Line Field Separator:: Setting 'FS' from the command line.
+* Full Line Fields::             Making the full line be a single field.
+* Field Splitting Summary::      Some final points and a summary table.
+
+The "field separator", which is either a single character or a regular
+expression, controls the way 'awk' splits an input record into fields.
+'awk' scans the input record for character sequences that match the
+separator; the fields themselves are the text between the matches.
+
+   In the examples that follow, we use the bullet symbol (*) to
+represent spaces in the output.  If the field separator is 'oo', then
+the following line:
+
+     moo goo gai pan
+
+is split into three fields: 'm', '*g', and '*gai*pan'.  Note the leading
+spaces in the values of the second and third fields.
+
+   The field separator is represented by the predefined variable 'FS'.
+Shell programmers take note: 'awk' does _not_ use the name 'IFS' that is
+used by the POSIX-compliant shells (such as the Unix Bourne shell, 'sh',
+or Bash).
+
+   The value of 'FS' can be changed in the 'awk' program with the
+assignment operator, '=' (*note Assignment Ops::).  Often, the right
+time to do this is at the beginning of execution before any input has
+been processed, so that the very first record is read with the proper
+separator.  To do this, use the special 'BEGIN' pattern (*note
+BEGIN/END::).  For example, here we set the value of 'FS' to the string
+'","':
+
+     awk 'BEGIN { FS = "," } ; { print $2 }'
+
+Given the input line:
+
+     John Q. Smith, 29 Oak St., Walamazoo, MI 42139
+
+this 'awk' program extracts and prints the string '*29*Oak*St.'.
+
+   Sometimes the input data contains separator characters that don't
+separate fields the way you thought they would.  For instance, the
+person's name in the example we just used might have a title or suffix
+attached, such as:
+
+     John Q. Smith, LXIX, 29 Oak St., Walamazoo, MI 42139
+
+The same program would extract '*LXIX' instead of '*29*Oak*St.'.  If you
+were expecting the program to print the address, you would be surprised.
+The moral is to choose your data layout and separator characters
+carefully to prevent such problems.  (If the data is not in a form that
+is easy to process, perhaps you can massage it first with a separate
+'awk' program.)
+
+
+File: gawk.info,  Node: Default Field Splitting,  Next: Regexp Field Splitting,  Up: Field Separators
+
+4.5.1 Whitespace Normally Separates Fields
+------------------------------------------
+
+Fields are normally separated by whitespace sequences (spaces, TABs, and
+newlines), not by single spaces.  Two spaces in a row do not delimit an
+empty field.  The default value of the field separator 'FS' is a string
+containing a single space, '" "'.  If 'awk' interpreted this value in
+the usual way, each space character would separate fields, so two spaces
+in a row would make an empty field between them.  The reason this does
+not happen is that a single space as the value of 'FS' is a special
+case--it is taken to specify the default manner of delimiting fields.
+
+   If 'FS' is any other single character, such as '","', then each
+occurrence of that character separates two fields.  Two consecutive
+occurrences delimit an empty field.  If the character occurs at the
+beginning or the end of the line, that too delimits an empty field.  The
+space character is the only single character that does not follow these
+rules.
+
+
+File: gawk.info,  Node: Regexp Field Splitting,  Next: Single Character Fields,  Prev: Default Field Splitting,  Up: Field Separators
+
+4.5.2 Using Regular Expressions to Separate Fields
+--------------------------------------------------
+
+The previous node discussed the use of single characters or simple
+strings as the value of 'FS'.  More generally, the value of 'FS' may be
+a string containing any regular expression.  In this case, each match in
+the record for the regular expression separates fields.  For example,
+the assignment:
+
+     FS = ", \t"
+
+makes every area of an input line that consists of a comma followed by a
+space and a TAB into a field separator.  ('\t' is an "escape sequence"
+that stands for a TAB; *note Escape Sequences::, for the complete list
+of similar escape sequences.)
+
+   For a less trivial example of a regular expression, try using single
+spaces to separate fields the way single commas are used.  'FS' can be
+set to '"[ ]"' (left bracket, space, right bracket).  This regular
+expression matches a single space and nothing else (*note Regexp::).
+
+   There is an important difference between the two cases of 'FS = " "'
+(a single space) and 'FS = "[ \t\n]+"' (a regular expression matching
+one or more spaces, TABs, or newlines).  For both values of 'FS', fields
+are separated by "runs" (multiple adjacent occurrences) of spaces, TABs,
+and/or newlines.  However, when the value of 'FS' is '" "', 'awk' first
+strips leading and trailing whitespace from the record and then decides
+where the fields are.  For example, the following pipeline prints 'b':
+
+     $ echo ' a b c d ' | awk '{ print $2 }'
+     -| b
+
+However, this pipeline prints 'a' (note the extra spaces around each
+letter):
+
+     $ echo ' a  b  c  d ' | awk 'BEGIN { FS = "[ \t\n]+" }
+     >                                  { print $2 }'
+     -| a
+
+In this case, the first field is null, or empty.
+
+   The stripping of leading and trailing whitespace also comes into play
+whenever '$0' is recomputed.  For instance, study this pipeline:
+
+     $ echo '   a b c d' | awk '{ print; $2 = $2; print }'
+     -|    a b c d
+     -| a b c d
+
+The first 'print' statement prints the record as it was read, with
+leading whitespace intact.  The assignment to '$2' rebuilds '$0' by
+concatenating '$1' through '$NF' together, separated by the value of
+'OFS' (which is a space by default).  Because the leading whitespace was
+ignored when finding '$1', it is not part of the new '$0'.  Finally, the
+last 'print' statement prints the new '$0'.
+
+   There is an additional subtlety to be aware of when using regular
+expressions for field splitting.  It is not well specified in the POSIX
+standard, or anywhere else, what '^' means when splitting fields.  Does
+the '^' match only at the beginning of the entire record?  Or is each
+field separator a new string?  It turns out that different 'awk'
+versions answer this question differently, and you should not rely on
+any specific behavior in your programs.  (d.c.)
+
+   As a point of information, BWK 'awk' allows '^' to match only at the
+beginning of the record.  'gawk' also works this way.  For example:
+
+     $ echo 'xxAA  xxBxx  C' |
+     > gawk -F '(^x+)|( +)' '{ for (i = 1; i <= NF; i++)
+     >                             printf "-->%s<--\n", $i }'
+     -| --><--
+     -| -->AA<--
+     -| -->xxBxx<--
+     -| -->C<--
+
+
+File: gawk.info,  Node: Single Character Fields,  Next: Command Line Field Separator,  Prev: Regexp Field Splitting,  Up: Field Separators
+
+4.5.3 Making Each Character a Separate Field
+--------------------------------------------
+
+There are times when you may want to examine each character of a record
+separately.  This can be done in 'gawk' by simply assigning the null
+string ('""') to 'FS'.  (c.e.)  In this case, each individual character
+in the record becomes a separate field.  For example:
+
+     $ echo a b | gawk 'BEGIN { FS = "" }
+     >                  {
+     >                      for (i = 1; i <= NF; i = i + 1)
+     >                          print "Field", i, "is", $i
+     >                  }'
+     -| Field 1 is a
+     -| Field 2 is
+     -| Field 3 is b
+
+   Traditionally, the behavior of 'FS' equal to '""' was not defined.
+In this case, most versions of Unix 'awk' simply treat the entire record
+as only having one field.  (d.c.)  In compatibility mode (*note
+Options::), if 'FS' is the null string, then 'gawk' also behaves this
+way.
+
+
+File: gawk.info,  Node: Command Line Field Separator,  Next: Full Line Fields,  Prev: Single Character Fields,  Up: Field Separators
+
+4.5.4 Setting 'FS' from the Command Line
+----------------------------------------
+
+'FS' can be set on the command line.  Use the '-F' option to do so.  For
+example:
+
+     awk -F, 'PROGRAM' INPUT-FILES
+
+sets 'FS' to the ',' character.  Notice that the option uses an
+uppercase 'F' instead of a lowercase 'f'.  The latter option ('-f')
+specifies a file containing an 'awk' program.
+
+   The value used for the argument to '-F' is processed in exactly the
+same way as assignments to the predefined variable 'FS'.  Any special
+characters in the field separator must be escaped appropriately.  For
+example, to use a '\' as the field separator on the command line, you
+would have to type:
+
+     # same as FS = "\\"
+     awk -F\\\\ '...' files ...
+
+Because '\' is used for quoting in the shell, 'awk' sees '-F\\'.  Then
+'awk' processes the '\\' for escape characters (*note Escape
+Sequences::), finally yielding a single '\' to use for the field
+separator.
+
+   As a special case, in compatibility mode (*note Options::), if the
+argument to '-F' is 't', then 'FS' is set to the TAB character.  If you
+type '-F\t' at the shell, without any quotes, the '\' gets deleted, so
+'awk' figures that you really want your fields to be separated with TABs
+and not 't's.  Use '-v FS="t"' or '-F"[t]"' on the command line if you
+really do want to separate your fields with 't's.  Use '-F '\t'' when
+not in compatibility mode to specify that TABs separate fields.
+
+   As an example, let's use an 'awk' program file called 'edu.awk' that
+contains the pattern '/edu/' and the action 'print $1':
+
+     /edu/   { print $1 }
+
+   Let's also set 'FS' to be the '-' character and run the program on
+the file 'mail-list'.  The following command prints a list of the names
+of the people that work at or attend a university, and the first three
+digits of their phone numbers:
+
+     $ awk -F- -f edu.awk mail-list
+     -| Fabius       555
+     -| Samuel       555
+     -| Jean
+
+Note the third line of output.  The third line in the original file
+looked like this:
+
+     Jean-Paul    555-2127     jeanpaul.campanorum@nyu.edu     R
+
+   The '-' as part of the person's name was used as the field separator,
+instead of the '-' in the phone number that was originally intended.
+This demonstrates why you have to be careful in choosing your field and
+record separators.
+
+   Perhaps the most common use of a single character as the field
+separator occurs when processing the Unix system password file.  On many
+Unix systems, each user has a separate entry in the system password
+file, with one line per user.  The information in these lines is
+separated by colons.  The first field is the user's login name and the
+second is the user's encrypted or shadow password.  (A shadow password
+is indicated by the presence of a single 'x' in the second field.)  A
+password file entry might look like this:
+
+     arnold:x:2076:10:Arnold Robbins:/home/arnold:/bin/bash
+
+   The following program searches the system password file and prints
+the entries for users whose full name is not indicated:
+
+     awk -F: '$5 == ""' /etc/passwd
+
+
+File: gawk.info,  Node: Full Line Fields,  Next: Field Splitting Summary,  Prev: Command Line Field Separator,  Up: Field Separators
+
+4.5.5 Making the Full Line Be a Single Field
+--------------------------------------------
+
+Occasionally, it's useful to treat the whole input line as a single
+field.  This can be done easily and portably simply by setting 'FS' to
+'"\n"' (a newline):(1)
+
+     awk -F'\n' 'PROGRAM' FILES ...
+
+When you do this, '$1' is the same as '$0'.
+
+               Changing 'FS' Does Not Affect the Fields
+
+   According to the POSIX standard, 'awk' is supposed to behave as if
+each record is split into fields at the time it is read.  In particular,
+this means that if you change the value of 'FS' after a record is read,
+the values of the fields (i.e., how they were split) should reflect the
+old value of 'FS', not the new one.
+
+   However, many older implementations of 'awk' do not work this way.
+Instead, they defer splitting the fields until a field is actually
+referenced.  The fields are split using the _current_ value of 'FS'!
+(d.c.)  This behavior can be difficult to diagnose.  The following
+example illustrates the difference between the two methods:
+
+     sed 1q /etc/passwd | awk '{ FS = ":" ; print $1 }'
+
+which usually prints:
+
+     root
+
+on an incorrect implementation of 'awk', while 'gawk' prints the full
+first line of the file, something like:
+
+     root:x:0:0:Root:/:
+
+   (The 'sed'(2) command prints just the first line of '/etc/passwd'.)
+
+   ---------- Footnotes ----------
+
+   (1) Thanks to Andrew Schorr for this tip.
+
+   (2) The 'sed' utility is a "stream editor."  Its behavior is also
+defined by the POSIX standard.
+
+
+File: gawk.info,  Node: Field Splitting Summary,  Prev: Full Line Fields,  Up: Field Separators
+
+4.5.6 Field-Splitting Summary
+-----------------------------
+
+It is important to remember that when you assign a string constant as
+the value of 'FS', it undergoes normal 'awk' string processing.  For
+example, with Unix 'awk' and 'gawk', the assignment 'FS = "\.."' assigns
+the character string '".."' to 'FS' (the backslash is stripped).  This
+creates a regexp meaning "fields are separated by occurrences of any two
+characters."  If instead you want fields to be separated by a literal
+period followed by any single character, use 'FS = "\\.."'.
+
+   The following list summarizes how fields are split, based on the
+value of 'FS' ('==' means "is equal to"):
+
+'FS == " "'
+     Fields are separated by runs of whitespace.  Leading and trailing
+     whitespace are ignored.  This is the default.
+
+'FS == ANY OTHER SINGLE CHARACTER'
+     Fields are separated by each occurrence of the character.  Multiple
+     successive occurrences delimit empty fields, as do leading and
+     trailing occurrences.  The character can even be a regexp
+     metacharacter; it does not need to be escaped.
+
+'FS == REGEXP'
+     Fields are separated by occurrences of characters that match
+     REGEXP.  Leading and trailing matches of REGEXP delimit empty
+     fields.
+
+'FS == ""'
+     Each individual character in the record becomes a separate field.
+     (This is a common extension; it is not specified by the POSIX
+     standard.)
+
+                         'FS' and 'IGNORECASE'
+
+   The 'IGNORECASE' variable (*note User-modified::) affects field
+splitting _only_ when the value of 'FS' is a regexp.  It has no effect
+when 'FS' is a single character, even if that character is a letter.
+Thus, in the following code:
+
+     FS = "c"
+     IGNORECASE = 1
+     $0 = "aCa"
+     print $1
+
+The output is 'aCa'.  If you really want to split fields on an
+alphabetic character while ignoring case, use a regexp that will do it
+for you (e.g., 'FS = "[c]"').  In this case, 'IGNORECASE' will take
+effect.
+
+
+File: gawk.info,  Node: Constant Size,  Next: Splitting By Content,  Prev: Field Separators,  Up: Reading Files
+
+4.6 Reading Fixed-Width Data
+============================
+
+This minor node discusses an advanced feature of 'gawk'.  If you are a
+novice 'awk' user, you might want to skip it on the first reading.
+
+   'gawk' provides a facility for dealing with fixed-width fields with
+no distinctive field separator.  For example, data of this nature arises
+in the input for old Fortran programs where numbers are run together, or
+in the output of programs that did not anticipate the use of their
+output as input for other programs.
+
+   An example of the latter is a table where all the columns are lined
+up by the use of a variable number of spaces and _empty fields are just
+spaces_.  Clearly, 'awk''s normal field splitting based on 'FS' does not
+work well in this case.  Although a portable 'awk' program can use a
+series of 'substr()' calls on '$0' (*note String Functions::), this is
+awkward and inefficient for a large number of fields.
+
+   The splitting of an input record into fixed-width fields is specified
+by assigning a string containing space-separated numbers to the built-in
+variable 'FIELDWIDTHS'.  Each number specifies the width of the field,
+_including_ columns between fields.  If you want to ignore the columns
+between fields, you can specify the width as a separate field that is
+subsequently ignored.  It is a fatal error to supply a field width that
+has a negative value.  The following data is the output of the Unix 'w'
+utility.  It is useful to illustrate the use of 'FIELDWIDTHS':
+
+      10:06pm  up 21 days, 14:04,  23 users
+     User     tty       login  idle   JCPU   PCPU  what
+     hzuo     ttyV0     8:58pm            9      5  vi p24.tex
+     hzang    ttyV3     6:37pm    50                -csh
+     eklye    ttyV5     9:53pm            7      1  em thes.tex
+     dportein ttyV6     8:17pm  1:47                -csh
+     gierd    ttyD3    10:00pm     1                elm
+     dave     ttyD4     9:47pm            4      4  w
+     brent    ttyp0    26Jun91  4:46  26:46   4:41  bash
+     dave     ttyq4    26Jun9115days     46     46  wnewmail
+
+   The following program takes this input, converts the idle time to
+number of seconds, and prints out the first two fields and the
+calculated idle time:
+
+     BEGIN  { FIELDWIDTHS = "9 6 10 6 7 7 35" }
+     NR > 2 {
+         idle = $4
+         sub(/^ +/, "", idle)   # strip leading spaces
+         if (idle == "")
+             idle = 0
+         if (idle ~ /:/) {
+             split(idle, t, ":")
+             idle = t[1] * 60 + t[2]
+         }
+         if (idle ~ /days/)
+             idle *= 24 * 60 * 60
+
+         print $1, $2, idle
+     }
+
+     NOTE: The preceding program uses a number of 'awk' features that
+     haven't been introduced yet.
+
+   Running the program on the data produces the following results:
+
+     hzuo      ttyV0  0
+     hzang     ttyV3  50
+     eklye     ttyV5  0
+     dportein  ttyV6  107
+     gierd     ttyD3  1
+     dave      ttyD4  0
+     brent     ttyp0  286
+     dave      ttyq4  1296000
+
+   Another (possibly more practical) example of fixed-width input data
+is the input from a deck of balloting cards.  In some parts of the
+United States, voters mark their choices by punching holes in computer
+cards.  These cards are then processed to count the votes for any
+particular candidate or on any particular issue.  Because a voter may
+choose not to vote on some issue, any column on the card may be empty.
+An 'awk' program for processing such data could use the 'FIELDWIDTHS'
+feature to simplify reading the data.  (Of course, getting 'gawk' to run
+on a system with card readers is another story!)
+
+   Assigning a value to 'FS' causes 'gawk' to use 'FS' for field
+splitting again.  Use 'FS = FS' to make this happen, without having to
+know the current value of 'FS'.  In order to tell which kind of field
+splitting is in effect, use 'PROCINFO["FS"]' (*note Auto-set::).  The
+value is '"FS"' if regular field splitting is being used, or
+'"FIELDWIDTHS"' if fixed-width field splitting is being used:
+
+     if (PROCINFO["FS"] == "FS")
+         REGULAR FIELD SPLITTING ...
+     else if  (PROCINFO["FS"] == "FIELDWIDTHS")
+         FIXED-WIDTH FIELD SPLITTING ...
+     else
+         CONTENT-BASED FIELD SPLITTING ... (see next minor node)
+
+   This information is useful when writing a function that needs to
+temporarily change 'FS' or 'FIELDWIDTHS', read some records, and then
+restore the original settings (*note Passwd Functions:: for an example
+of such a function).
+
+
+File: gawk.info,  Node: Splitting By Content,  Next: Multiple Line,  Prev: Constant Size,  Up: Reading Files
+
+4.7 Defining Fields by Content
+==============================
+
+This minor node discusses an advanced feature of 'gawk'.  If you are a
+novice 'awk' user, you might want to skip it on the first reading.
+
+   Normally, when using 'FS', 'gawk' defines the fields as the parts of
+the record that occur in between each field separator.  In other words,
+'FS' defines what a field _is not_, instead of what a field _is_.
+However, there are times when you really want to define the fields by
+what they are, and not by what they are not.
+
+   The most notorious such case is so-called "comma-separated values"
+(CSV) data.  Many spreadsheet programs, for example, can export their
+data into text files, where each record is terminated with a newline,
+and fields are separated by commas.  If commas only separated the data,
+there wouldn't be an issue.  The problem comes when one of the fields
+contains an _embedded_ comma.  In such cases, most programs embed the
+field in double quotes.(1)  So, we might have data like this:
+
+     Robbins,Arnold,"1234 A Pretty Street, NE",MyTown,MyState,12345-6789,USA
+
+   The 'FPAT' variable offers a solution for cases like this.  The value
+of 'FPAT' should be a string that provides a regular expression.  This
+regular expression describes the contents of each field.
+
+   In the case of CSV data as presented here, each field is either
+"anything that is not a comma," or "a double quote, anything that is not
+a double quote, and a closing double quote."  If written as a regular
+expression constant (*note Regexp::), we would have
+'/([^,]+)|("[^"]+")/'.  Writing this as a string requires us to escape
+the double quotes, leading to:
+
+     FPAT = "([^,]+)|(\"[^\"]+\")"
+
+   Putting this to use, here is a simple program to parse the data:
+
+     BEGIN {
+         FPAT = "([^,]+)|(\"[^\"]+\")"
+     }
+
+     {
+         print "NF = ", NF
+         for (i = 1; i <= NF; i++) {
+             printf("$%d = <%s>\n", i, $i)
+         }
+     }
+
+   When run, we get the following:
+
+     $ gawk -f simple-csv.awk addresses.csv
+     NF =  7
+     $1 = <Robbins>
+     $2 = <Arnold>
+     $3 = <"1234 A Pretty Street, NE">
+     $4 = <MyTown>
+     $5 = <MyState>
+     $6 = <12345-6789>
+     $7 = <USA>
+
+   Note the embedded comma in the value of '$3'.
+
+   A straightforward improvement when processing CSV data of this sort
+would be to remove the quotes when they occur, with something like this:
+
+     if (substr($i, 1, 1) == "\"") {
+         len = length($i)
+         $i = substr($i, 2, len - 2)    # Get text within the two quotes
+     }
+
+   As with 'FS', the 'IGNORECASE' variable (*note User-modified::)
+affects field splitting with 'FPAT'.
+
+   Assigning a value to 'FPAT' overrides field splitting with 'FS' and
+with 'FIELDWIDTHS'.  Similar to 'FIELDWIDTHS', the value of
+'PROCINFO["FS"]' will be '"FPAT"' if content-based field splitting is
+being used.
+
+     NOTE: Some programs export CSV data that contains embedded newlines
+     between the double quotes.  'gawk' provides no way to deal with
+     this.  Even though a formal specification for CSV data exists,
+     there isn't much more to be done; the 'FPAT' mechanism provides an
+     elegant solution for the majority of cases, and the 'gawk'
+     developers are satisfied with that.
+
+   As written, the regexp used for 'FPAT' requires that each field
+contain at least one character.  A straightforward modification
+(changing the first '+' to '*') allows fields to be empty:
+
+     FPAT = "([^,]*)|(\"[^\"]+\")"
+
+   Finally, the 'patsplit()' function makes the same functionality
+available for splitting regular strings (*note String Functions::).
+
+   To recap, 'gawk' provides three independent methods to split input
+records into fields.  The mechanism used is based on which of the three
+variables--'FS', 'FIELDWIDTHS', or 'FPAT'--was last assigned to.
+
+   ---------- Footnotes ----------
+
+   (1) The CSV format lacked a formal standard definition for many
+years.  RFC 4180 (http://www.ietf.org/rfc/rfc4180.txt) standardizes the
+most common practices.
+
+
+File: gawk.info,  Node: Multiple Line,  Next: Getline,  Prev: Splitting By Content,  Up: Reading Files
+
+4.8 Multiple-Line Records
+=========================
+
+In some databases, a single line cannot conveniently hold all the
+information in one entry.  In such cases, you can use multiline records.
+The first step in doing this is to choose your data format.
+
+   One technique is to use an unusual character or string to separate
+records.  For example, you could use the formfeed character (written
+'\f' in 'awk', as in C) to separate them, making each record a page of
+the file.  To do this, just set the variable 'RS' to '"\f"' (a string
+containing the formfeed character).  Any other character could equally
+well be used, as long as it won't be part of the data in a record.
+
+   Another technique is to have blank lines separate records.  By a
+special dispensation, an empty string as the value of 'RS' indicates
+that records are separated by one or more blank lines.  When 'RS' is set
+to the empty string, each record always ends at the first blank line
+encountered.  The next record doesn't start until the first nonblank
+line that follows.  No matter how many blank lines appear in a row, they
+all act as one record separator.  (Blank lines must be completely empty;
+lines that contain only whitespace do not count.)
+
+   You can achieve the same effect as 'RS = ""' by assigning the string
+'"\n\n+"' to 'RS'.  This regexp matches the newline at the end of the
+record and one or more blank lines after the record.  In addition, a
+regular expression always matches the longest possible sequence when
+there is a choice (*note Leftmost Longest::).  So, the next record
+doesn't start until the first nonblank line that follows--no matter how
+many blank lines appear in a row, they are considered one record
+separator.
+
+   However, there is an important difference between 'RS = ""' and 'RS =
+"\n\n+"'.  In the first case, leading newlines in the input data file
+are ignored, and if a file ends without extra blank lines after the last
+record, the final newline is removed from the record.  In the second
+case, this special processing is not done.  (d.c.)
+
+   Now that the input is separated into records, the second step is to
+separate the fields in the records.  One way to do this is to divide
+each of the lines into fields in the normal manner.  This happens by
+default as the result of a special feature.  When 'RS' is set to the
+empty string _and_ 'FS' is set to a single character, the newline
+character _always_ acts as a field separator.  This is in addition to
+whatever field separations result from 'FS'.(1)
+
+   The original motivation for this special exception was probably to
+provide useful behavior in the default case (i.e., 'FS' is equal to
+'" "').  This feature can be a problem if you really don't want the
+newline character to separate fields, because there is no way to prevent
+it.  However, you can work around this by using the 'split()' function
+to break up the record manually (*note String Functions::).  If you have
+a single-character field separator, you can work around the special
+feature in a different way, by making 'FS' into a regexp for that single
+character.  For example, if the field separator is a percent character,
+instead of 'FS = "%"', use 'FS = "[%]"'.
+
+   Another way to separate fields is to put each field on a separate
+line: to do this, just set the variable 'FS' to the string '"\n"'.
+(This single-character separator matches a single newline.)  A practical
+example of a data file organized this way might be a mailing list, where
+blank lines separate the entries.  Consider a mailing list in a file
+named 'addresses', which looks like this:
+
+     Jane Doe
+     123 Main Street
+     Anywhere, SE 12345-6789
+
+     John Smith
+     456 Tree-lined Avenue
+     Smallville, MW 98765-4321
+     ...
+
+A simple program to process this file is as follows:
+
+     # addrs.awk --- simple mailing list program
+
+     # Records are separated by blank lines.
+     # Each line is one field.
+     BEGIN { RS = "" ; FS = "\n" }
+
+     {
+           print "Name is:", $1
+           print "Address is:", $2
+           print "City and State are:", $3
+           print ""
+     }
+
+   Running the program produces the following output:
+
+     $ awk -f addrs.awk addresses
+     -| Name is: Jane Doe
+     -| Address is: 123 Main Street
+     -| City and State are: Anywhere, SE 12345-6789
+     -|
+     -| Name is: John Smith
+     -| Address is: 456 Tree-lined Avenue
+     -| City and State are: Smallville, MW 98765-4321
+     -|
+     ...
+
+   *Note Labels Program:: for a more realistic program dealing with
+address lists.  The following list summarizes how records are split,
+based on the value of 'RS'.  ('==' means "is equal to.")
+
+'RS == "\n"'
+     Records are separated by the newline character ('\n').  In effect,
+     every line in the data file is a separate record, including blank
+     lines.  This is the default.
+
+'RS == ANY SINGLE CHARACTER'
+     Records are separated by each occurrence of the character.
+     Multiple successive occurrences delimit empty records.
+
+'RS == ""'
+     Records are separated by runs of blank lines.  When 'FS' is a
+     single character, then the newline character always serves as a
+     field separator, in addition to whatever value 'FS' may have.
+     Leading and trailing newlines in a file are ignored.
+
+'RS == REGEXP'
+     Records are separated by occurrences of characters that match
+     REGEXP.  Leading and trailing matches of REGEXP delimit empty
+     records.  (This is a 'gawk' extension; it is not specified by the
+     POSIX standard.)
+
+   If not in compatibility mode (*note Options::), 'gawk' sets 'RT' to
+the input text that matched the value specified by 'RS'.  But if the
+input file ended without any text that matches 'RS', then 'gawk' sets
+'RT' to the null string.
+
+   ---------- Footnotes ----------
+
+   (1) When 'FS' is the null string ('""') or a regexp, this special
+feature of 'RS' does not apply.  It does apply to the default field
+separator of a single space: 'FS = " "'.
+
+
+File: gawk.info,  Node: Getline,  Next: Read Timeout,  Prev: Multiple Line,  Up: Reading Files
+
+4.9 Explicit Input with 'getline'
+=================================
+
+So far we have been getting our input data from 'awk''s main input
+stream--either the standard input (usually your keyboard, sometimes the
+output from another program) or the files specified on the command line.
+The 'awk' language has a special built-in command called 'getline' that
+can be used to read input under your explicit control.
+
+   The 'getline' command is used in several different ways and should
+_not_ be used by beginners.  The examples that follow the explanation of
+the 'getline' command include material that has not been covered yet.
+Therefore, come back and study the 'getline' command _after_ you have
+reviewed the rest of this Info file and have a good knowledge of how
+'awk' works.
+
+   The 'getline' command returns 1 if it finds a record and 0 if it
+encounters the end of the file.  If there is some error in getting a
+record, such as a file that cannot be opened, then 'getline' returns -1.
+In this case, 'gawk' sets the variable 'ERRNO' to a string describing
+the error that occurred.
+
+   If 'ERRNO' indicates that the I/O operation may be retried, and
+'PROCINFO["INPUT", "RETRY"]' is set, then 'getline' returns -2 instead
+of -1, and further calls to 'getline' may be attempted.  *Note Retrying
+Input:: for further information about this feature.
+
+   In the following examples, COMMAND stands for a string value that
+represents a shell command.
+
+     NOTE: When '--sandbox' is specified (*note Options::), reading
+     lines from files, pipes, and coprocesses is disabled.
+
+* Menu:
+
+* Plain Getline::               Using 'getline' with no arguments.
+* Getline/Variable::            Using 'getline' into a variable.
+* Getline/File::                Using 'getline' from a file.
+* Getline/Variable/File::       Using 'getline' into a variable from a
+                                file.
+* Getline/Pipe::                Using 'getline' from a pipe.
+* Getline/Variable/Pipe::       Using 'getline' into a variable from a
+                                pipe.
+* Getline/Coprocess::           Using 'getline' from a coprocess.
+* Getline/Variable/Coprocess::  Using 'getline' into a variable from a
+                                coprocess.
+* Getline Notes::               Important things to know about 'getline'.
+* Getline Summary::             Summary of 'getline' Variants.
+
+
+File: gawk.info,  Node: Plain Getline,  Next: Getline/Variable,  Up: Getline
+
+4.9.1 Using 'getline' with No Arguments
+---------------------------------------
+
+The 'getline' command can be used without arguments to read input from
+the current input file.  All it does in this case is read the next input
+record and split it up into fields.  This is useful if you've finished
+processing the current record, but want to do some special processing on
+the next record _right now_.  For example:
+
+     # Remove text between /* and */, inclusive
+     {
+         if ((i = index($0, "/*")) != 0) {
+             out = substr($0, 1, i - 1)  # leading part of the string
+             rest = substr($0, i + 2)    # ... */ ...
+             j = index(rest, "*/")       # is */ in trailing part?
+             if (j > 0) {
+                 rest = substr(rest, j + 2)  # remove comment
+             } else {
+                 while (j == 0) {
+                     # get more text
+                     if (getline <= 0) {
+                         print("unexpected EOF or error:", ERRNO) > "/dev/stderr"
+                         exit
+                     }
+                     # build up the line using string concatenation
+                     rest = rest $0
+                     j = index(rest, "*/")   # is */ in trailing part?
+                     if (j != 0) {
+                         rest = substr(rest, j + 2)
+                         break
+                     }
+                 }
+             }
+             # build up the output line using string concatenation
+             $0 = out rest
+         }
+         print $0
+     }
+
+   This 'awk' program deletes C-style comments ('/* ... */') from the
+input.  It uses a number of features we haven't covered yet, including
+string concatenation (*note Concatenation::) and the 'index()' and
+'substr()' built-in functions (*note String Functions::).  By replacing
+the 'print $0' with other statements, you could perform more complicated
+processing on the decommented input, such as searching for matches of a
+regular expression.  (This program has a subtle problem--it does not
+work if one comment ends and another begins on the same line.)
+
+   This form of the 'getline' command sets 'NF', 'NR', 'FNR', 'RT', and
+the value of '$0'.
+
+     NOTE: The new value of '$0' is used to test the patterns of any
+     subsequent rules.  The original value of '$0' that triggered the
+     rule that executed 'getline' is lost.  By contrast, the 'next'
+     statement reads a new record but immediately begins processing it
+     normally, starting with the first rule in the program.  *Note Next
+     Statement::.
+
+
+File: gawk.info,  Node: Getline/Variable,  Next: Getline/File,  Prev: Plain Getline,  Up: Getline
+
+4.9.2 Using 'getline' into a Variable
+-------------------------------------
+
+You can use 'getline VAR' to read the next record from 'awk''s input
+into the variable VAR.  No other processing is done.  For example,
+suppose the next line is a comment or a special string, and you want to
+read it without triggering any rules.  This form of 'getline' allows you
+to read that line and store it in a variable so that the main
+read-a-line-and-check-each-rule loop of 'awk' never sees it.  The
+following example swaps every two lines of input:
+
+     {
+          if ((getline tmp) > 0) {
+               print tmp
+               print $0
+          } else
+               print $0
+     }
+
+It takes the following list:
+
+     wan
+     tew
+     free
+     phore
+
+and produces these results:
+
+     tew
+     wan
+     phore
+     free
+
+   The 'getline' command used in this way sets only the variables 'NR',
+'FNR', and 'RT' (and, of course, VAR).  The record is not split into
+fields, so the values of the fields (including '$0') and the value of
+'NF' do not change.
+
+
+File: gawk.info,  Node: Getline/File,  Next: Getline/Variable/File,  Prev: Getline/Variable,  Up: Getline
+
+4.9.3 Using 'getline' from a File
+---------------------------------
+
+Use 'getline < FILE' to read the next record from FILE.  Here, FILE is a
+string-valued expression that specifies the file name.  '< FILE' is
+called a "redirection" because it directs input to come from a different
+place.  For example, the following program reads its input record from
+the file 'secondary.input' when it encounters a first field with a value
+equal to 10 in the current input file:
+
+     {
+         if ($1 == 10) {
+              getline < "secondary.input"
+              print
+         } else
+              print
+     }
+
+   Because the main input stream is not used, the values of 'NR' and
+'FNR' are not changed.  However, the record it reads is split into
+fields in the normal manner, so the values of '$0' and the other fields
+are changed, resulting in a new value of 'NF'.  'RT' is also set.
+
+   According to POSIX, 'getline < EXPRESSION' is ambiguous if EXPRESSION
+contains unparenthesized operators other than '$'; for example, 'getline
+< dir "/" file' is ambiguous because the concatenation operator (not
+discussed yet; *note Concatenation::) is not parenthesized.  You should
+write it as 'getline < (dir "/" file)' if you want your program to be
+portable to all 'awk' implementations.
+
+
+File: gawk.info,  Node: Getline/Variable/File,  Next: Getline/Pipe,  Prev: Getline/File,  Up: Getline
+
+4.9.4 Using 'getline' into a Variable from a File
+-------------------------------------------------
+
+Use 'getline VAR < FILE' to read input from the file FILE, and put it in
+the variable VAR.  As earlier, FILE is a string-valued expression that
+specifies the file from which to read.
+
+   In this version of 'getline', none of the predefined variables are
+changed and the record is not split into fields.  The only variable
+changed is VAR.(1)  For example, the following program copies all the
+input files to the output, except for records that say
+'@include FILENAME'.  Such a record is replaced by the contents of the
+file FILENAME:
+
+     {
+          if (NF == 2 && $1 == "@include") {
+               while ((getline line < $2) > 0)
+                    print line
+               close($2)
+          } else
+               print
+     }
+
+   Note here how the name of the extra input file is not built into the
+program; it is taken directly from the data, specifically from the
+second field on the '@include' line.
+
+   The 'close()' function is called to ensure that if two identical
+'@include' lines appear in the input, the entire specified file is
+included twice.  *Note Close Files And Pipes::.
+
+   One deficiency of this program is that it does not process nested
+'@include' statements (i.e., '@include' statements in included files)
+the way a true macro preprocessor would.  *Note Igawk Program:: for a
+program that does handle nested '@include' statements.
+
+   ---------- Footnotes ----------
+
+   (1) This is not quite true.  'RT' could be changed if 'RS' is a
+regular expression.
+
+
+File: gawk.info,  Node: Getline/Pipe,  Next: Getline/Variable/Pipe,  Prev: Getline/Variable/File,  Up: Getline
+
+4.9.5 Using 'getline' from a Pipe
+---------------------------------
+
+     Omniscience has much to recommend it.  Failing that, attention to
+     details would be useful.
+                         -- _Brian Kernighan_
+
+   The output of a command can also be piped into 'getline', using
+'COMMAND | getline'.  In this case, the string COMMAND is run as a shell
+command and its output is piped into 'awk' to be used as input.  This
+form of 'getline' reads one record at a time from the pipe.  For
+example, the following program copies its input to its output, except
+for lines that begin with '@execute', which are replaced by the output
+produced by running the rest of the line as a shell command:
+
+     {
+          if ($1 == "@execute") {
+               tmp = substr($0, 10)        # Remove "@execute"
+               while ((tmp | getline) > 0)
+                    print
+               close(tmp)
+          } else
+               print
+     }
+
+The 'close()' function is called to ensure that if two identical
+'@execute' lines appear in the input, the command is run for each one.
+*Note Close Files And Pipes::.  Given the input:
+
+     foo
+     bar
+     baz
+     @execute who
+     bletch
+
+the program might produce:
+
+     foo
+     bar
+     baz
+     arnold     ttyv0   Jul 13 14:22
+     miriam     ttyp0   Jul 13 14:23     (murphy:0)
+     bill       ttyp1   Jul 13 14:23     (murphy:0)
+     bletch
+
+Notice that this program ran the command 'who' and printed the result.
+(If you try this program yourself, you will of course get different
+results, depending upon who is logged in on your system.)
+
+   This variation of 'getline' splits the record into fields, sets the
+value of 'NF', and recomputes the value of '$0'.  The values of 'NR' and
+'FNR' are not changed.  'RT' is set.
+
+   According to POSIX, 'EXPRESSION | getline' is ambiguous if EXPRESSION
+contains unparenthesized operators other than '$'--for example, '"echo "
+"date" | getline' is ambiguous because the concatenation operator is not
+parenthesized.  You should write it as '("echo " "date") | getline' if
+you want your program to be portable to all 'awk' implementations.
+
+     NOTE: Unfortunately, 'gawk' has not been consistent in its
+     treatment of a construct like '"echo " "date" | getline'.  Most
+     versions, including the current version, treat it at as '("echo "
+     "date") | getline'.  (This is also how BWK 'awk' behaves.)  Some
+     versions instead treat it as '"echo " ("date" | getline)'.  (This
+     is how 'mawk' behaves.)  In short, _always_ use explicit
+     parentheses, and then you won't have to worry.
+
+
+File: gawk.info,  Node: Getline/Variable/Pipe,  Next: Getline/Coprocess,  Prev: Getline/Pipe,  Up: Getline
+
+4.9.6 Using 'getline' into a Variable from a Pipe
+-------------------------------------------------
+
+When you use 'COMMAND | getline VAR', the output of COMMAND is sent
+through a pipe to 'getline' and into the variable VAR.  For example, the
+following program reads the current date and time into the variable
+'current_time', using the 'date' utility, and then prints it:
+
+     BEGIN {
+          "date" | getline current_time
+          close("date")
+          print "Report printed on " current_time
+     }
+
+   In this version of 'getline', none of the predefined variables are
+changed and the record is not split into fields.  However, 'RT' is set.
+
+   According to POSIX, 'EXPRESSION | getline VAR' is ambiguous if
+EXPRESSION contains unparenthesized operators other than '$'; for
+example, '"echo " "date" | getline VAR' is ambiguous because the
+concatenation operator is not parenthesized.  You should write it as
+'("echo " "date") | getline VAR' if you want your program to be portable
+to other 'awk' implementations.
+
+
+File: gawk.info,  Node: Getline/Coprocess,  Next: Getline/Variable/Coprocess,  Prev: Getline/Variable/Pipe,  Up: Getline
+
+4.9.7 Using 'getline' from a Coprocess
+--------------------------------------
+
+Reading input into 'getline' from a pipe is a one-way operation.  The
+command that is started with 'COMMAND | getline' only sends data _to_
+your 'awk' program.
+
+   On occasion, you might want to send data to another program for
+processing and then read the results back.  'gawk' allows you to start a
+"coprocess", with which two-way communications are possible.  This is
+done with the '|&' operator.  Typically, you write data to the coprocess
+first and then read the results back, as shown in the following:
+
+     print "SOME QUERY" |& "db_server"
+     "db_server" |& getline
+
+which sends a query to 'db_server' and then reads the results.
+
+   The values of 'NR' and 'FNR' are not changed, because the main input
+stream is not used.  However, the record is split into fields in the
+normal manner, thus changing the values of '$0', of the other fields,
+and of 'NF' and 'RT'.
+
+   Coprocesses are an advanced feature.  They are discussed here only
+because this is the minor node on 'getline'.  *Note Two-way I/O::, where
+coprocesses are discussed in more detail.
+
+
+File: gawk.info,  Node: Getline/Variable/Coprocess,  Next: Getline Notes,  Prev: Getline/Coprocess,  Up: Getline
+
+4.9.8 Using 'getline' into a Variable from a Coprocess
+------------------------------------------------------
+
+When you use 'COMMAND |& getline VAR', the output from the coprocess
+COMMAND is sent through a two-way pipe to 'getline' and into the
+variable VAR.
+
+   In this version of 'getline', none of the predefined variables are
+changed and the record is not split into fields.  The only variable
+changed is VAR.  However, 'RT' is set.
+
+   Coprocesses are an advanced feature.  They are discussed here only
+because this is the minor node on 'getline'.  *Note Two-way I/O::, where
+coprocesses are discussed in more detail.
+
+
+File: gawk.info,  Node: Getline Notes,  Next: Getline Summary,  Prev: Getline/Variable/Coprocess,  Up: Getline
+
+4.9.9 Points to Remember About 'getline'
+----------------------------------------
+
+Here are some miscellaneous points about 'getline' that you should bear
+in mind:
+
+   * When 'getline' changes the value of '$0' and 'NF', 'awk' does _not_
+     automatically jump to the start of the program and start testing
+     the new record against every pattern.  However, the new record is
+     tested against any subsequent rules.
+
+   * Some very old 'awk' implementations limit the number of pipelines
+     that an 'awk' program may have open to just one.  In 'gawk', there
+     is no such limit.  You can open as many pipelines (and coprocesses)
+     as the underlying operating system permits.
+
+   * An interesting side effect occurs if you use 'getline' without a
+     redirection inside a 'BEGIN' rule.  Because an unredirected
+     'getline' reads from the command-line data files, the first
+     'getline' command causes 'awk' to set the value of 'FILENAME'.
+     Normally, 'FILENAME' does not have a value inside 'BEGIN' rules,
+     because you have not yet started to process the command-line data
+     files.  (d.c.)  (See *note BEGIN/END::; also *note Auto-set::.)
+
+   * Using 'FILENAME' with 'getline' ('getline < FILENAME') is likely to
+     be a source of confusion.  'awk' opens a separate input stream from
+     the current input file.  However, by not using a variable, '$0' and
+     'NF' are still updated.  If you're doing this, it's probably by
+     accident, and you should reconsider what it is you're trying to
+     accomplish.
+
+   * *note Getline Summary::, presents a table summarizing the 'getline'
+     variants and which variables they can affect.  It is worth noting
+     that those variants that do not use redirection can cause
+     'FILENAME' to be updated if they cause 'awk' to start reading a new
+     input file.
+
+   * If the variable being assigned is an expression with side effects,
+     different versions of 'awk' behave differently upon encountering
+     end-of-file.  Some versions don't evaluate the expression; many
+     versions (including 'gawk') do.  Here is an example, courtesy of
+     Duncan Moore:
+
+          BEGIN {
+              system("echo 1 > f")
+              while ((getline a[++c] < "f") > 0) { }
+              print c
+          }
+
+     Here, the side effect is the '++c'.  Is 'c' incremented if
+     end-of-file is encountered before the element in 'a' is assigned?
+
+     'gawk' treats 'getline' like a function call, and evaluates the
+     expression 'a[++c]' before attempting to read from 'f'.  However,
+     some versions of 'awk' only evaluate the expression once they know
+     that there is a string value to be assigned.
+
+
+File: gawk.info,  Node: Getline Summary,  Prev: Getline Notes,  Up: Getline
+
+4.9.10 Summary of 'getline' Variants
+------------------------------------
+
+*note Table 4.1: table-getline-variants. summarizes the eight variants
+of 'getline', listing which predefined variables are set by each one,
+and whether the variant is standard or a 'gawk' extension.  Note: for
+each variant, 'gawk' sets the 'RT' predefined variable.
+
+Variant                  Effect                      'awk' / 'gawk'
+-------------------------------------------------------------------------
+'getline'                Sets '$0', 'NF', 'FNR',     'awk'
+                         'NR', and 'RT'
+'getline' VAR            Sets VAR, 'FNR', 'NR',      'awk'
+                         and 'RT'
+'getline <' FILE         Sets '$0', 'NF', and 'RT'   'awk'
+'getline VAR < FILE'     Sets VAR and 'RT'           'awk'
+COMMAND '| getline'      Sets '$0', 'NF', and 'RT'   'awk'
+COMMAND '| getline'      Sets VAR and 'RT'           'awk'
+VAR
+COMMAND '|& getline'     Sets '$0', 'NF', and 'RT'   'gawk'
+COMMAND '|& getline'     Sets VAR and 'RT'           'gawk'
+VAR
+
+Table 4.1: 'getline' variants and what they set
+
+
+File: gawk.info,  Node: Read Timeout,  Next: Retrying Input,  Prev: Getline,  Up: Reading Files
+
+4.10 Reading Input with a Timeout
+=================================
+
+This minor node describes a feature that is specific to 'gawk'.
+
+   You may specify a timeout in milliseconds for reading input from the
+keyboard, a pipe, or two-way communication, including TCP/IP sockets.
+This can be done on a per-input, per-command, or per-connection basis,
+by setting a special element in the 'PROCINFO' array (*note Auto-set::):
+
+     PROCINFO["input_name", "READ_TIMEOUT"] = TIMEOUT IN MILLISECONDS
+
+   When set, this causes 'gawk' to time out and return failure if no
+data is available to read within the specified timeout period.  For
+example, a TCP client can decide to give up on receiving any response
+from the server after a certain amount of time:
+
+     Service = "/inet/tcp/0/localhost/daytime"
+     PROCINFO[Service, "READ_TIMEOUT"] = 100
+     if ((Service |& getline) > 0)
+         print $0
+     else if (ERRNO != "")
+         print ERRNO
+
+   Here is how to read interactively from the user(1) without waiting
+for more than five seconds:
+
+     PROCINFO["/dev/stdin", "READ_TIMEOUT"] = 5000
+     while ((getline < "/dev/stdin") > 0)
+         print $0
+
+   'gawk' terminates the read operation if input does not arrive after
+waiting for the timeout period, returns failure, and sets 'ERRNO' to an
+appropriate string value.  A negative or zero value for the timeout is
+the same as specifying no timeout at all.
+
+   A timeout can also be set for reading from the keyboard in the
+implicit loop that reads input records and matches them against
+patterns, like so:
+
+     $ gawk 'BEGIN { PROCINFO["-", "READ_TIMEOUT"] = 5000 }
+     > { print "You entered: " $0 }'
+     gawk
+     -| You entered: gawk
+
+   In this case, failure to respond within five seconds results in the
+following error message:
+
+     error-> gawk: cmd. line:2: (FILENAME=- FNR=1) fatal: error reading input file `-': Connection timed out
+
+   The timeout can be set or changed at any time, and will take effect
+on the next attempt to read from the input device.  In the following
+example, we start with a timeout value of one second, and progressively
+reduce it by one-tenth of a second until we wait indefinitely for the
+input to arrive:
+
+     PROCINFO[Service, "READ_TIMEOUT"] = 1000
+     while ((Service |& getline) > 0) {
+         print $0
+         PROCINFO[Service, "READ_TIMEOUT"] -= 100
+     }
+
+     NOTE: You should not assume that the read operation will block
+     exactly after the tenth record has been printed.  It is possible
+     that 'gawk' will read and buffer more than one record's worth of
+     data the first time.  Because of this, changing the value of
+     timeout like in the preceding example is not very useful.
+
+   If the 'PROCINFO' element is not present and the 'GAWK_READ_TIMEOUT'
+environment variable exists, 'gawk' uses its value to initialize the
+timeout value.  The exclusive use of the environment variable to specify
+timeout has the disadvantage of not being able to control it on a
+per-command or per-connection basis.
+
+   'gawk' considers a timeout event to be an error even though the
+attempt to read from the underlying device may succeed in a later
+attempt.  This is a limitation, and it also means that you cannot use
+this to multiplex input from two or more sources.  *Note Retrying
+Input:: for a way to enable later I/O attempts to succeed.
+
+   Assigning a timeout value prevents read operations from blocking
+indefinitely.  But bear in mind that there are other ways 'gawk' can
+stall waiting for an input device to be ready.  A network client can
+sometimes take a long time to establish a connection before it can start
+reading any data, or the attempt to open a FIFO special file for reading
+can block indefinitely until some other process opens it for writing.
+
+   ---------- Footnotes ----------
+
+   (1) This assumes that standard input is the keyboard.
+
+
+File: gawk.info,  Node: Retrying Input,  Next: Command-line directories,  Prev: Read Timeout,  Up: Reading Files
+
+4.11 Retrying Reads After Certain Input Errors
+==============================================
+
+This minor node describes a feature that is specific to 'gawk'.
+
+   When 'gawk' encounters an error while reading input, by default
+'getline' returns -1, and subsequent attempts to read from that file
+result in an end-of-file indication.  However, you may optionally
+instruct 'gawk' to allow I/O to be retried when certain errors are
+encountered by setting a special element in the 'PROCINFO' array (*note
+Auto-set::):
+
+     PROCINFO["INPUT_NAME", "RETRY"] = 1
+
+   When this element exists, 'gawk' checks the value of the system (C
+language) 'errno' variable when an I/O error occurs.  If 'errno'
+indicates a subsequent I/O attempt may succeed, 'getline' instead
+returns -2 and further calls to 'getline' may succeed.  This applies to
+the 'errno' values 'EAGAIN', 'EWOULDBLOCK', 'EINTR', or 'ETIMEDOUT'.
+
+   This feature is useful in conjunction with 'PROCINFO["INPUT_NAME",
+"READ_TIMEOUT"]' or situations where a file descriptor has been
+configured to behave in a non-blocking fashion.
+
+
+File: gawk.info,  Node: Command-line directories,  Next: Input Summary,  Prev: Retrying Input,  Up: Reading Files
+
+4.12 Directories on the Command Line
+====================================
+
+According to the POSIX standard, files named on the 'awk' command line
+must be text files; it is a fatal error if they are not.  Most versions
+of 'awk' treat a directory on the command line as a fatal error.
+
+   By default, 'gawk' produces a warning for a directory on the command
+line, but otherwise ignores it.  This makes it easier to use shell
+wildcards with your 'awk' program:
+
+     $ gawk -f whizprog.awk *        Directories could kill this program
+
+   If either of the '--posix' or '--traditional' options is given, then
+'gawk' reverts to treating a directory on the command line as a fatal
+error.
+
+   *Note Extension Sample Readdir:: for a way to treat directories as
+usable data from an 'awk' program.
+
+
+File: gawk.info,  Node: Input Summary,  Next: Input Exercises,  Prev: Command-line directories,  Up: Reading Files
+
+4.13 Summary
+============
+
+   * Input is split into records based on the value of 'RS'.  The
+     possibilities are as follows:
+
+     Value of 'RS'      Records are split on      'awk' / 'gawk'
+                        ...
+     ---------------------------------------------------------------------------
+     Any single         That character            'awk'
+     character
+     The empty string   Runs of two or more       'awk'
+     ('""')             newlines
+     A regexp           Text that matches the     'gawk'
+                        regexp
+
+   * 'FNR' indicates how many records have been read from the current
+     input file; 'NR' indicates how many records have been read in
+     total.
+
+   * 'gawk' sets 'RT' to the text matched by 'RS'.
+
+   * After splitting the input into records, 'awk' further splits the
+     records into individual fields, named '$1', '$2', and so on.  '$0'
+     is the whole record, and 'NF' indicates how many fields there are.
+     The default way to split fields is between whitespace characters.
+
+   * Fields may be referenced using a variable, as in '$NF'.  Fields may
+     also be assigned values, which causes the value of '$0' to be
+     recomputed when it is later referenced.  Assigning to a field with
+     a number greater than 'NF' creates the field and rebuilds the
+     record, using 'OFS' to separate the fields.  Incrementing 'NF' does
+     the same thing.  Decrementing 'NF' throws away fields and rebuilds
+     the record.
+
+   * Field splitting is more complicated than record splitting:
+
+     Field separator value         Fields are split ...          'awk' /
+                                                                 'gawk'
+     ---------------------------------------------------------------------------
+     'FS == " "'                   On runs of whitespace         'awk'
+     'FS == ANY SINGLE             On that character             'awk'
+     CHARACTER'
+     'FS == REGEXP'                On text matching the regexp   'awk'
+     'FS == ""'                    Such that each individual     'gawk'
+                                   character is a separate
+                                   field
+     'FIELDWIDTHS == LIST OF       Based on character position   'gawk'
+     COLUMNS'
+     'FPAT == REGEXP'              On the text surrounding       'gawk'
+                                   text matching the regexp
+
+   * Using 'FS = "\n"' causes the entire record to be a single field
+     (assuming that newlines separate records).
+
+   * 'FS' may be set from the command line using the '-F' option.  This
+     can also be done using command-line variable assignment.
+
+   * Use 'PROCINFO["FS"]' to see how fields are being split.
+
+   * Use 'getline' in its various forms to read additional records from
+     the default input stream, from a file, or from a pipe or coprocess.
+
+   * Use 'PROCINFO[FILE, "READ_TIMEOUT"]' to cause reads to time out for
+     FILE.
+
+   * Directories on the command line are fatal for standard 'awk';
+     'gawk' ignores them if not in POSIX mode.
+
+
+File: gawk.info,  Node: Input Exercises,  Prev: Input Summary,  Up: Reading Files
+
+4.14 Exercises
+==============
+
+  1. Using the 'FIELDWIDTHS' variable (*note Constant Size::), write a
+     program to read election data, where each record represents one
+     voter's votes.  Come up with a way to define which columns are
+     associated with each ballot item, and print the total votes,
+     including abstentions, for each item.
+
+  2. *note Plain Getline::, presented a program to remove C-style
+     comments ('/* ... */') from the input.  That program does not work
+     if one comment ends on one line and another one starts later on the
+     same line.  That can be fixed by making one simple change.  What is
+     it?
+
+
+File: gawk.info,  Node: Printing,  Next: Expressions,  Prev: Reading Files,  Up: Top
+
+5 Printing Output
+*****************
+
+One of the most common programming actions is to "print", or output,
+some or all of the input.  Use the 'print' statement for simple output,
+and the 'printf' statement for fancier formatting.  The 'print'
+statement is not limited when computing _which_ values to print.
+However, with two exceptions, you cannot specify _how_ to print
+them--how many columns, whether to use exponential notation or not, and
+so on.  (For the exceptions, *note Output Separators:: and *note
+OFMT::.)  For printing with specifications, you need the 'printf'
+statement (*note Printf::).
+
+   Besides basic and formatted printing, this major node also covers I/O
+redirections to files and pipes, introduces the special file names that
+'gawk' processes internally, and discusses the 'close()' built-in
+function.
+
+* Menu:
+
+* Print::                       The 'print' statement.
+* Print Examples::              Simple examples of 'print' statements.
+* Output Separators::           The output separators and how to change them.
+* OFMT::                        Controlling Numeric Output With 'print'.
+* Printf::                      The 'printf' statement.
+* Redirection::                 How to redirect output to multiple files and
+                                pipes.
+* Special FD::                  Special files for I/O.
+* Special Files::               File name interpretation in 'gawk'.
+                                'gawk' allows access to inherited file
+                                descriptors.
+* Close Files And Pipes::       Closing Input and Output Files and Pipes.
+* Nonfatal::                    Enabling Nonfatal Output.
+* Output Summary::              Output summary.
+* Output Exercises::            Exercises.
+
+
+File: gawk.info,  Node: Print,  Next: Print Examples,  Up: Printing
+
+5.1 The 'print' Statement
+=========================
+
+Use the 'print' statement to produce output with simple, standardized
+formatting.  You specify only the strings or numbers to print, in a list
+separated by commas.  They are output, separated by single spaces,
+followed by a newline.  The statement looks like this:
+
+     print ITEM1, ITEM2, ...
+
+The entire list of items may be optionally enclosed in parentheses.  The
+parentheses are necessary if any of the item expressions uses the '>'
+relational operator; otherwise it could be confused with an output
+redirection (*note Redirection::).
+
+   The items to print can be constant strings or numbers, fields of the
+current record (such as '$1'), variables, or any 'awk' expression.
+Numeric values are converted to strings and then printed.
+
+   The simple statement 'print' with no items is equivalent to 'print
+$0': it prints the entire current record.  To print a blank line, use
+'print ""'.  To print a fixed piece of text, use a string constant, such
+as '"Don't Panic"', as one item.  If you forget to use the double-quote
+characters, your text is taken as an 'awk' expression, and you will
+probably get an error.  Keep in mind that a space is printed between any
+two items.
+
+   Note that the 'print' statement is a statement and not an
+expression--you can't use it in the pattern part of a pattern-action
+statement, for example.
+
+
+File: gawk.info,  Node: Print Examples,  Next: Output Separators,  Prev: Print,  Up: Printing
+
+5.2 'print' Statement Examples
+==============================
+
+Each 'print' statement makes at least one line of output.  However, it
+isn't limited to only one line.  If an item value is a string containing
+a newline, the newline is output along with the rest of the string.  A
+single 'print' statement can make any number of lines this way.
+
+   The following is an example of printing a string that contains
+embedded newlines (the '\n' is an escape sequence, used to represent the
+newline character; *note Escape Sequences::):
+
+     $ awk 'BEGIN { print "line one\nline two\nline three" }'
+     -| line one
+     -| line two
+     -| line three
+
+   The next example, which is run on the 'inventory-shipped' file,
+prints the first two fields of each input record, with a space between
+them:
+
+     $ awk '{ print $1, $2 }' inventory-shipped
+     -| Jan 13
+     -| Feb 15
+     -| Mar 15
+     ...
+
+   A common mistake in using the 'print' statement is to omit the comma
+between two items.  This often has the effect of making the items run
+together in the output, with no space.  The reason for this is that
+juxtaposing two string expressions in 'awk' means to concatenate them.
+Here is the same program, without the comma:
+
+     $ awk '{ print $1 $2 }' inventory-shipped
+     -| Jan13
+     -| Feb15
+     -| Mar15
+     ...
+
+   To someone unfamiliar with the 'inventory-shipped' file, neither
+example's output makes much sense.  A heading line at the beginning
+would make it clearer.  Let's add some headings to our table of months
+('$1') and green crates shipped ('$2').  We do this using a 'BEGIN' rule
+(*note BEGIN/END::) so that the headings are only printed once:
+
+     awk 'BEGIN {  print "Month Crates"
+                   print "----- ------" }
+                {  print $1, $2 }' inventory-shipped
+
+When run, the program prints the following:
+
+     Month Crates
+     ----- ------
+     Jan 13
+     Feb 15
+     Mar 15
+     ...
+
+The only problem, however, is that the headings and the table data don't
+line up!  We can fix this by printing some spaces between the two
+fields:
+
+     awk 'BEGIN { print "Month Crates"
+                  print "----- ------" }
+                { print $1, "     ", $2 }' inventory-shipped
+
+   Lining up columns this way can get pretty complicated when there are
+many columns to fix.  Counting spaces for two or three columns is
+simple, but any more than this can take up a lot of time.  This is why
+the 'printf' statement was created (*note Printf::); one of its
+specialties is lining up columns of data.
+
+     NOTE: You can continue either a 'print' or 'printf' statement
+     simply by putting a newline after any comma (*note
+     Statements/Lines::).
+
+
+File: gawk.info,  Node: Output Separators,  Next: OFMT,  Prev: Print Examples,  Up: Printing
+
+5.3 Output Separators
+=====================
+
+As mentioned previously, a 'print' statement contains a list of items
+separated by commas.  In the output, the items are normally separated by
+single spaces.  However, this doesn't need to be the case; a single
+space is simply the default.  Any string of characters may be used as
+the "output field separator" by setting the predefined variable 'OFS'.
+The initial value of this variable is the string '" "' (i.e., a single
+space).
+
+   The output from an entire 'print' statement is called an "output
+record".  Each 'print' statement outputs one output record, and then
+outputs a string called the "output record separator" (or 'ORS').  The
+initial value of 'ORS' is the string '"\n"' (i.e., a newline character).
+Thus, each 'print' statement normally makes a separate line.
+
+   In order to change how output fields and records are separated,
+assign new values to the variables 'OFS' and 'ORS'.  The usual place to
+do this is in the 'BEGIN' rule (*note BEGIN/END::), so that it happens
+before any input is processed.  It can also be done with assignments on
+the command line, before the names of the input files, or using the '-v'
+command-line option (*note Options::).  The following example prints the
+first and second fields of each input record, separated by a semicolon,
+with a blank line added after each newline:
+
+     $ awk 'BEGIN { OFS = ";"; ORS = "\n\n" }
+     >            { print $1, $2 }' mail-list
+     -| Amelia;555-5553
+     -|
+     -| Anthony;555-3412
+     -|
+     -| Becky;555-7685
+     -|
+     -| Bill;555-1675
+     -|
+     -| Broderick;555-0542
+     -|
+     -| Camilla;555-2912
+     -|
+     -| Fabius;555-1234
+     -|
+     -| Julie;555-6699
+     -|
+     -| Martin;555-6480
+     -|
+     -| Samuel;555-3430
+     -|
+     -| Jean-Paul;555-2127
+     -|
+
+   If the value of 'ORS' does not contain a newline, the program's
+output runs together on a single line.
+
+
+File: gawk.info,  Node: OFMT,  Next: Printf,  Prev: Output Separators,  Up: Printing
+
+5.4 Controlling Numeric Output with 'print'
+===========================================
+
+When printing numeric values with the 'print' statement, 'awk'
+internally converts each number to a string of characters and prints
+that string.  'awk' uses the 'sprintf()' function to do this conversion
+(*note String Functions::).  For now, it suffices to say that the
+'sprintf()' function accepts a "format specification" that tells it how
+to format numbers (or strings), and that there are a number of different
+ways in which numbers can be formatted.  The different format
+specifications are discussed more fully in *note Control Letters::.
+
+   The predefined variable 'OFMT' contains the format specification that
+'print' uses with 'sprintf()' when it wants to convert a number to a
+string for printing.  The default value of 'OFMT' is '"%.6g"'.  The way
+'print' prints numbers can be changed by supplying a different format
+specification for the value of 'OFMT', as shown in the following
+example:
+
+     $ awk 'BEGIN {
+     >   OFMT = "%.0f"  # print numbers as integers (rounds)
+     >   print 17.23, 17.54 }'
+     -| 17 18
+
+According to the POSIX standard, 'awk''s behavior is undefined if 'OFMT'
+contains anything but a floating-point conversion specification.  (d.c.)
+
+
+File: gawk.info,  Node: Printf,  Next: Redirection,  Prev: OFMT,  Up: Printing
+
+5.5 Using 'printf' Statements for Fancier Printing
+==================================================
+
+For more precise control over the output format than what is provided by
+'print', use 'printf'.  With 'printf' you can specify the width to use
+for each item, as well as various formatting choices for numbers (such
+as what output base to use, whether to print an exponent, whether to
+print a sign, and how many digits to print after the decimal point).
+
+* Menu:
+
+* Basic Printf::                Syntax of the 'printf' statement.
+* Control Letters::             Format-control letters.
+* Format Modifiers::            Format-specification modifiers.
+* Printf Examples::             Several examples.
+
+
+File: gawk.info,  Node: Basic Printf,  Next: Control Letters,  Up: Printf
+
+5.5.1 Introduction to the 'printf' Statement
+--------------------------------------------
+
+A simple 'printf' statement looks like this:
+
+     printf FORMAT, ITEM1, ITEM2, ...
+
+As for 'print', the entire list of arguments may optionally be enclosed
+in parentheses.  Here too, the parentheses are necessary if any of the
+item expressions uses the '>' relational operator; otherwise, it can be
+confused with an output redirection (*note Redirection::).
+
+   The difference between 'printf' and 'print' is the FORMAT argument.
+This is an expression whose value is taken as a string; it specifies how
+to output each of the other arguments.  It is called the "format
+string".
+
+   The format string is very similar to that in the ISO C library
+function 'printf()'.  Most of FORMAT is text to output verbatim.
+Scattered among this text are "format specifiers"--one per item.  Each
+format specifier says to output the next item in the argument list at
+that place in the format.
+
+   The 'printf' statement does not automatically append a newline to its
+output.  It outputs only what the format string specifies.  So if a
+newline is needed, you must include one in the format string.  The
+output separator variables 'OFS' and 'ORS' have no effect on 'printf'
+statements.  For example:
+
+     $ awk 'BEGIN {
+     >    ORS = "\nOUCH!\n"; OFS = "+"
+     >    msg = "Don\47t Panic!"
+     >    printf "%s\n", msg
+     > }'
+     -| Don't Panic!
+
+Here, neither the '+' nor the 'OUCH!' appears in the output message.
+
+
+File: gawk.info,  Node: Control Letters,  Next: Format Modifiers,  Prev: Basic Printf,  Up: Printf
+
+5.5.2 Format-Control Letters
+----------------------------
+
+A format specifier starts with the character '%' and ends with a
+"format-control letter"--it tells the 'printf' statement how to output
+one item.  The format-control letter specifies what _kind_ of value to
+print.  The rest of the format specifier is made up of optional
+"modifiers" that control _how_ to print the value, such as the field
+width.  Here is a list of the format-control letters:
+
+'%c'
+     Print a number as a character; thus, 'printf "%c", 65' outputs the
+     letter 'A'.  The output for a string value is the first character
+     of the string.
+
+          NOTE: The POSIX standard says the first character of a string
+          is printed.  In locales with multibyte characters, 'gawk'
+          attempts to convert the leading bytes of the string into a
+          valid wide character and then to print the multibyte encoding
+          of that character.  Similarly, when printing a numeric value,
+          'gawk' allows the value to be within the numeric range of
+          values that can be held in a wide character.  If the
+          conversion to multibyte encoding fails, 'gawk' uses the low
+          eight bits of the value as the character to print.
+
+          Other 'awk' versions generally restrict themselves to printing
+          the first byte of a string or to numeric values within the
+          range of a single byte (0-255).
+
+'%d', '%i'
+     Print a decimal integer.  The two control letters are equivalent.
+     (The '%i' specification is for compatibility with ISO C.)
+
+'%e', '%E'
+     Print a number in scientific (exponential) notation.  For example:
+
+          printf "%4.3e\n", 1950
+
+     prints '1.950e+03', with a total of four significant figures, three
+     of which follow the decimal point.  (The '4.3' represents two
+     modifiers, discussed in the next node.)  '%E' uses 'E' instead of
+     'e' in the output.
+
+'%f'
+     Print a number in floating-point notation.  For example:
+
+          printf "%4.3f", 1950
+
+     prints '1950.000', with a total of four significant figures, three
+     of which follow the decimal point.  (The '4.3' represents two
+     modifiers, discussed in the next node.)
+
+     On systems supporting IEEE 754 floating-point format, values
+     representing negative infinity are formatted as '-inf' or
+     '-infinity', and positive infinity as 'inf' or 'infinity'.  The
+     special "not a number" value formats as '-nan' or 'nan' (*note Math
+     Definitions::).
+
+'%F'
+     Like '%f', but the infinity and "not a number" values are spelled
+     using uppercase letters.
+
+     The '%F' format is a POSIX extension to ISO C; not all systems
+     support it.  On those that don't, 'gawk' uses '%f' instead.
+
+'%g', '%G'
+     Print a number in either scientific notation or in floating-point
+     notation, whichever uses fewer characters; if the result is printed
+     in scientific notation, '%G' uses 'E' instead of 'e'.
+
+'%o'
+     Print an unsigned octal integer (*note Nondecimal-numbers::).
+
+'%s'
+     Print a string.
+
+'%u'
+     Print an unsigned decimal integer.  (This format is of marginal
+     use, because all numbers in 'awk' are floating point; it is
+     provided primarily for compatibility with C.)
+
+'%x', '%X'
+     Print an unsigned hexadecimal integer; '%X' uses the letters 'A'
+     through 'F' instead of 'a' through 'f' (*note
+     Nondecimal-numbers::).
+
+'%%'
+     Print a single '%'.  This does not consume an argument and it
+     ignores any modifiers.
+
+     NOTE: When using the integer format-control letters for values that
+     are outside the range of the widest C integer type, 'gawk' switches
+     to the '%g' format specifier.  If '--lint' is provided on the
+     command line (*note Options::), 'gawk' warns about this.  Other
+     versions of 'awk' may print invalid values or do something else
+     entirely.  (d.c.)
+
+
+File: gawk.info,  Node: Format Modifiers,  Next: Printf Examples,  Prev: Control Letters,  Up: Printf
+
+5.5.3 Modifiers for 'printf' Formats
+------------------------------------
+
+A format specification can also include "modifiers" that can control how
+much of the item's value is printed, as well as how much space it gets.
+The modifiers come between the '%' and the format-control letter.  We
+use the bullet symbol "*" in the following examples to represent spaces
+in the output.  Here are the possible modifiers, in the order in which
+they may appear:
+
+'N$'
+     An integer constant followed by a '$' is a "positional specifier".
+     Normally, format specifications are applied to arguments in the
+     order given in the format string.  With a positional specifier, the
+     format specification is applied to a specific argument, instead of
+     what would be the next argument in the list.  Positional specifiers
+     begin counting with one.  Thus:
+
+          printf "%s %s\n", "don't", "panic"
+          printf "%2$s %1$s\n", "panic", "don't"
+
+     prints the famous friendly message twice.
+
+     At first glance, this feature doesn't seem to be of much use.  It
+     is in fact a 'gawk' extension, intended for use in translating
+     messages at runtime.  *Note Printf Ordering::, which describes how
+     and why to use positional specifiers.  For now, we ignore them.
+
+'-' (Minus)
+     The minus sign, used before the width modifier (see later on in
+     this list), says to left-justify the argument within its specified
+     width.  Normally, the argument is printed right-justified in the
+     specified width.  Thus:
+
+          printf "%-4s", "foo"
+
+     prints 'foo*'.
+
+SPACE
+     For numeric conversions, prefix positive values with a space and
+     negative values with a minus sign.
+
+'+'
+     The plus sign, used before the width modifier (see later on in this
+     list), says to always supply a sign for numeric conversions, even
+     if the data to format is positive.  The '+' overrides the space
+     modifier.
+
+'#'
+     Use an "alternative form" for certain control letters.  For '%o',
+     supply a leading zero.  For '%x' and '%X', supply a leading '0x' or
+     '0X' for a nonzero result.  For '%e', '%E', '%f', and '%F', the
+     result always contains a decimal point.  For '%g' and '%G',
+     trailing zeros are not removed from the result.
+
+'0'
+     A leading '0' (zero) acts as a flag indicating that output should
+     be padded with zeros instead of spaces.  This applies only to the
+     numeric output formats.  This flag only has an effect when the
+     field width is wider than the value to print.
+
+'''
+     A single quote or apostrophe character is a POSIX extension to ISO
+     C. It indicates that the integer part of a floating-point value, or
+     the entire part of an integer decimal value, should have a
+     thousands-separator character in it.  This only works in locales
+     that support such characters.  For example:
+
+          $ cat thousands.awk          Show source program
+          -| BEGIN { printf "%'d\n", 1234567 }
+          $ LC_ALL=C gawk -f thousands.awk
+          -| 1234567                   Results in "C" locale
+          $ LC_ALL=en_US.UTF-8 gawk -f thousands.awk
+          -| 1,234,567                 Results in US English UTF locale
+
+     For more information about locales and internationalization issues,
+     see *note Locales::.
+
+          NOTE: The ''' flag is a nice feature, but its use complicates
+          things: it becomes difficult to use it in command-line
+          programs.  For information on appropriate quoting tricks, see
+          *note Quoting::.
+
+WIDTH
+     This is a number specifying the desired minimum width of a field.
+     Inserting any number between the '%' sign and the format-control
+     character forces the field to expand to this width.  The default
+     way to do this is to pad with spaces on the left.  For example:
+
+          printf "%4s", "foo"
+
+     prints '*foo'.
+
+     The value of WIDTH is a minimum width, not a maximum.  If the item
+     value requires more than WIDTH characters, it can be as wide as
+     necessary.  Thus, the following:
+
+          printf "%4s", "foobar"
+
+     prints 'foobar'.
+
+     Preceding the WIDTH with a minus sign causes the output to be
+     padded with spaces on the right, instead of on the left.
+
+'.PREC'
+     A period followed by an integer constant specifies the precision to
+     use when printing.  The meaning of the precision varies by control
+     letter:
+
+     '%d', '%i', '%o', '%u', '%x', '%X'
+          Minimum number of digits to print.
+
+     '%e', '%E', '%f', '%F'
+          Number of digits to the right of the decimal point.
+
+     '%g', '%G'
+          Maximum number of significant digits.
+
+     '%s'
+          Maximum number of characters from the string that should
+          print.
+
+     Thus, the following:
+
+          printf "%.4s", "foobar"
+
+     prints 'foob'.
+
+   The C library 'printf''s dynamic WIDTH and PREC capability (e.g.,
+'"%*.*s"') is supported.  Instead of supplying explicit WIDTH and/or
+PREC values in the format string, they are passed in the argument list.
+For example:
+
+     w = 5
+     p = 3
+     s = "abcdefg"
+     printf "%*.*s\n", w, p, s
+
+is exactly equivalent to:
+
+     s = "abcdefg"
+     printf "%5.3s\n", s
+
+Both programs output '**abc'.  Earlier versions of 'awk' did not support
+this capability.  If you must use such a version, you may simulate this
+feature by using concatenation to build up the format string, like so:
+
+     w = 5
+     p = 3
+     s = "abcdefg"
+     printf "%" w "." p "s\n", s
+
+This is not particularly easy to read, but it does work.
+
+   C programmers may be used to supplying additional modifiers ('h',
+'j', 'l', 'L', 't', and 'z') in 'printf' format strings.  These are not
+valid in 'awk'.  Most 'awk' implementations silently ignore them.  If
+'--lint' is provided on the command line (*note Options::), 'gawk' warns
+about their use.  If '--posix' is supplied, their use is a fatal error.
+
+
+File: gawk.info,  Node: Printf Examples,  Prev: Format Modifiers,  Up: Printf
+
+5.5.4 Examples Using 'printf'
+-----------------------------
+
+The following simple example shows how to use 'printf' to make an
+aligned table:
+
+     awk '{ printf "%-10s %s\n", $1, $2 }' mail-list
+
+This command prints the names of the people ('$1') in the file
+'mail-list' as a string of 10 characters that are left-justified.  It
+also prints the phone numbers ('$2') next on the line.  This produces an
+aligned two-column table of names and phone numbers, as shown here:
+
+     $ awk '{ printf "%-10s %s\n", $1, $2 }' mail-list
+     -| Amelia     555-5553
+     -| Anthony    555-3412
+     -| Becky      555-7685
+     -| Bill       555-1675
+     -| Broderick  555-0542
+     -| Camilla    555-2912
+     -| Fabius     555-1234
+     -| Julie      555-6699
+     -| Martin     555-6480
+     -| Samuel     555-3430
+     -| Jean-Paul  555-2127
+
+   In this case, the phone numbers had to be printed as strings because
+the numbers are separated by dashes.  Printing the phone numbers as
+numbers would have produced just the first three digits: '555'.  This
+would have been pretty confusing.
+
+   It wasn't necessary to specify a width for the phone numbers because
+they are last on their lines.  They don't need to have spaces after
+them.
+
+   The table could be made to look even nicer by adding headings to the
+tops of the columns.  This is done using a 'BEGIN' rule (*note
+BEGIN/END::) so that the headers are only printed once, at the beginning
+of the 'awk' program:
+
+     awk 'BEGIN { print "Name      Number"
+                  print "----      ------" }
+                { printf "%-10s %s\n", $1, $2 }' mail-list
+
+   The preceding example mixes 'print' and 'printf' statements in the
+same program.  Using just 'printf' statements can produce the same
+results:
+
+     awk 'BEGIN { printf "%-10s %s\n", "Name", "Number"
+                  printf "%-10s %s\n", "----", "------" }
+                { printf "%-10s %s\n", $1, $2 }' mail-list
+
+Printing each column heading with the same format specification used for
+the column elements ensures that the headings are aligned just like the
+columns.
+
+   The fact that the same format specification is used three times can
+be emphasized by storing it in a variable, like this:
+
+     awk 'BEGIN { format = "%-10s %s\n"
+                  printf format, "Name", "Number"
+                  printf format, "----", "------" }
+                { printf format, $1, $2 }' mail-list
+
+
+File: gawk.info,  Node: Redirection,  Next: Special FD,  Prev: Printf,  Up: Printing
+
+5.6 Redirecting Output of 'print' and 'printf'
+==============================================
+
+So far, the output from 'print' and 'printf' has gone to the standard
+output, usually the screen.  Both 'print' and 'printf' can also send
+their output to other places.  This is called "redirection".
+
+     NOTE: When '--sandbox' is specified (*note Options::), redirecting
+     output to files, pipes, and coprocesses is disabled.
+
+   A redirection appears after the 'print' or 'printf' statement.
+Redirections in 'awk' are written just like redirections in shell
+commands, except that they are written inside the 'awk' program.
+
+   There are four forms of output redirection: output to a file, output
+appended to a file, output through a pipe to another command, and output
+to a coprocess.  We show them all for the 'print' statement, but they
+work identically for 'printf':
+
+'print ITEMS > OUTPUT-FILE'
+     This redirection prints the items into the output file named
+     OUTPUT-FILE.  The file name OUTPUT-FILE can be any expression.  Its
+     value is changed to a string and then used as a file name (*note
+     Expressions::).
+
+     When this type of redirection is used, the OUTPUT-FILE is erased
+     before the first output is written to it.  Subsequent writes to the
+     same OUTPUT-FILE do not erase OUTPUT-FILE, but append to it.  (This
+     is different from how you use redirections in shell scripts.)  If
+     OUTPUT-FILE does not exist, it is created.  For example, here is
+     how an 'awk' program can write a list of peoples' names to one file
+     named 'name-list', and a list of phone numbers to another file
+     named 'phone-list':
+
+          $ awk '{ print $2 > "phone-list"
+          >        print $1 > "name-list" }' mail-list
+          $ cat phone-list
+          -| 555-5553
+          -| 555-3412
+          ...
+          $ cat name-list
+          -| Amelia
+          -| Anthony
+          ...
+
+     Each output file contains one name or number per line.
+
+'print ITEMS >> OUTPUT-FILE'
+     This redirection prints the items into the preexisting output file
+     named OUTPUT-FILE.  The difference between this and the single-'>'
+     redirection is that the old contents (if any) of OUTPUT-FILE are
+     not erased.  Instead, the 'awk' output is appended to the file.  If
+     OUTPUT-FILE does not exist, then it is created.
+
+'print ITEMS | COMMAND'
+     It is possible to send output to another program through a pipe
+     instead of into a file.  This redirection opens a pipe to COMMAND,
+     and writes the values of ITEMS through this pipe to another process
+     created to execute COMMAND.
+
+     The redirection argument COMMAND is actually an 'awk' expression.
+     Its value is converted to a string whose contents give the shell
+     command to be run.  For example, the following produces two files,
+     one unsorted list of peoples' names, and one list sorted in reverse
+     alphabetical order:
+
+          awk '{ print $1 > "names.unsorted"
+                 command = "sort -r > names.sorted"
+                 print $1 | command }' mail-list
+
+     The unsorted list is written with an ordinary redirection, while
+     the sorted list is written by piping through the 'sort' utility.
+
+     The next example uses redirection to mail a message to the mailing
+     list 'bug-system'.  This might be useful when trouble is
+     encountered in an 'awk' script run periodically for system
+     maintenance:
+
+          report = "mail bug-system"
+          print("Awk script failed:", $0) | report
+          print("at record number", FNR, "of", FILENAME) | report
+          close(report)
+
+     The 'close()' function is called here because it's a good idea to
+     close the pipe as soon as all the intended output has been sent to
+     it.  *Note Close Files And Pipes:: for more information.
+
+     This example also illustrates the use of a variable to represent a
+     FILE or COMMAND--it is not necessary to always use a string
+     constant.  Using a variable is generally a good idea, because (if
+     you mean to refer to that same file or command) 'awk' requires that
+     the string value be written identically every time.
+
+'print ITEMS |& COMMAND'
+     This redirection prints the items to the input of COMMAND.  The
+     difference between this and the single-'|' redirection is that the
+     output from COMMAND can be read with 'getline'.  Thus, COMMAND is a
+     "coprocess", which works together with but is subsidiary to the
+     'awk' program.
+
+     This feature is a 'gawk' extension, and is not available in POSIX
+     'awk'.  *Note Getline/Coprocess::, for a brief discussion.  *Note
+     Two-way I/O::, for a more complete discussion.
+
+   Redirecting output using '>', '>>', '|', or '|&' asks the system to
+open a file, pipe, or coprocess only if the particular FILE or COMMAND
+you specify has not already been written to by your program or if it has
+been closed since it was last written to.
+
+   It is a common error to use '>' redirection for the first 'print' to
+a file, and then to use '>>' for subsequent output:
+
+     # clear the file
+     print "Don't panic" > "guide.txt"
+     ...
+     # append
+     print "Avoid improbability generators" >> "guide.txt"
+
+This is indeed how redirections must be used from the shell.  But in
+'awk', it isn't necessary.  In this kind of case, a program should use
+'>' for all the 'print' statements, because the output file is only
+opened once.  (It happens that if you mix '>' and '>>' output is
+produced in the expected order.  However, mixing the operators for the
+same file is definitely poor style, and is confusing to readers of your
+program.)
+
+   Many older 'awk' implementations limit the number of pipelines that
+an 'awk' program may have open to just one!  In 'gawk', there is no such
+limit.  'gawk' allows a program to open as many pipelines as the
+underlying operating system permits.
+
+                           Piping into 'sh'
+
+   A particularly powerful way to use redirection is to build command
+lines and pipe them into the shell, 'sh'.  For example, suppose you have
+a list of files brought over from a system where all the file names are
+stored in uppercase, and you wish to rename them to have names in all
+lowercase.  The following program is both simple and efficient:
+
+     { printf("mv %s %s\n", $0, tolower($0)) | "sh" }
+
+     END { close("sh") }
+
+   The 'tolower()' function returns its argument string with all
+uppercase characters converted to lowercase (*note String Functions::).
+The program builds up a list of command lines, using the 'mv' utility to
+rename the files.  It then sends the list to the shell for execution.
+
+   *Note Shell Quoting:: for a function that can help in generating
+command lines to be fed to the shell.
+
+
+File: gawk.info,  Node: Special FD,  Next: Special Files,  Prev: Redirection,  Up: Printing
+
+5.7 Special Files for Standard Preopened Data Streams
+=====================================================
+
+Running programs conventionally have three input and output streams
+already available to them for reading and writing.  These are known as
+the "standard input", "standard output", and "standard error output".
+These open streams (and any other open files or pipes) are often
+referred to by the technical term "file descriptors".
+
+   These streams are, by default, connected to your keyboard and screen,
+but they are often redirected with the shell, via the '<', '<<', '>',
+'>>', '>&', and '|' operators.  Standard error is typically used for
+writing error messages; the reason there are two separate streams,
+standard output and standard error, is so that they can be redirected
+separately.
+
+   In traditional implementations of 'awk', the only way to write an
+error message to standard error in an 'awk' program is as follows:
+
+     print "Serious error detected!" | "cat 1>&2"
+
+This works by opening a pipeline to a shell command that can access the
+standard error stream that it inherits from the 'awk' process.  This is
+far from elegant, and it also requires a separate process.  So people
+writing 'awk' programs often don't do this.  Instead, they send the
+error messages to the screen, like this:
+
+     print "Serious error detected!" > "/dev/tty"
+
+('/dev/tty' is a special file supplied by the operating system that is
+connected to your keyboard and screen.  It represents the "terminal,"(1)
+which on modern systems is a keyboard and screen, not a serial console.)
+This generally has the same effect, but not always: although the
+standard error stream is usually the screen, it can be redirected; when
+that happens, writing to the screen is not correct.  In fact, if 'awk'
+is run from a background job, it may not have a terminal at all.  Then
+opening '/dev/tty' fails.
+
+   'gawk', BWK 'awk', and 'mawk' provide special file names for
+accessing the three standard streams.  If the file name matches one of
+these special names when 'gawk' (or one of the others) redirects input
+or output, then it directly uses the descriptor that the file name
+stands for.  These special file names work for all operating systems
+that 'gawk' has been ported to, not just those that are POSIX-compliant:
+
+'/dev/stdin'
+     The standard input (file descriptor 0).
+
+'/dev/stdout'
+     The standard output (file descriptor 1).
+
+'/dev/stderr'
+     The standard error output (file descriptor 2).
+
+   With these facilities, the proper way to write an error message then
+becomes:
+
+     print "Serious error detected!" > "/dev/stderr"
+
+   Note the use of quotes around the file name.  Like with any other
+redirection, the value must be a string.  It is a common error to omit
+the quotes, which leads to confusing results.
+
+   'gawk' does not treat these file names as special when in
+POSIX-compatibility mode.  However, because BWK 'awk' supports them,
+'gawk' does support them even when invoked with the '--traditional'
+option (*note Options::).
+
+   ---------- Footnotes ----------
+
+   (1) The "tty" in '/dev/tty' stands for "Teletype," a serial terminal.
+
+
+File: gawk.info,  Node: Special Files,  Next: Close Files And Pipes,  Prev: Special FD,  Up: Printing
+
+5.8 Special File names in 'gawk'
+================================
+
+Besides access to standard input, standard output, and standard error,
+'gawk' provides access to any open file descriptor.  Additionally, there
+are special file names reserved for TCP/IP networking.
+
+* Menu:
+
+* Other Inherited Files::       Accessing other open files with
+                                'gawk'.
+* Special Network::             Special files for network communications.
+* Special Caveats::             Things to watch out for.
+
+
+File: gawk.info,  Node: Other Inherited Files,  Next: Special Network,  Up: Special Files
+
+5.8.1 Accessing Other Open Files with 'gawk'
+--------------------------------------------
+
+Besides the '/dev/stdin', '/dev/stdout', and '/dev/stderr' special file
+names mentioned earlier, 'gawk' provides syntax for accessing any other
+inherited open file:
+
+'/dev/fd/N'
+     The file associated with file descriptor N.  Such a file must be
+     opened by the program initiating the 'awk' execution (typically the
+     shell).  Unless special pains are taken in the shell from which
+     'gawk' is invoked, only descriptors 0, 1, and 2 are available.
+
+   The file names '/dev/stdin', '/dev/stdout', and '/dev/stderr' are
+essentially aliases for '/dev/fd/0', '/dev/fd/1', and '/dev/fd/2',
+respectively.  However, those names are more self-explanatory.
+
+   Note that using 'close()' on a file name of the form '"/dev/fd/N"',
+for file descriptor numbers above two, does actually close the given
+file descriptor.
+
+
+File: gawk.info,  Node: Special Network,  Next: Special Caveats,  Prev: Other Inherited Files,  Up: Special Files
+
+5.8.2 Special Files for Network Communications
+----------------------------------------------
+
+'gawk' programs can open a two-way TCP/IP connection, acting as either a
+client or a server.  This is done using a special file name of the form:
+
+     /NET-TYPE/PROTOCOL/LOCAL-PORT/REMOTE-HOST/REMOTE-PORT
+
+   The NET-TYPE is one of 'inet', 'inet4', or 'inet6'.  The PROTOCOL is
+one of 'tcp' or 'udp', and the other fields represent the other
+essential pieces of information for making a networking connection.
+These file names are used with the '|&' operator for communicating with
+a coprocess (*note Two-way I/O::).  This is an advanced feature,
+mentioned here only for completeness.  Full discussion is delayed until
+*note TCP/IP Networking::.
+
+
+File: gawk.info,  Node: Special Caveats,  Prev: Special Network,  Up: Special Files
+
+5.8.3 Special File name Caveats
+-------------------------------
+
+Here are some things to bear in mind when using the special file names
+that 'gawk' provides:
+
+   * Recognition of the file names for the three standard preopened
+     files is disabled only in POSIX mode.
+
+   * Recognition of the other special file names is disabled if 'gawk'
+     is in compatibility mode (either '--traditional' or '--posix';
+     *note Options::).
+
+   * 'gawk' _always_ interprets these special file names.  For example,
+     using '/dev/fd/4' for output actually writes on file descriptor 4,
+     and not on a new file descriptor that is 'dup()'ed from file
+     descriptor 4.  Most of the time this does not matter; however, it
+     is important to _not_ close any of the files related to file
+     descriptors 0, 1, and 2.  Doing so results in unpredictable
+     behavior.
+
+
+File: gawk.info,  Node: Close Files And Pipes,  Next: Nonfatal,  Prev: Special Files,  Up: Printing
+
+5.9 Closing Input and Output Redirections
+=========================================
+
+If the same file name or the same shell command is used with 'getline'
+more than once during the execution of an 'awk' program (*note
+Getline::), the file is opened (or the command is executed) the first
+time only.  At that time, the first record of input is read from that
+file or command.  The next time the same file or command is used with
+'getline', another record is read from it, and so on.
+
+   Similarly, when a file or pipe is opened for output, 'awk' remembers
+the file name or command associated with it, and subsequent writes to
+the same file or command are appended to the previous writes.  The file
+or pipe stays open until 'awk' exits.
+
+   This implies that special steps are necessary in order to read the
+same file again from the beginning, or to rerun a shell command (rather
+than reading more output from the same command).  The 'close()' function
+makes these things possible:
+
+     close(FILENAME)
+
+or:
+
+     close(COMMAND)
+
+   The argument FILENAME or COMMAND can be any expression.  Its value
+must _exactly_ match the string that was used to open the file or start
+the command (spaces and other "irrelevant" characters included).  For
+example, if you open a pipe with this:
+
+     "sort -r names" | getline foo
+
+then you must close it with this:
+
+     close("sort -r names")
+
+   Once this function call is executed, the next 'getline' from that
+file or command, or the next 'print' or 'printf' to that file or
+command, reopens the file or reruns the command.  Because the expression
+that you use to close a file or pipeline must exactly match the
+expression used to open the file or run the command, it is good practice
+to use a variable to store the file name or command.  The previous
+example becomes the following:
+
+     sortcom = "sort -r names"
+     sortcom | getline foo
+     ...
+     close(sortcom)
+
+This helps avoid hard-to-find typographical errors in your 'awk'
+programs.  Here are some of the reasons for closing an output file:
+
+   * To write a file and read it back later on in the same 'awk'
+     program.  Close the file after writing it, then begin reading it
+     with 'getline'.
+
+   * To write numerous files, successively, in the same 'awk' program.
+     If the files aren't closed, eventually 'awk' may exceed a system
+     limit on the number of open files in one process.  It is best to
+     close each one when the program has finished writing it.
+
+   * To make a command finish.  When output is redirected through a
+     pipe, the command reading the pipe normally continues to try to
+     read input as long as the pipe is open.  Often this means the
+     command cannot really do its work until the pipe is closed.  For
+     example, if output is redirected to the 'mail' program, the message
+     is not actually sent until the pipe is closed.
+
+   * To run the same program a second time, with the same arguments.
+     This is not the same thing as giving more input to the first run!
+
+     For example, suppose a program pipes output to the 'mail' program.
+     If it outputs several lines redirected to this pipe without closing
+     it, they make a single message of several lines.  By contrast, if
+     the program closes the pipe after each line of output, then each
+     line makes a separate message.
+
+   If you use more files than the system allows you to have open, 'gawk'
+attempts to multiplex the available open files among your data files.
+'gawk''s ability to do this depends upon the facilities of your
+operating system, so it may not always work.  It is therefore both good
+practice and good portability advice to always use 'close()' on your
+files when you are done with them.  In fact, if you are using a lot of
+pipes, it is essential that you close commands when done.  For example,
+consider something like this:
+
+     {
+         ...
+         command = ("grep " $1 " /some/file | my_prog -q " $3)
+         while ((command | getline) > 0) {
+             PROCESS OUTPUT OF command
+         }
+         # need close(command) here
+     }
+
+   This example creates a new pipeline based on data in _each_ record.
+Without the call to 'close()' indicated in the comment, 'awk' creates
+child processes to run the commands, until it eventually runs out of
+file descriptors for more pipelines.
+
+   Even though each command has finished (as indicated by the
+end-of-file return status from 'getline'), the child process is not
+terminated;(1) more importantly, the file descriptor for the pipe is not
+closed and released until 'close()' is called or 'awk' exits.
+
+   'close()' silently does nothing if given an argument that does not
+represent a file, pipe, or coprocess that was opened with a redirection.
+In such a case, it returns a negative value, indicating an error.  In
+addition, 'gawk' sets 'ERRNO' to a string indicating the error.
+
+   Note also that 'close(FILENAME)' has no "magic" effects on the
+implicit loop that reads through the files named on the command line.
+It is, more likely, a close of a file that was never opened with a
+redirection, so 'awk' silently does nothing, except return a negative
+value.
+
+   When using the '|&' operator to communicate with a coprocess, it is
+occasionally useful to be able to close one end of the two-way pipe
+without closing the other.  This is done by supplying a second argument
+to 'close()'.  As in any other call to 'close()', the first argument is
+the name of the command or special file used to start the coprocess.
+The second argument should be a string, with either of the values '"to"'
+or '"from"'.  Case does not matter.  As this is an advanced feature,
+discussion is delayed until *note Two-way I/O::, which describes it in
+more detail and gives an example.
+
+                    Using 'close()''s Return Value
+
+   In many older versions of Unix 'awk', the 'close()' function is
+actually a statement.  (d.c.)  It is a syntax error to try and use the
+return value from 'close()':
+
+     command = "..."
+     command | getline info
+     retval = close(command)  # syntax error in many Unix awks
+
+   'gawk' treats 'close()' as a function.  The return value is -1 if the
+argument names something that was never opened with a redirection, or if
+there is a system problem closing the file or process.  In these cases,
+'gawk' sets the predefined variable 'ERRNO' to a string describing the
+problem.
+
+   In 'gawk', starting with version 4.2, when closing a pipe or
+coprocess (input or output), the return value is the exit status of the
+command, as described in *note Table 5.1:
+table-close-pipe-return-values.(2)  Otherwise, it is the return value
+from the system's 'close()' or 'fclose()' C functions when closing input
+or output files, respectively.  This value is zero if the close
+succeeds, or -1 if it fails.
+
+Situation                     Return value from 'close()'
+--------------------------------------------------------------------------
+Normal exit of command        Command's exit status
+Death by signal of command    256 + number of murderous signal
+Death by signal of command    512 + number of murderous signal
+with core dump
+Some kind of error            -1
+
+Table 5.1: Return values from 'close()' of a pipe
+
+   The POSIX standard is very vague; it says that 'close()' returns zero
+on success and a nonzero value otherwise.  In general, different
+implementations vary in what they report when closing pipes; thus, the
+return value cannot be used portably.  (d.c.)  In POSIX mode (*note
+Options::), 'gawk' just returns zero when closing a pipe.
+
+   ---------- Footnotes ----------
+
+   (1) The technical terminology is rather morbid.  The finished child
+is called a "zombie," and cleaning up after it is referred to as
+"reaping."
+
+   (2) Prior to version 4.2, the return value from closing a pipe or
+co-process was the full 16-bit exit value as defined by the 'wait()'
+system call.
+
+
+File: gawk.info,  Node: Nonfatal,  Next: Output Summary,  Prev: Close Files And Pipes,  Up: Printing
+
+5.10 Enabling Nonfatal Output
+=============================
+
+This minor node describes a 'gawk'-specific feature.
+
+   In standard 'awk', output with 'print' or 'printf' to a nonexistent
+file, or some other I/O error (such as filling up the disk) is a fatal
+error.
+
+     $ gawk 'BEGIN { print "hi" > "/no/such/file" }'
+     error-> gawk: cmd. line:1: fatal: can't redirect to `/no/such/file' (No such file or directory)
+
+   'gawk' makes it possible to detect that an error has occurred,
+allowing you to possibly recover from the error, or at least print an
+error message of your choosing before exiting.  You can do this in one
+of two ways:
+
+   * For all output files, by assigning any value to
+     'PROCINFO["NONFATAL"]'.
+
+   * On a per-file basis, by assigning any value to 'PROCINFO[FILENAME,
+     "NONFATAL"]'.  Here, FILENAME is the name of the file to which you
+     wish output to be nonfatal.
+
+   Once you have enabled nonfatal output, you must check 'ERRNO' after
+every relevant 'print' or 'printf' statement to see if something went
+wrong.  It is also a good idea to initialize 'ERRNO' to zero before
+attempting the output.  For example:
+
+     $ gawk '
+     > BEGIN {
+     >     PROCINFO["NONFATAL"] = 1
+     >     ERRNO = 0
+     >     print "hi" > "/no/such/file"
+     >     if (ERRNO) {
+     >         print("Output failed:", ERRNO) > "/dev/stderr"
+     >         exit 1
+     >     }
+     > }'
+     error-> Output failed: No such file or directory
+
+   Here, 'gawk' did not produce a fatal error; instead it let the 'awk'
+program code detect the problem and handle it.
+
+   This mechanism works also for standard output and standard error.
+For standard output, you may use 'PROCINFO["-", "NONFATAL"]' or
+'PROCINFO["/dev/stdout", "NONFATAL"]'.  For standard error, use
+'PROCINFO["/dev/stderr", "NONFATAL"]'.
+
+   When attempting to open a TCP/IP socket (*note TCP/IP Networking::),
+'gawk' tries multiple times.  The 'GAWK_SOCK_RETRIES' environment
+variable (*note Other Environment Variables::) allows you to override
+'gawk''s builtin default number of attempts.  However, once nonfatal I/O
+is enabled for a given socket, 'gawk' only retries once, relying on
+'awk'-level code to notice that there was a problem.
+
+
+File: gawk.info,  Node: Output Summary,  Next: Output Exercises,  Prev: Nonfatal,  Up: Printing
+
+5.11 Summary
+============
+
+   * The 'print' statement prints comma-separated expressions.  Each
+     expression is separated by the value of 'OFS' and terminated by the
+     value of 'ORS'.  'OFMT' provides the conversion format for numeric
+     values for the 'print' statement.
+
+   * The 'printf' statement provides finer-grained control over output,
+     with format-control letters for different data types and various
+     flags that modify the behavior of the format-control letters.
+
+   * Output from both 'print' and 'printf' may be redirected to files,
+     pipes, and coprocesses.
+
+   * 'gawk' provides special file names for access to standard input,
+     output, and error, and for network communications.
+
+   * Use 'close()' to close open file, pipe, and coprocess redirections.
+     For coprocesses, it is possible to close only one direction of the
+     communications.
+
+   * Normally errors with 'print' or 'printf' are fatal.  'gawk' lets
+     you make output errors be nonfatal either for all files or on a
+     per-file basis.  You must then check for errors after every
+     relevant output statement.
+
+
+File: gawk.info,  Node: Output Exercises,  Prev: Output Summary,  Up: Printing
+
+5.12 Exercises
+==============
+
+  1. Rewrite the program:
+
+          awk 'BEGIN { print "Month Crates"
+                       print "----- ------" }
+                     { print $1, "     ", $2 }' inventory-shipped
+
+     from *note Output Separators::, by using a new value of 'OFS'.
+
+  2. Use the 'printf' statement to line up the headings and table data
+     for the 'inventory-shipped' example that was covered in *note
+     Print::.
+
+  3. What happens if you forget the double quotes when redirecting
+     output, as follows:
+
+          BEGIN { print "Serious error detected!" > /dev/stderr }
+
+
+File: gawk.info,  Node: Expressions,  Next: Patterns and Actions,  Prev: Printing,  Up: Top
+
+6 Expressions
+*************
+
+Expressions are the basic building blocks of 'awk' patterns and actions.
+An expression evaluates to a value that you can print, test, or pass to
+a function.  Additionally, an expression can assign a new value to a
+variable or a field by using an assignment operator.
+
+   An expression can serve as a pattern or action statement on its own.
+Most other kinds of statements contain one or more expressions that
+specify the data on which to operate.  As in other languages,
+expressions in 'awk' can include variables, array references, constants,
+and function calls, as well as combinations of these with various
+operators.
+
+* Menu:
+
+* Values::                      Constants, Variables, and Regular Expressions.
+* All Operators::               'gawk''s operators.
+* Truth Values and Conditions:: Testing for true and false.
+* Function Calls::              A function call is an expression.
+* Precedence::                  How various operators nest.
+* Locales::                     How the locale affects things.
+* Expressions Summary::         Expressions summary.
+
+
+File: gawk.info,  Node: Values,  Next: All Operators,  Up: Expressions
+
+6.1 Constants, Variables, and Conversions
+=========================================
+
+Expressions are built up from values and the operations performed upon
+them.  This minor node describes the elementary objects that provide the
+values used in expressions.
+
+* Menu:
+
+* Constants::                   String, numeric and regexp constants.
+* Using Constant Regexps::      When and how to use a regexp constant.
+* Variables::                   Variables give names to values for later use.
+* Conversion::                  The conversion of strings to numbers and vice
+                                versa.
+
+
+File: gawk.info,  Node: Constants,  Next: Using Constant Regexps,  Up: Values
+
+6.1.1 Constant Expressions
+--------------------------
+
+The simplest type of expression is the "constant", which always has the
+same value.  There are three types of constants: numeric, string, and
+regular expression.
+
+   Each is used in the appropriate context when you need a data value
+that isn't going to change.  Numeric constants can have different forms,
+but are internally stored in an identical manner.
+
+* Menu:
+
+* Scalar Constants::            Numeric and string constants.
+* Nondecimal-numbers::          What are octal and hex numbers.
+* Regexp Constants::            Regular Expression constants.
+
+
+File: gawk.info,  Node: Scalar Constants,  Next: Nondecimal-numbers,  Up: Constants
+
+6.1.1.1 Numeric and String Constants
+....................................
+
+A "numeric constant" stands for a number.  This number can be an
+integer, a decimal fraction, or a number in scientific (exponential)
+notation.(1)  Here are some examples of numeric constants that all have
+the same value:
+
+     105
+     1.05e+2
+     1050e-1
+
+   A "string constant" consists of a sequence of characters enclosed in
+double quotation marks.  For example:
+
+     "parrot"
+
+represents the string whose contents are 'parrot'.  Strings in 'gawk'
+can be of any length, and they can contain any of the possible eight-bit
+ASCII characters, including ASCII NUL (character code zero).  Other
+'awk' implementations may have difficulty with some character codes.
+
+   ---------- Footnotes ----------
+
+   (1) The internal representation of all numbers, including integers,
+uses double-precision floating-point numbers.  On most modern systems,
+these are in IEEE 754 standard format.  *Note Arbitrary Precision
+Arithmetic::, for much more information.
+
+
+File: gawk.info,  Node: Nondecimal-numbers,  Next: Regexp Constants,  Prev: Scalar Constants,  Up: Constants
+
+6.1.1.2 Octal and Hexadecimal Numbers
+.....................................
+
+In 'awk', all numbers are in decimal (i.e., base 10).  Many other
+programming languages allow you to specify numbers in other bases, often
+octal (base 8) and hexadecimal (base 16).  In octal, the numbers go 0,
+1, 2, 3, 4, 5, 6, 7, 10, 11, 12, and so on.  Just as '11' in decimal is
+1 times 10 plus 1, so '11' in octal is 1 times 8 plus 1.  This equals 9
+in decimal.  In hexadecimal, there are 16 digits.  Because the everyday
+decimal number system only has ten digits ('0'-'9'), the letters 'a'
+through 'f' are used to represent the rest.  (Case in the letters is
+usually irrelevant; hexadecimal 'a' and 'A' have the same value.)  Thus,
+'11' in hexadecimal is 1 times 16 plus 1, which equals 17 in decimal.
+
+   Just by looking at plain '11', you can't tell what base it's in.  So,
+in C, C++, and other languages derived from C, there is a special
+notation to signify the base.  Octal numbers start with a leading '0',
+and hexadecimal numbers start with a leading '0x' or '0X':
+
+'11'
+     Decimal value 11
+
+'011'
+     Octal 11, decimal value 9
+
+'0x11'
+     Hexadecimal 11, decimal value 17
+
+   This example shows the difference:
+
+     $ gawk 'BEGIN { printf "%d, %d, %d\n", 011, 11, 0x11 }'
+     -| 9, 11, 17
+
+   Being able to use octal and hexadecimal constants in your programs is
+most useful when working with data that cannot be represented
+conveniently as characters or as regular numbers, such as binary data of
+various sorts.
+
+   'gawk' allows the use of octal and hexadecimal constants in your
+program text.  However, such numbers in the input data are not treated
+differently; doing so by default would break old programs.  (If you
+really need to do this, use the '--non-decimal-data' command-line
+option; *note Nondecimal Data::.)  If you have octal or hexadecimal
+data, you can use the 'strtonum()' function (*note String Functions::)
+to convert the data into a number.  Most of the time, you will want to
+use octal or hexadecimal constants when working with the built-in
+bit-manipulation functions; see *note Bitwise Functions:: for more
+information.
+
+   Unlike in some early C implementations, '8' and '9' are not valid in
+octal constants.  For example, 'gawk' treats '018' as decimal 18:
+
+     $ gawk 'BEGIN { print "021 is", 021 ; print 018 }'
+     -| 021 is 17
+     -| 18
+
+   Octal and hexadecimal source code constants are a 'gawk' extension.
+If 'gawk' is in compatibility mode (*note Options::), they are not
+available.
+
+              A Constant's Base Does Not Affect Its Value
+
+   Once a numeric constant has been converted internally into a number,
+'gawk' no longer remembers what the original form of the constant was;
+the internal value is always used.  This has particular consequences for
+conversion of numbers to strings:
+
+     $ gawk 'BEGIN { printf "0x11 is <%s>\n", 0x11 }'
+     -| 0x11 is <17>
+
+
+File: gawk.info,  Node: Regexp Constants,  Prev: Nondecimal-numbers,  Up: Constants
+
+6.1.1.3 Regular Expression Constants
+....................................
+
+A "regexp constant" is a regular expression description enclosed in
+slashes, such as '/^beginning and end$/'.  Most regexps used in 'awk'
+programs are constant, but the '~' and '!~' matching operators can also
+match computed or dynamic regexps (which are typically just ordinary
+strings or variables that contain a regexp, but could be more complex
+expressions).
+
+
+File: gawk.info,  Node: Using Constant Regexps,  Next: Variables,  Prev: Constants,  Up: Values
+
+6.1.2 Using Regular Expression Constants
+----------------------------------------
+
+When used on the righthand side of the '~' or '!~' operators, a regexp
+constant merely stands for the regexp that is to be matched.  However,
+regexp constants (such as '/foo/') may be used like simple expressions.
+When a regexp constant appears by itself, it has the same meaning as if
+it appeared in a pattern (i.e., '($0 ~ /foo/)').  (d.c.)  *Note
+Expression Patterns::.  This means that the following two code segments:
+
+     if ($0 ~ /barfly/ || $0 ~ /camelot/)
+         print "found"
+
+and:
+
+     if (/barfly/ || /camelot/)
+         print "found"
+
+are exactly equivalent.  One rather bizarre consequence of this rule is
+that the following Boolean expression is valid, but does not do what its
+author probably intended:
+
+     # Note that /foo/ is on the left of the ~
+     if (/foo/ ~ $1) print "found foo"
+
+This code is "obviously" testing '$1' for a match against the regexp
+'/foo/'.  But in fact, the expression '/foo/ ~ $1' really means '($0 ~
+/foo/) ~ $1'.  In other words, first match the input record against the
+regexp '/foo/'.  The result is either zero or one, depending upon the
+success or failure of the match.  That result is then matched against
+the first field in the record.  Because it is unlikely that you would
+ever really want to make this kind of test, 'gawk' issues a warning when
+it sees this construct in a program.  Another consequence of this rule
+is that the assignment statement:
+
+     matches = /foo/
+
+assigns either zero or one to the variable 'matches', depending upon the
+contents of the current input record.
+
+   Constant regular expressions are also used as the first argument for
+the 'gensub()', 'sub()', and 'gsub()' functions, as the second argument
+of the 'match()' function, and as the third argument of the 'split()'
+and 'patsplit()' functions (*note String Functions::).  Modern
+implementations of 'awk', including 'gawk', allow the third argument of
+'split()' to be a regexp constant, but some older implementations do
+not.  (d.c.)  Because some built-in functions accept regexp constants as
+arguments, confusion can arise when attempting to use regexp constants
+as arguments to user-defined functions (*note User-defined::).  For
+example:
+
+     function mysub(pat, repl, str, global)
+     {
+         if (global)
+             gsub(pat, repl, str)
+         else
+             sub(pat, repl, str)
+         return str
+     }
+
+     {
+         ...
+         text = "hi! hi yourself!"
+         mysub(/hi/, "howdy", text, 1)
+         ...
+     }
+
+   In this example, the programmer wants to pass a regexp constant to
+the user-defined function 'mysub()', which in turn passes it on to
+either 'sub()' or 'gsub()'.  However, what really happens is that the
+'pat' parameter is assigned a value of either one or zero, depending
+upon whether or not '$0' matches '/hi/'.  'gawk' issues a warning when
+it sees a regexp constant used as a parameter to a user-defined
+function, because passing a truth value in this way is probably not what
+was intended.
+
+
+File: gawk.info,  Node: Variables,  Next: Conversion,  Prev: Using Constant Regexps,  Up: Values
+
+6.1.3 Variables
+---------------
+
+"Variables" are ways of storing values at one point in your program for
+use later in another part of your program.  They can be manipulated
+entirely within the program text, and they can also be assigned values
+on the 'awk' command line.
+
+* Menu:
+
+* Using Variables::             Using variables in your programs.
+* Assignment Options::          Setting variables on the command line and a
+                                summary of command-line syntax. This is an
+                                advanced method of input.
+
+
+File: gawk.info,  Node: Using Variables,  Next: Assignment Options,  Up: Variables
+
+6.1.3.1 Using Variables in a Program
+....................................
+
+Variables let you give names to values and refer to them later.
+Variables have already been used in many of the examples.  The name of a
+variable must be a sequence of letters, digits, or underscores, and it
+may not begin with a digit.  Here, a "letter" is any one of the 52
+upper- and lowercase English letters.  Other characters that may be
+defined as letters in non-English locales are not valid in variable
+names.  Case is significant in variable names; 'a' and 'A' are distinct
+variables.
+
+   A variable name is a valid expression by itself; it represents the
+variable's current value.  Variables are given new values with
+"assignment operators", "increment operators", and "decrement operators"
+(*note Assignment Ops::).  In addition, the 'sub()' and 'gsub()'
+functions can change a variable's value, and the 'match()', 'split()',
+and 'patsplit()' functions can change the contents of their array
+parameters (*note String Functions::).
+
+   A few variables have special built-in meanings, such as 'FS' (the
+field separator) and 'NF' (the number of fields in the current input
+record).  *Note Built-in Variables:: for a list of the predefined
+variables.  These predefined variables can be used and assigned just
+like all other variables, but their values are also used or changed
+automatically by 'awk'.  All predefined variables' names are entirely
+uppercase.
+
+   Variables in 'awk' can be assigned either numeric or string values.
+The kind of value a variable holds can change over the life of a
+program.  By default, variables are initialized to the empty string,
+which is zero if converted to a number.  There is no need to explicitly
+initialize a variable in 'awk', which is what you would do in C and in
+most other traditional languages.
+
+
+File: gawk.info,  Node: Assignment Options,  Prev: Using Variables,  Up: Variables
+
+6.1.3.2 Assigning Variables on the Command Line
+...............................................
+
+Any 'awk' variable can be set by including a "variable assignment" among
+the arguments on the command line when 'awk' is invoked (*note Other
+Arguments::).  Such an assignment has the following form:
+
+     VARIABLE=TEXT
+
+With it, a variable is set either at the beginning of the 'awk' run or
+in between input files.  When the assignment is preceded with the '-v'
+option, as in the following:
+
+     -v VARIABLE=TEXT
+
+the variable is set at the very beginning, even before the 'BEGIN' rules
+execute.  The '-v' option and its assignment must precede all the file
+name arguments, as well as the program text.  (*Note Options:: for more
+information about the '-v' option.)  Otherwise, the variable assignment
+is performed at a time determined by its position among the input file
+arguments--after the processing of the preceding input file argument.
+For example:
+
+     awk '{ print $n }' n=4 inventory-shipped n=2 mail-list
+
+prints the value of field number 'n' for all input records.  Before the
+first file is read, the command line sets the variable 'n' equal to
+four.  This causes the fourth field to be printed in lines from
+'inventory-shipped'.  After the first file has finished, but before the
+second file is started, 'n' is set to two, so that the second field is
+printed in lines from 'mail-list':
+
+     $ awk '{ print $n }' n=4 inventory-shipped n=2 mail-list
+     -| 15
+     -| 24
+     ...
+     -| 555-5553
+     -| 555-3412
+     ...
+
+   Command-line arguments are made available for explicit examination by
+the 'awk' program in the 'ARGV' array (*note ARGC and ARGV::).  'awk'
+processes the values of command-line assignments for escape sequences
+(*note Escape Sequences::).  (d.c.)
+
+
+File: gawk.info,  Node: Conversion,  Prev: Variables,  Up: Values
+
+6.1.4 Conversion of Strings and Numbers
+---------------------------------------
+
+Number-to-string and string-to-number conversion are generally
+straightforward.  There can be subtleties to be aware of; this minor
+node discusses this important facet of 'awk'.
+
+* Menu:
+
+* Strings And Numbers::         How 'awk' Converts Between Strings And
+                                Numbers.
+* Locale influences conversions:: How the locale may affect conversions.
+
+
+File: gawk.info,  Node: Strings And Numbers,  Next: Locale influences conversions,  Up: Conversion
+
+6.1.4.1 How 'awk' Converts Between Strings and Numbers
+......................................................
+
+Strings are converted to numbers and numbers are converted to strings,
+if the context of the 'awk' program demands it.  For example, if the
+value of either 'foo' or 'bar' in the expression 'foo + bar' happens to
+be a string, it is converted to a number before the addition is
+performed.  If numeric values appear in string concatenation, they are
+converted to strings.  Consider the following:
+
+     two = 2; three = 3
+     print (two three) + 4
+
+This prints the (numeric) value 27.  The numeric values of the variables
+'two' and 'three' are converted to strings and concatenated together.
+The resulting string is converted back to the number 23, to which 4 is
+then added.
+
+   If, for some reason, you need to force a number to be converted to a
+string, concatenate that number with the empty string, '""'.  To force a
+string to be converted to a number, add zero to that string.  A string
+is converted to a number by interpreting any numeric prefix of the
+string as numerals: '"2.5"' converts to 2.5, '"1e3"' converts to 1,000,
+and '"25fix"' has a numeric value of 25.  Strings that can't be
+interpreted as valid numbers convert to zero.
+
+   The exact manner in which numbers are converted into strings is
+controlled by the 'awk' predefined variable 'CONVFMT' (*note Built-in
+Variables::).  Numbers are converted using the 'sprintf()' function with
+'CONVFMT' as the format specifier (*note String Functions::).
+
+   'CONVFMT''s default value is '"%.6g"', which creates a value with at
+most six significant digits.  For some applications, you might want to
+change it to specify more precision.  On most modern machines, 17 digits
+is usually enough to capture a floating-point number's value exactly.(1)
+
+   Strange results can occur if you set 'CONVFMT' to a string that
+doesn't tell 'sprintf()' how to format floating-point numbers in a
+useful way.  For example, if you forget the '%' in the format, 'awk'
+converts all numbers to the same constant string.
+
+   As a special case, if a number is an integer, then the result of
+converting it to a string is _always_ an integer, no matter what the
+value of 'CONVFMT' may be.  Given the following code fragment:
+
+     CONVFMT = "%2.2f"
+     a = 12
+     b = a ""
+
+'b' has the value '"12"', not '"12.00"'.  (d.c.)
+
+           Pre-POSIX 'awk' Used 'OFMT' for String Conversion
+
+   Prior to the POSIX standard, 'awk' used the value of 'OFMT' for
+converting numbers to strings.  'OFMT' specifies the output format to
+use when printing numbers with 'print'.  'CONVFMT' was introduced in
+order to separate the semantics of conversion from the semantics of
+printing.  Both 'CONVFMT' and 'OFMT' have the same default value:
+'"%.6g"'.  In the vast majority of cases, old 'awk' programs do not
+change their behavior.  *Note Print:: for more information on the
+'print' statement.
+
+   ---------- Footnotes ----------
+
+   (1) Pathological cases can require up to 752 digits (!), but we doubt
+that you need to worry about this.
+
+
+File: gawk.info,  Node: Locale influences conversions,  Prev: Strings And Numbers,  Up: Conversion
+
+6.1.4.2 Locales Can Influence Conversion
+........................................
+
+Where you are can matter when it comes to converting between numbers and
+strings.  The local character set and language--the "locale"--can affect
+numeric formats.  In particular, for 'awk' programs, it affects the
+decimal point character and the thousands-separator character.  The
+'"C"' locale, and most English-language locales, use the period
+character ('.') as the decimal point and don't have a thousands
+separator.  However, many (if not most) European and non-English locales
+use the comma (',') as the decimal point character.  European locales
+often use either a space or a period as the thousands separator, if they
+have one.
+
+   The POSIX standard says that 'awk' always uses the period as the
+decimal point when reading the 'awk' program source code, and for
+command-line variable assignments (*note Other Arguments::).  However,
+when interpreting input data, for 'print' and 'printf' output, and for
+number-to-string conversion, the local decimal point character is used.
+(d.c.)  In all cases, numbers in source code and in input data cannot
+have a thousands separator.  Here are some examples indicating the
+difference in behavior, on a GNU/Linux system:
+
+     $ export POSIXLY_CORRECT=1                        Force POSIX behavior
+     $ gawk 'BEGIN { printf "%g\n", 3.1415927 }'
+     -| 3.14159
+     $ LC_ALL=en_DK.utf-8 gawk 'BEGIN { printf "%g\n", 3.1415927 }'
+     -| 3,14159
+     $ echo 4,321 | gawk '{ print $1 + 1 }'
+     -| 5
+     $ echo 4,321 | LC_ALL=en_DK.utf-8 gawk '{ print $1 + 1 }'
+     -| 5,321
+
+The 'en_DK.utf-8' locale is for English in Denmark, where the comma acts
+as the decimal point separator.  In the normal '"C"' locale, 'gawk'
+treats '4,321' as 4, while in the Danish locale, it's treated as the
+full number including the fractional part, 4.321.
+
+   Some earlier versions of 'gawk' fully complied with this aspect of
+the standard.  However, many users in non-English locales complained
+about this behavior, because their data used a period as the decimal
+point, so the default behavior was restored to use a period as the
+decimal point character.  You can use the '--use-lc-numeric' option
+(*note Options::) to force 'gawk' to use the locale's decimal point
+character.  ('gawk' also uses the locale's decimal point character when
+in POSIX mode, either via '--posix' or the 'POSIXLY_CORRECT' environment
+variable, as shown previously.)
+
+   *note Table 6.1: table-locale-affects. describes the cases in which
+the locale's decimal point character is used and when a period is used.
+Some of these features have not been described yet.
+
+Feature     Default        '--posix' or
+                           '--use-lc-numeric'
+------------------------------------------------------------
+'%'g'       Use locale     Use locale
+'%g'        Use period     Use locale
+Input       Use period     Use locale
+'strtonum()'Use period     Use locale
+
+Table 6.1: Locale decimal point versus a period
+
+   Finally, modern-day formal standards and the IEEE standard
+floating-point representation can have an unusual but important effect
+on the way 'gawk' converts some special string values to numbers.  The
+details are presented in *note POSIX Floating Point Problems::.
+
+
+File: gawk.info,  Node: All Operators,  Next: Truth Values and Conditions,  Prev: Values,  Up: Expressions
+
+6.2 Operators: Doing Something with Values
+==========================================
+
+This minor node introduces the "operators" that make use of the values
+provided by constants and variables.
+
+* Menu:
+
+* Arithmetic Ops::              Arithmetic operations ('+', '-',
+                                etc.)
+* Concatenation::               Concatenating strings.
+* Assignment Ops::              Changing the value of a variable or a field.
+* Increment Ops::               Incrementing the numeric value of a variable.
+
+
+File: gawk.info,  Node: Arithmetic Ops,  Next: Concatenation,  Up: All Operators
+
+6.2.1 Arithmetic Operators
+--------------------------
+
+The 'awk' language uses the common arithmetic operators when evaluating
+expressions.  All of these arithmetic operators follow normal precedence
+rules and work as you would expect them to.
+
+   The following example uses a file named 'grades', which contains a
+list of student names as well as three test scores per student (it's a
+small class):
+
+     Pat   100 97 58
+     Sandy  84 72 93
+     Chris  72 92 89
+
+This program takes the file 'grades' and prints the average of the
+scores:
+
+     $ awk '{ sum = $2 + $3 + $4 ; avg = sum / 3
+     >        print $1, avg }' grades
+     -| Pat 85
+     -| Sandy 83
+     -| Chris 84.3333
+
+   The following list provides the arithmetic operators in 'awk', in
+order from the highest precedence to the lowest:
+
+'X ^ Y'
+'X ** Y'
+     Exponentiation; X raised to the Y power.  '2 ^ 3' has the value
+     eight; the character sequence '**' is equivalent to '^'.  (c.e.)
+
+'- X'
+     Negation.
+
+'+ X'
+     Unary plus; the expression is converted to a number.
+
+'X * Y'
+     Multiplication.
+
+'X / Y'
+     Division; because all numbers in 'awk' are floating-point numbers,
+     the result is _not_ rounded to an integer--'3 / 4' has the value
+     0.75.  (It is a common mistake, especially for C programmers, to
+     forget that _all_ numbers in 'awk' are floating point, and that
+     division of integer-looking constants produces a real number, not
+     an integer.)
+
+'X % Y'
+     Remainder; further discussion is provided in the text, just after
+     this list.
+
+'X + Y'
+     Addition.
+
+'X - Y'
+     Subtraction.
+
+   Unary plus and minus have the same precedence, the multiplication
+operators all have the same precedence, and addition and subtraction
+have the same precedence.
+
+   When computing the remainder of 'X % Y', the quotient is rounded
+toward zero to an integer and multiplied by Y.  This result is
+subtracted from X; this operation is sometimes known as "trunc-mod."
+The following relation always holds:
+
+     b * int(a / b) + (a % b) == a
+
+   One possibly undesirable effect of this definition of remainder is
+that 'X % Y' is negative if X is negative.  Thus:
+
+     -17 % 8 = -1
+
+   In other 'awk' implementations, the signedness of the remainder may
+be machine-dependent.
+
+     NOTE: The POSIX standard only specifies the use of '^' for
+     exponentiation.  For maximum portability, do not use the '**'
+     operator.
+
+
+File: gawk.info,  Node: Concatenation,  Next: Assignment Ops,  Prev: Arithmetic Ops,  Up: All Operators
+
+6.2.2 String Concatenation
+--------------------------
+
+     It seemed like a good idea at the time.
+                         -- _Brian Kernighan_
+
+   There is only one string operation: concatenation.  It does not have
+a specific operator to represent it.  Instead, concatenation is
+performed by writing expressions next to one another, with no operator.
+For example:
+
+     $ awk '{ print "Field number one: " $1 }' mail-list
+     -| Field number one: Amelia
+     -| Field number one: Anthony
+     ...
+
+   Without the space in the string constant after the ':', the line runs
+together.  For example:
+
+     $ awk '{ print "Field number one:" $1 }' mail-list
+     -| Field number one:Amelia
+     -| Field number one:Anthony
+     ...
+
+   Because string concatenation does not have an explicit operator, it
+is often necessary to ensure that it happens at the right time by using
+parentheses to enclose the items to concatenate.  For example, you might
+expect that the following code fragment concatenates 'file' and 'name':
+
+     file = "file"
+     name = "name"
+     print "something meaningful" > file name
+
+This produces a syntax error with some versions of Unix 'awk'.(1)  It is
+necessary to use the following:
+
+     print "something meaningful" > (file name)
+
+   Parentheses should be used around concatenation in all but the most
+common contexts, such as on the righthand side of '='.  Be careful about
+the kinds of expressions used in string concatenation.  In particular,
+the order of evaluation of expressions used for concatenation is
+undefined in the 'awk' language.  Consider this example:
+
+     BEGIN {
+         a = "don't"
+         print (a " " (a = "panic"))
+     }
+
+It is not defined whether the second assignment to 'a' happens before or
+after the value of 'a' is retrieved for producing the concatenated
+value.  The result could be either 'don't panic', or 'panic panic'.
+
+   The precedence of concatenation, when mixed with other operators, is
+often counter-intuitive.  Consider this example:
+
+     $ awk 'BEGIN { print -12 " " -24 }'
+     -| -12-24
+
+   This "obviously" is concatenating -12, a space, and -24.  But where
+did the space disappear to?  The answer lies in the combination of
+operator precedences and 'awk''s automatic conversion rules.  To get the
+desired result, write the program this way:
+
+     $ awk 'BEGIN { print -12 " " (-24) }'
+     -| -12 -24
+
+   This forces 'awk' to treat the '-' on the '-24' as unary.  Otherwise,
+it's parsed as follows:
+
+         -12 ('" "' - 24)
+     => -12 (0 - 24)
+     => -12 (-24)
+     => -12-24
+
+   As mentioned earlier, when mixing concatenation with other operators,
+_parenthesize_.  Otherwise, you're never quite sure what you'll get.
+
+   ---------- Footnotes ----------
+
+   (1) It happens that BWK 'awk', 'gawk', and 'mawk' all "get it right,"
+but you should not rely on this.
+
+
+File: gawk.info,  Node: Assignment Ops,  Next: Increment Ops,  Prev: Concatenation,  Up: All Operators
+
+6.2.3 Assignment Expressions
+----------------------------
+
+An "assignment" is an expression that stores a (usually different) value
+into a variable.  For example, let's assign the value one to the
+variable 'z':
+
+     z = 1
+
+   After this expression is executed, the variable 'z' has the value
+one.  Whatever old value 'z' had before the assignment is forgotten.
+
+   Assignments can also store string values.  For example, the following
+stores the value '"this food is good"' in the variable 'message':
+
+     thing = "food"
+     predicate = "good"
+     message = "this " thing " is " predicate
+
+This also illustrates string concatenation.  The '=' sign is called an
+"assignment operator".  It is the simplest assignment operator because
+the value of the righthand operand is stored unchanged.  Most operators
+(addition, concatenation, and so on) have no effect except to compute a
+value.  If the value isn't used, there's no reason to use the operator.
+An assignment operator is different; it does produce a value, but even
+if you ignore it, the assignment still makes itself felt through the
+alteration of the variable.  We call this a "side effect".
+
+   The lefthand operand of an assignment need not be a variable (*note
+Variables::); it can also be a field (*note Changing Fields::) or an
+array element (*note Arrays::).  These are all called "lvalues", which
+means they can appear on the lefthand side of an assignment operator.
+The righthand operand may be any expression; it produces the new value
+that the assignment stores in the specified variable, field, or array
+element.  (Such values are called "rvalues".)
+
+   It is important to note that variables do _not_ have permanent types.
+A variable's type is simply the type of whatever value was last assigned
+to it.  In the following program fragment, the variable 'foo' has a
+numeric value at first, and a string value later on:
+
+     foo = 1
+     print foo
+     foo = "bar"
+     print foo
+
+When the second assignment gives 'foo' a string value, the fact that it
+previously had a numeric value is forgotten.
+
+   String values that do not begin with a digit have a numeric value of
+zero.  After executing the following code, the value of 'foo' is five:
+
+     foo = "a string"
+     foo = foo + 5
+
+     NOTE: Using a variable as a number and then later as a string can
+     be confusing and is poor programming style.  The previous two
+     examples illustrate how 'awk' works, _not_ how you should write
+     your programs!
+
+   An assignment is an expression, so it has a value--the same value
+that is assigned.  Thus, 'z = 1' is an expression with the value one.
+One consequence of this is that you can write multiple assignments
+together, such as:
+
+     x = y = z = 5
+
+This example stores the value five in all three variables ('x', 'y', and
+'z').  It does so because the value of 'z = 5', which is five, is stored
+into 'y' and then the value of 'y = z = 5', which is five, is stored
+into 'x'.
+
+   Assignments may be used anywhere an expression is called for.  For
+example, it is valid to write 'x != (y = 1)' to set 'y' to one, and then
+test whether 'x' equals one.  But this style tends to make programs hard
+to read; such nesting of assignments should be avoided, except perhaps
+in a one-shot program.
+
+   Aside from '=', there are several other assignment operators that do
+arithmetic with the old value of the variable.  For example, the
+operator '+=' computes a new value by adding the righthand value to the
+old value of the variable.  Thus, the following assignment adds five to
+the value of 'foo':
+
+     foo += 5
+
+This is equivalent to the following:
+
+     foo = foo + 5
+
+Use whichever makes the meaning of your program clearer.
+
+   There are situations where using '+=' (or any assignment operator) is
+_not_ the same as simply repeating the lefthand operand in the righthand
+expression.  For example:
+
+     # Thanks to Pat Rankin for this example
+     BEGIN  {
+         foo[rand()] += 5
+         for (x in foo)
+            print x, foo[x]
+
+         bar[rand()] = bar[rand()] + 5
+         for (x in bar)
+            print x, bar[x]
+     }
+
+The indices of 'bar' are practically guaranteed to be different, because
+'rand()' returns different values each time it is called.  (Arrays and
+the 'rand()' function haven't been covered yet.  *Note Arrays::, and
+*note Numeric Functions:: for more information.)  This example
+illustrates an important fact about assignment operators: the lefthand
+expression is only evaluated _once_.
+
+   It is up to the implementation as to which expression is evaluated
+first, the lefthand or the righthand.  Consider this example:
+
+     i = 1
+     a[i += 2] = i + 1
+
+The value of 'a[3]' could be either two or four.
+
+   *note Table 6.2: table-assign-ops. lists the arithmetic assignment
+operators.  In each case, the righthand operand is an expression whose
+value is converted to a number.
+
+Operator               Effect
+--------------------------------------------------------------------------
+LVALUE '+='            Add INCREMENT to the value of LVALUE.
+INCREMENT
+LVALUE '-='            Subtract DECREMENT from the value of LVALUE.
+DECREMENT
+LVALUE '*='            Multiply the value of LVALUE by COEFFICIENT.
+COEFFICIENT
+LVALUE '/=' DIVISOR    Divide the value of LVALUE by DIVISOR.
+LVALUE '%=' MODULUS    Set LVALUE to its remainder by MODULUS.
+LVALUE '^=' POWER      Raise LVALUE to the power POWER.
+LVALUE '**=' POWER     Raise LVALUE to the power POWER.  (c.e.)
+
+Table 6.2: Arithmetic assignment operators
+
+     NOTE: Only the '^=' operator is specified by POSIX. For maximum
+     portability, do not use the '**=' operator.
+
+      Syntactic Ambiguities Between '/=' and Regular Expressions
+
+   There is a syntactic ambiguity between the '/=' assignment operator
+and regexp constants whose first character is an '='.  (d.c.)  This is
+most notable in some commercial 'awk' versions.  For example:
+
+     $ awk /==/ /dev/null
+     error-> awk: syntax error at source line 1
+     error->  context is
+     error->         >>> /= <<<
+     error-> awk: bailing out at source line 1
+
+A workaround is:
+
+     awk '/[=]=/' /dev/null
+
+   'gawk' does not have this problem; BWK 'awk' and 'mawk' also do not.
+
+
+File: gawk.info,  Node: Increment Ops,  Prev: Assignment Ops,  Up: All Operators
+
+6.2.4 Increment and Decrement Operators
+---------------------------------------
+
+"Increment" and "decrement operators" increase or decrease the value of
+a variable by one.  An assignment operator can do the same thing, so the
+increment operators add no power to the 'awk' language; however, they
+are convenient abbreviations for very common operations.
+
+   The operator used for adding one is written '++'.  It can be used to
+increment a variable either before or after taking its value.  To
+"pre-increment" a variable 'v', write '++v'.  This adds one to the value
+of 'v'--that new value is also the value of the expression.  (The
+assignment expression 'v += 1' is completely equivalent.)  Writing the
+'++' after the variable specifies "post-increment".  This increments the
+variable value just the same; the difference is that the value of the
+increment expression itself is the variable's _old_ value.  Thus, if
+'foo' has the value four, then the expression 'foo++' has the value
+four, but it changes the value of 'foo' to five.  In other words, the
+operator returns the old value of the variable, but with the side effect
+of incrementing it.
+
+   The post-increment 'foo++' is nearly the same as writing '(foo += 1)
+- 1'.  It is not perfectly equivalent because all numbers in 'awk' are
+floating point--in floating point, 'foo + 1 - 1' does not necessarily
+equal 'foo'.  But the difference is minute as long as you stick to
+numbers that are fairly small (less than 10e12).
+
+   Fields and array elements are incremented just like variables.  (Use
+'$(i++)' when you want to do a field reference and a variable increment
+at the same time.  The parentheses are necessary because of the
+precedence of the field reference operator '$'.)
+
+   The decrement operator '--' works just like '++', except that it
+subtracts one instead of adding it.  As with '++', it can be used before
+the lvalue to pre-decrement or after it to post-decrement.  Following is
+a summary of increment and decrement expressions:
+
+'++LVALUE'
+     Increment LVALUE, returning the new value as the value of the
+     expression.
+
+'LVALUE++'
+     Increment LVALUE, returning the _old_ value of LVALUE as the value
+     of the expression.
+
+'--LVALUE'
+     Decrement LVALUE, returning the new value as the value of the
+     expression.  (This expression is like '++LVALUE', but instead of
+     adding, it subtracts.)
+
+'LVALUE--'
+     Decrement LVALUE, returning the _old_ value of LVALUE as the value
+     of the expression.  (This expression is like 'LVALUE++', but
+     instead of adding, it subtracts.)
+
+                       Operator Evaluation Order
+
+     Doctor, it hurts when I do this!
+     Then don't do that!
+                           -- _Groucho Marx_
+
+What happens for something like the following?
+
+     b = 6
+     print b += b++
+
+Or something even stranger?
+
+     b = 6
+     b += ++b + b++
+     print b
+
+   In other words, when do the various side effects prescribed by the
+postfix operators ('b++') take effect?  When side effects happen is
+"implementation-defined".  In other words, it is up to the particular
+version of 'awk'.  The result for the first example may be 12 or 13, and
+for the second, it may be 22 or 23.
+
+   In short, doing things like this is not recommended and definitely
+not anything that you can rely upon for portability.  You should avoid
+such things in your own programs.
+
+
+File: gawk.info,  Node: Truth Values and Conditions,  Next: Function Calls,  Prev: All Operators,  Up: Expressions
+
+6.3 Truth Values and Conditions
+===============================
+
+In certain contexts, expression values also serve as "truth values";
+i.e., they determine what should happen next as the program runs.  This
+minor node describes how 'awk' defines "true" and "false" and how values
+are compared.
+
+* Menu:
+
+* Truth Values::                What is "true" and what is "false".
+* Typing and Comparison::       How variables acquire types and how this
+                                affects comparison of numbers and strings with
+                                '<', etc.
+* Boolean Ops::                 Combining comparison expressions using boolean
+                                operators '||' ("or"), '&&'
+                                ("and") and '!' ("not").
+* Conditional Exp::             Conditional expressions select between two
+                                subexpressions under control of a third
+                                subexpression.
+
+
+File: gawk.info,  Node: Truth Values,  Next: Typing and Comparison,  Up: Truth Values and Conditions
+
+6.3.1 True and False in 'awk'
+-----------------------------
+
+Many programming languages have a special representation for the
+concepts of "true" and "false."  Such languages usually use the special
+constants 'true' and 'false', or perhaps their uppercase equivalents.
+However, 'awk' is different.  It borrows a very simple concept of true
+and false from C. In 'awk', any nonzero numeric value _or_ any nonempty
+string value is true.  Any other value (zero or the null string, '""')
+is false.  The following program prints 'A strange truth value' three
+times:
+
+     BEGIN {
+        if (3.1415927)
+            print "A strange truth value"
+        if ("Four Score And Seven Years Ago")
+            print "A strange truth value"
+        if (j = 57)
+            print "A strange truth value"
+     }
+
+   There is a surprising consequence of the "nonzero or non-null" rule:
+the string constant '"0"' is actually true, because it is non-null.
+(d.c.)
+
+
+File: gawk.info,  Node: Typing and Comparison,  Next: Boolean Ops,  Prev: Truth Values,  Up: Truth Values and Conditions
+
+6.3.2 Variable Typing and Comparison Expressions
+------------------------------------------------
+
+     The Guide is definitive.  Reality is frequently inaccurate.
+      -- _Douglas Adams, 'The Hitchhiker's Guide to the Galaxy'_
+
+   Unlike in other programming languages, in 'awk' variables do not have
+a fixed type.  Instead, they can be either a number or a string,
+depending upon the value that is assigned to them.  We look now at how
+variables are typed, and how 'awk' compares variables.
+
+* Menu:
+
+* Variable Typing::             String type versus numeric type.
+* Comparison Operators::        The comparison operators.
+* POSIX String Comparison::     String comparison with POSIX rules.
+
+
+File: gawk.info,  Node: Variable Typing,  Next: Comparison Operators,  Up: Typing and Comparison
+
+6.3.2.1 String Type versus Numeric Type
+.......................................
+
+The POSIX standard introduced the concept of a "numeric string", which
+is simply a string that looks like a number--for example, '" +2"'.  This
+concept is used for determining the type of a variable.  The type of the
+variable is important because the types of two variables determine how
+they are compared.  Variable typing follows these rules:
+
+   * A numeric constant or the result of a numeric operation has the
+     "numeric" attribute.
+
+   * A string constant or the result of a string operation has the
+     "string" attribute.
+
+   * Fields, 'getline' input, 'FILENAME', 'ARGV' elements, 'ENVIRON'
+     elements, and the elements of an array created by 'match()',
+     'split()', and 'patsplit()' that are numeric strings have the
+     "strnum" attribute.  Otherwise, they have the "string" attribute.
+     Uninitialized variables also have the "strnum" attribute.
+
+   * Attributes propagate across assignments but are not changed by any
+     use.
+
+   The last rule is particularly important.  In the following program,
+'a' has numeric type, even though it is later used in a string
+operation:
+
+     BEGIN {
+          a = 12.345
+          b = a " is a cute number"
+          print b
+     }
+
+   When two operands are compared, either string comparison or numeric
+comparison may be used.  This depends upon the attributes of the
+operands, according to the following symmetric matrix:
+
+             +-------------------------------
+             |       STRING          NUMERIC         STRNUM
+     -----+-------------------------------
+             |
+     STRING  |       string          string          string
+             |
+     NUMERIC |       string          numeric         numeric
+             |
+     STRNUM  |       string          numeric         numeric
+     -----+-------------------------------
+
+   The basic idea is that user input that looks numeric--and _only_ user
+input--should be treated as numeric, even though it is actually made of
+characters and is therefore also a string.  Thus, for example, the
+string constant '" +3.14"', when it appears in program source code, is a
+string--even though it looks numeric--and is _never_ treated as a number
+for comparison purposes.
+
+   In short, when one operand is a "pure" string, such as a string
+constant, then a string comparison is performed.  Otherwise, a numeric
+comparison is performed.
+
+   This point bears additional emphasis: All user input is made of
+characters, and so is first and foremost of string type; input strings
+that look numeric are additionally given the strnum attribute.  Thus,
+the six-character input string ' +3.14' receives the strnum attribute.
+In contrast, the eight characters '" +3.14"' appearing in program text
+comprise a string constant.  The following examples print '1' when the
+comparison between the two different constants is true, and '0'
+otherwise:
+
+     $ echo ' +3.14' | awk '{ print($0 == " +3.14") }'    True
+     -| 1
+     $ echo ' +3.14' | awk '{ print($0 == "+3.14") }'     False
+     -| 0
+     $ echo ' +3.14' | awk '{ print($0 == "3.14") }'      False
+     -| 0
+     $ echo ' +3.14' | awk '{ print($0 == 3.14) }'        True
+     -| 1
+     $ echo ' +3.14' | awk '{ print($1 == " +3.14") }'    False
+     -| 0
+     $ echo ' +3.14' | awk '{ print($1 == "+3.14") }'     True
+     -| 1
+     $ echo ' +3.14' | awk '{ print($1 == "3.14") }'      False
+     -| 0
+     $ echo ' +3.14' | awk '{ print($1 == 3.14) }'        True
+     -| 1
+
+
+File: gawk.info,  Node: Comparison Operators,  Next: POSIX String Comparison,  Prev: Variable Typing,  Up: Typing and Comparison
+
+6.3.2.2 Comparison Operators
+............................
+
+"Comparison expressions" compare strings or numbers for relationships
+such as equality.  They are written using "relational operators", which
+are a superset of those in C. *note Table 6.3: table-relational-ops.
+describes them.
+
+Expression         Result
+--------------------------------------------------------------------------
+X '<' Y            True if X is less than Y
+X '<=' Y           True if X is less than or equal to Y
+X '>' Y            True if X is greater than Y
+X '>=' Y           True if X is greater than or equal to Y
+X '==' Y           True if X is equal to Y
+X '!=' Y           True if X is not equal to Y
+X '~' Y            True if the string X matches the regexp denoted by Y
+X '!~' Y           True if the string X does not match the regexp
+                   denoted by Y
+SUBSCRIPT 'in'     True if the array ARRAY has an element with the
+ARRAY              subscript SUBSCRIPT
+
+Table 6.3: Relational operators
+
+   Comparison expressions have the value one if true and zero if false.
+When comparing operands of mixed types, numeric operands are converted
+to strings using the value of 'CONVFMT' (*note Conversion::).
+
+   Strings are compared by comparing the first character of each, then
+the second character of each, and so on.  Thus, '"10"' is less than
+'"9"'.  If there are two strings where one is a prefix of the other, the
+shorter string is less than the longer one.  Thus, '"abc"' is less than
+'"abcd"'.
+
+   It is very easy to accidentally mistype the '==' operator and leave
+off one of the '=' characters.  The result is still valid 'awk' code,
+but the program does not do what is intended:
+
+     if (a = b)   # oops! should be a == b
+        ...
+     else
+        ...
+
+Unless 'b' happens to be zero or the null string, the 'if' part of the
+test always succeeds.  Because the operators are so similar, this kind
+of error is very difficult to spot when scanning the source code.
+
+   The following list of expressions illustrates the kinds of
+comparisons 'awk' performs, as well as what the result of each
+comparison is:
+
+'1.5 <= 2.0'
+     Numeric comparison (true)
+
+'"abc" >= "xyz"'
+     String comparison (false)
+
+'1.5 != " +2"'
+     String comparison (true)
+
+'"1e2" < "3"'
+     String comparison (true)
+
+'a = 2; b = "2"'
+'a == b'
+     String comparison (true)
+
+'a = 2; b = " +2"'
+'a == b'
+     String comparison (false)
+
+   In this example:
+
+     $ echo 1e2 3 | awk '{ print ($1 < $2) ? "true" : "false" }'
+     -| false
+
+the result is 'false' because both '$1' and '$2' are user input.  They
+are numeric strings--therefore both have the strnum attribute, dictating
+a numeric comparison.  The purpose of the comparison rules and the use
+of numeric strings is to attempt to produce the behavior that is "least
+surprising," while still "doing the right thing."
+
+   String comparisons and regular expression comparisons are very
+different.  For example:
+
+     x == "foo"
+
+has the value one, or is true if the variable 'x' is precisely 'foo'.
+By contrast:
+
+     x ~ /foo/
+
+has the value one if 'x' contains 'foo', such as '"Oh, what a fool am
+I!"'.
+
+   The righthand operand of the '~' and '!~' operators may be either a
+regexp constant ('/'...'/') or an ordinary expression.  In the latter
+case, the value of the expression as a string is used as a dynamic
+regexp (*note Regexp Usage::; also *note Computed Regexps::).
+
+   A constant regular expression in slashes by itself is also an
+expression.  '/REGEXP/' is an abbreviation for the following comparison
+expression:
+
+     $0 ~ /REGEXP/
+
+   One special place where '/foo/' is _not_ an abbreviation for '$0 ~
+/foo/' is when it is the righthand operand of '~' or '!~'.  *Note Using
+Constant Regexps::, where this is discussed in more detail.
+
+
+File: gawk.info,  Node: POSIX String Comparison,  Prev: Comparison Operators,  Up: Typing and Comparison
+
+6.3.2.3 String Comparison Based on Locale Collating Order
+.........................................................
+
+The POSIX standard used to say that all string comparisons are performed
+based on the locale's "collating order".  This is the order in which
+characters sort, as defined by the locale (for more discussion, *note
+Locales::).  This order is usually very different from the results
+obtained when doing straight byte-by-byte comparison.(1)
+
+   Because this behavior differs considerably from existing practice,
+'gawk' only implemented it when in POSIX mode (*note Options::).  Here
+is an example to illustrate the difference, in an 'en_US.UTF-8' locale:
+
+     $ gawk 'BEGIN { printf("ABC < abc = %s\n",
+     >                     ("ABC" < "abc" ? "TRUE" : "FALSE")) }'
+     -| ABC < abc = TRUE
+     $ gawk --posix 'BEGIN { printf("ABC < abc = %s\n",
+     >                             ("ABC" < "abc" ? "TRUE" : "FALSE")) }'
+     -| ABC < abc = FALSE
+
+   Fortunately, as of August 2016, comparison based on locale collating
+order is no longer required for the '==' and '!=' operators.(2)
+However, comparison based on locales is still required for '<', '<=',
+'>', and '>='.  POSIX thus recommends as follows:
+
+     Since the '==' operator checks whether strings are identical, not
+     whether they collate equally, applications needing to check whether
+     strings collate equally can use:
+
+          a <= b && a >= b
+
+   As of version 4.2, 'gawk' continues to use locale collating order for
+'<', '<=', '>', and '>=' only in POSIX mode.
+
+   ---------- Footnotes ----------
+
+   (1) Technically, string comparison is supposed to behave the same way
+as if the strings were compared with the C 'strcoll()' function.
+
+   (2) See the Austin Group website
+(http://austingroupbugs.net/view.php?id=1070).
+
+
+File: gawk.info,  Node: Boolean Ops,  Next: Conditional Exp,  Prev: Typing and Comparison,  Up: Truth Values and Conditions
+
+6.3.3 Boolean Expressions
+-------------------------
+
+A "Boolean expression" is a combination of comparison expressions or
+matching expressions, using the Boolean operators "or" ('||'), "and"
+('&&'), and "not" ('!'), along with parentheses to control nesting.  The
+truth value of the Boolean expression is computed by combining the truth
+values of the component expressions.  Boolean expressions are also
+referred to as "logical expressions".  The terms are equivalent.
+
+   Boolean expressions can be used wherever comparison and matching
+expressions can be used.  They can be used in 'if', 'while', 'do', and
+'for' statements (*note Statements::).  They have numeric values (one if
+true, zero if false) that come into play if the result of the Boolean
+expression is stored in a variable or used in arithmetic.
+
+   In addition, every Boolean expression is also a valid pattern, so you
+can use one as a pattern to control the execution of rules.  The Boolean
+operators are:
+
+'BOOLEAN1 && BOOLEAN2'
+     True if both BOOLEAN1 and BOOLEAN2 are true.  For example, the
+     following statement prints the current input record if it contains
+     both 'edu' and 'li':
+
+          if ($0 ~ /edu/ && $0 ~ /li/) print
+
+     The subexpression BOOLEAN2 is evaluated only if BOOLEAN1 is true.
+     This can make a difference when BOOLEAN2 contains expressions that
+     have side effects.  In the case of '$0 ~ /foo/ && ($2 == bar++)',
+     the variable 'bar' is not incremented if there is no substring
+     'foo' in the record.
+
+'BOOLEAN1 || BOOLEAN2'
+     True if at least one of BOOLEAN1 or BOOLEAN2 is true.  For example,
+     the following statement prints all records in the input that
+     contain _either_ 'edu' or 'li':
+
+          if ($0 ~ /edu/ || $0 ~ /li/) print
+
+     The subexpression BOOLEAN2 is evaluated only if BOOLEAN1 is false.
+     This can make a difference when BOOLEAN2 contains expressions that
+     have side effects.  (Thus, this test never really distinguishes
+     records that contain both 'edu' and 'li'--as soon as 'edu' is
+     matched, the full test succeeds.)
+
+'! BOOLEAN'
+     True if BOOLEAN is false.  For example, the following program
+     prints 'no home!' in the unusual event that the 'HOME' environment
+     variable is not defined:
+
+          BEGIN { if (! ("HOME" in ENVIRON))
+                      print "no home!" }
+
+     (The 'in' operator is described in *note Reference to Elements::.)
+
+   The '&&' and '||' operators are called "short-circuit" operators
+because of the way they work.  Evaluation of the full expression is
+"short-circuited" if the result can be determined partway through its
+evaluation.
+
+   Statements that end with '&&' or '||' can be continued simply by
+putting a newline after them.  But you cannot put a newline in front of
+either of these operators without using backslash continuation (*note
+Statements/Lines::).
+
+   The actual value of an expression using the '!' operator is either
+one or zero, depending upon the truth value of the expression it is
+applied to.  The '!' operator is often useful for changing the sense of
+a flag variable from false to true and back again.  For example, the
+following program is one way to print lines in between special
+bracketing lines:
+
+     $1 == "START"   { interested = ! interested; next }
+     interested      { print }
+     $1 == "END"     { interested = ! interested; next }
+
+The variable 'interested', as with all 'awk' variables, starts out
+initialized to zero, which is also false.  When a line is seen whose
+first field is 'START', the value of 'interested' is toggled to true,
+using '!'.  The next rule prints lines as long as 'interested' is true.
+When a line is seen whose first field is 'END', 'interested' is toggled
+back to false.(1)
+
+   Most commonly, the '!' operator is used in the conditions of 'if' and
+'while' statements, where it often makes more sense to phrase the logic
+in the negative:
+
+     if (! SOME CONDITION || SOME OTHER CONDITION) {
+         ... DO WHATEVER PROCESSING ...
+     }
+
+     NOTE: The 'next' statement is discussed in *note Next Statement::.
+     'next' tells 'awk' to skip the rest of the rules, get the next
+     record, and start processing the rules over again at the top.  The
+     reason it's there is to avoid printing the bracketing 'START' and
+     'END' lines.
+
+   ---------- Footnotes ----------
+
+   (1) This program has a bug; it prints lines starting with 'END'.  How
+would you fix it?
+
+
+File: gawk.info,  Node: Conditional Exp,  Prev: Boolean Ops,  Up: Truth Values and Conditions
+
+6.3.4 Conditional Expressions
+-----------------------------
+
+A "conditional expression" is a special kind of expression that has
+three operands.  It allows you to use one expression's value to select
+one of two other expressions.  The conditional expression in 'awk' is
+the same as in the C language, as shown here:
+
+     SELECTOR ? IF-TRUE-EXP : IF-FALSE-EXP
+
+There are three subexpressions.  The first, SELECTOR, is always computed
+first.  If it is "true" (not zero or not null), then IF-TRUE-EXP is
+computed next, and its value becomes the value of the whole expression.
+Otherwise, IF-FALSE-EXP is computed next, and its value becomes the
+value of the whole expression.  For example, the following expression
+produces the absolute value of 'x':
+
+     x >= 0 ? x : -x
+
+   Each time the conditional expression is computed, only one of
+IF-TRUE-EXP and IF-FALSE-EXP is used; the other is ignored.  This is
+important when the expressions have side effects.  For example, this
+conditional expression examines element 'i' of either array 'a' or array
+'b', and increments 'i':
+
+     x == y ? a[i++] : b[i++]
+
+This is guaranteed to increment 'i' exactly once, because each time only
+one of the two increment expressions is executed and the other is not.
+*Note Arrays::, for more information about arrays.
+
+   As a minor 'gawk' extension, a statement that uses '?:' can be
+continued simply by putting a newline after either character.  However,
+putting a newline in front of either character does not work without
+using backslash continuation (*note Statements/Lines::).  If '--posix'
+is specified (*note Options::), this extension is disabled.
+
+
+File: gawk.info,  Node: Function Calls,  Next: Precedence,  Prev: Truth Values and Conditions,  Up: Expressions
+
+6.4 Function Calls
+==================
+
+A "function" is a name for a particular calculation.  This enables you
+to ask for it by name at any point in the program.  For example, the
+function 'sqrt()' computes the square root of a number.
+
+   A fixed set of functions are "built in", which means they are
+available in every 'awk' program.  The 'sqrt()' function is one of
+these.  *Note Built-in:: for a list of built-in functions and their
+descriptions.  In addition, you can define functions for use in your
+program.  *Note User-defined:: for instructions on how to do this.
+Finally, 'gawk' lets you write functions in C or C++ that may be called
+from your program (*note Dynamic Extensions::).
+
+   The way to use a function is with a "function call" expression, which
+consists of the function name followed immediately by a list of
+"arguments" in parentheses.  The arguments are expressions that provide
+the raw materials for the function's calculations.  When there is more
+than one argument, they are separated by commas.  If there are no
+arguments, just write '()' after the function name.  The following
+examples show function calls with and without arguments:
+
+     sqrt(x^2 + y^2)        one argument
+     atan2(y, x)            two arguments
+     rand()                 no arguments
+
+     CAUTION: Do not put any space between the function name and the
+     opening parenthesis!  A user-defined function name looks just like
+     the name of a variable--a space would make the expression look like
+     concatenation of a variable with an expression inside parentheses.
+     With built-in functions, space before the parenthesis is harmless,
+     but it is best not to get into the habit of using space to avoid
+     mistakes with user-defined functions.
+
+   Each function expects a particular number of arguments.  For example,
+the 'sqrt()' function must be called with a single argument, the number
+of which to take the square root:
+
+     sqrt(ARGUMENT)
+
+   Some of the built-in functions have one or more optional arguments.
+If those arguments are not supplied, the functions use a reasonable
+default value.  *Note Built-in:: for full details.  If arguments are
+omitted in calls to user-defined functions, then those arguments are
+treated as local variables.  Such local variables act like the empty
+string if referenced where a string value is required, and like zero if
+referenced where a numeric value is required (*note User-defined::).
+
+   As an advanced feature, 'gawk' provides indirect function calls,
+which is a way to choose the function to call at runtime, instead of
+when you write the source code to your program.  We defer discussion of
+this feature until later; see *note Indirect Calls::.
+
+   Like every other expression, the function call has a value, often
+called the "return value", which is computed by the function based on
+the arguments you give it.  In this example, the return value of
+'sqrt(ARGUMENT)' is the square root of ARGUMENT.  The following program
+reads numbers, one number per line, and prints the square root of each
+one:
+
+     $ awk '{ print "The square root of", $1, "is", sqrt($1) }'
+     1
+     -| The square root of 1 is 1
+     3
+     -| The square root of 3 is 1.73205
+     5
+     -| The square root of 5 is 2.23607
+     Ctrl-d
+
+   A function can also have side effects, such as assigning values to
+certain variables or doing I/O. This program shows how the 'match()'
+function (*note String Functions::) changes the variables 'RSTART' and
+'RLENGTH':
+
+     {
+         if (match($1, $2))
+             print RSTART, RLENGTH
+         else
+             print "no match"
+     }
+
+Here is a sample run:
+
+     $ awk -f matchit.awk
+     aaccdd  c+
+     -| 3 2
+     foo     bar
+     -| no match
+     abcdefg e
+     -| 5 1
+
+
+File: gawk.info,  Node: Precedence,  Next: Locales,  Prev: Function Calls,  Up: Expressions
+
+6.5 Operator Precedence (How Operators Nest)
+============================================
+
+"Operator precedence" determines how operators are grouped when
+different operators appear close by in one expression.  For example, '*'
+has higher precedence than '+'; thus, 'a + b * c' means to multiply 'b'
+and 'c', and then add 'a' to the product (i.e., 'a + (b * c)').
+
+   The normal precedence of the operators can be overruled by using
+parentheses.  Think of the precedence rules as saying where the
+parentheses are assumed to be.  In fact, it is wise to always use
+parentheses whenever there is an unusual combination of operators,
+because other people who read the program may not remember what the
+precedence is in this case.  Even experienced programmers occasionally
+forget the exact rules, which leads to mistakes.  Explicit parentheses
+help prevent any such mistakes.
+
+   When operators of equal precedence are used together, the leftmost
+operator groups first, except for the assignment, conditional, and
+exponentiation operators, which group in the opposite order.  Thus, 'a -
+b + c' groups as '(a - b) + c' and 'a = b = c' groups as 'a = (b = c)'.
+
+   Normally the precedence of prefix unary operators does not matter,
+because there is only one way to interpret them: innermost first.  Thus,
+'$++i' means '$(++i)' and '++$x' means '++($x)'.  However, when another
+operator follows the operand, then the precedence of the unary operators
+can matter.  '$x^2' means '($x)^2', but '-x^2' means '-(x^2)', because
+'-' has lower precedence than '^', whereas '$' has higher precedence.
+Also, operators cannot be combined in a way that violates the precedence
+rules; for example, '$$0++--' is not a valid expression because the
+first '$' has higher precedence than the '++'; to avoid the problem the
+expression can be rewritten as '$($0++)--'.
+
+   This list presents 'awk''s operators, in order of highest to lowest
+precedence:
+
+'('...')'
+     Grouping.
+
+'$'
+     Field reference.
+
+'++ --'
+     Increment, decrement.
+
+'^ **'
+     Exponentiation.  These operators group right to left.
+
+'+ - !'
+     Unary plus, minus, logical "not."
+
+'* / %'
+     Multiplication, division, remainder.
+
+'+ -'
+     Addition, subtraction.
+
+String concatenation
+     There is no special symbol for concatenation.  The operands are
+     simply written side by side (*note Concatenation::).
+
+'< <= == != > >= >> | |&'
+     Relational and redirection.  The relational operators and the
+     redirections have the same precedence level.  Characters such as
+     '>' serve both as relationals and as redirections; the context
+     distinguishes between the two meanings.
+
+     Note that the I/O redirection operators in 'print' and 'printf'
+     statements belong to the statement level, not to expressions.  The
+     redirection does not produce an expression that could be the
+     operand of another operator.  As a result, it does not make sense
+     to use a redirection operator near another operator of lower
+     precedence without parentheses.  Such combinations (e.g., 'print
+     foo > a ? b : c') result in syntax errors.  The correct way to
+     write this statement is 'print foo > (a ? b : c)'.
+
+'~ !~'
+     Matching, nonmatching.
+
+'in'
+     Array membership.
+
+'&&'
+     Logical "and."
+
+'||'
+     Logical "or."
+
+'?:'
+     Conditional.  This operator groups right to left.
+
+'= += -= *= /= %= ^= **='
+     Assignment.  These operators group right to left.
+
+     NOTE: The '|&', '**', and '**=' operators are not specified by
+     POSIX. For maximum portability, do not use them.
+
+
+File: gawk.info,  Node: Locales,  Next: Expressions Summary,  Prev: Precedence,  Up: Expressions
+
+6.6 Where You Are Makes a Difference
+====================================
+
+Modern systems support the notion of "locales": a way to tell the system
+about the local character set and language.  The ISO C standard defines
+a default '"C"' locale, which is an environment that is typical of what
+many C programmers are used to.
+
+   Once upon a time, the locale setting used to affect regexp matching,
+but this is no longer true (*note Ranges and Locales::).
+
+   Locales can affect record splitting.  For the normal case of 'RS =
+"\n"', the locale is largely irrelevant.  For other single-character
+record separators, setting 'LC_ALL=C' in the environment will give you
+much better performance when reading records.  Otherwise, 'gawk' has to
+make several function calls, _per input character_, to find the record
+terminator.
+
+   Locales can affect how dates and times are formatted (*note Time
+Functions::).  For example, a common way to abbreviate the date
+September 4, 2015, in the United States is "9/4/15."  In many countries
+in Europe, however, it is abbreviated "4.9.15."  Thus, the '%x'
+specification in a '"US"' locale might produce '9/4/15', while in a
+'"EUROPE"' locale, it might produce '4.9.15'.
+
+   According to POSIX, string comparison is also affected by locales
+(similar to regular expressions).  The details are presented in *note
+POSIX String Comparison::.
+
+   Finally, the locale affects the value of the decimal point character
+used when 'gawk' parses input data.  This is discussed in detail in
+*note Conversion::.
+
+
+File: gawk.info,  Node: Expressions Summary,  Prev: Locales,  Up: Expressions
+
+6.7 Summary
+===========
+
+   * Expressions are the basic elements of computation in programs.
+     They are built from constants, variables, function calls, and
+     combinations of the various kinds of values with operators.
+
+   * 'awk' supplies three kinds of constants: numeric, string, and
+     regexp.  'gawk' lets you specify numeric constants in octal and
+     hexadecimal (bases 8 and 16) as well as decimal (base 10).  In
+     certain contexts, a standalone regexp constant such as '/foo/' has
+     the same meaning as '$0 ~ /foo/'.
+
+   * Variables hold values between uses in computations.  A number of
+     built-in variables provide information to your 'awk' program, and a
+     number of others let you control how 'awk' behaves.
+
+   * Numbers are automatically converted to strings, and strings to
+     numbers, as needed by 'awk'.  Numeric values are converted as if
+     they were formatted with 'sprintf()' using the format in 'CONVFMT'.
+     Locales can influence the conversions.
+
+   * 'awk' provides the usual arithmetic operators (addition,
+     subtraction, multiplication, division, modulus), and unary plus and
+     minus.  It also provides comparison operators, Boolean operators,
+     an array membership testing operator, and regexp matching
+     operators.  String concatenation is accomplished by placing two
+     expressions next to each other; there is no explicit operator.  The
+     three-operand '?:' operator provides an "if-else" test within
+     expressions.
+
+   * Assignment operators provide convenient shorthands for common
+     arithmetic operations.
+
+   * In 'awk', a value is considered to be true if it is nonzero _or_
+     non-null.  Otherwise, the value is false.
+
+   * A variable's type is set upon each assignment and may change over
+     its lifetime.  The type determines how it behaves in comparisons
+     (string or numeric).
+
+   * Function calls return a value that may be used as part of a larger
+     expression.  Expressions used to pass parameter values are fully
+     evaluated before the function is called.  'awk' provides built-in
+     and user-defined functions; this is described in *note Functions::.
+
+   * Operator precedence specifies the order in which operations are
+     performed, unless explicitly overridden by parentheses.  'awk''s
+     operator precedence is compatible with that of C.
+
+   * Locales can affect the format of data as output by an 'awk'
+     program, and occasionally the format for data read as input.
+
+
+File: gawk.info,  Node: Patterns and Actions,  Next: Arrays,  Prev: Expressions,  Up: Top
+
+7 Patterns, Actions, and Variables
+**********************************
+
+As you have already seen, each 'awk' statement consists of a pattern
+with an associated action.  This major node describes how you build
+patterns and actions, what kinds of things you can do within actions,
+and 'awk''s predefined variables.
+
+   The pattern-action rules and the statements available for use within
+actions form the core of 'awk' programming.  In a sense, everything
+covered up to here has been the foundation that programs are built on
+top of.  Now it's time to start building something useful.
+
+* Menu:
+
+* Pattern Overview::            What goes into a pattern.
+* Using Shell Variables::       How to use shell variables with 'awk'.
+* Action Overview::             What goes into an action.
+* Statements::                  Describes the various control statements in
+                                detail.
+* Built-in Variables::          Summarizes the predefined variables.
+* Pattern Action Summary::      Patterns and Actions summary.
+
+
+File: gawk.info,  Node: Pattern Overview,  Next: Using Shell Variables,  Up: Patterns and Actions
+
+7.1 Pattern Elements
+====================
+
+* Menu:
+
+* Regexp Patterns::             Using regexps as patterns.
+* Expression Patterns::         Any expression can be used as a pattern.
+* Ranges::                      Pairs of patterns specify record ranges.
+* BEGIN/END::                   Specifying initialization and cleanup rules.
+* BEGINFILE/ENDFILE::           Two special patterns for advanced control.
+* Empty::                       The empty pattern, which matches every record.
+
+Patterns in 'awk' control the execution of rules--a rule is executed
+when its pattern matches the current input record.  The following is a
+summary of the types of 'awk' patterns:
+
+'/REGULAR EXPRESSION/'
+     A regular expression.  It matches when the text of the input record
+     fits the regular expression.  (*Note Regexp::.)
+
+'EXPRESSION'
+     A single expression.  It matches when its value is nonzero (if a
+     number) or non-null (if a string).  (*Note Expression Patterns::.)
+
+'BEGPAT, ENDPAT'
+     A pair of patterns separated by a comma, specifying a "range" of
+     records.  The range includes both the initial record that matches
+     BEGPAT and the final record that matches ENDPAT.  (*Note Ranges::.)
+
+'BEGIN'
+'END'
+     Special patterns for you to supply startup or cleanup actions for
+     your 'awk' program.  (*Note BEGIN/END::.)
+
+'BEGINFILE'
+'ENDFILE'
+     Special patterns for you to supply startup or cleanup actions to be
+     done on a per-file basis.  (*Note BEGINFILE/ENDFILE::.)
+
+'EMPTY'
+     The empty pattern matches every input record.  (*Note Empty::.)
+
+
+File: gawk.info,  Node: Regexp Patterns,  Next: Expression Patterns,  Up: Pattern Overview
+
+7.1.1 Regular Expressions as Patterns
+-------------------------------------
+
+Regular expressions are one of the first kinds of patterns presented in
+this book.  This kind of pattern is simply a regexp constant in the
+pattern part of a rule.  Its meaning is '$0 ~ /PATTERN/'.  The pattern
+matches when the input record matches the regexp.  For example:
+
+     /foo|bar|baz/  { buzzwords++ }
+     END            { print buzzwords, "buzzwords seen" }
+
+
+File: gawk.info,  Node: Expression Patterns,  Next: Ranges,  Prev: Regexp Patterns,  Up: Pattern Overview
+
+7.1.2 Expressions as Patterns
+-----------------------------
+
+Any 'awk' expression is valid as an 'awk' pattern.  The pattern matches
+if the expression's value is nonzero (if a number) or non-null (if a
+string).  The expression is reevaluated each time the rule is tested
+against a new input record.  If the expression uses fields such as '$1',
+the value depends directly on the new input record's text; otherwise, it
+depends on only what has happened so far in the execution of the 'awk'
+program.
+
+   Comparison expressions, using the comparison operators described in
+*note Typing and Comparison::, are a very common kind of pattern.
+Regexp matching and nonmatching are also very common expressions.  The
+left operand of the '~' and '!~' operators is a string.  The right
+operand is either a constant regular expression enclosed in slashes
+('/REGEXP/'), or any expression whose string value is used as a dynamic
+regular expression (*note Computed Regexps::).  The following example
+prints the second field of each input record whose first field is
+precisely 'li':
+
+     $ awk '$1 == "li" { print $2 }' mail-list
+
+(There is no output, because there is no person with the exact name
+'li'.)  Contrast this with the following regular expression match, which
+accepts any record with a first field that contains 'li':
+
+     $ awk '$1 ~ /li/ { print $2 }' mail-list
+     -| 555-5553
+     -| 555-6699
+
+   A regexp constant as a pattern is also a special case of an
+expression pattern.  The expression '/li/' has the value one if 'li'
+appears in the current input record.  Thus, as a pattern, '/li/' matches
+any record containing 'li'.
+
+   Boolean expressions are also commonly used as patterns.  Whether the
+pattern matches an input record depends on whether its subexpressions
+match.  For example, the following command prints all the records in
+'mail-list' that contain both 'edu' and 'li':
+
+     $ awk '/edu/ && /li/' mail-list
+     -| Samuel       555-3430     samuel.lanceolis@shu.edu        A
+
+   The following command prints all records in 'mail-list' that contain
+_either_ 'edu' or 'li' (or both, of course):
+
+     $ awk '/edu/ || /li/' mail-list
+     -| Amelia       555-5553     amelia.zodiacusque@gmail.com    F
+     -| Broderick    555-0542     broderick.aliquotiens@yahoo.com R
+     -| Fabius       555-1234     fabius.undevicesimus@ucb.edu    F
+     -| Julie        555-6699     julie.perscrutabor@skeeve.com   F
+     -| Samuel       555-3430     samuel.lanceolis@shu.edu        A
+     -| Jean-Paul    555-2127     jeanpaul.campanorum@nyu.edu     R
+
+   The following command prints all records in 'mail-list' that do _not_
+contain the string 'li':
+
+     $ awk '! /li/' mail-list
+     -| Anthony      555-3412     anthony.asserturo@hotmail.com   A
+     -| Becky        555-7685     becky.algebrarum@gmail.com      A
+     -| Bill         555-1675     bill.drowning@hotmail.com       A
+     -| Camilla      555-2912     camilla.infusarum@skynet.be     R
+     -| Fabius       555-1234     fabius.undevicesimus@ucb.edu    F
+     -| Martin       555-6480     martin.codicibus@hotmail.com    A
+     -| Jean-Paul    555-2127     jeanpaul.campanorum@nyu.edu     R
+
+   The subexpressions of a Boolean operator in a pattern can be constant
+regular expressions, comparisons, or any other 'awk' expressions.  Range
+patterns are not expressions, so they cannot appear inside Boolean
+patterns.  Likewise, the special patterns 'BEGIN', 'END', 'BEGINFILE',
+and 'ENDFILE', which never match any input record, are not expressions
+and cannot appear inside Boolean patterns.
+
+   The precedence of the different operators that can appear in patterns
+is described in *note Precedence::.
+
+
+File: gawk.info,  Node: Ranges,  Next: BEGIN/END,  Prev: Expression Patterns,  Up: Pattern Overview
+
+7.1.3 Specifying Record Ranges with Patterns
+--------------------------------------------
+
+A "range pattern" is made of two patterns separated by a comma, in the
+form 'BEGPAT, ENDPAT'.  It is used to match ranges of consecutive input
+records.  The first pattern, BEGPAT, controls where the range begins,
+while ENDPAT controls where the pattern ends.  For example, the
+following:
+
+     awk '$1 == "on", $1 == "off"' myfile
+
+prints every record in 'myfile' between 'on'/'off' pairs, inclusive.
+
+   A range pattern starts out by matching BEGPAT against every input
+record.  When a record matches BEGPAT, the range pattern is "turned on",
+and the range pattern matches this record as well.  As long as the range
+pattern stays turned on, it automatically matches every input record
+read.  The range pattern also matches ENDPAT against every input record;
+when this succeeds, the range pattern is "turned off" again for the
+following record.  Then the range pattern goes back to checking BEGPAT
+against each record.
+
+   The record that turns on the range pattern and the one that turns it
+off both match the range pattern.  If you don't want to operate on these
+records, you can write 'if' statements in the rule's action to
+distinguish them from the records you are interested in.
+
+   It is possible for a pattern to be turned on and off by the same
+record.  If the record satisfies both conditions, then the action is
+executed for just that record.  For example, suppose there is text
+between two identical markers (e.g., the '%' symbol), each on its own
+line, that should be ignored.  A first attempt would be to combine a
+range pattern that describes the delimited text with the 'next'
+statement (not discussed yet, *note Next Statement::).  This causes
+'awk' to skip any further processing of the current record and start
+over again with the next input record.  Such a program looks like this:
+
+     /^%$/,/^%$/    { next }
+                    { print }
+
+This program fails because the range pattern is both turned on and
+turned off by the first line, which just has a '%' on it.  To accomplish
+this task, write the program in the following manner, using a flag:
+
+     /^%$/     { skip = ! skip; next }
+     skip == 1 { next } # skip lines with `skip' set
+
+   In a range pattern, the comma (',') has the lowest precedence of all
+the operators (i.e., it is evaluated last).  Thus, the following program
+attempts to combine a range pattern with another, simpler test:
+
+     echo Yes | awk '/1/,/2/ || /Yes/'
+
+   The intent of this program is '(/1/,/2/) || /Yes/'.  However, 'awk'
+interprets this as '/1/, (/2/ || /Yes/)'.  This cannot be changed or
+worked around; range patterns do not combine with other patterns:
+
+     $ echo Yes | gawk '(/1/,/2/) || /Yes/'
+     error-> gawk: cmd. line:1: (/1/,/2/) || /Yes/
+     error-> gawk: cmd. line:1:           ^ syntax error
+
+   As a minor point of interest, although it is poor style, POSIX allows
+you to put a newline after the comma in a range pattern.  (d.c.)
+
+
+File: gawk.info,  Node: BEGIN/END,  Next: BEGINFILE/ENDFILE,  Prev: Ranges,  Up: Pattern Overview
+
+7.1.4 The 'BEGIN' and 'END' Special Patterns
+--------------------------------------------
+
+All the patterns described so far are for matching input records.  The
+'BEGIN' and 'END' special patterns are different.  They supply startup
+and cleanup actions for 'awk' programs.  'BEGIN' and 'END' rules must
+have actions; there is no default action for these rules because there
+is no current record when they run.  'BEGIN' and 'END' rules are often
+referred to as "'BEGIN' and 'END' blocks" by longtime 'awk' programmers.
+
+* Menu:
+
+* Using BEGIN/END::             How and why to use BEGIN/END rules.
+* I/O And BEGIN/END::           I/O issues in BEGIN/END rules.
+
+
+File: gawk.info,  Node: Using BEGIN/END,  Next: I/O And BEGIN/END,  Up: BEGIN/END
+
+7.1.4.1 Startup and Cleanup Actions
+...................................
+
+A 'BEGIN' rule is executed once only, before the first input record is
+read.  Likewise, an 'END' rule is executed once only, after all the
+input is read.  For example:
+
+     $ awk '
+     > BEGIN { print "Analysis of \"li\"" }
+     > /li/  { ++n }
+     > END   { print "\"li\" appears in", n, "records." }' mail-list
+     -| Analysis of "li"
+     -| "li" appears in 4 records.
+
+   This program finds the number of records in the input file
+'mail-list' that contain the string 'li'.  The 'BEGIN' rule prints a
+title for the report.  There is no need to use the 'BEGIN' rule to
+initialize the counter 'n' to zero, as 'awk' does this automatically
+(*note Variables::).  The second rule increments the variable 'n' every
+time a record containing the pattern 'li' is read.  The 'END' rule
+prints the value of 'n' at the end of the run.
+
+   The special patterns 'BEGIN' and 'END' cannot be used in ranges or
+with Boolean operators (indeed, they cannot be used with any operators).
+An 'awk' program may have multiple 'BEGIN' and/or 'END' rules.  They are
+executed in the order in which they appear: all the 'BEGIN' rules at
+startup and all the 'END' rules at termination.  'BEGIN' and 'END' rules
+may be intermixed with other rules.  This feature was added in the 1987
+version of 'awk' and is included in the POSIX standard.  The original
+(1978) version of 'awk' required the 'BEGIN' rule to be placed at the
+beginning of the program, the 'END' rule to be placed at the end, and
+only allowed one of each.  This is no longer required, but it is a good
+idea to follow this template in terms of program organization and
+readability.
+
+   Multiple 'BEGIN' and 'END' rules are useful for writing library
+functions, because each library file can have its own 'BEGIN' and/or
+'END' rule to do its own initialization and/or cleanup.  The order in
+which library functions are named on the command line controls the order
+in which their 'BEGIN' and 'END' rules are executed.  Therefore, you
+have to be careful when writing such rules in library files so that the
+order in which they are executed doesn't matter.  *Note Options:: for
+more information on using library functions.  *Note Library Functions::,
+for a number of useful library functions.
+
+   If an 'awk' program has only 'BEGIN' rules and no other rules, then
+the program exits after the 'BEGIN' rules are run.(1)  However, if an
+'END' rule exists, then the input is read, even if there are no other
+rules in the program.  This is necessary in case the 'END' rule checks
+the 'FNR' and 'NR' variables.
+
+   ---------- Footnotes ----------
+
+   (1) The original version of 'awk' kept reading and ignoring input
+until the end of the file was seen.
+
+
+File: gawk.info,  Node: I/O And BEGIN/END,  Prev: Using BEGIN/END,  Up: BEGIN/END
+
+7.1.4.2 Input/Output from 'BEGIN' and 'END' Rules
+.................................................
+
+There are several (sometimes subtle) points to be aware of when doing
+I/O from a 'BEGIN' or 'END' rule.  The first has to do with the value of
+'$0' in a 'BEGIN' rule.  Because 'BEGIN' rules are executed before any
+input is read, there simply is no input record, and therefore no fields,
+when executing 'BEGIN' rules.  References to '$0' and the fields yield a
+null string or zero, depending upon the context.  One way to give '$0' a
+real value is to execute a 'getline' command without a variable (*note
+Getline::).  Another way is simply to assign a value to '$0'.
+
+   The second point is similar to the first, but from the other
+direction.  Traditionally, due largely to implementation issues, '$0'
+and 'NF' were _undefined_ inside an 'END' rule.  The POSIX standard
+specifies that 'NF' is available in an 'END' rule.  It contains the
+number of fields from the last input record.  Most probably due to an
+oversight, the standard does not say that '$0' is also preserved,
+although logically one would think that it should be.  In fact, all of
+BWK 'awk', 'mawk', and 'gawk' preserve the value of '$0' for use in
+'END' rules.  Be aware, however, that some other implementations and
+many older versions of Unix 'awk' do not.
+
+   The third point follows from the first two.  The meaning of 'print'
+inside a 'BEGIN' or 'END' rule is the same as always: 'print $0'.  If
+'$0' is the null string, then this prints an empty record.  Many
+longtime 'awk' programmers use an unadorned 'print' in 'BEGIN' and 'END'
+rules, to mean 'print ""', relying on '$0' being null.  Although one
+might generally get away with this in 'BEGIN' rules, it is a very bad
+idea in 'END' rules, at least in 'gawk'.  It is also poor style, because
+if an empty line is needed in the output, the program should print one
+explicitly.
+
+   Finally, the 'next' and 'nextfile' statements are not allowed in a
+'BEGIN' rule, because the implicit
+read-a-record-and-match-against-the-rules loop has not started yet.
+Similarly, those statements are not valid in an 'END' rule, because all
+the input has been read.  (*Note Next Statement:: and *note Nextfile
+Statement::.)
+
+
+File: gawk.info,  Node: BEGINFILE/ENDFILE,  Next: Empty,  Prev: BEGIN/END,  Up: Pattern Overview
+
+7.1.5 The 'BEGINFILE' and 'ENDFILE' Special Patterns
+----------------------------------------------------
+
+This minor node describes a 'gawk'-specific feature.
+
+   Two special kinds of rule, 'BEGINFILE' and 'ENDFILE', give you
+"hooks" into 'gawk''s command-line file processing loop.  As with the
+'BEGIN' and 'END' rules (*note BEGIN/END::), all 'BEGINFILE' rules in a
+program are merged, in the order they are read by 'gawk', and all
+'ENDFILE' rules are merged as well.
+
+   The body of the 'BEGINFILE' rules is executed just before 'gawk'
+reads the first record from a file.  'FILENAME' is set to the name of
+the current file, and 'FNR' is set to zero.
+
+   The 'BEGINFILE' rule provides you the opportunity to accomplish two
+tasks that would otherwise be difficult or impossible to perform:
+
+   * You can test if the file is readable.  Normally, it is a fatal
+     error if a file named on the command line cannot be opened for
+     reading.  However, you can bypass the fatal error and move on to
+     the next file on the command line.
+
+     You do this by checking if the 'ERRNO' variable is not the empty
+     string; if so, then 'gawk' was not able to open the file.  In this
+     case, your program can execute the 'nextfile' statement (*note
+     Nextfile Statement::).  This causes 'gawk' to skip the file
+     entirely.  Otherwise, 'gawk' exits with the usual fatal error.
+
+   * If you have written extensions that modify the record handling (by
+     inserting an "input parser"; *note Input Parsers::), you can invoke
+     them at this point, before 'gawk' has started processing the file.
+     (This is a _very_ advanced feature, currently used only by the
+     'gawkextlib' project (http://sourceforge.net/projects/gawkextlib).)
+
+   The 'ENDFILE' rule is called when 'gawk' has finished processing the
+last record in an input file.  For the last input file, it will be
+called before any 'END' rules.  The 'ENDFILE' rule is executed even for
+empty input files.
+
+   Normally, when an error occurs when reading input in the normal
+input-processing loop, the error is fatal.  However, if an 'ENDFILE'
+rule is present, the error becomes non-fatal, and instead 'ERRNO' is
+set.  This makes it possible to catch and process I/O errors at the
+level of the 'awk' program.
+
+   The 'next' statement (*note Next Statement::) is not allowed inside
+either a 'BEGINFILE' or an 'ENDFILE' rule.  The 'nextfile' statement is
+allowed only inside a 'BEGINFILE' rule, not inside an 'ENDFILE' rule.
+
+   The 'getline' statement (*note Getline::) is restricted inside both
+'BEGINFILE' and 'ENDFILE': only redirected forms of 'getline' are
+allowed.
+
+   'BEGINFILE' and 'ENDFILE' are 'gawk' extensions.  In most other 'awk'
+implementations, or if 'gawk' is in compatibility mode (*note
+Options::), they are not special.
+
+
+File: gawk.info,  Node: Empty,  Prev: BEGINFILE/ENDFILE,  Up: Pattern Overview
+
+7.1.6 The Empty Pattern
+-----------------------
+
+An empty (i.e., nonexistent) pattern is considered to match _every_
+input record.  For example, the program:
+
+     awk '{ print $1 }' mail-list
+
+prints the first field of every record.
+
+
+File: gawk.info,  Node: Using Shell Variables,  Next: Action Overview,  Prev: Pattern Overview,  Up: Patterns and Actions
+
+7.2 Using Shell Variables in Programs
+=====================================
+
+'awk' programs are often used as components in larger programs written
+in shell.  For example, it is very common to use a shell variable to
+hold a pattern that the 'awk' program searches for.  There are two ways
+to get the value of the shell variable into the body of the 'awk'
+program.
+
+   A common method is to use shell quoting to substitute the variable's
+value into the program inside the script.  For example, consider the
+following program:
+
+     printf "Enter search pattern: "
+     read pattern
+     awk "/$pattern/ "'{ nmatches++ }
+          END { print nmatches, "found" }' /path/to/data
+
+The 'awk' program consists of two pieces of quoted text that are
+concatenated together to form the program.  The first part is
+double-quoted, which allows substitution of the 'pattern' shell variable
+inside the quotes.  The second part is single-quoted.
+
+   Variable substitution via quoting works, but can potentially be
+messy.  It requires a good understanding of the shell's quoting rules
+(*note Quoting::), and it's often difficult to correctly match up the
+quotes when reading the program.
+
+   A better method is to use 'awk''s variable assignment feature (*note
+Assignment Options::) to assign the shell variable's value to an 'awk'
+variable.  Then use dynamic regexps to match the pattern (*note Computed
+Regexps::).  The following shows how to redo the previous example using
+this technique:
+
+     printf "Enter search pattern: "
+     read pattern
+     awk -v pat="$pattern" '$0 ~ pat { nmatches++ }
+            END { print nmatches, "found" }' /path/to/data
+
+Now, the 'awk' program is just one single-quoted string.  The assignment
+'-v pat="$pattern"' still requires double quotes, in case there is
+whitespace in the value of '$pattern'.  The 'awk' variable 'pat' could
+be named 'pattern' too, but that would be more confusing.  Using a
+variable also provides more flexibility, as the variable can be used
+anywhere inside the program--for printing, as an array subscript, or for
+any other use--without requiring the quoting tricks at every point in
+the program.
+
+
+File: gawk.info,  Node: Action Overview,  Next: Statements,  Prev: Using Shell Variables,  Up: Patterns and Actions
+
+7.3 Actions
+===========
+
+An 'awk' program or script consists of a series of rules and function
+definitions interspersed.  (Functions are described later.  *Note
+User-defined::.)  A rule contains a pattern and an action, either of
+which (but not both) may be omitted.  The purpose of the "action" is to
+tell 'awk' what to do once a match for the pattern is found.  Thus, in
+outline, an 'awk' program generally looks like this:
+
+     [PATTERN]  '{ ACTION }'
+      PATTERN  ['{ ACTION }']
+     ...
+     'function NAME(ARGS) { ... }'
+     ...
+
+   An action consists of one or more 'awk' "statements", enclosed in
+braces ('{...}').  Each statement specifies one thing to do.  The
+statements are separated by newlines or semicolons.  The braces around
+an action must be used even if the action contains only one statement,
+or if it contains no statements at all.  However, if you omit the action
+entirely, omit the braces as well.  An omitted action is equivalent to
+'{ print $0 }':
+
+     /foo/  { }     match 'foo', do nothing -- empty action
+     /foo/          match 'foo', print the record -- omitted action
+
+   The following types of statements are supported in 'awk':
+
+Expressions
+     Call functions or assign values to variables (*note Expressions::).
+     Executing this kind of statement simply computes the value of the
+     expression.  This is useful when the expression has side effects
+     (*note Assignment Ops::).
+
+Control statements
+     Specify the control flow of 'awk' programs.  The 'awk' language
+     gives you C-like constructs ('if', 'for', 'while', and 'do') as
+     well as a few special ones (*note Statements::).
+
+Compound statements
+     Enclose one or more statements in braces.  A compound statement is
+     used in order to put several statements together in the body of an
+     'if', 'while', 'do', or 'for' statement.
+
+Input statements
+     Use the 'getline' command (*note Getline::).  Also supplied in
+     'awk' are the 'next' statement (*note Next Statement::) and the
+     'nextfile' statement (*note Nextfile Statement::).
+
+Output statements
+     Such as 'print' and 'printf'.  *Note Printing::.
+
+Deletion statements
+     For deleting array elements.  *Note Delete::.
+
+
+File: gawk.info,  Node: Statements,  Next: Built-in Variables,  Prev: Action Overview,  Up: Patterns and Actions
+
+7.4 Control Statements in Actions
+=================================
+
+"Control statements", such as 'if', 'while', and so on, control the flow
+of execution in 'awk' programs.  Most of 'awk''s control statements are
+patterned after similar statements in C.
+
+   All the control statements start with special keywords, such as 'if'
+and 'while', to distinguish them from simple expressions.  Many control
+statements contain other statements.  For example, the 'if' statement
+contains another statement that may or may not be executed.  The
+contained statement is called the "body".  To include more than one
+statement in the body, group them into a single "compound statement"
+with braces, separating them with newlines or semicolons.
+
+* Menu:
+
+* If Statement::                Conditionally execute some 'awk'
+                                statements.
+* While Statement::             Loop until some condition is satisfied.
+* Do Statement::                Do specified action while looping until some
+                                condition is satisfied.
+* For Statement::               Another looping statement, that provides
+                                initialization and increment clauses.
+* Switch Statement::            Switch/case evaluation for conditional
+                                execution of statements based on a value.
+* Break Statement::             Immediately exit the innermost enclosing loop.
+* Continue Statement::          Skip to the end of the innermost enclosing
+                                loop.
+* Next Statement::              Stop processing the current input record.
+* Nextfile Statement::          Stop processing the current file.
+* Exit Statement::              Stop execution of 'awk'.
+
+
+File: gawk.info,  Node: If Statement,  Next: While Statement,  Up: Statements
+
+7.4.1 The 'if'-'else' Statement
+-------------------------------
+
+The 'if'-'else' statement is 'awk''s decision-making statement.  It
+looks like this:
+
+     'if (CONDITION) THEN-BODY' ['else ELSE-BODY']
+
+The CONDITION is an expression that controls what the rest of the
+statement does.  If the CONDITION is true, THEN-BODY is executed;
+otherwise, ELSE-BODY is executed.  The 'else' part of the statement is
+optional.  The condition is considered false if its value is zero or the
+null string; otherwise, the condition is true.  Refer to the following:
+
+     if (x % 2 == 0)
+         print "x is even"
+     else
+         print "x is odd"
+
+   In this example, if the expression 'x % 2 == 0' is true (i.e., if the
+value of 'x' is evenly divisible by two), then the first 'print'
+statement is executed; otherwise, the second 'print' statement is
+executed.  If the 'else' keyword appears on the same line as THEN-BODY
+and THEN-BODY is not a compound statement (i.e., not surrounded by
+braces), then a semicolon must separate THEN-BODY from the 'else'.  To
+illustrate this, the previous example can be rewritten as:
+
+     if (x % 2 == 0) print "x is even"; else
+             print "x is odd"
+
+If the ';' is left out, 'awk' can't interpret the statement and it
+produces a syntax error.  Don't actually write programs this way,
+because a human reader might fail to see the 'else' if it is not the
+first thing on its line.
+
+
+File: gawk.info,  Node: While Statement,  Next: Do Statement,  Prev: If Statement,  Up: Statements
+
+7.4.2 The 'while' Statement
+---------------------------
+
+In programming, a "loop" is a part of a program that can be executed two
+or more times in succession.  The 'while' statement is the simplest
+looping statement in 'awk'.  It repeatedly executes a statement as long
+as a condition is true.  For example:
+
+     while (CONDITION)
+       BODY
+
+BODY is a statement called the "body" of the loop, and CONDITION is an
+expression that controls how long the loop keeps running.  The first
+thing the 'while' statement does is test the CONDITION.  If the
+CONDITION is true, it executes the statement BODY.  (The CONDITION is
+true when the value is not zero and not a null string.)  After BODY has
+been executed, CONDITION is tested again, and if it is still true, BODY
+executes again.  This process repeats until the CONDITION is no longer
+true.  If the CONDITION is initially false, the body of the loop never
+executes and 'awk' continues with the statement following the loop.
+This example prints the first three fields of each record, one per line:
+
+     awk '
+     {
+         i = 1
+         while (i <= 3) {
+             print $i
+             i++
+         }
+     }' inventory-shipped
+
+The body of this loop is a compound statement enclosed in braces,
+containing two statements.  The loop works in the following manner:
+first, the value of 'i' is set to one.  Then, the 'while' statement
+tests whether 'i' is less than or equal to three.  This is true when 'i'
+equals one, so the 'i'th field is printed.  Then the 'i++' increments
+the value of 'i' and the loop repeats.  The loop terminates when 'i'
+reaches four.
+
+   A newline is not required between the condition and the body;
+however, using one makes the program clearer unless the body is a
+compound statement or else is very simple.  The newline after the open
+brace that begins the compound statement is not required either, but the
+program is harder to read without it.
+
+
+File: gawk.info,  Node: Do Statement,  Next: For Statement,  Prev: While Statement,  Up: Statements
+
+7.4.3 The 'do'-'while' Statement
+--------------------------------
+
+The 'do' loop is a variation of the 'while' looping statement.  The 'do'
+loop executes the BODY once and then repeats the BODY as long as the
+CONDITION is true.  It looks like this:
+
+     do
+       BODY
+     while (CONDITION)
+
+   Even if the CONDITION is false at the start, the BODY executes at
+least once (and only once, unless executing BODY makes CONDITION true).
+Contrast this with the corresponding 'while' statement:
+
+     while (CONDITION)
+         BODY
+
+This statement does not execute the BODY even once if the CONDITION is
+false to begin with.  The following is an example of a 'do' statement:
+
+     {
+         i = 1
+         do {
+             print $0
+             i++
+         } while (i <= 10)
+     }
+
+This program prints each input record 10 times.  However, it isn't a
+very realistic example, because in this case an ordinary 'while' would
+do just as well.  This situation reflects actual experience; only
+occasionally is there a real use for a 'do' statement.
+
+
+File: gawk.info,  Node: For Statement,  Next: Switch Statement,  Prev: Do Statement,  Up: Statements
+
+7.4.4 The 'for' Statement
+-------------------------
+
+The 'for' statement makes it more convenient to count iterations of a
+loop.  The general form of the 'for' statement looks like this:
+
+     for (INITIALIZATION; CONDITION; INCREMENT)
+       BODY
+
+The INITIALIZATION, CONDITION, and INCREMENT parts are arbitrary 'awk'
+expressions, and BODY stands for any 'awk' statement.
+
+   The 'for' statement starts by executing INITIALIZATION.  Then, as
+long as the CONDITION is true, it repeatedly executes BODY and then
+INCREMENT.  Typically, INITIALIZATION sets a variable to either zero or
+one, INCREMENT adds one to it, and CONDITION compares it against the
+desired number of iterations.  For example:
+
+     awk '
+     {
+         for (i = 1; i <= 3; i++)
+             print $i
+     }' inventory-shipped
+
+This prints the first three fields of each input record, with one field
+per line.
+
+   It isn't possible to set more than one variable in the INITIALIZATION
+part without using a multiple assignment statement such as 'x = y = 0'.
+This makes sense only if all the initial values are equal.  (But it is
+possible to initialize additional variables by writing their assignments
+as separate statements preceding the 'for' loop.)
+
+   The same is true of the INCREMENT part.  Incrementing additional
+variables requires separate statements at the end of the loop.  The C
+compound expression, using C's comma operator, is useful in this
+context, but it is not supported in 'awk'.
+
+   Most often, INCREMENT is an increment expression, as in the previous
+example.  But this is not required; it can be any expression whatsoever.
+For example, the following statement prints all the powers of two
+between 1 and 100:
+
+     for (i = 1; i <= 100; i *= 2)
+         print i
+
+   If there is nothing to be done, any of the three expressions in the
+parentheses following the 'for' keyword may be omitted.  Thus,
+'for (; x > 0;)' is equivalent to 'while (x > 0)'.  If the CONDITION is
+omitted, it is treated as true, effectively yielding an "infinite loop"
+(i.e., a loop that never terminates).
+
+   In most cases, a 'for' loop is an abbreviation for a 'while' loop, as
+shown here:
+
+     INITIALIZATION
+     while (CONDITION) {
+       BODY
+       INCREMENT
+     }
+
+The only exception is when the 'continue' statement (*note Continue
+Statement::) is used inside the loop.  Changing a 'for' statement to a
+'while' statement in this way can change the effect of the 'continue'
+statement inside the loop.
+
+   The 'awk' language has a 'for' statement in addition to a 'while'
+statement because a 'for' loop is often both less work to type and more
+natural to think of.  Counting the number of iterations is very common
+in loops.  It can be easier to think of this counting as part of looping
+rather than as something to do inside the loop.
+
+   There is an alternative version of the 'for' loop, for iterating over
+all the indices of an array:
+
+     for (i in array)
+         DO SOMETHING WITH array[i]
+
+*Note Scanning an Array:: for more information on this version of the
+'for' loop.
+
+
+File: gawk.info,  Node: Switch Statement,  Next: Break Statement,  Prev: For Statement,  Up: Statements
+
+7.4.5 The 'switch' Statement
+----------------------------
+
+This minor node describes a 'gawk'-specific feature.  If 'gawk' is in
+compatibility mode (*note Options::), it is not available.
+
+   The 'switch' statement allows the evaluation of an expression and the
+execution of statements based on a 'case' match.  Case statements are
+checked for a match in the order they are defined.  If no suitable
+'case' is found, the 'default' section is executed, if supplied.
+
+   Each 'case' contains a single constant, be it numeric, string, or
+regexp.  The 'switch' expression is evaluated, and then each 'case''s
+constant is compared against the result in turn.  The type of constant
+determines the comparison: numeric or string do the usual comparisons.
+A regexp constant does a regular expression match against the string
+value of the original expression.  The general form of the 'switch'
+statement looks like this:
+
+     switch (EXPRESSION) {
+     case VALUE OR REGULAR EXPRESSION:
+         CASE-BODY
+     default:
+         DEFAULT-BODY
+     }
+
+   Control flow in the 'switch' statement works as it does in C. Once a
+match to a given case is made, the case statement bodies execute until a
+'break', 'continue', 'next', 'nextfile', or 'exit' is encountered, or
+the end of the 'switch' statement itself.  For example:
+
+     while ((c = getopt(ARGC, ARGV, "aksx")) != -1) {
+         switch (c) {
+         case "a":
+             # report size of all files
+             all_files = TRUE;
+             break
+         case "k":
+             BLOCK_SIZE = 1024       # 1K block size
+             break
+         case "s":
+             # do sums only
+             sum_only = TRUE
+             break
+         case "x":
+             # don't cross filesystems
+             fts_flags = or(fts_flags, FTS_XDEV)
+             break
+         case "?":
+         default:
+             usage()
+             break
+         }
+     }
+
+   Note that if none of the statements specified here halt execution of
+a matched 'case' statement, execution falls through to the next 'case'
+until execution halts.  In this example, the 'case' for '"?"' falls
+through to the 'default' case, which is to call a function named
+'usage()'.  (The 'getopt()' function being called here is described in
+*note Getopt Function::.)
+
+
+File: gawk.info,  Node: Break Statement,  Next: Continue Statement,  Prev: Switch Statement,  Up: Statements
+
+7.4.6 The 'break' Statement
+---------------------------
+
+The 'break' statement jumps out of the innermost 'for', 'while', or 'do'
+loop that encloses it.  The following example finds the smallest divisor
+of any integer, and also identifies prime numbers:
+
+     # find smallest divisor of num
+     {
+         num = $1
+         for (divisor = 2; divisor * divisor <= num; divisor++) {
+             if (num % divisor == 0)
+                 break
+         }
+         if (num % divisor == 0)
+             printf "Smallest divisor of %d is %d\n", num, divisor
+         else
+             printf "%d is prime\n", num
+     }
+
+   When the remainder is zero in the first 'if' statement, 'awk'
+immediately "breaks out" of the containing 'for' loop.  This means that
+'awk' proceeds immediately to the statement following the loop and
+continues processing.  (This is very different from the 'exit'
+statement, which stops the entire 'awk' program.  *Note Exit
+Statement::.)
+
+   The following program illustrates how the CONDITION of a 'for' or
+'while' statement could be replaced with a 'break' inside an 'if':
+
+     # find smallest divisor of num
+     {
+         num = $1
+         for (divisor = 2; ; divisor++) {
+             if (num % divisor == 0) {
+                 printf "Smallest divisor of %d is %d\n", num, divisor
+                 break
+             }
+             if (divisor * divisor > num) {
+                 printf "%d is prime\n", num
+                 break
+             }
+         }
+     }
+
+   The 'break' statement is also used to break out of the 'switch'
+statement.  This is discussed in *note Switch Statement::.
+
+   The 'break' statement has no meaning when used outside the body of a
+loop or 'switch'.  However, although it was never documented, historical
+implementations of 'awk' treated the 'break' statement outside of a loop
+as if it were a 'next' statement (*note Next Statement::).  (d.c.)
+Recent versions of BWK 'awk' no longer allow this usage, nor does
+'gawk'.
+
+
+File: gawk.info,  Node: Continue Statement,  Next: Next Statement,  Prev: Break Statement,  Up: Statements
+
+7.4.7 The 'continue' Statement
+------------------------------
+
+Similar to 'break', the 'continue' statement is used only inside 'for',
+'while', and 'do' loops.  It skips over the rest of the loop body,
+causing the next cycle around the loop to begin immediately.  Contrast
+this with 'break', which jumps out of the loop altogether.
+
+   The 'continue' statement in a 'for' loop directs 'awk' to skip the
+rest of the body of the loop and resume execution with the
+increment-expression of the 'for' statement.  The following program
+illustrates this fact:
+
+     BEGIN {
+          for (x = 0; x <= 20; x++) {
+              if (x == 5)
+                  continue
+              printf "%d ", x
+          }
+          print ""
+     }
+
+This program prints all the numbers from 0 to 20--except for 5, for
+which the 'printf' is skipped.  Because the increment 'x++' is not
+skipped, 'x' does not remain stuck at 5.  Contrast the 'for' loop from
+the previous example with the following 'while' loop:
+
+     BEGIN {
+          x = 0
+          while (x <= 20) {
+              if (x == 5)
+                  continue
+              printf "%d ", x
+              x++
+          }
+          print ""
+     }
+
+This program loops forever once 'x' reaches 5, because the increment
+('x++') is never reached.
+
+   The 'continue' statement has no special meaning with respect to the
+'switch' statement, nor does it have any meaning when used outside the
+body of a loop.  Historical versions of 'awk' treated a 'continue'
+statement outside a loop the same way they treated a 'break' statement
+outside a loop: as if it were a 'next' statement (*note Next
+Statement::).  (d.c.)  Recent versions of BWK 'awk' no longer work this
+way, nor does 'gawk'.
+
+
+File: gawk.info,  Node: Next Statement,  Next: Nextfile Statement,  Prev: Continue Statement,  Up: Statements
+
+7.4.8 The 'next' Statement
+--------------------------
+
+The 'next' statement forces 'awk' to immediately stop processing the
+current record and go on to the next record.  This means that no further
+rules are executed for the current record, and the rest of the current
+rule's action isn't executed.
+
+   Contrast this with the effect of the 'getline' function (*note
+Getline::).  That also causes 'awk' to read the next record immediately,
+but it does not alter the flow of control in any way (i.e., the rest of
+the current action executes with a new input record).
+
+   At the highest level, 'awk' program execution is a loop that reads an
+input record and then tests each rule's pattern against it.  If you
+think of this loop as a 'for' statement whose body contains the rules,
+then the 'next' statement is analogous to a 'continue' statement.  It
+skips to the end of the body of this implicit loop and executes the
+increment (which reads another record).
+
+   For example, suppose an 'awk' program works only on records with four
+fields, and it shouldn't fail when given bad input.  To avoid
+complicating the rest of the program, write a "weed out" rule near the
+beginning, in the following manner:
+
+     NF != 4 {
+         printf("%s:%d: skipped: NF != 4\n", FILENAME, FNR) > "/dev/stderr"
+         next
+     }
+
+Because of the 'next' statement, the program's subsequent rules won't
+see the bad record.  The error message is redirected to the standard
+error output stream, as error messages should be.  For more detail, see
+*note Special Files::.
+
+   If the 'next' statement causes the end of the input to be reached,
+then the code in any 'END' rules is executed.  *Note BEGIN/END::.
+
+   The 'next' statement is not allowed inside 'BEGINFILE' and 'ENDFILE'
+rules.  *Note BEGINFILE/ENDFILE::.
+
+   According to the POSIX standard, the behavior is undefined if the
+'next' statement is used in a 'BEGIN' or 'END' rule.  'gawk' treats it
+as a syntax error.  Although POSIX does not disallow it, most other
+'awk' implementations don't allow the 'next' statement inside function
+bodies (*note User-defined::).  Just as with any other 'next' statement,
+a 'next' statement inside a function body reads the next record and
+starts processing it with the first rule in the program.
+
+
+File: gawk.info,  Node: Nextfile Statement,  Next: Exit Statement,  Prev: Next Statement,  Up: Statements
+
+7.4.9 The 'nextfile' Statement
+------------------------------
+
+The 'nextfile' statement is similar to the 'next' statement.  However,
+instead of abandoning processing of the current record, the 'nextfile'
+statement instructs 'awk' to stop processing the current data file.
+
+   Upon execution of the 'nextfile' statement, 'FILENAME' is updated to
+the name of the next data file listed on the command line, 'FNR' is
+reset to one, and processing starts over with the first rule in the
+program.  If the 'nextfile' statement causes the end of the input to be
+reached, then the code in any 'END' rules is executed.  An exception to
+this is when 'nextfile' is invoked during execution of any statement in
+an 'END' rule; in this case, it causes the program to stop immediately.
+*Note BEGIN/END::.
+
+   The 'nextfile' statement is useful when there are many data files to
+process but it isn't necessary to process every record in every file.
+Without 'nextfile', in order to move on to the next data file, a program
+would have to continue scanning the unwanted records.  The 'nextfile'
+statement accomplishes this much more efficiently.
+
+   In 'gawk', execution of 'nextfile' causes additional things to
+happen: any 'ENDFILE' rules are executed if 'gawk' is not currently in
+an 'END' or 'BEGINFILE' rule, 'ARGIND' is incremented, and any
+'BEGINFILE' rules are executed.  ('ARGIND' hasn't been introduced yet.
+*Note Built-in Variables::.)
+
+   With 'gawk', 'nextfile' is useful inside a 'BEGINFILE' rule to skip
+over a file that would otherwise cause 'gawk' to exit with a fatal
+error.  In this case, 'ENDFILE' rules are not executed.  *Note
+BEGINFILE/ENDFILE::.
+
+   Although it might seem that 'close(FILENAME)' would accomplish the
+same as 'nextfile', this isn't true.  'close()' is reserved for closing
+files, pipes, and coprocesses that are opened with redirections.  It is
+not related to the main processing that 'awk' does with the files listed
+in 'ARGV'.
+
+     NOTE: For many years, 'nextfile' was a common extension.  In
+     September 2012, it was accepted for inclusion into the POSIX
+     standard.  See the Austin Group website
+     (http://austingroupbugs.net/view.php?id=607).
+
+   The current version of BWK 'awk' and 'mawk' also support 'nextfile'.
+However, they don't allow the 'nextfile' statement inside function
+bodies (*note User-defined::).  'gawk' does; a 'nextfile' inside a
+function body reads the first record from the next file and starts
+processing it with the first rule in the program, just as any other
+'nextfile' statement.
+
+
+File: gawk.info,  Node: Exit Statement,  Prev: Nextfile Statement,  Up: Statements
+
+7.4.10 The 'exit' Statement
+---------------------------
+
+The 'exit' statement causes 'awk' to immediately stop executing the
+current rule and to stop processing input; any remaining input is
+ignored.  The 'exit' statement is written as follows:
+
+     'exit' [RETURN CODE]
+
+   When an 'exit' statement is executed from a 'BEGIN' rule, the program
+stops processing everything immediately.  No input records are read.
+However, if an 'END' rule is present, as part of executing the 'exit'
+statement, the 'END' rule is executed (*note BEGIN/END::).  If 'exit' is
+used in the body of an 'END' rule, it causes the program to stop
+immediately.
+
+   An 'exit' statement that is not part of a 'BEGIN' or 'END' rule stops
+the execution of any further automatic rules for the current record,
+skips reading any remaining input records, and executes the 'END' rule
+if there is one.  'gawk' also skips any 'ENDFILE' rules; they do not
+execute.
+
+   In such a case, if you don't want the 'END' rule to do its job, set a
+variable to a nonzero value before the 'exit' statement and check that
+variable in the 'END' rule.  *Note Assert Function:: for an example that
+does this.
+
+   If an argument is supplied to 'exit', its value is used as the exit
+status code for the 'awk' process.  If no argument is supplied, 'exit'
+causes 'awk' to return a "success" status.  In the case where an
+argument is supplied to a first 'exit' statement, and then 'exit' is
+called a second time from an 'END' rule with no argument, 'awk' uses the
+previously supplied exit value.  (d.c.)  *Note Exit Status:: for more
+information.
+
+   For example, suppose an error condition occurs that is difficult or
+impossible to handle.  Conventionally, programs report this by exiting
+with a nonzero status.  An 'awk' program can do this using an 'exit'
+statement with a nonzero argument, as shown in the following example:
+
+     BEGIN {
+         if (("date" | getline date_now) <= 0) {
+             print "Can't get system date" > "/dev/stderr"
+             exit 1
+         }
+         print "current date is", date_now
+         close("date")
+     }
+
+     NOTE: For full portability, exit values should be between zero and
+     126, inclusive.  Negative values, and values of 127 or greater, may
+     not produce consistent results across different operating systems.
+
+
+File: gawk.info,  Node: Built-in Variables,  Next: Pattern Action Summary,  Prev: Statements,  Up: Patterns and Actions
+
+7.5 Predefined Variables
+========================
+
+Most 'awk' variables are available to use for your own purposes; they
+never change unless your program assigns values to them, and they never
+affect anything unless your program examines them.  However, a few
+variables in 'awk' have special built-in meanings.  'awk' examines some
+of these automatically, so that they enable you to tell 'awk' how to do
+certain things.  Others are set automatically by 'awk', so that they
+carry information from the internal workings of 'awk' to your program.
+
+   This minor node documents all of 'gawk''s predefined variables, most
+of which are also documented in the major nodes describing their areas
+of activity.
+
+* Menu:
+
+* User-modified::               Built-in variables that you change to control
+                                'awk'.
+* Auto-set::                    Built-in variables where 'awk' gives
+                                you information.
+* ARGC and ARGV::               Ways to use 'ARGC' and 'ARGV'.
+
+
+File: gawk.info,  Node: User-modified,  Next: Auto-set,  Up: Built-in Variables
+
+7.5.1 Built-in Variables That Control 'awk'
+-------------------------------------------
+
+The following is an alphabetical list of variables that you can change
+to control how 'awk' does certain things.
+
+   The variables that are specific to 'gawk' are marked with a pound
+sign ('#').  These variables are 'gawk' extensions.  In other 'awk'
+implementations or if 'gawk' is in compatibility mode (*note Options::),
+they are not special.  (Any exceptions are noted in the description of
+each variable.)
+
+'BINMODE #'
+     On non-POSIX systems, this variable specifies use of binary mode
+     for all I/O. Numeric values of one, two, or three specify that
+     input files, output files, or all files, respectively, should use
+     binary I/O. A numeric value less than zero is treated as zero, and
+     a numeric value greater than three is treated as three.
+     Alternatively, string values of '"r"' or '"w"' specify that input
+     files and output files, respectively, should use binary I/O. A
+     string value of '"rw"' or '"wr"' indicates that all files should
+     use binary I/O. Any other string value is treated the same as
+     '"rw"', but causes 'gawk' to generate a warning message.  'BINMODE'
+     is described in more detail in *note PC Using::.  'mawk' (*note
+     Other Versions::) also supports this variable, but only using
+     numeric values.
+
+'CONVFMT'
+     A string that controls the conversion of numbers to strings (*note
+     Conversion::).  It works by being passed, in effect, as the first
+     argument to the 'sprintf()' function (*note String Functions::).
+     Its default value is '"%.6g"'.  'CONVFMT' was introduced by the
+     POSIX standard.
+
+'FIELDWIDTHS #'
+     A space-separated list of columns that tells 'gawk' how to split
+     input with fixed columnar boundaries.  Assigning a value to
+     'FIELDWIDTHS' overrides the use of 'FS' and 'FPAT' for field
+     splitting.  *Note Constant Size:: for more information.
+
+'FPAT #'
+     A regular expression (as a string) that tells 'gawk' to create the
+     fields based on text that matches the regular expression.
+     Assigning a value to 'FPAT' overrides the use of 'FS' and
+     'FIELDWIDTHS' for field splitting.  *Note Splitting By Content::
+     for more information.
+
+'FS'
+     The input field separator (*note Field Separators::).  The value is
+     a single-character string or a multicharacter regular expression
+     that matches the separations between fields in an input record.  If
+     the value is the null string ('""'), then each character in the
+     record becomes a separate field.  (This behavior is a 'gawk'
+     extension.  POSIX 'awk' does not specify the behavior when 'FS' is
+     the null string.  Nonetheless, some other versions of 'awk' also
+     treat '""' specially.)
+
+     The default value is '" "', a string consisting of a single space.
+     As a special exception, this value means that any sequence of
+     spaces, TABs, and/or newlines is a single separator.  It also
+     causes spaces, TABs, and newlines at the beginning and end of a
+     record to be ignored.
+
+     You can set the value of 'FS' on the command line using the '-F'
+     option:
+
+          awk -F, 'PROGRAM' INPUT-FILES
+
+     If 'gawk' is using 'FIELDWIDTHS' or 'FPAT' for field splitting,
+     assigning a value to 'FS' causes 'gawk' to return to the normal,
+     'FS'-based field splitting.  An easy way to do this is to simply
+     say 'FS = FS', perhaps with an explanatory comment.
+
+'IGNORECASE #'
+     If 'IGNORECASE' is nonzero or non-null, then all string comparisons
+     and all regular expression matching are case-independent.  This
+     applies to regexp matching with '~' and '!~', the 'gensub()',
+     'gsub()', 'index()', 'match()', 'patsplit()', 'split()', and
+     'sub()' functions, record termination with 'RS', and field
+     splitting with 'FS' and 'FPAT'.  However, the value of 'IGNORECASE'
+     does _not_ affect array subscripting and it does not affect field
+     splitting when using a single-character field separator.  *Note
+     Case-sensitivity::.
+
+'LINT #'
+     When this variable is true (nonzero or non-null), 'gawk' behaves as
+     if the '--lint' command-line option is in effect (*note Options::).
+     With a value of '"fatal"', lint warnings become fatal errors.  With
+     a value of '"invalid"', only warnings about things that are
+     actually invalid are issued.  (This is not fully implemented yet.)
+     Any other true value prints nonfatal warnings.  Assigning a false
+     value to 'LINT' turns off the lint warnings.
+
+     This variable is a 'gawk' extension.  It is not special in other
+     'awk' implementations.  Unlike with the other special variables,
+     changing 'LINT' does affect the production of lint warnings, even
+     if 'gawk' is in compatibility mode.  Much as the '--lint' and
+     '--traditional' options independently control different aspects of
+     'gawk''s behavior, the control of lint warnings during program
+     execution is independent of the flavor of 'awk' being executed.
+
+'OFMT'
+     A string that controls conversion of numbers to strings (*note
+     Conversion::) for printing with the 'print' statement.  It works by
+     being passed as the first argument to the 'sprintf()' function
+     (*note String Functions::).  Its default value is '"%.6g"'.
+     Earlier versions of 'awk' used 'OFMT' to specify the format for
+     converting numbers to strings in general expressions; this is now
+     done by 'CONVFMT'.
+
+'OFS'
+     The output field separator (*note Output Separators::).  It is
+     output between the fields printed by a 'print' statement.  Its
+     default value is '" "', a string consisting of a single space.
+
+'ORS'
+     The output record separator.  It is output at the end of every
+     'print' statement.  Its default value is '"\n"', the newline
+     character.  (*Note Output Separators::.)
+
+'PREC #'
+     The working precision of arbitrary-precision floating-point
+     numbers, 53 bits by default (*note Setting precision::).
+
+'ROUNDMODE #'
+     The rounding mode to use for arbitrary-precision arithmetic on
+     numbers, by default '"N"' ('roundTiesToEven' in the IEEE 754
+     standard; *note Setting the rounding mode::).
+
+'RS'
+     The input record separator.  Its default value is a string
+     containing a single newline character, which means that an input
+     record consists of a single line of text.  It can also be the null
+     string, in which case records are separated by runs of blank lines.
+     If it is a regexp, records are separated by matches of the regexp
+     in the input text.  (*Note Records::.)
+
+     The ability for 'RS' to be a regular expression is a 'gawk'
+     extension.  In most other 'awk' implementations, or if 'gawk' is in
+     compatibility mode (*note Options::), just the first character of
+     'RS''s value is used.
+
+'SUBSEP'
+     The subscript separator.  It has the default value of '"\034"' and
+     is used to separate the parts of the indices of a multidimensional
+     array.  Thus, the expression 'foo["A", "B"]' really accesses
+     'foo["A\034B"]' (*note Multidimensional::).
+
+'TEXTDOMAIN #'
+     Used for internationalization of programs at the 'awk' level.  It
+     sets the default text domain for specially marked string constants
+     in the source text, as well as for the 'dcgettext()',
+     'dcngettext()', and 'bindtextdomain()' functions (*note
+     Internationalization::).  The default value of 'TEXTDOMAIN' is
+     '"messages"'.
+
+
+File: gawk.info,  Node: Auto-set,  Next: ARGC and ARGV,  Prev: User-modified,  Up: Built-in Variables
+
+7.5.2 Built-in Variables That Convey Information
+------------------------------------------------
+
+The following is an alphabetical list of variables that 'awk' sets
+automatically on certain occasions in order to provide information to
+your program.
+
+   The variables that are specific to 'gawk' are marked with a pound
+sign ('#').  These variables are 'gawk' extensions.  In other 'awk'
+implementations or if 'gawk' is in compatibility mode (*note Options::),
+they are not special:
+
+'ARGC', 'ARGV'
+     The command-line arguments available to 'awk' programs are stored
+     in an array called 'ARGV'.  'ARGC' is the number of command-line
+     arguments present.  *Note Other Arguments::.  Unlike most 'awk'
+     arrays, 'ARGV' is indexed from 0 to 'ARGC' - 1.  In the following
+     example:
+
+          $ awk 'BEGIN {
+          >         for (i = 0; i < ARGC; i++)
+          >             print ARGV[i]
+          >      }' inventory-shipped mail-list
+          -| awk
+          -| inventory-shipped
+          -| mail-list
+
+     'ARGV[0]' contains 'awk', 'ARGV[1]' contains 'inventory-shipped',
+     and 'ARGV[2]' contains 'mail-list'.  The value of 'ARGC' is three,
+     one more than the index of the last element in 'ARGV', because the
+     elements are numbered from zero.
+
+     The names 'ARGC' and 'ARGV', as well as the convention of indexing
+     the array from 0 to 'ARGC' - 1, are derived from the C language's
+     method of accessing command-line arguments.
+
+     The value of 'ARGV[0]' can vary from system to system.  Also, you
+     should note that the program text is _not_ included in 'ARGV', nor
+     are any of 'awk''s command-line options.  *Note ARGC and ARGV:: for
+     information about how 'awk' uses these variables.  (d.c.)
+
+'ARGIND #'
+     The index in 'ARGV' of the current file being processed.  Every
+     time 'gawk' opens a new data file for processing, it sets 'ARGIND'
+     to the index in 'ARGV' of the file name.  When 'gawk' is processing
+     the input files, 'FILENAME == ARGV[ARGIND]' is always true.
+
+     This variable is useful in file processing; it allows you to tell
+     how far along you are in the list of data files as well as to
+     distinguish between successive instances of the same file name on
+     the command line.
+
+     While you can change the value of 'ARGIND' within your 'awk'
+     program, 'gawk' automatically sets it to a new value when it opens
+     the next file.
+
+'ENVIRON'
+     An associative array containing the values of the environment.  The
+     array indices are the environment variable names; the elements are
+     the values of the particular environment variables.  For example,
+     'ENVIRON["HOME"]' might be '/home/arnold'.
+
+     For POSIX 'awk', changing this array does not affect the
+     environment passed on to any programs that 'awk' may spawn via
+     redirection or the 'system()' function.
+
+     However, beginning with version 4.2, if not in POSIX compatibility
+     mode, 'gawk' does update its own environment when 'ENVIRON' is
+     changed, thus changing the environment seen by programs that it
+     creates.  You should therefore be especially careful if you modify
+     'ENVIRON["PATH"]', which is the search path for finding executable
+     programs.
+
+     This can also affect the running 'gawk' program, since some of the
+     built-in functions may pay attention to certain environment
+     variables.  The most notable instance of this is 'mktime()' (*note
+     Time Functions::), which pays attention the value of the 'TZ'
+     environment variable on many systems.
+
+     Some operating systems may not have environment variables.  On such
+     systems, the 'ENVIRON' array is empty (except for
+     'ENVIRON["AWKPATH"]' and 'ENVIRON["AWKLIBPATH"]'; *note AWKPATH
+     Variable:: and *note AWKLIBPATH Variable::).
+
+'ERRNO #'
+     If a system error occurs during a redirection for 'getline', during
+     a read for 'getline', or during a 'close()' operation, then 'ERRNO'
+     contains a string describing the error.
+
+     In addition, 'gawk' clears 'ERRNO' before opening each command-line
+     input file.  This enables checking if the file is readable inside a
+     'BEGINFILE' pattern (*note BEGINFILE/ENDFILE::).
+
+     Otherwise, 'ERRNO' works similarly to the C variable 'errno'.
+     Except for the case just mentioned, 'gawk' _never_ clears it (sets
+     it to zero or '""').  Thus, you should only expect its value to be
+     meaningful when an I/O operation returns a failure value, such as
+     'getline' returning -1.  You are, of course, free to clear it
+     yourself before doing an I/O operation.
+
+     If the value of 'ERRNO' corresponds to a system error in the C
+     'errno' variable, then 'PROCINFO["errno"]' will be set to the value
+     of 'errno'.  For non-system errors, 'PROCINFO["errno"]' will be
+     zero.
+
+'FILENAME'
+     The name of the current input file.  When no data files are listed
+     on the command line, 'awk' reads from the standard input and
+     'FILENAME' is set to '"-"'.  'FILENAME' changes each time a new
+     file is read (*note Reading Files::).  Inside a 'BEGIN' rule, the
+     value of 'FILENAME' is '""', because there are no input files being
+     processed yet.(1)  (d.c.)  Note, though, that using 'getline'
+     (*note Getline::) inside a 'BEGIN' rule can give 'FILENAME' a
+     value.
+
+'FNR'
+     The current record number in the current file.  'awk' increments
+     'FNR' each time it reads a new record (*note Records::).  'awk'
+     resets 'FNR' to zero each time it starts a new input file.
+
+'NF'
+     The number of fields in the current input record.  'NF' is set each
+     time a new record is read, when a new field is created, or when
+     '$0' changes (*note Fields::).
+
+     Unlike most of the variables described in this node, assigning a
+     value to 'NF' has the potential to affect 'awk''s internal
+     workings.  In particular, assignments to 'NF' can be used to create
+     fields in or remove fields from the current record.  *Note Changing
+     Fields::.
+
+'FUNCTAB #'
+     An array whose indices and corresponding values are the names of
+     all the built-in, user-defined, and extension functions in the
+     program.
+
+          NOTE: Attempting to use the 'delete' statement with the
+          'FUNCTAB' array causes a fatal error.  Any attempt to assign
+          to an element of 'FUNCTAB' also causes a fatal error.
+
+'NR'
+     The number of input records 'awk' has processed since the beginning
+     of the program's execution (*note Records::).  'awk' increments
+     'NR' each time it reads a new record.
+
+'PROCINFO #'
+     The elements of this array provide access to information about the
+     running 'awk' program.  The following elements (listed
+     alphabetically) are guaranteed to be available:
+
+     'PROCINFO["egid"]'
+          The value of the 'getegid()' system call.
+
+     'PROCINFO["errno"]'
+          The value of the C 'errno' variable when 'ERRNO' is set to the
+          associated error message.
+
+     'PROCINFO["euid"]'
+          The value of the 'geteuid()' system call.
+
+     'PROCINFO["FS"]'
+          This is '"FS"' if field splitting with 'FS' is in effect,
+          '"FIELDWIDTHS"' if field splitting with 'FIELDWIDTHS' is in
+          effect, or '"FPAT"' if field matching with 'FPAT' is in
+          effect.
+
+     'PROCINFO["gid"]'
+          The value of the 'getgid()' system call.
+
+     'PROCINFO["identifiers"]'
+          A subarray, indexed by the names of all identifiers used in
+          the text of the 'awk' program.  An "identifier" is simply the
+          name of a variable (be it scalar or array), built-in function,
+          user-defined function, or extension function.  For each
+          identifier, the value of the element is one of the following:
+
+          '"array"'
+               The identifier is an array.
+
+          '"builtin"'
+               The identifier is a built-in function.
+
+          '"extension"'
+               The identifier is an extension function loaded via
+               '@load' or '-l'.
+
+          '"scalar"'
+               The identifier is a scalar.
+
+          '"untyped"'
+               The identifier is untyped (could be used as a scalar or
+               an array; 'gawk' doesn't know yet).
+
+          '"user"'
+               The identifier is a user-defined function.
+
+          The values indicate what 'gawk' knows about the identifiers
+          after it has finished parsing the program; they are _not_
+          updated while the program runs.
+
+     'PROCINFO["pgrpid"]'
+          The process group ID of the current process.
+
+     'PROCINFO["pid"]'
+          The process ID of the current process.
+
+     'PROCINFO["ppid"]'
+          The parent process ID of the current process.
+
+     'PROCINFO["strftime"]'
+          The default time format string for 'strftime()'.  Assigning a
+          new value to this element changes the default.  *Note Time
+          Functions::.
+
+     'PROCINFO["uid"]'
+          The value of the 'getuid()' system call.
+
+     'PROCINFO["version"]'
+          The version of 'gawk'.
+
+     The following additional elements in the array are available to
+     provide information about the MPFR and GMP libraries if your
+     version of 'gawk' supports arbitrary-precision arithmetic (*note
+     Arbitrary Precision Arithmetic::):
+
+     'PROCINFO["gmp_version"]'
+          The version of the GNU MP library.
+
+     'PROCINFO["mpfr_version"]'
+          The version of the GNU MPFR library.
+
+     'PROCINFO["prec_max"]'
+          The maximum precision supported by MPFR.
+
+     'PROCINFO["prec_min"]'
+          The minimum precision required by MPFR.
+
+     The following additional elements in the array are available to
+     provide information about the version of the extension API, if your
+     version of 'gawk' supports dynamic loading of extension functions
+     (*note Dynamic Extensions::):
+
+     'PROCINFO["api_major"]'
+          The major version of the extension API.
+
+     'PROCINFO["api_minor"]'
+          The minor version of the extension API.
+
+     On some systems, there may be elements in the array, '"group1"'
+     through '"groupN"' for some N.  N is the number of supplementary
+     groups that the process has.  Use the 'in' operator to test for
+     these elements (*note Reference to Elements::).
+
+     The following elements allow you to change 'gawk''s behavior:
+
+     'PROCINFO["NONFATAL"]'
+          If this element exists, then I/O errors for all output
+          redirections become nonfatal.  *Note Nonfatal::.
+
+     'PROCINFO["OUTPUT_NAME", "NONFATAL"]'
+          Make output errors for OUTPUT_NAME be nonfatal.  *Note
+          Nonfatal::.
+
+     'PROCINFO["COMMAND", "pty"]'
+          For two-way communication to COMMAND, use a pseudo-tty instead
+          of setting up a two-way pipe.  *Note Two-way I/O:: for more
+          information.
+
+     'PROCINFO["INPUT_NAME", "READ_TIMEOUT"]'
+          Set a timeout for reading from input redirection INPUT_NAME.
+          *Note Read Timeout:: for more information.
+
+     'PROCINFO["sorted_in"]'
+          If this element exists in 'PROCINFO', its value controls the
+          order in which array indices will be processed by 'for (INDX
+          in ARRAY)' loops.  This is an advanced feature, so we defer
+          the full description until later; see *note Scanning an
+          Array::.
+
+'RLENGTH'
+     The length of the substring matched by the 'match()' function
+     (*note String Functions::).  'RLENGTH' is set by invoking the
+     'match()' function.  Its value is the length of the matched string,
+     or -1 if no match is found.
+
+'RSTART'
+     The start index in characters of the substring that is matched by
+     the 'match()' function (*note String Functions::).  'RSTART' is set
+     by invoking the 'match()' function.  Its value is the position of
+     the string where the matched substring starts, or zero if no match
+     was found.
+
+'RT #'
+     The input text that matched the text denoted by 'RS', the record
+     separator.  It is set every time a record is read.
+
+'SYMTAB #'
+     An array whose indices are the names of all defined global
+     variables and arrays in the program.  'SYMTAB' makes 'gawk''s
+     symbol table visible to the 'awk' programmer.  It is built as
+     'gawk' parses the program and is complete before the program starts
+     to run.
+
+     The array may be used for indirect access to read or write the
+     value of a variable:
+
+          foo = 5
+          SYMTAB["foo"] = 4
+          print foo    # prints 4
+
+     The 'isarray()' function (*note Type Functions::) may be used to
+     test if an element in 'SYMTAB' is an array.  Also, you may not use
+     the 'delete' statement with the 'SYMTAB' array.
+
+     You may use an index for 'SYMTAB' that is not a predefined
+     identifier:
+
+          SYMTAB["xxx"] = 5
+          print SYMTAB["xxx"]
+
+     This works as expected: in this case 'SYMTAB' acts just like a
+     regular array.  The only difference is that you can't then delete
+     'SYMTAB["xxx"]'.
+
+     The 'SYMTAB' array is more interesting than it looks.  Andrew
+     Schorr points out that it effectively gives 'awk' data pointers.
+     Consider his example:
+
+          # Indirect multiply of any variable by amount, return result
+
+          function multiply(variable, amount)
+          {
+              return SYMTAB[variable] *= amount
+          }
+
+     You would use it like this:
+
+          BEGIN {
+              answer = 10.5
+              multiply("answer", 4)
+              print "The answer is", answer
+          }
+
+     When run, this produces:
+
+          $ gawk -f answer.awk
+          -| The answer is 42
+
+          NOTE: In order to avoid severe time-travel paradoxes,(2)
+          neither 'FUNCTAB' nor 'SYMTAB' is available as an element
+          within the 'SYMTAB' array.
+
+                        Changing 'NR' and 'FNR'
+
+   'awk' increments 'NR' and 'FNR' each time it reads a record, instead
+of setting them to the absolute value of the number of records read.
+This means that a program can change these variables and their new
+values are incremented for each record.  (d.c.)  The following example
+shows this:
+
+     $ echo '1
+     > 2
+     > 3
+     > 4' | awk 'NR == 2 { NR = 17 }
+     > { print NR }'
+     -| 1
+     -| 17
+     -| 18
+     -| 19
+
+Before 'FNR' was added to the 'awk' language (*note V7/SVR3.1::), many
+'awk' programs used this feature to track the number of records in a
+file by resetting 'NR' to zero when 'FILENAME' changed.
+
+   ---------- Footnotes ----------
+
+   (1) Some early implementations of Unix 'awk' initialized 'FILENAME'
+to '"-"', even if there were data files to be processed.  This behavior
+was incorrect and should not be relied upon in your programs.
+
+   (2) Not to mention difficult implementation issues.
+
+
+File: gawk.info,  Node: ARGC and ARGV,  Prev: Auto-set,  Up: Built-in Variables
+
+7.5.3 Using 'ARGC' and 'ARGV'
+-----------------------------
+
+*note Auto-set:: presented the following program describing the
+information contained in 'ARGC' and 'ARGV':
+
+     $ awk 'BEGIN {
+     >        for (i = 0; i < ARGC; i++)
+     >            print ARGV[i]
+     >      }' inventory-shipped mail-list
+     -| awk
+     -| inventory-shipped
+     -| mail-list
+
+In this example, 'ARGV[0]' contains 'awk', 'ARGV[1]' contains
+'inventory-shipped', and 'ARGV[2]' contains 'mail-list'.  Notice that
+the 'awk' program is not entered in 'ARGV'.  The other command-line
+options, with their arguments, are also not entered.  This includes
+variable assignments done with the '-v' option (*note Options::).
+Normal variable assignments on the command line _are_ treated as
+arguments and do show up in the 'ARGV' array.  Given the following
+program in a file named 'showargs.awk':
+
+     BEGIN {
+         printf "A=%d, B=%d\n", A, B
+         for (i = 0; i < ARGC; i++)
+             printf "\tARGV[%d] = %s\n", i, ARGV[i]
+     }
+     END   { printf "A=%d, B=%d\n", A, B }
+
+Running it produces the following:
+
+     $ awk -v A=1 -f showargs.awk B=2 /dev/null
+     -| A=1, B=0
+     -|        ARGV[0] = awk
+     -|        ARGV[1] = B=2
+     -|        ARGV[2] = /dev/null
+     -| A=1, B=2
+
+   A program can alter 'ARGC' and the elements of 'ARGV'.  Each time
+'awk' reaches the end of an input file, it uses the next element of
+'ARGV' as the name of the next input file.  By storing a different
+string there, a program can change which files are read.  Use '"-"' to
+represent the standard input.  Storing additional elements and
+incrementing 'ARGC' causes additional files to be read.
+
+   If the value of 'ARGC' is decreased, that eliminates input files from
+the end of the list.  By recording the old value of 'ARGC' elsewhere, a
+program can treat the eliminated arguments as something other than file
+names.
+
+   To eliminate a file from the middle of the list, store the null
+string ('""') into 'ARGV' in place of the file's name.  As a special
+feature, 'awk' ignores file names that have been replaced with the null
+string.  Another option is to use the 'delete' statement to remove
+elements from 'ARGV' (*note Delete::).
+
+   All of these actions are typically done in the 'BEGIN' rule, before
+actual processing of the input begins.  *Note Split Program:: and *note
+Tee Program:: for examples of each way of removing elements from 'ARGV'.
+
+   To actually get options into an 'awk' program, end the 'awk' options
+with '--' and then supply the 'awk' program's options, in the following
+manner:
+
+     awk -f myprog.awk -- -v -q file1 file2 ...
+
+   The following fragment processes 'ARGV' in order to examine, and then
+remove, the previously mentioned command-line options:
+
+     BEGIN {
+         for (i = 1; i < ARGC; i++) {
+             if (ARGV[i] == "-v")
+                 verbose = 1
+             else if (ARGV[i] == "-q")
+                 debug = 1
+             else if (ARGV[i] ~ /^-./) {
+                 e = sprintf("%s: unrecognized option -- %c",
+                         ARGV[0], substr(ARGV[i], 2, 1))
+                 print e > "/dev/stderr"
+             } else
+                 break
+             delete ARGV[i]
+         }
+     }
+
+   Ending the 'awk' options with '--' isn't necessary in 'gawk'.  Unless
+'--posix' has been specified, 'gawk' silently puts any unrecognized
+options into 'ARGV' for the 'awk' program to deal with.  As soon as it
+sees an unknown option, 'gawk' stops looking for other options that it
+might otherwise recognize.  The previous command line with 'gawk' would
+be:
+
+     gawk -f myprog.awk -q -v file1 file2 ...
+
+Because '-q' is not a valid 'gawk' option, it and the following '-v' are
+passed on to the 'awk' program.  (*Note Getopt Function:: for an 'awk'
+library function that parses command-line options.)
+
+   When designing your program, you should choose options that don't
+conflict with 'gawk''s, because it will process any options that it
+accepts before passing the rest of the command line on to your program.
+Using '#!' with the '-E' option may help (*note Executable Scripts:: and
+*note Options::).
+
+
+File: gawk.info,  Node: Pattern Action Summary,  Prev: Built-in Variables,  Up: Patterns and Actions
+
+7.6 Summary
+===========
+
+   * Pattern-action pairs make up the basic elements of an 'awk'
+     program.  Patterns are either normal expressions, range
+     expressions, or regexp constants; one of the special keywords
+     'BEGIN', 'END', 'BEGINFILE', or 'ENDFILE'; or empty.  The action
+     executes if the current record matches the pattern.  Empty
+     (missing) patterns match all records.
+
+   * I/O from 'BEGIN' and 'END' rules has certain constraints.  This is
+     also true, only more so, for 'BEGINFILE' and 'ENDFILE' rules.  The
+     latter two give you "hooks" into 'gawk''s file processing, allowing
+     you to recover from a file that otherwise would cause a fatal error
+     (such as a file that cannot be opened).
+
+   * Shell variables can be used in 'awk' programs by careful use of
+     shell quoting.  It is easier to pass a shell variable into 'awk' by
+     using the '-v' option and an 'awk' variable.
+
+   * Actions consist of statements enclosed in curly braces.  Statements
+     are built up from expressions, control statements, compound
+     statements, input and output statements, and deletion statements.
+
+   * The control statements in 'awk' are 'if'-'else', 'while', 'for',
+     and 'do'-'while'.  'gawk' adds the 'switch' statement.  There are
+     two flavors of 'for' statement: one for performing general looping,
+     and the other for iterating through an array.
+
+   * 'break' and 'continue' let you exit early or start the next
+     iteration of a loop (or get out of a 'switch').
+
+   * 'next' and 'nextfile' let you read the next record and start over
+     at the top of your program or skip to the next input file and start
+     over, respectively.
+
+   * The 'exit' statement terminates your program.  When executed from
+     an action (or function body), it transfers control to the 'END'
+     statements.  From an 'END' statement body, it exits immediately.
+     You may pass an optional numeric value to be used as 'awk''s exit
+     status.
+
+   * Some predefined variables provide control over 'awk', mainly for
+     I/O. Other variables convey information from 'awk' to your program.
+
+   * 'ARGC' and 'ARGV' make the command-line arguments available to your
+     program.  Manipulating them from a 'BEGIN' rule lets you control
+     how 'awk' will process the provided data files.
+
+
+File: gawk.info,  Node: Arrays,  Next: Functions,  Prev: Patterns and Actions,  Up: Top
+
+8 Arrays in 'awk'
+*****************
+
+An "array" is a table of values called "elements".  The elements of an
+array are distinguished by their "indices".  Indices may be either
+numbers or strings.
+
+   This major node describes how arrays work in 'awk', how to use array
+elements, how to scan through every element in an array, and how to
+remove array elements.  It also describes how 'awk' simulates
+multidimensional arrays, as well as some of the less obvious points
+about array usage.  The major node moves on to discuss 'gawk''s facility
+for sorting arrays, and ends with a brief description of 'gawk''s
+ability to support true arrays of arrays.
+
+* Menu:
+
+* Array Basics::                The basics of arrays.
+* Numeric Array Subscripts::    How to use numbers as subscripts in
+                                'awk'.
+* Uninitialized Subscripts::    Using Uninitialized variables as subscripts.
+* Delete::                      The 'delete' statement removes an element
+                                from an array.
+* Multidimensional::            Emulating multidimensional arrays in
+                                'awk'.
+* Arrays of Arrays::            True multidimensional arrays.
+* Arrays Summary::              Summary of arrays.
+
+
+File: gawk.info,  Node: Array Basics,  Next: Numeric Array Subscripts,  Up: Arrays
+
+8.1 The Basics of Arrays
+========================
+
+This minor node presents the basics: working with elements in arrays one
+at a time, and traversing all of the elements in an array.
+
+* Menu:
+
+* Array Intro::                 Introduction to Arrays
+* Reference to Elements::       How to examine one element of an array.
+* Assigning Elements::          How to change an element of an array.
+* Array Example::               Basic Example of an Array
+* Scanning an Array::           A variation of the 'for' statement. It
+                                loops through the indices of an array's
+                                existing elements.
+* Controlling Scanning::        Controlling the order in which arrays are
+                                scanned.
+
+
+File: gawk.info,  Node: Array Intro,  Next: Reference to Elements,  Up: Array Basics
+
+8.1.1 Introduction to Arrays
+----------------------------
+
+     Doing linear scans over an associative array is like trying to club
+     someone to death with a loaded Uzi.
+                            -- _Larry Wall_
+
+   The 'awk' language provides one-dimensional arrays for storing groups
+of related strings or numbers.  Every 'awk' array must have a name.
+Array names have the same syntax as variable names; any valid variable
+name would also be a valid array name.  But one name cannot be used in
+both ways (as an array and as a variable) in the same 'awk' program.
+
+   Arrays in 'awk' superficially resemble arrays in other programming
+languages, but there are fundamental differences.  In 'awk', it isn't
+necessary to specify the size of an array before starting to use it.
+Additionally, any number or string, not just consecutive integers, may
+be used as an array index.
+
+   In most other languages, arrays must be "declared" before use,
+including a specification of how many elements or components they
+contain.  In such languages, the declaration causes a contiguous block
+of memory to be allocated for that many elements.  Usually, an index in
+the array must be a nonnegative integer.  For example, the index zero
+specifies the first element in the array, which is actually stored at
+the beginning of the block of memory.  Index one specifies the second
+element, which is stored in memory right after the first element, and so
+on.  It is impossible to add more elements to the array, because it has
+room only for as many elements as given in the declaration.  (Some
+languages allow arbitrary starting and ending indices--e.g., '15 ..
+27'--but the size of the array is still fixed when the array is
+declared.)
+
+   A contiguous array of four elements might look like *note Figure 8.1:
+figure-array-elements, conceptually, if the element values are eight,
+'"foo"', '""', and 30.
+
+
+|    8    |  \"foo\"  |   \"\"   |    30   |    Value
++---------+---------+--------+---------+
+     0         1         2         3        Index"
+
+Figure 8.1: A contiguous array
+
+Only the values are stored; the indices are implicit from the order of
+the values.  Here, eight is the value at index zero, because eight
+appears in the position with zero elements before it.
+
+   Arrays in 'awk' are different--they are "associative".  This means
+that each array is a collection of pairs--an index and its corresponding
+array element value:
+
+        Index   Value
+------------------------
+        '3'     '30'
+        '1'     '"foo"'
+        '0'     '8'
+        '2'     '""'
+
+The pairs are shown in jumbled order because their order is
+irrelevant.(1)
+
+   One advantage of associative arrays is that new pairs can be added at
+any time.  For example, suppose a tenth element is added to the array
+whose value is '"number ten"'.  The result is:
+
+        Index   Value
+-------------------------------
+        '10'    '"number
+                ten"'
+        '3'     '30'
+        '1'     '"foo"'
+        '0'     '8'
+        '2'     '""'
+
+Now the array is "sparse", which just means some indices are missing.
+It has elements 0-3 and 10, but doesn't have elements 4, 5, 6, 7, 8, or
+9.
+
+   Another consequence of associative arrays is that the indices don't
+have to be nonnegative integers.  Any number, or even a string, can be
+an index.  For example, the following is an array that translates words
+from English to French:
+
+        Index   Value
+------------------------
+        '"dog"' '"chien"'
+        '"cat"' '"chat"'
+        '"one"' '"un"'
+        '1'     '"un"'
+
+Here we decided to translate the number one in both spelled-out and
+numeric form--thus illustrating that a single array can have both
+numbers and strings as indices.  (In fact, array subscripts are always
+strings.  There are some subtleties to how numbers work when used as
+array subscripts; this is discussed in more detail in *note Numeric
+Array Subscripts::.)  Here, the number '1' isn't double-quoted, because
+'awk' automatically converts it to a string.
+
+   The value of 'IGNORECASE' has no effect upon array subscripting.  The
+identical string value used to store an array element must be used to
+retrieve it.  When 'awk' creates an array (e.g., with the 'split()'
+built-in function), that array's indices are consecutive integers
+starting at one.  (*Note String Functions::.)
+
+   'awk''s arrays are efficient--the time to access an element is
+independent of the number of elements in the array.
+
+   ---------- Footnotes ----------
+
+   (1) The ordering will vary among 'awk' implementations, which
+typically use hash tables to store array elements and values.
+
+
+File: gawk.info,  Node: Reference to Elements,  Next: Assigning Elements,  Prev: Array Intro,  Up: Array Basics
+
+8.1.2 Referring to an Array Element
+-----------------------------------
+
+The principal way to use an array is to refer to one of its elements.
+An "array reference" is an expression as follows:
+
+     ARRAY[INDEX-EXPRESSION]
+
+Here, ARRAY is the name of an array.  The expression INDEX-EXPRESSION is
+the index of the desired element of the array.
+
+   The value of the array reference is the current value of that array
+element.  For example, 'foo[4.3]' is an expression referencing the
+element of array 'foo' at index '4.3'.
+
+   A reference to an array element that has no recorded value yields a
+value of '""', the null string.  This includes elements that have not
+been assigned any value as well as elements that have been deleted
+(*note Delete::).
+
+     NOTE: A reference to an element that does not exist _automatically_
+     creates that array element, with the null string as its value.  (In
+     some cases, this is unfortunate, because it might waste memory
+     inside 'awk'.)
+
+     Novice 'awk' programmers often make the mistake of checking if an
+     element exists by checking if the value is empty:
+
+          # Check if "foo" exists in a:         Incorrect!
+          if (a["foo"] != "") ...
+
+     This is incorrect for two reasons.  First, it _creates_ 'a["foo"]'
+     if it didn't exist before!  Second, it is valid (if a bit unusual)
+     to set an array element equal to the empty string.
+
+   To determine whether an element exists in an array at a certain
+index, use the following expression:
+
+     INDX in ARRAY
+
+This expression tests whether the particular index INDX exists, without
+the side effect of creating that element if it is not present.  The
+expression has the value one (true) if 'ARRAY[INDX]' exists and zero
+(false) if it does not exist.  (We use INDX here, because 'index' is the
+name of a built-in function.)  For example, this statement tests whether
+the array 'frequencies' contains the index '2':
+
+     if (2 in frequencies)
+         print "Subscript 2 is present."
+
+   Note that this is _not_ a test of whether the array 'frequencies'
+contains an element whose _value_ is two.  There is no way to do that
+except to scan all the elements.  Also, this _does not_ create
+'frequencies[2]', while the following (incorrect) alternative does:
+
+     if (frequencies[2] != "")
+         print "Subscript 2 is present."
+
+
+File: gawk.info,  Node: Assigning Elements,  Next: Array Example,  Prev: Reference to Elements,  Up: Array Basics
+
+8.1.3 Assigning Array Elements
+------------------------------
+
+Array elements can be assigned values just like 'awk' variables:
+
+     ARRAY[INDEX-EXPRESSION] = VALUE
+
+ARRAY is the name of an array.  The expression INDEX-EXPRESSION is the
+index of the element of the array that is assigned a value.  The
+expression VALUE is the value to assign to that element of the array.
+
+
+File: gawk.info,  Node: Array Example,  Next: Scanning an Array,  Prev: Assigning Elements,  Up: Array Basics
+
+8.1.4 Basic Array Example
+-------------------------
+
+The following program takes a list of lines, each beginning with a line
+number, and prints them out in order of line number.  The line numbers
+are not in order when they are first read--instead, they are scrambled.
+This program sorts the lines by making an array using the line numbers
+as subscripts.  The program then prints out the lines in sorted order of
+their numbers.  It is a very simple program and gets confused upon
+encountering repeated numbers, gaps, or lines that don't begin with a
+number:
+
+     {
+         if ($1 > max)
+             max = $1
+         arr[$1] = $0
+     }
+
+     END {
+         for (x = 1; x <= max; x++)
+             print arr[x]
+     }
+
+   The first rule keeps track of the largest line number seen so far; it
+also stores each line into the array 'arr', at an index that is the
+line's number.  The second rule runs after all the input has been read,
+to print out all the lines.  When this program is run with the following
+input:
+
+     5  I am the Five man
+     2  Who are you?  The new number two!
+     4  . . . And four on the floor
+     1  Who is number one?
+     3  I three you.
+
+Its output is:
+
+     1  Who is number one?
+     2  Who are you?  The new number two!
+     3  I three you.
+     4  . . . And four on the floor
+     5  I am the Five man
+
+   If a line number is repeated, the last line with a given number
+overrides the others.  Gaps in the line numbers can be handled with an
+easy improvement to the program's 'END' rule, as follows:
+
+     END {
+         for (x = 1; x <= max; x++)
+             if (x in arr)
+                 print arr[x]
+     }
+
+
+File: gawk.info,  Node: Scanning an Array,  Next: Controlling Scanning,  Prev: Array Example,  Up: Array Basics
+
+8.1.5 Scanning All Elements of an Array
+---------------------------------------
+
+In programs that use arrays, it is often necessary to use a loop that
+executes once for each element of an array.  In other languages, where
+arrays are contiguous and indices are limited to nonnegative integers,
+this is easy: all the valid indices can be found by counting from the
+lowest index up to the highest.  This technique won't do the job in
+'awk', because any number or string can be an array index.  So 'awk' has
+a special kind of 'for' statement for scanning an array:
+
+     for (VAR in ARRAY)
+         BODY
+
+This loop executes BODY once for each index in ARRAY that the program
+has previously used, with the variable VAR set to that index.
+
+   The following program uses this form of the 'for' statement.  The
+first rule scans the input records and notes which words appear (at
+least once) in the input, by storing a one into the array 'used' with
+the word as the index.  The second rule scans the elements of 'used' to
+find all the distinct words that appear in the input.  It prints each
+word that is more than 10 characters long and also prints the number of
+such words.  *Note String Functions:: for more information on the
+built-in function 'length()'.
+
+     # Record a 1 for each word that is used at least once
+     {
+         for (i = 1; i <= NF; i++)
+             used[$i] = 1
+     }
+
+     # Find number of distinct words more than 10 characters long
+     END {
+         for (x in used) {
+             if (length(x) > 10) {
+                 ++num_long_words
+                 print x
+             }
+         }
+         print num_long_words, "words longer than 10 characters"
+     }
+
+*Note Word Sorting:: for a more detailed example of this type.
+
+   The order in which elements of the array are accessed by this
+statement is determined by the internal arrangement of the array
+elements within 'awk' and in standard 'awk' cannot be controlled or
+changed.  This can lead to problems if new elements are added to ARRAY
+by statements in the loop body; it is not predictable whether the 'for'
+loop will reach them.  Similarly, changing VAR inside the loop may
+produce strange results.  It is best to avoid such things.
+
+   As a point of information, 'gawk' sets up the list of elements to be
+iterated over before the loop starts, and does not change it.  But not
+all 'awk' versions do so.  Consider this program, named 'loopcheck.awk':
+
+     BEGIN {
+         a["here"] = "here"
+         a["is"] = "is"
+         a["a"] = "a"
+         a["loop"] = "loop"
+         for (i in a) {
+             j++
+             a[j] = j
+             print i
+         }
+     }
+
+   Here is what happens when run with 'gawk' (and 'mawk'):
+
+     $ gawk -f loopcheck.awk
+     -| here
+     -| loop
+     -| a
+     -| is
+
+   Contrast this to BWK 'awk':
+
+     $ nawk -f loopcheck.awk
+     -| loop
+     -| here
+     -| is
+     -| a
+     -| 1
+
+
+File: gawk.info,  Node: Controlling Scanning,  Prev: Scanning an Array,  Up: Array Basics
+
+8.1.6 Using Predefined Array Scanning Orders with 'gawk'
+--------------------------------------------------------
+
+This node describes a feature that is specific to 'gawk'.
+
+   By default, when a 'for' loop traverses an array, the order is
+undefined, meaning that the 'awk' implementation determines the order in
+which the array is traversed.  This order is usually based on the
+internal implementation of arrays and will vary from one version of
+'awk' to the next.
+
+   Often, though, you may wish to do something simple, such as "traverse
+the array by comparing the indices in ascending order," or "traverse the
+array by comparing the values in descending order."  'gawk' provides two
+mechanisms that give you this control:
+
+   * Set 'PROCINFO["sorted_in"]' to one of a set of predefined values.
+     We describe this now.
+
+   * Set 'PROCINFO["sorted_in"]' to the name of a user-defined function
+     to use for comparison of array elements.  This advanced feature is
+     described later in *note Array Sorting::.
+
+   The following special values for 'PROCINFO["sorted_in"]' are
+available:
+
+'"@unsorted"'
+     Array elements are processed in arbitrary order, which is the
+     default 'awk' behavior.
+
+'"@ind_str_asc"'
+     Order by indices in ascending order compared as strings; this is
+     the most basic sort.  (Internally, array indices are always
+     strings, so with 'a[2*5] = 1' the index is '"10"' rather than
+     numeric 10.)
+
+'"@ind_num_asc"'
+     Order by indices in ascending order but force them to be treated as
+     numbers in the process.  Any index with a non-numeric value will
+     end up positioned as if it were zero.
+
+'"@val_type_asc"'
+     Order by element values in ascending order (rather than by
+     indices).  Ordering is by the type assigned to the element (*note
+     Typing and Comparison::).  All numeric values come before all
+     string values, which in turn come before all subarrays.  (Subarrays
+     have not been described yet; *note Arrays of Arrays::.)
+
+'"@val_str_asc"'
+     Order by element values in ascending order (rather than by
+     indices).  Scalar values are compared as strings.  Subarrays, if
+     present, come out last.
+
+'"@val_num_asc"'
+     Order by element values in ascending order (rather than by
+     indices).  Scalar values are compared as numbers.  Subarrays, if
+     present, come out last.  When numeric values are equal, the string
+     values are used to provide an ordering: this guarantees consistent
+     results across different versions of the C 'qsort()' function,(1)
+     which 'gawk' uses internally to perform the sorting.
+
+'"@ind_str_desc"'
+     Like '"@ind_str_asc"', but the string indices are ordered from high
+     to low.
+
+'"@ind_num_desc"'
+     Like '"@ind_num_asc"', but the numeric indices are ordered from
+     high to low.
+
+'"@val_type_desc"'
+     Like '"@val_type_asc"', but the element values, based on type, are
+     ordered from high to low.  Subarrays, if present, come out first.
+
+'"@val_str_desc"'
+     Like '"@val_str_asc"', but the element values, treated as strings,
+     are ordered from high to low.  Subarrays, if present, come out
+     first.
+
+'"@val_num_desc"'
+     Like '"@val_num_asc"', but the element values, treated as numbers,
+     are ordered from high to low.  Subarrays, if present, come out
+     first.
+
+   The array traversal order is determined before the 'for' loop starts
+to run.  Changing 'PROCINFO["sorted_in"]' in the loop body does not
+affect the loop.  For example:
+
+     $ gawk '
+     > BEGIN {
+     >    a[4] = 4
+     >    a[3] = 3
+     >    for (i in a)
+     >        print i, a[i]
+     > }'
+     -| 4 4
+     -| 3 3
+     $ gawk '
+     > BEGIN {
+     >    PROCINFO["sorted_in"] = "@ind_str_asc"
+     >    a[4] = 4
+     >    a[3] = 3
+     >    for (i in a)
+     >        print i, a[i]
+     > }'
+     -| 3 3
+     -| 4 4
+
+   When sorting an array by element values, if a value happens to be a
+subarray then it is considered to be greater than any string or numeric
+value, regardless of what the subarray itself contains, and all
+subarrays are treated as being equal to each other.  Their order
+relative to each other is determined by their index strings.
+
+   Here are some additional things to bear in mind about sorted array
+traversal:
+
+   * The value of 'PROCINFO["sorted_in"]' is global.  That is, it
+     affects all array traversal 'for' loops.  If you need to change it
+     within your own code, you should see if it's defined and save and
+     restore the value:
+
+          ...
+          if ("sorted_in" in PROCINFO) {
+              save_sorted = PROCINFO["sorted_in"]
+              PROCINFO["sorted_in"] = "@val_str_desc" # or whatever
+          }
+          ...
+          if (save_sorted)
+              PROCINFO["sorted_in"] = save_sorted
+
+   * As already mentioned, the default array traversal order is
+     represented by '"@unsorted"'.  You can also get the default
+     behavior by assigning the null string to 'PROCINFO["sorted_in"]' or
+     by just deleting the '"sorted_in"' element from the 'PROCINFO'
+     array with the 'delete' statement.  (The 'delete' statement hasn't
+     been described yet; *note Delete::.)
+
+   In addition, 'gawk' provides built-in functions for sorting arrays;
+see *note Array Sorting Functions::.
+
+   ---------- Footnotes ----------
+
+   (1) When two elements compare as equal, the C 'qsort()' function does
+not guarantee that they will maintain their original relative order
+after sorting.  Using the string value to provide a unique ordering when
+the numeric values are equal ensures that 'gawk' behaves consistently
+across different environments.
+
+
+File: gawk.info,  Node: Numeric Array Subscripts,  Next: Uninitialized Subscripts,  Prev: Array Basics,  Up: Arrays
+
+8.2 Using Numbers to Subscript Arrays
+=====================================
+
+An important aspect to remember about arrays is that _array subscripts
+are always strings_.  When a numeric value is used as a subscript, it is
+converted to a string value before being used for subscripting (*note
+Conversion::).  This means that the value of the predefined variable
+'CONVFMT' can affect how your program accesses elements of an array.
+For example:
+
+     xyz = 12.153
+     data[xyz] = 1
+     CONVFMT = "%2.2f"
+     if (xyz in data)
+         printf "%s is in data\n", xyz
+     else
+         printf "%s is not in data\n", xyz
+
+This prints '12.15 is not in data'.  The first statement gives 'xyz' a
+numeric value.  Assigning to 'data[xyz]' subscripts 'data' with the
+string value '"12.153"' (using the default conversion value of
+'CONVFMT', '"%.6g"').  Thus, the array element 'data["12.153"]' is
+assigned the value one.  The program then changes the value of
+'CONVFMT'.  The test '(xyz in data)' generates a new string value from
+'xyz'--this time '"12.15"'--because the value of 'CONVFMT' only allows
+two significant digits.  This test fails, because '"12.15"' is different
+from '"12.153"'.
+
+   According to the rules for conversions (*note Conversion::), integer
+values always convert to strings as integers, no matter what the value
+of 'CONVFMT' may happen to be.  So the usual case of the following
+works:
+
+     for (i = 1; i <= maxsub; i++)
+         do something with array[i]
+
+   The "integer values always convert to strings as integers" rule has
+an additional consequence for array indexing.  Octal and hexadecimal
+constants (*note Nondecimal-numbers::) are converted internally into
+numbers, and their original form is forgotten.  This means, for example,
+that 'array[17]', 'array[021]', and 'array[0x11]' all refer to the same
+element!
+
+   As with many things in 'awk', the majority of the time things work as
+you would expect them to.  But it is useful to have a precise knowledge
+of the actual rules, as they can sometimes have a subtle effect on your
+programs.
+
+
+File: gawk.info,  Node: Uninitialized Subscripts,  Next: Delete,  Prev: Numeric Array Subscripts,  Up: Arrays
+
+8.3 Using Uninitialized Variables as Subscripts
+===============================================
+
+Suppose it's necessary to write a program to print the input data in
+reverse order.  A reasonable attempt to do so (with some test data)
+might look like this:
+
+     $ echo 'line 1
+     > line 2
+     > line 3' | awk '{ l[lines] = $0; ++lines }
+     > END {
+     >     for (i = lines - 1; i >= 0; i--)
+     >        print l[i]
+     > }'
+     -| line 3
+     -| line 2
+
+   Unfortunately, the very first line of input data did not appear in
+the output!
+
+   Upon first glance, we would think that this program should have
+worked.  The variable 'lines' is uninitialized, and uninitialized
+variables have the numeric value zero.  So, 'awk' should have printed
+the value of 'l[0]'.
+
+   The issue here is that subscripts for 'awk' arrays are _always_
+strings.  Uninitialized variables, when used as strings, have the value
+'""', not zero.  Thus, 'line 1' ends up stored in 'l[""]'.  The
+following version of the program works correctly:
+
+     { l[lines++] = $0 }
+     END {
+         for (i = lines - 1; i >= 0; i--)
+            print l[i]
+     }
+
+   Here, the '++' forces 'lines' to be numeric, thus making the "old
+value" numeric zero.  This is then converted to '"0"' as the array
+subscript.
+
+   Even though it is somewhat unusual, the null string ('""') is a valid
+array subscript.  (d.c.)  'gawk' warns about the use of the null string
+as a subscript if '--lint' is provided on the command line (*note
+Options::).
+
+
+File: gawk.info,  Node: Delete,  Next: Multidimensional,  Prev: Uninitialized Subscripts,  Up: Arrays
+
+8.4 The 'delete' Statement
+==========================
+
+To remove an individual element of an array, use the 'delete' statement:
+
+     delete ARRAY[INDEX-EXPRESSION]
+
+   Once an array element has been deleted, any value the element once
+had is no longer available.  It is as if the element had never been
+referred to or been given a value.  The following is an example of
+deleting elements in an array:
+
+     for (i in frequencies)
+         delete frequencies[i]
+
+This example removes all the elements from the array 'frequencies'.
+Once an element is deleted, a subsequent 'for' statement to scan the
+array does not report that element and using the 'in' operator to check
+for the presence of that element returns zero (i.e., false):
+
+     delete foo[4]
+     if (4 in foo)
+         print "This will never be printed"
+
+   It is important to note that deleting an element is _not_ the same as
+assigning it a null value (the empty string, '""').  For example:
+
+     foo[4] = ""
+     if (4 in foo)
+       print "This is printed, even though foo[4] is empty"
+
+   It is not an error to delete an element that does not exist.
+However, if '--lint' is provided on the command line (*note Options::),
+'gawk' issues a warning message when an element that is not in the array
+is deleted.
+
+   All the elements of an array may be deleted with a single statement
+by leaving off the subscript in the 'delete' statement, as follows:
+
+     delete ARRAY
+
+   Using this version of the 'delete' statement is about three times
+more efficient than the equivalent loop that deletes each element one at
+a time.
+
+   This form of the 'delete' statement is also supported by BWK 'awk'
+and 'mawk', as well as by a number of other implementations.
+
+     NOTE: For many years, using 'delete' without a subscript was a
+     common extension.  In September 2012, it was accepted for inclusion
+     into the POSIX standard.  See the Austin Group website
+     (http://austingroupbugs.net/view.php?id=544).
+
+   The following statement provides a portable but nonobvious way to
+clear out an array:(1)
+
+     split("", array)
+
+   The 'split()' function (*note String Functions::) clears out the
+target array first.  This call asks it to split apart the null string.
+Because there is no data to split out, the function simply clears the
+array and then returns.
+
+     CAUTION: Deleting all the elements from an array does not change
+     its type; you cannot clear an array and then use the array's name
+     as a scalar (i.e., a regular variable).  For example, the following
+     does not work:
+
+          a[1] = 3
+          delete a
+          a = 3
+
+   ---------- Footnotes ----------
+
+   (1) Thanks to Michael Brennan for pointing this out.
+
+
+File: gawk.info,  Node: Multidimensional,  Next: Arrays of Arrays,  Prev: Delete,  Up: Arrays
+
+8.5 Multidimensional Arrays
+===========================
+
+* Menu:
+
+* Multiscanning::               Scanning multidimensional arrays.
+
+A "multidimensional array" is an array in which an element is identified
+by a sequence of indices instead of a single index.  For example, a
+two-dimensional array requires two indices.  The usual way (in many
+languages, including 'awk') to refer to an element of a two-dimensional
+array named 'grid' is with 'grid[X,Y]'.
+
+   Multidimensional arrays are supported in 'awk' through concatenation
+of indices into one string.  'awk' converts the indices into strings
+(*note Conversion::) and concatenates them together, with a separator
+between them.  This creates a single string that describes the values of
+the separate indices.  The combined string is used as a single index
+into an ordinary, one-dimensional array.  The separator used is the
+value of the built-in variable 'SUBSEP'.
+
+   For example, suppose we evaluate the expression 'foo[5,12] = "value"'
+when the value of 'SUBSEP' is '"@"'.  The numbers 5 and 12 are converted
+to strings and concatenated with an '@' between them, yielding '"5@12"';
+thus, the array element 'foo["5@12"]' is set to '"value"'.
+
+   Once the element's value is stored, 'awk' has no record of whether it
+was stored with a single index or a sequence of indices.  The two
+expressions 'foo[5,12]' and 'foo[5 SUBSEP 12]' are always equivalent.
+
+   The default value of 'SUBSEP' is the string '"\034"', which contains
+a nonprinting character that is unlikely to appear in an 'awk' program
+or in most input data.  The usefulness of choosing an unlikely character
+comes from the fact that index values that contain a string matching
+'SUBSEP' can lead to combined strings that are ambiguous.  Suppose that
+'SUBSEP' is '"@"'; then 'foo["a@b", "c"]' and 'foo["a", "b@c"]' are
+indistinguishable because both are actually stored as 'foo["a@b@c"]'.
+
+   To test whether a particular index sequence exists in a
+multidimensional array, use the same operator ('in') that is used for
+single-dimensional arrays.  Write the whole sequence of indices in
+parentheses, separated by commas, as the left operand:
+
+     if ((SUBSCRIPT1, SUBSCRIPT2, ...) in ARRAY)
+         ...
+
+   Here is an example that treats its input as a two-dimensional array
+of fields; it rotates this array 90 degrees clockwise and prints the
+result.  It assumes that all lines have the same number of elements:
+
+     {
+          if (max_nf < NF)
+               max_nf = NF
+          max_nr = NR
+          for (x = 1; x <= NF; x++)
+               vector[x, NR] = $x
+     }
+
+     END {
+          for (x = 1; x <= max_nf; x++) {
+               for (y = max_nr; y >= 1; --y)
+                    printf("%s ", vector[x, y])
+               printf("\n")
+          }
+     }
+
+When given the input:
+
+     1 2 3 4 5 6
+     2 3 4 5 6 1
+     3 4 5 6 1 2
+     4 5 6 1 2 3
+
+the program produces the following output:
+
+     4 3 2 1
+     5 4 3 2
+     6 5 4 3
+     1 6 5 4
+     2 1 6 5
+     3 2 1 6
+
+
+File: gawk.info,  Node: Multiscanning,  Up: Multidimensional
+
+8.5.1 Scanning Multidimensional Arrays
+--------------------------------------
+
+There is no special 'for' statement for scanning a "multidimensional"
+array.  There cannot be one, because, in truth, 'awk' does not have
+multidimensional arrays or elements--there is only a multidimensional
+_way of accessing_ an array.
+
+   However, if your program has an array that is always accessed as
+multidimensional, you can get the effect of scanning it by combining the
+scanning 'for' statement (*note Scanning an Array::) with the built-in
+'split()' function (*note String Functions::).  It works in the
+following manner:
+
+     for (combined in array) {
+         split(combined, separate, SUBSEP)
+         ...
+     }
+
+This sets the variable 'combined' to each concatenated combined index in
+the array, and splits it into the individual indices by breaking it
+apart where the value of 'SUBSEP' appears.  The individual indices then
+become the elements of the array 'separate'.
+
+   Thus, if a value is previously stored in 'array[1, "foo"]', then an
+element with index '"1\034foo"' exists in 'array'.  (Recall that the
+default value of 'SUBSEP' is the character with code 034.)  Sooner or
+later, the 'for' statement finds that index and does an iteration with
+the variable 'combined' set to '"1\034foo"'.  Then the 'split()'
+function is called as follows:
+
+     split("1\034foo", separate, "\034")
+
+The result is to set 'separate[1]' to '"1"' and 'separate[2]' to
+'"foo"'.  Presto!  The original sequence of separate indices is
+recovered.
+
+
+File: gawk.info,  Node: Arrays of Arrays,  Next: Arrays Summary,  Prev: Multidimensional,  Up: Arrays
+
+8.6 Arrays of Arrays
+====================
+
+'gawk' goes beyond standard 'awk''s multidimensional array access and
+provides true arrays of arrays.  Elements of a subarray are referred to
+by their own indices enclosed in square brackets, just like the elements
+of the main array.  For example, the following creates a two-element
+subarray at index '1' of the main array 'a':
+
+     a[1][1] = 1
+     a[1][2] = 2
+
+   This simulates a true two-dimensional array.  Each subarray element
+can contain another subarray as a value, which in turn can hold other
+arrays as well.  In this way, you can create arrays of three or more
+dimensions.  The indices can be any 'awk' expressions, including scalars
+separated by commas (i.e., a regular 'awk' simulated multidimensional
+subscript).  So the following is valid in 'gawk':
+
+     a[1][3][1, "name"] = "barney"
+
+   Each subarray and the main array can be of different length.  In
+fact, the elements of an array or its subarray do not all have to have
+the same type.  This means that the main array and any of its subarrays
+can be nonrectangular, or jagged in structure.  You can assign a scalar
+value to the index '4' of the main array 'a', even though 'a[1]' is
+itself an array and not a scalar:
+
+     a[4] = "An element in a jagged array"
+
+   The terms "dimension", "row", and "column" are meaningless when
+applied to such an array, but we will use "dimension" henceforth to
+imply the maximum number of indices needed to refer to an existing
+element.  The type of any element that has already been assigned cannot
+be changed by assigning a value of a different type.  You have to first
+delete the current element, which effectively makes 'gawk' forget about
+the element at that index:
+
+     delete a[4]
+     a[4][5][6][7] = "An element in a four-dimensional array"
+
+This removes the scalar value from index '4' and then inserts a
+three-level nested subarray containing a scalar.  You can also delete an
+entire subarray or subarray of subarrays:
+
+     delete a[4][5]
+     a[4][5] = "An element in subarray a[4]"
+
+   But recall that you can not delete the main array 'a' and then use it
+as a scalar.
+
+   The built-in functions that take array arguments can also be used
+with subarrays.  For example, the following code fragment uses
+'length()' (*note String Functions::) to determine the number of
+elements in the main array 'a' and its subarrays:
+
+     print length(a), length(a[1]), length(a[1][3])
+
+This results in the following output for our main array 'a':
+
+     2, 3, 1
+
+The 'SUBSCRIPT in ARRAY' expression (*note Reference to Elements::)
+works similarly for both regular 'awk'-style arrays and arrays of
+arrays.  For example, the tests '1 in a', '3 in a[1]', and '(1, "name")
+in a[1][3]' all evaluate to one (true) for our array 'a'.
+
+   The 'for (item in array)' statement (*note Scanning an Array::) can
+be nested to scan all the elements of an array of arrays if it is
+rectangular in structure.  In order to print the contents (scalar
+values) of a two-dimensional array of arrays (i.e., in which each
+first-level element is itself an array, not necessarily of the same
+length), you could use the following code:
+
+     for (i in array)
+         for (j in array[i])
+             print array[i][j]
+
+   The 'isarray()' function (*note Type Functions::) lets you test if an
+array element is itself an array:
+
+     for (i in array) {
+         if (isarray(array[i]) {
+             for (j in array[i]) {
+                 print array[i][j]
+             }
+         }
+         else
+             print array[i]
+     }
+
+   If the structure of a jagged array of arrays is known in advance, you
+can often devise workarounds using control statements.  For example, the
+following code prints the elements of our main array 'a':
+
+     for (i in a) {
+         for (j in a[i]) {
+             if (j == 3) {
+                 for (k in a[i][j])
+                     print a[i][j][k]
+             } else
+                 print a[i][j]
+         }
+     }
+
+*Note Walking Arrays:: for a user-defined function that "walks" an
+arbitrarily dimensioned array of arrays.
+
+   Recall that a reference to an uninitialized array element yields a
+value of '""', the null string.  This has one important implication when
+you intend to use a subarray as an argument to a function, as
+illustrated by the following example:
+
+     $ gawk 'BEGIN { split("a b c d", b[1]); print b[1][1] }'
+     error-> gawk: cmd. line:1: fatal: split: second argument is not an array
+
+   The way to work around this is to first force 'b[1]' to be an array
+by creating an arbitrary index:
+
+     $ gawk 'BEGIN { b[1][1] = ""; split("a b c d", b[1]); print b[1][1] }'
+     -| a
+
+
+File: gawk.info,  Node: Arrays Summary,  Prev: Arrays of Arrays,  Up: Arrays
+
+8.7 Summary
+===========
+
+   * Standard 'awk' provides one-dimensional associative arrays (arrays
+     indexed by string values).  All arrays are associative; numeric
+     indices are converted automatically to strings.
+
+   * Array elements are referenced as 'ARRAY[INDX]'.  Referencing an
+     element creates it if it did not exist previously.
+
+   * The proper way to see if an array has an element with a given index
+     is to use the 'in' operator: 'INDX in ARRAY'.
+
+   * Use 'for (INDX in ARRAY) ...' to scan through all the individual
+     elements of an array.  In the body of the loop, INDX takes on the
+     value of each element's index in turn.
+
+   * The order in which a 'for (INDX in ARRAY)' loop traverses an array
+     is undefined in POSIX 'awk' and varies among implementations.
+     'gawk' lets you control the order by assigning special predefined
+     values to 'PROCINFO["sorted_in"]'.
+
+   * Use 'delete ARRAY[INDX]' to delete an individual element.  To
+     delete all of the elements in an array, use 'delete ARRAY'.  This
+     latter feature has been a common extension for many years and is
+     now standard, but may not be supported by all commercial versions
+     of 'awk'.
+
+   * Standard 'awk' simulates multidimensional arrays by separating
+     subscript values with commas.  The values are concatenated into a
+     single string, separated by the value of 'SUBSEP'.  The fact that
+     such a subscript was created in this way is not retained; thus,
+     changing 'SUBSEP' may have unexpected consequences.  You can use
+     '(SUB1, SUB2, ...) in ARRAY' to see if such a multidimensional
+     subscript exists in ARRAY.
+
+   * 'gawk' provides true arrays of arrays.  You use a separate set of
+     square brackets for each dimension in such an array:
+     'data[row][col]', for example.  Array elements may thus be either
+     scalar values (number or string) or other arrays.
+
+   * Use the 'isarray()' built-in function to determine if an array
+     element is itself a subarray.
+
+
+File: gawk.info,  Node: Functions,  Next: Library Functions,  Prev: Arrays,  Up: Top
+
+9 Functions
+***********
+
+This major node describes 'awk''s built-in functions, which fall into
+three categories: numeric, string, and I/O. 'gawk' provides additional
+groups of functions to work with values that represent time, do bit
+manipulation, sort arrays, provide type information, and
+internationalize and localize programs.
+
+   Besides the built-in functions, 'awk' has provisions for writing new
+functions that the rest of a program can use.  The second half of this
+major node describes these "user-defined" functions.  Finally, we
+explore indirect function calls, a 'gawk'-specific extension that lets
+you determine at runtime what function is to be called.
+
+* Menu:
+
+* Built-in::                    Summarizes the built-in functions.
+* User-defined::                Describes User-defined functions in detail.
+* Indirect Calls::              Choosing the function to call at runtime.
+* Functions Summary::           Summary of functions.
+
+
+File: gawk.info,  Node: Built-in,  Next: User-defined,  Up: Functions
+
+9.1 Built-in Functions
+======================
+
+"Built-in" functions are always available for your 'awk' program to
+call.  This minor node defines all the built-in functions in 'awk'; some
+of these are mentioned in other minor nodes but are summarized here for
+your convenience.
+
+* Menu:
+
+* Calling Built-in::            How to call built-in functions.
+* Numeric Functions::           Functions that work with numbers, including
+                                'int()', 'sin()' and 'rand()'.
+* String Functions::            Functions for string manipulation, such as
+                                'split()', 'match()' and
+                                'sprintf()'.
+* I/O Functions::               Functions for files and shell commands.
+* Time Functions::              Functions for dealing with timestamps.
+* Bitwise Functions::           Functions for bitwise operations.
+* Type Functions::              Functions for type information.
+* I18N Functions::              Functions for string translation.
+
+
+File: gawk.info,  Node: Calling Built-in,  Next: Numeric Functions,  Up: Built-in
+
+9.1.1 Calling Built-in Functions
+--------------------------------
+
+To call one of 'awk''s built-in functions, write the name of the
+function followed by arguments in parentheses.  For example, 'atan2(y +
+z, 1)' is a call to the function 'atan2()' and has two arguments.
+
+   Whitespace is ignored between the built-in function name and the
+opening parenthesis, but nonetheless it is good practice to avoid using
+whitespace there.  User-defined functions do not permit whitespace in
+this way, and it is easier to avoid mistakes by following a simple
+convention that always works--no whitespace after a function name.
+
+   Each built-in function accepts a certain number of arguments.  In
+some cases, arguments can be omitted.  The defaults for omitted
+arguments vary from function to function and are described under the
+individual functions.  In some 'awk' implementations, extra arguments
+given to built-in functions are ignored.  However, in 'gawk', it is a
+fatal error to give extra arguments to a built-in function.
+
+   When a function is called, expressions that create the function's
+actual parameters are evaluated completely before the call is performed.
+For example, in the following code fragment:
+
+     i = 4
+     j = sqrt(i++)
+
+the variable 'i' is incremented to the value five before 'sqrt()' is
+called with a value of four for its actual parameter.  The order of
+evaluation of the expressions used for the function's parameters is
+undefined.  Thus, avoid writing programs that assume that parameters are
+evaluated from left to right or from right to left.  For example:
+
+     i = 5
+     j = atan2(++i, i *= 2)
+
+   If the order of evaluation is left to right, then 'i' first becomes
+six, and then 12, and 'atan2()' is called with the two arguments six and
+12.  But if the order of evaluation is right to left, 'i' first becomes
+10, then 11, and 'atan2()' is called with the two arguments 11 and 10.
+
+
+File: gawk.info,  Node: Numeric Functions,  Next: String Functions,  Prev: Calling Built-in,  Up: Built-in
+
+9.1.2 Numeric Functions
+-----------------------
+
+The following list describes all of the built-in functions that work
+with numbers.  Optional parameters are enclosed in square
+brackets ([ ]):
+
+'atan2(Y, X)'
+     Return the arctangent of 'Y / X' in radians.  You can use 'pi =
+     atan2(0, -1)' to retrieve the value of pi.
+
+'cos(X)'
+     Return the cosine of X, with X in radians.
+
+'exp(X)'
+     Return the exponential of X ('e ^ X') or report an error if X is
+     out of range.  The range of values X can have depends on your
+     machine's floating-point representation.
+
+'int(X)'
+     Return the nearest integer to X, located between X and zero and
+     truncated toward zero.  For example, 'int(3)' is 3, 'int(3.9)' is
+     3, 'int(-3.9)' is -3, and 'int(-3)' is -3 as well.
+
+'intdiv(NUMERATOR, DENOMINATOR, RESULT)'
+     Perform integer division, similar to the standard C function of the
+     same name.  First, truncate 'numerator' and 'denominator' towards
+     zero, creating integer values.  Clear the 'result' array, and then
+     set 'result["quotient"]' to the result of 'numerator /
+     denominator', truncated towards zero to an integer, and set
+     'result["remainder"]' to the result of 'numerator % denominator',
+     truncated towards zero to an integer.  This function is primarily
+     intended for use with arbitrary length integers; it avoids creating
+     MPFR arbitrary precision floating-point values (*note Arbitrary
+     Precision Integers::).
+
+     This function is a 'gawk' extension.  It is not available in
+     compatibility mode (*note Options::).
+
+'log(X)'
+     Return the natural logarithm of X, if X is positive; otherwise,
+     return 'NaN' ("not a number") on IEEE 754 systems.  Additionally,
+     'gawk' prints a warning message when 'x' is negative.
+
+'rand()'
+     Return a random number.  The values of 'rand()' are uniformly
+     distributed between zero and one.  The value could be zero but is
+     never one.(1)
+
+     Often random integers are needed instead.  Following is a
+     user-defined function that can be used to obtain a random
+     nonnegative integer less than N:
+
+          function randint(n)
+          {
+              return int(n * rand())
+          }
+
+     The multiplication produces a random number greater than or equal
+     to zero and less than 'n'.  Using 'int()', this result is made into
+     an integer between zero and 'n' - 1, inclusive.
+
+     The following example uses a similar function to produce random
+     integers between one and N.  This program prints a new random
+     number for each input record:
+
+          # Function to roll a simulated die.
+          function roll(n) { return 1 + int(rand() * n) }
+
+          # Roll 3 six-sided dice and
+          # print total number of points.
+          {
+              printf("%d points\n", roll(6) + roll(6) + roll(6))
+          }
+
+          CAUTION: In most 'awk' implementations, including 'gawk',
+          'rand()' starts generating numbers from the same starting
+          number, or "seed", each time you run 'awk'.(2)  Thus, a
+          program generates the same results each time you run it.  The
+          numbers are random within one 'awk' run but predictable from
+          run to run.  This is convenient for debugging, but if you want
+          a program to do different things each time it is used, you
+          must change the seed to a value that is different in each run.
+          To do this, use 'srand()'.
+
+'sin(X)'
+     Return the sine of X, with X in radians.
+
+'sqrt(X)'
+     Return the positive square root of X.  'gawk' prints a warning
+     message if X is negative.  Thus, 'sqrt(4)' is 2.
+
+'srand('[X]')'
+     Set the starting point, or seed, for generating random numbers to
+     the value X.
+
+     Each seed value leads to a particular sequence of random
+     numbers.(3)  Thus, if the seed is set to the same value a second
+     time, the same sequence of random numbers is produced again.
+
+          CAUTION: Different 'awk' implementations use different
+          random-number generators internally.  Don't expect the same
+          'awk' program to produce the same series of random numbers
+          when executed by different versions of 'awk'.
+
+     If the argument X is omitted, as in 'srand()', then the current
+     date and time of day are used for a seed.  This is the way to get
+     random numbers that are truly unpredictable.
+
+     The return value of 'srand()' is the previous seed.  This makes it
+     easy to keep track of the seeds in case you need to consistently
+     reproduce sequences of random numbers.
+
+     POSIX does not specify the initial seed; it differs among 'awk'
+     implementations.
+
+   ---------- Footnotes ----------
+
+   (1) The C version of 'rand()' on many Unix systems is known to
+produce fairly poor sequences of random numbers.  However, nothing
+requires that an 'awk' implementation use the C 'rand()' to implement
+the 'awk' version of 'rand()'.  In fact, 'gawk' uses the BSD 'random()'
+function, which is considerably better than 'rand()', to produce random
+numbers.
+
+   (2) 'mawk' uses a different seed each time.
+
+   (3) Computer-generated random numbers really are not truly random.
+They are technically known as "pseudorandom".  This means that although
+the numbers in a sequence appear to be random, you can in fact generate
+the same sequence of random numbers over and over again.
+
+
+File: gawk.info,  Node: String Functions,  Next: I/O Functions,  Prev: Numeric Functions,  Up: Built-in
+
+9.1.3 String-Manipulation Functions
+-----------------------------------
+
+The functions in this minor node look at or change the text of one or
+more strings.
+
+   'gawk' understands locales (*note Locales::) and does all string
+processing in terms of _characters_, not _bytes_.  This distinction is
+particularly important to understand for locales where one character may
+be represented by multiple bytes.  Thus, for example, 'length()' returns
+the number of characters in a string, and not the number of bytes used
+to represent those characters.  Similarly, 'index()' works with
+character indices, and not byte indices.
+
+     CAUTION: A number of functions deal with indices into strings.  For
+     these functions, the first character of a string is at position
+     (index) one.  This is different from C and the languages descended
+     from it, where the first character is at position zero.  You need
+     to remember this when doing index calculations, particularly if you
+     are used to C.
+
+   In the following list, optional parameters are enclosed in square
+brackets ([ ]).  Several functions perform string substitution; the full
+discussion is provided in the description of the 'sub()' function, which
+comes toward the end, because the list is presented alphabetically.
+
+   Those functions that are specific to 'gawk' are marked with a pound
+sign ('#').  They are not available in compatibility mode (*note
+Options::):
+
+* Menu:
+
+* Gory Details::                More than you want to know about '\' and
+                                '&' with 'sub()', 'gsub()', and
+                                'gensub()'.
+
+'asort('SOURCE [',' DEST [',' HOW ] ]') #'
+'asorti('SOURCE [',' DEST [',' HOW ] ]') #'
+     These two functions are similar in behavior, so they are described
+     together.
+
+          NOTE: The following description ignores the third argument,
+          HOW, as it requires understanding features that we have not
+          discussed yet.  Thus, the discussion here is a deliberate
+          simplification.  (We do provide all the details later on; see
+          *note Array Sorting Functions:: for the full story.)
+
+     Both functions return the number of elements in the array SOURCE.
+     For 'asort()', 'gawk' sorts the values of SOURCE and replaces the
+     indices of the sorted values of SOURCE with sequential integers
+     starting with one.  If the optional array DEST is specified, then
+     SOURCE is duplicated into DEST.  DEST is then sorted, leaving the
+     indices of SOURCE unchanged.
+
+     When comparing strings, 'IGNORECASE' affects the sorting (*note
+     Array Sorting Functions::).  If the SOURCE array contains subarrays
+     as values (*note Arrays of Arrays::), they will come last, after
+     all scalar values.  Subarrays are _not_ recursively sorted.
+
+     For example, if the contents of 'a' are as follows:
+
+          a["last"] = "de"
+          a["first"] = "sac"
+          a["middle"] = "cul"
+
+     A call to 'asort()':
+
+          asort(a)
+
+     results in the following contents of 'a':
+
+          a[1] = "cul"
+          a[2] = "de"
+          a[3] = "sac"
+
+     The 'asorti()' function works similarly to 'asort()'; however, the
+     _indices_ are sorted, instead of the values.  Thus, in the previous
+     example, starting with the same initial set of indices and values
+     in 'a', calling 'asorti(a)' would yield:
+
+          a[1] = "first"
+          a[2] = "last"
+          a[3] = "middle"
+
+'gensub(REGEXP, REPLACEMENT, HOW' [', TARGET']') #'
+     Search the target string TARGET for matches of the regular
+     expression REGEXP.  If HOW is a string beginning with 'g' or 'G'
+     (short for "global"), then replace all matches of REGEXP with
+     REPLACEMENT.  Otherwise, HOW is treated as a number indicating
+     which match of REGEXP to replace.  If no TARGET is supplied, use
+     '$0'.  It returns the modified string as the result of the function
+     and the original target string is _not_ changed.
+
+     'gensub()' is a general substitution function.  Its purpose is to
+     provide more features than the standard 'sub()' and 'gsub()'
+     functions.
+
+     'gensub()' provides an additional feature that is not available in
+     'sub()' or 'gsub()': the ability to specify components of a regexp
+     in the replacement text.  This is done by using parentheses in the
+     regexp to mark the components and then specifying '\N' in the
+     replacement text, where N is a digit from 1 to 9.  For example:
+
+          $ gawk '
+          > BEGIN {
+          >      a = "abc def"
+          >      b = gensub(/(.+) (.+)/, "\\2 \\1", "g", a)
+          >      print b
+          > }'
+          -| def abc
+
+     As with 'sub()', you must type two backslashes in order to get one
+     into the string.  In the replacement text, the sequence '\0'
+     represents the entire matched text, as does the character '&'.
+
+     The following example shows how you can use the third argument to
+     control which match of the regexp should be changed:
+
+          $ echo a b c a b c |
+          > gawk '{ print gensub(/a/, "AA", 2) }'
+          -| a b c AA b c
+
+     In this case, '$0' is the default target string.  'gensub()'
+     returns the new string as its result, which is passed directly to
+     'print' for printing.
+
+     If the HOW argument is a string that does not begin with 'g' or
+     'G', or if it is a number that is less than or equal to zero, only
+     one substitution is performed.  If HOW is zero, 'gawk' issues a
+     warning message.
+
+     If REGEXP does not match TARGET, 'gensub()''s return value is the
+     original unchanged value of TARGET.
+
+'gsub(REGEXP, REPLACEMENT' [', TARGET']')'
+     Search TARGET for _all_ of the longest, leftmost, _nonoverlapping_
+     matching substrings it can find and replace them with REPLACEMENT.
+     The 'g' in 'gsub()' stands for "global," which means replace
+     everywhere.  For example:
+
+          { gsub(/Britain/, "United Kingdom"); print }
+
+     replaces all occurrences of the string 'Britain' with 'United
+     Kingdom' for all input records.
+
+     The 'gsub()' function returns the number of substitutions made.  If
+     the variable to search and alter (TARGET) is omitted, then the
+     entire input record ('$0') is used.  As in 'sub()', the characters
+     '&' and '\' are special, and the third argument must be assignable.
+
+'index(IN, FIND)'
+     Search the string IN for the first occurrence of the string FIND,
+     and return the position in characters where that occurrence begins
+     in the string IN.  Consider the following example:
+
+          $ awk 'BEGIN { print index("peanut", "an") }'
+          -| 3
+
+     If FIND is not found, 'index()' returns zero.
+
+     With BWK 'awk' and 'gawk', it is a fatal error to use a regexp
+     constant for FIND.  Other implementations allow it, simply treating
+     the regexp constant as an expression meaning '$0 ~ /regexp/'.
+     (d.c.)
+
+'length('[STRING]')'
+     Return the number of characters in STRING.  If STRING is a number,
+     the length of the digit string representing that number is
+     returned.  For example, 'length("abcde")' is five.  By contrast,
+     'length(15 * 35)' works out to three.  In this example, 15 * 35 =
+     525, and 525 is then converted to the string '"525"', which has
+     three characters.
+
+     If no argument is supplied, 'length()' returns the length of '$0'.
+
+          NOTE: In older versions of 'awk', the 'length()' function
+          could be called without any parentheses.  Doing so is
+          considered poor practice, although the 2008 POSIX standard
+          explicitly allows it, to support historical practice.  For
+          programs to be maximally portable, always supply the
+          parentheses.
+
+     If 'length()' is called with a variable that has not been used,
+     'gawk' forces the variable to be a scalar.  Other implementations
+     of 'awk' leave the variable without a type.  (d.c.)  Consider:
+
+          $ gawk 'BEGIN { print length(x) ; x[1] = 1 }'
+          -| 0
+          error-> gawk: fatal: attempt to use scalar `x' as array
+
+          $ nawk 'BEGIN { print length(x) ; x[1] = 1 }'
+          -| 0
+
+     If '--lint' has been specified on the command line, 'gawk' issues a
+     warning about this.
+
+     With 'gawk' and several other 'awk' implementations, when given an
+     array argument, the 'length()' function returns the number of
+     elements in the array.  (c.e.)  This is less useful than it might
+     seem at first, as the array is not guaranteed to be indexed from
+     one to the number of elements in it.  If '--lint' is provided on
+     the command line (*note Options::), 'gawk' warns that passing an
+     array argument is not portable.  If '--posix' is supplied, using an
+     array argument is a fatal error (*note Arrays::).
+
+'match(STRING, REGEXP' [', ARRAY']')'
+     Search STRING for the longest, leftmost substring matched by the
+     regular expression REGEXP and return the character position (index)
+     at which that substring begins (one, if it starts at the beginning
+     of STRING).  If no match is found, return zero.
+
+     The REGEXP argument may be either a regexp constant ('/'...'/') or
+     a string constant ('"'...'"').  In the latter case, the string is
+     treated as a regexp to be matched.  *Note Computed Regexps:: for a
+     discussion of the difference between the two forms, and the
+     implications for writing your program correctly.
+
+     The order of the first two arguments is the opposite of most other
+     string functions that work with regular expressions, such as
+     'sub()' and 'gsub()'.  It might help to remember that for
+     'match()', the order is the same as for the '~' operator: 'STRING ~
+     REGEXP'.
+
+     The 'match()' function sets the predefined variable 'RSTART' to the
+     index.  It also sets the predefined variable 'RLENGTH' to the
+     length in characters of the matched substring.  If no match is
+     found, 'RSTART' is set to zero, and 'RLENGTH' to -1.
+
+     For example:
+
+          {
+              if ($1 == "FIND")
+                  regex = $2
+              else {
+                  where = match($0, regex)
+                  if (where != 0)
+                      print "Match of", regex, "found at", where, "in", $0
+                 }
+          }
+
+     This program looks for lines that match the regular expression
+     stored in the variable 'regex'.  This regular expression can be
+     changed.  If the first word on a line is 'FIND', 'regex' is changed
+     to be the second word on that line.  Therefore, if given:
+
+          FIND ru+n
+          My program runs
+          but not very quickly
+          FIND Melvin
+          JF+KM
+          This line is property of Reality Engineering Co.
+          Melvin was here.
+
+     'awk' prints:
+
+          Match of ru+n found at 12 in My program runs
+          Match of Melvin found at 1 in Melvin was here.
+
+     If ARRAY is present, it is cleared, and then the zeroth element of
+     ARRAY is set to the entire portion of STRING matched by REGEXP.  If
+     REGEXP contains parentheses, the integer-indexed elements of ARRAY
+     are set to contain the portion of STRING matching the corresponding
+     parenthesized subexpression.  For example:
+
+          $ echo foooobazbarrrrr |
+          > gawk '{ match($0, /(fo+).+(bar*)/, arr)
+          >         print arr[1], arr[2] }'
+          -| foooo barrrrr
+
+     In addition, multidimensional subscripts are available providing
+     the start index and length of each matched subexpression:
+
+          $ echo foooobazbarrrrr |
+          > gawk '{ match($0, /(fo+).+(bar*)/, arr)
+          >           print arr[1], arr[2]
+          >           print arr[1, "start"], arr[1, "length"]
+          >           print arr[2, "start"], arr[2, "length"]
+          > }'
+          -| foooo barrrrr
+          -| 1 5
+          -| 9 7
+
+     There may not be subscripts for the start and index for every
+     parenthesized subexpression, because they may not all have matched
+     text; thus, they should be tested for with the 'in' operator (*note
+     Reference to Elements::).
+
+     The ARRAY argument to 'match()' is a 'gawk' extension.  In
+     compatibility mode (*note Options::), using a third argument is a
+     fatal error.
+
+'patsplit(STRING, ARRAY' [', FIELDPAT' [', SEPS' ] ]') #'
+     Divide STRING into pieces defined by FIELDPAT and store the pieces
+     in ARRAY and the separator strings in the SEPS array.  The first
+     piece is stored in 'ARRAY[1]', the second piece in 'ARRAY[2]', and
+     so forth.  The third argument, FIELDPAT, is a regexp describing the
+     fields in STRING (just as 'FPAT' is a regexp describing the fields
+     in input records).  It may be either a regexp constant or a string.
+     If FIELDPAT is omitted, the value of 'FPAT' is used.  'patsplit()'
+     returns the number of elements created.  'SEPS[I]' is the separator
+     string between 'ARRAY[I]' and 'ARRAY[I+1]'.  Any leading separator
+     will be in 'SEPS[0]'.
+
+     The 'patsplit()' function splits strings into pieces in a manner
+     similar to the way input lines are split into fields using 'FPAT'
+     (*note Splitting By Content::).
+
+     Before splitting the string, 'patsplit()' deletes any previously
+     existing elements in the arrays ARRAY and SEPS.
+
+'split(STRING, ARRAY' [', FIELDSEP' [', SEPS' ] ]')'
+     Divide STRING into pieces separated by FIELDSEP and store the
+     pieces in ARRAY and the separator strings in the SEPS array.  The
+     first piece is stored in 'ARRAY[1]', the second piece in
+     'ARRAY[2]', and so forth.  The string value of the third argument,
+     FIELDSEP, is a regexp describing where to split STRING (much as
+     'FS' can be a regexp describing where to split input records).  If
+     FIELDSEP is omitted, the value of 'FS' is used.  'split()' returns
+     the number of elements created.  SEPS is a 'gawk' extension, with
+     'SEPS[I]' being the separator string between 'ARRAY[I]' and
+     'ARRAY[I+1]'.  If FIELDSEP is a single space, then any leading
+     whitespace goes into 'SEPS[0]' and any trailing whitespace goes
+     into 'SEPS[N]', where N is the return value of 'split()' (i.e., the
+     number of elements in ARRAY).
+
+     The 'split()' function splits strings into pieces in a manner
+     similar to the way input lines are split into fields.  For example:
+
+          split("cul-de-sac", a, "-", seps)
+
+     splits the string '"cul-de-sac"' into three fields using '-' as the
+     separator.  It sets the contents of the array 'a' as follows:
+
+          a[1] = "cul"
+          a[2] = "de"
+          a[3] = "sac"
+
+     and sets the contents of the array 'seps' as follows:
+
+          seps[1] = "-"
+          seps[2] = "-"
+
+     The value returned by this call to 'split()' is three.
+
+     As with input field-splitting, when the value of FIELDSEP is '" "',
+     leading and trailing whitespace is ignored in values assigned to
+     the elements of ARRAY but not in SEPS, and the elements are
+     separated by runs of whitespace.  Also, as with input field
+     splitting, if FIELDSEP is the null string, each individual
+     character in the string is split into its own array element.
+     (c.e.)
+
+     Note, however, that 'RS' has no effect on the way 'split()' works.
+     Even though 'RS = ""' causes the newline character to also be an
+     input field separator, this does not affect how 'split()' splits
+     strings.
+
+     Modern implementations of 'awk', including 'gawk', allow the third
+     argument to be a regexp constant ('/'...'/') as well as a string.
+     (d.c.)  The POSIX standard allows this as well.  *Note Computed
+     Regexps:: for a discussion of the difference between using a string
+     constant or a regexp constant, and the implications for writing
+     your program correctly.
+
+     Before splitting the string, 'split()' deletes any previously
+     existing elements in the arrays ARRAY and SEPS.
+
+     If STRING is null, the array has no elements.  (So this is a
+     portable way to delete an entire array with one statement.  *Note
+     Delete::.)
+
+     If STRING does not match FIELDSEP at all (but is not null), ARRAY
+     has one element only.  The value of that element is the original
+     STRING.
+
+     In POSIX mode (*note Options::), the fourth argument is not
+     allowed.
+
+'sprintf(FORMAT, EXPRESSION1, ...)'
+     Return (without printing) the string that 'printf' would have
+     printed out with the same arguments (*note Printf::).  For example:
+
+          pival = sprintf("pi = %.2f (approx.)", 22/7)
+
+     assigns the string 'pi = 3.14 (approx.)' to the variable 'pival'.
+
+'strtonum(STR) #'
+     Examine STR and return its numeric value.  If STR begins with a
+     leading '0', 'strtonum()' assumes that STR is an octal number.  If
+     STR begins with a leading '0x' or '0X', 'strtonum()' assumes that
+     STR is a hexadecimal number.  For example:
+
+          $ echo 0x11 |
+          > gawk '{ printf "%d\n", strtonum($1) }'
+          -| 17
+
+     Using the 'strtonum()' function is _not_ the same as adding zero to
+     a string value; the automatic coercion of strings to numbers works
+     only for decimal data, not for octal or hexadecimal.(1)
+
+     Note also that 'strtonum()' uses the current locale's decimal point
+     for recognizing numbers (*note Locales::).
+
+'sub(REGEXP, REPLACEMENT' [', TARGET']')'
+     Search TARGET, which is treated as a string, for the leftmost,
+     longest substring matched by the regular expression REGEXP.  Modify
+     the entire string by replacing the matched text with REPLACEMENT.
+     The modified string becomes the new value of TARGET.  Return the
+     number of substitutions made (zero or one).
+
+     The REGEXP argument may be either a regexp constant ('/'...'/') or
+     a string constant ('"'...'"').  In the latter case, the string is
+     treated as a regexp to be matched.  *Note Computed Regexps:: for a
+     discussion of the difference between the two forms, and the
+     implications for writing your program correctly.
+
+     This function is peculiar because TARGET is not simply used to
+     compute a value, and not just any expression will do--it must be a
+     variable, field, or array element so that 'sub()' can store a
+     modified value there.  If this argument is omitted, then the
+     default is to use and alter '$0'.(2)  For example:
+
+          str = "water, water, everywhere"
+          sub(/at/, "ith", str)
+
+     sets 'str' to 'wither, water, everywhere', by replacing the
+     leftmost longest occurrence of 'at' with 'ith'.
+
+     If the special character '&' appears in REPLACEMENT, it stands for
+     the precise substring that was matched by REGEXP.  (If the regexp
+     can match more than one string, then this precise substring may
+     vary.)  For example:
+
+          { sub(/candidate/, "& and his wife"); print }
+
+     changes the first occurrence of 'candidate' to 'candidate and his
+     wife' on each input line.  Here is another example:
+
+          $ awk 'BEGIN {
+          >         str = "daabaaa"
+          >         sub(/a+/, "C&C", str)
+          >         print str
+          > }'
+          -| dCaaCbaaa
+
+     This shows how '&' can represent a nonconstant string and also
+     illustrates the "leftmost, longest" rule in regexp matching (*note
+     Leftmost Longest::).
+
+     The effect of this special character ('&') can be turned off by
+     putting a backslash before it in the string.  As usual, to insert
+     one backslash in the string, you must write two backslashes.
+     Therefore, write '\\&' in a string constant to include a literal
+     '&' in the replacement.  For example, the following shows how to
+     replace the first '|' on each line with an '&':
+
+          { sub(/\|/, "\\&"); print }
+
+     As mentioned, the third argument to 'sub()' must be a variable,
+     field, or array element.  Some versions of 'awk' allow the third
+     argument to be an expression that is not an lvalue.  In such a
+     case, 'sub()' still searches for the pattern and returns zero or
+     one, but the result of the substitution (if any) is thrown away
+     because there is no place to put it.  Such versions of 'awk' accept
+     expressions like the following:
+
+          sub(/USA/, "United States", "the USA and Canada")
+
+     For historical compatibility, 'gawk' accepts such erroneous code.
+     However, using any other nonchangeable object as the third
+     parameter causes a fatal error and your program will not run.
+
+     Finally, if the REGEXP is not a regexp constant, it is converted
+     into a string, and then the value of that string is treated as the
+     regexp to match.
+
+'substr(STRING, START' [', LENGTH' ]')'
+     Return a LENGTH-character-long substring of STRING, starting at
+     character number START.  The first character of a string is
+     character number one.(3)  For example, 'substr("washington", 5, 3)'
+     returns '"ing"'.
+
+     If LENGTH is not present, 'substr()' returns the whole suffix of
+     STRING that begins at character number START.  For example,
+     'substr("washington", 5)' returns '"ington"'.  The whole suffix is
+     also returned if LENGTH is greater than the number of characters
+     remaining in the string, counting from character START.
+
+     If START is less than one, 'substr()' treats it as if it was one.
+     (POSIX doesn't specify what to do in this case: BWK 'awk' acts this
+     way, and therefore 'gawk' does too.)  If START is greater than the
+     number of characters in the string, 'substr()' returns the null
+     string.  Similarly, if LENGTH is present but less than or equal to
+     zero, the null string is returned.
+
+     The string returned by 'substr()' _cannot_ be assigned.  Thus, it
+     is a mistake to attempt to change a portion of a string, as shown
+     in the following example:
+
+          string = "abcdef"
+          # try to get "abCDEf", won't work
+          substr(string, 3, 3) = "CDE"
+
+     It is also a mistake to use 'substr()' as the third argument of
+     'sub()' or 'gsub()':
+
+          gsub(/xyz/, "pdq", substr($0, 5, 20))  # WRONG
+
+     (Some commercial versions of 'awk' treat 'substr()' as assignable,
+     but doing so is not portable.)
+
+     If you need to replace bits and pieces of a string, combine
+     'substr()' with string concatenation, in the following manner:
+
+          string = "abcdef"
+          ...
+          string = substr(string, 1, 2) "CDE" substr(string, 6)
+
+'tolower(STRING)'
+     Return a copy of STRING, with each uppercase character in the
+     string replaced with its corresponding lowercase character.
+     Nonalphabetic characters are left unchanged.  For example,
+     'tolower("MiXeD cAsE 123")' returns '"mixed case 123"'.
+
+'toupper(STRING)'
+     Return a copy of STRING, with each lowercase character in the
+     string replaced with its corresponding uppercase character.
+     Nonalphabetic characters are left unchanged.  For example,
+     'toupper("MiXeD cAsE 123")' returns '"MIXED CASE 123"'.
+
+                       Matching the Null String
+
+   In 'awk', the '*' operator can match the null string.  This is
+particularly important for the 'sub()', 'gsub()', and 'gensub()'
+functions.  For example:
+
+     $ echo abc | awk '{ gsub(/m*/, "X"); print }'
+     -| XaXbXcX
+
+Although this makes a certain amount of sense, it can be surprising.
+
+   ---------- Footnotes ----------
+
+   (1) Unless you use the '--non-decimal-data' option, which isn't
+recommended.  *Note Nondecimal Data:: for more information.
+
+   (2) Note that this means that the record will first be regenerated
+using the value of 'OFS' if any fields have been changed, and that the
+fields will be updated after the substitution, even if the operation is
+a "no-op" such as 'sub(/^/, "")'.
+
+   (3) This is different from C and C++, in which the first character is
+number zero.
+
+
+File: gawk.info,  Node: Gory Details,  Up: String Functions
+
+9.1.3.1 More about '\' and '&' with 'sub()', 'gsub()', and 'gensub()'
+.....................................................................
+
+     CAUTION: This subsubsection has been reported to cause headaches.
+     You might want to skip it upon first reading.
+
+   When using 'sub()', 'gsub()', or 'gensub()', and trying to get
+literal backslashes and ampersands into the replacement text, you need
+to remember that there are several levels of "escape processing" going
+on.
+
+   First, there is the "lexical" level, which is when 'awk' reads your
+program and builds an internal copy of it to execute.  Then there is the
+runtime level, which is when 'awk' actually scans the replacement string
+to determine what to generate.
+
+   At both levels, 'awk' looks for a defined set of characters that can
+come after a backslash.  At the lexical level, it looks for the escape
+sequences listed in *note Escape Sequences::.  Thus, for every '\' that
+'awk' processes at the runtime level, you must type two backslashes at
+the lexical level.  When a character that is not valid for an escape
+sequence follows the '\', BWK 'awk' and 'gawk' both simply remove the
+initial '\' and put the next character into the string.  Thus, for
+example, '"a\qb"' is treated as '"aqb"'.
+
+   At the runtime level, the various functions handle sequences of '\'
+and '&' differently.  The situation is (sadly) somewhat complex.
+Historically, the 'sub()' and 'gsub()' functions treated the
+two-character sequence '\&' specially; this sequence was replaced in the
+generated text with a single '&'.  Any other '\' within the REPLACEMENT
+string that did not precede an '&' was passed through unchanged.  This
+is illustrated in *note Table 9.1: table-sub-escapes.
+
+      You type         'sub()' sees          'sub()' generates
+      -----         -------          ----------
+          '\&'              '&'            The matched text
+         '\\&'             '\&'            A literal '&'
+        '\\\&'             '\&'            A literal '&'
+       '\\\\&'            '\\&'            A literal '\&'
+      '\\\\\&'            '\\&'            A literal '\&'
+     '\\\\\\&'           '\\\&'            A literal '\\&'
+         '\\q'             '\q'            A literal '\q'
+
+Table 9.1: Historical escape sequence processing for 'sub()' and
+'gsub()'
+
+This table shows the lexical-level processing, where an odd number of
+backslashes becomes an even number at the runtime level, as well as the
+runtime processing done by 'sub()'.  (For the sake of simplicity, the
+rest of the following tables only show the case of even numbers of
+backslashes entered at the lexical level.)
+
+   The problem with the historical approach is that there is no way to
+get a literal '\' followed by the matched text.
+
+   Several editions of the POSIX standard attempted to fix this problem
+but weren't successful.  The details are irrelevant at this point in
+time.
+
+   At one point, the 'gawk' maintainer submitted proposed text for a
+revised standard that reverts to rules that correspond more closely to
+the original existing practice.  The proposed rules have special cases
+that make it possible to produce a '\' preceding the matched text.  This
+is shown in *note Table 9.2: table-sub-proposed.
+
+      You type         'sub()' sees         'sub()' generates
+      -----         -------         ----------
+     '\\\\\\&'           '\\\&'            A literal '\&'
+       '\\\\&'            '\\&'            A literal '\', followed by the matched text
+         '\\&'             '\&'            A literal '&'
+         '\\q'             '\q'            A literal '\q'
+        '\\\\'             '\\'            '\\'
+
+Table 9.2: 'gawk' rules for 'sub()' and backslash
+
+   In a nutshell, at the runtime level, there are now three special
+sequences of characters ('\\\&', '\\&', and '\&') whereas historically
+there was only one.  However, as in the historical case, any '\' that is
+not part of one of these three sequences is not special and appears in
+the output literally.
+
+   'gawk' 3.0 and 3.1 follow these rules for 'sub()' and 'gsub()'.  The
+POSIX standard took much longer to be revised than was expected.  In
+addition, the 'gawk' maintainer's proposal was lost during the
+standardization process.  The final rules are somewhat simpler.  The
+results are similar except for one case.
+
+   The POSIX rules state that '\&' in the replacement string produces a
+literal '&', '\\' produces a literal '\', and '\' followed by anything
+else is not special; the '\' is placed straight into the output.  These
+rules are presented in *note Table 9.3: table-posix-sub.
+
+      You type         'sub()' sees         'sub()' generates
+      -----         -------         ----------
+     '\\\\\\&'           '\\\&'            A literal '\&'
+       '\\\\&'            '\\&'            A literal '\', followed by the matched text
+         '\\&'             '\&'            A literal '&'
+         '\\q'             '\q'            A literal '\q'
+        '\\\\'             '\\'            '\'
+
+Table 9.3: POSIX rules for 'sub()' and 'gsub()'
+
+   The only case where the difference is noticeable is the last one:
+'\\\\' is seen as '\\' and produces '\' instead of '\\'.
+
+   Starting with version 3.1.4, 'gawk' followed the POSIX rules when
+'--posix' was specified (*note Options::).  Otherwise, it continued to
+follow the proposed rules, as that had been its behavior for many years.
+
+   When version 4.0.0 was released, the 'gawk' maintainer made the POSIX
+rules the default, breaking well over a decade's worth of backward
+compatibility.(1)  Needless to say, this was a bad idea, and as of
+version 4.0.1, 'gawk' resumed its historical behavior, and only follows
+the POSIX rules when '--posix' is given.
+
+   The rules for 'gensub()' are considerably simpler.  At the runtime
+level, whenever 'gawk' sees a '\', if the following character is a
+digit, then the text that matched the corresponding parenthesized
+subexpression is placed in the generated output.  Otherwise, no matter
+what character follows the '\', it appears in the generated text and the
+'\' does not, as shown in *note Table 9.4: table-gensub-escapes.
+
+       You type          'gensub()' sees         'gensub()' generates
+       -----          ---------         ------------
+           '&'                    '&'            The matched text
+         '\\&'                   '\&'            A literal '&'
+        '\\\\'                   '\\'            A literal '\'
+       '\\\\&'                  '\\&'            A literal '\', then the matched text
+     '\\\\\\&'                 '\\\&'            A literal '\&'
+         '\\q'                   '\q'            A literal 'q'
+
+Table 9.4: Escape sequence processing for 'gensub()'
+
+   Because of the complexity of the lexical- and runtime-level
+processing and the special cases for 'sub()' and 'gsub()', we recommend
+the use of 'gawk' and 'gensub()' when you have to do substitutions.
+
+   ---------- Footnotes ----------
+
+   (1) This was rather naive of him, despite there being a note in this
+minor node indicating that the next major version would move to the
+POSIX rules.
+
+
+File: gawk.info,  Node: I/O Functions,  Next: Time Functions,  Prev: String Functions,  Up: Built-in
+
+9.1.4 Input/Output Functions
+----------------------------
+
+The following functions relate to input/output (I/O). Optional
+parameters are enclosed in square brackets ([ ]):
+
+'close('FILENAME [',' HOW]')'
+     Close the file FILENAME for input or output.  Alternatively, the
+     argument may be a shell command that was used for creating a
+     coprocess, or for redirecting to or from a pipe; then the coprocess
+     or pipe is closed.  *Note Close Files And Pipes:: for more
+     information.
+
+     When closing a coprocess, it is occasionally useful to first close
+     one end of the two-way pipe and then to close the other.  This is
+     done by providing a second argument to 'close()'.  This second
+     argument (HOW) should be one of the two string values '"to"' or
+     '"from"', indicating which end of the pipe to close.  Case in the
+     string does not matter.  *Note Two-way I/O::, which discusses this
+     feature in more detail and gives an example.
+
+     Note that the second argument to 'close()' is a 'gawk' extension;
+     it is not available in compatibility mode (*note Options::).
+
+'fflush('[FILENAME]')'
+     Flush any buffered output associated with FILENAME, which is either
+     a file opened for writing or a shell command for redirecting output
+     to a pipe or coprocess.
+
+     Many utility programs "buffer" their output (i.e., they save
+     information to write to a disk file or the screen in memory until
+     there is enough for it to be worthwhile to send the data to the
+     output device).  This is often more efficient than writing every
+     little bit of information as soon as it is ready.  However,
+     sometimes it is necessary to force a program to "flush" its buffers
+     (i.e., write the information to its destination, even if a buffer
+     is not full).  This is the purpose of the 'fflush()'
+     function--'gawk' also buffers its output, and the 'fflush()'
+     function forces 'gawk' to flush its buffers.
+
+     Brian Kernighan added 'fflush()' to his 'awk' in April 1992.  For
+     two decades, it was a common extension.  In December 2012, it was
+     accepted for inclusion into the POSIX standard.  See the Austin
+     Group website (http://austingroupbugs.net/view.php?id=634).
+
+     POSIX standardizes 'fflush()' as follows: if there is no argument,
+     or if the argument is the null string ('""'), then 'awk' flushes
+     the buffers for _all_ open output files and pipes.
+
+          NOTE: Prior to version 4.0.2, 'gawk' would flush only the
+          standard output if there was no argument, and flush all output
+          files and pipes if the argument was the null string.  This was
+          changed in order to be compatible with Brian Kernighan's
+          'awk', in the hope that standardizing this feature in POSIX
+          would then be easier (which indeed proved to be the case).
+
+          With 'gawk', you can use 'fflush("/dev/stdout")' if you wish
+          to flush only the standard output.
+
+     'fflush()' returns zero if the buffer is successfully flushed;
+     otherwise, it returns a nonzero value.  ('gawk' returns -1.)  In
+     the case where all buffers are flushed, the return value is zero
+     only if all buffers were flushed successfully.  Otherwise, it is
+     -1, and 'gawk' warns about the problem FILENAME.
+
+     'gawk' also issues a warning message if you attempt to flush a file
+     or pipe that was opened for reading (such as with 'getline'), or if
+     FILENAME is not an open file, pipe, or coprocess.  In such a case,
+     'fflush()' returns -1, as well.
+
+              Interactive Versus Noninteractive Buffering
+
+     As a side point, buffering issues can be even more confusing if
+     your program is "interactive" (i.e., communicating with a user
+     sitting at a keyboard).(1)
+
+     Interactive programs generally "line buffer" their output (i.e.,
+     they write out every line).  Noninteractive programs wait until
+     they have a full buffer, which may be many lines of output.  Here
+     is an example of the difference:
+
+          $ awk '{ print $1 + $2 }'
+          1 1
+          -| 2
+          2 3
+          -| 5
+          Ctrl-d
+
+     Each line of output is printed immediately.  Compare that behavior
+     with this example:
+
+          $ awk '{ print $1 + $2 }' | cat
+          1 1
+          2 3
+          Ctrl-d
+          -| 2
+          -| 5
+
+     Here, no output is printed until after the 'Ctrl-d' is typed,
+     because it is all buffered and sent down the pipe to 'cat' in one
+     shot.
+
+'system(COMMAND)'
+     Execute the operating system command COMMAND and then return to the
+     'awk' program.  Return COMMAND's exit status (see further on).
+
+     For example, if the following fragment of code is put in your 'awk'
+     program:
+
+          END {
+               system("date | mail -s 'awk run done' root")
+          }
+
+     the system administrator is sent mail when the 'awk' program
+     finishes processing input and begins its end-of-input processing.
+
+     Note that redirecting 'print' or 'printf' into a pipe is often
+     enough to accomplish your task.  If you need to run many commands,
+     it is more efficient to simply print them down a pipeline to the
+     shell:
+
+          while (MORE STUFF TO DO)
+              print COMMAND | "/bin/sh"
+          close("/bin/sh")
+
+     However, if your 'awk' program is interactive, 'system()' is useful
+     for running large self-contained programs, such as a shell or an
+     editor.  Some operating systems cannot implement the 'system()'
+     function.  'system()' causes a fatal error if it is not supported.
+
+          NOTE: When '--sandbox' is specified, the 'system()' function
+          is disabled (*note Options::).
+
+     On POSIX systems, a command's exit status is a 16-bit number.  The
+     exit value passed to the C 'exit()' function is held in the
+     high-order eight bits.  The low-order bits indicate if the process
+     was killed by a signal (bit 7) and if so, the guilty signal number
+     (bits 0-6).
+
+     Traditionally, 'awk''s 'system()' function has simply returned the
+     exit status value divided by 256.  In the normal case this gives
+     the exit status but in the case of death-by-signal it yields a
+     fractional floating-point value.(2)  POSIX states that 'awk''s
+     'system()' should return the full 16-bit value.
+
+     'gawk' steers a middle ground.  The return values are summarized in
+     *note Table 9.5: table-system-return-values.
+
+     Situation                     Return value from 'system()'
+     --------------------------------------------------------------------------
+     '--traditional'               C 'system()''s value divided by 256
+     '--posix'                     C 'system()''s value
+     Normal exit of command        Command's exit status
+     Death by signal of command    256 + number of murderous signal
+     Death by signal of command    512 + number of murderous signal
+     with core dump
+     Some kind of error            -1
+
+     Table 9.5: Return values from 'system()'
+
+             Controlling Output Buffering with 'system()'
+
+   The 'fflush()' function provides explicit control over output
+buffering for individual files and pipes.  However, its use is not
+portable to many older 'awk' implementations.  An alternative method to
+flush output buffers is to call 'system()' with a null string as its
+argument:
+
+     system("")   # flush output
+
+'gawk' treats this use of the 'system()' function as a special case and
+is smart enough not to run a shell (or other command interpreter) with
+the empty command.  Therefore, with 'gawk', this idiom is not only
+useful, it is also efficient.  Although this method should work with
+other 'awk' implementations, it does not necessarily avoid starting an
+unnecessary shell.  (Other implementations may only flush the buffer
+associated with the standard output and not necessarily all buffered
+output.)
+
+   If you think about what a programmer expects, it makes sense that
+'system()' should flush any pending output.  The following program:
+
+     BEGIN {
+          print "first print"
+          system("echo system echo")
+          print "second print"
+     }
+
+must print:
+
+     first print
+     system echo
+     second print
+
+and not:
+
+     system echo
+     first print
+     second print
+
+   If 'awk' did not flush its buffers before calling 'system()', you
+would see the latter (undesirable) output.
+
+   ---------- Footnotes ----------
+
+   (1) A program is interactive if the standard output is connected to a
+terminal device.  On modern systems, this means your keyboard and
+screen.
+
+   (2) In private correspondence, Dr. Kernighan has indicated to me that
+the way this was done was probably a mistake.
+
+
+File: gawk.info,  Node: Time Functions,  Next: Bitwise Functions,  Prev: I/O Functions,  Up: Built-in
+
+9.1.5 Time Functions
+--------------------
+
+'awk' programs are commonly used to process log files containing
+timestamp information, indicating when a particular log record was
+written.  Many programs log their timestamps in the form returned by the
+'time()' system call, which is the number of seconds since a particular
+epoch.  On POSIX-compliant systems, it is the number of seconds since
+1970-01-01 00:00:00 UTC, not counting leap seconds.(1)  All known
+POSIX-compliant systems support timestamps from 0 through 2^31 - 1,
+which is sufficient to represent times through 2038-01-19 03:14:07 UTC.
+Many systems support a wider range of timestamps, including negative
+timestamps that represent times before the epoch.
+
+   In order to make it easier to process such log files and to produce
+useful reports, 'gawk' provides the following functions for working with
+timestamps.  They are 'gawk' extensions; they are not specified in the
+POSIX standard.(2)  However, recent versions of 'mawk' (*note Other
+Versions::) also support these functions.  Optional parameters are
+enclosed in square brackets ([ ]):
+
+'mktime(DATESPEC)'
+     Turn DATESPEC into a timestamp in the same form as is returned by
+     'systime()'.  It is similar to the function of the same name in ISO
+     C. The argument, DATESPEC, is a string of the form
+     '"YYYY MM DD HH MM SS [DST]"'.  The string consists of six or seven
+     numbers representing, respectively, the full year including
+     century, the month from 1 to 12, the day of the month from 1 to 31,
+     the hour of the day from 0 to 23, the minute from 0 to 59, the
+     second from 0 to 60,(3) and an optional daylight-savings flag.
+
+     The values of these numbers need not be within the ranges
+     specified; for example, an hour of -1 means 1 hour before midnight.
+     The origin-zero Gregorian calendar is assumed, with year 0
+     preceding year 1 and year -1 preceding year 0.  The time is assumed
+     to be in the local time zone.  If the daylight-savings flag is
+     positive, the time is assumed to be daylight savings time; if zero,
+     the time is assumed to be standard time; and if negative (the
+     default), 'mktime()' attempts to determine whether daylight savings
+     time is in effect for the specified time.
+
+     If DATESPEC does not contain enough elements or if the resulting
+     time is out of range, 'mktime()' returns -1.
+
+'strftime('[FORMAT [',' TIMESTAMP [',' UTC-FLAG] ] ]')'
+     Format the time specified by TIMESTAMP based on the contents of the
+     FORMAT string and return the result.  It is similar to the function
+     of the same name in ISO C. If UTC-FLAG is present and is either
+     nonzero or non-null, the value is formatted as UTC (Coordinated
+     Universal Time, formerly GMT or Greenwich Mean Time).  Otherwise,
+     the value is formatted for the local time zone.  The TIMESTAMP is
+     in the same format as the value returned by the 'systime()'
+     function.  If no TIMESTAMP argument is supplied, 'gawk' uses the
+     current time of day as the timestamp.  Without a FORMAT argument,
+     'strftime()' uses the value of 'PROCINFO["strftime"]' as the format
+     string (*note Built-in Variables::).  The default string value is
+     '"%a %b %e %H:%M:%S %Z %Y"'.  This format string produces output
+     that is equivalent to that of the 'date' utility.  You can assign a
+     new value to 'PROCINFO["strftime"]' to change the default format;
+     see the following list for the various format directives.
+
+'systime()'
+     Return the current time as the number of seconds since the system
+     epoch.  On POSIX systems, this is the number of seconds since
+     1970-01-01 00:00:00 UTC, not counting leap seconds.  It may be a
+     different number on other systems.
+
+   The 'systime()' function allows you to compare a timestamp from a log
+file with the current time of day.  In particular, it is easy to
+determine how long ago a particular record was logged.  It also allows
+you to produce log records using the "seconds since the epoch" format.
+
+   The 'mktime()' function allows you to convert a textual
+representation of a date and time into a timestamp.  This makes it easy
+to do before/after comparisons of dates and times, particularly when
+dealing with date and time data coming from an external source, such as
+a log file.
+
+   The 'strftime()' function allows you to easily turn a timestamp into
+human-readable information.  It is similar in nature to the 'sprintf()'
+function (*note String Functions::), in that it copies nonformat
+specification characters verbatim to the returned string, while
+substituting date and time values for format specifications in the
+FORMAT string.
+
+   'strftime()' is guaranteed by the 1999 ISO C standard(4) to support
+the following date format specifications:
+
+'%a'
+     The locale's abbreviated weekday name.
+
+'%A'
+     The locale's full weekday name.
+
+'%b'
+     The locale's abbreviated month name.
+
+'%B'
+     The locale's full month name.
+
+'%c'
+     The locale's "appropriate" date and time representation.  (This is
+     '%A %B %d %T %Y' in the '"C"' locale.)
+
+'%C'
+     The century part of the current year.  This is the year divided by
+     100 and truncated to the next lower integer.
+
+'%d'
+     The day of the month as a decimal number (01-31).
+
+'%D'
+     Equivalent to specifying '%m/%d/%y'.
+
+'%e'
+     The day of the month, padded with a space if it is only one digit.
+
+'%F'
+     Equivalent to specifying '%Y-%m-%d'.  This is the ISO 8601 date
+     format.
+
+'%g'
+     The year modulo 100 of the ISO 8601 week number, as a decimal
+     number (00-99).  For example, January 1, 2012, is in week 53 of
+     2011.  Thus, the year of its ISO 8601 week number is 2011, even
+     though its year is 2012.  Similarly, December 31, 2012, is in week
+     1 of 2013.  Thus, the year of its ISO week number is 2013, even
+     though its year is 2012.
+
+'%G'
+     The full year of the ISO week number, as a decimal number.
+
+'%h'
+     Equivalent to '%b'.
+
+'%H'
+     The hour (24-hour clock) as a decimal number (00-23).
+
+'%I'
+     The hour (12-hour clock) as a decimal number (01-12).
+
+'%j'
+     The day of the year as a decimal number (001-366).
+
+'%m'
+     The month as a decimal number (01-12).
+
+'%M'
+     The minute as a decimal number (00-59).
+
+'%n'
+     A newline character (ASCII LF).
+
+'%p'
+     The locale's equivalent of the AM/PM designations associated with a
+     12-hour clock.
+
+'%r'
+     The locale's 12-hour clock time.  (This is '%I:%M:%S %p' in the
+     '"C"' locale.)
+
+'%R'
+     Equivalent to specifying '%H:%M'.
+
+'%S'
+     The second as a decimal number (00-60).
+
+'%t'
+     A TAB character.
+
+'%T'
+     Equivalent to specifying '%H:%M:%S'.
+
+'%u'
+     The weekday as a decimal number (1-7).  Monday is day one.
+
+'%U'
+     The week number of the year (with the first Sunday as the first day
+     of week one) as a decimal number (00-53).
+
+'%V'
+     The week number of the year (with the first Monday as the first day
+     of week one) as a decimal number (01-53).  The method for
+     determining the week number is as specified by ISO 8601.  (To wit:
+     if the week containing January 1 has four or more days in the new
+     year, then it is week one; otherwise it is the last week [52 or 53]
+     of the previous year and the next week is week one.)
+
+'%w'
+     The weekday as a decimal number (0-6).  Sunday is day zero.
+
+'%W'
+     The week number of the year (with the first Monday as the first day
+     of week one) as a decimal number (00-53).
+
+'%x'
+     The locale's "appropriate" date representation.  (This is '%A %B %d
+     %Y' in the '"C"' locale.)
+
+'%X'
+     The locale's "appropriate" time representation.  (This is '%T' in
+     the '"C"' locale.)
+
+'%y'
+     The year modulo 100 as a decimal number (00-99).
+
+'%Y'
+     The full year as a decimal number (e.g., 2015).
+
+'%z'
+     The time zone offset in a '+HHMM' format (e.g., the format
+     necessary to produce RFC 822/RFC 1036 date headers).
+
+'%Z'
+     The time zone name or abbreviation; no characters if no time zone
+     is determinable.
+
+'%Ec %EC %Ex %EX %Ey %EY %Od %Oe %OH'
+'%OI %Om %OM %OS %Ou %OU %OV %Ow %OW %Oy'
+     "Alternative representations" for the specifications that use only
+     the second letter ('%c', '%C', and so on).(5)  (These facilitate
+     compliance with the POSIX 'date' utility.)
+
+'%%'
+     A literal '%'.
+
+   If a conversion specifier is not one of those just listed, the
+behavior is undefined.(6)
+
+   For systems that are not yet fully standards-compliant, 'gawk'
+supplies a copy of 'strftime()' from the GNU C Library.  It supports all
+of the just-listed format specifications.  If that version is used to
+compile 'gawk' (*note Installation::), then the following additional
+format specifications are available:
+
+'%k'
+     The hour (24-hour clock) as a decimal number (0-23).  Single-digit
+     numbers are padded with a space.
+
+'%l'
+     The hour (12-hour clock) as a decimal number (1-12).  Single-digit
+     numbers are padded with a space.
+
+'%s'
+     The time as a decimal timestamp in seconds since the epoch.
+
+   Additionally, the alternative representations are recognized but
+their normal representations are used.
+
+   The following example is an 'awk' implementation of the POSIX 'date'
+utility.  Normally, the 'date' utility prints the current date and time
+of day in a well-known format.  However, if you provide an argument to
+it that begins with a '+', 'date' copies nonformat specifier characters
+to the standard output and interprets the current time according to the
+format specifiers in the string.  For example:
+
+     $ date '+Today is %A, %B %d, %Y.'
+     -| Today is Monday, September 22, 2014.
+
+   Here is the 'gawk' version of the 'date' utility.  It has a shell
+"wrapper" to handle the '-u' option, which requires that 'date' run as
+if the time zone is set to UTC:
+
+     #! /bin/sh
+     #
+     # date --- approximate the POSIX 'date' command
+
+     case $1 in
+     -u)  TZ=UTC0     # use UTC
+          export TZ
+          shift ;;
+     esac
+
+     gawk 'BEGIN  {
+         format = PROCINFO["strftime"]
+         exitval = 0
+
+         if (ARGC > 2)
+             exitval = 1
+         else if (ARGC == 2) {
+             format = ARGV[1]
+             if (format ~ /^\+/)
+                 format = substr(format, 2)   # remove leading +
+         }
+         print strftime(format)
+         exit exitval
+     }' "$@"
+
+   ---------- Footnotes ----------
+
+   (1) *Note Glossary::, especially the entries "Epoch" and "UTC."
+
+   (2) The GNU 'date' utility can also do many of the things described
+here.  Its use may be preferable for simple time-related operations in
+shell scripts.
+
+   (3) Occasionally there are minutes in a year with a leap second,
+which is why the seconds can go up to 60.
+
+   (4) Unfortunately, not every system's 'strftime()' necessarily
+supports all of the conversions listed here.
+
+   (5) If you don't understand any of this, don't worry about it; these
+facilities are meant to make it easier to "internationalize" programs.
+Other internationalization features are described in *note
+Internationalization::.
+
+   (6) This is because ISO C leaves the behavior of the C version of
+'strftime()' undefined and 'gawk' uses the system's version of
+'strftime()' if it's there.  Typically, the conversion specifier either
+does not appear in the returned string or appears literally.
+
+
+File: gawk.info,  Node: Bitwise Functions,  Next: Type Functions,  Prev: Time Functions,  Up: Built-in
+
+9.1.6 Bit-Manipulation Functions
+--------------------------------
+
+     I can explain it for you, but I can't understand it for you.
+                            -- _Anonymous_
+
+   Many languages provide the ability to perform "bitwise" operations on
+two integer numbers.  In other words, the operation is performed on each
+successive pair of bits in the operands.  Three common operations are
+bitwise AND, OR, and XOR. The operations are described in *note Table
+9.6: table-bitwise-ops.
+
+                     Bit operator
+               |  AND  |   OR  |  XOR
+               |--+--+--+--+--+--
+     Operands  | 0 | 1 | 0 | 1 | 0 | 1
+     -------+--+--+--+--+--+--
+         0     | 0   0 | 0   1 | 0   1
+         1     | 0   1 | 1   1 | 1   0
+
+Table 9.6: Bitwise operations
+
+   As you can see, the result of an AND operation is 1 only when _both_
+bits are 1.  The result of an OR operation is 1 if _either_ bit is 1.
+The result of an XOR operation is 1 if either bit is 1, but not both.
+The next operation is the "complement"; the complement of 1 is 0 and the
+complement of 0 is 1.  Thus, this operation "flips" all the bits of a
+given value.
+
+   Finally, two other common operations are to shift the bits left or
+right.  For example, if you have a bit string '10111001' and you shift
+it right by three bits, you end up with '00010111'.(1)  If you start
+over again with '10111001' and shift it left by three bits, you end up
+with '11001000'.  The following list describes 'gawk''s built-in
+functions that implement the bitwise operations.  Optional parameters
+are enclosed in square brackets ([ ]):
+
+'and(V1, V2 [, ...])'
+     Return the bitwise AND of the arguments.  There must be at least
+     two.
+
+'compl(VAL)'
+     Return the bitwise complement of VAL.
+
+'lshift(VAL, COUNT)'
+     Return the value of VAL, shifted left by COUNT bits.
+
+'or(V1, V2 [, ...])'
+     Return the bitwise OR of the arguments.  There must be at least
+     two.
+
+'rshift(VAL, COUNT)'
+     Return the value of VAL, shifted right by COUNT bits.
+
+'xor(V1, V2 [, ...])'
+     Return the bitwise XOR of the arguments.  There must be at least
+     two.
+
+     CAUTION: Beginning with 'gawk' 4.1 4.2, negative operands are not
+     allowed for any of these functions.  A negative operand produces a
+     fatal error.  See the sidebar "Beware The Smoke and Mirrors!"  for
+     more information as to why.
+
+   Here is a user-defined function (*note User-defined::) that
+illustrates the use of these functions:
+
+     # bits2str --- turn a byte into readable ones and zeros
+
+     function bits2str(bits,        data, mask)
+     {
+         if (bits == 0)
+             return "0"
+
+         mask = 1
+         for (; bits != 0; bits = rshift(bits, 1))
+             data = (and(bits, mask) ? "1" : "0") data
+
+         while ((length(data) % 8) != 0)
+             data = "0" data
+
+         return data
+     }
+
+     BEGIN {
+         printf "123 = %s\n", bits2str(123)
+         printf "0123 = %s\n", bits2str(0123)
+         printf "0x99 = %s\n", bits2str(0x99)
+         comp = compl(0x99)
+         printf "compl(0x99) = %#x = %s\n", comp, bits2str(comp)
+         shift = lshift(0x99, 2)
+         printf "lshift(0x99, 2) = %#x = %s\n", shift, bits2str(shift)
+         shift = rshift(0x99, 2)
+         printf "rshift(0x99, 2) = %#x = %s\n", shift, bits2str(shift)
+     }
+
+This program produces the following output when run:
+
+     $ gawk -f testbits.awk
+     -| 123 = 01111011
+     -| 0123 = 01010011
+     -| 0x99 = 10011001
+     -| compl(0x99) = 0x3fffffffffff66 = 00111111111111111111111111111111111111111111111101100110
+     -| lshift(0x99, 2) = 0x264 = 0000001001100100
+     -| rshift(0x99, 2) = 0x26 = 00100110
+
+   The 'bits2str()' function turns a binary number into a string.
+Initializing 'mask' to one creates a binary value where the rightmost
+bit is set to one.  Using this mask, the function repeatedly checks the
+rightmost bit.  ANDing the mask with the value indicates whether the
+rightmost bit is one or not.  If so, a '"1"' is concatenated onto the
+front of the string.  Otherwise, a '"0"' is added.  The value is then
+shifted right by one bit and the loop continues until there are no more
+one bits.
+
+   If the initial value is zero, it returns a simple '"0"'.  Otherwise,
+at the end, it pads the value with zeros to represent multiples of 8-bit
+quantities.  This is typical in modern computers.
+
+   The main code in the 'BEGIN' rule shows the difference between the
+decimal and octal values for the same numbers (*note
+Nondecimal-numbers::), and then demonstrates the results of the
+'compl()', 'lshift()', and 'rshift()' functions.
+
+                     Beware The Smoke and Mirrors!
+
+   It other languages, bitwise operations are performed on integer
+values, not floating-point values.  As a general statement, such
+operations work best when performed on unsigned integers.
+
+   'gawk' attempts to treat the arguments to the bitwise functions as
+unsigned integers.  For this reason, negative arguments produce a fatal
+error.
+
+   In normal operation, for all of these functions, first the
+double-precision floating-point value is converted to the widest C
+unsigned integer type, then the bitwise operation is performed.  If the
+result cannot be represented exactly as a C 'double', leading nonzero
+bits are removed one by one until it can be represented exactly.  The
+result is then converted back into a C 'double'.(2)
+
+   However, when using arbitrary precision arithmetic with the '-M'
+option (*note Arbitrary Precision Arithmetic::), the results may differ.
+This is particularly noticable with the 'compl()' function:
+
+     $ gawk 'BEGIN { print compl(42) }'
+     -| 9007199254740949
+     $ gawk -M 'BEGIN { print compl(42) }'
+     -| -43
+
+   What's going on becomes clear when printing the results in
+hexadecimal:
+
+     $ gawk 'BEGIN { printf "%#x\n", compl(42) }'
+     -| 0x1fffffffffffd5
+     $ gawk -M 'BEGIN { printf "%#x\n", compl(42) }'
+     -| 0xffffffffffffffd5
+
+   When using the '-M' option, under the hood, 'gawk' uses GNU MP
+arbitrary precision integers which have at least 64 bits of precision.
+When not using '-M', 'gawk' stores integral values in regular
+double-precision floating point, which only maintain 53 bits of
+precision.  Furthermore, the GNU MP library treats (or least seems to
+treat) the leading bit as a sign bit; thus the result with '-M' in this
+case is a negative number.
+
+   In short, using 'gawk' for any but the simplest kind of bitwise
+operations is probably a bad idea; caveat emptor!
+
+   ---------- Footnotes ----------
+
+   (1) This example shows that zeros come in on the left side.  For
+'gawk', this is always true, but in some languages, it's possible to
+have the left side fill with ones.
+
+   (2) If you don't understand this paragraph, the upshot is that 'gawk'
+can only store a particular range of integer values; numbers outside
+that range are reduced to fit within the range.
+
+
+File: gawk.info,  Node: Type Functions,  Next: I18N Functions,  Prev: Bitwise Functions,  Up: Built-in
+
+9.1.7 Getting Type Information
+------------------------------
+
+'gawk' provides two functions that lets you distinguish the type of a
+variable.  This is necessary for writing code that traverses every
+element of an array of arrays (*note Arrays of Arrays::), and in other
+contexts.
+
+'isarray(X)'
+     Return a true value if X is an array.  Otherwise, return false.
+
+'typeof(X)'
+     Return one of the following strings, depending upon the type of X:
+
+     '"array"'
+          X is an array.
+
+     '"number"'
+          X is a number.
+
+     '"string"'
+          X is a string.
+
+     '"strnum"'
+          X is a string that might be a number, such as a field or the
+          result of calling 'split()'.  (I.e., X has the STRNUM
+          attribute; *note Variable Typing::.)
+
+     '"unassigned"'
+          X is a scalar variable that has not been assigned a value yet.
+          For example:
+
+               BEGIN {
+                   a[1]                # creates a[1] but it has no assigned value
+                   print typeof(a[1])  # scalar_u
+               }
+
+     '"untyped"'
+          X has not yet been used yet at all; it can become a scalar or
+          an array.  For example:
+
+               BEGIN {
+                   print typeof(x)     # x never used --> untyped
+                   mk_arr(x)
+                   print typeof(x)     # x now an array --> array
+               }
+
+               function mk_arr(a) { a[1] = 1 }
+
+   'isarray()' is meant for use in two circumstances.  The first is when
+traversing a multidimensional array: you can test if an element is
+itself an array or not.  The second is inside the body of a user-defined
+function (not discussed yet; *note User-defined::), to test if a
+parameter is an array or not.
+
+     NOTE: Using 'isarray()' at the global level to test variables makes
+     no sense.  Because you are the one writing the program, you are
+     supposed to know if your variables are arrays or not.  And in fact,
+     due to the way 'gawk' works, if you pass the name of a variable
+     that has not been previously used to 'isarray()', 'gawk' ends up
+     turning it into a scalar.
+
+   The 'typeof()' function is general; it allows you to determine if a
+variable or function parameter is a scalar, an array.
+
+   'isarray()' is deprecated; you should use 'typeof()' instead.  You
+should replace any existing uses of 'isarray(var)' in your code with
+'typeof(var) == "array"'.
+
+
+File: gawk.info,  Node: I18N Functions,  Prev: Type Functions,  Up: Built-in
+
+9.1.8 String-Translation Functions
+----------------------------------
+
+'gawk' provides facilities for internationalizing 'awk' programs.  These
+include the functions described in the following list.  The descriptions
+here are purposely brief.  *Note Internationalization::, for the full
+story.  Optional parameters are enclosed in square brackets ([ ]):
+
+'bindtextdomain(DIRECTORY' [',' DOMAIN]')'
+     Set the directory in which 'gawk' will look for message translation
+     files, in case they will not or cannot be placed in the "standard"
+     locations (e.g., during testing).  It returns the directory in
+     which DOMAIN is "bound."
+
+     The default DOMAIN is the value of 'TEXTDOMAIN'.  If DIRECTORY is
+     the null string ('""'), then 'bindtextdomain()' returns the current
+     binding for the given DOMAIN.
+
+'dcgettext(STRING' [',' DOMAIN [',' CATEGORY] ]')'
+     Return the translation of STRING in text domain DOMAIN for locale
+     category CATEGORY.  The default value for DOMAIN is the current
+     value of 'TEXTDOMAIN'.  The default value for CATEGORY is
+     '"LC_MESSAGES"'.
+
+'dcngettext(STRING1, STRING2, NUMBER' [',' DOMAIN [',' CATEGORY] ]')'
+     Return the plural form used for NUMBER of the translation of
+     STRING1 and STRING2 in text domain DOMAIN for locale category
+     CATEGORY.  STRING1 is the English singular variant of a message,
+     and STRING2 is the English plural variant of the same message.  The
+     default value for DOMAIN is the current value of 'TEXTDOMAIN'.  The
+     default value for CATEGORY is '"LC_MESSAGES"'.
+
+
+File: gawk.info,  Node: User-defined,  Next: Indirect Calls,  Prev: Built-in,  Up: Functions
+
+9.2 User-Defined Functions
+==========================
+
+Complicated 'awk' programs can often be simplified by defining your own
+functions.  User-defined functions can be called just like built-in ones
+(*note Function Calls::), but it is up to you to define them (i.e., to
+tell 'awk' what they should do).
+
+* Menu:
+
+* Definition Syntax::           How to write definitions and what they mean.
+* Function Example::            An example function definition and what it
+                                does.
+* Function Caveats::            Things to watch out for.
+* Return Statement::            Specifying the value a function returns.
+* Dynamic Typing::              How variable types can change at runtime.
+
+
+File: gawk.info,  Node: Definition Syntax,  Next: Function Example,  Up: User-defined
+
+9.2.1 Function Definition Syntax
+--------------------------------
+
+     It's entirely fair to say that the awk syntax for local variable
+     definitions is appallingly awful.
+                         -- _Brian Kernighan_
+
+   Definitions of functions can appear anywhere between the rules of an
+'awk' program.  Thus, the general form of an 'awk' program is extended
+to include sequences of rules _and_ user-defined function definitions.
+There is no need to put the definition of a function before all uses of
+the function.  This is because 'awk' reads the entire program before
+starting to execute any of it.
+
+   The definition of a function named NAME looks like this:
+
+     'function' NAME'('[PARAMETER-LIST]')'
+     '{'
+          BODY-OF-FUNCTION
+     '}'
+
+Here, NAME is the name of the function to define.  A valid function name
+is like a valid variable name: a sequence of letters, digits, and
+underscores that doesn't start with a digit.  Here too, only the 52
+upper- and lowercase English letters may be used in a function name.
+Within a single 'awk' program, any particular name can only be used as a
+variable, array, or function.
+
+   PARAMETER-LIST is an optional list of the function's arguments and
+local variable names, separated by commas.  When the function is called,
+the argument names are used to hold the argument values given in the
+call.
+
+   A function cannot have two parameters with the same name, nor may it
+have a parameter with the same name as the function itself.
+
+     CAUTION: According to the POSIX standard, function parameters
+     cannot have the same name as one of the special predefined
+     variables (*note Built-in Variables::), nor may a function
+     parameter have the same name as another function.
+
+     Not all versions of 'awk' enforce these restrictions.  'gawk'
+     always enforces the first restriction.  With '--posix' (*note
+     Options::), it also enforces the second restriction.
+
+   Local variables act like the empty string if referenced where a
+string value is required, and like zero if referenced where a numeric
+value is required.  This is the same as the behavior of regular
+variables that have never been assigned a value.  (There is more to
+understand about local variables; *note Dynamic Typing::.)
+
+   The BODY-OF-FUNCTION consists of 'awk' statements.  It is the most
+important part of the definition, because it says what the function
+should actually _do_.  The argument names exist to give the body a way
+to talk about the arguments; local variables exist to give the body
+places to keep temporary values.
+
+   Argument names are not distinguished syntactically from local
+variable names.  Instead, the number of arguments supplied when the
+function is called determines how many argument variables there are.
+Thus, if three argument values are given, the first three names in
+PARAMETER-LIST are arguments and the rest are local variables.
+
+   It follows that if the number of arguments is not the same in all
+calls to the function, some of the names in PARAMETER-LIST may be
+arguments on some occasions and local variables on others.  Another way
+to think of this is that omitted arguments default to the null string.
+
+   Usually when you write a function, you know how many names you intend
+to use for arguments and how many you intend to use as local variables.
+It is conventional to place some extra space between the arguments and
+the local variables, in order to document how your function is supposed
+to be used.
+
+   During execution of the function body, the arguments and local
+variable values hide, or "shadow", any variables of the same names used
+in the rest of the program.  The shadowed variables are not accessible
+in the function definition, because there is no way to name them while
+their names have been taken away for the arguments and local variables.
+All other variables used in the 'awk' program can be referenced or set
+normally in the function's body.
+
+   The arguments and local variables last only as long as the function
+body is executing.  Once the body finishes, you can once again access
+the variables that were shadowed while the function was running.
+
+   The function body can contain expressions that call functions.  They
+can even call this function, either directly or by way of another
+function.  When this happens, we say the function is "recursive".  The
+act of a function calling itself is called "recursion".
+
+   All the built-in functions return a value to their caller.
+User-defined functions can do so also, using the 'return' statement,
+which is described in detail in *note Return Statement::.  Many of the
+subsequent examples in this minor node use the 'return' statement.
+
+   In many 'awk' implementations, including 'gawk', the keyword
+'function' may be abbreviated 'func'.  (c.e.)  However, POSIX only
+specifies the use of the keyword 'function'.  This actually has some
+practical implications.  If 'gawk' is in POSIX-compatibility mode (*note
+Options::), then the following statement does _not_ define a function:
+
+     func foo() { a = sqrt($1) ; print a }
+
+Instead, it defines a rule that, for each record, concatenates the value
+of the variable 'func' with the return value of the function 'foo'.  If
+the resulting string is non-null, the action is executed.  This is
+probably not what is desired.  ('awk' accepts this input as
+syntactically valid, because functions may be used before they are
+defined in 'awk' programs.(1))
+
+   To ensure that your 'awk' programs are portable, always use the
+keyword 'function' when defining a function.
+
+   ---------- Footnotes ----------
+
+   (1) This program won't actually run, because 'foo()' is undefined.
+
+
+File: gawk.info,  Node: Function Example,  Next: Function Caveats,  Prev: Definition Syntax,  Up: User-defined
+
+9.2.2 Function Definition Examples
+----------------------------------
+
+Here is an example of a user-defined function, called 'myprint()', that
+takes a number and prints it in a specific format:
+
+     function myprint(num)
+     {
+          printf "%6.3g\n", num
+     }
+
+To illustrate, here is an 'awk' rule that uses our 'myprint()' function:
+
+     $3 > 0     { myprint($3) }
+
+This program prints, in our special format, all the third fields that
+contain a positive number in our input.  Therefore, when given the
+following input:
+
+      1.2   3.4    5.6   7.8
+      9.10 11.12 -13.14 15.16
+     17.18 19.20  21.22 23.24
+
+this program, using our function to format the results, prints:
+
+        5.6
+       21.2
+
+   This function deletes all the elements in an array (recall that the
+extra whitespace signifies the start of the local variable list):
+
+     function delarray(a,    i)
+     {
+         for (i in a)
+             delete a[i]
+     }
+
+   When working with arrays, it is often necessary to delete all the
+elements in an array and start over with a new list of elements (*note
+Delete::).  Instead of having to repeat this loop everywhere that you
+need to clear out an array, your program can just call 'delarray()'.
+(This guarantees portability.  The use of 'delete ARRAY' to delete the
+contents of an entire array is a relatively recent(1) addition to the
+POSIX standard.)
+
+   The following is an example of a recursive function.  It takes a
+string as an input parameter and returns the string in reverse order.
+Recursive functions must always have a test that stops the recursion.
+In this case, the recursion terminates when the input string is already
+empty:
+
+     function rev(str)
+     {
+         if (str == "")
+             return ""
+
+         return (rev(substr(str, 2)) substr(str, 1, 1))
+     }
+
+   If this function is in a file named 'rev.awk', it can be tested this
+way:
+
+     $ echo "Don't Panic!" |
+     > gawk -e '{ print rev($0) }' -f rev.awk
+     -| !cinaP t'noD
+
+   The C 'ctime()' function takes a timestamp and returns it as a
+string, formatted in a well-known fashion.  The following example uses
+the built-in 'strftime()' function (*note Time Functions::) to create an
+'awk' version of 'ctime()':
+
+     # ctime.awk
+     #
+     # awk version of C ctime(3) function
+
+     function ctime(ts,    format)
+     {
+         format = "%a %b %e %H:%M:%S %Z %Y"
+
+         if (ts == 0)
+             ts = systime()       # use current time as default
+         return strftime(format, ts)
+     }
+
+   You might think that 'ctime()' could use 'PROCINFO["strftime"]' for
+its format string.  That would be a mistake, because 'ctime()' is
+supposed to return the time formatted in a standard fashion, and
+user-level code could have changed 'PROCINFO["strftime"]'.
+
+   ---------- Footnotes ----------
+
+   (1) Late in 2012.
+
+
+File: gawk.info,  Node: Function Caveats,  Next: Return Statement,  Prev: Function Example,  Up: User-defined
+
+9.2.3 Calling User-Defined Functions
+------------------------------------
+
+"Calling a function" means causing the function to run and do its job.
+A function call is an expression and its value is the value returned by
+the function.
+
+* Menu:
+
+* Calling A Function::          Don't use spaces.
+* Variable Scope::              Controlling variable scope.
+* Pass By Value/Reference::     Passing parameters.
+
+
+File: gawk.info,  Node: Calling A Function,  Next: Variable Scope,  Up: Function Caveats
+
+9.2.3.1 Writing a Function Call
+...............................
+
+A function call consists of the function name followed by the arguments
+in parentheses.  'awk' expressions are what you write in the call for
+the arguments.  Each time the call is executed, these expressions are
+evaluated, and the values become the actual arguments.  For example,
+here is a call to 'foo()' with three arguments (the first being a string
+concatenation):
+
+     foo(x y, "lose", 4 * z)
+
+     CAUTION: Whitespace characters (spaces and TABs) are not allowed
+     between the function name and the opening parenthesis of the
+     argument list.  If you write whitespace by mistake, 'awk' might
+     think that you mean to concatenate a variable with an expression in
+     parentheses.  However, it notices that you used a function name and
+     not a variable name, and reports an error.
+
+
+File: gawk.info,  Node: Variable Scope,  Next: Pass By Value/Reference,  Prev: Calling A Function,  Up: Function Caveats
+
+9.2.3.2 Controlling Variable Scope
+..................................
+
+Unlike in many languages, there is no way to make a variable local to a
+'{' ... '}' block in 'awk', but you can make a variable local to a
+function.  It is good practice to do so whenever a variable is needed
+only in that function.
+
+   To make a variable local to a function, simply declare the variable
+as an argument after the actual function arguments (*note Definition
+Syntax::).  Look at the following example, where variable 'i' is a
+global variable used by both functions 'foo()' and 'bar()':
+
+     function bar()
+     {
+         for (i = 0; i < 3; i++)
+             print "bar's i=" i
+     }
+
+     function foo(j)
+     {
+         i = j + 1
+         print "foo's i=" i
+         bar()
+         print "foo's i=" i
+     }
+
+     BEGIN {
+           i = 10
+           print "top's i=" i
+           foo(0)
+           print "top's i=" i
+     }
+
+   Running this script produces the following, because the 'i' in
+functions 'foo()' and 'bar()' and at the top level refer to the same
+variable instance:
+
+     top's i=10
+     foo's i=1
+     bar's i=0
+     bar's i=1
+     bar's i=2
+     foo's i=3
+     top's i=3
+
+   If you want 'i' to be local to both 'foo()' and 'bar()', do as
+follows (the extra space before 'i' is a coding convention to indicate
+that 'i' is a local variable, not an argument):
+
+     function bar(    i)
+     {
+         for (i = 0; i < 3; i++)
+             print "bar's i=" i
+     }
+
+     function foo(j,    i)
+     {
+         i = j + 1
+         print "foo's i=" i
+         bar()
+         print "foo's i=" i
+     }
+
+     BEGIN {
+           i = 10
+           print "top's i=" i
+           foo(0)
+           print "top's i=" i
+     }
+
+   Running the corrected script produces the following:
+
+     top's i=10
+     foo's i=1
+     bar's i=0
+     bar's i=1
+     bar's i=2
+     foo's i=1
+     top's i=10
+
+   Besides scalar values (strings and numbers), you may also have local
+arrays.  By using a parameter name as an array, 'awk' treats it as an
+array, and it is local to the function.  In addition, recursive calls
+create new arrays.  Consider this example:
+
+     function some_func(p1,      a)
+     {
+         if (p1++ > 3)
+             return
+
+         a[p1] = p1
+
+         some_func(p1)
+
+         printf("At level %d, index %d %s found in a\n",
+              p1, (p1 - 1), (p1 - 1) in a ? "is" : "is not")
+         printf("At level %d, index %d %s found in a\n",
+              p1, p1, p1 in a ? "is" : "is not")
+         print ""
+     }
+
+     BEGIN {
+         some_func(1)
+     }
+
+   When run, this program produces the following output:
+
+     At level 4, index 3 is not found in a
+     At level 4, index 4 is found in a
+
+     At level 3, index 2 is not found in a
+     At level 3, index 3 is found in a
+
+     At level 2, index 1 is not found in a
+     At level 2, index 2 is found in a
+
+
+File: gawk.info,  Node: Pass By Value/Reference,  Prev: Variable Scope,  Up: Function Caveats
+
+9.2.3.3 Passing Function Arguments by Value Or by Reference
+...........................................................
+
+In 'awk', when you declare a function, there is no way to declare
+explicitly whether the arguments are passed "by value" or "by
+reference".
+
+   Instead, the passing convention is determined at runtime when the
+function is called, according to the following rule: if the argument is
+an array variable, then it is passed by reference.  Otherwise, the
+argument is passed by value.
+
+   Passing an argument by value means that when a function is called, it
+is given a _copy_ of the value of this argument.  The caller may use a
+variable as the expression for the argument, but the called function
+does not know this--it only knows what value the argument had.  For
+example, if you write the following code:
+
+     foo = "bar"
+     z = myfunc(foo)
+
+then you should not think of the argument to 'myfunc()' as being "the
+variable 'foo'."  Instead, think of the argument as the string value
+'"bar"'.  If the function 'myfunc()' alters the values of its local
+variables, this has no effect on any other variables.  Thus, if
+'myfunc()' does this:
+
+     function myfunc(str)
+     {
+        print str
+        str = "zzz"
+        print str
+     }
+
+to change its first argument variable 'str', it does _not_ change the
+value of 'foo' in the caller.  The role of 'foo' in calling 'myfunc()'
+ended when its value ('"bar"') was computed.  If 'str' also exists
+outside of 'myfunc()', the function body cannot alter this outer value,
+because it is shadowed during the execution of 'myfunc()' and cannot be
+seen or changed from there.
+
+   However, when arrays are the parameters to functions, they are _not_
+copied.  Instead, the array itself is made available for direct
+manipulation by the function.  This is usually termed "call by
+reference".  Changes made to an array parameter inside the body of a
+function _are_ visible outside that function.
+
+     NOTE: Changing an array parameter inside a function can be very
+     dangerous if you do not watch what you are doing.  For example:
+
+          function changeit(array, ind, nvalue)
+          {
+               array[ind] = nvalue
+          }
+
+          BEGIN {
+              a[1] = 1; a[2] = 2; a[3] = 3
+              changeit(a, 2, "two")
+              printf "a[1] = %s, a[2] = %s, a[3] = %s\n",
+                      a[1], a[2], a[3]
+          }
+
+     prints 'a[1] = 1, a[2] = two, a[3] = 3', because 'changeit()'
+     stores '"two"' in the second element of 'a'.
+
+   Some 'awk' implementations allow you to call a function that has not
+been defined.  They only report a problem at runtime, when the program
+actually tries to call the function.  For example:
+
+     BEGIN {
+         if (0)
+             foo()
+         else
+             bar()
+     }
+     function bar() { ... }
+     # note that `foo' is not defined
+
+Because the 'if' statement will never be true, it is not really a
+problem that 'foo()' has not been defined.  Usually, though, it is a
+problem if a program calls an undefined function.
+
+   If '--lint' is specified (*note Options::), 'gawk' reports calls to
+undefined functions.
+
+   Some 'awk' implementations generate a runtime error if you use either
+the 'next' statement or the 'nextfile' statement (*note Next
+Statement::, and *note Nextfile Statement::) inside a user-defined
+function.  'gawk' does not have this limitation.
+
+
+File: gawk.info,  Node: Return Statement,  Next: Dynamic Typing,  Prev: Function Caveats,  Up: User-defined
+
+9.2.4 The 'return' Statement
+----------------------------
+
+As seen in several earlier examples, the body of a user-defined function
+can contain a 'return' statement.  This statement returns control to the
+calling part of the 'awk' program.  It can also be used to return a
+value for use in the rest of the 'awk' program.  It looks like this:
+
+     'return' [EXPRESSION]
+
+   The EXPRESSION part is optional.  Due most likely to an oversight,
+POSIX does not define what the return value is if you omit the
+EXPRESSION.  Technically speaking, this makes the returned value
+undefined, and therefore, unpredictable.  In practice, though, all
+versions of 'awk' simply return the null string, which acts like zero if
+used in a numeric context.
+
+   A 'return' statement without an EXPRESSION is assumed at the end of
+every function definition.  So, if control reaches the end of the
+function body, then technically the function returns an unpredictable
+value.  In practice, it returns the empty string.  'awk' does _not_ warn
+you if you use the return value of such a function.
+
+   Sometimes, you want to write a function for what it does, not for
+what it returns.  Such a function corresponds to a 'void' function in C,
+C++, or Java, or to a 'procedure' in Ada.  Thus, it may be appropriate
+to not return any value; simply bear in mind that you should not be
+using the return value of such a function.
+
+   The following is an example of a user-defined function that returns a
+value for the largest number among the elements of an array:
+
+     function maxelt(vec,   i, ret)
+     {
+          for (i in vec) {
+               if (ret == "" || vec[i] > ret)
+                    ret = vec[i]
+          }
+          return ret
+     }
+
+You call 'maxelt()' with one argument, which is an array name.  The
+local variables 'i' and 'ret' are not intended to be arguments; there is
+nothing to stop you from passing more than one argument to 'maxelt()'
+but the results would be strange.  The extra space before 'i' in the
+function parameter list indicates that 'i' and 'ret' are local
+variables.  You should follow this convention when defining functions.
+
+   The following program uses the 'maxelt()' function.  It loads an
+array, calls 'maxelt()', and then reports the maximum number in that
+array:
+
+     function maxelt(vec,   i, ret)
+     {
+          for (i in vec) {
+               if (ret == "" || vec[i] > ret)
+                    ret = vec[i]
+          }
+          return ret
+     }
+
+     # Load all fields of each record into nums.
+     {
+          for(i = 1; i <= NF; i++)
+               nums[NR, i] = $i
+     }
+
+     END {
+          print maxelt(nums)
+     }
+
+   Given the following input:
+
+      1 5 23 8 16
+     44 3 5 2 8 26
+     256 291 1396 2962 100
+     -6 467 998 1101
+     99385 11 0 225
+
+the program reports (predictably) that 99,385 is the largest value in
+the array.
+
+
+File: gawk.info,  Node: Dynamic Typing,  Prev: Return Statement,  Up: User-defined
+
+9.2.5 Functions and Their Effects on Variable Typing
+----------------------------------------------------
+
+'awk' is a very fluid language.  It is possible that 'awk' can't tell if
+an identifier represents a scalar variable or an array until runtime.
+Here is an annotated sample program:
+
+     function foo(a)
+     {
+         a[1] = 1   # parameter is an array
+     }
+
+     BEGIN {
+         b = 1
+         foo(b)  # invalid: fatal type mismatch
+
+         foo(x)  # x uninitialized, becomes an array dynamically
+         x = 1   # now not allowed, runtime error
+     }
+
+   In this example, the first call to 'foo()' generates a fatal error,
+so 'awk' will not report the second error.  If you comment out that
+call, though, then 'awk' does report the second error.
+
+   Usually, such things aren't a big issue, but it's worth being aware
+of them.
+
+
+File: gawk.info,  Node: Indirect Calls,  Next: Functions Summary,  Prev: User-defined,  Up: Functions
+
+9.3 Indirect Function Calls
+===========================
+
+This section describes an advanced, 'gawk'-specific extension.
+
+   Often, you may wish to defer the choice of function to call until
+runtime.  For example, you may have different kinds of records, each of
+which should be processed differently.
+
+   Normally, you would have to use a series of 'if'-'else' statements to
+decide which function to call.  By using "indirect" function calls, you
+can specify the name of the function to call as a string variable, and
+then call the function.  Let's look at an example.
+
+   Suppose you have a file with your test scores for the classes you are
+taking, and you wish to get the sum and the average of your test scores.
+The first field is the class name.  The following fields are the
+functions to call to process the data, up to a "marker" field 'data:'.
+Following the marker, to the end of the record, are the various numeric
+test scores.
+
+   Here is the initial file:
+
+     Biology_101 sum average data: 87.0 92.4 78.5 94.9
+     Chemistry_305 sum average data: 75.2 98.3 94.7 88.2
+     English_401 sum average data: 100.0 95.6 87.1 93.4
+
+   To process the data, you might write initially:
+
+     {
+         class = $1
+         for (i = 2; $i != "data:"; i++) {
+             if ($i == "sum")
+                 sum()   # processes the whole record
+             else if ($i == "average")
+                 average()
+             ...           # and so on
+         }
+     }
+
+This style of programming works, but can be awkward.  With "indirect"
+function calls, you tell 'gawk' to use the _value_ of a variable as the
+_name_ of the function to call.
+
+   The syntax is similar to that of a regular function call: an
+identifier immediately followed by an opening parenthesis, any
+arguments, and then a closing parenthesis, with the addition of a
+leading '@' character:
+
+     the_func = "sum"
+     result = @the_func()   # calls the sum() function
+
+   Here is a full program that processes the previously shown data,
+using indirect function calls:
+
+     # indirectcall.awk --- Demonstrate indirect function calls
+
+     # average --- return the average of the values in fields $first - $last
+
+     function average(first, last,   sum, i)
+     {
+         sum = 0;
+         for (i = first; i <= last; i++)
+             sum += $i
+
+         return sum / (last - first + 1)
+     }
+
+     # sum --- return the sum of the values in fields $first - $last
+
+     function sum(first, last,   ret, i)
+     {
+         ret = 0;
+         for (i = first; i <= last; i++)
+             ret += $i
+
+         return ret
+     }
+
+   These two functions expect to work on fields; thus, the parameters
+'first' and 'last' indicate where in the fields to start and end.
+Otherwise, they perform the expected computations and are not unusual:
+
+     # For each record, print the class name and the requested statistics
+     {
+         class_name = $1
+         gsub(/_/, " ", class_name)  # Replace _ with spaces
+
+         # find start
+         for (i = 1; i <= NF; i++) {
+             if ($i == "data:") {
+                 start = i + 1
+                 break
+             }
+         }
+
+         printf("%s:\n", class_name)
+         for (i = 2; $i != "data:"; i++) {
+             the_function = $i
+             printf("\t%s: <%s>\n", $i, @the_function(start, NF) "")
+         }
+         print ""
+     }
+
+   This is the main processing for each record.  It prints the class
+name (with underscores replaced with spaces).  It then finds the start
+of the actual data, saving it in 'start'.  The last part of the code
+loops through each function name (from '$2' up to the marker, 'data:'),
+calling the function named by the field.  The indirect function call
+itself occurs as a parameter in the call to 'printf'.  (The 'printf'
+format string uses '%s' as the format specifier so that we can use
+functions that return strings, as well as numbers.  Note that the result
+from the indirect call is concatenated with the empty string, in order
+to force it to be a string value.)
+
+   Here is the result of running the program:
+
+     $ gawk -f indirectcall.awk class_data1
+     -| Biology 101:
+     -|     sum: <352.8>
+     -|     average: <88.2>
+     -|
+     -| Chemistry 305:
+     -|     sum: <356.4>
+     -|     average: <89.1>
+     -|
+     -| English 401:
+     -|     sum: <376.1>
+     -|     average: <94.025>
+
+   The ability to use indirect function calls is more powerful than you
+may think at first.  The C and C++ languages provide "function
+pointers," which are a mechanism for calling a function chosen at
+runtime.  One of the most well-known uses of this ability is the C
+'qsort()' function, which sorts an array using the famous "quicksort"
+algorithm (see the Wikipedia article
+(http://en.wikipedia.org/wiki/Quicksort) for more information).  To use
+this function, you supply a pointer to a comparison function.  This
+mechanism allows you to sort arbitrary data in an arbitrary fashion.
+
+   We can do something similar using 'gawk', like this:
+
+     # quicksort.awk --- Quicksort algorithm, with user-supplied
+     #                   comparison function
+
+     # quicksort --- C.A.R. Hoare's quicksort algorithm. See Wikipedia
+     #               or almost any algorithms or computer science text.
+
+     function quicksort(data, left, right, less_than,    i, last)
+     {
+         if (left >= right)  # do nothing if array contains fewer
+             return          # than two elements
+
+         quicksort_swap(data, left, int((left + right) / 2))
+         last = left
+         for (i = left + 1; i <= right; i++)
+             if (@less_than(data[i], data[left]))
+                 quicksort_swap(data, ++last, i)
+         quicksort_swap(data, left, last)
+         quicksort(data, left, last - 1, less_than)
+         quicksort(data, last + 1, right, less_than)
+     }
+
+     # quicksort_swap --- helper function for quicksort, should really be inline
+
+     function quicksort_swap(data, i, j,      temp)
+     {
+         temp = data[i]
+         data[i] = data[j]
+         data[j] = temp
+     }
+
+   The 'quicksort()' function receives the 'data' array, the starting
+and ending indices to sort ('left' and 'right'), and the name of a
+function that performs a "less than" comparison.  It then implements the
+quicksort algorithm.
+
+   To make use of the sorting function, we return to our previous
+example.  The first thing to do is write some comparison functions:
+
+     # num_lt --- do a numeric less than comparison
+
+     function num_lt(left, right)
+     {
+         return ((left + 0) < (right + 0))
+     }
+
+     # num_ge --- do a numeric greater than or equal to comparison
+
+     function num_ge(left, right)
+     {
+         return ((left + 0) >= (right + 0))
+     }
+
+   The 'num_ge()' function is needed to perform a descending sort; when
+used to perform a "less than" test, it actually does the opposite
+(greater than or equal to), which yields data sorted in descending
+order.
+
+   Next comes a sorting function.  It is parameterized with the starting
+and ending field numbers and the comparison function.  It builds an
+array with the data and calls 'quicksort()' appropriately, and then
+formats the results as a single string:
+
+     # do_sort --- sort the data according to `compare'
+     #             and return it as a string
+
+     function do_sort(first, last, compare,      data, i, retval)
+     {
+         delete data
+         for (i = 1; first <= last; first++) {
+             data[i] = $first
+             i++
+         }
+
+         quicksort(data, 1, i-1, compare)
+
+         retval = data[1]
+         for (i = 2; i in data; i++)
+             retval = retval " " data[i]
+
+         return retval
+     }
+
+   Finally, the two sorting functions call 'do_sort()', passing in the
+names of the two comparison functions:
+
+     # sort --- sort the data in ascending order and return it as a string
+
+     function sort(first, last)
+     {
+         return do_sort(first, last, "num_lt")
+     }
+
+     # rsort --- sort the data in descending order and return it as a string
+
+     function rsort(first, last)
+     {
+         return do_sort(first, last, "num_ge")
+     }
+
+   Here is an extended version of the data file:
+
+     Biology_101 sum average sort rsort data: 87.0 92.4 78.5 94.9
+     Chemistry_305 sum average sort rsort data: 75.2 98.3 94.7 88.2
+     English_401 sum average sort rsort data: 100.0 95.6 87.1 93.4
+
+   Finally, here are the results when the enhanced program is run:
+
+     $ gawk -f quicksort.awk -f indirectcall.awk class_data2
+     -| Biology 101:
+     -|     sum: <352.8>
+     -|     average: <88.2>
+     -|     sort: <78.5 87.0 92.4 94.9>
+     -|     rsort: <94.9 92.4 87.0 78.5>
+     -|
+     -| Chemistry 305:
+     -|     sum: <356.4>
+     -|     average: <89.1>
+     -|     sort: <75.2 88.2 94.7 98.3>
+     -|     rsort: <98.3 94.7 88.2 75.2>
+     -|
+     -| English 401:
+     -|     sum: <376.1>
+     -|     average: <94.025>
+     -|     sort: <87.1 93.4 95.6 100.0>
+     -|     rsort: <100.0 95.6 93.4 87.1>
+
+   Another example where indirect functions calls are useful can be
+found in processing arrays.  This is described in *note Walking
+Arrays::.
+
+   Remember that you must supply a leading '@' in front of an indirect
+function call.
+
+   Starting with version 4.1.2 of 'gawk', indirect function calls may
+also be used with built-in functions and with extension functions (*note
+Dynamic Extensions::).  There are some limitations when calling built-in
+functions indirectly, as follows.
+
+   * You cannot pass a regular expression constant to a built-in
+     function through an indirect function call.(1)  This applies to the
+     'sub()', 'gsub()', 'gensub()', 'match()', 'split()' and
+     'patsplit()' functions.
+
+   * If calling 'sub()' or 'gsub()', you may only pass two arguments,
+     since those functions are unusual in that they update their third
+     argument.  This means that '$0' will be updated.
+
+   'gawk' does its best to make indirect function calls efficient.  For
+example, in the following case:
+
+     for (i = 1; i <= n; i++)
+         @the_func()
+
+'gawk' looks up the actual function to call only once.
+
+   ---------- Footnotes ----------
+
+   (1) This may change in a future version; recheck the documentation
+that comes with your version of 'gawk' to see if it has.
+
+
+File: gawk.info,  Node: Functions Summary,  Prev: Indirect Calls,  Up: Functions
+
+9.4 Summary
+===========
+
+   * 'awk' provides built-in functions and lets you define your own
+     functions.
+
+   * POSIX 'awk' provides three kinds of built-in functions: numeric,
+     string, and I/O. 'gawk' provides functions that sort arrays, work
+     with values representing time, do bit manipulation, determine
+     variable type (array versus scalar), and internationalize and
+     localize programs.  'gawk' also provides several extensions to some
+     of standard functions, typically in the form of additional
+     arguments.
+
+   * Functions accept zero or more arguments and return a value.  The
+     expressions that provide the argument values are completely
+     evaluated before the function is called.  Order of evaluation is
+     not defined.  The return value can be ignored.
+
+   * The handling of backslash in 'sub()' and 'gsub()' is not simple.
+     It is more straightforward in 'gawk''s 'gensub()' function, but
+     that function still requires care in its use.
+
+   * User-defined functions provide important capabilities but come with
+     some syntactic inelegancies.  In a function call, there cannot be
+     any space between the function name and the opening left
+     parenthesis of the argument list.  Also, there is no provision for
+     local variables, so the convention is to add extra parameters, and
+     to separate them visually from the real parameters by extra
+     whitespace.
+
+   * User-defined functions may call other user-defined (and built-in)
+     functions and may call themselves recursively.  Function parameters
+     "hide" any global variables of the same names.  You cannot use the
+     name of a reserved variable (such as 'ARGC') as the name of a
+     parameter in user-defined functions.
+
+   * Scalar values are passed to user-defined functions by value.  Array
+     parameters are passed by reference; any changes made by the
+     function to array parameters are thus visible after the function
+     has returned.
+
+   * Use the 'return' statement to return from a user-defined function.
+     An optional expression becomes the function's return value.  Only
+     scalar values may be returned by a function.
+
+   * If a variable that has never been used is passed to a user-defined
+     function, how that function treats the variable can set its nature:
+     either scalar or array.
+
+   * 'gawk' provides indirect function calls using a special syntax.  By
+     setting a variable to the name of a function, you can determine at
+     runtime what function will be called at that point in the program.
+     This is equivalent to function pointers in C and C++.
+
+
+File: gawk.info,  Node: Library Functions,  Next: Sample Programs,  Prev: Functions,  Up: Top
+
+10 A Library of 'awk' Functions
+*******************************
+
+*note User-defined:: describes how to write your own 'awk' functions.
+Writing functions is important, because it allows you to encapsulate
+algorithms and program tasks in a single place.  It simplifies
+programming, making program development more manageable and making
+programs more readable.
+
+   In their seminal 1976 book, 'Software Tools',(1) Brian Kernighan and
+P.J. Plauger wrote:
+
+     Good Programming is not learned from generalities, but by seeing
+     how significant programs can be made clean, easy to read, easy to
+     maintain and modify, human-engineered, efficient and reliable, by
+     the application of common sense and good programming practices.
+     Careful study and imitation of good programs leads to better
+     writing.
+
+   In fact, they felt this idea was so important that they placed this
+statement on the cover of their book.  Because we believe strongly that
+their statement is correct, this major node and *note Sample Programs::,
+provide a good-sized body of code for you to read and, we hope, to learn
+from.
+
+   This major node presents a library of useful 'awk' functions.  Many
+of the sample programs presented later in this Info file use these
+functions.  The functions are presented here in a progression from
+simple to complex.
+
+   *note Extract Program:: presents a program that you can use to
+extract the source code for these example library functions and programs
+from the Texinfo source for this Info file.  (This has already been done
+as part of the 'gawk' distribution.)
+
+   If you have written one or more useful, general-purpose 'awk'
+functions and would like to contribute them to the 'awk' user community,
+see *note How To Contribute::, for more information.
+
+   The programs in this major node and in *note Sample Programs::,
+freely use 'gawk'-specific features.  Rewriting these programs for
+different implementations of 'awk' is pretty straightforward:
+
+   * Diagnostic error messages are sent to '/dev/stderr'.  Use '| "cat
+     1>&2"' instead of '> "/dev/stderr"' if your system does not have a
+     '/dev/stderr', or if you cannot use 'gawk'.
+
+   * A number of programs use 'nextfile' (*note Nextfile Statement::) to
+     skip any remaining input in the input file.
+
+   * Finally, some of the programs choose to ignore upper- and lowercase
+     distinctions in their input.  They do so by assigning one to
+     'IGNORECASE'.  You can achieve almost the same effect(2) by adding
+     the following rule to the beginning of the program:
+
+          # ignore case
+          { $0 = tolower($0) }
+
+     Also, verify that all regexp and string constants used in
+     comparisons use only lowercase letters.
+
+* Menu:
+
+* Library Names::               How to best name private global variables in
+                                library functions.
+* General Functions::           Functions that are of general use.
+* Data File Management::        Functions for managing command-line data
+                                files.
+* Getopt Function::             A function for processing command-line
+                                arguments.
+* Passwd Functions::            Functions for getting user information.
+* Group Functions::             Functions for getting group information.
+* Walking Arrays::              A function to walk arrays of arrays.
+* Library Functions Summary::   Summary of library functions.
+* Library Exercises::           Exercises.
+
+   ---------- Footnotes ----------
+
+   (1) Sadly, over 35 years later, many of the lessons taught by this
+book have yet to be learned by a vast number of practicing programmers.
+
+   (2) The effects are not identical.  Output of the transformed record
+will be in all lowercase, while 'IGNORECASE' preserves the original
+contents of the input record.
+
+
+File: gawk.info,  Node: Library Names,  Next: General Functions,  Up: Library Functions
+
+10.1 Naming Library Function Global Variables
+=============================================
+
+Due to the way the 'awk' language evolved, variables are either "global"
+(usable by the entire program) or "local" (usable just by a specific
+function).  There is no intermediate state analogous to 'static'
+variables in C.
+
+   Library functions often need to have global variables that they can
+use to preserve state information between calls to the function--for
+example, 'getopt()''s variable '_opti' (*note Getopt Function::).  Such
+variables are called "private", as the only functions that need to use
+them are the ones in the library.
+
+   When writing a library function, you should try to choose names for
+your private variables that will not conflict with any variables used by
+either another library function or a user's main program.  For example,
+a name like 'i' or 'j' is not a good choice, because user programs often
+use variable names like these for their own purposes.
+
+   The example programs shown in this major node all start the names of
+their private variables with an underscore ('_').  Users generally don't
+use leading underscores in their variable names, so this convention
+immediately decreases the chances that the variable names will be
+accidentally shared with the user's program.
+
+   In addition, several of the library functions use a prefix that helps
+indicate what function or set of functions use the variables--for
+example, '_pw_byname()' in the user database routines (*note Passwd
+Functions::).  This convention is recommended, as it even further
+decreases the chance of inadvertent conflict among variable names.  Note
+that this convention is used equally well for variable names and for
+private function names.(1)
+
+   As a final note on variable naming, if a function makes global
+variables available for use by a main program, it is a good convention
+to start those variables' names with a capital letter--for example,
+'getopt()''s 'Opterr' and 'Optind' variables (*note Getopt Function::).
+The leading capital letter indicates that it is global, while the fact
+that the variable name is not all capital letters indicates that the
+variable is not one of 'awk''s predefined variables, such as 'FS'.
+
+   It is also important that _all_ variables in library functions that
+do not need to save state are, in fact, declared local.(2)  If this is
+not done, the variables could accidentally be used in the user's
+program, leading to bugs that are very difficult to track down:
+
+     function lib_func(x, y,    l1, l2)
+     {
+         ...
+         # some_var should be local but by oversight is not
+         USE VARIABLE some_var
+         ...
+     }
+
+   A different convention, common in the Tcl community, is to use a
+single associative array to hold the values needed by the library
+function(s), or "package."  This significantly decreases the number of
+actual global names in use.  For example, the functions described in
+*note Passwd Functions:: might have used array elements
+'PW_data["inited"]', 'PW_data["total"]', 'PW_data["count"]', and
+'PW_data["awklib"]', instead of '_pw_inited', '_pw_awklib', '_pw_total',
+and '_pw_count'.
+
+   The conventions presented in this minor node are exactly that:
+conventions.  You are not required to write your programs this way--we
+merely recommend that you do so.
+
+   ---------- Footnotes ----------
+
+   (1) Although all the library routines could have been rewritten to
+use this convention, this was not done, in order to show how our own
+'awk' programming style has evolved and to provide some basis for this
+discussion.
+
+   (2) 'gawk''s '--dump-variables' command-line option is useful for
+verifying this.
+
+
+File: gawk.info,  Node: General Functions,  Next: Data File Management,  Prev: Library Names,  Up: Library Functions
+
+10.2 General Programming
+========================
+
+This minor node presents a number of functions that are of general
+programming use.
+
+* Menu:
+
+* Strtonum Function::           A replacement for the built-in
+                                'strtonum()' function.
+* Assert Function::             A function for assertions in 'awk'
+                                programs.
+* Round Function::              A function for rounding if 'sprintf()'
+                                does not do it correctly.
+* Cliff Random Function::       The Cliff Random Number Generator.
+* Ordinal Functions::           Functions for using characters as numbers and
+                                vice versa.
+* Join Function::               A function to join an array into a string.
+* Getlocaltime Function::       A function to get formatted times.
+* Readfile Function::           A function to read an entire file at once.
+* Shell Quoting::               A function to quote strings for the shell.
+
+
+File: gawk.info,  Node: Strtonum Function,  Next: Assert Function,  Up: General Functions
+
+10.2.1 Converting Strings to Numbers
+------------------------------------
+
+The 'strtonum()' function (*note String Functions::) is a 'gawk'
+extension.  The following function provides an implementation for other
+versions of 'awk':
+
+     # mystrtonum --- convert string to number
+
+     function mystrtonum(str,        ret, n, i, k, c)
+     {
+         if (str ~ /^0[0-7]*$/) {
+             # octal
+             n = length(str)
+             ret = 0
+             for (i = 1; i <= n; i++) {
+                 c = substr(str, i, 1)
+                 # index() returns 0 if c not in string,
+                 # includes c == "0"
+                 k = index("1234567", c)
+
+                 ret = ret * 8 + k
+             }
+         } else if (str ~ /^0[xX][[:xdigit:]]+$/) {
+             # hexadecimal
+             str = substr(str, 3)    # lop off leading 0x
+             n = length(str)
+             ret = 0
+             for (i = 1; i <= n; i++) {
+                 c = substr(str, i, 1)
+                 c = tolower(c)
+                 # index() returns 0 if c not in string,
+                 # includes c == "0"
+                 k = index("123456789abcdef", c)
+
+                 ret = ret * 16 + k
+             }
+         } else if (str ~ \
+       /^[-+]?([0-9]+([.][0-9]*([Ee][0-9]+)?)?|([.][0-9]+([Ee][-+]?[0-9]+)?))$/) {
+             # decimal number, possibly floating point
+             ret = str + 0
+         } else
+             ret = "NOT-A-NUMBER"
+
+         return ret
+     }
+
+     # BEGIN {     # gawk test harness
+     #     a[1] = "25"
+     #     a[2] = ".31"
+     #     a[3] = "0123"
+     #     a[4] = "0xdeadBEEF"
+     #     a[5] = "123.45"
+     #     a[6] = "1.e3"
+     #     a[7] = "1.32"
+     #     a[8] = "1.32E2"
+     #
+     #     for (i = 1; i in a; i++)
+     #         print a[i], strtonum(a[i]), mystrtonum(a[i])
+     # }
+
+   The function first looks for C-style octal numbers (base 8).  If the
+input string matches a regular expression describing octal numbers, then
+'mystrtonum()' loops through each character in the string.  It sets 'k'
+to the index in '"1234567"' of the current octal digit.  The return
+value will either be the same number as the digit, or zero if the
+character is not there, which will be true for a '0'.  This is safe,
+because the regexp test in the 'if' ensures that only octal values are
+converted.
+
+   Similar logic applies to the code that checks for and converts a
+hexadecimal value, which starts with '0x' or '0X'.  The use of
+'tolower()' simplifies the computation for finding the correct numeric
+value for each hexadecimal digit.
+
+   Finally, if the string matches the (rather complicated) regexp for a
+regular decimal integer or floating-point number, the computation 'ret =
+str + 0' lets 'awk' convert the value to a number.
+
+   A commented-out test program is included, so that the function can be
+tested with 'gawk' and the results compared to the built-in 'strtonum()'
+function.
+
+
+File: gawk.info,  Node: Assert Function,  Next: Round Function,  Prev: Strtonum Function,  Up: General Functions
+
+10.2.2 Assertions
+-----------------
+
+When writing large programs, it is often useful to know that a condition
+or set of conditions is true.  Before proceeding with a particular
+computation, you make a statement about what you believe to be the case.
+Such a statement is known as an "assertion".  The C language provides an
+'<assert.h>' header file and corresponding 'assert()' macro that a
+programmer can use to make assertions.  If an assertion fails, the
+'assert()' macro arranges to print a diagnostic message describing the
+condition that should have been true but was not, and then it kills the
+program.  In C, using 'assert()' looks this:
+
+     #include <assert.h>
+
+     int myfunc(int a, double b)
+     {
+          assert(a <= 5 && b >= 17.1);
+          ...
+     }
+
+   If the assertion fails, the program prints a message similar to this:
+
+     prog.c:5: assertion failed: a <= 5 && b >= 17.1
+
+   The C language makes it possible to turn the condition into a string
+for use in printing the diagnostic message.  This is not possible in
+'awk', so this 'assert()' function also requires a string version of the
+condition that is being tested.  Following is the function:
+
+     # assert --- assert that a condition is true. Otherwise, exit.
+
+     function assert(condition, string)
+     {
+         if (! condition) {
+             printf("%s:%d: assertion failed: %s\n",
+                 FILENAME, FNR, string) > "/dev/stderr"
+             _assert_exit = 1
+             exit 1
+         }
+     }
+
+     END {
+         if (_assert_exit)
+             exit 1
+     }
+
+   The 'assert()' function tests the 'condition' parameter.  If it is
+false, it prints a message to standard error, using the 'string'
+parameter to describe the failed condition.  It then sets the variable
+'_assert_exit' to one and executes the 'exit' statement.  The 'exit'
+statement jumps to the 'END' rule.  If the 'END' rule finds
+'_assert_exit' to be true, it exits immediately.
+
+   The purpose of the test in the 'END' rule is to keep any other 'END'
+rules from running.  When an assertion fails, the program should exit
+immediately.  If no assertions fail, then '_assert_exit' is still false
+when the 'END' rule is run normally, and the rest of the program's 'END'
+rules execute.  For all of this to work correctly, 'assert.awk' must be
+the first source file read by 'awk'.  The function can be used in a
+program in the following way:
+
+     function myfunc(a, b)
+     {
+          assert(a <= 5 && b >= 17.1, "a <= 5 && b >= 17.1")
+          ...
+     }
+
+If the assertion fails, you see a message similar to the following:
+
+     mydata:1357: assertion failed: a <= 5 && b >= 17.1
+
+   There is a small problem with this version of 'assert()'.  An 'END'
+rule is automatically added to the program calling 'assert()'.
+Normally, if a program consists of just a 'BEGIN' rule, the input files
+and/or standard input are not read.  However, now that the program has
+an 'END' rule, 'awk' attempts to read the input data files or standard
+input (*note Using BEGIN/END::), most likely causing the program to hang
+as it waits for input.
+
+   There is a simple workaround to this: make sure that such a 'BEGIN'
+rule always ends with an 'exit' statement.
+
+
+File: gawk.info,  Node: Round Function,  Next: Cliff Random Function,  Prev: Assert Function,  Up: General Functions
+
+10.2.3 Rounding Numbers
+-----------------------
+
+The way 'printf' and 'sprintf()' (*note Printf::) perform rounding often
+depends upon the system's C 'sprintf()' subroutine.  On many machines,
+'sprintf()' rounding is "unbiased", which means it doesn't always round
+a trailing .5 up, contrary to naive expectations.  In unbiased rounding,
+.5 rounds to even, rather than always up, so 1.5 rounds to 2 but 4.5
+rounds to 4.  This means that if you are using a format that does
+rounding (e.g., '"%.0f"'), you should check what your system does.  The
+following function does traditional rounding; it might be useful if your
+'awk''s 'printf' does unbiased rounding:
+
+     # round.awk --- do normal rounding
+
+     function round(x,   ival, aval, fraction)
+     {
+        ival = int(x)    # integer part, int() truncates
+
+        # see if fractional part
+        if (ival == x)   # no fraction
+           return ival   # ensure no decimals
+
+        if (x < 0) {
+           aval = -x     # absolute value
+           ival = int(aval)
+           fraction = aval - ival
+           if (fraction >= .5)
+              return int(x) - 1   # -2.5 --> -3
+           else
+              return int(x)       # -2.3 --> -2
+        } else {
+           fraction = x - ival
+           if (fraction >= .5)
+              return ival + 1
+           else
+              return ival
+        }
+     }
+
+     # test harness
+     # { print $0, round($0) }
+
+
+File: gawk.info,  Node: Cliff Random Function,  Next: Ordinal Functions,  Prev: Round Function,  Up: General Functions
+
+10.2.4 The Cliff Random Number Generator
+----------------------------------------
+
+The Cliff random number generator
+(http://mathworld.wolfram.com/CliffRandomNumberGenerator.html) is a very
+simple random number generator that "passes the noise sphere test for
+randomness by showing no structure."  It is easily programmed, in less
+than 10 lines of 'awk' code:
+
+     # cliff_rand.awk --- generate Cliff random numbers
+
+     BEGIN { _cliff_seed = 0.1 }
+
+     function cliff_rand()
+     {
+         _cliff_seed = (100 * log(_cliff_seed)) % 1
+         if (_cliff_seed < 0)
+             _cliff_seed = - _cliff_seed
+         return _cliff_seed
+     }
+
+   This algorithm requires an initial "seed" of 0.1.  Each new value
+uses the current seed as input for the calculation.  If the built-in
+'rand()' function (*note Numeric Functions::) isn't random enough, you
+might try using this function instead.
+
+
+File: gawk.info,  Node: Ordinal Functions,  Next: Join Function,  Prev: Cliff Random Function,  Up: General Functions
+
+10.2.5 Translating Between Characters and Numbers
+-------------------------------------------------
+
+One commercial implementation of 'awk' supplies a built-in function,
+'ord()', which takes a character and returns the numeric value for that
+character in the machine's character set.  If the string passed to
+'ord()' has more than one character, only the first one is used.
+
+   The inverse of this function is 'chr()' (from the function of the
+same name in Pascal), which takes a number and returns the corresponding
+character.  Both functions are written very nicely in 'awk'; there is no
+real reason to build them into the 'awk' interpreter:
+
+     # ord.awk --- do ord and chr
+
+     # Global identifiers:
+     #    _ord_:        numerical values indexed by characters
+     #    _ord_init:    function to initialize _ord_
+
+     BEGIN    { _ord_init() }
+
+     function _ord_init(    low, high, i, t)
+     {
+         low = sprintf("%c", 7) # BEL is ascii 7
+         if (low == "\a") {    # regular ascii
+             low = 0
+             high = 127
+         } else if (sprintf("%c", 128 + 7) == "\a") {
+             # ascii, mark parity
+             low = 128
+             high = 255
+         } else {        # ebcdic(!)
+             low = 0
+             high = 255
+         }
+
+         for (i = low; i <= high; i++) {
+             t = sprintf("%c", i)
+             _ord_[t] = i
+         }
+     }
+
+   Some explanation of the numbers used by '_ord_init()' is worthwhile.
+The most prominent character set in use today is ASCII.(1) Although an
+8-bit byte can hold 256 distinct values (from 0 to 255), ASCII only
+defines characters that use the values from 0 to 127.(2)  In the now
+distant past, at least one minicomputer manufacturer used ASCII, but
+with mark parity, meaning that the leftmost bit in the byte is always 1.
+This means that on those systems, characters have numeric values from
+128 to 255.  Finally, large mainframe systems use the EBCDIC character
+set, which uses all 256 values.  There are other character sets in use
+on some older systems, but they are not really worth worrying about:
+
+     function ord(str,    c)
+     {
+         # only first character is of interest
+         c = substr(str, 1, 1)
+         return _ord_[c]
+     }
+
+     function chr(c)
+     {
+         # force c to be numeric by adding 0
+         return sprintf("%c", c + 0)
+     }
+
+     #### test code ####
+     # BEGIN {
+     #    for (;;) {
+     #        printf("enter a character: ")
+     #        if (getline var <= 0)
+     #            break
+     #        printf("ord(%s) = %d\n", var, ord(var))
+     #    }
+     # }
+
+   An obvious improvement to these functions is to move the code for the
+'_ord_init' function into the body of the 'BEGIN' rule.  It was written
+this way initially for ease of development.  There is a "test program"
+in a 'BEGIN' rule, to test the function.  It is commented out for
+production use.
+
+   ---------- Footnotes ----------
+
+   (1) This is changing; many systems use Unicode, a very large
+character set that includes ASCII as a subset.  On systems with full
+Unicode support, a character can occupy up to 32 bits, making simple
+tests such as used here prohibitively expensive.
+
+   (2) ASCII has been extended in many countries to use the values from
+128 to 255 for country-specific characters.  If your system uses these
+extensions, you can simplify '_ord_init()' to loop from 0 to 255.
+
+
+File: gawk.info,  Node: Join Function,  Next: Getlocaltime Function,  Prev: Ordinal Functions,  Up: General Functions
+
+10.2.6 Merging an Array into a String
+-------------------------------------
+
+When doing string processing, it is often useful to be able to join all
+the strings in an array into one long string.  The following function,
+'join()', accomplishes this task.  It is used later in several of the
+application programs (*note Sample Programs::).
+
+   Good function design is important; this function needs to be general,
+but it should also have a reasonable default behavior.  It is called
+with an array as well as the beginning and ending indices of the
+elements in the array to be merged.  This assumes that the array indices
+are numeric--a reasonable assumption, as the array was likely created
+with 'split()' (*note String Functions::):
+
+     # join.awk --- join an array into a string
+
+     function join(array, start, end, sep,    result, i)
+     {
+         if (sep == "")
+            sep = " "
+         else if (sep == SUBSEP) # magic value
+            sep = ""
+         result = array[start]
+         for (i = start + 1; i <= end; i++)
+             result = result sep array[i]
+         return result
+     }
+
+   An optional additional argument is the separator to use when joining
+the strings back together.  If the caller supplies a nonempty value,
+'join()' uses it; if it is not supplied, it has a null value.  In this
+case, 'join()' uses a single space as a default separator for the
+strings.  If the value is equal to 'SUBSEP', then 'join()' joins the
+strings with no separator between them.  'SUBSEP' serves as a "magic"
+value to indicate that there should be no separation between the
+component strings.(1)
+
+   ---------- Footnotes ----------
+
+   (1) It would be nice if 'awk' had an assignment operator for
+concatenation.  The lack of an explicit operator for concatenation makes
+string operations more difficult than they really need to be.
+
+
+File: gawk.info,  Node: Getlocaltime Function,  Next: Readfile Function,  Prev: Join Function,  Up: General Functions
+
+10.2.7 Managing the Time of Day
+-------------------------------
+
+The 'systime()' and 'strftime()' functions described in *note Time
+Functions:: provide the minimum functionality necessary for dealing with
+the time of day in human-readable form.  Although 'strftime()' is
+extensive, the control formats are not necessarily easy to remember or
+intuitively obvious when reading a program.
+
+   The following function, 'getlocaltime()', populates a user-supplied
+array with preformatted time information.  It returns a string with the
+current time formatted in the same way as the 'date' utility:
+
+     # getlocaltime.awk --- get the time of day in a usable format
+
+     # Returns a string in the format of output of date(1)
+     # Populates the array argument time with individual values:
+     #    time["second"]       -- seconds (0 - 59)
+     #    time["minute"]       -- minutes (0 - 59)
+     #    time["hour"]         -- hours (0 - 23)
+     #    time["althour"]      -- hours (0 - 12)
+     #    time["monthday"]     -- day of month (1 - 31)
+     #    time["month"]        -- month of year (1 - 12)
+     #    time["monthname"]    -- name of the month
+     #    time["shortmonth"]   -- short name of the month
+     #    time["year"]         -- year modulo 100 (0 - 99)
+     #    time["fullyear"]     -- full year
+     #    time["weekday"]      -- day of week (Sunday = 0)
+     #    time["altweekday"]   -- day of week (Monday = 0)
+     #    time["dayname"]      -- name of weekday
+     #    time["shortdayname"] -- short name of weekday
+     #    time["yearday"]      -- day of year (0 - 365)
+     #    time["timezone"]     -- abbreviation of timezone name
+     #    time["ampm"]         -- AM or PM designation
+     #    time["weeknum"]      -- week number, Sunday first day
+     #    time["altweeknum"]   -- week number, Monday first day
+
+     function getlocaltime(time,    ret, now, i)
+     {
+         # get time once, avoids unnecessary system calls
+         now = systime()
+
+         # return date(1)-style output
+         ret = strftime("%a %b %e %H:%M:%S %Z %Y", now)
+
+         # clear out target array
+         delete time
+
+         # fill in values, force numeric values to be
+         # numeric by adding 0
+         time["second"]       = strftime("%S", now) + 0
+         time["minute"]       = strftime("%M", now) + 0
+         time["hour"]         = strftime("%H", now) + 0
+         time["althour"]      = strftime("%I", now) + 0
+         time["monthday"]     = strftime("%d", now) + 0
+         time["month"]        = strftime("%m", now) + 0
+         time["monthname"]    = strftime("%B", now)
+         time["shortmonth"]   = strftime("%b", now)
+         time["year"]         = strftime("%y", now) + 0
+         time["fullyear"]     = strftime("%Y", now) + 0
+         time["weekday"]      = strftime("%w", now) + 0
+         time["altweekday"]   = strftime("%u", now) + 0
+         time["dayname"]      = strftime("%A", now)
+         time["shortdayname"] = strftime("%a", now)
+         time["yearday"]      = strftime("%j", now) + 0
+         time["timezone"]     = strftime("%Z", now)
+         time["ampm"]         = strftime("%p", now)
+         time["weeknum"]      = strftime("%U", now) + 0
+         time["altweeknum"]   = strftime("%W", now) + 0
+
+         return ret
+     }
+
+   The string indices are easier to use and read than the various
+formats required by 'strftime()'.  The 'alarm' program presented in
+*note Alarm Program:: uses this function.  A more general design for the
+'getlocaltime()' function would have allowed the user to supply an
+optional timestamp value to use instead of the current time.
+
+
+File: gawk.info,  Node: Readfile Function,  Next: Shell Quoting,  Prev: Getlocaltime Function,  Up: General Functions
+
+10.2.8 Reading a Whole File at Once
+-----------------------------------
+
+Often, it is convenient to have the entire contents of a file available
+in memory as a single string.  A straightforward but naive way to do
+that might be as follows:
+
+     function readfile(file,    tmp, contents)
+     {
+         if ((getline tmp < file) < 0)
+             return
+
+         contents = tmp
+         while (getline tmp < file) > 0)
+             contents = contents RT tmp
+
+         close(file)
+         return contents
+     }
+
+   This function reads from 'file' one record at a time, building up the
+full contents of the file in the local variable 'contents'.  It works,
+but is not necessarily efficient.
+
+   The following function, based on a suggestion by Denis Shirokov,
+reads the entire contents of the named file in one shot:
+
+     # readfile.awk --- read an entire file at once
+
+     function readfile(file,     tmp, save_rs)
+     {
+         save_rs = RS
+         RS = "^$"
+         getline tmp < file
+         close(file)
+         RS = save_rs
+
+         return tmp
+     }
+
+   It works by setting 'RS' to '^$', a regular expression that will
+never match if the file has contents.  'gawk' reads data from the file
+into 'tmp', attempting to match 'RS'.  The match fails after each read,
+but fails quickly, such that 'gawk' fills 'tmp' with the entire contents
+of the file.  (*Note Records:: for information on 'RT' and 'RS'.)
+
+   In the case that 'file' is empty, the return value is the null
+string.  Thus, calling code may use something like:
+
+     contents = readfile("/some/path")
+     if (length(contents) == 0)
+         # file was empty ...
+
+   This tests the result to see if it is empty or not.  An equivalent
+test would be 'contents == ""'.
+
+   *Note Extension Sample Readfile:: for an extension function that also
+reads an entire file into memory.
+
+
+File: gawk.info,  Node: Shell Quoting,  Prev: Readfile Function,  Up: General Functions
+
+10.2.9 Quoting Strings to Pass to the Shell
+-------------------------------------------
+
+Michael Brennan offers the following programming pattern, which he uses
+frequently:
+
+     #! /bin/sh
+
+     awkp='
+        ...
+        '
+
+     INPUT_PROGRAM | awk "$awkp" | /bin/sh
+
+   For example, a program of his named 'flac-edit' has this form:
+
+     $ flac-edit -song="Whoope! That's Great" file.flac
+
+   It generates the following output, which is to be piped to the shell
+('/bin/sh'):
+
+     chmod +w file.flac
+     metaflac --remove-tag=TITLE file.flac
+     LANG=en_US.88591 metaflac --set-tag=TITLE='Whoope! That'"'"'s Great' file.flac
+     chmod -w file.flac
+
+   Note the need for shell quoting.  The function 'shell_quote()' does
+it.  'SINGLE' is the one-character string '"'"' and 'QSINGLE' is the
+three-character string '"\"'\""':
+
+     # shell_quote --- quote an argument for passing to the shell
+
+     function shell_quote(s,             # parameter
+         SINGLE, QSINGLE, i, X, n, ret)  # locals
+     {
+         if (s == "")
+             return "\"\""
+
+         SINGLE = "\x27"  # single quote
+         QSINGLE = "\"\x27\""
+         n = split(s, X, SINGLE)
+
+         ret = SINGLE X[1] SINGLE
+         for (i = 2; i <= n; i++)
+             ret = ret QSINGLE SINGLE X[i] SINGLE
+
+         return ret
+     }
+
+
+File: gawk.info,  Node: Data File Management,  Next: Getopt Function,  Prev: General Functions,  Up: Library Functions
+
+10.3 Data file Management
+=========================
+
+This minor node presents functions that are useful for managing
+command-line data files.
+
+* Menu:
+
+* Filetrans Function::          A function for handling data file transitions.
+* Rewind Function::             A function for rereading the current file.
+* File Checking::               Checking that data files are readable.
+* Empty Files::                 Checking for zero-length files.
+* Ignoring Assigns::            Treating assignments as file names.
+
+
+File: gawk.info,  Node: Filetrans Function,  Next: Rewind Function,  Up: Data File Management
+
+10.3.1 Noting Data file Boundaries
+----------------------------------
+
+The 'BEGIN' and 'END' rules are each executed exactly once, at the
+beginning and end of your 'awk' program, respectively (*note
+BEGIN/END::).  We (the 'gawk' authors) once had a user who mistakenly
+thought that the 'BEGIN' rules were executed at the beginning of each
+data file and the 'END' rules were executed at the end of each data
+file.
+
+   When informed that this was not the case, the user requested that we
+add new special patterns to 'gawk', named 'BEGIN_FILE' and 'END_FILE',
+that would have the desired behavior.  He even supplied us the code to
+do so.
+
+   Adding these special patterns to 'gawk' wasn't necessary; the job can
+be done cleanly in 'awk' itself, as illustrated by the following library
+program.  It arranges to call two user-supplied functions, 'beginfile()'
+and 'endfile()', at the beginning and end of each data file.  Besides
+solving the problem in only nine(!)  lines of code, it does so
+_portably_; this works with any implementation of 'awk':
+
+     # transfile.awk
+     #
+     # Give the user a hook for filename transitions
+     #
+     # The user must supply functions beginfile() and endfile()
+     # that each take the name of the file being started or
+     # finished, respectively.
+
+     FILENAME != _oldfilename {
+         if (_oldfilename != "")
+             endfile(_oldfilename)
+         _oldfilename = FILENAME
+         beginfile(FILENAME)
+     }
+
+     END { endfile(FILENAME) }
+
+   This file must be loaded before the user's "main" program, so that
+the rule it supplies is executed first.
+
+   This rule relies on 'awk''s 'FILENAME' variable, which automatically
+changes for each new data file.  The current file name is saved in a
+private variable, '_oldfilename'.  If 'FILENAME' does not equal
+'_oldfilename', then a new data file is being processed and it is
+necessary to call 'endfile()' for the old file.  Because 'endfile()'
+should only be called if a file has been processed, the program first
+checks to make sure that '_oldfilename' is not the null string.  The
+program then assigns the current file name to '_oldfilename' and calls
+'beginfile()' for the file.  Because, like all 'awk' variables,
+'_oldfilename' is initialized to the null string, this rule executes
+correctly even for the first data file.
+
+   The program also supplies an 'END' rule to do the final processing
+for the last file.  Because this 'END' rule comes before any 'END' rules
+supplied in the "main" program, 'endfile()' is called first.  Once
+again, the value of multiple 'BEGIN' and 'END' rules should be clear.
+
+   If the same data file occurs twice in a row on the command line, then
+'endfile()' and 'beginfile()' are not executed at the end of the first
+pass and at the beginning of the second pass.  The following version
+solves the problem:
+
+     # ftrans.awk --- handle datafile transitions
+     #
+     # user supplies beginfile() and endfile() functions
+
+     FNR == 1 {
+         if (_filename_ != "")
+             endfile(_filename_)
+         _filename_ = FILENAME
+         beginfile(FILENAME)
+     }
+
+     END { endfile(_filename_) }
+
+   *note Wc Program:: shows how this library function can be used and
+how it simplifies writing the main program.
+
+          So Why Does 'gawk' Have 'BEGINFILE' and 'ENDFILE'?
+
+   You are probably wondering, if 'beginfile()' and 'endfile()'
+functions can do the job, why does 'gawk' have 'BEGINFILE' and 'ENDFILE'
+patterns?
+
+   Good question.  Normally, if 'awk' cannot open a file, this causes an
+immediate fatal error.  In this case, there is no way for a user-defined
+function to deal with the problem, as the mechanism for calling it
+relies on the file being open and at the first record.  Thus, the main
+reason for 'BEGINFILE' is to give you a "hook" to catch files that
+cannot be processed.  'ENDFILE' exists for symmetry, and because it
+provides an easy way to do per-file cleanup processing.  For more
+information, refer to *note BEGINFILE/ENDFILE::.
+
+
+File: gawk.info,  Node: Rewind Function,  Next: File Checking,  Prev: Filetrans Function,  Up: Data File Management
+
+10.3.2 Rereading the Current File
+---------------------------------
+
+Another request for a new built-in function was for a function that
+would make it possible to reread the current file.  The requesting user
+didn't want to have to use 'getline' (*note Getline::) inside a loop.
+
+   However, as long as you are not in the 'END' rule, it is quite easy
+to arrange to immediately close the current input file and then start
+over with it from the top.  For lack of a better name, we'll call the
+function 'rewind()':
+
+     # rewind.awk --- rewind the current file and start over
+
+     function rewind(    i)
+     {
+         # shift remaining arguments up
+         for (i = ARGC; i > ARGIND; i--)
+             ARGV[i] = ARGV[i-1]
+
+         # make sure gawk knows to keep going
+         ARGC++
+
+         # make current file next to get done
+         ARGV[ARGIND+1] = FILENAME
+
+         # do it
+         nextfile
+     }
+
+   The 'rewind()' function relies on the 'ARGIND' variable (*note
+Auto-set::), which is specific to 'gawk'.  It also relies on the
+'nextfile' keyword (*note Nextfile Statement::).  Because of this, you
+should not call it from an 'ENDFILE' rule.  (This isn't necessary
+anyway, because 'gawk' goes to the next file as soon as an 'ENDFILE'
+rule finishes!)
+
+   You need to be careful calling 'rewind()'.  You can end up causing
+infinite recursion if you don't pay attention.  Here is an example use:
+
+     $ cat data
+     -| a
+     -| b
+     -| c
+     -| d
+     -| e
+
+     $ cat test.awk
+     -| FNR == 3 && ! rewound {
+     -| 	rewound = 1
+     -| 	rewind()
+     -| }
+     -|
+     -| { print FILENAME, FNR, $0 }
+
+     $ gawk -f rewind.awk -f test.awk data 
+     -| data 1 a
+     -| data 2 b
+     -| data 1 a
+     -| data 2 b
+     -| data 3 c
+     -| data 4 d
+     -| data 5 e
+
+
+File: gawk.info,  Node: File Checking,  Next: Empty Files,  Prev: Rewind Function,  Up: Data File Management
+
+10.3.3 Checking for Readable Data files
+---------------------------------------
+
+Normally, if you give 'awk' a data file that isn't readable, it stops
+with a fatal error.  There are times when you might want to just ignore
+such files and keep going.(1)  You can do this by prepending the
+following program to your 'awk' program:
+
+     # readable.awk --- library file to skip over unreadable files
+
+     BEGIN {
+         for (i = 1; i < ARGC; i++) {
+             if (ARGV[i] ~ /^[a-zA-Z_][a-zA-Z0-9_]*=.*/ \
+                 || ARGV[i] == "-" || ARGV[i] == "/dev/stdin")
+                 continue    # assignment or standard input
+             else if ((getline junk < ARGV[i]) < 0) # unreadable
+                 delete ARGV[i]
+             else
+                 close(ARGV[i])
+         }
+     }
+
+   This works, because the 'getline' won't be fatal.  Removing the
+element from 'ARGV' with 'delete' skips the file (because it's no longer
+in the list).  See also *note ARGC and ARGV::.
+
+   Because 'awk' variable names only allow the English letters, the
+regular expression check purposely does not use character classes such
+as '[:alpha:]' and '[:alnum:]' (*note Bracket Expressions::).
+
+   ---------- Footnotes ----------
+
+   (1) The 'BEGINFILE' special pattern (*note BEGINFILE/ENDFILE::)
+provides an alternative mechanism for dealing with files that can't be
+opened.  However, the code here provides a portable solution.
+
+
+File: gawk.info,  Node: Empty Files,  Next: Ignoring Assigns,  Prev: File Checking,  Up: Data File Management
+
+10.3.4 Checking for Zero-Length Files
+-------------------------------------
+
+All known 'awk' implementations silently skip over zero-length files.
+This is a by-product of 'awk''s implicit
+read-a-record-and-match-against-the-rules loop: when 'awk' tries to read
+a record from an empty file, it immediately receives an end-of-file
+indication, closes the file, and proceeds on to the next command-line
+data file, _without_ executing any user-level 'awk' program code.
+
+   Using 'gawk''s 'ARGIND' variable (*note Built-in Variables::), it is
+possible to detect when an empty data file has been skipped.  Similar to
+the library file presented in *note Filetrans Function::, the following
+library file calls a function named 'zerofile()' that the user must
+provide.  The arguments passed are the file name and the position in
+'ARGV' where it was found:
+
+     # zerofile.awk --- library file to process empty input files
+
+     BEGIN { Argind = 0 }
+
+     ARGIND > Argind + 1 {
+         for (Argind++; Argind < ARGIND; Argind++)
+             zerofile(ARGV[Argind], Argind)
+     }
+
+     ARGIND != Argind { Argind = ARGIND }
+
+     END {
+         if (ARGIND > Argind)
+             for (Argind++; Argind <= ARGIND; Argind++)
+                 zerofile(ARGV[Argind], Argind)
+     }
+
+   The user-level variable 'Argind' allows the 'awk' program to track
+its progress through 'ARGV'.  Whenever the program detects that 'ARGIND'
+is greater than 'Argind + 1', it means that one or more empty files were
+skipped.  The action then calls 'zerofile()' for each such file,
+incrementing 'Argind' along the way.
+
+   The 'Argind != ARGIND' rule simply keeps 'Argind' up to date in the
+normal case.
+
+   Finally, the 'END' rule catches the case of any empty files at the
+end of the command-line arguments.  Note that the test in the condition
+of the 'for' loop uses the '<=' operator, not '<'.
+
+
+File: gawk.info,  Node: Ignoring Assigns,  Prev: Empty Files,  Up: Data File Management
+
+10.3.5 Treating Assignments as File names
+-----------------------------------------
+
+Occasionally, you might not want 'awk' to process command-line variable
+assignments (*note Assignment Options::).  In particular, if you have a
+file name that contains an '=' character, 'awk' treats the file name as
+an assignment and does not process it.
+
+   Some users have suggested an additional command-line option for
+'gawk' to disable command-line assignments.  However, some simple
+programming with a library file does the trick:
+
+     # noassign.awk --- library file to avoid the need for a
+     # special option that disables command-line assignments
+
+     function disable_assigns(argc, argv,    i)
+     {
+         for (i = 1; i < argc; i++)
+             if (argv[i] ~ /^[a-zA-Z_][a-zA-Z0-9_]*=.*/)
+                 argv[i] = ("./" argv[i])
+     }
+
+     BEGIN {
+         if (No_command_assign)
+             disable_assigns(ARGC, ARGV)
+     }
+
+   You then run your program this way:
+
+     awk -v No_command_assign=1 -f noassign.awk -f yourprog.awk *
+
+   The function works by looping through the arguments.  It prepends
+'./' to any argument that matches the form of a variable assignment,
+turning that argument into a file name.
+
+   The use of 'No_command_assign' allows you to disable command-line
+assignments at invocation time, by giving the variable a true value.
+When not set, it is initially zero (i.e., false), so the command-line
+arguments are left alone.
+
+
+File: gawk.info,  Node: Getopt Function,  Next: Passwd Functions,  Prev: Data File Management,  Up: Library Functions
+
+10.4 Processing Command-Line Options
+====================================
+
+Most utilities on POSIX-compatible systems take options on the command
+line that can be used to change the way a program behaves.  'awk' is an
+example of such a program (*note Options::).  Often, options take
+"arguments" (i.e., data that the program needs to correctly obey the
+command-line option).  For example, 'awk''s '-F' option requires a
+string to use as the field separator.  The first occurrence on the
+command line of either '--' or a string that does not begin with '-'
+ends the options.
+
+   Modern Unix systems provide a C function named 'getopt()' for
+processing command-line arguments.  The programmer provides a string
+describing the one-letter options.  If an option requires an argument,
+it is followed in the string with a colon.  'getopt()' is also passed
+the count and values of the command-line arguments and is called in a
+loop.  'getopt()' processes the command-line arguments for option
+letters.  Each time around the loop, it returns a single character
+representing the next option letter that it finds, or '?' if it finds an
+invalid option.  When it returns -1, there are no options left on the
+command line.
+
+   When using 'getopt()', options that do not take arguments can be
+grouped together.  Furthermore, options that take arguments require that
+the argument be present.  The argument can immediately follow the option
+letter, or it can be a separate command-line argument.
+
+   Given a hypothetical program that takes three command-line options,
+'-a', '-b', and '-c', where '-b' requires an argument, all of the
+following are valid ways of invoking the program:
+
+     prog -a -b foo -c data1 data2 data3
+     prog -ac -bfoo -- data1 data2 data3
+     prog -acbfoo data1 data2 data3
+
+   Notice that when the argument is grouped with its option, the rest of
+the argument is considered to be the option's argument.  In this
+example, '-acbfoo' indicates that all of the '-a', '-b', and '-c'
+options were supplied, and that 'foo' is the argument to the '-b'
+option.
+
+   'getopt()' provides four external variables that the programmer can
+use:
+
+'optind'
+     The index in the argument value array ('argv') where the first
+     nonoption command-line argument can be found.
+
+'optarg'
+     The string value of the argument to an option.
+
+'opterr'
+     Usually 'getopt()' prints an error message when it finds an invalid
+     option.  Setting 'opterr' to zero disables this feature.  (An
+     application might want to print its own error message.)
+
+'optopt'
+     The letter representing the command-line option.
+
+   The following C fragment shows how 'getopt()' might process
+command-line arguments for 'awk':
+
+     int
+     main(int argc, char *argv[])
+     {
+         ...
+         /* print our own message */
+         opterr = 0;
+         while ((c = getopt(argc, argv, "v:f:F:W:")) != -1) {
+             switch (c) {
+             case 'f':    /* file */
+                 ...
+                 break;
+             case 'F':    /* field separator */
+                 ...
+                 break;
+             case 'v':    /* variable assignment */
+                 ...
+                 break;
+             case 'W':    /* extension */
+                 ...
+                 break;
+             case '?':
+             default:
+                 usage();
+                 break;
+             }
+         }
+         ...
+     }
+
+   As a side point, 'gawk' actually uses the GNU 'getopt_long()'
+function to process both normal and GNU-style long options (*note
+Options::).
+
+   The abstraction provided by 'getopt()' is very useful and is quite
+handy in 'awk' programs as well.  Following is an 'awk' version of
+'getopt()'.  This function highlights one of the greatest weaknesses in
+'awk', which is that it is very poor at manipulating single characters.
+Repeated calls to 'substr()' are necessary for accessing individual
+characters (*note String Functions::).(1)
+
+   The discussion that follows walks through the code a bit at a time:
+
+     # getopt.awk --- Do C library getopt(3) function in awk
+
+     # External variables:
+     #    Optind -- index in ARGV of first nonoption argument
+     #    Optarg -- string value of argument to current option
+     #    Opterr -- if nonzero, print our own diagnostic
+     #    Optopt -- current option letter
+
+     # Returns:
+     #    -1     at end of options
+     #    "?"    for unrecognized option
+     #    <c>    a character representing the current option
+
+     # Private Data:
+     #    _opti  -- index in multiflag option, e.g., -abc
+
+   The function starts out with comments presenting a list of the global
+variables it uses, what the return values are, what they mean, and any
+global variables that are "private" to this library function.  Such
+documentation is essential for any program, and particularly for library
+functions.
+
+   The 'getopt()' function first checks that it was indeed called with a
+string of options (the 'options' parameter).  If 'options' has a zero
+length, 'getopt()' immediately returns -1:
+
+     function getopt(argc, argv, options,    thisopt, i)
+     {
+         if (length(options) == 0)    # no options given
+             return -1
+
+         if (argv[Optind] == "--") {  # all done
+             Optind++
+             _opti = 0
+             return -1
+         } else if (argv[Optind] !~ /^-[^:[:space:]]/) {
+             _opti = 0
+             return -1
+         }
+
+   The next thing to check for is the end of the options.  A '--' ends
+the command-line options, as does any command-line argument that does
+not begin with a '-'.  'Optind' is used to step through the array of
+command-line arguments; it retains its value across calls to 'getopt()',
+because it is a global variable.
+
+   The regular expression that is used, '/^-[^:[:space:]/', checks for a
+'-' followed by anything that is not whitespace and not a colon.  If the
+current command-line argument does not match this pattern, it is not an
+option, and it ends option processing.  Continuing on:
+
+         if (_opti == 0)
+             _opti = 2
+         thisopt = substr(argv[Optind], _opti, 1)
+         Optopt = thisopt
+         i = index(options, thisopt)
+         if (i == 0) {
+             if (Opterr)
+                 printf("%c -- invalid option\n", thisopt) > "/dev/stderr"
+             if (_opti >= length(argv[Optind])) {
+                 Optind++
+                 _opti = 0
+             } else
+                 _opti++
+             return "?"
+         }
+
+   The '_opti' variable tracks the position in the current command-line
+argument ('argv[Optind]').  If multiple options are grouped together
+with one '-' (e.g., '-abx'), it is necessary to return them to the user
+one at a time.
+
+   If '_opti' is equal to zero, it is set to two, which is the index in
+the string of the next character to look at (we skip the '-', which is
+at position one).  The variable 'thisopt' holds the character, obtained
+with 'substr()'.  It is saved in 'Optopt' for the main program to use.
+
+   If 'thisopt' is not in the 'options' string, then it is an invalid
+option.  If 'Opterr' is nonzero, 'getopt()' prints an error message on
+the standard error that is similar to the message from the C version of
+'getopt()'.
+
+   Because the option is invalid, it is necessary to skip it and move on
+to the next option character.  If '_opti' is greater than or equal to
+the length of the current command-line argument, it is necessary to move
+on to the next argument, so 'Optind' is incremented and '_opti' is reset
+to zero.  Otherwise, 'Optind' is left alone and '_opti' is merely
+incremented.
+
+   In any case, because the option is invalid, 'getopt()' returns '"?"'.
+The main program can examine 'Optopt' if it needs to know what the
+invalid option letter actually is.  Continuing on:
+
+         if (substr(options, i + 1, 1) == ":") {
+             # get option argument
+             if (length(substr(argv[Optind], _opti + 1)) > 0)
+                 Optarg = substr(argv[Optind], _opti + 1)
+             else
+                 Optarg = argv[++Optind]
+             _opti = 0
+         } else
+             Optarg = ""
+
+   If the option requires an argument, the option letter is followed by
+a colon in the 'options' string.  If there are remaining characters in
+the current command-line argument ('argv[Optind]'), then the rest of
+that string is assigned to 'Optarg'.  Otherwise, the next command-line
+argument is used ('-xFOO' versus '-x FOO').  In either case, '_opti' is
+reset to zero, because there are no more characters left to examine in
+the current command-line argument.  Continuing:
+
+         if (_opti == 0 || _opti >= length(argv[Optind])) {
+             Optind++
+             _opti = 0
+         } else
+             _opti++
+         return thisopt
+     }
+
+   Finally, if '_opti' is either zero or greater than the length of the
+current command-line argument, it means this element in 'argv' is
+through being processed, so 'Optind' is incremented to point to the next
+element in 'argv'.  If neither condition is true, then only '_opti' is
+incremented, so that the next option letter can be processed on the next
+call to 'getopt()'.
+
+   The 'BEGIN' rule initializes both 'Opterr' and 'Optind' to one.
+'Opterr' is set to one, because the default behavior is for 'getopt()'
+to print a diagnostic message upon seeing an invalid option.  'Optind'
+is set to one, because there's no reason to look at the program name,
+which is in 'ARGV[0]':
+
+     BEGIN {
+         Opterr = 1    # default is to diagnose
+         Optind = 1    # skip ARGV[0]
+
+         # test program
+         if (_getopt_test) {
+             while ((_go_c = getopt(ARGC, ARGV, "ab:cd")) != -1)
+                 printf("c = <%c>, Optarg = <%s>\n",
+                                            _go_c, Optarg)
+             printf("non-option arguments:\n")
+             for (; Optind < ARGC; Optind++)
+                 printf("\tARGV[%d] = <%s>\n",
+                                         Optind, ARGV[Optind])
+         }
+     }
+
+   The rest of the 'BEGIN' rule is a simple test program.  Here are the
+results of two sample runs of the test program:
+
+     $ awk -f getopt.awk -v _getopt_test=1 -- -a -cbARG bax -x
+     -| c = <a>, Optarg = <>
+     -| c = <c>, Optarg = <>
+     -| c = <b>, Optarg = <ARG>
+     -| non-option arguments:
+     -|         ARGV[3] = <bax>
+     -|         ARGV[4] = <-x>
+
+     $ awk -f getopt.awk -v _getopt_test=1 -- -a -x -- xyz abc
+     -| c = <a>, Optarg = <>
+     error-> x -- invalid option
+     -| c = <?>, Optarg = <>
+     -| non-option arguments:
+     -|         ARGV[4] = <xyz>
+     -|         ARGV[5] = <abc>
+
+   In both runs, the first '--' terminates the arguments to 'awk', so
+that it does not try to interpret the '-a', etc., as its own options.
+
+     NOTE: After 'getopt()' is through, user-level code must clear out
+     all the elements of 'ARGV' from 1 to 'Optind', so that 'awk' does
+     not try to process the command-line options as file names.
+
+   Using '#!' with the '-E' option may help avoid conflicts between your
+program's options and 'gawk''s options, as '-E' causes 'gawk' to abandon
+processing of further options (*note Executable Scripts:: and *note
+Options::).
+
+   Several of the sample programs presented in *note Sample Programs::,
+use 'getopt()' to process their arguments.
+
+   ---------- Footnotes ----------
+
+   (1) This function was written before 'gawk' acquired the ability to
+split strings into single characters using '""' as the separator.  We
+have left it alone, as using 'substr()' is more portable.
+
+
+File: gawk.info,  Node: Passwd Functions,  Next: Group Functions,  Prev: Getopt Function,  Up: Library Functions
+
+10.5 Reading the User Database
+==============================
+
+The 'PROCINFO' array (*note Built-in Variables::) provides access to the
+current user's real and effective user and group ID numbers, and, if
+available, the user's supplementary group set.  However, because these
+are numbers, they do not provide very useful information to the average
+user.  There needs to be some way to find the user information
+associated with the user and group ID numbers.  This minor node presents
+a suite of functions for retrieving information from the user database.
+*Note Group Functions:: for a similar suite that retrieves information
+from the group database.
+
+   The POSIX standard does not define the file where user information is
+kept.  Instead, it provides the '<pwd.h>' header file and several C
+language subroutines for obtaining user information.  The primary
+function is 'getpwent()', for "get password entry."  The "password"
+comes from the original user database file, '/etc/passwd', which stores
+user information along with the encrypted passwords (hence the name).
+
+   Although an 'awk' program could simply read '/etc/passwd' directly,
+this file may not contain complete information about the system's set of
+users.(1)  To be sure you are able to produce a readable and complete
+version of the user database, it is necessary to write a small C program
+that calls 'getpwent()'.  'getpwent()' is defined as returning a pointer
+to a 'struct passwd'.  Each time it is called, it returns the next entry
+in the database.  When there are no more entries, it returns 'NULL', the
+null pointer.  When this happens, the C program should call 'endpwent()'
+to close the database.  Following is 'pwcat', a C program that "cats"
+the password database:
+
+     /*
+      * pwcat.c
+      *
+      * Generate a printable version of the password database.
+      */
+     #include <stdio.h>
+     #include <pwd.h>
+
+     int
+     main(int argc, char **argv)
+     {
+         struct passwd *p;
+
+         while ((p = getpwent()) != NULL)
+             printf("%s:%s:%ld:%ld:%s:%s:%s\n",
+                 p->pw_name, p->pw_passwd, (long) p->pw_uid,
+                 (long) p->pw_gid, p->pw_gecos, p->pw_dir, p->pw_shell);
+
+         endpwent();
+         return 0;
+     }
+
+   If you don't understand C, don't worry about it.  The output from
+'pwcat' is the user database, in the traditional '/etc/passwd' format of
+colon-separated fields.  The fields are:
+
+Login name
+     The user's login name.
+
+Encrypted password
+     The user's encrypted password.  This may not be available on some
+     systems.
+
+User-ID
+     The user's numeric user ID number.  (On some systems, it's a C
+     'long', and not an 'int'.  Thus, we cast it to 'long' for all
+     cases.)
+
+Group-ID
+     The user's numeric group ID number.  (Similar comments about 'long'
+     versus 'int' apply here.)
+
+Full name
+     The user's full name, and perhaps other information associated with
+     the user.
+
+Home directory
+     The user's login (or "home") directory (familiar to shell
+     programmers as '$HOME').
+
+Login shell
+     The program that is run when the user logs in.  This is usually a
+     shell, such as Bash.
+
+   A few lines representative of 'pwcat''s output are as follows:
+
+     $ pwcat
+     -| root:x:0:1:Operator:/:/bin/sh
+     -| nobody:*:65534:65534::/:
+     -| daemon:*:1:1::/:
+     -| sys:*:2:2::/:/bin/csh
+     -| bin:*:3:3::/bin:
+     -| arnold:xyzzy:2076:10:Arnold Robbins:/home/arnold:/bin/sh
+     -| miriam:yxaay:112:10:Miriam Robbins:/home/miriam:/bin/sh
+     -| andy:abcca2:113:10:Andy Jacobs:/home/andy:/bin/sh
+     ...
+
+   With that introduction, following is a group of functions for getting
+user information.  There are several functions here, corresponding to
+the C functions of the same names:
+
+     # passwd.awk --- access password file information
+
+     BEGIN {
+         # tailor this to suit your system
+         _pw_awklib = "/usr/local/libexec/awk/"
+     }
+
+     function _pw_init(    oldfs, oldrs, olddol0, pwcat, using_fw, using_fpat)
+     {
+         if (_pw_inited)
+             return
+
+         oldfs = FS
+         oldrs = RS
+         olddol0 = $0
+         using_fw = (PROCINFO["FS"] == "FIELDWIDTHS")
+         using_fpat = (PROCINFO["FS"] == "FPAT")
+         FS = ":"
+         RS = "\n"
+
+         pwcat = _pw_awklib "pwcat"
+         while ((pwcat | getline) > 0) {
+             _pw_byname[$1] = $0
+             _pw_byuid[$3] = $0
+             _pw_bycount[++_pw_total] = $0
+         }
+         close(pwcat)
+         _pw_count = 0
+         _pw_inited = 1
+         FS = oldfs
+         if (using_fw)
+             FIELDWIDTHS = FIELDWIDTHS
+         else if (using_fpat)
+             FPAT = FPAT
+         RS = oldrs
+         $0 = olddol0
+     }
+
+   The 'BEGIN' rule sets a private variable to the directory where
+'pwcat' is stored.  Because it is used to help out an 'awk' library
+routine, we have chosen to put it in '/usr/local/libexec/awk'; however,
+you might want it to be in a different directory on your system.
+
+   The function '_pw_init()' fills three copies of the user information
+into three associative arrays.  The arrays are indexed by username
+('_pw_byname'), by user ID number ('_pw_byuid'), and by order of
+occurrence ('_pw_bycount').  The variable '_pw_inited' is used for
+efficiency, as '_pw_init()' needs to be called only once.
+
+   Because this function uses 'getline' to read information from
+'pwcat', it first saves the values of 'FS', 'RS', and '$0'.  It notes in
+the variable 'using_fw' whether field splitting with 'FIELDWIDTHS' is in
+effect or not.  Doing so is necessary, as these functions could be
+called from anywhere within a user's program, and the user may have his
+or her own way of splitting records and fields.  This makes it possible
+to restore the correct field-splitting mechanism later.  The test can
+only be true for 'gawk'.  It is false if using 'FS' or 'FPAT', or on
+some other 'awk' implementation.
+
+   The code that checks for using 'FPAT', using 'using_fpat' and
+'PROCINFO["FS"]', is similar.
+
+   The main part of the function uses a loop to read database lines,
+split the lines into fields, and then store the lines into each array as
+necessary.  When the loop is done, '_pw_init()' cleans up by closing the
+pipeline, setting '_pw_inited' to one, and restoring 'FS' (and
+'FIELDWIDTHS' or 'FPAT' if necessary), 'RS', and '$0'.  The use of
+'_pw_count' is explained shortly.
+
+   The 'getpwnam()' function takes a username as a string argument.  If
+that user is in the database, it returns the appropriate line.
+Otherwise, it relies on the array reference to a nonexistent element to
+create the element with the null string as its value:
+
+     function getpwnam(name)
+     {
+         _pw_init()
+         return _pw_byname[name]
+     }
+
+   Similarly, the 'getpwuid()' function takes a user ID number argument.
+If that user number is in the database, it returns the appropriate line.
+Otherwise, it returns the null string:
+
+     function getpwuid(uid)
+     {
+         _pw_init()
+         return _pw_byuid[uid]
+     }
+
+   The 'getpwent()' function simply steps through the database, one
+entry at a time.  It uses '_pw_count' to track its current position in
+the '_pw_bycount' array:
+
+     function getpwent()
+     {
+         _pw_init()
+         if (_pw_count < _pw_total)
+             return _pw_bycount[++_pw_count]
+         return ""
+     }
+
+   The 'endpwent()' function resets '_pw_count' to zero, so that
+subsequent calls to 'getpwent()' start over again:
+
+     function endpwent()
+     {
+         _pw_count = 0
+     }
+
+   A conscious design decision in this suite is that each subroutine
+calls '_pw_init()' to initialize the database arrays.  The overhead of
+running a separate process to generate the user database, and the I/O to
+scan it, are only incurred if the user's main program actually calls one
+of these functions.  If this library file is loaded along with a user's
+program, but none of the routines are ever called, then there is no
+extra runtime overhead.  (The alternative is move the body of
+'_pw_init()' into a 'BEGIN' rule, which always runs 'pwcat'.  This
+simplifies the code but runs an extra process that may never be needed.)
+
+   In turn, calling '_pw_init()' is not too expensive, because the
+'_pw_inited' variable keeps the program from reading the data more than
+once.  If you are worried about squeezing every last cycle out of your
+'awk' program, the check of '_pw_inited' could be moved out of
+'_pw_init()' and duplicated in all the other functions.  In practice,
+this is not necessary, as most 'awk' programs are I/O-bound, and such a
+change would clutter up the code.
+
+   The 'id' program in *note Id Program:: uses these functions.
+
+   ---------- Footnotes ----------
+
+   (1) It is often the case that password information is stored in a
+network database.
+
+
+File: gawk.info,  Node: Group Functions,  Next: Walking Arrays,  Prev: Passwd Functions,  Up: Library Functions
+
+10.6 Reading the Group Database
+===============================
+
+Much of the discussion presented in *note Passwd Functions:: applies to
+the group database as well.  Although there has traditionally been a
+well-known file ('/etc/group') in a well-known format, the POSIX
+standard only provides a set of C library routines ('<grp.h>' and
+'getgrent()') for accessing the information.  Even though this file may
+exist, it may not have complete information.  Therefore, as with the
+user database, it is necessary to have a small C program that generates
+the group database as its output.  'grcat', a C program that "cats" the
+group database, is as follows:
+
+     /*
+      * grcat.c
+      *
+      * Generate a printable version of the group database.
+      */
+     #include <stdio.h>
+     #include <grp.h>
+
+     int
+     main(int argc, char **argv)
+     {
+         struct group *g;
+         int i;
+
+         while ((g = getgrent()) != NULL) {
+             printf("%s:%s:%ld:", g->gr_name, g->gr_passwd,
+                                          (long) g->gr_gid);
+             for (i = 0; g->gr_mem[i] != NULL; i++) {
+                 printf("%s", g->gr_mem[i]);
+                 if (g->gr_mem[i+1] != NULL)
+                     putchar(',');
+             }
+             putchar('\n');
+         }
+         endgrent();
+         return 0;
+     }
+
+   Each line in the group database represents one group.  The fields are
+separated with colons and represent the following information:
+
+Group Name
+     The group's name.
+
+Group Password
+     The group's encrypted password.  In practice, this field is never
+     used; it is usually empty or set to '*'.
+
+Group ID Number
+     The group's numeric group ID number; the association of name to
+     number must be unique within the file.  (On some systems it's a C
+     'long', and not an 'int'.  Thus, we cast it to 'long' for all
+     cases.)
+
+Group Member List
+     A comma-separated list of usernames.  These users are members of
+     the group.  Modern Unix systems allow users to be members of
+     several groups simultaneously.  If your system does, then there are
+     elements '"group1"' through '"groupN"' in 'PROCINFO' for those
+     group ID numbers.  (Note that 'PROCINFO' is a 'gawk' extension;
+     *note Built-in Variables::.)
+
+   Here is what running 'grcat' might produce:
+
+     $ grcat
+     -| wheel:*:0:arnold
+     -| nogroup:*:65534:
+     -| daemon:*:1:
+     -| kmem:*:2:
+     -| staff:*:10:arnold,miriam,andy
+     -| other:*:20:
+     ...
+
+   Here are the functions for obtaining information from the group
+database.  There are several, modeled after the C library functions of
+the same names:
+
+     # group.awk --- functions for dealing with the group file
+
+     BEGIN {
+         # Change to suit your system
+         _gr_awklib = "/usr/local/libexec/awk/"
+     }
+
+     function _gr_init(    oldfs, oldrs, olddol0, grcat,
+                                  using_fw, using_fpat, n, a, i)
+     {
+         if (_gr_inited)
+             return
+
+         oldfs = FS
+         oldrs = RS
+         olddol0 = $0
+         using_fw = (PROCINFO["FS"] == "FIELDWIDTHS")
+         using_fpat = (PROCINFO["FS"] == "FPAT")
+         FS = ":"
+         RS = "\n"
+
+         grcat = _gr_awklib "grcat"
+         while ((grcat | getline) > 0) {
+             if ($1 in _gr_byname)
+                 _gr_byname[$1] = _gr_byname[$1] "," $4
+             else
+                 _gr_byname[$1] = $0
+             if ($3 in _gr_bygid)
+                 _gr_bygid[$3] = _gr_bygid[$3] "," $4
+             else
+                 _gr_bygid[$3] = $0
+
+             n = split($4, a, "[ \t]*,[ \t]*")
+             for (i = 1; i <= n; i++)
+                 if (a[i] in _gr_groupsbyuser)
+                     _gr_groupsbyuser[a[i]] = _gr_groupsbyuser[a[i]] " " $1
+                 else
+                     _gr_groupsbyuser[a[i]] = $1
+
+             _gr_bycount[++_gr_count] = $0
+         }
+         close(grcat)
+         _gr_count = 0
+         _gr_inited++
+         FS = oldfs
+         if (using_fw)
+             FIELDWIDTHS = FIELDWIDTHS
+         else if (using_fpat)
+             FPAT = FPAT
+         RS = oldrs
+         $0 = olddol0
+     }
+
+   The 'BEGIN' rule sets a private variable to the directory where
+'grcat' is stored.  Because it is used to help out an 'awk' library
+routine, we have chosen to put it in '/usr/local/libexec/awk'.  You
+might want it to be in a different directory on your system.
+
+   These routines follow the same general outline as the user database
+routines (*note Passwd Functions::).  The '_gr_inited' variable is used
+to ensure that the database is scanned no more than once.  The
+'_gr_init()' function first saves 'FS', 'RS', and '$0', and then sets
+'FS' and 'RS' to the correct values for scanning the group information.
+It also takes care to note whether 'FIELDWIDTHS' or 'FPAT' is being
+used, and to restore the appropriate field-splitting mechanism.
+
+   The group information is stored in several associative arrays.  The
+arrays are indexed by group name ('_gr_byname'), by group ID number
+('_gr_bygid'), and by position in the database ('_gr_bycount').  There
+is an additional array indexed by username ('_gr_groupsbyuser'), which
+is a space-separated list of groups to which each user belongs.
+
+   Unlike in the user database, it is possible to have multiple records
+in the database for the same group.  This is common when a group has a
+large number of members.  A pair of such entries might look like the
+following:
+
+     tvpeople:*:101:johnny,jay,arsenio
+     tvpeople:*:101:david,conan,tom,joan
+
+   For this reason, '_gr_init()' looks to see if a group name or group
+ID number is already seen.  If so, the usernames are simply concatenated
+onto the previous list of users.(1)
+
+   Finally, '_gr_init()' closes the pipeline to 'grcat', restores 'FS'
+(and 'FIELDWIDTHS' or 'FPAT', if necessary), 'RS', and '$0', initializes
+'_gr_count' to zero (it is used later), and makes '_gr_inited' nonzero.
+
+   The 'getgrnam()' function takes a group name as its argument, and if
+that group exists, it is returned.  Otherwise, it relies on the array
+reference to a nonexistent element to create the element with the null
+string as its value:
+
+     function getgrnam(group)
+     {
+         _gr_init()
+         return _gr_byname[group]
+     }
+
+   The 'getgrgid()' function is similar; it takes a numeric group ID and
+looks up the information associated with that group ID:
+
+     function getgrgid(gid)
+     {
+         _gr_init()
+         return _gr_bygid[gid]
+     }
+
+   The 'getgruser()' function does not have a C counterpart.  It takes a
+username and returns the list of groups that have the user as a member:
+
+     function getgruser(user)
+     {
+         _gr_init()
+         return _gr_groupsbyuser[user]
+     }
+
+   The 'getgrent()' function steps through the database one entry at a
+time.  It uses '_gr_count' to track its position in the list:
+
+     function getgrent()
+     {
+         _gr_init()
+         if (++_gr_count in _gr_bycount)
+             return _gr_bycount[_gr_count]
+         return ""
+     }
+
+   The 'endgrent()' function resets '_gr_count' to zero so that
+'getgrent()' can start over again:
+
+     function endgrent()
+     {
+         _gr_count = 0
+     }
+
+   As with the user database routines, each function calls '_gr_init()'
+to initialize the arrays.  Doing so only incurs the extra overhead of
+running 'grcat' if these functions are used (as opposed to moving the
+body of '_gr_init()' into a 'BEGIN' rule).
+
+   Most of the work is in scanning the database and building the various
+associative arrays.  The functions that the user calls are themselves
+very simple, relying on 'awk''s associative arrays to do work.
+
+   The 'id' program in *note Id Program:: uses these functions.
+
+   ---------- Footnotes ----------
+
+   (1) There is a subtle problem with the code just presented.  Suppose
+that the first time there were no names.  This code adds the names with
+a leading comma.  It also doesn't check that there is a '$4'.
+
+
+File: gawk.info,  Node: Walking Arrays,  Next: Library Functions Summary,  Prev: Group Functions,  Up: Library Functions
+
+10.7 Traversing Arrays of Arrays
+================================
+
+*note Arrays of Arrays:: described how 'gawk' provides arrays of arrays.
+In particular, any element of an array may be either a scalar or another
+array.  The 'isarray()' function (*note Type Functions::) lets you
+distinguish an array from a scalar.  The following function,
+'walk_array()', recursively traverses an array, printing the element
+indices and values.  You call it with the array and a string
+representing the name of the array:
+
+     function walk_array(arr, name,      i)
+     {
+         for (i in arr) {
+             if (isarray(arr[i]))
+                 walk_array(arr[i], (name "[" i "]"))
+             else
+                 printf("%s[%s] = %s\n", name, i, arr[i])
+         }
+     }
+
+It works by looping over each element of the array.  If any given
+element is itself an array, the function calls itself recursively,
+passing the subarray and a new string representing the current index.
+Otherwise, the function simply prints the element's name, index, and
+value.  Here is a main program to demonstrate:
+
+     BEGIN {
+         a[1] = 1
+         a[2][1] = 21
+         a[2][2] = 22
+         a[3] = 3
+         a[4][1][1] = 411
+         a[4][2] = 42
+
+         walk_array(a, "a")
+     }
+
+   When run, the program produces the following output:
+
+     $ gawk -f walk_array.awk
+     -| a[1] = 1
+     -| a[2][1] = 21
+     -| a[2][2] = 22
+     -| a[3] = 3
+     -| a[4][1][1] = 411
+     -| a[4][2] = 42
+
+   The function just presented simply prints the name and value of each
+scalar array element.  However, it is easy to generalize it, by passing
+in the name of a function to call when walking an array.  The modified
+function looks like this:
+
+     function process_array(arr, name, process, do_arrays,   i, new_name)
+     {
+         for (i in arr) {
+             new_name = (name "[" i "]")
+             if (isarray(arr[i])) {
+                 if (do_arrays)
+                     @process(new_name, arr[i])
+                 process_array(arr[i], new_name, process, do_arrays)
+             } else
+                 @process(new_name, arr[i])
+         }
+     }
+
+   The arguments are as follows:
+
+'arr'
+     The array.
+
+'name'
+     The name of the array (a string).
+
+'process'
+     The name of the function to call.
+
+'do_arrays'
+     If this is true, the function can handle elements that are
+     subarrays.
+
+   If subarrays are to be processed, that is done before walking them
+further.
+
+   When run with the following scaffolding, the function produces the
+same results as does the earlier version of 'walk_array()':
+
+     BEGIN {
+         a[1] = 1
+         a[2][1] = 21
+         a[2][2] = 22
+         a[3] = 3
+         a[4][1][1] = 411
+         a[4][2] = 42
+
+         process_array(a, "a", "do_print", 0)
+     }
+
+     function do_print(name, element)
+     {
+         printf "%s = %s\n", name, element
+     }
+
+
+File: gawk.info,  Node: Library Functions Summary,  Next: Library Exercises,  Prev: Walking Arrays,  Up: Library Functions
+
+10.8 Summary
+============
+
+   * Reading programs is an excellent way to learn Good Programming.
+     The functions and programs provided in this major node and the next
+     are intended to serve that purpose.
+
+   * When writing general-purpose library functions, put some thought
+     into how to name any global variables so that they won't conflict
+     with variables from a user's program.
+
+   * The functions presented here fit into the following categories:
+
+     General problems
+          Number-to-string conversion, testing assertions, rounding,
+          random number generation, converting characters to numbers,
+          joining strings, getting easily usable time-of-day
+          information, and reading a whole file in one shot
+
+     Managing data files
+          Noting data file boundaries, rereading the current file,
+          checking for readable files, checking for zero-length files,
+          and treating assignments as file names
+
+     Processing command-line options
+          An 'awk' version of the standard C 'getopt()' function
+
+     Reading the user and group databases
+          Two sets of routines that parallel the C library versions
+
+     Traversing arrays of arrays
+          Two functions that traverse an array of arrays to any depth
+
+
+File: gawk.info,  Node: Library Exercises,  Prev: Library Functions Summary,  Up: Library Functions
+
+10.9 Exercises
+==============
+
+  1. In *note Empty Files::, we presented the 'zerofile.awk' program,
+     which made use of 'gawk''s 'ARGIND' variable.  Can this problem be
+     solved without relying on 'ARGIND'?  If so, how?
+
+  2. As a related challenge, revise that code to handle the case where
+     an intervening value in 'ARGV' is a variable assignment.
+
+
+File: gawk.info,  Node: Sample Programs,  Next: Advanced Features,  Prev: Library Functions,  Up: Top
+
+11 Practical 'awk' Programs
+***************************
+
+*note Library Functions::, presents the idea that reading programs in a
+language contributes to learning that language.  This major node
+continues that theme, presenting a potpourri of 'awk' programs for your
+reading enjoyment.
+
+   Many of these programs use library functions presented in *note
+Library Functions::.
+
+* Menu:
+
+* Running Examples::            How to run these examples.
+* Clones::                      Clones of common utilities.
+* Miscellaneous Programs::      Some interesting 'awk' programs.
+* Programs Summary::            Summary of programs.
+* Programs Exercises::          Exercises.
+
+
+File: gawk.info,  Node: Running Examples,  Next: Clones,  Up: Sample Programs
+
+11.1 Running the Example Programs
+=================================
+
+To run a given program, you would typically do something like this:
+
+     awk -f PROGRAM -- OPTIONS FILES
+
+Here, PROGRAM is the name of the 'awk' program (such as 'cut.awk'),
+OPTIONS are any command-line options for the program that start with a
+'-', and FILES are the actual data files.
+
+   If your system supports the '#!' executable interpreter mechanism
+(*note Executable Scripts::), you can instead run your program directly:
+
+     cut.awk -c1-8 myfiles > results
+
+   If your 'awk' is not 'gawk', you may instead need to use this:
+
+     cut.awk -- -c1-8 myfiles > results
+
+
+File: gawk.info,  Node: Clones,  Next: Miscellaneous Programs,  Prev: Running Examples,  Up: Sample Programs
+
+11.2 Reinventing Wheels for Fun and Profit
+==========================================
+
+This minor node presents a number of POSIX utilities implemented in
+'awk'.  Reinventing these programs in 'awk' is often enjoyable, because
+the algorithms can be very clearly expressed, and the code is usually
+very concise and simple.  This is true because 'awk' does so much for
+you.
+
+   It should be noted that these programs are not necessarily intended
+to replace the installed versions on your system.  Nor may all of these
+programs be fully compliant with the most recent POSIX standard.  This
+is not a problem; their purpose is to illustrate 'awk' language
+programming for "real-world" tasks.
+
+   The programs are presented in alphabetical order.
+
+* Menu:
+
+* Cut Program::                 The 'cut' utility.
+* Egrep Program::               The 'egrep' utility.
+* Id Program::                  The 'id' utility.
+* Split Program::               The 'split' utility.
+* Tee Program::                 The 'tee' utility.
+* Uniq Program::                The 'uniq' utility.
+* Wc Program::                  The 'wc' utility.
+
+
+File: gawk.info,  Node: Cut Program,  Next: Egrep Program,  Up: Clones
+
+11.2.1 Cutting Out Fields and Columns
+-------------------------------------
+
+The 'cut' utility selects, or "cuts," characters or fields from its
+standard input and sends them to its standard output.  Fields are
+separated by TABs by default, but you may supply a command-line option
+to change the field "delimiter" (i.e., the field-separator character).
+'cut''s definition of fields is less general than 'awk''s.
+
+   A common use of 'cut' might be to pull out just the login names of
+logged-on users from the output of 'who'.  For example, the following
+pipeline generates a sorted, unique list of the logged-on users:
+
+     who | cut -c1-8 | sort | uniq
+
+   The options for 'cut' are:
+
+'-c LIST'
+     Use LIST as the list of characters to cut out.  Items within the
+     list may be separated by commas, and ranges of characters can be
+     separated with dashes.  The list '1-8,15,22-35' specifies
+     characters 1 through 8, 15, and 22 through 35.
+
+'-f LIST'
+     Use LIST as the list of fields to cut out.
+
+'-d DELIM'
+     Use DELIM as the field-separator character instead of the TAB
+     character.
+
+'-s'
+     Suppress printing of lines that do not contain the field delimiter.
+
+   The 'awk' implementation of 'cut' uses the 'getopt()' library
+function (*note Getopt Function::) and the 'join()' library function
+(*note Join Function::).
+
+   The program begins with a comment describing the options, the library
+functions needed, and a 'usage()' function that prints out a usage
+message and exits.  'usage()' is called if invalid arguments are
+supplied:
+
+     # cut.awk --- implement cut in awk
+
+     # Options:
+     #    -f list     Cut fields
+     #    -d c        Field delimiter character
+     #    -c list     Cut characters
+     #
+     #    -s          Suppress lines without the delimiter
+     #
+     # Requires getopt() and join() library functions
+
+     function usage()
+     {
+         print("usage: cut [-f list] [-d c] [-s] [files...]") > "/dev/stderr"
+         print("usage: cut [-c list] [files...]") > "/dev/stderr"
+         exit 1
+     }
+
+   Next comes a 'BEGIN' rule that parses the command-line options.  It
+sets 'FS' to a single TAB character, because that is 'cut''s default
+field separator.  The rule then sets the output field separator to be
+the same as the input field separator.  A loop using 'getopt()' steps
+through the command-line options.  Exactly one of the variables
+'by_fields' or 'by_chars' is set to true, to indicate that processing
+should be done by fields or by characters, respectively.  When cutting
+by characters, the output field separator is set to the null string:
+
+     BEGIN {
+         FS = "\t"    # default
+         OFS = FS
+         while ((c = getopt(ARGC, ARGV, "sf:c:d:")) != -1) {
+             if (c == "f") {
+                 by_fields = 1
+                 fieldlist = Optarg
+             } else if (c == "c") {
+                 by_chars = 1
+                 fieldlist = Optarg
+                 OFS = ""
+             } else if (c == "d") {
+                 if (length(Optarg) > 1) {
+                     printf("cut: using first character of %s" \
+                            " for delimiter\n", Optarg) > "/dev/stderr"
+                     Optarg = substr(Optarg, 1, 1)
+                 }
+                 fs = FS = Optarg
+                 OFS = FS
+                 if (FS == " ")    # defeat awk semantics
+                     FS = "[ ]"
+             } else if (c == "s")
+                 suppress = 1
+             else
+                 usage()
+         }
+
+         # Clear out options
+         for (i = 1; i < Optind; i++)
+             ARGV[i] = ""
+
+   The code must take special care when the field delimiter is a space.
+Using a single space ('" "') for the value of 'FS' is incorrect--'awk'
+would separate fields with runs of spaces, TABs, and/or newlines, and we
+want them to be separated with individual spaces.  To this end, we save
+the original space character in the variable 'fs' for later use; after
+setting 'FS' to '"[ ]"' we can't use it directly to see if the field
+delimiter character is in the string.
+
+   Also remember that after 'getopt()' is through (as described in *note
+Getopt Function::), we have to clear out all the elements of 'ARGV' from
+1 to 'Optind', so that 'awk' does not try to process the command-line
+options as file names.
+
+   After dealing with the command-line options, the program verifies
+that the options make sense.  Only one or the other of '-c' and '-f'
+should be used, and both require a field list.  Then the program calls
+either 'set_fieldlist()' or 'set_charlist()' to pull apart the list of
+fields or characters:
+
+         if (by_fields && by_chars)
+             usage()
+
+         if (by_fields == 0 && by_chars == 0)
+             by_fields = 1    # default
+
+         if (fieldlist == "") {
+             print "cut: needs list for -c or -f" > "/dev/stderr"
+             exit 1
+         }
+
+         if (by_fields)
+             set_fieldlist()
+         else
+             set_charlist()
+     }
+
+   'set_fieldlist()' splits the field list apart at the commas into an
+array.  Then, for each element of the array, it looks to see if the
+element is actually a range, and if so, splits it apart.  The function
+checks the range to make sure that the first number is smaller than the
+second.  Each number in the list is added to the 'flist' array, which
+simply lists the fields that will be printed.  Normal field splitting is
+used.  The program lets 'awk' handle the job of doing the field
+splitting:
+
+     function set_fieldlist(        n, m, i, j, k, f, g)
+     {
+         n = split(fieldlist, f, ",")
+         j = 1    # index in flist
+         for (i = 1; i <= n; i++) {
+             if (index(f[i], "-") != 0) { # a range
+                 m = split(f[i], g, "-")
+                 if (m != 2 || g[1] >= g[2]) {
+                     printf("cut: bad field list: %s\n",
+                                       f[i]) > "/dev/stderr"
+                     exit 1
+                 }
+                 for (k = g[1]; k <= g[2]; k++)
+                     flist[j++] = k
+             } else
+                 flist[j++] = f[i]
+         }
+         nfields = j - 1
+     }
+
+   The 'set_charlist()' function is more complicated than
+'set_fieldlist()'.  The idea here is to use 'gawk''s 'FIELDWIDTHS'
+variable (*note Constant Size::), which describes constant-width input.
+When using a character list, that is exactly what we have.
+
+   Setting up 'FIELDWIDTHS' is more complicated than simply listing the
+fields that need to be printed.  We have to keep track of the fields to
+print and also the intervening characters that have to be skipped.  For
+example, suppose you wanted characters 1 through 8, 15, and 22 through
+35.  You would use '-c 1-8,15,22-35'.  The necessary value for
+'FIELDWIDTHS' is '"8 6 1 6 14"'.  This yields five fields, and the
+fields to print are '$1', '$3', and '$5'.  The intermediate fields are
+"filler", which is stuff in between the desired data.  'flist' lists the
+fields to print, and 't' tracks the complete field list, including
+filler fields:
+
+     function set_charlist(    field, i, j, f, g, n, m, t,
+                               filler, last, len)
+     {
+         field = 1   # count total fields
+         n = split(fieldlist, f, ",")
+         j = 1       # index in flist
+         for (i = 1; i <= n; i++) {
+             if (index(f[i], "-") != 0) { # range
+                 m = split(f[i], g, "-")
+                 if (m != 2 || g[1] >= g[2]) {
+                     printf("cut: bad character list: %s\n",
+                                    f[i]) > "/dev/stderr"
+                     exit 1
+                 }
+                 len = g[2] - g[1] + 1
+                 if (g[1] > 1)  # compute length of filler
+                     filler = g[1] - last - 1
+                 else
+                     filler = 0
+                 if (filler)
+                     t[field++] = filler
+                 t[field++] = len  # length of field
+                 last = g[2]
+                 flist[j++] = field - 1
+             } else {
+                 if (f[i] > 1)
+                     filler = f[i] - last - 1
+                 else
+                     filler = 0
+                 if (filler)
+                     t[field++] = filler
+                 t[field++] = 1
+                 last = f[i]
+                 flist[j++] = field - 1
+             }
+         }
+         FIELDWIDTHS = join(t, 1, field - 1)
+         nfields = j - 1
+     }
+
+   Next is the rule that processes the data.  If the '-s' option is
+given, then 'suppress' is true.  The first 'if' statement makes sure
+that the input record does have the field separator.  If 'cut' is
+processing fields, 'suppress' is true, and the field separator character
+is not in the record, then the record is skipped.
+
+   If the record is valid, then 'gawk' has split the data into fields,
+either using the character in 'FS' or using fixed-length fields and
+'FIELDWIDTHS'.  The loop goes through the list of fields that should be
+printed.  The corresponding field is printed if it contains data.  If
+the next field also has data, then the separator character is written
+out between the fields:
+
+     {
+         if (by_fields && suppress && index($0, fs) == 0)
+             next
+
+         for (i = 1; i <= nfields; i++) {
+             if ($flist[i] != "") {
+                 printf "%s", $flist[i]
+                 if (i < nfields && $flist[i+1] != "")
+                     printf "%s", OFS
+             }
+         }
+         print ""
+     }
+
+   This version of 'cut' relies on 'gawk''s 'FIELDWIDTHS' variable to do
+the character-based cutting.  It is possible in other 'awk'
+implementations to use 'substr()' (*note String Functions::), but it is
+also extremely painful.  The 'FIELDWIDTHS' variable supplies an elegant
+solution to the problem of picking the input line apart by characters.
+
+
+File: gawk.info,  Node: Egrep Program,  Next: Id Program,  Prev: Cut Program,  Up: Clones
+
+11.2.2 Searching for Regular Expressions in Files
+-------------------------------------------------
+
+The 'egrep' utility searches files for patterns.  It uses regular
+expressions that are almost identical to those available in 'awk' (*note
+Regexp::).  You invoke it as follows:
+
+     'egrep' [OPTIONS] ''PATTERN'' FILES ...
+
+   The PATTERN is a regular expression.  In typical usage, the regular
+expression is quoted to prevent the shell from expanding any of the
+special characters as file name wildcards.  Normally, 'egrep' prints the
+lines that matched.  If multiple file names are provided on the command
+line, each output line is preceded by the name of the file and a colon.
+
+   The options to 'egrep' are as follows:
+
+'-c'
+     Print out a count of the lines that matched the pattern, instead of
+     the lines themselves.
+
+'-s'
+     Be silent.  No output is produced and the exit value indicates
+     whether the pattern was matched.
+
+'-v'
+     Invert the sense of the test.  'egrep' prints the lines that do
+     _not_ match the pattern and exits successfully if the pattern is
+     not matched.
+
+'-i'
+     Ignore case distinctions in both the pattern and the input data.
+
+'-l'
+     Only print (list) the names of the files that matched, not the
+     lines that matched.
+
+'-e PATTERN'
+     Use PATTERN as the regexp to match.  The purpose of the '-e' option
+     is to allow patterns that start with a '-'.
+
+   This version uses the 'getopt()' library function (*note Getopt
+Function::) and the file transition library program (*note Filetrans
+Function::).
+
+   The program begins with a descriptive comment and then a 'BEGIN' rule
+that processes the command-line arguments with 'getopt()'.  The '-i'
+(ignore case) option is particularly easy with 'gawk'; we just use the
+'IGNORECASE' predefined variable (*note Built-in Variables::):
+
+     # egrep.awk --- simulate egrep in awk
+     #
+     # Options:
+     #    -c    count of lines
+     #    -s    silent - use exit value
+     #    -v    invert test, success if no match
+     #    -i    ignore case
+     #    -l    print filenames only
+     #    -e    argument is pattern
+     #
+     # Requires getopt and file transition library functions
+
+     BEGIN {
+         while ((c = getopt(ARGC, ARGV, "ce:svil")) != -1) {
+             if (c == "c")
+                 count_only++
+             else if (c == "s")
+                 no_print++
+             else if (c == "v")
+                 invert++
+             else if (c == "i")
+                 IGNORECASE = 1
+             else if (c == "l")
+                 filenames_only++
+             else if (c == "e")
+                 pattern = Optarg
+             else
+                 usage()
+         }
+
+   Next comes the code that handles the 'egrep'-specific behavior.  If
+no pattern is supplied with '-e', the first nonoption on the command
+line is used.  The 'awk' command-line arguments up to 'ARGV[Optind]' are
+cleared, so that 'awk' won't try to process them as files.  If no files
+are specified, the standard input is used, and if multiple files are
+specified, we make sure to note this so that the file names can precede
+the matched lines in the output:
+
+         if (pattern == "")
+             pattern = ARGV[Optind++]
+
+         for (i = 1; i < Optind; i++)
+             ARGV[i] = ""
+         if (Optind >= ARGC) {
+             ARGV[1] = "-"
+             ARGC = 2
+         } else if (ARGC - Optind > 1)
+             do_filenames++
+
+     #    if (IGNORECASE)
+     #        pattern = tolower(pattern)
+     }
+
+   The last two lines are commented out, as they are not needed in
+'gawk'.  They should be uncommented if you have to use another version
+of 'awk'.
+
+   The next set of lines should be uncommented if you are not using
+'gawk'.  This rule translates all the characters in the input line into
+lowercase if the '-i' option is specified.(1)  The rule is commented out
+as it is not necessary with 'gawk':
+
+     #{
+     #    if (IGNORECASE)
+     #        $0 = tolower($0)
+     #}
+
+   The 'beginfile()' function is called by the rule in 'ftrans.awk' when
+each new file is processed.  In this case, it is very simple; all it
+does is initialize a variable 'fcount' to zero.  'fcount' tracks how
+many lines in the current file matched the pattern.  Naming the
+parameter 'junk' shows we know that 'beginfile()' is called with a
+parameter, but that we're not interested in its value:
+
+     function beginfile(junk)
+     {
+         fcount = 0
+     }
+
+   The 'endfile()' function is called after each file has been
+processed.  It affects the output only when the user wants a count of
+the number of lines that matched.  'no_print' is true only if the exit
+status is desired.  'count_only' is true if line counts are desired.
+'egrep' therefore only prints line counts if printing and counting are
+enabled.  The output format must be adjusted depending upon the number
+of files to process.  Finally, 'fcount' is added to 'total', so that we
+know the total number of lines that matched the pattern:
+
+     function endfile(file)
+     {
+         if (! no_print && count_only) {
+             if (do_filenames)
+                 print file ":" fcount
+             else
+                 print fcount
+         }
+
+         total += fcount
+     }
+
+   The 'BEGINFILE' and 'ENDFILE' special patterns (*note
+BEGINFILE/ENDFILE::) could be used, but then the program would be
+'gawk'-specific.  Additionally, this example was written before 'gawk'
+acquired 'BEGINFILE' and 'ENDFILE'.
+
+   The following rule does most of the work of matching lines.  The
+variable 'matches' is true if the line matched the pattern.  If the user
+wants lines that did not match, the sense of 'matches' is inverted using
+the '!' operator.  'fcount' is incremented with the value of 'matches',
+which is either one or zero, depending upon a successful or unsuccessful
+match.  If the line does not match, the 'next' statement just moves on
+to the next record.
+
+   A number of additional tests are made, but they are only done if we
+are not counting lines.  First, if the user only wants the exit status
+('no_print' is true), then it is enough to know that _one_ line in this
+file matched, and we can skip on to the next file with 'nextfile'.
+Similarly, if we are only printing file names, we can print the file
+name, and then skip to the next file with 'nextfile'.  Finally, each
+line is printed, with a leading file name and colon if necessary:
+
+     {
+         matches = ($0 ~ pattern)
+         if (invert)
+             matches = ! matches
+
+         fcount += matches    # 1 or 0
+
+         if (! matches)
+             next
+
+         if (! count_only) {
+             if (no_print)
+                 nextfile
+
+             if (filenames_only) {
+                 print FILENAME
+                 nextfile
+             }
+
+             if (do_filenames)
+                 print FILENAME ":" $0
+             else
+                 print
+         }
+     }
+
+   The 'END' rule takes care of producing the correct exit status.  If
+there are no matches, the exit status is one; otherwise, it is zero:
+
+     END {
+         exit (total == 0)
+     }
+
+   The 'usage()' function prints a usage message in case of invalid
+options, and then exits:
+
+     function usage()
+     {
+         print("Usage: egrep [-csvil] [-e pat] [files ...]") > "/dev/stderr"
+         print("\n\tegrep [-csvil] pat [files ...]") > "/dev/stderr"
+         exit 1
+     }
+
+   ---------- Footnotes ----------
+
+   (1) It also introduces a subtle bug; if a match happens, we output
+the translated line, not the original.
+
+
+File: gawk.info,  Node: Id Program,  Next: Split Program,  Prev: Egrep Program,  Up: Clones
+
+11.2.3 Printing Out User Information
+------------------------------------
+
+The 'id' utility lists a user's real and effective user ID numbers, real
+and effective group ID numbers, and the user's group set, if any.  'id'
+only prints the effective user ID and group ID if they are different
+from the real ones.  If possible, 'id' also supplies the corresponding
+user and group names.  The output might look like this:
+
+     $ id
+     -| uid=1000(arnold) gid=1000(arnold) groups=1000(arnold),4(adm),7(lp),27(sudo)
+
+   This information is part of what is provided by 'gawk''s 'PROCINFO'
+array (*note Built-in Variables::).  However, the 'id' utility provides
+a more palatable output than just individual numbers.
+
+   Here is a simple version of 'id' written in 'awk'.  It uses the user
+database library functions (*note Passwd Functions::) and the group
+database library functions (*note Group Functions::) from *note Library
+Functions::.
+
+   The program is fairly straightforward.  All the work is done in the
+'BEGIN' rule.  The user and group ID numbers are obtained from
+'PROCINFO'.  The code is repetitive.  The entry in the user database for
+the real user ID number is split into parts at the ':'.  The name is the
+first field.  Similar code is used for the effective user ID number and
+the group numbers:
+
+     # id.awk --- implement id in awk
+     #
+     # Requires user and group library functions
+     # output is:
+     # uid=12(foo) euid=34(bar) gid=3(baz) \
+     #             egid=5(blat) groups=9(nine),2(two),1(one)
+
+     BEGIN {
+         uid = PROCINFO["uid"]
+         euid = PROCINFO["euid"]
+         gid = PROCINFO["gid"]
+         egid = PROCINFO["egid"]
+
+         printf("uid=%d", uid)
+         pw = getpwuid(uid)
+         pr_first_field(pw)
+
+         if (euid != uid) {
+             printf(" euid=%d", euid)
+             pw = getpwuid(euid)
+             pr_first_field(pw)
+         }
+
+         printf(" gid=%d", gid)
+         pw = getgrgid(gid)
+         pr_first_field(pw)
+
+         if (egid != gid) {
+             printf(" egid=%d", egid)
+             pw = getgrgid(egid)
+             pr_first_field(pw)
+         }
+
+         for (i = 1; ("group" i) in PROCINFO; i++) {
+             if (i == 1)
+                 printf(" groups=")
+             group = PROCINFO["group" i]
+             printf("%d", group)
+             pw = getgrgid(group)
+             pr_first_field(pw)
+             if (("group" (i+1)) in PROCINFO)
+                 printf(",")
+         }
+
+         print ""
+     }
+
+     function pr_first_field(str,  a)
+     {
+         if (str != "") {
+             split(str, a, ":")
+             printf("(%s)", a[1])
+         }
+     }
+
+   The test in the 'for' loop is worth noting.  Any supplementary groups
+in the 'PROCINFO' array have the indices '"group1"' through '"groupN"'
+for some N (i.e., the total number of supplementary groups).  However,
+we don't know in advance how many of these groups there are.
+
+   This loop works by starting at one, concatenating the value with
+'"group"', and then using 'in' to see if that value is in the array
+(*note Reference to Elements::).  Eventually, 'i' is incremented past
+the last group in the array and the loop exits.
+
+   The loop is also correct if there are _no_ supplementary groups; then
+the condition is false the first time it's tested, and the loop body
+never executes.
+
+   The 'pr_first_field()' function simply isolates out some code that is
+used repeatedly, making the whole program shorter and cleaner.  In
+particular, moving the check for the empty string into this function
+saves several lines of code.
+
+
+File: gawk.info,  Node: Split Program,  Next: Tee Program,  Prev: Id Program,  Up: Clones
+
+11.2.4 Splitting a Large File into Pieces
+-----------------------------------------
+
+The 'split' program splits large text files into smaller pieces.  Usage
+is as follows:(1)
+
+     'split' ['-COUNT'] [FILE] [PREFIX]
+
+   By default, the output files are named 'xaa', 'xab', and so on.  Each
+file has 1,000 lines in it, with the likely exception of the last file.
+To change the number of lines in each file, supply a number on the
+command line preceded with a minus sign (e.g., '-500' for files with 500
+lines in them instead of 1,000).  To change the names of the output
+files to something like 'myfileaa', 'myfileab', and so on, supply an
+additional argument that specifies the file name prefix.
+
+   Here is a version of 'split' in 'awk'.  It uses the 'ord()' and
+'chr()' functions presented in *note Ordinal Functions::.
+
+   The program first sets its defaults, and then tests to make sure
+there are not too many arguments.  It then looks at each argument in
+turn.  The first argument could be a minus sign followed by a number.
+If it is, this happens to look like a negative number, so it is made
+positive, and that is the count of lines.  The data file name is skipped
+over and the final argument is used as the prefix for the output file
+names:
+
+     # split.awk --- do split in awk
+     #
+     # Requires ord() and chr() library functions
+     # usage: split [-count] [file] [outname]
+
+     BEGIN {
+         outfile = "x"    # default
+         count = 1000
+         if (ARGC > 4)
+             usage()
+
+         i = 1
+         if (i in ARGV && ARGV[i] ~ /^-[[:digit:]]+$/) {
+             count = -ARGV[i]
+             ARGV[i] = ""
+             i++
+         }
+         # test argv in case reading from stdin instead of file
+         if (i in ARGV)
+             i++    # skip datafile name
+         if (i in ARGV) {
+             outfile = ARGV[i]
+             ARGV[i] = ""
+         }
+
+         s1 = s2 = "a"
+         out = (outfile s1 s2)
+     }
+
+   The next rule does most of the work.  'tcount' (temporary count)
+tracks how many lines have been printed to the output file so far.  If
+it is greater than 'count', it is time to close the current file and
+start a new one.  's1' and 's2' track the current suffixes for the file
+name.  If they are both 'z', the file is just too big.  Otherwise, 's1'
+moves to the next letter in the alphabet and 's2' starts over again at
+'a':
+
+     {
+         if (++tcount > count) {
+             close(out)
+             if (s2 == "z") {
+                 if (s1 == "z") {
+                     printf("split: %s is too large to split\n",
+                            FILENAME) > "/dev/stderr"
+                     exit 1
+                 }
+                 s1 = chr(ord(s1) + 1)
+                 s2 = "a"
+             }
+             else
+                 s2 = chr(ord(s2) + 1)
+             out = (outfile s1 s2)
+             tcount = 1
+         }
+         print > out
+     }
+
+The 'usage()' function simply prints an error message and exits:
+
+     function usage()
+     {
+         print("usage: split [-num] [file] [outname]") > "/dev/stderr"
+         exit 1
+     }
+
+   This program is a bit sloppy; it relies on 'awk' to automatically
+close the last file instead of doing it in an 'END' rule.  It also
+assumes that letters are contiguous in the character set, which isn't
+true for EBCDIC systems.
+
+   ---------- Footnotes ----------
+
+   (1) This is the traditional usage.  The POSIX usage is different, but
+not relevant for what the program aims to demonstrate.
+
+
+File: gawk.info,  Node: Tee Program,  Next: Uniq Program,  Prev: Split Program,  Up: Clones
+
+11.2.5 Duplicating Output into Multiple Files
+---------------------------------------------
+
+The 'tee' program is known as a "pipe fitting."  'tee' copies its
+standard input to its standard output and also duplicates it to the
+files named on the command line.  Its usage is as follows:
+
+     'tee' ['-a'] FILE ...
+
+   The '-a' option tells 'tee' to append to the named files, instead of
+truncating them and starting over.
+
+   The 'BEGIN' rule first makes a copy of all the command-line arguments
+into an array named 'copy'.  'ARGV[0]' is not needed, so it is not
+copied.  'tee' cannot use 'ARGV' directly, because 'awk' attempts to
+process each file name in 'ARGV' as input data.
+
+   If the first argument is '-a', then the flag variable 'append' is set
+to true, and both 'ARGV[1]' and 'copy[1]' are deleted.  If 'ARGC' is
+less than two, then no file names were supplied and 'tee' prints a usage
+message and exits.  Finally, 'awk' is forced to read the standard input
+by setting 'ARGV[1]' to '"-"' and 'ARGC' to two:
+
+     # tee.awk --- tee in awk
+     #
+     # Copy standard input to all named output files.
+     # Append content if -a option is supplied.
+     #
+     BEGIN {
+         for (i = 1; i < ARGC; i++)
+             copy[i] = ARGV[i]
+
+         if (ARGV[1] == "-a") {
+             append = 1
+             delete ARGV[1]
+             delete copy[1]
+             ARGC--
+         }
+         if (ARGC < 2) {
+             print "usage: tee [-a] file ..." > "/dev/stderr"
+             exit 1
+         }
+         ARGV[1] = "-"
+         ARGC = 2
+     }
+
+   The following single rule does all the work.  Because there is no
+pattern, it is executed for each line of input.  The body of the rule
+simply prints the line into each file on the command line, and then to
+the standard output:
+
+     {
+         # moving the if outside the loop makes it run faster
+         if (append)
+             for (i in copy)
+                 print >> copy[i]
+         else
+             for (i in copy)
+                 print > copy[i]
+         print
+     }
+
+It is also possible to write the loop this way:
+
+     for (i in copy)
+         if (append)
+             print >> copy[i]
+         else
+             print > copy[i]
+
+This is more concise, but it is also less efficient.  The 'if' is tested
+for each record and for each output file.  By duplicating the loop body,
+the 'if' is only tested once for each input record.  If there are N
+input records and M output files, the first method only executes N 'if'
+statements, while the second executes N'*'M 'if' statements.
+
+   Finally, the 'END' rule cleans up by closing all the output files:
+
+     END {
+         for (i in copy)
+             close(copy[i])
+     }
+
+
+File: gawk.info,  Node: Uniq Program,  Next: Wc Program,  Prev: Tee Program,  Up: Clones
+
+11.2.6 Printing Nonduplicated Lines of Text
+-------------------------------------------
+
+The 'uniq' utility reads sorted lines of data on its standard input, and
+by default removes duplicate lines.  In other words, it only prints
+unique lines--hence the name.  'uniq' has a number of options.  The
+usage is as follows:
+
+     'uniq' ['-udc' ['-N']] ['+N'] [INPUTFILE [OUTPUTFILE]]
+
+   The options for 'uniq' are:
+
+'-d'
+     Print only repeated (duplicated) lines.
+
+'-u'
+     Print only nonrepeated (unique) lines.
+
+'-c'
+     Count lines.  This option overrides '-d' and '-u'.  Both repeated
+     and nonrepeated lines are counted.
+
+'-N'
+     Skip N fields before comparing lines.  The definition of fields is
+     similar to 'awk''s default: nonwhitespace characters separated by
+     runs of spaces and/or TABs.
+
+'+N'
+     Skip N characters before comparing lines.  Any fields specified
+     with '-N' are skipped first.
+
+'INPUTFILE'
+     Data is read from the input file named on the command line, instead
+     of from the standard input.
+
+'OUTPUTFILE'
+     The generated output is sent to the named output file, instead of
+     to the standard output.
+
+   Normally 'uniq' behaves as if both the '-d' and '-u' options are
+provided.
+
+   'uniq' uses the 'getopt()' library function (*note Getopt Function::)
+and the 'join()' library function (*note Join Function::).
+
+   The program begins with a 'usage()' function and then a brief outline
+of the options and their meanings in comments.  The 'BEGIN' rule deals
+with the command-line arguments and options.  It uses a trick to get
+'getopt()' to handle options of the form '-25', treating such an option
+as the option letter '2' with an argument of '5'.  If indeed two or more
+digits are supplied ('Optarg' looks like a number), 'Optarg' is
+concatenated with the option digit and then the result is added to zero
+to make it into a number.  If there is only one digit in the option,
+then 'Optarg' is not needed.  In this case, 'Optind' must be decremented
+so that 'getopt()' processes it next time.  This code is admittedly a
+bit tricky.
+
+   If no options are supplied, then the default is taken, to print both
+repeated and nonrepeated lines.  The output file, if provided, is
+assigned to 'outputfile'.  Early on, 'outputfile' is initialized to the
+standard output, '/dev/stdout':
+
+     # uniq.awk --- do uniq in awk
+     #
+     # Requires getopt() and join() library functions
+
+     function usage()
+     {
+         print("Usage: uniq [-udc [-n]] [+n] [ in [ out ]]") > "/dev/stderr"
+         exit 1
+     }
+
+     # -c    count lines. overrides -d and -u
+     # -d    only repeated lines
+     # -u    only nonrepeated lines
+     # -n    skip n fields
+     # +n    skip n characters, skip fields first
+
+     BEGIN {
+         count = 1
+         outputfile = "/dev/stdout"
+         opts = "udc0:1:2:3:4:5:6:7:8:9:"
+         while ((c = getopt(ARGC, ARGV, opts)) != -1) {
+             if (c == "u")
+                 non_repeated_only++
+             else if (c == "d")
+                 repeated_only++
+             else if (c == "c")
+                 do_count++
+             else if (index("0123456789", c) != 0) {
+                 # getopt() requires args to options
+                 # this messes us up for things like -5
+                 if (Optarg ~ /^[[:digit:]]+$/)
+                     fcount = (c Optarg) + 0
+                 else {
+                     fcount = c + 0
+                     Optind--
+                 }
+             } else
+                 usage()
+         }
+
+         if (ARGV[Optind] ~ /^\+[[:digit:]]+$/) {
+             charcount = substr(ARGV[Optind], 2) + 0
+             Optind++
+         }
+
+         for (i = 1; i < Optind; i++)
+             ARGV[i] = ""
+
+         if (repeated_only == 0 && non_repeated_only == 0)
+             repeated_only = non_repeated_only = 1
+
+         if (ARGC - Optind == 2) {
+             outputfile = ARGV[ARGC - 1]
+             ARGV[ARGC - 1] = ""
+         }
+     }
+
+   The following function, 'are_equal()', compares the current line,
+'$0', to the previous line, 'last'.  It handles skipping fields and
+characters.  If no field count and no character count are specified,
+'are_equal()' returns one or zero depending upon the result of a simple
+string comparison of 'last' and '$0'.
+
+   Otherwise, things get more complicated.  If fields have to be
+skipped, each line is broken into an array using 'split()' (*note String
+Functions::); the desired fields are then joined back into a line using
+'join()'.  The joined lines are stored in 'clast' and 'cline'.  If no
+fields are skipped, 'clast' and 'cline' are set to 'last' and '$0',
+respectively.  Finally, if characters are skipped, 'substr()' is used to
+strip off the leading 'charcount' characters in 'clast' and 'cline'.
+The two strings are then compared and 'are_equal()' returns the result:
+
+     function are_equal(    n, m, clast, cline, alast, aline)
+     {
+         if (fcount == 0 && charcount == 0)
+             return (last == $0)
+
+         if (fcount > 0) {
+             n = split(last, alast)
+             m = split($0, aline)
+             clast = join(alast, fcount+1, n)
+             cline = join(aline, fcount+1, m)
+         } else {
+             clast = last
+             cline = $0
+         }
+         if (charcount) {
+             clast = substr(clast, charcount + 1)
+             cline = substr(cline, charcount + 1)
+         }
+
+         return (clast == cline)
+     }
+
+   The following two rules are the body of the program.  The first one
+is executed only for the very first line of data.  It sets 'last' equal
+to '$0', so that subsequent lines of text have something to be compared
+to.
+
+   The second rule does the work.  The variable 'equal' is one or zero,
+depending upon the results of 'are_equal()''s comparison.  If 'uniq' is
+counting repeated lines, and the lines are equal, then it increments the
+'count' variable.  Otherwise, it prints the line and resets 'count',
+because the two lines are not equal.
+
+   If 'uniq' is not counting, and if the lines are equal, 'count' is
+incremented.  Nothing is printed, as the point is to remove duplicates.
+Otherwise, if 'uniq' is counting repeated lines and more than one line
+is seen, or if 'uniq' is counting nonrepeated lines and only one line is
+seen, then the line is printed, and 'count' is reset.
+
+   Finally, similar logic is used in the 'END' rule to print the final
+line of input data:
+
+     NR == 1 {
+         last = $0
+         next
+     }
+
+     {
+         equal = are_equal()
+
+         if (do_count) {    # overrides -d and -u
+             if (equal)
+                 count++
+             else {
+                 printf("%4d %s\n", count, last) > outputfile
+                 last = $0
+                 count = 1    # reset
+             }
+             next
+         }
+
+         if (equal)
+             count++
+         else {
+             if ((repeated_only && count > 1) ||
+                 (non_repeated_only && count == 1))
+                     print last > outputfile
+             last = $0
+             count = 1
+         }
+     }
+
+     END {
+         if (do_count)
+             printf("%4d %s\n", count, last) > outputfile
+         else if ((repeated_only && count > 1) ||
+                 (non_repeated_only && count == 1))
+             print last > outputfile
+         close(outputfile)
+     }
+
+
+File: gawk.info,  Node: Wc Program,  Prev: Uniq Program,  Up: Clones
+
+11.2.7 Counting Things
+----------------------
+
+The 'wc' (word count) utility counts lines, words, and characters in one
+or more input files.  Its usage is as follows:
+
+     'wc' ['-lwc'] [FILES ...]
+
+   If no files are specified on the command line, 'wc' reads its
+standard input.  If there are multiple files, it also prints total
+counts for all the files.  The options and their meanings are as
+follows:
+
+'-l'
+     Count only lines.
+
+'-w'
+     Count only words.  A "word" is a contiguous sequence of
+     nonwhitespace characters, separated by spaces and/or TABs.
+     Luckily, this is the normal way 'awk' separates fields in its input
+     data.
+
+'-c'
+     Count only characters.
+
+   Implementing 'wc' in 'awk' is particularly elegant, because 'awk'
+does a lot of the work for us; it splits lines into words (i.e., fields)
+and counts them, it counts lines (i.e., records), and it can easily tell
+us how long a line is.
+
+   This program uses the 'getopt()' library function (*note Getopt
+Function::) and the file-transition functions (*note Filetrans
+Function::).
+
+   This version has one notable difference from traditional versions of
+'wc': it always prints the counts in the order lines, words, and
+characters.  Traditional versions note the order of the '-l', '-w', and
+'-c' options on the command line, and print the counts in that order.
+
+   The 'BEGIN' rule does the argument processing.  The variable
+'print_total' is true if more than one file is named on the command
+line:
+
+     # wc.awk --- count lines, words, characters
+
+     # Options:
+     #    -l    only count lines
+     #    -w    only count words
+     #    -c    only count characters
+     #
+     # Default is to count lines, words, characters
+     #
+     # Requires getopt() and file transition library functions
+
+     BEGIN {
+         # let getopt() print a message about
+         # invalid options. we ignore them
+         while ((c = getopt(ARGC, ARGV, "lwc")) != -1) {
+             if (c == "l")
+                 do_lines = 1
+             else if (c == "w")
+                 do_words = 1
+             else if (c == "c")
+                 do_chars = 1
+         }
+         for (i = 1; i < Optind; i++)
+             ARGV[i] = ""
+
+         # if no options, do all
+         if (! do_lines && ! do_words && ! do_chars)
+             do_lines = do_words = do_chars = 1
+
+         print_total = (ARGC - i > 1)
+     }
+
+   The 'beginfile()' function is simple; it just resets the counts of
+lines, words, and characters to zero, and saves the current file name in
+'fname':
+
+     function beginfile(file)
+     {
+         lines = words = chars = 0
+         fname = FILENAME
+     }
+
+   The 'endfile()' function adds the current file's numbers to the
+running totals of lines, words, and characters.  It then prints out
+those numbers for the file that was just read.  It relies on
+'beginfile()' to reset the numbers for the following data file:
+
+     function endfile(file)
+     {
+         tlines += lines
+         twords += words
+         tchars += chars
+         if (do_lines)
+             printf "\t%d", lines
+         if (do_words)
+             printf "\t%d", words
+         if (do_chars)
+             printf "\t%d", chars
+         printf "\t%s\n", fname
+     }
+
+   There is one rule that is executed for each line.  It adds the length
+of the record, plus one, to 'chars'.(1)  Adding one plus the record
+length is needed because the newline character separating records (the
+value of 'RS') is not part of the record itself, and thus not included
+in its length.  Next, 'lines' is incremented for each line read, and
+'words' is incremented by the value of 'NF', which is the number of
+"words" on this line:
+
+     # do per line
+     {
+         chars += length($0) + 1    # get newline
+         lines++
+         words += NF
+     }
+
+   Finally, the 'END' rule simply prints the totals for all the files:
+
+     END {
+         if (print_total) {
+             if (do_lines)
+                 printf "\t%d", tlines
+             if (do_words)
+                 printf "\t%d", twords
+             if (do_chars)
+                 printf "\t%d", tchars
+             print "\ttotal"
+         }
+     }
+
+   ---------- Footnotes ----------
+
+   (1) Because 'gawk' understands multibyte locales, this code counts
+characters, not bytes.
+
+
+File: gawk.info,  Node: Miscellaneous Programs,  Next: Programs Summary,  Prev: Clones,  Up: Sample Programs
+
+11.3 A Grab Bag of 'awk' Programs
+=================================
+
+This minor node is a large "grab bag" of miscellaneous programs.  We
+hope you find them both interesting and enjoyable.
+
+* Menu:
+
+* Dupword Program::             Finding duplicated words in a document.
+* Alarm Program::               An alarm clock.
+* Translate Program::           A program similar to the 'tr' utility.
+* Labels Program::              Printing mailing labels.
+* Word Sorting::                A program to produce a word usage count.
+* History Sorting::             Eliminating duplicate entries from a history
+                                file.
+* Extract Program::             Pulling out programs from Texinfo source
+                                files.
+* Simple Sed::                  A Simple Stream Editor.
+* Igawk Program::               A wrapper for 'awk' that includes
+                                files.
+* Anagram Program::             Finding anagrams from a dictionary.
+* Signature Program::           People do amazing things with too much time on
+                                their hands.
+
+
+File: gawk.info,  Node: Dupword Program,  Next: Alarm Program,  Up: Miscellaneous Programs
+
+11.3.1 Finding Duplicated Words in a Document
+---------------------------------------------
+
+A common error when writing large amounts of prose is to accidentally
+duplicate words.  Typically you will see this in text as something like
+"the the program does the following..." When the text is online, often
+the duplicated words occur at the end of one line and the beginning of
+another, making them very difficult to spot.
+
+   This program, 'dupword.awk', scans through a file one line at a time
+and looks for adjacent occurrences of the same word.  It also saves the
+last word on a line (in the variable 'prev') for comparison with the
+first word on the next line.
+
+   The first two statements make sure that the line is all lowercase, so
+that, for example, "The" and "the" compare equal to each other.  The
+next statement replaces nonalphanumeric and nonwhitespace characters
+with spaces, so that punctuation does not affect the comparison either.
+The characters are replaced with spaces so that formatting controls
+don't create nonsense words (e.g., the Texinfo '@code{NF}' becomes
+'codeNF' if punctuation is simply deleted).  The record is then resplit
+into fields, yielding just the actual words on the line, and ensuring
+that there are no empty fields.
+
+   If there are no fields left after removing all the punctuation, the
+current record is skipped.  Otherwise, the program loops through each
+word, comparing it to the previous one:
+
+     # dupword.awk --- find duplicate words in text
+     {
+         $0 = tolower($0)
+         gsub(/[^[:alnum:][:blank:]]/, " ");
+         $0 = $0         # re-split
+         if (NF == 0)
+             next
+         if ($1 == prev)
+             printf("%s:%d: duplicate %s\n",
+                 FILENAME, FNR, $1)
+         for (i = 2; i <= NF; i++)
+             if ($i == $(i-1))
+                 printf("%s:%d: duplicate %s\n",
+                     FILENAME, FNR, $i)
+         prev = $NF
+     }
+
+
+File: gawk.info,  Node: Alarm Program,  Next: Translate Program,  Prev: Dupword Program,  Up: Miscellaneous Programs
+
+11.3.2 An Alarm Clock Program
+-----------------------------
+
+     Nothing cures insomnia like a ringing alarm clock.
+                          -- _Arnold Robbins_
+     Sleep is for web developers.
+                          -- _Erik Quanstrom_
+
+   The following program is a simple "alarm clock" program.  You give it
+a time of day and an optional message.  At the specified time, it prints
+the message on the standard output.  In addition, you can give it the
+number of times to repeat the message as well as a delay between
+repetitions.
+
+   This program uses the 'getlocaltime()' function from *note
+Getlocaltime Function::.
+
+   All the work is done in the 'BEGIN' rule.  The first part is argument
+checking and setting of defaults: the delay, the count, and the message
+to print.  If the user supplied a message without the ASCII BEL
+character (known as the "alert" character, '"\a"'), then it is added to
+the message.  (On many systems, printing the ASCII BEL generates an
+audible alert.  Thus, when the alarm goes off, the system calls
+attention to itself in case the user is not looking at the computer.)
+Just for a change, this program uses a 'switch' statement (*note Switch
+Statement::), but the processing could be done with a series of
+'if'-'else' statements instead.  Here is the program:
+
+     # alarm.awk --- set an alarm
+     #
+     # Requires getlocaltime() library function
+     # usage: alarm time [ "message" [ count [ delay ] ] ]
+
+     BEGIN {
+         # Initial argument sanity checking
+         usage1 = "usage: alarm time ['message' [count [delay]]]"
+         usage2 = sprintf("\t(%s) time ::= hh:mm", ARGV[1])
+
+         if (ARGC < 2) {
+             print usage1 > "/dev/stderr"
+             print usage2 > "/dev/stderr"
+             exit 1
+         }
+         switch (ARGC) {
+         case 5:
+             delay = ARGV[4] + 0
+             # fall through
+         case 4:
+             count = ARGV[3] + 0
+             # fall through
+         case 3:
+             message = ARGV[2]
+             break
+         default:
+             if (ARGV[1] !~ /[[:digit:]]?[[:digit:]]:[[:digit:]]{2}/) {
+                 print usage1 > "/dev/stderr"
+                 print usage2 > "/dev/stderr"
+                 exit 1
+             }
+             break
+         }
+
+         # set defaults for once we reach the desired time
+         if (delay == 0)
+             delay = 180    # 3 minutes
+         if (count == 0)
+             count = 5
+         if (message == "")
+             message = sprintf("\aIt is now %s!\a", ARGV[1])
+         else if (index(message, "\a") == 0)
+             message = "\a" message "\a"
+
+   The next minor node of code turns the alarm time into hours and
+minutes, converts it (if necessary) to a 24-hour clock, and then turns
+that time into a count of the seconds since midnight.  Next it turns the
+current time into a count of seconds since midnight.  The difference
+between the two is how long to wait before setting off the alarm:
+
+         # split up alarm time
+         split(ARGV[1], atime, ":")
+         hour = atime[1] + 0    # force numeric
+         minute = atime[2] + 0  # force numeric
+
+         # get current broken down time
+         getlocaltime(now)
+
+         # if time given is 12-hour hours and it's after that
+         # hour, e.g., `alarm 5:30' at 9 a.m. means 5:30 p.m.,
+         # then add 12 to real hour
+         if (hour < 12 && now["hour"] > hour)
+             hour += 12
+
+         # set target time in seconds since midnight
+         target = (hour * 60 * 60) + (minute * 60)
+
+         # get current time in seconds since midnight
+         current = (now["hour"] * 60 * 60) + \
+                    (now["minute"] * 60) + now["second"]
+
+         # how long to sleep for
+         naptime = target - current
+         if (naptime <= 0) {
+             print "alarm: time is in the past!" > "/dev/stderr"
+             exit 1
+         }
+
+   Finally, the program uses the 'system()' function (*note I/O
+Functions::) to call the 'sleep' utility.  The 'sleep' utility simply
+pauses for the given number of seconds.  If the exit status is not zero,
+the program assumes that 'sleep' was interrupted and exits.  If 'sleep'
+exited with an OK status (zero), then the program prints the message in
+a loop, again using 'sleep' to delay for however many seconds are
+necessary:
+
+         # zzzzzz..... go away if interrupted
+         if (system(sprintf("sleep %d", naptime)) != 0)
+             exit 1
+
+         # time to notify!
+         command = sprintf("sleep %d", delay)
+         for (i = 1; i <= count; i++) {
+             print message
+             # if sleep command interrupted, go away
+             if (system(command) != 0)
+                 break
+         }
+
+         exit 0
+     }
+
+
+File: gawk.info,  Node: Translate Program,  Next: Labels Program,  Prev: Alarm Program,  Up: Miscellaneous Programs
+
+11.3.3 Transliterating Characters
+---------------------------------
+
+The system 'tr' utility transliterates characters.  For example, it is
+often used to map uppercase letters into lowercase for further
+processing:
+
+     GENERATE DATA | tr 'A-Z' 'a-z' | PROCESS DATA ...
+
+   'tr' requires two lists of characters.(1)  When processing the input,
+the first character in the first list is replaced with the first
+character in the second list, the second character in the first list is
+replaced with the second character in the second list, and so on.  If
+there are more characters in the "from" list than in the "to" list, the
+last character of the "to" list is used for the remaining characters in
+the "from" list.
+
+   Once upon a time, a user proposed adding a transliteration function
+to 'gawk'.  The following program was written to prove that character
+transliteration could be done with a user-level function.  This program
+is not as complete as the system 'tr' utility, but it does most of the
+job.
+
+   The 'translate' program was written long before 'gawk' acquired the
+ability to split each character in a string into separate array
+elements.  Thus, it makes repeated use of the 'substr()', 'index()', and
+'gsub()' built-in functions (*note String Functions::).  There are two
+functions.  The first, 'stranslate()', takes three arguments:
+
+'from'
+     A list of characters from which to translate
+
+'to'
+     A list of characters to which to translate
+
+'target'
+     The string on which to do the translation
+
+   Associative arrays make the translation part fairly easy.  't_ar'
+holds the "to" characters, indexed by the "from" characters.  Then a
+simple loop goes through 'from', one character at a time.  For each
+character in 'from', if the character appears in 'target', it is
+replaced with the corresponding 'to' character.
+
+   The 'translate()' function calls 'stranslate()', using '$0' as the
+target.  The main program sets two global variables, 'FROM' and 'TO',
+from the command line, and then changes 'ARGV' so that 'awk' reads from
+the standard input.
+
+   Finally, the processing rule simply calls 'translate()' for each
+record:
+
+     # translate.awk --- do tr-like stuff
+     # Bugs: does not handle things like tr A-Z a-z; it has
+     # to be spelled out. However, if `to' is shorter than `from',
+     # the last character in `to' is used for the rest of `from'.
+
+     function stranslate(from, to, target,     lf, lt, ltarget, t_ar, i, c,
+                                                                    result)
+     {
+         lf = length(from)
+         lt = length(to)
+         ltarget = length(target)
+         for (i = 1; i <= lt; i++)
+             t_ar[substr(from, i, 1)] = substr(to, i, 1)
+         if (lt < lf)
+             for (; i <= lf; i++)
+                 t_ar[substr(from, i, 1)] = substr(to, lt, 1)
+         for (i = 1; i <= ltarget; i++) {
+             c = substr(target, i, 1)
+             if (c in t_ar)
+                 c = t_ar[c]
+             result = result c
+         }
+         return result
+     }
+
+     function translate(from, to)
+     {
+         return $0 = stranslate(from, to, $0)
+     }
+
+     # main program
+     BEGIN {
+         if (ARGC < 3) {
+             print "usage: translate from to" > "/dev/stderr"
+             exit
+         }
+         FROM = ARGV[1]
+         TO = ARGV[2]
+         ARGC = 2
+         ARGV[1] = "-"
+     }
+
+     {
+         translate(FROM, TO)
+         print
+     }
+
+   It is possible to do character transliteration in a user-level
+function, but it is not necessarily efficient, and we (the 'gawk'
+developers) started to consider adding a built-in function.  However,
+shortly after writing this program, we learned that Brian Kernighan had
+added the 'toupper()' and 'tolower()' functions to his 'awk' (*note
+String Functions::).  These functions handle the vast majority of the
+cases where character transliteration is necessary, and so we chose to
+simply add those functions to 'gawk' as well and then leave well enough
+alone.
+
+   An obvious improvement to this program would be to set up the 't_ar'
+array only once, in a 'BEGIN' rule.  However, this assumes that the
+"from" and "to" lists will never change throughout the lifetime of the
+program.
+
+   Another obvious improvement is to enable the use of ranges, such as
+'a-z', as allowed by the 'tr' utility.  Look at the code for 'cut.awk'
+(*note Cut Program::) for inspiration.
+
+   ---------- Footnotes ----------
+
+   (1) On some older systems, including Solaris, the system version of
+'tr' may require that the lists be written as range expressions enclosed
+in square brackets ('[a-z]') and quoted, to prevent the shell from
+attempting a file name expansion.  This is not a feature.
+
+
+File: gawk.info,  Node: Labels Program,  Next: Word Sorting,  Prev: Translate Program,  Up: Miscellaneous Programs
+
+11.3.4 Printing Mailing Labels
+------------------------------
+
+Here is a "real-world"(1) program.  This script reads lists of names and
+addresses and generates mailing labels.  Each page of labels has 20
+labels on it, two across and 10 down.  The addresses are guaranteed to
+be no more than five lines of data.  Each address is separated from the
+next by a blank line.
+
+   The basic idea is to read 20 labels' worth of data.  Each line of
+each label is stored in the 'line' array.  The single rule takes care of
+filling the 'line' array and printing the page when 20 labels have been
+read.
+
+   The 'BEGIN' rule simply sets 'RS' to the empty string, so that 'awk'
+splits records at blank lines (*note Records::).  It sets 'MAXLINES' to
+100, because 100 is the maximum number of lines on the page (20 * 5 =
+100).
+
+   Most of the work is done in the 'printpage()' function.  The label
+lines are stored sequentially in the 'line' array.  But they have to
+print horizontally: 'line[1]' next to 'line[6]', 'line[2]' next to
+'line[7]', and so on.  Two loops accomplish this.  The outer loop,
+controlled by 'i', steps through every 10 lines of data; this is each
+row of labels.  The inner loop, controlled by 'j', goes through the
+lines within the row.  As 'j' goes from 0 to 4, 'i+j' is the 'j'th line
+in the row, and 'i+j+5' is the entry next to it.  The output ends up
+looking something like this:
+
+     line 1          line 6
+     line 2          line 7
+     line 3          line 8
+     line 4          line 9
+     line 5          line 10
+     ...
+
+The 'printf' format string '%-41s' left-aligns the data and prints it
+within a fixed-width field.
+
+   As a final note, an extra blank line is printed at lines 21 and 61,
+to keep the output lined up on the labels.  This is dependent on the
+particular brand of labels in use when the program was written.  You
+will also note that there are two blank lines at the top and two blank
+lines at the bottom.
+
+   The 'END' rule arranges to flush the final page of labels; there may
+not have been an even multiple of 20 labels in the data:
+
+     # labels.awk --- print mailing labels
+
+     # Each label is 5 lines of data that may have blank lines.
+     # The label sheets have 2 blank lines at the top and 2 at
+     # the bottom.
+
+     BEGIN    { RS = "" ; MAXLINES = 100 }
+
+     function printpage(    i, j)
+     {
+         if (Nlines <= 0)
+             return
+
+         printf "\n\n"        # header
+
+         for (i = 1; i <= Nlines; i += 10) {
+             if (i == 21 || i == 61)
+                 print ""
+             for (j = 0; j < 5; j++) {
+                 if (i + j > MAXLINES)
+                     break
+                 printf "   %-41s %s\n", line[i+j], line[i+j+5]
+             }
+             print ""
+         }
+
+         printf "\n\n"        # footer
+
+         delete line
+     }
+
+     # main rule
+     {
+         if (Count >= 20) {
+             printpage()
+             Count = 0
+             Nlines = 0
+         }
+         n = split($0, a, "\n")
+         for (i = 1; i <= n; i++)
+             line[++Nlines] = a[i]
+         for (; i <= 5; i++)
+             line[++Nlines] = ""
+         Count++
+     }
+
+     END {
+         printpage()
+     }
+
+   ---------- Footnotes ----------
+
+   (1) "Real world" is defined as "a program actually used to get
+something done."
+
+
+File: gawk.info,  Node: Word Sorting,  Next: History Sorting,  Prev: Labels Program,  Up: Miscellaneous Programs
+
+11.3.5 Generating Word-Usage Counts
+-----------------------------------
+
+When working with large amounts of text, it can be interesting to know
+how often different words appear.  For example, an author may overuse
+certain words, in which case he or she might wish to find synonyms to
+substitute for words that appear too often.  This node develops a
+program for counting words and presenting the frequency information in a
+useful format.
+
+   At first glance, a program like this would seem to do the job:
+
+     # wordfreq-first-try.awk --- print list of word frequencies
+
+     {
+         for (i = 1; i <= NF; i++)
+             freq[$i]++
+     }
+
+     END {
+         for (word in freq)
+             printf "%s\t%d\n", word, freq[word]
+     }
+
+   The program relies on 'awk''s default field-splitting mechanism to
+break each line up into "words" and uses an associative array named
+'freq', indexed by each word, to count the number of times the word
+occurs.  In the 'END' rule, it prints the counts.
+
+   This program has several problems that prevent it from being useful
+on real text files:
+
+   * The 'awk' language considers upper- and lowercase characters to be
+     distinct.  Therefore, "bartender" and "Bartender" are not treated
+     as the same word.  This is undesirable, because words are
+     capitalized if they begin sentences in normal text, and a frequency
+     analyzer should not be sensitive to capitalization.
+
+   * Words are detected using the 'awk' convention that fields are
+     separated just by whitespace.  Other characters in the input
+     (except newlines) don't have any special meaning to 'awk'.  This
+     means that punctuation characters count as part of words.
+
+   * The output does not come out in any useful order.  You're more
+     likely to be interested in which words occur most frequently or in
+     having an alphabetized table of how frequently each word occurs.
+
+   The first problem can be solved by using 'tolower()' to remove case
+distinctions.  The second problem can be solved by using 'gsub()' to
+remove punctuation characters.  Finally, we solve the third problem by
+using the system 'sort' utility to process the output of the 'awk'
+script.  Here is the new version of the program:
+
+     # wordfreq.awk --- print list of word frequencies
+
+     {
+         $0 = tolower($0)    # remove case distinctions
+         # remove punctuation
+         gsub(/[^[:alnum:]_[:blank:]]/, "", $0)
+         for (i = 1; i <= NF; i++)
+             freq[$i]++
+     }
+
+     END {
+         for (word in freq)
+             printf "%s\t%d\n", word, freq[word]
+     }
+
+   The regexp '/[^[:alnum:]_[:blank:]]/' might have been written
+'/[[:punct:]]/', but then underscores would also be removed, and we want
+to keep them.
+
+   Assuming we have saved this program in a file named 'wordfreq.awk',
+and that the data is in 'file1', the following pipeline:
+
+     awk -f wordfreq.awk file1 | sort -k 2nr
+
+produces a table of the words appearing in 'file1' in order of
+decreasing frequency.
+
+   The 'awk' program suitably massages the data and produces a word
+frequency table, which is not ordered.  The 'awk' script's output is
+then sorted by the 'sort' utility and printed on the screen.
+
+   The options given to 'sort' specify a sort that uses the second field
+of each input line (skipping one field), that the sort keys should be
+treated as numeric quantities (otherwise '15' would come before '5'),
+and that the sorting should be done in descending (reverse) order.
+
+   The 'sort' could even be done from within the program, by changing
+the 'END' action to:
+
+     END {
+         sort = "sort -k 2nr"
+         for (word in freq)
+             printf "%s\t%d\n", word, freq[word] | sort
+         close(sort)
+     }
+
+   This way of sorting must be used on systems that do not have true
+pipes at the command-line (or batch-file) level.  See the general
+operating system documentation for more information on how to use the
+'sort' program.
+
+
+File: gawk.info,  Node: History Sorting,  Next: Extract Program,  Prev: Word Sorting,  Up: Miscellaneous Programs
+
+11.3.6 Removing Duplicates from Unsorted Text
+---------------------------------------------
+
+The 'uniq' program (*note Uniq Program::) removes duplicate lines from
+_sorted_ data.
+
+   Suppose, however, you need to remove duplicate lines from a data file
+but that you want to preserve the order the lines are in.  A good
+example of this might be a shell history file.  The history file keeps a
+copy of all the commands you have entered, and it is not unusual to
+repeat a command several times in a row.  Occasionally you might want to
+compact the history by removing duplicate entries.  Yet it is desirable
+to maintain the order of the original commands.
+
+   This simple program does the job.  It uses two arrays.  The 'data'
+array is indexed by the text of each line.  For each line, 'data[$0]' is
+incremented.  If a particular line has not been seen before, then
+'data[$0]' is zero.  In this case, the text of the line is stored in
+'lines[count]'.  Each element of 'lines' is a unique command, and the
+indices of 'lines' indicate the order in which those lines are
+encountered.  The 'END' rule simply prints out the lines, in order:
+
+     # histsort.awk --- compact a shell history file
+     # Thanks to Byron Rakitzis for the general idea
+
+     {
+         if (data[$0]++ == 0)
+             lines[++count] = $0
+     }
+
+     END {
+         for (i = 1; i <= count; i++)
+             print lines[i]
+     }
+
+   This program also provides a foundation for generating other useful
+information.  For example, using the following 'print' statement in the
+'END' rule indicates how often a particular command is used:
+
+     print data[lines[i]], lines[i]
+
+This works because 'data[$0]' is incremented each time a line is seen.
+
+
+File: gawk.info,  Node: Extract Program,  Next: Simple Sed,  Prev: History Sorting,  Up: Miscellaneous Programs
+
+11.3.7 Extracting Programs from Texinfo Source Files
+----------------------------------------------------
+
+The nodes *note Library Functions::, and *note Sample Programs::, are
+the top level nodes for a large number of 'awk' programs.  If you want
+to experiment with these programs, it is tedious to type them in by
+hand.  Here we present a program that can extract parts of a Texinfo
+input file into separate files.
+
+   This Info file is written in Texinfo
+(http://www.gnu.org/software/texinfo/), the GNU Project's document
+formatting language.  A single Texinfo source file can be used to
+produce both printed documentation, with TeX, and online documentation.
+(The Texinfo language is described fully, starting with *note (Texinfo,
+texinfo,Texinfo---The GNU Documentation Format)Top::.)
+
+   For our purposes, it is enough to know three things about Texinfo
+input files:
+
+   * The "at" symbol ('@') is special in Texinfo, much as the backslash
+     ('\') is in C or 'awk'.  Literal '@' symbols are represented in
+     Texinfo source files as '@@'.
+
+   * Comments start with either '@c' or '@comment'.  The file-extraction
+     program works by using special comments that start at the beginning
+     of a line.
+
+   * Lines containing '@group' and '@end group' commands bracket example
+     text that should not be split across a page boundary.
+     (Unfortunately, TeX isn't always smart enough to do things exactly
+     right, so we have to give it some help.)
+
+   The following program, 'extract.awk', reads through a Texinfo source
+file and does two things, based on the special comments.  Upon seeing
+'@c system ...', it runs a command, by extracting the command text from
+the control line and passing it on to the 'system()' function (*note I/O
+Functions::).  Upon seeing '@c file FILENAME', each subsequent line is
+sent to the file FILENAME, until '@c endfile' is encountered.  The rules
+in 'extract.awk' match either '@c' or '@comment' by letting the 'omment'
+part be optional.  Lines containing '@group' and '@end group' are simply
+removed.  'extract.awk' uses the 'join()' library function (*note Join
+Function::).
+
+   The example programs in the online Texinfo source for 'GAWK:
+Effective AWK Programming' ('gawktexi.in') have all been bracketed
+inside 'file' and 'endfile' lines.  The 'gawk' distribution uses a copy
+of 'extract.awk' to extract the sample programs and install many of them
+in a standard directory where 'gawk' can find them.  The Texinfo file
+looks something like this:
+
+     ...
+     This program has a @code{BEGIN} rule
+     that prints a nice message:
+
+     @example
+     @c file examples/messages.awk
+     BEGIN @{ print "Don't panic!" @}
+     @c endfile
+     @end example
+
+     It also prints some final advice:
+
+     @example
+     @c file examples/messages.awk
+     END @{ print "Always avoid bored archaeologists!" @}
+     @c endfile
+     @end example
+     ...
+
+   'extract.awk' begins by setting 'IGNORECASE' to one, so that mixed
+upper- and lowercase letters in the directives won't matter.
+
+   The first rule handles calling 'system()', checking that a command is
+given ('NF' is at least three) and also checking that the command exits
+with a zero exit status, signifying OK:
+
+     # extract.awk --- extract files and run programs from Texinfo files
+
+     BEGIN    { IGNORECASE = 1 }
+
+     /^@c(omment)?[ \t]+system/ {
+         if (NF < 3) {
+             e = ("extract: " FILENAME ":" FNR)
+             e = (e  ": badly formed `system' line")
+             print e > "/dev/stderr"
+             next
+         }
+         $1 = ""
+         $2 = ""
+         stat = system($0)
+         if (stat != 0) {
+             e = ("extract: " FILENAME ":" FNR)
+             e = (e ": warning: system returned " stat)
+             print e > "/dev/stderr"
+         }
+     }
+
+The variable 'e' is used so that the rule fits nicely on the screen.
+
+   The second rule handles moving data into files.  It verifies that a
+file name is given in the directive.  If the file named is not the
+current file, then the current file is closed.  Keeping the current file
+open until a new file is encountered allows the use of the '>'
+redirection for printing the contents, keeping open-file management
+simple.
+
+   The 'for' loop does the work.  It reads lines using 'getline' (*note
+Getline::).  For an unexpected end-of-file, it calls the
+'unexpected_eof()' function.  If the line is an "endfile" line, then it
+breaks out of the loop.  If the line is an '@group' or '@end group'
+line, then it ignores it and goes on to the next line.  Similarly,
+comments within examples are also ignored.
+
+   Most of the work is in the following few lines.  If the line has no
+'@' symbols, the program can print it directly.  Otherwise, each leading
+'@' must be stripped off.  To remove the '@' symbols, the line is split
+into separate elements of the array 'a', using the 'split()' function
+(*note String Functions::).  The '@' symbol is used as the separator
+character.  Each element of 'a' that is empty indicates two successive
+'@' symbols in the original line.  For each two empty elements ('@@' in
+the original file), we have to add a single '@' symbol back in.
+
+   When the processing of the array is finished, 'join()' is called with
+the value of 'SUBSEP' (*note Multidimensional::), to rejoin the pieces
+back into a single line.  That line is then printed to the output file:
+
+     /^@c(omment)?[ \t]+file/ {
+         if (NF != 3) {
+             e = ("extract: " FILENAME ":" FNR ": badly formed `file' line")
+             print e > "/dev/stderr"
+             next
+         }
+         if ($3 != curfile) {
+             if (curfile != "")
+                 close(curfile)
+             curfile = $3
+         }
+
+         for (;;) {
+             if ((getline line) <= 0)
+                 unexpected_eof()
+             if (line ~ /^@c(omment)?[ \t]+endfile/)
+                 break
+             else if (line ~ /^@(end[ \t]+)?group/)
+                 continue
+             else if (line ~ /^@c(omment+)?[ \t]+/)
+                 continue
+             if (index(line, "@") == 0) {
+                 print line > curfile
+                 continue
+             }
+             n = split(line, a, "@")
+             # if a[1] == "", means leading @,
+             # don't add one back in.
+             for (i = 2; i <= n; i++) {
+                 if (a[i] == "") { # was an @@
+                     a[i] = "@"
+                     if (a[i+1] == "")
+                         i++
+                 }
+             }
+             print join(a, 1, n, SUBSEP) > curfile
+         }
+     }
+
+   An important thing to note is the use of the '>' redirection.  Output
+done with '>' only opens the file once; it stays open and subsequent
+output is appended to the file (*note Redirection::).  This makes it
+easy to mix program text and explanatory prose for the same sample
+source file (as has been done here!)  without any hassle.  The file is
+only closed when a new data file name is encountered or at the end of
+the input file.
+
+   Finally, the function 'unexpected_eof()' prints an appropriate error
+message and then exits.  The 'END' rule handles the final cleanup,
+closing the open file:
+
+     function unexpected_eof()
+     {
+         printf("extract: %s:%d: unexpected EOF or error\n",
+                          FILENAME, FNR) > "/dev/stderr"
+         exit 1
+     }
+
+     END {
+         if (curfile)
+             close(curfile)
+     }
+
+
+File: gawk.info,  Node: Simple Sed,  Next: Igawk Program,  Prev: Extract Program,  Up: Miscellaneous Programs
+
+11.3.8 A Simple Stream Editor
+-----------------------------
+
+The 'sed' utility is a "stream editor", a program that reads a stream of
+data, makes changes to it, and passes it on.  It is often used to make
+global changes to a large file or to a stream of data generated by a
+pipeline of commands.  Although 'sed' is a complicated program in its
+own right, its most common use is to perform global substitutions in the
+middle of a pipeline:
+
+     COMMAND1 < orig.data | sed 's/old/new/g' | COMMAND2 > result
+
+   Here, 's/old/new/g' tells 'sed' to look for the regexp 'old' on each
+input line and globally replace it with the text 'new' (i.e., all the
+occurrences on a line).  This is similar to 'awk''s 'gsub()' function
+(*note String Functions::).
+
+   The following program, 'awksed.awk', accepts at least two
+command-line arguments: the pattern to look for and the text to replace
+it with.  Any additional arguments are treated as data file names to
+process.  If none are provided, the standard input is used:
+
+     # awksed.awk --- do s/foo/bar/g using just print
+     #    Thanks to Michael Brennan for the idea
+
+     function usage()
+     {
+         print "usage: awksed pat repl [files...]" > "/dev/stderr"
+         exit 1
+     }
+
+     BEGIN {
+         # validate arguments
+         if (ARGC < 3)
+             usage()
+
+         RS = ARGV[1]
+         ORS = ARGV[2]
+
+         # don't use arguments as files
+         ARGV[1] = ARGV[2] = ""
+     }
+
+     # look ma, no hands!
+     {
+         if (RT == "")
+             printf "%s", $0
+         else
+             print
+     }
+
+   The program relies on 'gawk''s ability to have 'RS' be a regexp, as
+well as on the setting of 'RT' to the actual text that terminates the
+record (*note Records::).
+
+   The idea is to have 'RS' be the pattern to look for.  'gawk'
+automatically sets '$0' to the text between matches of the pattern.
+This is text that we want to keep, unmodified.  Then, by setting 'ORS'
+to the replacement text, a simple 'print' statement outputs the text we
+want to keep, followed by the replacement text.
+
+   There is one wrinkle to this scheme, which is what to do if the last
+record doesn't end with text that matches 'RS'.  Using a 'print'
+statement unconditionally prints the replacement text, which is not
+correct.  However, if the file did not end in text that matches 'RS',
+'RT' is set to the null string.  In this case, we can print '$0' using
+'printf' (*note Printf::).
+
+   The 'BEGIN' rule handles the setup, checking for the right number of
+arguments and calling 'usage()' if there is a problem.  Then it sets
+'RS' and 'ORS' from the command-line arguments and sets 'ARGV[1]' and
+'ARGV[2]' to the null string, so that they are not treated as file names
+(*note ARGC and ARGV::).
+
+   The 'usage()' function prints an error message and exits.  Finally,
+the single rule handles the printing scheme outlined earlier, using
+'print' or 'printf' as appropriate, depending upon the value of 'RT'.
+
+
+File: gawk.info,  Node: Igawk Program,  Next: Anagram Program,  Prev: Simple Sed,  Up: Miscellaneous Programs
+
+11.3.9 An Easy Way to Use Library Functions
+-------------------------------------------
+
+In *note Include Files::, we saw how 'gawk' provides a built-in
+file-inclusion capability.  However, this is a 'gawk' extension.  This
+minor node provides the motivation for making file inclusion available
+for standard 'awk', and shows how to do it using a combination of shell
+and 'awk' programming.
+
+   Using library functions in 'awk' can be very beneficial.  It
+encourages code reuse and the writing of general functions.  Programs
+are smaller and therefore clearer.  However, using library functions is
+only easy when writing 'awk' programs; it is painful when running them,
+requiring multiple '-f' options.  If 'gawk' is unavailable, then so too
+is the 'AWKPATH' environment variable and the ability to put 'awk'
+functions into a library directory (*note Options::).  It would be nice
+to be able to write programs in the following manner:
+
+     # library functions
+     @include getopt.awk
+     @include join.awk
+     ...
+
+     # main program
+     BEGIN {
+         while ((c = getopt(ARGC, ARGV, "a:b:cde")) != -1)
+             ...
+         ...
+     }
+
+   The following program, 'igawk.sh', provides this service.  It
+simulates 'gawk''s searching of the 'AWKPATH' variable and also allows
+"nested" includes (i.e., a file that is included with '@include' can
+contain further '@include' statements).  'igawk' makes an effort to only
+include files once, so that nested includes don't accidentally include a
+library function twice.
+
+   'igawk' should behave just like 'gawk' externally.  This means it
+should accept all of 'gawk''s command-line arguments, including the
+ability to have multiple source files specified via '-f' and the ability
+to mix command-line and library source files.
+
+   The program is written using the POSIX Shell ('sh') command
+language.(1)  It works as follows:
+
+  1. Loop through the arguments, saving anything that doesn't represent
+     'awk' source code for later, when the expanded program is run.
+
+  2. For any arguments that do represent 'awk' text, put the arguments
+     into a shell variable that will be expanded.  There are two cases:
+
+       a. Literal text, provided with '-e' or '--source'.  This text is
+          just appended directly.
+
+       b. Source file names, provided with '-f'.  We use a neat trick
+          and append '@include FILENAME' to the shell variable's
+          contents.  Because the file-inclusion program works the way
+          'gawk' does, this gets the text of the file included in the
+          program at the correct point.
+
+  3. Run an 'awk' program (naturally) over the shell variable's contents
+     to expand '@include' statements.  The expanded program is placed in
+     a second shell variable.
+
+  4. Run the expanded program with 'gawk' and any other original
+     command-line arguments that the user supplied (such as the data
+     file names).
+
+   This program uses shell variables extensively: for storing
+command-line arguments and the text of the 'awk' program that will
+expand the user's program, for the user's original program, and for the
+expanded program.  Doing so removes some potential problems that might
+arise were we to use temporary files instead, at the cost of making the
+script somewhat more complicated.
+
+   The initial part of the program turns on shell tracing if the first
+argument is 'debug'.
+
+   The next part loops through all the command-line arguments.  There
+are several cases of interest:
+
+'--'
+     This ends the arguments to 'igawk'.  Anything else should be passed
+     on to the user's 'awk' program without being evaluated.
+
+'-W'
+     This indicates that the next option is specific to 'gawk'.  To make
+     argument processing easier, the '-W' is appended to the front of
+     the remaining arguments and the loop continues.  (This is an 'sh'
+     programming trick.  Don't worry about it if you are not familiar
+     with 'sh'.)
+
+'-v', '-F'
+     These are saved and passed on to 'gawk'.
+
+'-f', '--file', '--file=', '-Wfile='
+     The file name is appended to the shell variable 'program' with an
+     '@include' statement.  The 'expr' utility is used to remove the
+     leading option part of the argument (e.g., '--file=').  (Typical
+     'sh' usage would be to use the 'echo' and 'sed' utilities to do
+     this work.  Unfortunately, some versions of 'echo' evaluate escape
+     sequences in their arguments, possibly mangling the program text.
+     Using 'expr' avoids this problem.)
+
+'--source', '--source=', '-Wsource='
+     The source text is appended to 'program'.
+
+'--version', '-Wversion'
+     'igawk' prints its version number, runs 'gawk --version' to get the
+     'gawk' version information, and then exits.
+
+   If none of the '-f', '--file', '-Wfile', '--source', or '-Wsource'
+arguments are supplied, then the first nonoption argument should be the
+'awk' program.  If there are no command-line arguments left, 'igawk'
+prints an error message and exits.  Otherwise, the first argument is
+appended to 'program'.  In any case, after the arguments have been
+processed, the shell variable 'program' contains the complete text of
+the original 'awk' program.
+
+   The program is as follows:
+
+     #! /bin/sh
+     # igawk --- like gawk but do @include processing
+
+     if [ "$1" = debug ]
+     then
+         set -x
+         shift
+     fi
+
+     # A literal newline, so that program text is formatted correctly
+     n='
+     '
+
+     # Initialize variables to empty
+     program=
+     opts=
+
+     while [ $# -ne 0 ] # loop over arguments
+     do
+         case $1 in
+         --)     shift
+                 break ;;
+
+         -W)     shift
+                 # The ${x?'message here'} construct prints a
+                 # diagnostic if $x is the null string
+                 set -- -W"${@?'missing operand'}"
+                 continue ;;
+
+         -[vF])  opts="$opts $1 '${2?'missing operand'}'"
+                 shift ;;
+
+         -[vF]*) opts="$opts '$1'" ;;
+
+         -f)     program="$program$n@include ${2?'missing operand'}"
+                 shift ;;
+
+         -f*)    f=$(expr "$1" : '-f\(.*\)')
+                 program="$program$n@include $f" ;;
+
+         -[W-]file=*)
+                 f=$(expr "$1" : '-.file=\(.*\)')
+                 program="$program$n@include $f" ;;
+
+         -[W-]file)
+                 program="$program$n@include ${2?'missing operand'}"
+                 shift ;;
+
+         -[W-]source=*)
+                 t=$(expr "$1" : '-.source=\(.*\)')
+                 program="$program$n$t" ;;
+
+         -[W-]source)
+                 program="$program$n${2?'missing operand'}"
+                 shift ;;
+
+         -[W-]version)
+                 echo igawk: version 3.0 1>&2
+                 gawk --version
+                 exit 0 ;;
+
+         -[W-]*) opts="$opts '$1'" ;;
+
+         *)      break ;;
+         esac
+         shift
+     done
+
+     if [ -z "$program" ]
+     then
+          program=${1?'missing program'}
+          shift
+     fi
+
+     # At this point, `program' has the program.
+
+   The 'awk' program to process '@include' directives is stored in the
+shell variable 'expand_prog'.  Doing this keeps the shell script
+readable.  The 'awk' program reads through the user's program, one line
+at a time, using 'getline' (*note Getline::).  The input file names and
+'@include' statements are managed using a stack.  As each '@include' is
+encountered, the current file name is "pushed" onto the stack and the
+file named in the '@include' directive becomes the current file name.
+As each file is finished, the stack is "popped," and the previous input
+file becomes the current input file again.  The process is started by
+making the original file the first one on the stack.
+
+   The 'pathto()' function does the work of finding the full path to a
+file.  It simulates 'gawk''s behavior when searching the 'AWKPATH'
+environment variable (*note AWKPATH Variable::).  If a file name has a
+'/' in it, no path search is done.  Similarly, if the file name is
+'"-"', then that string is used as-is.  Otherwise, the file name is
+concatenated with the name of each directory in the path, and an attempt
+is made to open the generated file name.  The only way to test if a file
+can be read in 'awk' is to go ahead and try to read it with 'getline';
+this is what 'pathto()' does.(2)  If the file can be read, it is closed
+and the file name is returned:
+
+     expand_prog='
+
+     function pathto(file,    i, t, junk)
+     {
+         if (index(file, "/") != 0)
+             return file
+
+         if (file == "-")
+             return file
+
+         for (i = 1; i <= ndirs; i++) {
+             t = (pathlist[i] "/" file)
+             if ((getline junk < t) > 0) {
+                 # found it
+                 close(t)
+                 return t
+             }
+         }
+         return ""
+     }
+
+   The main program is contained inside one 'BEGIN' rule.  The first
+thing it does is set up the 'pathlist' array that 'pathto()' uses.
+After splitting the path on ':', null elements are replaced with '"."',
+which represents the current directory:
+
+     BEGIN {
+         path = ENVIRON["AWKPATH"]
+         ndirs = split(path, pathlist, ":")
+         for (i = 1; i <= ndirs; i++) {
+             if (pathlist[i] == "")
+                 pathlist[i] = "."
+         }
+
+   The stack is initialized with 'ARGV[1]', which will be
+'"/dev/stdin"'.  The main loop comes next.  Input lines are read in
+succession.  Lines that do not start with '@include' are printed
+verbatim.  If the line does start with '@include', the file name is in
+'$2'.  'pathto()' is called to generate the full path.  If it cannot,
+then the program prints an error message and continues.
+
+   The next thing to check is if the file is included already.  The
+'processed' array is indexed by the full file name of each included file
+and it tracks this information for us.  If the file is seen again, a
+warning message is printed.  Otherwise, the new file name is pushed onto
+the stack and processing continues.
+
+   Finally, when 'getline' encounters the end of the input file, the
+file is closed and the stack is popped.  When 'stackptr' is less than
+zero, the program is done:
+
+         stackptr = 0
+         input[stackptr] = ARGV[1] # ARGV[1] is first file
+
+         for (; stackptr >= 0; stackptr--) {
+             while ((getline < input[stackptr]) > 0) {
+                 if (tolower($1) != "@include") {
+                     print
+                     continue
+                 }
+                 fpath = pathto($2)
+                 if (fpath == "") {
+                     printf("igawk: %s:%d: cannot find %s\n",
+                         input[stackptr], FNR, $2) > "/dev/stderr"
+                     continue
+                 }
+                 if (! (fpath in processed)) {
+                     processed[fpath] = input[stackptr]
+                     input[++stackptr] = fpath  # push onto stack
+                 } else
+                     print $2, "included in", input[stackptr],
+                         "already included in",
+                         processed[fpath] > "/dev/stderr"
+             }
+             close(input[stackptr])
+         }
+     }'  # close quote ends `expand_prog' variable
+
+     processed_program=$(gawk -- "$expand_prog" /dev/stdin << EOF
+     $program
+     EOF
+     )
+
+   The shell construct 'COMMAND << MARKER' is called a "here document".
+Everything in the shell script up to the MARKER is fed to COMMAND as
+input.  The shell processes the contents of the here document for
+variable and command substitution (and possibly other things as well,
+depending upon the shell).
+
+   The shell construct '$(...)' is called "command substitution".  The
+output of the command inside the parentheses is substituted into the
+command line.  Because the result is used in a variable assignment, it
+is saved as a single string, even if the results contain whitespace.
+
+   The expanded program is saved in the variable 'processed_program'.
+It's done in these steps:
+
+  1. Run 'gawk' with the '@include'-processing program (the value of the
+     'expand_prog' shell variable) reading standard input.
+
+  2. Standard input is the contents of the user's program, from the
+     shell variable 'program'.  Feed its contents to 'gawk' via a here
+     document.
+
+  3. Save the results of this processing in the shell variable
+     'processed_program' by using command substitution.
+
+   The last step is to call 'gawk' with the expanded program, along with
+the original options and command-line arguments that the user supplied:
+
+     eval gawk $opts -- '"$processed_program"' '"$@"'
+
+   The 'eval' command is a shell construct that reruns the shell's
+parsing process.  This keeps things properly quoted.
+
+   This version of 'igawk' represents the fifth version of this program.
+There are four key simplifications that make the program work better:
+
+   * Using '@include' even for the files named with '-f' makes building
+     the initial collected 'awk' program much simpler; all the
+     '@include' processing can be done once.
+
+   * Not trying to save the line read with 'getline' in the 'pathto()'
+     function when testing for the file's accessibility for use with the
+     main program simplifies things considerably.
+
+   * Using a 'getline' loop in the 'BEGIN' rule does it all in one
+     place.  It is not necessary to call out to a separate loop for
+     processing nested '@include' statements.
+
+   * Instead of saving the expanded program in a temporary file, putting
+     it in a shell variable avoids some potential security problems.
+     This has the disadvantage that the script relies upon more features
+     of the 'sh' language, making it harder to follow for those who
+     aren't familiar with 'sh'.
+
+   Also, this program illustrates that it is often worthwhile to combine
+'sh' and 'awk' programming together.  You can usually accomplish quite a
+lot, without having to resort to low-level programming in C or C++, and
+it is frequently easier to do certain kinds of string and argument
+manipulation using the shell than it is in 'awk'.
+
+   Finally, 'igawk' shows that it is not always necessary to add new
+features to a program; they can often be layered on top.(3)
+
+   ---------- Footnotes ----------
+
+   (1) Fully explaining the 'sh' language is beyond the scope of this
+book.  We provide some minimal explanations, but see a good shell
+programming book if you wish to understand things in more depth.
+
+   (2) On some very old versions of 'awk', the test 'getline junk < t'
+can loop forever if the file exists but is empty.
+
+   (3) 'gawk' does '@include' processing itself in order to support the
+use of 'awk' programs as Web CGI scripts.
+
+
+File: gawk.info,  Node: Anagram Program,  Next: Signature Program,  Prev: Igawk Program,  Up: Miscellaneous Programs
+
+11.3.10 Finding Anagrams from a Dictionary
+------------------------------------------
+
+An interesting programming challenge is to search for "anagrams" in a
+word list (such as '/usr/share/dict/words' on many GNU/Linux systems).
+One word is an anagram of another if both words contain the same letters
+(e.g., "babbling" and "blabbing").
+
+   Column 2, Problem C, of Jon Bentley's 'Programming Pearls', Second
+Edition, presents an elegant algorithm.  The idea is to give words that
+are anagrams a common signature, sort all the words together by their
+signatures, and then print them.  Dr. Bentley observes that taking the
+letters in each word and sorting them produces those common signatures.
+
+   The following program uses arrays of arrays to bring together words
+with the same signature and array sorting to print the words in sorted
+order:
+
+     # anagram.awk --- An implementation of the anagram-finding algorithm
+     #                 from Jon Bentley's "Programming Pearls," 2nd edition.
+     #                 Addison Wesley, 2000, ISBN 0-201-65788-0.
+     #                 Column 2, Problem C, section 2.8, pp 18-20.
+
+     /'s$/   { next }        # Skip possessives
+
+   The program starts with a header, and then a rule to skip possessives
+in the dictionary file.  The next rule builds up the data structure.
+The first dimension of the array is indexed by the signature; the second
+dimension is the word itself:
+
+     {
+         key = word2key($1)  # Build signature
+         data[key][$1] = $1  # Store word with signature
+     }
+
+   The 'word2key()' function creates the signature.  It splits the word
+apart into individual letters, sorts the letters, and then joins them
+back together:
+
+     # word2key --- split word apart into letters, sort, and join back together
+
+     function word2key(word,     a, i, n, result)
+     {
+         n = split(word, a, "")
+         asort(a)
+
+         for (i = 1; i <= n; i++)
+             result = result a[i]
+
+         return result
+     }
+
+   Finally, the 'END' rule traverses the array and prints out the
+anagram lists.  It sends the output to the system 'sort' command because
+otherwise the anagrams would appear in arbitrary order:
+
+     END {
+         sort = "sort"
+         for (key in data) {
+             # Sort words with same key
+             nwords = asorti(data[key], words)
+             if (nwords == 1)
+                 continue
+
+             # And print. Minor glitch: trailing space at end of each line
+             for (j = 1; j <= nwords; j++)
+                 printf("%s ", words[j]) | sort
+             print "" | sort
+         }
+         close(sort)
+     }
+
+   Here is some partial output when the program is run:
+
+     $ gawk -f anagram.awk /usr/share/dict/words | grep '^b'
+     ...
+     babbled blabbed
+     babbler blabber brabble
+     babblers blabbers brabbles
+     babbling blabbing
+     babbly blabby
+     babel bable
+     babels beslab
+     babery yabber
+     ...
+
+
+File: gawk.info,  Node: Signature Program,  Prev: Anagram Program,  Up: Miscellaneous Programs
+
+11.3.11 And Now for Something Completely Different
+--------------------------------------------------
+
+The following program was written by Davide Brini and is published on
+his website (http://backreference.org/2011/02/03/obfuscated-awk/).  It
+serves as his signature in the Usenet group 'comp.lang.awk'.  He
+supplies the following copyright terms:
+
+     Copyright (C) 2008 Davide Brini
+
+     Copying and distribution of the code published in this page, with
+     or without modification, are permitted in any medium without
+     royalty provided the copyright notice and this notice are
+     preserved.
+
+   Here is the program:
+
+     awk 'BEGIN{O="~"~"~";o="=="=="==";o+=+o;x=O""O;while(X++<=x+o+o)c=c"%c";
+     printf c,(x-O)*(x-O),x*(x-o)-o,x*(x-O)+x-O-o,+x*(x-O)-x+o,X*(o*o+O)+x-O,
+     X*(X-x)-o*o,(x+X)*o*o+o,x*(X-x)-O-O,x-O+(O+o+X+x)*(o+O),X*X-X*(x-O)-x+O,
+     O+X*(o*(o+O)+O),+x+O+X*o,x*(x-o),(o+X+x)*o*o-(x-O-O),O+(X-x)*(X+O),x-O}'
+
+   We leave it to you to determine what the program does.  (If you are
+truly desperate to understand it, see Chris Johansen's explanation,
+which is embedded in the Texinfo source file for this Info file.)
+
+
+File: gawk.info,  Node: Programs Summary,  Next: Programs Exercises,  Prev: Miscellaneous Programs,  Up: Sample Programs
+
+11.4 Summary
+============
+
+   * The programs provided in this major node continue on the theme that
+     reading programs is an excellent way to learn Good Programming.
+
+   * Using '#!' to make 'awk' programs directly runnable makes them
+     easier to use.  Otherwise, invoke the program using 'awk -f ...'.
+
+   * Reimplementing standard POSIX programs in 'awk' is a pleasant
+     exercise; 'awk''s expressive power lets you write such programs in
+     relatively few lines of code, yet they are functionally complete
+     and usable.
+
+   * One of standard 'awk''s weaknesses is working with individual
+     characters.  The ability to use 'split()' with the empty string as
+     the separator can considerably simplify such tasks.
+
+   * The examples here demonstrate the usefulness of the library
+     functions from *note Library Functions:: for a number of real (if
+     small) programs.
+
+   * Besides reinventing POSIX wheels, other programs solved a selection
+     of interesting problems, such as finding duplicate words in text,
+     printing mailing labels, and finding anagrams.
+
+
+File: gawk.info,  Node: Programs Exercises,  Prev: Programs Summary,  Up: Sample Programs
+
+11.5 Exercises
+==============
+
+  1. Rewrite 'cut.awk' (*note Cut Program::) using 'split()' with '""'
+     as the separator.
+
+  2. In *note Egrep Program::, we mentioned that 'egrep -i' could be
+     simulated in versions of 'awk' without 'IGNORECASE' by using
+     'tolower()' on the line and the pattern.  In a footnote there, we
+     also mentioned that this solution has a bug: the translated line is
+     output, and not the original one.  Fix this problem.
+
+  3. The POSIX version of 'id' takes options that control which
+     information is printed.  Modify the 'awk' version (*note Id
+     Program::) to accept the same arguments and perform in the same
+     way.
+
+  4. The 'split.awk' program (*note Split Program::) assumes that
+     letters are contiguous in the character set, which isn't true for
+     EBCDIC systems.  Fix this problem.  (Hint: Consider a different way
+     to work through the alphabet, without relying on 'ord()' and
+     'chr()'.)
+
+  5. In 'uniq.awk' (*note Uniq Program::, the logic for choosing which
+     lines to print represents a "state machine", which is "a device
+     that can be in one of a set number of stable conditions depending
+     on its previous condition and on the present values of its
+     inputs."(1)  Brian Kernighan suggests that "an alternative approach
+     to state machines is to just read the input into an array, then use
+     indexing.  It's almost always easier code, and for most inputs
+     where you would use this, just as fast."  Rewrite the logic to
+     follow this suggestion.
+
+  6. Why can't the 'wc.awk' program (*note Wc Program::) just use the
+     value of 'FNR' in 'endfile()'?  Hint: Examine the code in *note
+     Filetrans Function::.
+
+  7. Manipulation of individual characters in the 'translate' program
+     (*note Translate Program::) is painful using standard 'awk'
+     functions.  Given that 'gawk' can split strings into individual
+     characters using '""' as the separator, how might you use this
+     feature to simplify the program?
+
+  8. The 'extract.awk' program (*note Extract Program::) was written
+     before 'gawk' had the 'gensub()' function.  Use it to simplify the
+     code.
+
+  9. Compare the performance of the 'awksed.awk' program (*note Simple
+     Sed::) with the more straightforward:
+
+          BEGIN {
+              pat = ARGV[1]
+              repl = ARGV[2]
+              ARGV[1] = ARGV[2] = ""
+          }
+
+          { gsub(pat, repl); print }
+
+  10. What are the advantages and disadvantages of 'awksed.awk' versus
+     the real 'sed' utility?
+
+  11. In *note Igawk Program::, we mentioned that not trying to save the
+     line read with 'getline' in the 'pathto()' function when testing
+     for the file's accessibility for use with the main program
+     simplifies things considerably.  What problem does this engender
+     though?
+
+  12. As an additional example of the idea that it is not always
+     necessary to add new features to a program, consider the idea of
+     having two files in a directory in the search path:
+
+     'default.awk'
+          This file contains a set of default library functions, such as
+          'getopt()' and 'assert()'.
+
+     'site.awk'
+          This file contains library functions that are specific to a
+          site or installation; i.e., locally developed functions.
+          Having a separate file allows 'default.awk' to change with new
+          'gawk' releases, without requiring the system administrator to
+          update it each time by adding the local functions.
+
+     One user suggested that 'gawk' be modified to automatically read
+     these files upon startup.  Instead, it would be very simple to
+     modify 'igawk' to do this.  Since 'igawk' can process nested
+     '@include' directives, 'default.awk' could simply contain
+     '@include' statements for the desired library functions.  Make this
+     change.
+
+  13. Modify 'anagram.awk' (*note Anagram Program::), to avoid the use
+     of the external 'sort' utility.
+
+   ---------- Footnotes ----------
+
+   (1) This is the definition returned from entering 'define: state
+machine' into Google.
+
+
+File: gawk.info,  Node: Advanced Features,  Next: Internationalization,  Prev: Sample Programs,  Up: Top
+
+12 Advanced Features of 'gawk'
+******************************
+
+     Write documentation as if whoever reads it is a violent psychopath
+     who knows where you live.
+            -- _Steve English, as quoted by Peter Langston_
+
+   This major node discusses advanced features in 'gawk'.  It's a bit of
+a "grab bag" of items that are otherwise unrelated to each other.
+First, we look at a command-line option that allows 'gawk' to recognize
+nondecimal numbers in input data, not just in 'awk' programs.  Then,
+'gawk''s special features for sorting arrays are presented.  Next,
+two-way I/O, discussed briefly in earlier parts of this Info file, is
+described in full detail, along with the basics of TCP/IP networking.
+Finally, we see how 'gawk' can "profile" an 'awk' program, making it
+possible to tune it for performance.
+
+   Additional advanced features are discussed in separate major nodes of
+their own:
+
+   * *note Internationalization::, discusses how to internationalize
+     your 'awk' programs, so that they can speak multiple national
+     languages.
+
+   * *note Debugger::, describes 'gawk''s built-in command-line debugger
+     for debugging 'awk' programs.
+
+   * *note Arbitrary Precision Arithmetic::, describes how you can use
+     'gawk' to perform arbitrary-precision arithmetic.
+
+   * *note Dynamic Extensions::, discusses the ability to dynamically
+     add new built-in functions to 'gawk'.
+
+* Menu:
+
+* Nondecimal Data::             Allowing nondecimal input data.
+* Array Sorting::               Facilities for controlling array traversal and
+                                sorting arrays.
+* Two-way I/O::                 Two-way communications with another process.
+* TCP/IP Networking::           Using 'gawk' for network programming.
+* Profiling::                   Profiling your 'awk' programs.
+* Advanced Features Summary::   Summary of advanced features.
+
+
+File: gawk.info,  Node: Nondecimal Data,  Next: Array Sorting,  Up: Advanced Features
+
+12.1 Allowing Nondecimal Input Data
+===================================
+
+If you run 'gawk' with the '--non-decimal-data' option, you can have
+nondecimal values in your input data:
+
+     $ echo 0123 123 0x123 |
+     > gawk --non-decimal-data '{ printf "%d, %d, %d\n", $1, $2, $3 }'
+     -| 83, 123, 291
+
+   For this feature to work, write your program so that 'gawk' treats
+your data as numeric:
+
+     $ echo 0123 123 0x123 | gawk '{ print $1, $2, $3 }'
+     -| 0123 123 0x123
+
+The 'print' statement treats its expressions as strings.  Although the
+fields can act as numbers when necessary, they are still strings, so
+'print' does not try to treat them numerically.  You need to add zero to
+a field to force it to be treated as a number.  For example:
+
+     $ echo 0123 123 0x123 | gawk --non-decimal-data '
+     > { print $1, $2, $3
+     >   print $1 + 0, $2 + 0, $3 + 0 }'
+     -| 0123 123 0x123
+     -| 83 123 291
+
+   Because it is common to have decimal data with leading zeros, and
+because using this facility could lead to surprising results, the
+default is to leave it disabled.  If you want it, you must explicitly
+request it.
+
+     CAUTION: _Use of this option is not recommended._  It can break old
+     programs very badly.  Instead, use the 'strtonum()' function to
+     convert your data (*note String Functions::).  This makes your
+     programs easier to write and easier to read, and leads to less
+     surprising results.
+
+     This option may disappear in a future version of 'gawk'.
+
+
+File: gawk.info,  Node: Array Sorting,  Next: Two-way I/O,  Prev: Nondecimal Data,  Up: Advanced Features
+
+12.2 Controlling Array Traversal and Array Sorting
+==================================================
+
+'gawk' lets you control the order in which a 'for (INDX in ARRAY)' loop
+traverses an array.
+
+   In addition, two built-in functions, 'asort()' and 'asorti()', let
+you sort arrays based on the array values and indices, respectively.
+These two functions also provide control over the sorting criteria used
+to order the elements during sorting.
+
+* Menu:
+
+* Controlling Array Traversal:: How to use PROCINFO["sorted_in"].
+* Array Sorting Functions::     How to use 'asort()' and 'asorti()'.
+
+
+File: gawk.info,  Node: Controlling Array Traversal,  Next: Array Sorting Functions,  Up: Array Sorting
+
+12.2.1 Controlling Array Traversal
+----------------------------------
+
+By default, the order in which a 'for (INDX in ARRAY)' loop scans an
+array is not defined; it is generally based upon the internal
+implementation of arrays inside 'awk'.
+
+   Often, though, it is desirable to be able to loop over the elements
+in a particular order that you, the programmer, choose.  'gawk' lets you
+do this.
+
+   *note Controlling Scanning:: describes how you can assign special,
+predefined values to 'PROCINFO["sorted_in"]' in order to control the
+order in which 'gawk' traverses an array during a 'for' loop.
+
+   In addition, the value of 'PROCINFO["sorted_in"]' can be a function
+name.(1)  This lets you traverse an array based on any custom criterion.
+The array elements are ordered according to the return value of this
+function.  The comparison function should be defined with at least four
+arguments:
+
+     function comp_func(i1, v1, i2, v2)
+     {
+         COMPARE ELEMENTS 1 AND 2 IN SOME FASHION
+         RETURN < 0; 0; OR > 0
+     }
+
+   Here, 'i1' and 'i2' are the indices, and 'v1' and 'v2' are the
+corresponding values of the two elements being compared.  Either 'v1' or
+'v2', or both, can be arrays if the array being traversed contains
+subarrays as values.  (*Note Arrays of Arrays:: for more information
+about subarrays.)  The three possible return values are interpreted as
+follows:
+
+'comp_func(i1, v1, i2, v2) < 0'
+     Index 'i1' comes before index 'i2' during loop traversal.
+
+'comp_func(i1, v1, i2, v2) == 0'
+     Indices 'i1' and 'i2' come together, but the relative order with
+     respect to each other is undefined.
+
+'comp_func(i1, v1, i2, v2) > 0'
+     Index 'i1' comes after index 'i2' during loop traversal.
+
+   Our first comparison function can be used to scan an array in
+numerical order of the indices:
+
+     function cmp_num_idx(i1, v1, i2, v2)
+     {
+          # numerical index comparison, ascending order
+          return (i1 - i2)
+     }
+
+   Our second function traverses an array based on the string order of
+the element values rather than by indices:
+
+     function cmp_str_val(i1, v1, i2, v2)
+     {
+         # string value comparison, ascending order
+         v1 = v1 ""
+         v2 = v2 ""
+         if (v1 < v2)
+             return -1
+         return (v1 != v2)
+     }
+
+   The third comparison function makes all numbers, and numeric strings
+without any leading or trailing spaces, come out first during loop
+traversal:
+
+     function cmp_num_str_val(i1, v1, i2, v2,   n1, n2)
+     {
+          # numbers before string value comparison, ascending order
+          n1 = v1 + 0
+          n2 = v2 + 0
+          if (n1 == v1)
+              return (n2 == v2) ? (n1 - n2) : -1
+          else if (n2 == v2)
+              return 1
+          return (v1 < v2) ? -1 : (v1 != v2)
+     }
+
+   Here is a main program to demonstrate how 'gawk' behaves using each
+of the previous functions:
+
+     BEGIN {
+         data["one"] = 10
+         data["two"] = 20
+         data[10] = "one"
+         data[100] = 100
+         data[20] = "two"
+
+         f[1] = "cmp_num_idx"
+         f[2] = "cmp_str_val"
+         f[3] = "cmp_num_str_val"
+         for (i = 1; i <= 3; i++) {
+             printf("Sort function: %s\n", f[i])
+             PROCINFO["sorted_in"] = f[i]
+             for (j in data)
+                 printf("\tdata[%s] = %s\n", j, data[j])
+             print ""
+         }
+     }
+
+   Here are the results when the program is run:
+
+     $ gawk -f compdemo.awk
+     -| Sort function: cmp_num_idx      Sort by numeric index
+     -|     data[two] = 20
+     -|     data[one] = 10              Both strings are numerically zero
+     -|     data[10] = one
+     -|     data[20] = two
+     -|     data[100] = 100
+     -|
+     -| Sort function: cmp_str_val      Sort by element values as strings
+     -|     data[one] = 10
+     -|     data[100] = 100             String 100 is less than string 20
+     -|     data[two] = 20
+     -|     data[10] = one
+     -|     data[20] = two
+     -|
+     -| Sort function: cmp_num_str_val  Sort all numeric values before all strings
+     -|     data[one] = 10
+     -|     data[two] = 20
+     -|     data[100] = 100
+     -|     data[10] = one
+     -|     data[20] = two
+
+   Consider sorting the entries of a GNU/Linux system password file
+according to login name.  The following program sorts records by a
+specific field position and can be used for this purpose:
+
+     # passwd-sort.awk --- simple program to sort by field position
+     # field position is specified by the global variable POS
+
+     function cmp_field(i1, v1, i2, v2)
+     {
+         # comparison by value, as string, and ascending order
+         return v1[POS] < v2[POS] ? -1 : (v1[POS] != v2[POS])
+     }
+
+     {
+         for (i = 1; i <= NF; i++)
+             a[NR][i] = $i
+     }
+
+     END {
+         PROCINFO["sorted_in"] = "cmp_field"
+         if (POS < 1 || POS > NF)
+             POS = 1
+         for (i in a) {
+             for (j = 1; j <= NF; j++)
+                 printf("%s%c", a[i][j], j < NF ? ":" : "")
+             print ""
+         }
+     }
+
+   The first field in each entry of the password file is the user's
+login name, and the fields are separated by colons.  Each record defines
+a subarray, with each field as an element in the subarray.  Running the
+program produces the following output:
+
+     $ gawk -v POS=1 -F: -f sort.awk /etc/passwd
+     -| adm:x:3:4:adm:/var/adm:/sbin/nologin
+     -| apache:x:48:48:Apache:/var/www:/sbin/nologin
+     -| avahi:x:70:70:Avahi daemon:/:/sbin/nologin
+     ...
+
+   The comparison should normally always return the same value when
+given a specific pair of array elements as its arguments.  If
+inconsistent results are returned, then the order is undefined.  This
+behavior can be exploited to introduce random order into otherwise
+seemingly ordered data:
+
+     function cmp_randomize(i1, v1, i2, v2)
+     {
+         # random order (caution: this may never terminate!)
+         return (2 - 4 * rand())
+     }
+
+   As already mentioned, the order of the indices is arbitrary if two
+elements compare equal.  This is usually not a problem, but letting the
+tied elements come out in arbitrary order can be an issue, especially
+when comparing item values.  The partial ordering of the equal elements
+may change the next time the array is traversed, if other elements are
+added to or removed from the array.  One way to resolve ties when
+comparing elements with otherwise equal values is to include the indices
+in the comparison rules.  Note that doing this may make the loop
+traversal less efficient, so consider it only if necessary.  The
+following comparison functions force a deterministic order, and are
+based on the fact that the (string) indices of two elements are never
+equal:
+
+     function cmp_numeric(i1, v1, i2, v2)
+     {
+         # numerical value (and index) comparison, descending order
+         return (v1 != v2) ? (v2 - v1) : (i2 - i1)
+     }
+
+     function cmp_string(i1, v1, i2, v2)
+     {
+         # string value (and index) comparison, descending order
+         v1 = v1 i1
+         v2 = v2 i2
+         return (v1 > v2) ? -1 : (v1 != v2)
+     }
+
+   A custom comparison function can often simplify ordered loop
+traversal, and the sky is really the limit when it comes to designing
+such a function.
+
+   When string comparisons are made during a sort, either for element
+values where one or both aren't numbers, or for element indices handled
+as strings, the value of 'IGNORECASE' (*note Built-in Variables::)
+controls whether the comparisons treat corresponding upper- and
+lowercase letters as equivalent or distinct.
+
+   Another point to keep in mind is that in the case of subarrays, the
+element values can themselves be arrays; a production comparison
+function should use the 'isarray()' function (*note Type Functions::) to
+check for this, and choose a defined sorting order for subarrays.
+
+   All sorting based on 'PROCINFO["sorted_in"]' is disabled in POSIX
+mode, because the 'PROCINFO' array is not special in that case.
+
+   As a side note, sorting the array indices before traversing the array
+has been reported to add a 15% to 20% overhead to the execution time of
+'awk' programs.  For this reason, sorted array traversal is not the
+default.
+
+   ---------- Footnotes ----------
+
+   (1) This is why the predefined sorting orders start with an '@'
+character, which cannot be part of an identifier.
+
+
+File: gawk.info,  Node: Array Sorting Functions,  Prev: Controlling Array Traversal,  Up: Array Sorting
+
+12.2.2 Sorting Array Values and Indices with 'gawk'
+---------------------------------------------------
+
+In most 'awk' implementations, sorting an array requires writing a
+'sort()' function.  This can be educational for exploring different
+sorting algorithms, but usually that's not the point of the program.
+'gawk' provides the built-in 'asort()' and 'asorti()' functions (*note
+String Functions::) for sorting arrays.  For example:
+
+     POPULATE THE ARRAY data
+     n = asort(data)
+     for (i = 1; i <= n; i++)
+         DO SOMETHING WITH data[i]
+
+   After the call to 'asort()', the array 'data' is indexed from 1 to
+some number N, the total number of elements in 'data'.  (This count is
+'asort()''s return value.)  'data[1]' <= 'data[2]' <= 'data[3]', and so
+on.  The default comparison is based on the type of the elements (*note
+Typing and Comparison::).  All numeric values come before all string
+values, which in turn come before all subarrays.
+
+   An important side effect of calling 'asort()' is that _the array's
+original indices are irrevocably lost_.  As this isn't always desirable,
+'asort()' accepts a second argument:
+
+     POPULATE THE ARRAY source
+     n = asort(source, dest)
+     for (i = 1; i <= n; i++)
+         DO SOMETHING WITH dest[i]
+
+   In this case, 'gawk' copies the 'source' array into the 'dest' array
+and then sorts 'dest', destroying its indices.  However, the 'source'
+array is not affected.
+
+   Often, what's needed is to sort on the values of the _indices_
+instead of the values of the elements.  To do that, use the 'asorti()'
+function.  The interface and behavior are identical to that of
+'asort()', except that the index values are used for sorting and become
+the values of the result array:
+
+     { source[$0] = some_func($0) }
+
+     END {
+         n = asorti(source, dest)
+         for (i = 1; i <= n; i++) {
+             Work with sorted indices directly:
+             DO SOMETHING WITH dest[i]
+             ...
+             Access original array via sorted indices:
+             DO SOMETHING WITH source[dest[i]]
+         }
+     }
+
+   So far, so good.  Now it starts to get interesting.  Both 'asort()'
+and 'asorti()' accept a third string argument to control comparison of
+array elements.  When we introduced 'asort()' and 'asorti()' in *note
+String Functions::, we ignored this third argument; however, now is the
+time to describe how this argument affects these two functions.
+
+   Basically, the third argument specifies how the array is to be
+sorted.  There are two possibilities.  As with 'PROCINFO["sorted_in"]',
+this argument may be one of the predefined names that 'gawk' provides
+(*note Controlling Scanning::), or it may be the name of a user-defined
+function (*note Controlling Array Traversal::).
+
+   In the latter case, _the function can compare elements in any way it
+chooses_, taking into account just the indices, just the values, or
+both.  This is extremely powerful.
+
+   Once the array is sorted, 'asort()' takes the _values_ in their final
+order and uses them to fill in the result array, whereas 'asorti()'
+takes the _indices_ in their final order and uses them to fill in the
+result array.
+
+     NOTE: Copying array indices and elements isn't expensive in terms
+     of memory.  Internally, 'gawk' maintains "reference counts" to
+     data.  For example, when 'asort()' copies the first array to the
+     second one, there is only one copy of the original array elements'
+     data, even though both arrays use the values.
+
+   Because 'IGNORECASE' affects string comparisons, the value of
+'IGNORECASE' also affects sorting for both 'asort()' and 'asorti()'.
+Note also that the locale's sorting order does _not_ come into play;
+comparisons are based on character values only.(1)
+
+   The following example demonstrates the use of a comparison function
+with 'asort()'.  The comparison function, 'case_fold_compare()', maps
+both values to lowercase in order to compare them ignoring case.
+
+     # case_fold_compare --- compare as strings, ignoring case
+
+     function case_fold_compare(i1, v1, i2, v2,    l, r)
+     {
+         l = tolower(v1)
+         r = tolower(v2)
+
+         if (l < r)
+             return -1
+         else if (l == r)
+             return 0
+         else
+             return 1
+     }
+
+   And here is the test program for it:
+
+     # Test program
+
+     BEGIN {
+         Letters = "abcdefghijklmnopqrstuvwxyz" \
+                   "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
+         split(Letters, data, "")
+
+         asort(data, result, "case_fold_compare")
+
+         j = length(result)
+         for (i = 1; i <= j; i++) {
+             printf("%s", result[i])
+             if (i % (j/2) == 0)
+                 printf("\n")
+             else
+                 printf(" ")
+         }
+     }
+
+   When run, we get the following:
+
+     $ gawk -f case_fold_compare.awk
+     -| A a B b c C D d e E F f g G H h i I J j k K l L M m
+     -| n N O o p P Q q r R S s t T u U V v w W X x y Y z Z
+
+   ---------- Footnotes ----------
+
+   (1) This is true because locale-based comparison occurs only when in
+POSIX-compatibility mode, and because 'asort()' and 'asorti()' are
+'gawk' extensions, they are not available in that case.
+
+
+File: gawk.info,  Node: Two-way I/O,  Next: TCP/IP Networking,  Prev: Array Sorting,  Up: Advanced Features
+
+12.3 Two-Way Communications with Another Process
+================================================
+
+It is often useful to be able to send data to a separate program for
+processing and then read the result.  This can always be done with
+temporary files:
+
+     # Write the data for processing
+     tempfile = ("mydata." PROCINFO["pid"])
+     while (NOT DONE WITH DATA)
+         print DATA | ("subprogram > " tempfile)
+     close("subprogram > " tempfile)
+
+     # Read the results, remove tempfile when done
+     while ((getline newdata < tempfile) > 0)
+         PROCESS newdata APPROPRIATELY
+     close(tempfile)
+     system("rm " tempfile)
+
+This works, but not elegantly.  Among other things, it requires that the
+program be run in a directory that cannot be shared among users; for
+example, '/tmp' will not do, as another user might happen to be using a
+temporary file with the same name.(1)
+
+   However, with 'gawk', it is possible to open a _two-way_ pipe to
+another process.  The second process is termed a "coprocess", as it runs
+in parallel with 'gawk'.  The two-way connection is created using the
+'|&' operator (borrowed from the Korn shell, 'ksh'):(2)
+
+     do {
+         print DATA |& "subprogram"
+         "subprogram" |& getline results
+     } while (DATA LEFT TO PROCESS)
+     close("subprogram")
+
+   The first time an I/O operation is executed using the '|&' operator,
+'gawk' creates a two-way pipeline to a child process that runs the other
+program.  Output created with 'print' or 'printf' is written to the
+program's standard input, and output from the program's standard output
+can be read by the 'gawk' program using 'getline'.  As is the case with
+processes started by '|', the subprogram can be any program, or pipeline
+of programs, that can be started by the shell.
+
+   There are some cautionary items to be aware of:
+
+   * As the code inside 'gawk' currently stands, the coprocess's
+     standard error goes to the same place that the parent 'gawk''s
+     standard error goes.  It is not possible to read the child's
+     standard error separately.
+
+   * I/O buffering may be a problem.  'gawk' automatically flushes all
+     output down the pipe to the coprocess.  However, if the coprocess
+     does not flush its output, 'gawk' may hang when doing a 'getline'
+     in order to read the coprocess's results.  This could lead to a
+     situation known as "deadlock", where each process is waiting for
+     the other one to do something.
+
+   It is possible to close just one end of the two-way pipe to a
+coprocess, by supplying a second argument to the 'close()' function of
+either '"to"' or '"from"' (*note Close Files And Pipes::).  These
+strings tell 'gawk' to close the end of the pipe that sends data to the
+coprocess or the end that reads from it, respectively.
+
+   This is particularly necessary in order to use the system 'sort'
+utility as part of a coprocess; 'sort' must read _all_ of its input data
+before it can produce any output.  The 'sort' program does not receive
+an end-of-file indication until 'gawk' closes the write end of the pipe.
+
+   When you have finished writing data to the 'sort' utility, you can
+close the '"to"' end of the pipe, and then start reading sorted data via
+'getline'.  For example:
+
+     BEGIN {
+         command = "LC_ALL=C sort"
+         n = split("abcdefghijklmnopqrstuvwxyz", a, "")
+
+         for (i = n; i > 0; i--)
+             print a[i] |& command
+         close(command, "to")
+
+         while ((command |& getline line) > 0)
+             print "got", line
+         close(command)
+     }
+
+   This program writes the letters of the alphabet in reverse order, one
+per line, down the two-way pipe to 'sort'.  It then closes the write end
+of the pipe, so that 'sort' receives an end-of-file indication.  This
+causes 'sort' to sort the data and write the sorted data back to the
+'gawk' program.  Once all of the data has been read, 'gawk' terminates
+the coprocess and exits.
+
+   As a side note, the assignment 'LC_ALL=C' in the 'sort' command
+ensures traditional Unix (ASCII) sorting from 'sort'.  This is not
+strictly necessary here, but it's good to know how to do this.
+
+   Be careful when closing the '"from"' end of a two-way pipe; in this
+case 'gawk' waits for the child process to exit, which may cause your
+program to hang.  (Thus, this particular feature is of much less use in
+practice than being able to close the '"to"' end.)
+
+     CAUTION: Normally, it is a fatal error to write to the '"to"' end
+     of a two-way pipe which has been closed, and it is also a fatal
+     error to read from the '"from"' end of a two-way pipe that has been
+     closed.
+
+     You may set 'PROCINFO["COMMAND", "NONFATAL"]' to make such
+     operations become nonfatal, in which case you then need to check
+     'ERRNO' after each 'print', 'printf', or 'getline'.  *Note
+     Nonfatal::, for more information.
+
+   You may also use pseudo-ttys (ptys) for two-way communication instead
+of pipes, if your system supports them.  This is done on a per-command
+basis, by setting a special element in the 'PROCINFO' array (*note
+Auto-set::), like so:
+
+     command = "sort -nr"           # command, save in convenience variable
+     PROCINFO[command, "pty"] = 1   # update PROCINFO
+     print ... |& command           # start two-way pipe
+     ...
+
+If your system does not have ptys, or if all the system's ptys are in
+use, 'gawk' automatically falls back to using regular pipes.
+
+   Using ptys usually avoids the buffer deadlock issues described
+earlier, at some loss in performance.  This is because the tty driver
+buffers and sends data line-by-line.  On systems with the 'stdbuf' (part
+of the GNU Coreutils package
+(http://www.gnu.org/software/coreutils/coreutils.html)), you can use
+that program instead of ptys.
+
+   Note also that ptys are not fully transparent.  Certain binary
+control codes, such 'Ctrl-d' for end-of-file, are interpreted by the tty
+driver and not passed through.
+
+     CAUTION: Finally, coprocesses open up the possibility of "deadlock"
+     between 'gawk' and the program running in the coprocess.  This can
+     occur if you send "too much" data to the coprocess before reading
+     any back; each process is blocked writing data with noone available
+     to read what they've already written.  There is no workaround for
+     deadlock; careful programming and knowledge of the behavior of the
+     coprocess are required.
+
+   ---------- Footnotes ----------
+
+   (1) Michael Brennan suggests the use of 'rand()' to generate unique
+file names.  This is a valid point; nevertheless, temporary files remain
+more difficult to use than two-way pipes.
+
+   (2) This is very different from the same operator in the C shell and
+in Bash.
+
+
+File: gawk.info,  Node: TCP/IP Networking,  Next: Profiling,  Prev: Two-way I/O,  Up: Advanced Features
+
+12.4 Using 'gawk' for Network Programming
+=========================================
+
+     'EMRED':
+         A host is a host from coast to coast,
+         and nobody talks to a host that's close,
+         unless the host that isn't close
+         is busy, hung, or dead.
+                 -- _Mike O'Brien (aka Mr. Protocol)_
+
+   In addition to being able to open a two-way pipeline to a coprocess
+on the same system (*note Two-way I/O::), it is possible to make a
+two-way connection to another process on another system across an IP
+network connection.
+
+   You can think of this as just a _very long_ two-way pipeline to a
+coprocess.  The way 'gawk' decides that you want to use TCP/IP
+networking is by recognizing special file names that begin with one of
+'/inet/', '/inet4/', or '/inet6/'.
+
+   The full syntax of the special file name is
+'/NET-TYPE/PROTOCOL/LOCAL-PORT/REMOTE-HOST/REMOTE-PORT'.  The components
+are:
+
+NET-TYPE
+     Specifies the kind of Internet connection to make.  Use '/inet4/'
+     to force IPv4, and '/inet6/' to force IPv6.  Plain '/inet/' (which
+     used to be the only option) uses the system default, most likely
+     IPv4.
+
+PROTOCOL
+     The protocol to use over IP. This must be either 'tcp', or 'udp',
+     for a TCP or UDP IP connection, respectively.  TCP should be used
+     for most applications.
+
+LOCAL-PORT
+     The local TCP or UDP port number to use.  Use a port number of '0'
+     when you want the system to pick a port.  This is what you should
+     do when writing a TCP or UDP client.  You may also use a well-known
+     service name, such as 'smtp' or 'http', in which case 'gawk'
+     attempts to determine the predefined port number using the C
+     'getaddrinfo()' function.
+
+REMOTE-HOST
+     The IP address or fully qualified domain name of the Internet host
+     to which you want to connect.
+
+REMOTE-PORT
+     The TCP or UDP port number to use on the given REMOTE-HOST.  Again,
+     use '0' if you don't care, or else a well-known service name.
+
+     NOTE: Failure in opening a two-way socket will result in a nonfatal
+     error being returned to the calling code.  The value of 'ERRNO'
+     indicates the error (*note Auto-set::).
+
+   Consider the following very simple example:
+
+     BEGIN {
+         Service = "/inet/tcp/0/localhost/daytime"
+         Service |& getline
+         print $0
+         close(Service)
+     }
+
+   This program reads the current date and time from the local system's
+TCP 'daytime' server.  It then prints the results and closes the
+connection.
+
+   Because this topic is extensive, the use of 'gawk' for TCP/IP
+programming is documented separately.  See *note (General Introduction,
+gawkinet, TCP/IP Internetworking with 'gawk')Top::, for a much more
+complete introduction and discussion, as well as extensive examples.
+
+     NOTE: 'gawk' can only open direct sockets.  There is currently no
+     way to access services available over Secure Socket Layer (SSL);
+     this includes any web service whose URL starts with 'https://'.
+
+
+File: gawk.info,  Node: Profiling,  Next: Advanced Features Summary,  Prev: TCP/IP Networking,  Up: Advanced Features
+
+12.5 Profiling Your 'awk' Programs
+==================================
+
+You may produce execution traces of your 'awk' programs.  This is done
+by passing the option '--profile' to 'gawk'.  When 'gawk' has finished
+running, it creates a profile of your program in a file named
+'awkprof.out'.  Because it is profiling, it also executes up to 45%
+slower than 'gawk' normally does.
+
+   As shown in the following example, the '--profile' option can be used
+to change the name of the file where 'gawk' will write the profile:
+
+     gawk --profile=myprog.prof -f myprog.awk data1 data2
+
+In the preceding example, 'gawk' places the profile in 'myprog.prof'
+instead of in 'awkprof.out'.
+
+   Here is a sample session showing a simple 'awk' program, its input
+data, and the results from running 'gawk' with the '--profile' option.
+First, the 'awk' program:
+
+     BEGIN { print "First BEGIN rule" }
+
+     END { print "First END rule" }
+
+     /foo/ {
+         print "matched /foo/, gosh"
+         for (i = 1; i <= 3; i++)
+             sing()
+     }
+
+     {
+         if (/foo/)
+             print "if is true"
+         else
+             print "else is true"
+     }
+
+     BEGIN { print "Second BEGIN rule" }
+
+     END { print "Second END rule" }
+
+     function sing(    dummy)
+     {
+         print "I gotta be me!"
+     }
+
+   Following is the input data:
+
+     foo
+     bar
+     baz
+     foo
+     junk
+
+   Here is the 'awkprof.out' that results from running the 'gawk'
+profiler on this program and data (this example also illustrates that
+'awk' programmers sometimes get up very early in the morning to work):
+
+         # gawk profile, created Mon Sep 29 05:16:21 2014
+
+         # BEGIN rule(s)
+
+         BEGIN {
+      1          print "First BEGIN rule"
+         }
+
+         BEGIN {
+      1          print "Second BEGIN rule"
+         }
+
+         # Rule(s)
+
+      5  /foo/ { # 2
+      2          print "matched /foo/, gosh"
+      6          for (i = 1; i <= 3; i++) {
+      6                  sing()
+                 }
+         }
+
+      5  {
+      5          if (/foo/) { # 2
+      2                  print "if is true"
+      3          } else {
+      3                  print "else is true"
+                 }
+         }
+
+         # END rule(s)
+
+         END {
+      1          print "First END rule"
+         }
+
+         END {
+      1          print "Second END rule"
+         }
+
+
+         # Functions, listed alphabetically
+
+      6  function sing(dummy)
+         {
+      6          print "I gotta be me!"
+         }
+
+   This example illustrates many of the basic features of profiling
+output.  They are as follows:
+
+   * The program is printed in the order 'BEGIN' rules, 'BEGINFILE'
+     rules, pattern-action rules, 'ENDFILE' rules, 'END' rules, and
+     functions, listed alphabetically.  Multiple 'BEGIN' and 'END' rules
+     retain their separate identities, as do multiple 'BEGINFILE' and
+     'ENDFILE' rules.
+
+   * Pattern-action rules have two counts.  The first count, to the left
+     of the rule, shows how many times the rule's pattern was _tested_.
+     The second count, to the right of the rule's opening left brace in
+     a comment, shows how many times the rule's action was _executed_.
+     The difference between the two indicates how many times the rule's
+     pattern evaluated to false.
+
+   * Similarly, the count for an 'if'-'else' statement shows how many
+     times the condition was tested.  To the right of the opening left
+     brace for the 'if''s body is a count showing how many times the
+     condition was true.  The count for the 'else' indicates how many
+     times the test failed.
+
+   * The count for a loop header (such as 'for' or 'while') shows how
+     many times the loop test was executed.  (Because of this, you can't
+     just look at the count on the first statement in a rule to
+     determine how many times the rule was executed.  If the first
+     statement is a loop, the count is misleading.)
+
+   * For user-defined functions, the count next to the 'function'
+     keyword indicates how many times the function was called.  The
+     counts next to the statements in the body show how many times those
+     statements were executed.
+
+   * The layout uses "K&R" style with TABs.  Braces are used everywhere,
+     even when the body of an 'if', 'else', or loop is only a single
+     statement.
+
+   * Parentheses are used only where needed, as indicated by the
+     structure of the program and the precedence rules.  For example,
+     '(3 + 5) * 4' means add three and five, then multiply the total by
+     four.  However, '3 + 5 * 4' has no parentheses, and means '3 + (5 *
+     4)'.
+
+   * Parentheses are used around the arguments to 'print' and 'printf'
+     only when the 'print' or 'printf' statement is followed by a
+     redirection.  Similarly, if the target of a redirection isn't a
+     scalar, it gets parenthesized.
+
+   * 'gawk' supplies leading comments in front of the 'BEGIN' and 'END'
+     rules, the 'BEGINFILE' and 'ENDFILE' rules, the pattern-action
+     rules, and the functions.
+
+   The profiled version of your program may not look exactly like what
+you typed when you wrote it.  This is because 'gawk' creates the
+profiled version by "pretty-printing" its internal representation of the
+program.  The advantage to this is that 'gawk' can produce a standard
+representation.  Also, things such as:
+
+     /foo/
+
+come out as:
+
+     /foo/   {
+         print $0
+     }
+
+which is correct, but possibly unexpected.
+
+   Besides creating profiles when a program has completed, 'gawk' can
+produce a profile while it is running.  This is useful if your 'awk'
+program goes into an infinite loop and you want to see what has been
+executed.  To use this feature, run 'gawk' with the '--profile' option
+in the background:
+
+     $ gawk --profile -f myprog &
+     [1] 13992
+
+The shell prints a job number and process ID number; in this case,
+13992.  Use the 'kill' command to send the 'USR1' signal to 'gawk':
+
+     $ kill -USR1 13992
+
+As usual, the profiled version of the program is written to
+'awkprof.out', or to a different file if one was specified with the
+'--profile' option.
+
+   Along with the regular profile, as shown earlier, the profile file
+includes a trace of any active functions:
+
+     # Function Call Stack:
+
+     #   3. baz
+     #   2. bar
+     #   1. foo
+     # -- main --
+
+   You may send 'gawk' the 'USR1' signal as many times as you like.
+Each time, the profile and function call trace are appended to the
+output profile file.
+
+   If you use the 'HUP' signal instead of the 'USR1' signal, 'gawk'
+produces the profile and the function call trace and then exits.
+
+   When 'gawk' runs on MS-Windows systems, it uses the 'INT' and 'QUIT'
+signals for producing the profile, and in the case of the 'INT' signal,
+'gawk' exits.  This is because these systems don't support the 'kill'
+command, so the only signals you can deliver to a program are those
+generated by the keyboard.  The 'INT' signal is generated by the
+'Ctrl-c' or 'Ctrl-BREAK' key, while the 'QUIT' signal is generated by
+the 'Ctrl-\' key.
+
+   Finally, 'gawk' also accepts another option, '--pretty-print'.  When
+called this way, 'gawk' "pretty-prints" the program into 'awkprof.out',
+without any execution counts.
+
+     NOTE: Once upon a time, the '--pretty-print' option would also run
+     your program.  This is is no longer the case.
+
+   There is a significant difference between the output created when
+profiling, and that created when pretty-printing.  Pretty-printed output
+preserves the original comments that were in the program, although their
+placement may not correspond exactly to their original locations in the
+source code.(1)
+
+   However, as a deliberate design decision, profiling output _omits_
+the original program's comments.  This allows you to focus on the
+execution count data and helps you avoid the temptation to use the
+profiler for pretty-printing.
+
+   Additionally, pretty-printed output does not have the leading
+indentation that the profiling output does.  This makes it easy to
+pretty-print your code once development is completed, and then use the
+result as the final version of your program.
+
+   Because the internal representation of your program is formatted to
+recreate an 'awk' program, profiling and pretty-printing automatically
+disable 'gawk''s default optimizations.
+
+   ---------- Footnotes ----------
+
+   (1) 'gawk' does the best it can to preserve the distinction between
+comments at the end of a statement and comments on lines by themselves.
+Due to implementation constraints, it does not always do so correctly,
+particularly for 'switch' statements.  The 'gawk' maintainers hope to
+improve this in a subsequent release.
+
+
+File: gawk.info,  Node: Advanced Features Summary,  Prev: Profiling,  Up: Advanced Features
+
+12.6 Summary
+============
+
+   * The '--non-decimal-data' option causes 'gawk' to treat octal- and
+     hexadecimal-looking input data as octal and hexadecimal.  This
+     option should be used with caution or not at all; use of
+     'strtonum()' is preferable.  Note that this option may disappear in
+     a future version of 'gawk'.
+
+   * You can take over complete control of sorting in 'for (INDX in
+     ARRAY)' array traversal by setting 'PROCINFO["sorted_in"]' to the
+     name of a user-defined function that does the comparison of array
+     elements based on index and value.
+
+   * Similarly, you can supply the name of a user-defined comparison
+     function as the third argument to either 'asort()' or 'asorti()' to
+     control how those functions sort arrays.  Or you may provide one of
+     the predefined control strings that work for
+     'PROCINFO["sorted_in"]'.
+
+   * You can use the '|&' operator to create a two-way pipe to a
+     coprocess.  You read from the coprocess with 'getline' and write to
+     it with 'print' or 'printf'.  Use 'close()' to close off the
+     coprocess completely, or optionally, close off one side of the
+     two-way communications.
+
+   * By using special file names with the '|&' operator, you can open a
+     TCP/IP (or UDP/IP) connection to remote hosts on the Internet.
+     'gawk' supports both IPv4 and IPv6.
+
+   * You can generate statement count profiles of your program.  This
+     can help you determine which parts of your program may be taking
+     the most time and let you tune them more easily.  Sending the
+     'USR1' signal while profiling causes 'gawk' to dump the profile and
+     keep going, including a function call stack.
+
+   * You can also just "pretty-print" the program.
+
+
+File: gawk.info,  Node: Internationalization,  Next: Debugger,  Prev: Advanced Features,  Up: Top
+
+13 Internationalization with 'gawk'
+***********************************
+
+Once upon a time, computer makers wrote software that worked only in
+English.  Eventually, hardware and software vendors noticed that if
+their systems worked in the native languages of non-English-speaking
+countries, they were able to sell more systems.  As a result,
+internationalization and localization of programs and software systems
+became a common practice.
+
+   For many years, the ability to provide internationalization was
+largely restricted to programs written in C and C++.  This major node
+describes the underlying library 'gawk' uses for internationalization,
+as well as how 'gawk' makes internationalization features available at
+the 'awk' program level.  Having internationalization available at the
+'awk' level gives software developers additional flexibility--they are
+no longer forced to write in C or C++ when internationalization is a
+requirement.
+
+* Menu:
+
+* I18N and L10N::               Internationalization and Localization.
+* Explaining gettext::          How GNU 'gettext' works.
+* Programmer i18n::             Features for the programmer.
+* Translator i18n::             Features for the translator.
+* I18N Example::                A simple i18n example.
+* Gawk I18N::                   'gawk' is also internationalized.
+* I18N Summary::                Summary of I18N stuff.
+
+
+File: gawk.info,  Node: I18N and L10N,  Next: Explaining gettext,  Up: Internationalization
+
+13.1 Internationalization and Localization
+==========================================
+
+"Internationalization" means writing (or modifying) a program once, in
+such a way that it can use multiple languages without requiring further
+source code changes.  "Localization" means providing the data necessary
+for an internationalized program to work in a particular language.  Most
+typically, these terms refer to features such as the language used for
+printing error messages, the language used to read responses, and
+information related to how numerical and monetary values are printed and
+read.
+
+
+File: gawk.info,  Node: Explaining gettext,  Next: Programmer i18n,  Prev: I18N and L10N,  Up: Internationalization
+
+13.2 GNU 'gettext'
+==================
+
+'gawk' uses GNU 'gettext' to provide its internationalization features.
+The facilities in GNU 'gettext' focus on messages: strings printed by a
+program, either directly or via formatting with 'printf' or
+'sprintf()'.(1)
+
+   When using GNU 'gettext', each application has its own "text domain".
+This is a unique name, such as 'kpilot' or 'gawk', that identifies the
+application.  A complete application may have multiple
+components--programs written in C or C++, as well as scripts written in
+'sh' or 'awk'.  All of the components use the same text domain.
+
+   To make the discussion concrete, assume we're writing an application
+named 'guide'.  Internationalization consists of the following steps, in
+this order:
+
+  1. The programmer reviews the source for all of 'guide''s components
+     and marks each string that is a candidate for translation.  For
+     example, '"`-F': option required"' is a good candidate for
+     translation.  A table with strings of option names is not (e.g.,
+     'gawk''s '--profile' option should remain the same, no matter what
+     the local language).
+
+  2. The programmer indicates the application's text domain ('"guide"')
+     to the 'gettext' library, by calling the 'textdomain()' function.
+
+  3. Messages from the application are extracted from the source code
+     and collected into a portable object template file ('guide.pot'),
+     which lists the strings and their translations.  The translations
+     are initially empty.  The original (usually English) messages serve
+     as the key for lookup of the translations.
+
+  4. For each language with a translator, 'guide.pot' is copied to a
+     portable object file ('.po') and translations are created and
+     shipped with the application.  For example, there might be a
+     'fr.po' for a French translation.
+
+  5. Each language's '.po' file is converted into a binary message
+     object ('.gmo') file.  A message object file contains the original
+     messages and their translations in a binary format that allows fast
+     lookup of translations at runtime.
+
+  6. When 'guide' is built and installed, the binary translation files
+     are installed in a standard place.
+
+  7. For testing and development, it is possible to tell 'gettext' to
+     use '.gmo' files in a different directory than the standard one by
+     using the 'bindtextdomain()' function.
+
+  8. At runtime, 'guide' looks up each string via a call to 'gettext()'.
+     The returned string is the translated string if available, or the
+     original string if not.
+
+  9. If necessary, it is possible to access messages from a different
+     text domain than the one belonging to the application, without
+     having to switch the application's default text domain back and
+     forth.
+
+   In C (or C++), the string marking and dynamic translation lookup are
+accomplished by wrapping each string in a call to 'gettext()':
+
+     printf("%s", gettext("Don't Panic!\n"));
+
+   The tools that extract messages from source code pull out all strings
+enclosed in calls to 'gettext()'.
+
+   The GNU 'gettext' developers, recognizing that typing 'gettext(...)'
+over and over again is both painful and ugly to look at, use the macro
+'_' (an underscore) to make things easier:
+
+     /* In the standard header file: */
+     #define _(str) gettext(str)
+
+     /* In the program text: */
+     printf("%s", _("Don't Panic!\n"));
+
+This reduces the typing overhead to just three extra characters per
+string and is considerably easier to read as well.
+
+   There are locale "categories" for different types of locale-related
+information.  The defined locale categories that 'gettext' knows about
+are:
+
+'LC_MESSAGES'
+     Text messages.  This is the default category for 'gettext'
+     operations, but it is possible to supply a different one
+     explicitly, if necessary.  (It is almost never necessary to supply
+     a different category.)
+
+'LC_COLLATE'
+     Text-collation information (i.e., how different characters and/or
+     groups of characters sort in a given language).
+
+'LC_CTYPE'
+     Character-type information (alphabetic, digit, upper- or lowercase,
+     and so on) as well as character encoding.  This information is
+     accessed via the POSIX character classes in regular expressions,
+     such as '/[[:alnum:]]/' (*note Bracket Expressions::).
+
+'LC_MONETARY'
+     Monetary information, such as the currency symbol, and whether the
+     symbol goes before or after a number.
+
+'LC_NUMERIC'
+     Numeric information, such as which characters to use for the
+     decimal point and the thousands separator.(2)
+
+'LC_TIME'
+     Time- and date-related information, such as 12- or 24-hour clock,
+     month printed before or after the day in a date, local month
+     abbreviations, and so on.
+
+'LC_ALL'
+     All of the above.  (Not too useful in the context of 'gettext'.)
+
+     NOTE: As described in *note Locales::, environment variables with
+     the same name as the locale categories ('LC_CTYPE', 'LC_ALL', etc.)
+     influence 'gawk''s behavior (and that of other utilities).
+
+     Normally, these variables also affect how the 'gettext' library
+     finds translations.  However, the 'LANGUAGE' environment variable
+     overrides the 'LC_XXX' variables.  Many GNU/Linux systems may
+     define this variable without your knowledge, causing 'gawk' to not
+     find the correct translations.  If this happens to you, look to see
+     if 'LANGUAGE' is defined, and if so, use the shell's 'unset'
+     command to remove it.
+
+   For testing translations of 'gawk' itself, you can set the
+'GAWK_LOCALE_DIR' environment variable.  See the documentation for the C
+'bindtextdomain()' function and also see *note Other Environment
+Variables::.
+
+   ---------- Footnotes ----------
+
+   (1) For some operating systems, the 'gawk' port doesn't support GNU
+'gettext'.  Therefore, these features are not available if you are using
+one of those operating systems.  Sorry.
+
+   (2) Americans use a comma every three decimal places and a period for
+the decimal point, while many Europeans do exactly the opposite:
+1,234.56 versus 1.234,56.
+
+
+File: gawk.info,  Node: Programmer i18n,  Next: Translator i18n,  Prev: Explaining gettext,  Up: Internationalization
+
+13.3 Internationalizing 'awk' Programs
+======================================
+
+'gawk' provides the following variables for internationalization:
+
+'TEXTDOMAIN'
+     This variable indicates the application's text domain.  For
+     compatibility with GNU 'gettext', the default value is
+     '"messages"'.
+
+'_"your message here"'
+     String constants marked with a leading underscore are candidates
+     for translation at runtime.  String constants without a leading
+     underscore are not translated.
+
+   'gawk' provides the following functions for internationalization:
+
+'dcgettext(STRING [, DOMAIN [, CATEGORY]])'
+     Return the translation of STRING in text domain DOMAIN for locale
+     category CATEGORY.  The default value for DOMAIN is the current
+     value of 'TEXTDOMAIN'.  The default value for CATEGORY is
+     '"LC_MESSAGES"'.
+
+     If you supply a value for CATEGORY, it must be a string equal to
+     one of the known locale categories described in *note Explaining
+     gettext::.  You must also supply a text domain.  Use 'TEXTDOMAIN'
+     if you want to use the current domain.
+
+          CAUTION: The order of arguments to the 'awk' version of the
+          'dcgettext()' function is purposely different from the order
+          for the C version.  The 'awk' version's order was chosen to be
+          simple and to allow for reasonable 'awk'-style default
+          arguments.
+
+'dcngettext(STRING1, STRING2, NUMBER [, DOMAIN [, CATEGORY]])'
+     Return the plural form used for NUMBER of the translation of
+     STRING1 and STRING2 in text domain DOMAIN for locale category
+     CATEGORY.  STRING1 is the English singular variant of a message,
+     and STRING2 is the English plural variant of the same message.  The
+     default value for DOMAIN is the current value of 'TEXTDOMAIN'.  The
+     default value for CATEGORY is '"LC_MESSAGES"'.
+
+     The same remarks about argument order as for the 'dcgettext()'
+     function apply.
+
+'bindtextdomain(DIRECTORY [, DOMAIN ])'
+     Change the directory in which 'gettext' looks for '.gmo' files, in
+     case they will not or cannot be placed in the standard locations
+     (e.g., during testing).  Return the directory in which DOMAIN is
+     "bound."
+
+     The default DOMAIN is the value of 'TEXTDOMAIN'.  If DIRECTORY is
+     the null string ('""'), then 'bindtextdomain()' returns the current
+     binding for the given DOMAIN.
+
+   To use these facilities in your 'awk' program, follow these steps:
+
+  1. Set the variable 'TEXTDOMAIN' to the text domain of your program.
+     This is best done in a 'BEGIN' rule (*note BEGIN/END::), or it can
+     also be done via the '-v' command-line option (*note Options::):
+
+          BEGIN {
+              TEXTDOMAIN = "guide"
+              ...
+          }
+
+  2. Mark all translatable strings with a leading underscore ('_')
+     character.  It _must_ be adjacent to the opening quote of the
+     string.  For example:
+
+          print _"hello, world"
+          x = _"you goofed"
+          printf(_"Number of users is %d\n", nusers)
+
+  3. If you are creating strings dynamically, you can still translate
+     them, using the 'dcgettext()' built-in function:(1)
+
+          if (groggy)
+              message = dcgettext("%d customers disturbing me\n", "adminprog")
+          else
+              message = dcgettext("enjoying %d customers\n", "adminprog")
+          printf(message, ncustomers)
+
+     Here, the call to 'dcgettext()' supplies a different text domain
+     ('"adminprog"') in which to find the message, but it uses the
+     default '"LC_MESSAGES"' category.
+
+     The previous example only works if 'ncustomers' is greater than
+     one.  This example would be better done with 'dcngettext()':
+
+          if (groggy)
+              message = dcngettext("%d customer disturbing me\n",
+                                   "%d customers disturbing me\n", "adminprog")
+          else
+              message = dcngettext("enjoying %d customer\n",
+                                   "enjoying %d customers\n", "adminprog")
+          printf(message, ncustomers)
+
+  4. During development, you might want to put the '.gmo' file in a
+     private directory for testing.  This is done with the
+     'bindtextdomain()' built-in function:
+
+          BEGIN {
+             TEXTDOMAIN = "guide"   # our text domain
+             if (Testing) {
+                 # where to find our files
+                 bindtextdomain("testdir")
+                 # joe is in charge of adminprog
+                 bindtextdomain("../joe/testdir", "adminprog")
+             }
+             ...
+          }
+
+   *Note I18N Example:: for an example program showing the steps to
+create and use translations from 'awk'.
+
+   ---------- Footnotes ----------
+
+   (1) Thanks to Bruno Haible for this example.
+
+
+File: gawk.info,  Node: Translator i18n,  Next: I18N Example,  Prev: Programmer i18n,  Up: Internationalization
+
+13.4 Translating 'awk' Programs
+===============================
+
+Once a program's translatable strings have been marked, they must be
+extracted to create the initial '.pot' file.  As part of translation, it
+is often helpful to rearrange the order in which arguments to 'printf'
+are output.
+
+   'gawk''s '--gen-pot' command-line option extracts the messages and is
+discussed next.  After that, 'printf''s ability to rearrange the order
+for 'printf' arguments at runtime is covered.
+
+* Menu:
+
+* String Extraction::           Extracting marked strings.
+* Printf Ordering::             Rearranging 'printf' arguments.
+* I18N Portability::            'awk'-level portability issues.
+
+
+File: gawk.info,  Node: String Extraction,  Next: Printf Ordering,  Up: Translator i18n
+
+13.4.1 Extracting Marked Strings
+--------------------------------
+
+Once your 'awk' program is working, and all the strings have been marked
+and you've set (and perhaps bound) the text domain, it is time to
+produce translations.  First, use the '--gen-pot' command-line option to
+create the initial '.pot' file:
+
+     gawk --gen-pot -f guide.awk > guide.pot
+
+   When run with '--gen-pot', 'gawk' does not execute your program.
+Instead, it parses it as usual and prints all marked strings to standard
+output in the format of a GNU 'gettext' Portable Object file.  Also
+included in the output are any constant strings that appear as the first
+argument to 'dcgettext()' or as the first and second argument to
+'dcngettext()'.(1)  You should distribute the generated '.pot' file with
+your 'awk' program; translators will eventually use it to provide you
+translations that you can also then distribute.  *Note I18N Example::
+for the full list of steps to go through to create and test translations
+for 'guide'.
+
+   ---------- Footnotes ----------
+
+   (1) The 'xgettext' utility that comes with GNU 'gettext' can handle
+'.awk' files.
+
+
+File: gawk.info,  Node: Printf Ordering,  Next: I18N Portability,  Prev: String Extraction,  Up: Translator i18n
+
+13.4.2 Rearranging 'printf' Arguments
+-------------------------------------
+
+Format strings for 'printf' and 'sprintf()' (*note Printf::) present a
+special problem for translation.  Consider the following:(1)
+
+     printf(_"String `%s' has %d characters\n",
+               string, length(string)))
+
+   A possible German translation for this might be:
+
+     "%d Zeichen lang ist die Zeichenkette `%s'\n"
+
+   The problem should be obvious: the order of the format specifications
+is different from the original!  Even though 'gettext()' can return the
+translated string at runtime, it cannot change the argument order in the
+call to 'printf'.
+
+   To solve this problem, 'printf' format specifiers may have an
+additional optional element, which we call a "positional specifier".
+For example:
+
+     "%2$d Zeichen lang ist die Zeichenkette `%1$s'\n"
+
+   Here, the positional specifier consists of an integer count, which
+indicates which argument to use, and a '$'.  Counts are one-based, and
+the format string itself is _not_ included.  Thus, in the following
+example, 'string' is the first argument and 'length(string)' is the
+second:
+
+     $ gawk 'BEGIN {
+     >     string = "Don\47t Panic"
+     >     printf "%2$d characters live in \"%1$s\"\n",
+     >                         string, length(string)
+     > }'
+     -| 11 characters live in "Don't Panic"
+
+   If present, positional specifiers come first in the format
+specification, before the flags, the field width, and/or the precision.
+
+   Positional specifiers can be used with the dynamic field width and
+precision capability:
+
+     $ gawk 'BEGIN {
+     >    printf("%*.*s\n", 10, 20, "hello")
+     >    printf("%3$*2$.*1$s\n", 20, 10, "hello")
+     > }'
+     -|      hello
+     -|      hello
+
+     NOTE: When using '*' with a positional specifier, the '*' comes
+     first, then the integer position, and then the '$'.  This is
+     somewhat counterintuitive.
+
+   'gawk' does not allow you to mix regular format specifiers and those
+with positional specifiers in the same string:
+
+     $ gawk 'BEGIN { printf "%d %3$s\n", 1, 2, "hi" }'
+     error-> gawk: cmd. line:1: fatal: must use `count$' on all formats or none
+
+     NOTE: There are some pathological cases that 'gawk' may fail to
+     diagnose.  In such cases, the output may not be what you expect.
+     It's still a bad idea to try mixing them, even if 'gawk' doesn't
+     detect it.
+
+   Although positional specifiers can be used directly in 'awk'
+programs, their primary purpose is to help in producing correct
+translations of format strings into languages different from the one in
+which the program is first written.
+
+   ---------- Footnotes ----------
+
+   (1) This example is borrowed from the GNU 'gettext' manual.
+
+
+File: gawk.info,  Node: I18N Portability,  Prev: Printf Ordering,  Up: Translator i18n
+
+13.4.3 'awk' Portability Issues
+-------------------------------
+
+'gawk''s internationalization features were purposely chosen to have as
+little impact as possible on the portability of 'awk' programs that use
+them to other versions of 'awk'.  Consider this program:
+
+     BEGIN {
+         TEXTDOMAIN = "guide"
+         if (Test_Guide)   # set with -v
+             bindtextdomain("/test/guide/messages")
+         print _"don't panic!"
+     }
+
+As written, it won't work on other versions of 'awk'.  However, it is
+actually almost portable, requiring very little change:
+
+   * Assignments to 'TEXTDOMAIN' won't have any effect, because
+     'TEXTDOMAIN' is not special in other 'awk' implementations.
+
+   * Non-GNU versions of 'awk' treat marked strings as the concatenation
+     of a variable named '_' with the string following it.(1)
+     Typically, the variable '_' has the null string ('""') as its
+     value, leaving the original string constant as the result.
+
+   * By defining "dummy" functions to replace 'dcgettext()',
+     'dcngettext()', and 'bindtextdomain()', the 'awk' program can be
+     made to run, but all the messages are output in the original
+     language.  For example:
+
+          function bindtextdomain(dir, domain)
+          {
+              return dir
+          }
+
+          function dcgettext(string, domain, category)
+          {
+              return string
+          }
+
+          function dcngettext(string1, string2, number, domain, category)
+          {
+              return (number == 1 ? string1 : string2)
+          }
+
+   * The use of positional specifications in 'printf' or 'sprintf()' is
+     _not_ portable.  To support 'gettext()' at the C level, many
+     systems' C versions of 'sprintf()' do support positional
+     specifiers.  But it works only if enough arguments are supplied in
+     the function call.  Many versions of 'awk' pass 'printf' formats
+     and arguments unchanged to the underlying C library version of
+     'sprintf()', but only one format and argument at a time.  What
+     happens if a positional specification is used is anybody's guess.
+     However, because the positional specifications are primarily for
+     use in _translated_ format strings, and because non-GNU 'awk's
+     never retrieve the translated string, this should not be a problem
+     in practice.
+
+   ---------- Footnotes ----------
+
+   (1) This is good fodder for an "Obfuscated 'awk'" contest.
+
+
+File: gawk.info,  Node: I18N Example,  Next: Gawk I18N,  Prev: Translator i18n,  Up: Internationalization
+
+13.5 A Simple Internationalization Example
+==========================================
+
+Now let's look at a step-by-step example of how to internationalize and
+localize a simple 'awk' program, using 'guide.awk' as our original
+source:
+
+     BEGIN {
+         TEXTDOMAIN = "guide"
+         bindtextdomain(".")  # for testing
+         print _"Don't Panic"
+         print _"The Answer Is", 42
+         print "Pardon me, Zaphod who?"
+     }
+
+Run 'gawk --gen-pot' to create the '.pot' file:
+
+     $ gawk --gen-pot -f guide.awk > guide.pot
+
+This produces:
+
+     #: guide.awk:4
+     msgid "Don't Panic"
+     msgstr ""
+
+     #: guide.awk:5
+     msgid "The Answer Is"
+     msgstr ""
+
+
+   This original portable object template file is saved and reused for
+each language into which the application is translated.  The 'msgid' is
+the original string and the 'msgstr' is the translation.
+
+     NOTE: Strings not marked with a leading underscore do not appear in
+     the 'guide.pot' file.
+
+   Next, the messages must be translated.  Here is a translation to a
+hypothetical dialect of English, called "Mellow":(1)
+
+     $ cp guide.pot guide-mellow.po
+     ADD TRANSLATIONS TO guide-mellow.po ...
+
+Following are the translations:
+
+     #: guide.awk:4
+     msgid "Don't Panic"
+     msgstr "Hey man, relax!"
+
+     #: guide.awk:5
+     msgid "The Answer Is"
+     msgstr "Like, the scoop is"
+
+
+   The next step is to make the directory to hold the binary message
+object file and then to create the 'guide.mo' file.  We pretend that our
+file is to be used in the 'en_US.UTF-8' locale, because we have to use a
+locale name known to the C 'gettext' routines.  The directory layout
+shown here is standard for GNU 'gettext' on GNU/Linux systems.  Other
+versions of 'gettext' may use a different layout:
+
+     $ mkdir en_US.UTF-8 en_US.UTF-8/LC_MESSAGES
+
+   The 'msgfmt' utility does the conversion from human-readable '.po'
+file to machine-readable '.mo' file.  By default, 'msgfmt' creates a
+file named 'messages'.  This file must be renamed and placed in the
+proper directory (using the '-o' option) so that 'gawk' can find it:
+
+     $ msgfmt guide-mellow.po -o en_US.UTF-8/LC_MESSAGES/guide.mo
+
+   Finally, we run the program to test it:
+
+     $ gawk -f guide.awk
+     -| Hey man, relax!
+     -| Like, the scoop is 42
+     -| Pardon me, Zaphod who?
+
+   If the three replacement functions for 'dcgettext()', 'dcngettext()',
+and 'bindtextdomain()' (*note I18N Portability::) are in a file named
+'libintl.awk', then we can run 'guide.awk' unchanged as follows:
+
+     $ gawk --posix -f guide.awk -f libintl.awk
+     -| Don't Panic
+     -| The Answer Is 42
+     -| Pardon me, Zaphod who?
+
+   ---------- Footnotes ----------
+
+   (1) Perhaps it would be better if it were called "Hippy."  Ah, well.
+
+
+File: gawk.info,  Node: Gawk I18N,  Next: I18N Summary,  Prev: I18N Example,  Up: Internationalization
+
+13.6 'gawk' Can Speak Your Language
+===================================
+
+'gawk' itself has been internationalized using the GNU 'gettext'
+package.  (GNU 'gettext' is described in complete detail in *note (GNU
+'gettext' utilities, gettext, GNU 'gettext' utilities)Top::.)  As of
+this writing, the latest version of GNU 'gettext' is version 0.19.4
+(ftp://ftp.gnu.org/gnu/gettext/gettext-0.19.4.tar.gz).
+
+   If a translation of 'gawk''s messages exists, then 'gawk' produces
+usage messages, warnings, and fatal errors in the local language.
+
+
+File: gawk.info,  Node: I18N Summary,  Prev: Gawk I18N,  Up: Internationalization
+
+13.7 Summary
+============
+
+   * Internationalization means writing a program such that it can use
+     multiple languages without requiring source code changes.
+     Localization means providing the data necessary for an
+     internationalized program to work in a particular language.
+
+   * 'gawk' uses GNU 'gettext' to let you internationalize and localize
+     'awk' programs.  A program's text domain identifies the program for
+     grouping all messages and other data together.
+
+   * You mark a program's strings for translation by preceding them with
+     an underscore.  Once that is done, the strings are extracted into a
+     '.pot' file.  This file is copied for each language into a '.po'
+     file, and the '.po' files are compiled into '.gmo' files for use at
+     runtime.
+
+   * You can use positional specifications with 'sprintf()' and 'printf'
+     to rearrange the placement of argument values in formatted strings
+     and output.  This is useful for the translation of format control
+     strings.
+
+   * The internationalization features have been designed so that they
+     can be easily worked around in a standard 'awk'.
+
+   * 'gawk' itself has been internationalized and ships with a number of
+     translations for its messages.
+
+
+File: gawk.info,  Node: Debugger,  Next: Arbitrary Precision Arithmetic,  Prev: Internationalization,  Up: Top
+
+14 Debugging 'awk' Programs
+***************************
+
+It would be nice if computer programs worked perfectly the first time
+they were run, but in real life, this rarely happens for programs of any
+complexity.  Thus, most programming languages have facilities available
+for "debugging" programs, and now 'awk' is no exception.
+
+   The 'gawk' debugger is purposely modeled after the GNU Debugger (GDB)
+(http://www.gnu.org/software/gdb/) command-line debugger.  If you are
+familiar with GDB, learning how to use 'gawk' for debugging your program
+is easy.
+
+* Menu:
+
+* Debugging::                   Introduction to 'gawk' debugger.
+* Sample Debugging Session::    Sample debugging session.
+* List of Debugger Commands::   Main debugger commands.
+* Readline Support::            Readline support.
+* Limitations::                 Limitations and future plans.
+* Debugging Summary::           Debugging summary.
+
+
+File: gawk.info,  Node: Debugging,  Next: Sample Debugging Session,  Up: Debugger
+
+14.1 Introduction to the 'gawk' Debugger
+========================================
+
+This minor node introduces debugging in general and begins the
+discussion of debugging in 'gawk'.
+
+* Menu:
+
+* Debugging Concepts::          Debugging in General.
+* Debugging Terms::             Additional Debugging Concepts.
+* Awk Debugging::               Awk Debugging.
+
+
+File: gawk.info,  Node: Debugging Concepts,  Next: Debugging Terms,  Up: Debugging
+
+14.1.1 Debugging in General
+---------------------------
+
+(If you have used debuggers in other languages, you may want to skip
+ahead to *note Awk Debugging::.)
+
+   Of course, a debugging program cannot remove bugs for you, because it
+has no way of knowing what you or your users consider a "bug" versus a
+"feature."  (Sometimes, we humans have a hard time with this ourselves.)
+In that case, what can you expect from such a tool?  The answer to that
+depends on the language being debugged, but in general, you can expect
+at least the following:
+
+   * The ability to watch a program execute its instructions one by one,
+     giving you, the programmer, the opportunity to think about what is
+     happening on a time scale of seconds, minutes, or hours, rather
+     than the nanosecond time scale at which the code usually runs.
+
+   * The opportunity to not only passively observe the operation of your
+     program, but to control it and try different paths of execution,
+     without having to change your source files.
+
+   * The chance to see the values of data in the program at any point in
+     execution, and also to change that data on the fly, to see how that
+     affects what happens afterward.  (This often includes the ability
+     to look at internal data structures besides the variables you
+     actually defined in your code.)
+
+   * The ability to obtain additional information about your program's
+     state or even its internal structure.
+
+   All of these tools provide a great amount of help in using your own
+skills and understanding of the goals of your program to find where it
+is going wrong (or, for that matter, to better comprehend a perfectly
+functional program that you or someone else wrote).
+
+
+File: gawk.info,  Node: Debugging Terms,  Next: Awk Debugging,  Prev: Debugging Concepts,  Up: Debugging
+
+14.1.2 Debugging Concepts
+-------------------------
+
+Before diving in to the details, we need to introduce several important
+concepts that apply to just about all debuggers.  The following list
+defines terms used throughout the rest of this major node:
+
+"Stack frame"
+     Programs generally call functions during the course of their
+     execution.  One function can call another, or a function can call
+     itself (recursion).  You can view the chain of called functions
+     (main program calls A, which calls B, which calls C), as a stack of
+     executing functions: the currently running function is the topmost
+     one on the stack, and when it finishes (returns), the next one down
+     then becomes the active function.  Such a stack is termed a "call
+     stack".
+
+     For each function on the call stack, the system maintains a data
+     area that contains the function's parameters, local variables, and
+     return value, as well as any other "bookkeeping" information needed
+     to manage the call stack.  This data area is termed a "stack
+     frame".
+
+     'gawk' also follows this model, and gives you access to the call
+     stack and to each stack frame.  You can see the call stack, as well
+     as from where each function on the stack was invoked.  Commands
+     that print the call stack print information about each stack frame
+     (as detailed later on).
+
+"Breakpoint"
+     During debugging, you often wish to let the program run until it
+     reaches a certain point, and then continue execution from there one
+     statement (or instruction) at a time.  The way to do this is to set
+     a "breakpoint" within the program.  A breakpoint is where the
+     execution of the program should break off (stop), so that you can
+     take over control of the program's execution.  You can add and
+     remove as many breakpoints as you like.
+
+"Watchpoint"
+     A watchpoint is similar to a breakpoint.  The difference is that
+     breakpoints are oriented around the code: stop when a certain point
+     in the code is reached.  A watchpoint, however, specifies that
+     program execution should stop when a _data value_ is changed.  This
+     is useful, as sometimes it happens that a variable receives an
+     erroneous value, and it's hard to track down where this happens
+     just by looking at the code.  By using a watchpoint, you can stop
+     whenever a variable is assigned to, and usually find the errant
+     code quite quickly.
+
+
+File: gawk.info,  Node: Awk Debugging,  Prev: Debugging Terms,  Up: Debugging
+
+14.1.3 'awk' Debugging
+----------------------
+
+Debugging an 'awk' program has some specific aspects that are not shared
+with programs written in other languages.
+
+   First of all, the fact that 'awk' programs usually take input line by
+line from a file or files and operate on those lines using specific
+rules makes it especially useful to organize viewing the execution of
+the program in terms of these rules.  As we will see, each 'awk' rule is
+treated almost like a function call, with its own specific block of
+instructions.
+
+   In addition, because 'awk' is by design a very concise language, it
+is easy to lose sight of everything that is going on "inside" each line
+of 'awk' code.  The debugger provides the opportunity to look at the
+individual primitive instructions carried out by the higher-level 'awk'
+commands.
+
+
+File: gawk.info,  Node: Sample Debugging Session,  Next: List of Debugger Commands,  Prev: Debugging,  Up: Debugger
+
+14.2 Sample 'gawk' Debugging Session
+====================================
+
+In order to illustrate the use of 'gawk' as a debugger, let's look at a
+sample debugging session.  We will use the 'awk' implementation of the
+POSIX 'uniq' command described earlier (*note Uniq Program::) as our
+example.
+
+* Menu:
+
+* Debugger Invocation::         How to Start the Debugger.
+* Finding The Bug::             Finding the Bug.
+
+
+File: gawk.info,  Node: Debugger Invocation,  Next: Finding The Bug,  Up: Sample Debugging Session
+
+14.2.1 How to Start the Debugger
+--------------------------------
+
+Starting the debugger is almost exactly like running 'gawk' normally,
+except you have to pass an additional option, '--debug', or the
+corresponding short option, '-D'.  The file(s) containing the program
+and any supporting code are given on the command line as arguments to
+one or more '-f' options.  ('gawk' is not designed to debug command-line
+programs, only programs contained in files.)  In our case, we invoke the
+debugger like this:
+
+     $ gawk -D -f getopt.awk -f join.awk -f uniq.awk -1 inputfile
+
+where both 'getopt.awk' and 'uniq.awk' are in '$AWKPATH'.  (Experienced
+users of GDB or similar debuggers should note that this syntax is
+slightly different from what you are used to.  With the 'gawk' debugger,
+you give the arguments for running the program in the command line to
+the debugger rather than as part of the 'run' command at the debugger
+prompt.)  The '-1' is an option to 'uniq.awk'.
+
+   Instead of immediately running the program on 'inputfile', as 'gawk'
+would ordinarily do, the debugger merely loads all the program source
+files, compiles them internally, and then gives us a prompt:
+
+     gawk>
+
+from which we can issue commands to the debugger.  At this point, no
+code has been executed.
+
+
+File: gawk.info,  Node: Finding The Bug,  Prev: Debugger Invocation,  Up: Sample Debugging Session
+
+14.2.2 Finding the Bug
+----------------------
+
+Let's say that we are having a problem using (a faulty version of)
+'uniq.awk' in the "field-skipping" mode, and it doesn't seem to be
+catching lines which should be identical when skipping the first field,
+such as:
+
+     awk is a wonderful program!
+     gawk is a wonderful program!
+
+   This could happen if we were thinking (C-like) of the fields in a
+record as being numbered in a zero-based fashion, so instead of the
+lines:
+
+     clast = join(alast, fcount+1, n)
+     cline = join(aline, fcount+1, m)
+
+we wrote:
+
+     clast = join(alast, fcount, n)
+     cline = join(aline, fcount, m)
+
+   The first thing we usually want to do when trying to investigate a
+problem like this is to put a breakpoint in the program so that we can
+watch it at work and catch what it is doing wrong.  A reasonable spot
+for a breakpoint in 'uniq.awk' is at the beginning of the function
+'are_equal()', which compares the current line with the previous one.
+To set the breakpoint, use the 'b' (breakpoint) command:
+
+     gawk> b are_equal
+     -| Breakpoint 1 set at file `awklib/eg/prog/uniq.awk', line 63
+
+   The debugger tells us the file and line number where the breakpoint
+is.  Now type 'r' or 'run' and the program runs until it hits the
+breakpoint for the first time:
+
+     gawk> r
+     -| Starting program:
+     -| Stopping in Rule ...
+     -| Breakpoint 1, are_equal(n, m, clast, cline, alast, aline)
+              at `awklib/eg/prog/uniq.awk':63
+     -| 63          if (fcount == 0 && charcount == 0)
+     gawk>
+
+   Now we can look at what's going on inside our program.  First of all,
+let's see how we got to where we are.  At the prompt, we type 'bt'
+(short for "backtrace"), and the debugger responds with a listing of the
+current stack frames:
+
+     gawk> bt
+     -| #0  are_equal(n, m, clast, cline, alast, aline)
+              at `awklib/eg/prog/uniq.awk':68
+     -| #1  in main() at `awklib/eg/prog/uniq.awk':88
+
+   This tells us that 'are_equal()' was called by the main program at
+line 88 of 'uniq.awk'.  (This is not a big surprise, because this is the
+only call to 'are_equal()' in the program, but in more complex programs,
+knowing who called a function and with what parameters can be the key to
+finding the source of the problem.)
+
+   Now that we're in 'are_equal()', we can start looking at the values
+of some variables.  Let's say we type 'p n' ('p' is short for "print").
+We would expect to see the value of 'n', a parameter to 'are_equal()'.
+Actually, the debugger gives us:
+
+     gawk> p n
+     -| n = untyped variable
+
+In this case, 'n' is an uninitialized local variable, because the
+function was called without arguments (*note Function Calls::).
+
+   A more useful variable to display might be the current record:
+
+     gawk> p $0
+     -| $0 = "gawk is a wonderful program!"
+
+This might be a bit puzzling at first, as this is the second line of our
+test input.  Let's look at 'NR':
+
+     gawk> p NR
+     -| NR = 2
+
+So we can see that 'are_equal()' was only called for the second record
+of the file.  Of course, this is because our program contains a rule for
+'NR == 1':
+
+     NR == 1 {
+         last = $0
+         next
+     }
+
+   OK, let's just check that that rule worked correctly:
+
+     gawk> p last
+     -| last = "awk is a wonderful program!"
+
+   Everything we have done so far has verified that the program has
+worked as planned, up to and including the call to 'are_equal()', so the
+problem must be inside this function.  To investigate further, we must
+begin "stepping through" the lines of 'are_equal()'.  We start by typing
+'n' (for "next"):
+
+     gawk> n
+     -| 66          if (fcount > 0) {
+
+   This tells us that 'gawk' is now ready to execute line 66, which
+decides whether to give the lines the special "field-skipping" treatment
+indicated by the '-1' command-line option.  (Notice that we skipped from
+where we were before, at line 63, to here, because the condition in line
+63, 'if (fcount == 0 && charcount == 0)', was false.)
+
+   Continuing to step, we now get to the splitting of the current and
+last records:
+
+     gawk> n
+     -| 67              n = split(last, alast)
+     gawk> n
+     -| 68              m = split($0, aline)
+
+   At this point, we should be curious to see what our records were
+split into, so we try to look:
+
+     gawk> p n m alast aline
+     -| n = 5
+     -| m = untyped variable
+     -| alast = array, 5 elements
+     -| aline = untyped variable
+
+(The 'p' command can take more than one argument, similar to 'awk''s
+'print' statement.)
+
+   This is kind of disappointing, though.  All we found out is that
+there are five elements in 'alast'; 'm' and 'aline' don't have values
+because we are at line 68 but haven't executed it yet.  This information
+is useful enough (we now know that none of the words were accidentally
+left out), but what if we want to see inside the array?
+
+   The first choice would be to use subscripts:
+
+     gawk> p alast[0]
+     -| "0" not in array `alast'
+
+Oops!
+
+     gawk> p alast[1]
+     -| alast["1"] = "awk"
+
+   This would be kind of slow for a 100-member array, though, so 'gawk'
+provides a shortcut (reminiscent of another language not to be
+mentioned):
+
+     gawk> p @alast
+     -| alast["1"] = "awk"
+     -| alast["2"] = "is"
+     -| alast["3"] = "a"
+     -| alast["4"] = "wonderful"
+     -| alast["5"] = "program!"
+
+   It looks like we got this far OK. Let's take another step or two:
+
+     gawk> n
+     -| 69              clast = join(alast, fcount, n)
+     gawk> n
+     -| 70              cline = join(aline, fcount, m)
+
+   Well, here we are at our error (sorry to spoil the suspense).  What
+we had in mind was to join the fields starting from the second one to
+make the virtual record to compare, and if the first field were numbered
+zero, this would work.  Let's look at what we've got:
+
+     gawk> p cline clast
+     -| cline = "gawk is a wonderful program!"
+     -| clast = "awk is a wonderful program!"
+
+   Hey, those look pretty familiar!  They're just our original,
+unaltered input records.  A little thinking (the human brain is still
+the best debugging tool), and we realize that we were off by one!
+
+   We get out of the debugger:
+
+     gawk> q
+     -| The program is running. Exit anyway (y/n)? y
+
+Then we get into an editor:
+
+     clast = join(alast, fcount+1, n)
+     cline = join(aline, fcount+1, m)
+
+and problem solved!
+
+
+File: gawk.info,  Node: List of Debugger Commands,  Next: Readline Support,  Prev: Sample Debugging Session,  Up: Debugger
+
+14.3 Main Debugger Commands
+===========================
+
+The 'gawk' debugger command set can be divided into the following
+categories:
+
+   * Breakpoint control
+
+   * Execution control
+
+   * Viewing and changing data
+
+   * Working with the stack
+
+   * Getting information
+
+   * Miscellaneous
+
+   Each of these are discussed in the following subsections.  In the
+following descriptions, commands that may be abbreviated show the
+abbreviation on a second description line.  A debugger command name may
+also be truncated if that partial name is unambiguous.  The debugger has
+the built-in capability to automatically repeat the previous command
+just by hitting 'Enter'.  This works for the commands 'list', 'next',
+'nexti', 'step', 'stepi', and 'continue' executed without any argument.
+
+* Menu:
+
+* Breakpoint Control::          Control of Breakpoints.
+* Debugger Execution Control::  Control of Execution.
+* Viewing And Changing Data::   Viewing and Changing Data.
+* Execution Stack::             Dealing with the Stack.
+* Debugger Info::               Obtaining Information about the Program and
+                                the Debugger State.
+* Miscellaneous Debugger Commands:: Miscellaneous Commands.
+
+
+File: gawk.info,  Node: Breakpoint Control,  Next: Debugger Execution Control,  Up: List of Debugger Commands
+
+14.3.1 Control of Breakpoints
+-----------------------------
+
+As we saw earlier, the first thing you probably want to do in a
+debugging session is to get your breakpoints set up, because your
+program will otherwise just run as if it was not under the debugger.
+The commands for controlling breakpoints are:
+
+'break' [[FILENAME':']N | FUNCTION] ['"EXPRESSION"']
+'b' [[FILENAME':']N | FUNCTION] ['"EXPRESSION"']
+     Without any argument, set a breakpoint at the next instruction to
+     be executed in the selected stack frame.  Arguments can be one of
+     the following:
+
+     N
+          Set a breakpoint at line number N in the current source file.
+
+     FILENAME':'N
+          Set a breakpoint at line number N in source file FILENAME.
+
+     FUNCTION
+          Set a breakpoint at entry to (the first instruction of)
+          function FUNCTION.
+
+     Each breakpoint is assigned a number that can be used to delete it
+     from the breakpoint list using the 'delete' command.
+
+     With a breakpoint, you may also supply a condition.  This is an
+     'awk' expression (enclosed in double quotes) that the debugger
+     evaluates whenever the breakpoint is reached.  If the condition is
+     true, then the debugger stops execution and prompts for a command.
+     Otherwise, it continues executing the program.
+
+'clear' [[FILENAME':']N | FUNCTION]
+     Without any argument, delete any breakpoint at the next instruction
+     to be executed in the selected stack frame.  If the program stops
+     at a breakpoint, this deletes that breakpoint so that the program
+     does not stop at that location again.  Arguments can be one of the
+     following:
+
+     N
+          Delete breakpoint(s) set at line number N in the current
+          source file.
+
+     FILENAME':'N
+          Delete breakpoint(s) set at line number N in source file
+          FILENAME.
+
+     FUNCTION
+          Delete breakpoint(s) set at entry to function FUNCTION.
+
+'condition' N '"EXPRESSION"'
+     Add a condition to existing breakpoint or watchpoint N.  The
+     condition is an 'awk' expression _enclosed in double quotes_ that
+     the debugger evaluates whenever the breakpoint or watchpoint is
+     reached.  If the condition is true, then the debugger stops
+     execution and prompts for a command.  Otherwise, the debugger
+     continues executing the program.  If the condition expression is
+     not specified, any existing condition is removed (i.e., the
+     breakpoint or watchpoint is made unconditional).
+
+'delete' [N1 N2 ...] [N-M]
+'d' [N1 N2 ...] [N-M]
+     Delete specified breakpoints or a range of breakpoints.  Delete all
+     defined breakpoints if no argument is supplied.
+
+'disable' [N1 N2 ... | N-M]
+     Disable specified breakpoints or a range of breakpoints.  Without
+     any argument, disable all breakpoints.
+
+'enable' ['del' | 'once'] [N1 N2 ...] [N-M]
+'e' ['del' | 'once'] [N1 N2 ...] [N-M]
+     Enable specified breakpoints or a range of breakpoints.  Without
+     any argument, enable all breakpoints.  Optionally, you can specify
+     how to enable the breakpoints:
+
+     'del'
+          Enable the breakpoints temporarily, then delete each one when
+          the program stops at it.
+
+     'once'
+          Enable the breakpoints temporarily, then disable each one when
+          the program stops at it.
+
+'ignore' N COUNT
+     Ignore breakpoint number N the next COUNT times it is hit.
+
+'tbreak' [[FILENAME':']N | FUNCTION]
+'t' [[FILENAME':']N | FUNCTION]
+     Set a temporary breakpoint (enabled for only one stop).  The
+     arguments are the same as for 'break'.
+
+
+File: gawk.info,  Node: Debugger Execution Control,  Next: Viewing And Changing Data,  Prev: Breakpoint Control,  Up: List of Debugger Commands
+
+14.3.2 Control of Execution
+---------------------------
+
+Now that your breakpoints are ready, you can start running the program
+and observing its behavior.  There are more commands for controlling
+execution of the program than we saw in our earlier example:
+
+'commands' [N]
+'silent'
+...
+'end'
+     Set a list of commands to be executed upon stopping at a breakpoint
+     or watchpoint.  N is the breakpoint or watchpoint number.  Without
+     a number, the last one set is used.  The actual commands follow,
+     starting on the next line, and terminated by the 'end' command.  If
+     the command 'silent' is in the list, the usual messages about
+     stopping at a breakpoint and the source line are not printed.  Any
+     command in the list that resumes execution (e.g., 'continue')
+     terminates the list (an implicit 'end'), and subsequent commands
+     are ignored.  For example:
+
+          gawk> commands
+          > silent
+          > printf "A silent breakpoint; i = %d\n", i
+          > info locals
+          > set i = 10
+          > continue
+          > end
+          gawk>
+
+'continue' [COUNT]
+'c' [COUNT]
+     Resume program execution.  If continued from a breakpoint and COUNT
+     is specified, ignore the breakpoint at that location the next COUNT
+     times before stopping.
+
+'finish'
+     Execute until the selected stack frame returns.  Print the returned
+     value.
+
+'next' [COUNT]
+'n' [COUNT]
+     Continue execution to the next source line, stepping over function
+     calls.  The argument COUNT controls how many times to repeat the
+     action, as in 'step'.
+
+'nexti' [COUNT]
+'ni' [COUNT]
+     Execute one (or COUNT) instruction(s), stepping over function
+     calls.
+
+'return' [VALUE]
+     Cancel execution of a function call.  If VALUE (either a string or
+     a number) is specified, it is used as the function's return value.
+     If used in a frame other than the innermost one (the currently
+     executing function; i.e., frame number 0), discard all inner frames
+     in addition to the selected one, and the caller of that frame
+     becomes the innermost frame.
+
+'run'
+'r'
+     Start/restart execution of the program.  When restarting, the
+     debugger retains the current breakpoints, watchpoints, command
+     history, automatic display variables, and debugger options.
+
+'step' [COUNT]
+'s' [COUNT]
+     Continue execution until control reaches a different source line in
+     the current stack frame, stepping inside any function called within
+     the line.  If the argument COUNT is supplied, steps that many times
+     before stopping, unless it encounters a breakpoint or watchpoint.
+
+'stepi' [COUNT]
+'si' [COUNT]
+     Execute one (or COUNT) instruction(s), stepping inside function
+     calls.  (For illustration of what is meant by an "instruction" in
+     'gawk', see the output shown under 'dump' in *note Miscellaneous
+     Debugger Commands::.)
+
+'until' [[FILENAME':']N | FUNCTION]
+'u' [[FILENAME':']N | FUNCTION]
+     Without any argument, continue execution until a line past the
+     current line in the current stack frame is reached.  With an
+     argument, continue execution until the specified location is
+     reached, or the current stack frame returns.
+
+
+File: gawk.info,  Node: Viewing And Changing Data,  Next: Execution Stack,  Prev: Debugger Execution Control,  Up: List of Debugger Commands
+
+14.3.3 Viewing and Changing Data
+--------------------------------
+
+The commands for viewing and changing variables inside of 'gawk' are:
+
+'display' [VAR | '$'N]
+     Add variable VAR (or field '$N') to the display list.  The value of
+     the variable or field is displayed each time the program stops.
+     Each variable added to the list is identified by a unique number:
+
+          gawk> display x
+          -| 10: x = 1
+
+     This displays the assigned item number, the variable name, and its
+     current value.  If the display variable refers to a function
+     parameter, it is silently deleted from the list as soon as the
+     execution reaches a context where no such variable of the given
+     name exists.  Without argument, 'display' displays the current
+     values of items on the list.
+
+'eval "AWK STATEMENTS"'
+     Evaluate AWK STATEMENTS in the context of the running program.  You
+     can do anything that an 'awk' program would do: assign values to
+     variables, call functions, and so on.
+
+'eval' PARAM, ...
+AWK STATEMENTS
+'end'
+     This form of 'eval' is similar, but it allows you to define "local
+     variables" that exist in the context of the AWK STATEMENTS, instead
+     of using variables or function parameters defined by the program.
+
+'print' VAR1[',' VAR2 ...]
+'p' VAR1[',' VAR2 ...]
+     Print the value of a 'gawk' variable or field.  Fields must be
+     referenced by constants:
+
+          gawk> print $3
+
+     This prints the third field in the input record (if the specified
+     field does not exist, it prints 'Null field').  A variable can be
+     an array element, with the subscripts being constant string values.
+     To print the contents of an array, prefix the name of the array
+     with the '@' symbol:
+
+          gawk> print @a
+
+     This prints the indices and the corresponding values for all
+     elements in the array 'a'.
+
+'printf' FORMAT [',' ARG ...]
+     Print formatted text.  The FORMAT may include escape sequences,
+     such as '\n' (*note Escape Sequences::).  No newline is printed
+     unless one is specified.
+
+'set' VAR'='VALUE
+     Assign a constant (number or string) value to an 'awk' variable or
+     field.  String values must be enclosed between double quotes
+     ('"'...'"').
+
+     You can also set special 'awk' variables, such as 'FS', 'NF', 'NR',
+     and so on.
+
+'watch' VAR | '$'N ['"EXPRESSION"']
+'w' VAR | '$'N ['"EXPRESSION"']
+     Add variable VAR (or field '$N') to the watch list.  The debugger
+     then stops whenever the value of the variable or field changes.
+     Each watched item is assigned a number that can be used to delete
+     it from the watch list using the 'unwatch' command.
+
+     With a watchpoint, you may also supply a condition.  This is an
+     'awk' expression (enclosed in double quotes) that the debugger
+     evaluates whenever the watchpoint is reached.  If the condition is
+     true, then the debugger stops execution and prompts for a command.
+     Otherwise, 'gawk' continues executing the program.
+
+'undisplay' [N]
+     Remove item number N (or all items, if no argument) from the
+     automatic display list.
+
+'unwatch' [N]
+     Remove item number N (or all items, if no argument) from the watch
+     list.
+
+
+File: gawk.info,  Node: Execution Stack,  Next: Debugger Info,  Prev: Viewing And Changing Data,  Up: List of Debugger Commands
+
+14.3.4 Working with the Stack
+-----------------------------
+
+Whenever you run a program that contains any function calls, 'gawk'
+maintains a stack of all of the function calls leading up to where the
+program is right now.  You can see how you got to where you are, and
+also move around in the stack to see what the state of things was in the
+functions that called the one you are in.  The commands for doing this
+are:
+
+'backtrace' [COUNT]
+'bt' [COUNT]
+'where' [COUNT]
+     Print a backtrace of all function calls (stack frames), or
+     innermost COUNT frames if COUNT > 0.  Print the outermost COUNT
+     frames if COUNT < 0.  The backtrace displays the name and arguments
+     to each function, the source file name, and the line number.  The
+     alias 'where' for 'backtrace' is provided for longtime GDB users
+     who may be used to that command.
+
+'down' [COUNT]
+     Move COUNT (default 1) frames down the stack toward the innermost
+     frame.  Then select and print the frame.
+
+'frame' [N]
+'f' [N]
+     Select and print stack frame N.  Frame 0 is the currently
+     executing, or "innermost", frame (function call); frame 1 is the
+     frame that called the innermost one.  The highest-numbered frame is
+     the one for the main program.  The printed information consists of
+     the frame number, function and argument names, source file, and the
+     source line.
+
+'up' [COUNT]
+     Move COUNT (default 1) frames up the stack toward the outermost
+     frame.  Then select and print the frame.
+
+
+File: gawk.info,  Node: Debugger Info,  Next: Miscellaneous Debugger Commands,  Prev: Execution Stack,  Up: List of Debugger Commands
+
+14.3.5 Obtaining Information About the Program and the Debugger State
+---------------------------------------------------------------------
+
+Besides looking at the values of variables, there is often a need to get
+other sorts of information about the state of your program and of the
+debugging environment itself.  The 'gawk' debugger has one command that
+provides this information, appropriately called 'info'.  'info' is used
+with one of a number of arguments that tell it exactly what you want to
+know:
+
+'info' WHAT
+'i' WHAT
+     The value for WHAT should be one of the following:
+
+     'args'
+          List arguments of the selected frame.
+
+     'break'
+          List all currently set breakpoints.
+
+     'display'
+          List all items in the automatic display list.
+
+     'frame'
+          Give a description of the selected stack frame.
+
+     'functions'
+          List all function definitions including source file names and
+          line numbers.
+
+     'locals'
+          List local variables of the selected frame.
+
+     'source'
+          Print the name of the current source file.  Each time the
+          program stops, the current source file is the file containing
+          the current instruction.  When the debugger first starts, the
+          current source file is the first file included via the '-f'
+          option.  The 'list FILENAME:LINENO' command can be used at any
+          time to change the current source.
+
+     'sources'
+          List all program sources.
+
+     'variables'
+          List all global variables.
+
+     'watch'
+          List all items in the watch list.
+
+   Additional commands give you control over the debugger, the ability
+to save the debugger's state, and the ability to run debugger commands
+from a file.  The commands are:
+
+'option' [NAME['='VALUE]]
+'o' [NAME['='VALUE]]
+     Without an argument, display the available debugger options and
+     their current values.  'option NAME' shows the current value of the
+     named option.  'option NAME=VALUE' assigns a new value to the named
+     option.  The available options are:
+
+     'history_size'
+          Set the maximum number of lines to keep in the history file
+          './.gawk_history'.  The default is 100.
+
+     'listsize'
+          Specify the number of lines that 'list' prints.  The default
+          is 15.
+
+     'outfile'
+          Send 'gawk' output to a file; debugger output still goes to
+          standard output.  An empty string ('""') resets output to
+          standard output.
+
+     'prompt'
+          Change the debugger prompt.  The default is 'gawk> '.
+
+     'save_history' ['on' | 'off']
+          Save command history to file './.gawk_history'.  The default
+          is 'on'.
+
+     'save_options' ['on' | 'off']
+          Save current options to file './.gawkrc' upon exit.  The
+          default is 'on'.  Options are read back into the next session
+          upon startup.
+
+     'trace' ['on' | 'off']
+          Turn instruction tracing on or off.  The default is 'off'.
+
+'save' FILENAME
+     Save the commands from the current session to the given file name,
+     so that they can be replayed using the 'source' command.
+
+'source' FILENAME
+     Run command(s) from a file; an error in any command does not
+     terminate execution of subsequent commands.  Comments (lines
+     starting with '#') are allowed in a command file.  Empty lines are
+     ignored; they do _not_ repeat the last command.  You can't restart
+     the program by having more than one 'run' command in the file.
+     Also, the list of commands may include additional 'source'
+     commands; however, the 'gawk' debugger will not source the same
+     file more than once in order to avoid infinite recursion.
+
+     In addition to, or instead of, the 'source' command, you can use
+     the '-D FILE' or '--debug=FILE' command-line options to execute
+     commands from a file non-interactively (*note Options::).
+
+
+File: gawk.info,  Node: Miscellaneous Debugger Commands,  Prev: Debugger Info,  Up: List of Debugger Commands
+
+14.3.6 Miscellaneous Commands
+-----------------------------
+
+There are a few more commands that do not fit into the previous
+categories, as follows:
+
+'dump' [FILENAME]
+     Dump byte code of the program to standard output or to the file
+     named in FILENAME.  This prints a representation of the internal
+     instructions that 'gawk' executes to implement the 'awk' commands
+     in a program.  This can be very enlightening, as the following
+     partial dump of Davide Brini's obfuscated code (*note Signature
+     Program::) demonstrates:
+
+          gawk> dump
+          -|        # BEGIN
+          -|
+          -| [  1:0xfcd340] Op_rule           : [in_rule = BEGIN] [source_file = brini.awk]
+          -| [  1:0xfcc240] Op_push_i         : "~" [MALLOC|STRING|STRCUR]
+          -| [  1:0xfcc2a0] Op_push_i         : "~" [MALLOC|STRING|STRCUR]
+          -| [  1:0xfcc280] Op_match          :
+          -| [  1:0xfcc1e0] Op_store_var      : O
+          -| [  1:0xfcc2e0] Op_push_i         : "==" [MALLOC|STRING|STRCUR]
+          -| [  1:0xfcc340] Op_push_i         : "==" [MALLOC|STRING|STRCUR]
+          -| [  1:0xfcc320] Op_equal          :
+          -| [  1:0xfcc200] Op_store_var      : o
+          -| [  1:0xfcc380] Op_push           : o
+          -| [  1:0xfcc360] Op_plus_i         : 0 [MALLOC|NUMCUR|NUMBER]
+          -| [  1:0xfcc220] Op_push_lhs       : o [do_reference = true]
+          -| [  1:0xfcc300] Op_assign_plus    :
+          -| [   :0xfcc2c0] Op_pop            :
+          -| [  1:0xfcc400] Op_push           : O
+          -| [  1:0xfcc420] Op_push_i         : "" [MALLOC|STRING|STRCUR]
+          -| [   :0xfcc4a0] Op_no_op          :
+          -| [  1:0xfcc480] Op_push           : O
+          -| [   :0xfcc4c0] Op_concat         : [expr_count = 3] [concat_flag = 0]
+          -| [  1:0xfcc3c0] Op_store_var      : x
+          -| [  1:0xfcc440] Op_push_lhs       : X [do_reference = true]
+          -| [  1:0xfcc3a0] Op_postincrement  :
+          -| [  1:0xfcc4e0] Op_push           : x
+          -| [  1:0xfcc540] Op_push           : o
+          -| [  1:0xfcc500] Op_plus           :
+          -| [  1:0xfcc580] Op_push           : o
+          -| [  1:0xfcc560] Op_plus           :
+          -| [  1:0xfcc460] Op_leq            :
+          -| [   :0xfcc5c0] Op_jmp_false      : [target_jmp = 0xfcc5e0]
+          -| [  1:0xfcc600] Op_push_i         : "%c" [MALLOC|STRING|STRCUR]
+          -| [   :0xfcc660] Op_no_op          :
+          -| [  1:0xfcc520] Op_assign_concat  : c
+          -| [   :0xfcc620] Op_jmp            : [target_jmp = 0xfcc440]
+          -|
+          ...
+          -|
+          -| [     2:0xfcc5a0] Op_K_printf         : [expr_count = 17] [redir_type = ""]
+          -| [      :0xfcc140] Op_no_op            :
+          -| [      :0xfcc1c0] Op_atexit           :
+          -| [      :0xfcc640] Op_stop             :
+          -| [      :0xfcc180] Op_no_op            :
+          -| [      :0xfcd150] Op_after_beginfile  :
+          -| [      :0xfcc160] Op_no_op            :
+          -| [      :0xfcc1a0] Op_after_endfile    :
+          gawk>
+
+'exit'
+     Exit the debugger.  See the entry for 'quit', later in this list.
+
+'help'
+'h'
+     Print a list of all of the 'gawk' debugger commands with a short
+     summary of their usage.  'help COMMAND' prints the information
+     about the command COMMAND.
+
+'list' ['-' | '+' | N | FILENAME':'N | N-M | FUNCTION]
+'l' ['-' | '+' | N | FILENAME':'N | N-M | FUNCTION]
+     Print the specified lines (default 15) from the current source file
+     or the file named FILENAME.  The possible arguments to 'list' are
+     as follows:
+
+     '-' (Minus)
+          Print lines before the lines last printed.
+
+     '+'
+          Print lines after the lines last printed.  'list' without any
+          argument does the same thing.
+
+     N
+          Print lines centered around line number N.
+
+     N-M
+          Print lines from N to M.
+
+     FILENAME':'N
+          Print lines centered around line number N in source file
+          FILENAME.  This command may change the current source file.
+
+     FUNCTION
+          Print lines centered around the beginning of the function
+          FUNCTION.  This command may change the current source file.
+
+'quit'
+'q'
+     Exit the debugger.  Debugging is great fun, but sometimes we all
+     have to tend to other obligations in life, and sometimes we find
+     the bug and are free to go on to the next one!  As we saw earlier,
+     if you are running a program, the debugger warns you when you type
+     'q' or 'quit', to make sure you really want to quit.
+
+'trace' ['on' | 'off']
+     Turn on or off continuous printing of the instructions that are
+     about to be executed, along with the 'awk' lines they implement.
+     The default is 'off'.
+
+     It is to be hoped that most of the "opcodes" in these instructions
+     are fairly self-explanatory, and using 'stepi' and 'nexti' while
+     'trace' is on will make them into familiar friends.
+
+
+File: gawk.info,  Node: Readline Support,  Next: Limitations,  Prev: List of Debugger Commands,  Up: Debugger
+
+14.4 Readline Support
+=====================
+
+If 'gawk' is compiled with the GNU Readline library
+(http://cnswww.cns.cwru.edu/php/chet/readline/readline.html), you can
+take advantage of that library's command completion and history
+expansion features.  The following types of completion are available:
+
+Command completion
+     Command names.
+
+Source file name completion
+     Source file names.  Relevant commands are 'break', 'clear', 'list',
+     'tbreak', and 'until'.
+
+Argument completion
+     Non-numeric arguments to a command.  Relevant commands are 'enable'
+     and 'info'.
+
+Variable name completion
+     Global variable names, and function arguments in the current
+     context if the program is running.  Relevant commands are
+     'display', 'print', 'set', and 'watch'.
+
+
+File: gawk.info,  Node: Limitations,  Next: Debugging Summary,  Prev: Readline Support,  Up: Debugger
+
+14.5 Limitations
+================
+
+We hope you find the 'gawk' debugger useful and enjoyable to work with,
+but as with any program, especially in its early releases, it still has
+some limitations.  A few that it's worth being aware of are:
+
+   * At this point, the debugger does not give a detailed explanation of
+     what you did wrong when you type in something it doesn't like.
+     Rather, it just responds 'syntax error'.  When you do figure out
+     what your mistake was, though, you'll feel like a real guru.
+
+   * If you perused the dump of opcodes in *note Miscellaneous Debugger
+     Commands:: (or if you are already familiar with 'gawk' internals),
+     you will realize that much of the internal manipulation of data in
+     'gawk', as in many interpreters, is done on a stack.  'Op_push',
+     'Op_pop', and the like are the "bread and butter" of most 'gawk'
+     code.
+
+     Unfortunately, as of now, the 'gawk' debugger does not allow you to
+     examine the stack's contents.  That is, the intermediate results of
+     expression evaluation are on the stack, but cannot be printed.
+     Rather, only variables that are defined in the program can be
+     printed.  Of course, a workaround for this is to use more explicit
+     variables at the debugging stage and then change back to obscure,
+     perhaps more optimal code later.
+
+   * There is no way to look "inside" the process of compiling regular
+     expressions to see if you got it right.  As an 'awk' programmer,
+     you are expected to know the meaning of '/[^[:alnum:][:blank:]]/'.
+
+   * The 'gawk' debugger is designed to be used by running a program
+     (with all its parameters) on the command line, as described in
+     *note Debugger Invocation::.  There is no way (as of now) to attach
+     or "break into" a running program.  This seems reasonable for a
+     language that is used mainly for quickly executing, short programs.
+
+   * The 'gawk' debugger only accepts source code supplied with the '-f'
+     option.
+
+   One other point is worth discussing.  Conventional debuggers run in a
+separate process (and thus address space) from the programs that they
+debug (the "debuggee", if you will).
+
+   The 'gawk' debugger is different; it is an integrated part of 'gawk'
+itself.  This makes it possible, in rare cases, for 'gawk' to become an
+excellent demonstrator of Heisenberg Uncertainty physics, where the mere
+act of observing something can change it.  Consider the following:(1)
+
+     $ cat test.awk
+     -| { print typeof($1), typeof($2) }
+     $ cat test.data
+     -| abc 123
+     $ gawk -f test.awk test.data
+     -| strnum strnum
+
+   This is all as expected: field data has the STRNUM attribute (*note
+Variable Typing::).  Now watch what happens when we run this program
+under the debugger:
+
+     $ gawk -D -f test.awk test.data
+     gawk> w $1                        Set watchpoint on $1
+     -| Watchpoint 1: $1
+     gawk> w $2                        Set watchpoint on $2
+     -| Watchpoint 2: $2
+     gawk> r                           Start the program
+     -| Starting program:
+     -| Stopping in Rule ...
+     -| Watchpoint 1: $1               Watchpoint fires
+     -|   Old value: ""
+     -|   New value: "abc"
+     -| main() at `test.awk':1
+     -| 1       { print typeof($1), typeof($2) }
+     gawk> n                           Keep going ...
+     -| Watchpoint 2: $2               Watchpoint fires
+     -|   Old value: ""
+     -|   New value: "123"
+     -| main() at `test.awk':1
+     -| 1       { print typeof($1), typeof($2) }
+     gawk> n                           Get result from typeof()
+     -| strnum number                  Result for $2 isn't right
+     -| Program exited normally with exit value: 0
+     gawk> quit
+
+   In this case, the act of comparing the new value of '$2' with the old
+one caused 'gawk' to evaluate it and determine that it is indeed a
+number, and this is reflected in the result of 'typeof()'.
+
+   Cases like this where the debugger is not transparent to the
+program's execution should be rare.  If you encounter one, please report
+it (*note Bugs::).
+
+   ---------- Footnotes ----------
+
+   (1) Thanks to Hermann Peifer for this example.
+
+
+File: gawk.info,  Node: Debugging Summary,  Prev: Limitations,  Up: Debugger
+
+14.6 Summary
+============
+
+   * Programs rarely work correctly the first time.  Finding bugs is
+     called debugging, and a program that helps you find bugs is a
+     debugger.  'gawk' has a built-in debugger that works very similarly
+     to the GNU Debugger, GDB.
+
+   * Debuggers let you step through your program one statement at a
+     time, examine and change variable and array values, and do a number
+     of other things that let you understand what your program is
+     actually doing (as opposed to what it is supposed to do).
+
+   * Like most debuggers, the 'gawk' debugger works in terms of stack
+     frames, and lets you set both breakpoints (stop at a point in the
+     code) and watchpoints (stop when a data value changes).
+
+   * The debugger command set is fairly complete, providing control over
+     breakpoints, execution, viewing and changing data, working with the
+     stack, getting information, and other tasks.
+
+   * If the GNU Readline library is available when 'gawk' is compiled,
+     it is used by the debugger to provide command-line history and
+     editing.
+
+   * Usually, the debugger does not not affect the program being
+     debugged, but occasionally it can.
+
+
+File: gawk.info,  Node: Arbitrary Precision Arithmetic,  Next: Dynamic Extensions,  Prev: Debugger,  Up: Top
+
+15 Arithmetic and Arbitrary-Precision Arithmetic with 'gawk'
+************************************************************
+
+This major node introduces some basic concepts relating to how computers
+do arithmetic and defines some important terms.  It then proceeds to
+describe floating-point arithmetic, which is what 'awk' uses for all its
+computations, including a discussion of arbitrary-precision
+floating-point arithmetic, which is a feature available only in 'gawk'.
+It continues on to present arbitrary-precision integers, and concludes
+with a description of some points where 'gawk' and the POSIX standard
+are not quite in agreement.
+
+     NOTE: Most users of 'gawk' can safely skip this chapter.  But if
+     you want to do scientific calculations with 'gawk', this is the
+     place to be.
+
+* Menu:
+
+* Computer Arithmetic::           A quick intro to computer math.
+* Math Definitions::              Defining terms used.
+* MPFR features::                 The MPFR features in 'gawk'.
+* FP Math Caution::               Things to know.
+* Arbitrary Precision Integers::  Arbitrary Precision Integer Arithmetic with
+                                  'gawk'.
+* POSIX Floating Point Problems:: Standards Versus Existing Practice.
+* Floating point summary::        Summary of floating point discussion.
+
+
+File: gawk.info,  Node: Computer Arithmetic,  Next: Math Definitions,  Up: Arbitrary Precision Arithmetic
+
+15.1 A General Description of Computer Arithmetic
+=================================================
+
+Until now, we have worked with data as either numbers or strings.
+Ultimately, however, computers represent everything in terms of "binary
+digits", or "bits".  A decimal digit can take on any of 10 values: zero
+through nine.  A binary digit can take on any of two values, zero or
+one.  Using binary, computers (and computer software) can represent and
+manipulate numerical and character data.  In general, the more bits you
+can use to represent a particular thing, the greater the range of
+possible values it can take on.
+
+   Modern computers support at least two, and often more, ways to do
+arithmetic.  Each kind of arithmetic uses a different representation
+(organization of the bits) for the numbers.  The kinds of arithmetic
+that interest us are:
+
+Decimal arithmetic
+     This is the kind of arithmetic you learned in elementary school,
+     using paper and pencil (and/or a calculator).  In theory, numbers
+     can have an arbitrary number of digits on either side (or both
+     sides) of the decimal point, and the results of a computation are
+     always exact.
+
+     Some modern systems can do decimal arithmetic in hardware, but
+     usually you need a special software library to provide access to
+     these instructions.  There are also libraries that do decimal
+     arithmetic entirely in software.
+
+     Despite the fact that some users expect 'gawk' to be performing
+     decimal arithmetic,(1) it does not do so.
+
+Integer arithmetic
+     In school, integer values were referred to as "whole" numbers--that
+     is, numbers without any fractional part, such as 1, 42, or -17.
+     The advantage to integer numbers is that they represent values
+     exactly.  The disadvantage is that their range is limited.
+
+     In computers, integer values come in two flavors: "signed" and
+     "unsigned".  Signed values may be negative or positive, whereas
+     unsigned values are always greater than or equal to zero.
+
+     In computer systems, integer arithmetic is exact, but the possible
+     range of values is limited.  Integer arithmetic is generally faster
+     than floating-point arithmetic.
+
+Floating-point arithmetic
+     Floating-point numbers represent what were called in school "real"
+     numbers (i.e., those that have a fractional part, such as
+     3.1415927).  The advantage to floating-point numbers is that they
+     can represent a much larger range of values than can integers.  The
+     disadvantage is that there are numbers that they cannot represent
+     exactly.
+
+     Modern systems support floating-point arithmetic in hardware, with
+     a limited range of values.  There are software libraries that allow
+     the use of arbitrary-precision floating-point calculations.
+
+     POSIX 'awk' uses "double-precision" floating-point numbers, which
+     can hold more digits than "single-precision" floating-point
+     numbers.  'gawk' has facilities for performing arbitrary-precision
+     floating-point arithmetic, which we describe in more detail
+     shortly.
+
+   Computers work with integer and floating-point values of different
+ranges.  Integer values are usually either 32 or 64 bits in size.
+Single-precision floating-point values occupy 32 bits, whereas
+double-precision floating-point values occupy 64 bits.  Floating-point
+values are always signed.  The possible ranges of values are shown in
+*note Table 15.1: table-numeric-ranges.
+
+Numeric representation   Minimum value            Maximum value
+---------------------------------------------------------------------------
+32-bit signed integer    -2,147,483,648           2,147,483,647
+32-bit unsigned          0                        4,294,967,295
+integer
+64-bit signed integer    -9,223,372,036,854,775,8089,223,372,036,854,775,807
+64-bit unsigned          0                        18,446,744,073,709,551,615
+integer
+Single-precision         1.175494e-38             3.402823e38
+floating point
+(approximate)
+Double-precision         2.225074e-308            1.797693e308
+floating point
+(approximate)
+
+Table 15.1: Value ranges for different numeric representations
+
+   ---------- Footnotes ----------
+
+   (1) We don't know why they expect this, but they do.
+
+
+File: gawk.info,  Node: Math Definitions,  Next: MPFR features,  Prev: Computer Arithmetic,  Up: Arbitrary Precision Arithmetic
+
+15.2 Other Stuff to Know
+========================
+
+The rest of this major node uses a number of terms.  Here are some
+informal definitions that should help you work your way through the
+material here:
+
+"Accuracy"
+     A floating-point calculation's accuracy is how close it comes to
+     the real (paper and pencil) value.
+
+"Error"
+     The difference between what the result of a computation "should be"
+     and what it actually is.  It is best to minimize error as much as
+     possible.
+
+"Exponent"
+     The order of magnitude of a value; some number of bits in a
+     floating-point value store the exponent.
+
+"Inf"
+     A special value representing infinity.  Operations involving
+     another number and infinity produce infinity.
+
+"NaN"
+     "Not a number."(1)  A special value that results from attempting a
+     calculation that has no answer as a real number.  In such a case,
+     programs can either receive a floating-point exception, or get
+     'NaN' back as the result.  The IEEE 754 standard recommends that
+     systems return 'NaN'.  Some examples:
+
+     'sqrt(-1)'
+          This makes sense in the range of complex numbers, but not in
+          the range of real numbers, so the result is 'NaN'.
+
+     'log(-8)'
+          -8 is out of the domain of 'log()', so the result is 'NaN'.
+
+"Normalized"
+     How the significand (see later in this list) is usually stored.
+     The value is adjusted so that the first bit is one, and then that
+     leading one is assumed instead of physically stored.  This provides
+     one extra bit of precision.
+
+"Precision"
+     The number of bits used to represent a floating-point number.  The
+     more bits, the more digits you can represent.  Binary and decimal
+     precisions are related approximately, according to the formula:
+
+          PREC = 3.322 * DPS
+
+     Here, _prec_ denotes the binary precision (measured in bits) and
+     _dps_ (short for decimal places) is the decimal digits.
+
+"Rounding mode"
+     How numbers are rounded up or down when necessary.  More details
+     are provided later.
+
+"Significand"
+     A floating-point value consists of the significand multiplied by 10
+     to the power of the exponent.  For example, in '1.2345e67', the
+     significand is '1.2345'.
+
+"Stability"
+     From the Wikipedia article on numerical stability
+     (http://en.wikipedia.org/wiki/Numerical_stability): "Calculations
+     that can be proven not to magnify approximation errors are called
+     "numerically stable"."
+
+   See the Wikipedia article on accuracy and precision
+(http://en.wikipedia.org/wiki/Accuracy_and_precision) for more
+information on some of those terms.
+
+   On modern systems, floating-point hardware uses the representation
+and operations defined by the IEEE 754 standard.  Three of the standard
+IEEE 754 types are 32-bit single precision, 64-bit double precision, and
+128-bit quadruple precision.  The standard also specifies extended
+precision formats to allow greater precisions and larger exponent
+ranges.  ('awk' uses only the 64-bit double-precision format.)
+
+   *note Table 15.2: table-ieee-formats. lists the precision and
+exponent field values for the basic IEEE 754 binary formats.
+
+Name           Total bits     Precision      Minimum        Maximum
+                                             exponent       exponent
+---------------------------------------------------------------------------
+Single         32             24             -126           +127
+Double         64             53             -1022          +1023
+Quadruple      128            113            -16382         +16383
+
+Table 15.2: Basic IEEE format values
+
+     NOTE: The precision numbers include the implied leading one that
+     gives them one extra bit of significand.
+
+   ---------- Footnotes ----------
+
+   (1) Thanks to Michael Brennan for this description, which we have
+paraphrased, and for the examples.
+
+
+File: gawk.info,  Node: MPFR features,  Next: FP Math Caution,  Prev: Math Definitions,  Up: Arbitrary Precision Arithmetic
+
+15.3 Arbitrary-Precision Arithmetic Features in 'gawk'
+======================================================
+
+By default, 'gawk' uses the double-precision floating-point values
+supplied by the hardware of the system it runs on.  However, if it was
+compiled to do so, and the '-M' command-line option is supplied, 'gawk'
+uses the GNU MPFR (http://www.mpfr.org) and GNU MP (http://gmplib.org)
+(GMP) libraries for arbitrary-precision arithmetic on numbers.  You can
+see if MPFR support is available like so:
+
+     $ gawk --version
+     -| GNU Awk 4.1.2, API: 1.1 (GNU MPFR 3.1.0-p3, GNU MP 5.0.2)
+     -| Copyright (C) 1989, 1991-2015 Free Software Foundation.
+     ...
+
+(You may see different version numbers than what's shown here.  That's
+OK; what's important is to see that GNU MPFR and GNU MP are listed in
+the output.)
+
+   Additionally, there are a few elements available in the 'PROCINFO'
+array to provide information about the MPFR and GMP libraries (*note
+Auto-set::).
+
+   The MPFR library provides precise control over precisions and
+rounding modes, and gives correctly rounded, reproducible,
+platform-independent results.  With the '-M' command-line option, all
+floating-point arithmetic operators and numeric functions can yield
+results to any desired precision level supported by MPFR.
+
+   Two predefined variables, 'PREC' and 'ROUNDMODE', provide control
+over the working precision and the rounding mode.  The precision and the
+rounding mode are set globally for every operation to follow.  *Note
+Setting precision:: and *note Setting the rounding mode:: for more
+information.
+
+
+File: gawk.info,  Node: FP Math Caution,  Next: Arbitrary Precision Integers,  Prev: MPFR features,  Up: Arbitrary Precision Arithmetic
+
+15.4 Floating-Point Arithmetic: Caveat Emptor!
+==============================================
+
+     Math class is tough!
+                   -- _Teen Talk Barbie, July 1992_
+
+   This minor node provides a high-level overview of the issues involved
+when doing lots of floating-point arithmetic.(1)  The discussion applies
+to both hardware and arbitrary-precision floating-point arithmetic.
+
+     CAUTION: The material here is purposely general.  If you need to do
+     serious computer arithmetic, you should do some research first, and
+     not rely just on what we tell you.
+
+* Menu:
+
+* Inexactness of computations:: Floating point math is not exact.
+* Getting Accuracy::            Getting more accuracy takes some work.
+* Try To Round::                Add digits and round.
+* Setting precision::           How to set the precision.
+* Setting the rounding mode::   How to set the rounding mode.
+
+   ---------- Footnotes ----------
+
+   (1) There is a very nice paper on floating-point arithmetic
+(http://www.validlab.com/goldberg/paper.pdf) by David Goldberg, "What
+Every Computer Scientist Should Know About Floating-Point Arithmetic,"
+'ACM Computing Surveys' *23*, 1 (1991-03): 5-48.  This is worth reading
+if you are interested in the details, but it does require a background
+in computer science.
+
+
+File: gawk.info,  Node: Inexactness of computations,  Next: Getting Accuracy,  Up: FP Math Caution
+
+15.4.1 Floating-Point Arithmetic Is Not Exact
+---------------------------------------------
+
+Binary floating-point representations and arithmetic are inexact.
+Simple values like 0.1 cannot be precisely represented using binary
+floating-point numbers, and the limited precision of floating-point
+numbers means that slight changes in the order of operations or the
+precision of intermediate storage can change the result.  To make
+matters worse, with arbitrary-precision floating-point arithmetic, you
+can set the precision before starting a computation, but then you cannot
+be sure of the number of significant decimal places in the final result.
+
+* Menu:
+
+* Inexact representation::      Numbers are not exactly represented.
+* Comparing FP Values::         How to compare floating point values.
+* Errors accumulate::           Errors get bigger as they go.
+
+
+File: gawk.info,  Node: Inexact representation,  Next: Comparing FP Values,  Up: Inexactness of computations
+
+15.4.1.1 Many Numbers Cannot Be Represented Exactly
+...................................................
+
+So, before you start to write any code, you should think about what you
+really want and what's really happening.  Consider the two numbers in
+the following example:
+
+     x = 0.875             # 1/2 + 1/4 + 1/8
+     y = 0.425
+
+   Unlike the number in 'y', the number stored in 'x' is exactly
+representable in binary because it can be written as a finite sum of one
+or more fractions whose denominators are all powers of two.  When 'gawk'
+reads a floating-point number from program source, it automatically
+rounds that number to whatever precision your machine supports.  If you
+try to print the numeric content of a variable using an output format
+string of '"%.17g"', it may not produce the same number as you assigned
+to it:
+
+     $ gawk 'BEGIN { x = 0.875; y = 0.425
+     >               printf("%0.17g, %0.17g\n", x, y) }'
+     -| 0.875, 0.42499999999999999
+
+   Often the error is so small you do not even notice it, and if you do,
+you can always specify how much precision you would like in your output.
+Usually this is a format string like '"%.15g"', which, when used in the
+previous example, produces an output identical to the input.
+
+
+File: gawk.info,  Node: Comparing FP Values,  Next: Errors accumulate,  Prev: Inexact representation,  Up: Inexactness of computations
+
+15.4.1.2 Be Careful Comparing Values
+....................................
+
+Because the underlying representation can be a little bit off from the
+exact value, comparing floating-point values to see if they are exactly
+equal is generally a bad idea.  Here is an example where it does not
+work like you would expect:
+
+     $ gawk 'BEGIN { print (0.1 + 12.2 == 12.3) }'
+     -| 0
+
+   The general wisdom when comparing floating-point values is to see if
+they are within some small range of each other (called a "delta", or
+"tolerance").  You have to decide how small a delta is important to you.
+Code to do this looks something like the following:
+
+     delta = 0.00001                 # for example
+     difference = abs(a) - abs(b)    # subtract the two values
+     if (difference < delta)
+         # all ok
+     else
+         # not ok
+
+(We assume that you have a simple absolute value function named 'abs()'
+defined elsewhere in your program.)
+
+
+File: gawk.info,  Node: Errors accumulate,  Prev: Comparing FP Values,  Up: Inexactness of computations
+
+15.4.1.3 Errors Accumulate
+..........................
+
+The loss of accuracy during a single computation with floating-point
+numbers usually isn't enough to worry about.  However, if you compute a
+value that is the result of a sequence of floating-point operations, the
+error can accumulate and greatly affect the computation itself.  Here is
+an attempt to compute the value of pi using one of its many series
+representations:
+
+     BEGIN {
+         x = 1.0 / sqrt(3.0)
+         n = 6
+         for (i = 1; i < 30; i++) {
+             n = n * 2.0
+             x = (sqrt(x * x + 1) - 1) / x
+             printf("%.15f\n", n * x)
+         }
+     }
+
+   When run, the early errors propagate through later computations,
+causing the loop to terminate prematurely after attempting to divide by
+zero:
+
+     $ gawk -f pi.awk
+     -| 3.215390309173475
+     -| 3.159659942097510
+     -| 3.146086215131467
+     -| 3.142714599645573
+     ...
+     -| 3.224515243534819
+     -| 2.791117213058638
+     -| 0.000000000000000
+     error-> gawk: pi.awk:6: fatal: division by zero attempted
+
+   Here is an additional example where the inaccuracies in internal
+representations yield an unexpected result:
+
+     $ gawk 'BEGIN {
+     >   for (d = 1.1; d <= 1.5; d += 0.1)    # loop five times (?)
+     >       i++
+     >   print i
+     > }'
+     -| 4
+
+
+File: gawk.info,  Node: Getting Accuracy,  Next: Try To Round,  Prev: Inexactness of computations,  Up: FP Math Caution
+
+15.4.2 Getting the Accuracy You Need
+------------------------------------
+
+Can arbitrary-precision arithmetic give exact results?  There are no
+easy answers.  The standard rules of algebra often do not apply when
+using floating-point arithmetic.  Among other things, the distributive
+and associative laws do not hold completely, and order of operation may
+be important for your computation.  Rounding error, cumulative precision
+loss, and underflow are often troublesome.
+
+   When 'gawk' tests the expressions '0.1 + 12.2' and '12.3' for
+equality using the machine double-precision arithmetic, it decides that
+they are not equal!  (*Note Comparing FP Values::.)  You can get the
+result you want by increasing the precision; 56 bits in this case does
+the job:
+
+     $ gawk -M -v PREC=56 'BEGIN { print (0.1 + 12.2 == 12.3) }'
+     -| 1
+
+   If adding more bits is good, perhaps adding even more bits of
+precision is better?  Here is what happens if we use an even larger
+value of 'PREC':
+
+     $ gawk -M -v PREC=201 'BEGIN { print (0.1 + 12.2 == 12.3) }'
+     -| 0
+
+   This is not a bug in 'gawk' or in the MPFR library.  It is easy to
+forget that the finite number of bits used to store the value is often
+just an approximation after proper rounding.  The test for equality
+succeeds if and only if _all_ bits in the two operands are exactly the
+same.  Because this is not necessarily true after floating-point
+computations with a particular precision and effective rounding mode, a
+straight test for equality may not work.  Instead, compare the two
+numbers to see if they are within the desirable delta of each other.
+
+   In applications where 15 or fewer decimal places suffice, hardware
+double-precision arithmetic can be adequate, and is usually much faster.
+But you need to keep in mind that every floating-point operation can
+suffer a new rounding error with catastrophic consequences, as
+illustrated by our earlier attempt to compute the value of pi.  Extra
+precision can greatly enhance the stability and the accuracy of your
+computation in such cases.
+
+   Additionally, you should understand that repeated addition is not
+necessarily equivalent to multiplication in floating-point arithmetic.
+In the example in *note Errors accumulate:::
+
+     $ gawk 'BEGIN {
+     >   for (d = 1.1; d <= 1.5; d += 0.1)    # loop five times (?)
+     >       i++
+     >   print i
+     > }'
+     -| 4
+
+you may or may not succeed in getting the correct result by choosing an
+arbitrarily large value for 'PREC'.  Reformulation of the problem at
+hand is often the correct approach in such situations.
+
+
+File: gawk.info,  Node: Try To Round,  Next: Setting precision,  Prev: Getting Accuracy,  Up: FP Math Caution
+
+15.4.3 Try a Few Extra Bits of Precision and Rounding
+-----------------------------------------------------
+
+Instead of arbitrary-precision floating-point arithmetic, often all you
+need is an adjustment of your logic or a different order for the
+operations in your calculation.  The stability and the accuracy of the
+computation of pi in the earlier example can be enhanced by using the
+following simple algebraic transformation:
+
+     (sqrt(x * x + 1) - 1) / x == x / (sqrt(x * x + 1) + 1)
+
+After making this change, the program converges to pi in under 30
+iterations:
+
+     $ gawk -f pi2.awk
+     -| 3.215390309173473
+     -| 3.159659942097501
+     -| 3.146086215131436
+     -| 3.142714599645370
+     -| 3.141873049979825
+     ...
+     -| 3.141592653589797
+     -| 3.141592653589797
+
+
+File: gawk.info,  Node: Setting precision,  Next: Setting the rounding mode,  Prev: Try To Round,  Up: FP Math Caution
+
+15.4.4 Setting the Precision
+----------------------------
+
+'gawk' uses a global working precision; it does not keep track of the
+precision or accuracy of individual numbers.  Performing an arithmetic
+operation or calling a built-in function rounds the result to the
+current working precision.  The default working precision is 53 bits,
+which you can modify using the predefined variable 'PREC'.  You can also
+set the value to one of the predefined case-insensitive strings shown in
+*note Table 15.3: table-predefined-precision-strings, to emulate an IEEE
+754 binary format.
+
+'PREC'       IEEE 754 binary format
+---------------------------------------------------
+'"half"'     16-bit half-precision
+'"single"'   Basic 32-bit single precision
+'"double"'   Basic 64-bit double precision
+'"quad"'     Basic 128-bit quadruple precision
+'"oct"'      256-bit octuple precision
+
+Table 15.3: Predefined precision strings for 'PREC'
+
+   The following example illustrates the effects of changing precision
+on arithmetic operations:
+
+     $ gawk -M -v PREC=100 'BEGIN { x = 1.0e-400; print x + 0
+     >   PREC = "double"; print x + 0 }'
+     -| 1e-400
+     -| 0
+
+     CAUTION: Be wary of floating-point constants!  When reading a
+     floating-point constant from program source code, 'gawk' uses the
+     default precision (that of a C 'double'), unless overridden by an
+     assignment to the special variable 'PREC' on the command line, to
+     store it internally as an MPFR number.  Changing the precision
+     using 'PREC' in the program text does _not_ change the precision of
+     a constant.
+
+     If you need to represent a floating-point constant at a higher
+     precision than the default and cannot use a command-line assignment
+     to 'PREC', you should either specify the constant as a string, or
+     as a rational number, whenever possible.  The following example
+     illustrates the differences among various ways to print a
+     floating-point constant:
+
+          $ gawk -M 'BEGIN { PREC = 113; printf("%0.25f\n", 0.1) }'
+          -| 0.1000000000000000055511151
+          $ gawk -M -v PREC=113 'BEGIN { printf("%0.25f\n", 0.1) }'
+          -| 0.1000000000000000000000000
+          $ gawk -M 'BEGIN { PREC = 113; printf("%0.25f\n", "0.1") }'
+          -| 0.1000000000000000000000000
+          $ gawk -M 'BEGIN { PREC = 113; printf("%0.25f\n", 1/10) }'
+          -| 0.1000000000000000000000000
+
+
+File: gawk.info,  Node: Setting the rounding mode,  Prev: Setting precision,  Up: FP Math Caution
+
+15.4.5 Setting the Rounding Mode
+--------------------------------
+
+The 'ROUNDMODE' variable provides program-level control over the
+rounding mode.  The correspondence between 'ROUNDMODE' and the IEEE
+rounding modes is shown in *note Table 15.4: table-gawk-rounding-modes.
+
+Rounding mode                    IEEE name              'ROUNDMODE'
+---------------------------------------------------------------------------
+Round to nearest, ties to even   'roundTiesToEven'      '"N"' or '"n"'
+Round toward positive infinity   'roundTowardPositive'  '"U"' or '"u"'
+Round toward negative infinity   'roundTowardNegative'  '"D"' or '"d"'
+Round toward zero                'roundTowardZero'      '"Z"' or '"z"'
+Round to nearest, ties away      'roundTiesToAway'      '"A"' or '"a"'
+from zero
+
+Table 15.4: 'gawk' rounding modes
+
+   'ROUNDMODE' has the default value '"N"', which selects the IEEE 754
+rounding mode 'roundTiesToEven'.  In *note Table 15.4:
+table-gawk-rounding-modes, the value '"A"' selects 'roundTiesToAway'.
+This is only available if your version of the MPFR library supports it;
+otherwise, setting 'ROUNDMODE' to '"A"' has no effect.
+
+   The default mode 'roundTiesToEven' is the most preferred, but the
+least intuitive.  This method does the obvious thing for most values, by
+rounding them up or down to the nearest digit.  For example, rounding
+1.132 to two digits yields 1.13, and rounding 1.157 yields 1.16.
+
+   However, when it comes to rounding a value that is exactly halfway
+between, things do not work the way you probably learned in school.  In
+this case, the number is rounded to the nearest even digit.  So rounding
+0.125 to two digits rounds down to 0.12, but rounding 0.6875 to three
+digits rounds up to 0.688.  You probably have already encountered this
+rounding mode when using 'printf' to format floating-point numbers.  For
+example:
+
+     BEGIN {
+         x = -4.5
+         for (i = 1; i < 10; i++) {
+             x += 1.0
+             printf("%4.1f => %2.0f\n", x, x)
+         }
+     }
+
+produces the following output when run on the author's system:(1)
+
+     -3.5 => -4
+     -2.5 => -2
+     -1.5 => -2
+     -0.5 => 0
+      0.5 => 0
+      1.5 => 2
+      2.5 => 2
+      3.5 => 4
+      4.5 => 4
+
+   The theory behind 'roundTiesToEven' is that it more or less evenly
+distributes upward and downward rounds of exact halves, which might
+cause any accumulating round-off error to cancel itself out.  This is
+the default rounding mode for IEEE 754 computing functions and
+operators.
+
+   The other rounding modes are rarely used.  Rounding toward positive
+infinity ('roundTowardPositive') and toward negative infinity
+('roundTowardNegative') are often used to implement interval arithmetic,
+where you adjust the rounding mode to calculate upper and lower bounds
+for the range of output.  The 'roundTowardZero' mode can be used for
+converting floating-point numbers to integers.  The rounding mode
+'roundTiesToAway' rounds the result to the nearest number and selects
+the number with the larger magnitude if a tie occurs.
+
+   Some numerical analysts will tell you that your choice of rounding
+style has tremendous impact on the final outcome, and advise you to wait
+until final output for any rounding.  Instead, you can often avoid
+round-off error problems by setting the precision initially to some
+value sufficiently larger than the final desired precision, so that the
+accumulation of round-off error does not influence the outcome.  If you
+suspect that results from your computation are sensitive to accumulation
+of round-off error, look for a significant difference in output when you
+change the rounding mode to be sure.
+
+   ---------- Footnotes ----------
+
+   (1) It is possible for the output to be completely different if the C
+library in your system does not use the IEEE 754 even-rounding rule to
+round halfway cases for 'printf'.
+
+
+File: gawk.info,  Node: Arbitrary Precision Integers,  Next: POSIX Floating Point Problems,  Prev: FP Math Caution,  Up: Arbitrary Precision Arithmetic
+
+15.5 Arbitrary-Precision Integer Arithmetic with 'gawk'
+=======================================================
+
+When given the '-M' option, 'gawk' performs all integer arithmetic using
+GMP arbitrary-precision integers.  Any number that looks like an integer
+in a source or data file is stored as an arbitrary-precision integer.
+The size of the integer is limited only by the available memory.  For
+example, the following computes 5^4^3^2, the result of which is beyond
+the limits of ordinary hardware double-precision floating-point values:
+
+     $ gawk -M 'BEGIN {
+     >   x = 5^4^3^2
+     >   print "number of digits =", length(x)
+     >   print substr(x, 1, 20), "...", substr(x, length(x) - 19, 20)
+     > }'
+     -| number of digits = 183231
+     -| 62060698786608744707 ... 92256259918212890625
+
+   If instead you were to compute the same value using
+arbitrary-precision floating-point values, the precision needed for
+correct output (using the formula 'prec = 3.322 * dps') would be 3.322 x
+183231, or 608693.
+
+   The result from an arithmetic operation with an integer and a
+floating-point value is a floating-point value with a precision equal to
+the working precision.  The following program calculates the eighth term
+in Sylvester's sequence(1) using a recurrence:
+
+     $ gawk -M 'BEGIN {
+     >   s = 2.0
+     >   for (i = 1; i <= 7; i++)
+     >       s = s * (s - 1) + 1
+     >   print s
+     > }'
+     -| 113423713055421845118910464
+
+   The output differs from the actual number,
+113,423,713,055,421,844,361,000,443, because the default precision of 53
+bits is not enough to represent the floating-point results exactly.  You
+can either increase the precision (100 bits is enough in this case), or
+replace the floating-point constant '2.0' with an integer, to perform
+all computations using integer arithmetic to get the correct output.
+
+   Sometimes 'gawk' must implicitly convert an arbitrary-precision
+integer into an arbitrary-precision floating-point value.  This is
+primarily because the MPFR library does not always provide the relevant
+interface to process arbitrary-precision integers or mixed-mode numbers
+as needed by an operation or function.  In such a case, the precision is
+set to the minimum value necessary for exact conversion, and the working
+precision is not used for this purpose.  If this is not what you need or
+want, you can employ a subterfuge and convert the integer to floating
+point first, like this:
+
+     gawk -M 'BEGIN { n = 13; print (n + 0.0) % 2.0 }'
+
+   You can avoid this issue altogether by specifying the number as a
+floating-point value to begin with:
+
+     gawk -M 'BEGIN { n = 13.0; print n % 2.0 }'
+
+   Note that for this particular example, it is likely best to just use
+the following:
+
+     gawk -M 'BEGIN { n = 13; print n % 2 }'
+
+   When dividing two arbitrary precision integers with either '/' or
+'%', the result is typically an arbitrary precision floating point value
+(unless the denominator evenly divides into the numerator).  In order to
+do integer division or remainder with arbitrary precision integers, use
+the built-in 'intdiv()' function (*note Numeric Functions::).
+
+   You can simulate the 'intdiv()' function in standard 'awk' using this
+user-defined function:
+
+     # intdiv --- do integer division
+
+     function intdiv(numerator, denominator, result)
+     {
+         split("", result)
+
+         numerator = int(numerator)
+         denominator = int(denominator)
+         result["quotient"] = int(numerator / denominator)
+         result["remainder"] = int(numerator % denominator)
+
+         return 0.0
+     }
+
+   The following example program, contributed by Katie Wasserman, uses
+'intdiv()' to compute the digits of pi to as many places as you choose
+to set:
+
+     # pi.awk --- compute the digits of pi
+
+     BEGIN {
+         digits = 100000
+         two = 2 * 10 ^ digits
+         pi = two
+         for (m = digits * 4; m > 0; --m) {
+             d = m * 2 + 1
+             x = pi * m
+             intdiv(x, d, result)
+             pi = result["quotient"]
+             pi = pi + two
+         }
+         print pi
+     }
+
+   When asked about the algorithm used, Katie replied:
+
+     It's not that well known but it's not that obscure either.  It's
+     Euler's modification to Newton's method for calculating pi.  Take a
+     look at lines (23) - (25) here:
+     <http://mathworld.wolfram.com/PiFormulas.html>.
+
+     The algorithm I wrote simply expands the multiply by 2 and works
+     from the innermost expression outwards.  I used this to program HP
+     calculators because it's quite easy to modify for tiny memory
+     devices with smallish word sizes.  See
+     <http://www.hpmuseum.org/cgi-sys/cgiwrap/hpmuseum/articles.cgi?read=899>.
+
+   ---------- Footnotes ----------
+
+   (1) Weisstein, Eric W. 'Sylvester's Sequence'.  From MathWorld--A
+Wolfram Web Resource
+(<http://mathworld.wolfram.com/SylvestersSequence.html>).
+
+
+File: gawk.info,  Node: POSIX Floating Point Problems,  Next: Floating point summary,  Prev: Arbitrary Precision Integers,  Up: Arbitrary Precision Arithmetic
+
+15.6 Standards Versus Existing Practice
+=======================================
+
+Historically, 'awk' has converted any nonnumeric-looking string to the
+numeric value zero, when required.  Furthermore, the original definition
+of the language and the original POSIX standards specified that 'awk'
+only understands decimal numbers (base 10), and not octal (base 8) or
+hexadecimal numbers (base 16).
+
+   Changes in the language of the 2001 and 2004 POSIX standards can be
+interpreted to imply that 'awk' should support additional features.
+These features are:
+
+   * Interpretation of floating-point data values specified in
+     hexadecimal notation (e.g., '0xDEADBEEF').  (Note: data values,
+     _not_ source code constants.)
+
+   * Support for the special IEEE 754 floating-point values "not a
+     number" (NaN), positive infinity ("inf"), and negative infinity
+     ("-inf").  In particular, the format for these values is as
+     specified by the ISO 1999 C standard, which ignores case and can
+     allow implementation-dependent additional characters after the
+     'nan' and allow either 'inf' or 'infinity'.
+
+   The first problem is that both of these are clear changes to
+historical practice:
+
+   * The 'gawk' maintainer feels that supporting hexadecimal
+     floating-point values, in particular, is ugly, and was never
+     intended by the original designers to be part of the language.
+
+   * Allowing completely alphabetic strings to have valid numeric values
+     is also a very severe departure from historical practice.
+
+   The second problem is that the 'gawk' maintainer feels that this
+interpretation of the standard, which required a certain amount of
+"language lawyering" to arrive at in the first place, was not even
+intended by the standard developers.  In other words, "We see how you
+got where you are, but we don't think that that's where you want to be."
+
+   Recognizing these issues, but attempting to provide compatibility
+with the earlier versions of the standard, the 2008 POSIX standard added
+explicit wording to allow, but not require, that 'awk' support
+hexadecimal floating-point values and special values for "not a number"
+and infinity.
+
+   Although the 'gawk' maintainer continues to feel that providing those
+features is inadvisable, nevertheless, on systems that support IEEE
+floating point, it seems reasonable to provide _some_ way to support NaN
+and infinity values.  The solution implemented in 'gawk' is as follows:
+
+   * With the '--posix' command-line option, 'gawk' becomes "hands off."
+     String values are passed directly to the system library's
+     'strtod()' function, and if it successfully returns a numeric
+     value, that is what's used.(1)  By definition, the results are not
+     portable across different systems.  They are also a little
+     surprising:
+
+          $ echo nanny | gawk --posix '{ print $1 + 0 }'
+          -| nan
+          $ echo 0xDeadBeef | gawk --posix '{ print $1 + 0 }'
+          -| 3735928559
+
+   * Without '--posix', 'gawk' interprets the four string values '+inf',
+     '-inf', '+nan', and '-nan' specially, producing the corresponding
+     special numeric values.  The leading sign acts a signal to 'gawk'
+     (and the user) that the value is really numeric.  Hexadecimal
+     floating point is not supported (unless you also use
+     '--non-decimal-data', which is _not_ recommended).  For example:
+
+          $ echo nanny | gawk '{ print $1 + 0 }'
+          -| 0
+          $ echo +nan | gawk '{ print $1 + 0 }'
+          -| nan
+          $ echo 0xDeadBeef | gawk '{ print $1 + 0 }'
+          -| 0
+
+     'gawk' ignores case in the four special values.  Thus, '+nan' and
+     '+NaN' are the same.
+
+   ---------- Footnotes ----------
+
+   (1) You asked for it, you got it.
+
+
+File: gawk.info,  Node: Floating point summary,  Prev: POSIX Floating Point Problems,  Up: Arbitrary Precision Arithmetic
+
+15.7 Summary
+============
+
+   * Most computer arithmetic is done using either integers or
+     floating-point values.  Standard 'awk' uses double-precision
+     floating-point values.
+
+   * In the early 1990s Barbie mistakenly said, "Math class is tough!"
+     Although math isn't tough, floating-point arithmetic isn't the same
+     as pencil-and-paper math, and care must be taken:
+
+        - Not all numbers can be represented exactly.
+
+        - Comparing values should use a delta, instead of being done
+          directly with '==' and '!='.
+
+        - Errors accumulate.
+
+        - Operations are not always truly associative or distributive.
+
+   * Increasing the accuracy can help, but it is not a panacea.
+
+   * Often, increasing the accuracy and then rounding to the desired
+     number of digits produces reasonable results.
+
+   * Use '-M' (or '--bignum') to enable MPFR arithmetic.  Use 'PREC' to
+     set the precision in bits, and 'ROUNDMODE' to set the IEEE 754
+     rounding mode.
+
+   * With '-M', 'gawk' performs arbitrary-precision integer arithmetic
+     using the GMP library.  This is faster and more space-efficient
+     than using MPFR for the same calculations.
+
+   * There are several areas with respect to floating-point numbers
+     where 'gawk' disagrees with the POSIX standard.  It pays to be
+     aware of them.
+
+   * Overall, there is no need to be unduly suspicious about the results
+     from floating-point arithmetic.  The lesson to remember is that
+     floating-point arithmetic is always more complex than arithmetic
+     using pencil and paper.  In order to take advantage of the power of
+     floating-point arithmetic, you need to know its limitations and
+     work within them.  For most casual use of floating-point
+     arithmetic, you will often get the expected result if you simply
+     round the display of your final results to the correct number of
+     significant decimal digits.
+
+   * As general advice, avoid presenting numerical data in a manner that
+     implies better precision than is actually the case.
+
+
+File: gawk.info,  Node: Dynamic Extensions,  Next: Language History,  Prev: Arbitrary Precision Arithmetic,  Up: Top
+
+16 Writing Extensions for 'gawk'
+********************************
+
+It is possible to add new functions written in C or C++ to 'gawk' using
+dynamically loaded libraries.  This facility is available on systems
+that support the C 'dlopen()' and 'dlsym()' functions.  This major node
+describes how to create extensions using code written in C or C++.
+
+   If you don't know anything about C programming, you can safely skip
+this major node, although you may wish to review the documentation on
+the extensions that come with 'gawk' (*note Extension Samples::), and
+the information on the 'gawkextlib' project (*note gawkextlib::).  The
+sample extensions are automatically built and installed when 'gawk' is.
+
+     NOTE: When '--sandbox' is specified, extensions are disabled (*note
+     Options::).
+
+* Menu:
+
+* Extension Intro::             What is an extension.
+* Plugin License::              A note about licensing.
+* Extension Mechanism Outline:: An outline of how it works.
+* Extension API Description::   A full description of the API.
+* Finding Extensions::          How 'gawk' finds compiled extensions.
+* Extension Example::           Example C code for an extension.
+* Extension Samples::           The sample extensions that ship with
+                                'gawk'.
+* gawkextlib::                  The 'gawkextlib' project.
+* Extension summary::           Extension summary.
+* Extension Exercises::         Exercises.
+
+
+File: gawk.info,  Node: Extension Intro,  Next: Plugin License,  Up: Dynamic Extensions
+
+16.1 Introduction
+=================
+
+An "extension" (sometimes called a "plug-in") is a piece of external
+compiled code that 'gawk' can load at runtime to provide additional
+functionality, over and above the built-in capabilities described in the
+rest of this Info file.
+
+   Extensions are useful because they allow you (of course) to extend
+'gawk''s functionality.  For example, they can provide access to system
+calls (such as 'chdir()' to change directory) and to other C library
+routines that could be of use.  As with most software, "the sky is the
+limit"; if you can imagine something that you might want to do and can
+write in C or C++, you can write an extension to do it!
+
+   Extensions are written in C or C++, using the "application
+programming interface" (API) defined for this purpose by the 'gawk'
+developers.  The rest of this major node explains the facilities that
+the API provides and how to use them, and presents a small example
+extension.  In addition, it documents the sample extensions included in
+the 'gawk' distribution and describes the 'gawkextlib' project.  *Note
+Extension Design::, for a discussion of the extension mechanism goals
+and design.
+
+
+File: gawk.info,  Node: Plugin License,  Next: Extension Mechanism Outline,  Prev: Extension Intro,  Up: Dynamic Extensions
+
+16.2 Extension Licensing
+========================
+
+Every dynamic extension must be distributed under a license that is
+compatible with the GNU GPL (*note Copying::).
+
+   In order for the extension to tell 'gawk' that it is properly
+licensed, the extension must define the global symbol
+'plugin_is_GPL_compatible'.  If this symbol does not exist, 'gawk' emits
+a fatal error and exits when it tries to load your extension.
+
+   The declared type of the symbol should be 'int'.  It does not need to
+be in any allocated section, though.  The code merely asserts that the
+symbol exists in the global scope.  Something like this is enough:
+
+     int plugin_is_GPL_compatible;
+
+
+File: gawk.info,  Node: Extension Mechanism Outline,  Next: Extension API Description,  Prev: Plugin License,  Up: Dynamic Extensions
+
+16.3 How It Works at a High Level
+=================================
+
+Communication between 'gawk' and an extension is two-way.  First, when
+an extension is loaded, 'gawk' passes it a pointer to a 'struct' whose
+fields are function pointers.  This is shown in *note Figure 16.1:
+figure-load-extension.
+
+
+                         Struct
+                         +---+
+                         |   |
+                         +---+
+         +---------------|   |
+         |               +---+      dl_load(api_p, id);
+         |               |   |  ___________________ 
+         |               +---+                     |
+         |     +---------|   |  __________________ |
+         |     |         +---+                    ||
+         |     |         |   |                    ||
+         |     |         +---+                    ||
+         |     |     +---|   |                    ||
+         |     |     |   +---+                  \\ || /
+         |     |     |                           \\  /
+         v     v     v                            \\/
++-------+-+---+-+---+-+------------------+--------------------+
+|       |x|   |x|   |x|                  |OOOOOOOOOOOOOOOOOOOO|
+|       |x|   |x|   |x|                  |OOOOOOOOOOOOOOOOOOOO|
+|       |x|   |x|   |x|                  |OOOOOOOOOOOOOOOOOOOO|
++-------+-+---+-+---+-+------------------+--------------------+
+
+    gawk Main Program Address Space              Extension"
+
+Figure 16.1: Loading the extension
+
+   The extension can call functions inside 'gawk' through these function
+pointers, at runtime, without needing (link-time) access to 'gawk''s
+symbols.  One of these function pointers is to a function for
+"registering" new functions.  This is shown in *note Figure 16.2:
+figure-register-new-function.
+
+
+
+            +--------------------------------------------+
+            |                                            |
+            V                                            |
++-------+-+---+-+---+-+------------------+--------------+-+---+
+|       |x|   |x|   |x|                  |OOOOOOOOOOOOOO|X|OOO|
+|       |x|   |x|   |x|                  |OOOOOOOOOOOOOO|X|OOO|
+|       |x|   |x|   |x|                  |OOOOOOOOOOOOOO|X|OOO|
++-------+-+---+-+---+-+------------------+--------------+-+---+
+
+    gawk Main Program Address Space              Extension"
+
+Figure 16.2: Registering a new function
+
+   In the other direction, the extension registers its new functions
+with 'gawk' by passing function pointers to the functions that provide
+the new feature ('do_chdir()', for example).  'gawk' associates the
+function pointer with a name and can then call it, using a defined
+calling convention.  This is shown in *note Figure 16.3:
+figure-call-new-function.
+
+
+        chdir(\"/path\")                             (*fnptr)(1);
+    }
+            +--------------------------------------------+
+            |                                            |
+            |                                            V
++-------+-+---+-+---+-+------------------+--------------+-+---+
+|       |x|   |x|   |x|                  |OOOOOOOOOOOOOO|X|OOO|
+|       |x|   |x|   |x|                  |OOOOOOOOOOOOOO|X|OOO|
+|       |x|   |x|   |x|                  |OOOOOOOOOOOOOO|X|OOO|
++-------+-+---+-+---+-+------------------+--------------+-+---+
+
+    gawk Main Program Address Space              Extension"
+
+Figure 16.3: Calling the new function
+
+   The 'do_XXX()' function, in turn, then uses the function pointers in
+the API 'struct' to do its work, such as updating variables or arrays,
+printing messages, setting 'ERRNO', and so on.
+
+   Convenience macros make calling through the function pointers look
+like regular function calls so that extension code is quite readable and
+understandable.
+
+   Although all of this sounds somewhat complicated, the result is that
+extension code is quite straightforward to write and to read.  You can
+see this in the sample extension 'filefuncs.c' (*note Extension
+Example::) and also in the 'testext.c' code for testing the APIs.
+
+   Some other bits and pieces:
+
+   * The API provides access to 'gawk''s 'do_XXX' values, reflecting
+     command-line options, like 'do_lint', 'do_profiling', and so on
+     (*note Extension API Variables::).  These are informational: an
+     extension cannot affect their values inside 'gawk'.  In addition,
+     attempting to assign to them produces a compile-time error.
+
+   * The API also provides major and minor version numbers, so that an
+     extension can check if the 'gawk' it is loaded with supports the
+     facilities it was compiled with.  (Version mismatches "shouldn't"
+     happen, but we all know how _that_ goes.)  *Note Extension
+     Versioning:: for details.
+
+
+File: gawk.info,  Node: Extension API Description,  Next: Finding Extensions,  Prev: Extension Mechanism Outline,  Up: Dynamic Extensions
+
+16.4 API Description
+====================
+
+C or C++ code for an extension must include the header file 'gawkapi.h',
+which declares the functions and defines the data types used to
+communicate with 'gawk'.  This (rather large) minor node describes the
+API in detail.
+
+* Menu:
+
+* Extension API Functions Introduction:: Introduction to the API functions.
+* General Data Types::                   The data types.
+* Memory Allocation Functions::          Functions for allocating memory.
+* Constructor Functions::                Functions for creating values.
+* Registration Functions::               Functions to register things with
+                                         'gawk'.
+* Printing Messages::                    Functions for printing messages.
+* Updating ERRNO::                Functions for updating 'ERRNO'.
+* Requesting Values::                    How to get a value.
+* Accessing Parameters::                 Functions for accessing parameters.
+* Symbol Table Access::                  Functions for accessing global
+                                         variables.
+* Array Manipulation::                   Functions for working with arrays.
+* Redirection API::                      How to access and manipulate redirections.
+* Extension API Variables::              Variables provided by the API.
+* Extension API Boilerplate::            Boilerplate code for using the API.
+
+
+File: gawk.info,  Node: Extension API Functions Introduction,  Next: General Data Types,  Up: Extension API Description
+
+16.4.1 Introduction
+-------------------
+
+Access to facilities within 'gawk' is achieved by calling through
+function pointers passed into your extension.
+
+   API function pointers are provided for the following kinds of
+operations:
+
+   * Allocating, reallocating, and releasing memory.
+
+   * Registration functions.  You may register:
+
+        - Extension functions
+        - Exit callbacks
+        - A version string
+        - Input parsers
+        - Output wrappers
+        - Two-way processors
+
+     All of these are discussed in detail later in this major node.
+
+   * Printing fatal, warning, and "lint" warning messages.
+
+   * Updating 'ERRNO', or unsetting it.
+
+   * Accessing parameters, including converting an undefined parameter
+     into an array.
+
+   * Symbol table access: retrieving a global variable, creating one, or
+     changing one.
+
+   * Creating and releasing cached values; this provides an efficient
+     way to use values for multiple variables and can be a big
+     performance win.
+
+   * Manipulating arrays:
+
+        - Retrieving, adding, deleting, and modifying elements
+
+        - Getting the count of elements in an array
+
+        - Creating a new array
+
+        - Clearing an array
+
+        - Flattening an array for easy C-style looping over all its
+          indices and elements
+
+   * Accessing and manipulating redirections.
+
+   Some points about using the API:
+
+   * The following types, macros, and/or functions are referenced in
+     'gawkapi.h'.  For correct use, you must therefore include the
+     corresponding standard header file _before_ including 'gawkapi.h':
+
+     C entity                 Header file
+     -------------------------------------------
+     'EOF'                    '<stdio.h>'
+     Values for 'errno'       '<errno.h>'
+     'FILE'                   '<stdio.h>'
+     'NULL'                   '<stddef.h>'
+     'memcpy()'               '<string.h>'
+     'memset()'               '<string.h>'
+     'size_t'                 '<sys/types.h>'
+     'struct stat'            '<sys/stat.h>'
+
+     Due to portability concerns, especially to systems that are not
+     fully standards-compliant, it is your responsibility to include the
+     correct files in the correct way.  This requirement is necessary in
+     order to keep 'gawkapi.h' clean, instead of becoming a portability
+     hodge-podge as can be seen in some parts of the 'gawk' source code.
+
+   * The 'gawkapi.h' file may be included more than once without ill
+     effect.  Doing so, however, is poor coding practice.
+
+   * Although the API only uses ISO C 90 features, there is an
+     exception; the "constructor" functions use the 'inline' keyword.
+     If your compiler does not support this keyword, you should either
+     place '-Dinline=''' on your command line or use the GNU Autotools
+     and include a 'config.h' file in your extensions.
+
+   * All pointers filled in by 'gawk' point to memory managed by 'gawk'
+     and should be treated by the extension as read-only.  Memory for
+     _all_ strings passed into 'gawk' from the extension _must_ come
+     from calling one of 'gawk_malloc()', 'gawk_calloc()', or
+     'gawk_realloc()', and is managed by 'gawk' from then on.
+
+   * The API defines several simple 'struct's that map values as seen
+     from 'awk'.  A value can be a 'double', a string, or an array (as
+     in multidimensional arrays, or when creating a new array).  String
+     values maintain both pointer and length, because embedded NUL
+     characters are allowed.
+
+          NOTE: By intent, strings are maintained using the current
+          multibyte encoding (as defined by 'LC_XXX' environment
+          variables) and not using wide characters.  This matches how
+          'gawk' stores strings internally and also how characters are
+          likely to be input into and output from files.
+
+   * When retrieving a value (such as a parameter or that of a global
+     variable or array element), the extension requests a specific type
+     (number, string, scalar, value cookie, array, or "undefined").
+     When the request is "undefined," the returned value will have the
+     real underlying type.
+
+     However, if the request and actual type don't match, the access
+     function returns "false" and fills in the type of the actual value
+     that is there, so that the extension can, e.g., print an error
+     message (such as "scalar passed where array expected").
+
+   You may call the API functions by using the function pointers
+directly, but the interface is not so pretty.  To make extension code
+look more like regular code, the 'gawkapi.h' header file defines several
+macros that you should use in your code.  This minor node presents the
+macros as if they were functions.
+
+
+File: gawk.info,  Node: General Data Types,  Next: Memory Allocation Functions,  Prev: Extension API Functions Introduction,  Up: Extension API Description
+
+16.4.2 General-Purpose Data Types
+---------------------------------
+
+     I have a true love/hate relationship with unions.
+                          -- _Arnold Robbins_
+
+     That's the thing about unions: the compiler will arrange things so
+     they can accommodate both love and hate.
+                            -- _Chet Ramey_
+
+   The extension API defines a number of simple types and structures for
+general-purpose use.  Additional, more specialized, data structures are
+introduced in subsequent minor nodes, together with the functions that
+use them.
+
+   The general-purpose types and structures are as follows:
+
+'typedef void *awk_ext_id_t;'
+     A value of this type is received from 'gawk' when an extension is
+     loaded.  That value must then be passed back to 'gawk' as the first
+     parameter of each API function.
+
+'#define awk_const ...'
+     This macro expands to 'const' when compiling an extension, and to
+     nothing when compiling 'gawk' itself.  This makes certain fields in
+     the API data structures unwritable from extension code, while
+     allowing 'gawk' to use them as it needs to.
+
+'typedef enum awk_bool {'
+'    awk_false = 0,'
+'    awk_true'
+'} awk_bool_t;'
+     A simple Boolean type.
+
+'typedef struct awk_string {'
+'    char *str;      /* data */'
+'    size_t len;     /* length thereof, in chars */'
+'} awk_string_t;'
+     This represents a mutable string.  'gawk' owns the memory pointed
+     to if it supplied the value.  Otherwise, it takes ownership of the
+     memory pointed to.  _Such memory must come from calling one of the
+     'gawk_malloc()', 'gawk_calloc()', or 'gawk_realloc()' functions!_
+
+     As mentioned earlier, strings are maintained using the current
+     multibyte encoding.
+
+'typedef enum {'
+'    AWK_UNDEFINED,'
+'    AWK_NUMBER,'
+'    AWK_STRING,'
+'    AWK_ARRAY,'
+'    AWK_SCALAR,         /* opaque access to a variable */'
+'    AWK_VALUE_COOKIE    /* for updating a previously created value */'
+'} awk_valtype_t;'
+     This 'enum' indicates the type of a value.  It is used in the
+     following 'struct'.
+
+'typedef struct awk_value {'
+'    awk_valtype_t val_type;'
+'    union {'
+'        awk_string_t       s;'
+'        double             d;'
+'        awk_array_t        a;'
+'        awk_scalar_t       scl;'
+'        awk_value_cookie_t vc;'
+'    } u;'
+'} awk_value_t;'
+     An "'awk' value."  The 'val_type' member indicates what kind of
+     value the 'union' holds, and each member is of the appropriate
+     type.
+
+'#define str_value      u.s'
+'#define num_value      u.d'
+'#define array_cookie   u.a'
+'#define scalar_cookie  u.scl'
+'#define value_cookie   u.vc'
+     Using these macros makes accessing the fields of the 'awk_value_t'
+     more readable.
+
+'typedef void *awk_scalar_t;'
+     Scalars can be represented as an opaque type.  These values are
+     obtained from 'gawk' and then passed back into it.  This is
+     discussed in a general fashion in the text following this list, and
+     in more detail in *note Symbol table by cookie::.
+
+'typedef void *awk_value_cookie_t;'
+     A "value cookie" is an opaque type representing a cached value.
+     This is also discussed in a general fashion in the text following
+     this list, and in more detail in *note Cached values::.
+
+   Scalar values in 'awk' are either numbers or strings.  The
+'awk_value_t' struct represents values.  The 'val_type' member indicates
+what is in the 'union'.
+
+   Representing numbers is easy--the API uses a C 'double'.  Strings
+require more work.  Because 'gawk' allows embedded NUL bytes in string
+values, a string must be represented as a pair containing a data pointer
+and length.  This is the 'awk_string_t' type.
+
+   Identifiers (i.e., the names of global variables) can be associated
+with either scalar values or with arrays.  In addition, 'gawk' provides
+true arrays of arrays, where any given array element can itself be an
+array.  Discussion of arrays is delayed until *note Array
+Manipulation::.
+
+   The various macros listed earlier make it easier to use the elements
+of the 'union' as if they were fields in a 'struct'; this is a common
+coding practice in C. Such code is easier to write and to read, but it
+remains _your_ responsibility to make sure that the 'val_type' member
+correctly reflects the type of the value in the 'awk_value_t' struct.
+
+   Conceptually, the first three members of the 'union' (number, string,
+and array) are all that is needed for working with 'awk' values.
+However, because the API provides routines for accessing and changing
+the value of a global scalar variable only by using the variable's name,
+there is a performance penalty: 'gawk' must find the variable each time
+it is accessed and changed.  This turns out to be a real issue, not just
+a theoretical one.
+
+   Thus, if you know that your extension will spend considerable time
+reading and/or changing the value of one or more scalar variables, you
+can obtain a "scalar cookie"(1) object for that variable, and then use
+the cookie for getting the variable's value or for changing the
+variable's value.  The 'awk_scalar_t' type holds a scalar cookie, and
+the 'scalar_cookie' macro provides access to the value of that type in
+the 'awk_value_t' struct.  Given a scalar cookie, 'gawk' can directly
+retrieve or modify the value, as required, without having to find it
+first.
+
+   The 'awk_value_cookie_t' type and 'value_cookie' macro are similar.
+If you know that you wish to use the same numeric or string _value_ for
+one or more variables, you can create the value once, retaining a "value
+cookie" for it, and then pass in that value cookie whenever you wish to
+set the value of a variable.  This saves storage space within the
+running 'gawk' process and reduces the time needed to create the value.
+
+   ---------- Footnotes ----------
+
+   (1) See the "cookie" entry in the Jargon file
+(http://catb.org/jargon/html/C/cookie.html) for a definition of
+"cookie", and the "magic cookie" entry in the Jargon file
+(http://catb.org/jargon/html/M/magic-cookie.html) for a nice example.
+See also the entry for "Cookie" in the *note Glossary::.
+
+
+File: gawk.info,  Node: Memory Allocation Functions,  Next: Constructor Functions,  Prev: General Data Types,  Up: Extension API Description
+
+16.4.3 Memory Allocation Functions and Convenience Macros
+---------------------------------------------------------
+
+The API provides a number of "memory allocation" functions for
+allocating memory that can be passed to 'gawk', as well as a number of
+convenience macros.  This node presents them all as function prototypes,
+in the way that extension code would use them:
+
+'void *gawk_malloc(size_t size);'
+     Call the correct version of 'malloc()' to allocate storage that may
+     be passed to 'gawk'.
+
+'void *gawk_calloc(size_t nmemb, size_t size);'
+     Call the correct version of 'calloc()' to allocate storage that may
+     be passed to 'gawk'.
+
+'void *gawk_realloc(void *ptr, size_t size);'
+     Call the correct version of 'realloc()' to allocate storage that
+     may be passed to 'gawk'.
+
+'void gawk_free(void *ptr);'
+     Call the correct version of 'free()' to release storage that was
+     allocated with 'gawk_malloc()', 'gawk_calloc()', or
+     'gawk_realloc()'.
+
+   The API has to provide these functions because it is possible for an
+extension to be compiled and linked against a different version of the C
+library than was used for the 'gawk' executable.(1)  If 'gawk' were to
+use its version of 'free()' when the memory came from an unrelated
+version of 'malloc()', unexpected behavior would likely result.
+
+   Two convenience macros may be used for allocating storage from
+'gawk_malloc()' and 'gawk_realloc()'.  If the allocation fails, they
+cause 'gawk' to exit with a fatal error message.  They should be used as
+if they were procedure calls that do not return a value:
+
+'#define emalloc(pointer, type, size, message) ...'
+     The arguments to this macro are as follows:
+
+     'pointer'
+          The pointer variable to point at the allocated storage.
+
+     'type'
+          The type of the pointer variable.  This is used to create a
+          cast for the call to 'gawk_malloc()'.
+
+     'size'
+          The total number of bytes to be allocated.
+
+     'message'
+          A message to be prefixed to the fatal error message.
+          Typically this is the name of the function using the macro.
+
+     For example, you might allocate a string value like so:
+
+          awk_value_t result;
+          char *message;
+          const char greet[] = "Don't Panic!";
+
+          emalloc(message, char *, sizeof(greet), "myfunc");
+          strcpy(message, greet);
+          make_malloced_string(message, strlen(message), & result);
+
+'#define erealloc(pointer, type, size, message) ...'
+     This is like 'emalloc()', but it calls 'gawk_realloc()' instead of
+     'gawk_malloc()'.  The arguments are the same as for the 'emalloc()'
+     macro.
+
+   ---------- Footnotes ----------
+
+   (1) This is more common on MS-Windows systems, but it can happen on
+Unix-like systems as well.
+
+
+File: gawk.info,  Node: Constructor Functions,  Next: Registration Functions,  Prev: Memory Allocation Functions,  Up: Extension API Description
+
+16.4.4 Constructor Functions
+----------------------------
+
+The API provides a number of "constructor" functions for creating string
+and numeric values, as well as a number of convenience macros.  This
+node presents them all as function prototypes, in the way that extension
+code would use them:
+
+'static inline awk_value_t *'
+'make_const_string(const char *string, size_t length, awk_value_t *result);'
+     This function creates a string value in the 'awk_value_t' variable
+     pointed to by 'result'.  It expects 'string' to be a C string
+     constant (or other string data), and automatically creates a _copy_
+     of the data for storage in 'result'.  It returns 'result'.
+
+'static inline awk_value_t *'
+'make_malloced_string(const char *string, size_t length, awk_value_t *result);'
+     This function creates a string value in the 'awk_value_t' variable
+     pointed to by 'result'.  It expects 'string' to be a 'char *' value
+     pointing to data previously obtained from 'gawk_malloc()',
+     'gawk_calloc()', or 'gawk_realloc()'.  The idea here is that the
+     data is passed directly to 'gawk', which assumes responsibility for
+     it.  It returns 'result'.
+
+'static inline awk_value_t *'
+'make_null_string(awk_value_t *result);'
+     This specialized function creates a null string (the "undefined"
+     value) in the 'awk_value_t' variable pointed to by 'result'.  It
+     returns 'result'.
+
+'static inline awk_value_t *'
+'make_number(double num, awk_value_t *result);'
+     This function simply creates a numeric value in the 'awk_value_t'
+     variable pointed to by 'result'.
+
+
+File: gawk.info,  Node: Registration Functions,  Next: Printing Messages,  Prev: Constructor Functions,  Up: Extension API Description
+
+16.4.5 Registration Functions
+-----------------------------
+
+This minor node describes the API functions for registering parts of
+your extension with 'gawk'.
+
+* Menu:
+
+* Extension Functions::         Registering extension functions.
+* Exit Callback Functions::     Registering an exit callback.
+* Extension Version String::    Registering a version string.
+* Input Parsers::               Registering an input parser.
+* Output Wrappers::             Registering an output wrapper.
+* Two-way processors::          Registering a two-way processor.
+
+
+File: gawk.info,  Node: Extension Functions,  Next: Exit Callback Functions,  Up: Registration Functions
+
+16.4.5.1 Registering An Extension Function
+..........................................
+
+Extension functions are described by the following record:
+
+     typedef struct awk_ext_func {
+         const char *name;
+         awk_value_t *(*function)(int num_actual_args, awk_value_t *result);
+         size_t max_expected_args;
+     } awk_ext_func_t;
+
+   The fields are:
+
+'const char *name;'
+     The name of the new function.  'awk'-level code calls the function
+     by this name.  This is a regular C string.
+
+     Function names must obey the rules for 'awk' identifiers.  That is,
+     they must begin with either an English letter or an underscore,
+     which may be followed by any number of letters, digits, and
+     underscores.  Letter case in function names is significant.
+
+'awk_value_t *(*function)(int num_actual_args, awk_value_t *result);'
+     This is a pointer to the C function that provides the extension's
+     functionality.  The function must fill in '*result' with either a
+     number or a string.  'gawk' takes ownership of any string memory.
+     As mentioned earlier, string memory _must_ come from one of
+     'gawk_malloc()', 'gawk_calloc()', or 'gawk_realloc()'.
+
+     The 'num_actual_args' argument tells the C function how many actual
+     parameters were passed from the calling 'awk' code.
+
+     The function must return the value of 'result'.  This is for the
+     convenience of the calling code inside 'gawk'.
+
+'size_t max_expected_args;'
+     This is the maximum number of arguments the function expects to
+     receive.  Each extension function may decide what to do if the
+     number of arguments isn't what it expected.  As with real 'awk'
+     functions, it is likely OK to ignore extra arguments.  This value
+     does not affect actual program execution.
+
+     Extension functions should compare this value to the number of
+     actual arguments passed and possibly issue a lint warning if there
+     is an undesirable mismatch.  Of course, if '--lint=fatal' is used,
+     this would cause the program to exit.
+
+   Once you have a record representing your extension function, you
+register it with 'gawk' using this API function:
+
+'awk_bool_t add_ext_func(const char *namespace, const awk_ext_func_t *func);'
+     This function returns true upon success, false otherwise.  The
+     'namespace' parameter is currently not used; you should pass in an
+     empty string ('""').  The 'func' pointer is the address of a
+     'struct' representing your function, as just described.
+
+
+File: gawk.info,  Node: Exit Callback Functions,  Next: Extension Version String,  Prev: Extension Functions,  Up: Registration Functions
+
+16.4.5.2 Registering An Exit Callback Function
+..............................................
+
+An "exit callback" function is a function that 'gawk' calls before it
+exits.  Such functions are useful if you have general "cleanup" tasks
+that should be performed in your extension (such as closing database
+connections or other resource deallocations).  You can register such a
+function with 'gawk' using the following function:
+
+'void awk_atexit(void (*funcp)(void *data, int exit_status),'
+'                void *arg0);'
+     The parameters are:
+
+     'funcp'
+          A pointer to the function to be called before 'gawk' exits.
+          The 'data' parameter will be the original value of 'arg0'.
+          The 'exit_status' parameter is the exit status value that
+          'gawk' intends to pass to the 'exit()' system call.
+
+     'arg0'
+          A pointer to private data that 'gawk' saves in order to pass
+          to the function pointed to by 'funcp'.
+
+   Exit callback functions are called in last-in, first-out (LIFO)
+order--that is, in the reverse order in which they are registered with
+'gawk'.
+
+
+File: gawk.info,  Node: Extension Version String,  Next: Input Parsers,  Prev: Exit Callback Functions,  Up: Registration Functions
+
+16.4.5.3 Registering An Extension Version String
+................................................
+
+You can register a version string that indicates the name and version of
+your extension with 'gawk', as follows:
+
+'void register_ext_version(const char *version);'
+     Register the string pointed to by 'version' with 'gawk'.  Note that
+     'gawk' does _not_ copy the 'version' string, so it should not be
+     changed.
+
+   'gawk' prints all registered extension version strings when it is
+invoked with the '--version' option.
+
+
+File: gawk.info,  Node: Input Parsers,  Next: Output Wrappers,  Prev: Extension Version String,  Up: Registration Functions
+
+16.4.5.4 Customized Input Parsers
+.................................
+
+By default, 'gawk' reads text files as its input.  It uses the value of
+'RS' to find the end of the record, and then uses 'FS' (or 'FIELDWIDTHS'
+or 'FPAT') to split it into fields (*note Reading Files::).
+Additionally, it sets the value of 'RT' (*note Built-in Variables::).
+
+   If you want, you can provide your own custom input parser.  An input
+parser's job is to return a record to the 'gawk' record-processing code,
+along with indicators for the value and length of the data to be used
+for 'RT', if any.
+
+   To provide an input parser, you must first provide two functions
+(where XXX is a prefix name for your extension):
+
+'awk_bool_t XXX_can_take_file(const awk_input_buf_t *iobuf);'
+     This function examines the information available in 'iobuf' (which
+     we discuss shortly).  Based on the information there, it decides if
+     the input parser should be used for this file.  If so, it should
+     return true.  Otherwise, it should return false.  It should not
+     change any state (variable values, etc.)  within 'gawk'.
+
+'awk_bool_t XXX_take_control_of(awk_input_buf_t *iobuf);'
+     When 'gawk' decides to hand control of the file over to the input
+     parser, it calls this function.  This function in turn must fill in
+     certain fields in the 'awk_input_buf_t' structure and ensure that
+     certain conditions are true.  It should then return true.  If an
+     error of some kind occurs, it should not fill in any fields and
+     should return false; then 'gawk' will not use the input parser.
+     The details are presented shortly.
+
+   Your extension should package these functions inside an
+'awk_input_parser_t', which looks like this:
+
+     typedef struct awk_input_parser {
+         const char *name;   /* name of parser */
+         awk_bool_t (*can_take_file)(const awk_input_buf_t *iobuf);
+         awk_bool_t (*take_control_of)(awk_input_buf_t *iobuf);
+         awk_const struct awk_input_parser *awk_const next;   /* for gawk */
+     } awk_input_parser_t;
+
+   The fields are:
+
+'const char *name;'
+     The name of the input parser.  This is a regular C string.
+
+'awk_bool_t (*can_take_file)(const awk_input_buf_t *iobuf);'
+     A pointer to your 'XXX_can_take_file()' function.
+
+'awk_bool_t (*take_control_of)(awk_input_buf_t *iobuf);'
+     A pointer to your 'XXX_take_control_of()' function.
+
+'awk_const struct input_parser *awk_const next;'
+     This is for use by 'gawk'; therefore it is marked 'awk_const' so
+     that the extension cannot modify it.
+
+   The steps are as follows:
+
+  1. Create a 'static awk_input_parser_t' variable and initialize it
+     appropriately.
+
+  2. When your extension is loaded, register your input parser with
+     'gawk' using the 'register_input_parser()' API function (described
+     next).
+
+   An 'awk_input_buf_t' looks like this:
+
+     typedef struct awk_input {
+         const char *name;       /* filename */
+         int fd;                 /* file descriptor */
+     #define INVALID_HANDLE (-1)
+         void *opaque;           /* private data for input parsers */
+         int (*get_record)(char **out, struct awk_input *iobuf,
+                           int *errcode, char **rt_start, size_t *rt_len);
+         ssize_t (*read_func)();
+         void (*close_func)(struct awk_input *iobuf);
+         struct stat sbuf;       /* stat buf */
+     } awk_input_buf_t;
+
+   The fields can be divided into two categories: those for use
+(initially, at least) by 'XXX_can_take_file()', and those for use by
+'XXX_take_control_of()'.  The first group of fields and their uses are
+as follows:
+
+'const char *name;'
+     The name of the file.
+
+'int fd;'
+     A file descriptor for the file.  If 'gawk' was able to open the
+     file, then 'fd' will _not_ be equal to 'INVALID_HANDLE'.
+     Otherwise, it will.
+
+'struct stat sbuf;'
+     If the file descriptor is valid, then 'gawk' will have filled in
+     this structure via a call to the 'fstat()' system call.
+
+   The 'XXX_can_take_file()' function should examine these fields and
+decide if the input parser should be used for the file.  The decision
+can be made based upon 'gawk' state (the value of a variable defined
+previously by the extension and set by 'awk' code), the name of the
+file, whether or not the file descriptor is valid, the information in
+the 'struct stat', or any combination of these factors.
+
+   Once 'XXX_can_take_file()' has returned true, and 'gawk' has decided
+to use your input parser, it calls 'XXX_take_control_of()'.  That
+function then fills either the 'get_record' field or the 'read_func'
+field in the 'awk_input_buf_t'.  It must also ensure that 'fd' is _not_
+set to 'INVALID_HANDLE'.  The following list describes the fields that
+may be filled by 'XXX_take_control_of()':
+
+'void *opaque;'
+     This is used to hold any state information needed by the input
+     parser for this file.  It is "opaque" to 'gawk'.  The input parser
+     is not required to use this pointer.
+
+'int (*get_record)(char **out,'
+'                  struct awk_input *iobuf,'
+'                  int *errcode,'
+'                  char **rt_start,'
+'                  size_t *rt_len);'
+     This function pointer should point to a function that creates the
+     input records.  Said function is the core of the input parser.  Its
+     behavior is described in the text following this list.
+
+'ssize_t (*read_func)();'
+     This function pointer should point to a function that has the same
+     behavior as the standard POSIX 'read()' system call.  It is an
+     alternative to the 'get_record' pointer.  Its behavior is also
+     described in the text following this list.
+
+'void (*close_func)(struct awk_input *iobuf);'
+     This function pointer should point to a function that does the
+     "teardown."  It should release any resources allocated by
+     'XXX_take_control_of()'.  It may also close the file.  If it does
+     so, it should set the 'fd' field to 'INVALID_HANDLE'.
+
+     If 'fd' is still not 'INVALID_HANDLE' after the call to this
+     function, 'gawk' calls the regular 'close()' system call.
+
+     Having a "teardown" function is optional.  If your input parser
+     does not need it, do not set this field.  Then, 'gawk' calls the
+     regular 'close()' system call on the file descriptor, so it should
+     be valid.
+
+   The 'XXX_get_record()' function does the work of creating input
+records.  The parameters are as follows:
+
+'char **out'
+     This is a pointer to a 'char *' variable that is set to point to
+     the record.  'gawk' makes its own copy of the data, so the
+     extension must manage this storage.
+
+'struct awk_input *iobuf'
+     This is the 'awk_input_buf_t' for the file.  The fields should be
+     used for reading data ('fd') and for managing private state
+     ('opaque'), if any.
+
+'int *errcode'
+     If an error occurs, '*errcode' should be set to an appropriate code
+     from '<errno.h>'.
+
+'char **rt_start'
+'size_t *rt_len'
+     If the concept of a "record terminator" makes sense, then
+     '*rt_start' should be set to point to the data to be used for 'RT',
+     and '*rt_len' should be set to the length of the data.  Otherwise,
+     '*rt_len' should be set to zero.  'gawk' makes its own copy of this
+     data, so the extension must manage this storage.
+
+   The return value is the length of the buffer pointed to by '*out', or
+'EOF' if end-of-file was reached or an error occurred.
+
+   It is guaranteed that 'errcode' is a valid pointer, so there is no
+need to test for a 'NULL' value.  'gawk' sets '*errcode' to zero, so
+there is no need to set it unless an error occurs.
+
+   If an error does occur, the function should return 'EOF' and set
+'*errcode' to a value greater than zero.  In that case, if '*errcode'
+does not equal zero, 'gawk' automatically updates the 'ERRNO' variable
+based on the value of '*errcode'.  (In general, setting '*errcode =
+errno' should do the right thing.)
+
+   As an alternative to supplying a function that returns an input
+record, you may instead supply a function that simply reads bytes, and
+let 'gawk' parse the data into records.  If you do so, the data should
+be returned in the multibyte encoding of the current locale.  Such a
+function should follow the same behavior as the 'read()' system call,
+and you fill in the 'read_func' pointer with its address in the
+'awk_input_buf_t' structure.
+
+   By default, 'gawk' sets the 'read_func' pointer to point to the
+'read()' system call.  So your extension need not set this field
+explicitly.
+
+     NOTE: You must choose one method or the other: either a function
+     that returns a record, or one that returns raw data.  In
+     particular, if you supply a function to get a record, 'gawk' will
+     call it, and will never call the raw read function.
+
+   'gawk' ships with a sample extension that reads directories,
+returning records for each entry in a directory (*note Extension Sample
+Readdir::).  You may wish to use that code as a guide for writing your
+own input parser.
+
+   When writing an input parser, you should think about (and document)
+how it is expected to interact with 'awk' code.  You may want it to
+always be called, and to take effect as appropriate (as the 'readdir'
+extension does).  Or you may want it to take effect based upon the value
+of an 'awk' variable, as the XML extension from the 'gawkextlib' project
+does (*note gawkextlib::).  In the latter case, code in a 'BEGINFILE'
+rule can look at 'FILENAME' and 'ERRNO' to decide whether or not to
+activate an input parser (*note BEGINFILE/ENDFILE::).
+
+   You register your input parser with the following function:
+
+'void register_input_parser(awk_input_parser_t *input_parser);'
+     Register the input parser pointed to by 'input_parser' with 'gawk'.
+
+
+File: gawk.info,  Node: Output Wrappers,  Next: Two-way processors,  Prev: Input Parsers,  Up: Registration Functions
+
+16.4.5.5 Customized Output Wrappers
+...................................
+
+An "output wrapper" is the mirror image of an input parser.  It allows
+an extension to take over the output to a file opened with the '>' or
+'>>' I/O redirection operators (*note Redirection::).
+
+   The output wrapper is very similar to the input parser structure:
+
+     typedef struct awk_output_wrapper {
+         const char *name;   /* name of the wrapper */
+         awk_bool_t (*can_take_file)(const awk_output_buf_t *outbuf);
+         awk_bool_t (*take_control_of)(awk_output_buf_t *outbuf);
+         awk_const struct awk_output_wrapper *awk_const next;  /* for gawk */
+     } awk_output_wrapper_t;
+
+   The members are as follows:
+
+'const char *name;'
+     This is the name of the output wrapper.
+
+'awk_bool_t (*can_take_file)(const awk_output_buf_t *outbuf);'
+     This points to a function that examines the information in the
+     'awk_output_buf_t' structure pointed to by 'outbuf'.  It should
+     return true if the output wrapper wants to take over the file, and
+     false otherwise.  It should not change any state (variable values,
+     etc.)  within 'gawk'.
+
+'awk_bool_t (*take_control_of)(awk_output_buf_t *outbuf);'
+     The function pointed to by this field is called when 'gawk' decides
+     to let the output wrapper take control of the file.  It should fill
+     in appropriate members of the 'awk_output_buf_t' structure, as
+     described next, and return true if successful, false otherwise.
+
+'awk_const struct output_wrapper *awk_const next;'
+     This is for use by 'gawk'; therefore it is marked 'awk_const' so
+     that the extension cannot modify it.
+
+   The 'awk_output_buf_t' structure looks like this:
+
+     typedef struct awk_output_buf {
+         const char *name;   /* name of output file */
+         const char *mode;   /* mode argument to fopen */
+         FILE *fp;           /* stdio file pointer */
+         awk_bool_t redirected;  /* true if a wrapper is active */
+         void *opaque;       /* for use by output wrapper */
+         size_t (*gawk_fwrite)(const void *buf, size_t size, size_t count,
+                     FILE *fp, void *opaque);
+         int (*gawk_fflush)(FILE *fp, void *opaque);
+         int (*gawk_ferror)(FILE *fp, void *opaque);
+         int (*gawk_fclose)(FILE *fp, void *opaque);
+     } awk_output_buf_t;
+
+   Here too, your extension will define 'XXX_can_take_file()' and
+'XXX_take_control_of()' functions that examine and update data members
+in the 'awk_output_buf_t'.  The data members are as follows:
+
+'const char *name;'
+     The name of the output file.
+
+'const char *mode;'
+     The mode string (as would be used in the second argument to
+     'fopen()') with which the file was opened.
+
+'FILE *fp;'
+     The 'FILE' pointer from '<stdio.h>'.  'gawk' opens the file before
+     attempting to find an output wrapper.
+
+'awk_bool_t redirected;'
+     This field must be set to true by the 'XXX_take_control_of()'
+     function.
+
+'void *opaque;'
+     This pointer is opaque to 'gawk'.  The extension should use it to
+     store a pointer to any private data associated with the file.
+
+'size_t (*gawk_fwrite)(const void *buf, size_t size, size_t count,'
+'                      FILE *fp, void *opaque);'
+'int (*gawk_fflush)(FILE *fp, void *opaque);'
+'int (*gawk_ferror)(FILE *fp, void *opaque);'
+'int (*gawk_fclose)(FILE *fp, void *opaque);'
+     These pointers should be set to point to functions that perform the
+     equivalent function as the '<stdio.h>' functions do, if
+     appropriate.  'gawk' uses these function pointers for all output.
+     'gawk' initializes the pointers to point to internal "pass-through"
+     functions that just call the regular '<stdio.h>' functions, so an
+     extension only needs to redefine those functions that are
+     appropriate for what it does.
+
+   The 'XXX_can_take_file()' function should make a decision based upon
+the 'name' and 'mode' fields, and any additional state (such as 'awk'
+variable values) that is appropriate.
+
+   When 'gawk' calls 'XXX_take_control_of()', that function should fill
+in the other fields as appropriate, except for 'fp', which it should
+just use normally.
+
+   You register your output wrapper with the following function:
+
+'void register_output_wrapper(awk_output_wrapper_t *output_wrapper);'
+     Register the output wrapper pointed to by 'output_wrapper' with
+     'gawk'.
+
+
+File: gawk.info,  Node: Two-way processors,  Prev: Output Wrappers,  Up: Registration Functions
+
+16.4.5.6 Customized Two-way Processors
+......................................
+
+A "two-way processor" combines an input parser and an output wrapper for
+two-way I/O with the '|&' operator (*note Redirection::).  It makes
+identical use of the 'awk_input_parser_t' and 'awk_output_buf_t'
+structures as described earlier.
+
+   A two-way processor is represented by the following structure:
+
+     typedef struct awk_two_way_processor {
+         const char *name;   /* name of the two-way processor */
+         awk_bool_t (*can_take_two_way)(const char *name);
+         awk_bool_t (*take_control_of)(const char *name,
+                                       awk_input_buf_t *inbuf,
+                                       awk_output_buf_t *outbuf);
+         awk_const struct awk_two_way_processor *awk_const next;  /* for gawk */
+     } awk_two_way_processor_t;
+
+   The fields are as follows:
+
+'const char *name;'
+     The name of the two-way processor.
+
+'awk_bool_t (*can_take_two_way)(const char *name);'
+     The function pointed to by this field should return true if it
+     wants to take over two-way I/O for this file name.  It should not
+     change any state (variable values, etc.)  within 'gawk'.
+
+'awk_bool_t (*take_control_of)(const char *name,'
+'                              awk_input_buf_t *inbuf,'
+'                              awk_output_buf_t *outbuf);'
+     The function pointed to by this field should fill in the
+     'awk_input_buf_t' and 'awk_output_buf_t' structures pointed to by
+     'inbuf' and 'outbuf', respectively.  These structures were
+     described earlier.
+
+'awk_const struct two_way_processor *awk_const next;'
+     This is for use by 'gawk'; therefore it is marked 'awk_const' so
+     that the extension cannot modify it.
+
+   As with the input parser and output processor, you provide "yes I can
+take this" and "take over for this" functions, 'XXX_can_take_two_way()'
+and 'XXX_take_control_of()'.
+
+   You register your two-way processor with the following function:
+
+'void register_two_way_processor(awk_two_way_processor_t *two_way_processor);'
+     Register the two-way processor pointed to by 'two_way_processor'
+     with 'gawk'.
+
+
+File: gawk.info,  Node: Printing Messages,  Next: Updating ERRNO,  Prev: Registration Functions,  Up: Extension API Description
+
+16.4.6 Printing Messages
+------------------------
+
+You can print different kinds of warning messages from your extension,
+as described here.  Note that for these functions, you must pass in the
+extension ID received from 'gawk' when the extension was loaded:(1)
+
+'void fatal(awk_ext_id_t id, const char *format, ...);'
+     Print a message and then cause 'gawk' to exit immediately.
+
+'void nonfatal(awk_ext_id_t id, const char *format, ...);'
+     Print a nonfatal error message.
+
+'void warning(awk_ext_id_t id, const char *format, ...);'
+     Print a warning message.
+
+'void lintwarn(awk_ext_id_t id, const char *format, ...);'
+     Print a "lint warning."  Normally this is the same as printing a
+     warning message, but if 'gawk' was invoked with '--lint=fatal',
+     then lint warnings become fatal error messages.
+
+   All of these functions are otherwise like the C 'printf()' family of
+functions, where the 'format' parameter is a string with literal
+characters and formatting codes intermixed.
+
+   ---------- Footnotes ----------
+
+   (1) Because the API uses only ISO C 90 features, it cannot make use
+of the ISO C 99 variadic macro feature to hide that parameter.  More's
+the pity.
+
+
+File: gawk.info,  Node: Updating ERRNO,  Next: Requesting Values,  Prev: Printing Messages,  Up: Extension API Description
+
+16.4.7 Updating 'ERRNO'
+-----------------------
+
+The following functions allow you to update the 'ERRNO' variable:
+
+'void update_ERRNO_int(int errno_val);'
+     Set 'ERRNO' to the string equivalent of the error code in
+     'errno_val'.  The value should be one of the defined error codes in
+     '<errno.h>', and 'gawk' turns it into a (possibly translated)
+     string using the C 'strerror()' function.
+
+'void update_ERRNO_string(const char *string);'
+     Set 'ERRNO' directly to the string value of 'ERRNO'.  'gawk' makes
+     a copy of the value of 'string'.
+
+'void unset_ERRNO(void);'
+     Unset 'ERRNO'.
+
+
+File: gawk.info,  Node: Requesting Values,  Next: Accessing Parameters,  Prev: Updating ERRNO,  Up: Extension API Description
+
+16.4.8 Requesting Values
+------------------------
+
+All of the functions that return values from 'gawk' work in the same
+way.  You pass in an 'awk_valtype_t' value to indicate what kind of
+value you expect.  If the actual value matches what you requested, the
+function returns true and fills in the 'awk_value_t' result.  Otherwise,
+the function returns false, and the 'val_type' member indicates the type
+of the actual value.  You may then print an error message or reissue the
+request for the actual value type, as appropriate.  This behavior is
+summarized in *note Table 16.1: table-value-types-returned.
+
+                                     Type of Actual Value
+--------------------------------------------------------------------------
+                          String         Number      Array       Undefined
+------------------------------------------------------------------------------
+             String       String         String      False       False
+             Number       Number if      Number      False       False
+                          can be
+                          converted,
+                          else false
+Type         Array        False          False       Array       False
+Requested    Scalar       Scalar         Scalar      False       False
+             Undefined    String         Number      Array       Undefined
+             Value        False          False       False       False
+             cookie
+
+Table 16.1: API value types returned
+
+
+File: gawk.info,  Node: Accessing Parameters,  Next: Symbol Table Access,  Prev: Requesting Values,  Up: Extension API Description
+
+16.4.9 Accessing and Updating Parameters
+----------------------------------------
+
+Two functions give you access to the arguments (parameters) passed to
+your extension function.  They are:
+
+'awk_bool_t get_argument(size_t count,'
+'                        awk_valtype_t wanted,'
+'                        awk_value_t *result);'
+     Fill in the 'awk_value_t' structure pointed to by 'result' with the
+     'count'th argument.  Return true if the actual type matches
+     'wanted', and false otherwise.  In the latter case,
+     'result->val_type' indicates the actual type (*note Table 16.1:
+     table-value-types-returned.).  Counts are zero-based--the first
+     argument is numbered zero, the second one, and so on.  'wanted'
+     indicates the type of value expected.
+
+'awk_bool_t set_argument(size_t count, awk_array_t array);'
+     Convert a parameter that was undefined into an array; this provides
+     call by reference for arrays.  Return false if 'count' is too big,
+     or if the argument's type is not undefined.  *Note Array
+     Manipulation:: for more information on creating arrays.
+
+
+File: gawk.info,  Node: Symbol Table Access,  Next: Array Manipulation,  Prev: Accessing Parameters,  Up: Extension API Description
+
+16.4.10 Symbol Table Access
+---------------------------
+
+Two sets of routines provide access to global variables, and one set
+allows you to create and release cached values.
+
+* Menu:
+
+* Symbol table by name::        Accessing variables by name.
+* Symbol table by cookie::      Accessing variables by "cookie".
+* Cached values::               Creating and using cached values.
+
+
+File: gawk.info,  Node: Symbol table by name,  Next: Symbol table by cookie,  Up: Symbol Table Access
+
+16.4.10.1 Variable Access and Update by Name
+............................................
+
+The following routines provide the ability to access and update global
+'awk'-level variables by name.  In compiler terminology, identifiers of
+different kinds are termed "symbols", thus the "sym" in the routines'
+names.  The data structure that stores information about symbols is
+termed a "symbol table".  The functions are as follows:
+
+'awk_bool_t sym_lookup(const char *name,'
+'                      awk_valtype_t wanted,'
+'                      awk_value_t *result);'
+     Fill in the 'awk_value_t' structure pointed to by 'result' with the
+     value of the variable named by the string 'name', which is a
+     regular C string.  'wanted' indicates the type of value expected.
+     Return true if the actual type matches 'wanted', and false
+     otherwise.  In the latter case, 'result->val_type' indicates the
+     actual type (*note Table 16.1: table-value-types-returned.).
+
+'awk_bool_t sym_update(const char *name, awk_value_t *value);'
+     Update the variable named by the string 'name', which is a regular
+     C string.  The variable is added to 'gawk''s symbol table if it is
+     not there.  Return true if everything worked, and false otherwise.
+
+     Changing types (scalar to array or vice versa) of an existing
+     variable is _not_ allowed, nor may this routine be used to update
+     an array.  This routine cannot be used to update any of the
+     predefined variables (such as 'ARGC' or 'NF').
+
+   An extension can look up the value of 'gawk''s special variables.
+However, with the exception of the 'PROCINFO' array, an extension cannot
+change any of those variables.
+
+     CAUTION: It is possible for the lookup of 'PROCINFO' to fail.  This
+     happens if the 'awk' program being run does not reference
+     'PROCINFO'; in this case, 'gawk' doesn't bother to create the array
+     and populate it.
+
+
+File: gawk.info,  Node: Symbol table by cookie,  Next: Cached values,  Prev: Symbol table by name,  Up: Symbol Table Access
+
+16.4.10.2 Variable Access and Update by Cookie
+..............................................
+
+A "scalar cookie" is an opaque handle that provides access to a global
+variable or array.  It is an optimization that avoids looking up
+variables in 'gawk''s symbol table every time access is needed.  This
+was discussed earlier, in *note General Data Types::.
+
+   The following functions let you work with scalar cookies:
+
+'awk_bool_t sym_lookup_scalar(awk_scalar_t cookie,'
+'                             awk_valtype_t wanted,'
+'                             awk_value_t *result);'
+     Retrieve the current value of a scalar cookie.  Once you have
+     obtained a scalar cookie using 'sym_lookup()', you can use this
+     function to get its value more efficiently.  Return false if the
+     value cannot be retrieved.
+
+'awk_bool_t sym_update_scalar(awk_scalar_t cookie, awk_value_t *value);'
+     Update the value associated with a scalar cookie.  Return false if
+     the new value is not of type 'AWK_STRING' or 'AWK_NUMBER'.  Here
+     too, the predefined variables may not be updated.
+
+   It is not obvious at first glance how to work with scalar cookies or
+what their raison d'e^tre really is.  In theory, the 'sym_lookup()' and
+'sym_update()' routines are all you really need to work with variables.
+For example, you might have code that looks up the value of a variable,
+evaluates a condition, and then possibly changes the value of the
+variable based on the result of that evaluation, like so:
+
+     /*  do_magic --- do something really great */
+
+     static awk_value_t *
+     do_magic(int nargs, awk_value_t *result)
+     {
+         awk_value_t value;
+
+         if (   sym_lookup("MAGIC_VAR", AWK_NUMBER, & value)
+             && some_condition(value.num_value)) {
+                 value.num_value += 42;
+                 sym_update("MAGIC_VAR", & value);
+         }
+
+         return make_number(0.0, result);
+     }
+
+This code looks (and is) simple and straightforward.  So what's the
+problem?
+
+   Well, consider what happens if 'awk'-level code associated with your
+extension calls the 'magic()' function (implemented in C by
+'do_magic()'), once per record, while processing hundreds of thousands
+or millions of records.  The 'MAGIC_VAR' variable is looked up in the
+symbol table once or twice per function call!
+
+   The symbol table lookup is really pure overhead; it is considerably
+more efficient to get a cookie that represents the variable, and use
+that to get the variable's value and update it as needed.(1)
+
+   Thus, the way to use cookies is as follows.  First, install your
+extension's variable in 'gawk''s symbol table using 'sym_update()', as
+usual.  Then get a scalar cookie for the variable using 'sym_lookup()':
+
+     static awk_scalar_t magic_var_cookie;    /* cookie for MAGIC_VAR */
+
+     static void
+     my_extension_init()
+     {
+         awk_value_t value;
+
+         /* install initial value */
+         sym_update("MAGIC_VAR", make_number(42.0, & value));
+
+         /* get the cookie */
+         sym_lookup("MAGIC_VAR", AWK_SCALAR, & value);
+
+         /* save the cookie */
+         magic_var_cookie = value.scalar_cookie;
+         ...
+     }
+
+   Next, use the routines in this minor node for retrieving and updating
+the value through the cookie.  Thus, 'do_magic()' now becomes something
+like this:
+
+     /*  do_magic --- do something really great */
+
+     static awk_value_t *
+     do_magic(int nargs, awk_value_t *result)
+     {
+         awk_value_t value;
+
+         if (   sym_lookup_scalar(magic_var_cookie, AWK_NUMBER, & value)
+             && some_condition(value.num_value)) {
+                 value.num_value += 42;
+                 sym_update_scalar(magic_var_cookie, & value);
+         }
+         ...
+
+         return make_number(0.0, result);
+     }
+
+     NOTE: The previous code omitted error checking for presentation
+     purposes.  Your extension code should be more robust and carefully
+     check the return values from the API functions.
+
+   ---------- Footnotes ----------
+
+   (1) The difference is measurable and quite real.  Trust us.
+
+
+File: gawk.info,  Node: Cached values,  Prev: Symbol table by cookie,  Up: Symbol Table Access
+
+16.4.10.3 Creating and Using Cached Values
+..........................................
+
+The routines in this minor node allow you to create and release cached
+values.  Like scalar cookies, in theory, cached values are not
+necessary.  You can create numbers and strings using the functions in
+*note Constructor Functions::.  You can then assign those values to
+variables using 'sym_update()' or 'sym_update_scalar()', as you like.
+
+   However, you can understand the point of cached values if you
+remember that _every_ string value's storage _must_ come from
+'gawk_malloc()', 'gawk_calloc()', or 'gawk_realloc()'.  If you have 20
+variables, all of which have the same string value, you must create 20
+identical copies of the string.(1)
+
+   It is clearly more efficient, if possible, to create a value once,
+and then tell 'gawk' to reuse the value for multiple variables.  That is
+what the routines in this minor node let you do.  The functions are as
+follows:
+
+'awk_bool_t create_value(awk_value_t *value, awk_value_cookie_t *result);'
+     Create a cached string or numeric value from 'value' for efficient
+     later assignment.  Only values of type 'AWK_NUMBER' and
+     'AWK_STRING' are allowed.  Any other type is rejected.
+     'AWK_UNDEFINED' could be allowed, but doing so would result in
+     inferior performance.
+
+'awk_bool_t release_value(awk_value_cookie_t vc);'
+     Release the memory associated with a value cookie obtained from
+     'create_value()'.
+
+   You use value cookies in a fashion similar to the way you use scalar
+cookies.  In the extension initialization routine, you create the value
+cookie:
+
+     static awk_value_cookie_t answer_cookie;  /* static value cookie */
+
+     static void
+     my_extension_init()
+     {
+         awk_value_t value;
+         char *long_string;
+         size_t long_string_len;
+
+         /* code from earlier */
+         ...
+         /* ... fill in long_string and long_string_len ... */
+         make_malloced_string(long_string, long_string_len, & value);
+         create_value(& value, & answer_cookie);    /* create cookie */
+         ...
+     }
+
+   Once the value is created, you can use it as the value of any number
+of variables:
+
+     static awk_value_t *
+     do_magic(int nargs, awk_value_t *result)
+     {
+         awk_value_t new_value;
+
+         ...    /* as earlier */
+
+         value.val_type = AWK_VALUE_COOKIE;
+         value.value_cookie = answer_cookie;
+         sym_update("VAR1", & value);
+         sym_update("VAR2", & value);
+         ...
+         sym_update("VAR100", & value);
+         ...
+     }
+
+Using value cookies in this way saves considerable storage, as all of
+'VAR1' through 'VAR100' share the same value.
+
+   You might be wondering, "Is this sharing problematic?  What happens
+if 'awk' code assigns a new value to 'VAR1'; are all the others changed
+too?"
+
+   That's a great question.  The answer is that no, it's not a problem.
+Internally, 'gawk' uses "reference-counted strings".  This means that
+many variables can share the same string value, and 'gawk' keeps track
+of the usage.  When a variable's value changes, 'gawk' simply decrements
+the reference count on the old value and updates the variable to use the
+new value.
+
+   Finally, as part of your cleanup action (*note Exit Callback
+Functions::) you should release any cached values that you created,
+using 'release_value()'.
+
+   ---------- Footnotes ----------
+
+   (1) Numeric values are clearly less problematic, requiring only a C
+'double' to store.
+
+
+File: gawk.info,  Node: Array Manipulation,  Next: Redirection API,  Prev: Symbol Table Access,  Up: Extension API Description
+
+16.4.11 Array Manipulation
+--------------------------
+
+The primary data structure(1) in 'awk' is the associative array (*note
+Arrays::).  Extensions need to be able to manipulate 'awk' arrays.  The
+API provides a number of data structures for working with arrays,
+functions for working with individual elements, and functions for
+working with arrays as a whole.  This includes the ability to "flatten"
+an array so that it is easy for C code to traverse every element in an
+array.  The array data structures integrate nicely with the data
+structures for values to make it easy to both work with and create true
+arrays of arrays (*note General Data Types::).
+
+* Menu:
+
+* Array Data Types::            Data types for working with arrays.
+* Array Functions::             Functions for working with arrays.
+* Flattening Arrays::           How to flatten arrays.
+* Creating Arrays::             How to create and populate arrays.
+
+   ---------- Footnotes ----------
+
+   (1) OK, the only data structure.
+
+
+File: gawk.info,  Node: Array Data Types,  Next: Array Functions,  Up: Array Manipulation
+
+16.4.11.1 Array Data Types
+..........................
+
+The data types associated with arrays are as follows:
+
+'typedef void *awk_array_t;'
+     If you request the value of an array variable, you get back an
+     'awk_array_t' value.  This value is opaque(1) to the extension; it
+     uniquely identifies the array but can only be used by passing it
+     into API functions or receiving it from API functions.  This is
+     very similar to way 'FILE *' values are used with the '<stdio.h>'
+     library routines.
+
+'typedef struct awk_element {'
+'    /* convenience linked list pointer, not used by gawk */'
+'    struct awk_element *next;'
+'    enum {'
+'        AWK_ELEMENT_DEFAULT = 0,  /* set by gawk */'
+'        AWK_ELEMENT_DELETE = 1    /* set by extension */'
+'    } flags;'
+'    awk_value_t index;'
+'    awk_value_t value;'
+'} awk_element_t;'
+     The 'awk_element_t' is a "flattened" array element.  'awk' produces
+     an array of these inside the 'awk_flat_array_t' (see the next
+     item).  Individual elements may be marked for deletion.  New
+     elements must be added individually, one at a time, using the
+     separate API for that purpose.  The fields are as follows:
+
+     'struct awk_element *next;'
+          This pointer is for the convenience of extension writers.  It
+          allows an extension to create a linked list of new elements
+          that can then be added to an array in a loop that traverses
+          the list.
+
+     'enum { ... } flags;'
+          A set of flag values that convey information between the
+          extension and 'gawk'.  Currently there is only one:
+          'AWK_ELEMENT_DELETE'.  Setting it causes 'gawk' to delete the
+          element from the original array upon release of the flattened
+          array.
+
+     'index'
+     'value'
+          The index and value of the element, respectively.  _All_
+          memory pointed to by 'index' and 'value' belongs to 'gawk'.
+
+'typedef struct awk_flat_array {'
+'    awk_const void *awk_const opaque1;    /* for use by gawk */'
+'    awk_const void *awk_const opaque2;    /* for use by gawk */'
+'    awk_const size_t count;     /* how many elements */'
+'    awk_element_t elements[1];  /* will be extended */'
+'} awk_flat_array_t;'
+     This is a flattened array.  When an extension gets one of these
+     from 'gawk', the 'elements' array is of actual size 'count'.  The
+     'opaque1' and 'opaque2' pointers are for use by 'gawk'; therefore
+     they are marked 'awk_const' so that the extension cannot modify
+     them.
+
+   ---------- Footnotes ----------
+
+   (1) It is also a "cookie," but the 'gawk' developers did not wish to
+overuse this term.
+
+
+File: gawk.info,  Node: Array Functions,  Next: Flattening Arrays,  Prev: Array Data Types,  Up: Array Manipulation
+
+16.4.11.2 Array Functions
+.........................
+
+The following functions relate to individual array elements:
+
+'awk_bool_t get_element_count(awk_array_t a_cookie, size_t *count);'
+     For the array represented by 'a_cookie', place in '*count' the
+     number of elements it contains.  A subarray counts as a single
+     element.  Return false if there is an error.
+
+'awk_bool_t get_array_element(awk_array_t a_cookie,'
+'                             const awk_value_t *const index,'
+'                             awk_valtype_t wanted,'
+'                             awk_value_t *result);'
+     For the array represented by 'a_cookie', return in '*result' the
+     value of the element whose index is 'index'.  'wanted' specifies
+     the type of value you wish to retrieve.  Return false if 'wanted'
+     does not match the actual type or if 'index' is not in the array
+     (*note Table 16.1: table-value-types-returned.).
+
+     The value for 'index' can be numeric, in which case 'gawk' converts
+     it to a string.  Using nonintegral values is possible, but requires
+     that you understand how such values are converted to strings (*note
+     Conversion::); thus, using integral values is safest.
+
+     As with _all_ strings passed into 'gawk' from an extension, the
+     string value of 'index' must come from 'gawk_malloc()',
+     'gawk_calloc()', or 'gawk_realloc()', and 'gawk' releases the
+     storage.
+
+'awk_bool_t set_array_element(awk_array_t a_cookie,'
+'                             const awk_value_t *const index,'
+'                             const awk_value_t *const value);'
+     In the array represented by 'a_cookie', create or modify the
+     element whose index is given by 'index'.  The 'ARGV' and 'ENVIRON'
+     arrays may not be changed, although the 'PROCINFO' array can be.
+
+'awk_bool_t set_array_element_by_elem(awk_array_t a_cookie,'
+'                                     awk_element_t element);'
+     Like 'set_array_element()', but take the 'index' and 'value' from
+     'element'.  This is a convenience macro.
+
+'awk_bool_t del_array_element(awk_array_t a_cookie,'
+'                             const awk_value_t* const index);'
+     Remove the element with the given index from the array represented
+     by 'a_cookie'.  Return true if the element was removed, or false if
+     the element did not exist in the array.
+
+   The following functions relate to arrays as a whole:
+
+'awk_array_t create_array(void);'
+     Create a new array to which elements may be added.  *Note Creating
+     Arrays:: for a discussion of how to create a new array and add
+     elements to it.
+
+'awk_bool_t clear_array(awk_array_t a_cookie);'
+     Clear the array represented by 'a_cookie'.  Return false if there
+     was some kind of problem, true otherwise.  The array remains an
+     array, but after calling this function, it has no elements.  This
+     is equivalent to using the 'delete' statement (*note Delete::).
+
+'awk_bool_t flatten_array(awk_array_t a_cookie, awk_flat_array_t **data);'
+     For the array represented by 'a_cookie', create an
+     'awk_flat_array_t' structure and fill it in.  Set the pointer whose
+     address is passed as 'data' to point to this structure.  Return
+     true upon success, or false otherwise.  *Note Flattening Arrays::,
+     for a discussion of how to flatten an array and work with it.
+
+'awk_bool_t release_flattened_array(awk_array_t a_cookie,'
+'                                   awk_flat_array_t *data);'
+     When done with a flattened array, release the storage using this
+     function.  You must pass in both the original array cookie and the
+     address of the created 'awk_flat_array_t' structure.  The function
+     returns true upon success, false otherwise.
+
+
+File: gawk.info,  Node: Flattening Arrays,  Next: Creating Arrays,  Prev: Array Functions,  Up: Array Manipulation
+
+16.4.11.3 Working With All The Elements of an Array
+...................................................
+
+To "flatten" an array is to create a structure that represents the full
+array in a fashion that makes it easy for C code to traverse the entire
+array.  Some of the code in 'extension/testext.c' does this, and also
+serves as a nice example showing how to use the APIs.
+
+   We walk through that part of the code one step at a time.  First, the
+'gawk' script that drives the test extension:
+
+     @load "testext"
+     BEGIN {
+         n = split("blacky rusty sophie raincloud lucky", pets)
+         printf("pets has %d elements\n", length(pets))
+         ret = dump_array_and_delete("pets", "3")
+         printf("dump_array_and_delete(pets) returned %d\n", ret)
+         if ("3" in pets)
+             printf("dump_array_and_delete() did NOT remove index \"3\"!\n")
+         else
+             printf("dump_array_and_delete() did remove index \"3\"!\n")
+         print ""
+     }
+
+This code creates an array with 'split()' (*note String Functions::) and
+then calls 'dump_array_and_delete()'.  That function looks up the array
+whose name is passed as the first argument, and deletes the element at
+the index passed in the second argument.  The 'awk' code then prints the
+return value and checks if the element was indeed deleted.  Here is the
+C code that implements 'dump_array_and_delete()'.  It has been edited
+slightly for presentation.
+
+   The first part declares variables, sets up the default return value
+in 'result', and checks that the function was called with the correct
+number of arguments:
+
+     static awk_value_t *
+     dump_array_and_delete(int nargs, awk_value_t *result)
+     {
+         awk_value_t value, value2, value3;
+         awk_flat_array_t *flat_array;
+         size_t count;
+         char *name;
+         int i;
+
+         assert(result != NULL);
+         make_number(0.0, result);
+
+         if (nargs != 2) {
+             printf("dump_array_and_delete: nargs not right "
+                    "(%d should be 2)\n", nargs);
+             goto out;
+         }
+
+   The function then proceeds in steps, as follows.  First, retrieve the
+name of the array, passed as the first argument, followed by the array
+itself.  If either operation fails, print an error message and return:
+
+         /* get argument named array as flat array and print it */
+         if (get_argument(0, AWK_STRING, & value)) {
+             name = value.str_value.str;
+             if (sym_lookup(name, AWK_ARRAY, & value2))
+                 printf("dump_array_and_delete: sym_lookup of %s passed\n",
+                        name);
+             else {
+                 printf("dump_array_and_delete: sym_lookup of %s failed\n",
+                        name);
+                 goto out;
+             }
+         } else {
+             printf("dump_array_and_delete: get_argument(0) failed\n");
+             goto out;
+         }
+
+   For testing purposes and to make sure that the C code sees the same
+number of elements as the 'awk' code, the second step is to get the
+count of elements in the array and print it:
+
+         if (! get_element_count(value2.array_cookie, & count)) {
+             printf("dump_array_and_delete: get_element_count failed\n");
+             goto out;
+         }
+
+         printf("dump_array_and_delete: incoming size is %lu\n",
+                (unsigned long) count);
+
+   The third step is to actually flatten the array, and then to
+double-check that the count in the 'awk_flat_array_t' is the same as the
+count just retrieved:
+
+         if (! flatten_array(value2.array_cookie, & flat_array)) {
+             printf("dump_array_and_delete: could not flatten array\n");
+             goto out;
+         }
+
+         if (flat_array->count != count) {
+             printf("dump_array_and_delete: flat_array->count (%lu)"
+                    " != count (%lu)\n",
+                     (unsigned long) flat_array->count,
+                     (unsigned long) count);
+             goto out;
+         }
+
+   The fourth step is to retrieve the index of the element to be
+deleted, which was passed as the second argument.  Remember that
+argument counts passed to 'get_argument()' are zero-based, and thus the
+second argument is numbered one:
+
+         if (! get_argument(1, AWK_STRING, & value3)) {
+             printf("dump_array_and_delete: get_argument(1) failed\n");
+             goto out;
+         }
+
+   The fifth step is where the "real work" is done.  The function loops
+over every element in the array, printing the index and element values.
+In addition, upon finding the element with the index that is supposed to
+be deleted, the function sets the 'AWK_ELEMENT_DELETE' bit in the
+'flags' field of the element.  When the array is released, 'gawk'
+traverses the flattened array, and deletes any elements that have this
+flag bit set:
+
+         for (i = 0; i < flat_array->count; i++) {
+             printf("\t%s[\"%.*s\"] = %s\n",
+                 name,
+                 (int) flat_array->elements[i].index.str_value.len,
+                 flat_array->elements[i].index.str_value.str,
+                 valrep2str(& flat_array->elements[i].value));
+
+             if (strcmp(value3.str_value.str,
+                        flat_array->elements[i].index.str_value.str) == 0) {
+                 flat_array->elements[i].flags |= AWK_ELEMENT_DELETE;
+                 printf("dump_array_and_delete: marking element \"%s\" "
+                        "for deletion\n",
+                     flat_array->elements[i].index.str_value.str);
+             }
+         }
+
+   The sixth step is to release the flattened array.  This tells 'gawk'
+that the extension is no longer using the array, and that it should
+delete any elements marked for deletion.  'gawk' also frees any storage
+that was allocated, so you should not use the pointer ('flat_array' in
+this code) once you have called 'release_flattened_array()':
+
+         if (! release_flattened_array(value2.array_cookie, flat_array)) {
+             printf("dump_array_and_delete: could not release flattened array\n");
+             goto out;
+         }
+
+   Finally, because everything was successful, the function sets the
+return value to success, and returns:
+
+         make_number(1.0, result);
+     out:
+         return result;
+     }
+
+   Here is the output from running this part of the test:
+
+     pets has 5 elements
+     dump_array_and_delete: sym_lookup of pets passed
+     dump_array_and_delete: incoming size is 5
+             pets["1"] = "blacky"
+             pets["2"] = "rusty"
+             pets["3"] = "sophie"
+     dump_array_and_delete: marking element "3" for deletion
+             pets["4"] = "raincloud"
+             pets["5"] = "lucky"
+     dump_array_and_delete(pets) returned 1
+     dump_array_and_delete() did remove index "3"!
+
+
+File: gawk.info,  Node: Creating Arrays,  Prev: Flattening Arrays,  Up: Array Manipulation
+
+16.4.11.4 How To Create and Populate Arrays
+...........................................
+
+Besides working with arrays created by 'awk' code, you can create arrays
+and populate them as you see fit, and then 'awk' code can access them
+and manipulate them.
+
+   There are two important points about creating arrays from extension
+code:
+
+   * You must install a new array into 'gawk''s symbol table immediately
+     upon creating it.  Once you have done so, you can then populate the
+     array.
+
+     Similarly, if installing a new array as a subarray of an existing
+     array, you must add the new array to its parent before adding any
+     elements to it.
+
+     Thus, the correct way to build an array is to work "top down."
+     Create the array, and immediately install it in 'gawk''s symbol
+     table using 'sym_update()', or install it as an element in a
+     previously existing array using 'set_array_element()'.  We show
+     example code shortly.
+
+   * Due to 'gawk' internals, after using 'sym_update()' to install an
+     array into 'gawk', you have to retrieve the array cookie from the
+     value passed in to 'sym_update()' before doing anything else with
+     it, like so:
+
+          awk_value_t val;
+          awk_array_t new_array;
+
+          new_array = create_array();
+          val.val_type = AWK_ARRAY;
+          val.array_cookie = new_array;
+
+          /* install array in the symbol table */
+          sym_update("array", & val);
+
+          new_array = val.array_cookie;    /* YOU MUST DO THIS */
+
+     If installing an array as a subarray, you must also retrieve the
+     value of the array cookie after the call to 'set_element()'.
+
+   The following C code is a simple test extension to create an array
+with two regular elements and with a subarray.  The leading '#include'
+directives and boilerplate variable declarations (*note Extension API
+Boilerplate::) are omitted for brevity.  The first step is to create a
+new array and then install it in the symbol table:
+
+     /* create_new_array --- create a named array */
+
+     static void
+     create_new_array()
+     {
+         awk_array_t a_cookie;
+         awk_array_t subarray;
+         awk_value_t index, value;
+
+         a_cookie = create_array();
+         value.val_type = AWK_ARRAY;
+         value.array_cookie = a_cookie;
+
+         if (! sym_update("new_array", & value))
+             printf("create_new_array: sym_update(\"new_array\") failed!\n");
+         a_cookie = value.array_cookie;
+
+Note how 'a_cookie' is reset from the 'array_cookie' field in the
+'value' structure.
+
+   The second step is to install two regular values into 'new_array':
+
+         (void) make_const_string("hello", 5, & index);
+         (void) make_const_string("world", 5, & value);
+         if (! set_array_element(a_cookie, & index, & value)) {
+             printf("fill_in_array: set_array_element failed\n");
+             return;
+         }
+
+         (void) make_const_string("answer", 6, & index);
+         (void) make_number(42.0, & value);
+         if (! set_array_element(a_cookie, & index, & value)) {
+             printf("fill_in_array: set_array_element failed\n");
+             return;
+         }
+
+   The third step is to create the subarray and install it:
+
+         (void) make_const_string("subarray", 8, & index);
+         subarray = create_array();
+         value.val_type = AWK_ARRAY;
+         value.array_cookie = subarray;
+         if (! set_array_element(a_cookie, & index, & value)) {
+             printf("fill_in_array: set_array_element failed\n");
+             return;
+         }
+         subarray = value.array_cookie;
+
+   The final step is to populate the subarray with its own element:
+
+         (void) make_const_string("foo", 3, & index);
+         (void) make_const_string("bar", 3, & value);
+         if (! set_array_element(subarray, & index, & value)) {
+             printf("fill_in_array: set_array_element failed\n");
+             return;
+         }
+     }
+
+   Here is a sample script that loads the extension and then dumps the
+array:
+
+     @load "subarray"
+
+     function dumparray(name, array,     i)
+     {
+         for (i in array)
+             if (isarray(array[i]))
+                 dumparray(name "[\"" i "\"]", array[i])
+             else
+                 printf("%s[\"%s\"] = %s\n", name, i, array[i])
+     }
+
+     BEGIN {
+         dumparray("new_array", new_array);
+     }
+
+   Here is the result of running the script:
+
+     $ AWKLIBPATH=$PWD ./gawk -f subarray.awk
+     -| new_array["subarray"]["foo"] = bar
+     -| new_array["hello"] = world
+     -| new_array["answer"] = 42
+
+(*Note Finding Extensions:: for more information on the 'AWKLIBPATH'
+environment variable.)
+
+
+File: gawk.info,  Node: Redirection API,  Next: Extension API Variables,  Prev: Array Manipulation,  Up: Extension API Description
+
+16.4.12 Accessing and Manipulating Redirections
+-----------------------------------------------
+
+The following function allows extensions to access and manipulate
+redirections.
+
+'awk_bool_t get_file(const char *name,'
+'                    size_t name_len,'
+'                    const char *filetype,'
+'                    int fd,'
+'                    const awk_input_buf_t **ibufp,'
+'                    const awk_output_buf_t **obufp);'
+     Look up a file in 'gawk''s internal redirection table.  If 'name'
+     is 'NULL' or 'name_len' is zero, return data for the currently open
+     input file corresponding to 'FILENAME'.  (This does not access the
+     'filetype' argument, so that may be undefined).  If the file is not
+     already open, attempt to open it.  The 'filetype' argument must be
+     zero-terminated and should be one of:
+
+     '">"'
+          A file opened for output.
+
+     '">>"'
+          A file opened for append.
+
+     '"<"'
+          A file opened for input.
+
+     '"|>"'
+          A pipe opened for output.
+
+     '"|<"'
+          A pipe opened for input.
+
+     '"|&"'
+          A two-way coprocess.
+
+     On error, return a 'false' value.  Otherwise, return 'true', and
+     return additional information about the redirection in the 'ibufp'
+     and 'obufp' pointers.  For input redirections, the '*ibufp' value
+     should be non-'NULL', and '*obufp' should be 'NULL'.  For output
+     redirections, the '*obufp' value should be non-'NULL', and '*ibufp'
+     should be 'NULL'.  For two-way coprocesses, both values should be
+     non-'NULL'.
+
+     In the usual case, the extension is interested in '(*ibufp)->fd'
+     and/or 'fileno((*obufp)->fp)'.  If the file is not already open,
+     and the 'fd' argument is non-negative, 'gawk' will use that file
+     descriptor instead of opening the file in the usual way.  If 'fd'
+     is non-negative, but the file exists already, 'gawk' ignores 'fd'
+     and returns the existing file.  It is the caller's responsibility
+     to notice that neither the 'fd' in the returned 'awk_input_buf_t'
+     nor the 'fd' in the returned 'awk_output_buf_t' matches the
+     requested value.
+
+     Note that supplying a file descriptor is currently _not_ supported
+     for pipes.  However, supplying a file descriptor should work for
+     input, output, append, and two-way (coprocess) sockets.  If
+     'filetype' is two-way, 'gawk' assumes that it is a socket!  Note
+     that in the two-way case, the input and output file descriptors may
+     differ.  To check for success, you must check whether either
+     matches.
+
+   It is anticipated that this API function will be used to implement
+I/O multiplexing and a socket library.
+
+
+File: gawk.info,  Node: Extension API Variables,  Next: Extension API Boilerplate,  Prev: Redirection API,  Up: Extension API Description
+
+16.4.13 API Variables
+---------------------
+
+The API provides two sets of variables.  The first provides information
+about the version of the API (both with which the extension was
+compiled, and with which 'gawk' was compiled).  The second provides
+information about how 'gawk' was invoked.
+
+* Menu:
+
+* Extension Versioning::        API Version information.
+* Extension API Informational Variables:: Variables providing information about
+                                'gawk''s invocation.
+
+
+File: gawk.info,  Node: Extension Versioning,  Next: Extension API Informational Variables,  Up: Extension API Variables
+
+16.4.13.1 API Version Constants and Variables
+.............................................
+
+The API provides both a "major" and a "minor" version number.  The API
+versions are available at compile time as C preprocessor defines to
+support conditional compilation, and as enum constants to facilitate
+debugging:
+
+API Version              C preprocessor define    enum constant
+---------------------------------------------------------------------------
+Major                    gawk_api_major_version   GAWK_API_MAJOR_VERSION
+Minor                    gawk_api_minor_version   GAWK_API_MINOR_VERSION
+
+Table 16.2: gawk API version constants
+
+   The minor version increases when new functions are added to the API.
+Such new functions are always added to the end of the API 'struct'.
+
+   The major version increases (and the minor version is reset to zero)
+if any of the data types change size or member order, or if any of the
+existing functions change signature.
+
+   It could happen that an extension may be compiled against one version
+of the API but loaded by a version of 'gawk' using a different version.
+For this reason, the major and minor API versions of the running 'gawk'
+are included in the API 'struct' as read-only constant integers:
+
+'api->major_version'
+     The major version of the running 'gawk'
+
+'api->minor_version'
+     The minor version of the running 'gawk'
+
+   It is up to the extension to decide if there are API
+incompatibilities.  Typically, a check like this is enough:
+
+     if (api->major_version != GAWK_API_MAJOR_VERSION
+         || api->minor_version < GAWK_API_MINOR_VERSION) {
+             fprintf(stderr, "foo_extension: version mismatch with gawk!\n");
+             fprintf(stderr, "\tmy version (%d, %d), gawk version (%d, %d)\n",
+                     GAWK_API_MAJOR_VERSION, GAWK_API_MINOR_VERSION,
+                     api->major_version, api->minor_version);
+             exit(1);
+     }
+
+   Such code is included in the boilerplate 'dl_load_func()' macro
+provided in 'gawkapi.h' (discussed in *note Extension API
+Boilerplate::).
+
+
+File: gawk.info,  Node: Extension API Informational Variables,  Prev: Extension Versioning,  Up: Extension API Variables
+
+16.4.13.2 Informational Variables
+.................................
+
+The API provides access to several variables that describe whether the
+corresponding command-line options were enabled when 'gawk' was invoked.
+The variables are:
+
+'do_debug'
+     This variable is true if 'gawk' was invoked with '--debug' option.
+
+'do_lint'
+     This variable is true if 'gawk' was invoked with '--lint' option.
+
+'do_mpfr'
+     This variable is true if 'gawk' was invoked with '--bignum' option.
+
+'do_profile'
+     This variable is true if 'gawk' was invoked with '--profile'
+     option.
+
+'do_sandbox'
+     This variable is true if 'gawk' was invoked with '--sandbox'
+     option.
+
+'do_traditional'
+     This variable is true if 'gawk' was invoked with '--traditional'
+     option.
+
+   The value of 'do_lint' can change if 'awk' code modifies the 'LINT'
+predefined variable (*note Built-in Variables::).  The others should not
+change during execution.
+
+
+File: gawk.info,  Node: Extension API Boilerplate,  Prev: Extension API Variables,  Up: Extension API Description
+
+16.4.14 Boilerplate Code
+------------------------
+
+As mentioned earlier (*note Extension Mechanism Outline::), the function
+definitions as presented are really macros.  To use these macros, your
+extension must provide a small amount of boilerplate code (variables and
+functions) toward the top of your source file, using predefined names as
+described here.  The boilerplate needed is also provided in comments in
+the 'gawkapi.h' header file:
+
+     /* Boilerplate code: */
+     int plugin_is_GPL_compatible;
+
+     static gawk_api_t *const api;
+     static awk_ext_id_t ext_id;
+     static const char *ext_version = NULL; /* or ... = "some string" */
+
+     static awk_ext_func_t func_table[] = {
+         { "name", do_name, 1 },
+         /* ... */
+     };
+
+     /* EITHER: */
+
+     static awk_bool_t (*init_func)(void) = NULL;
+
+     /* OR: */
+
+     static awk_bool_t
+     init_my_extension(void)
+     {
+         ...
+     }
+
+     static awk_bool_t (*init_func)(void) = init_my_extension;
+
+     dl_load_func(func_table, some_name, "name_space_in_quotes")
+
+   These variables and functions are as follows:
+
+'int plugin_is_GPL_compatible;'
+     This asserts that the extension is compatible with the GNU GPL
+     (*note Copying::).  If your extension does not have this, 'gawk'
+     will not load it (*note Plugin License::).
+
+'static gawk_api_t *const api;'
+     This global 'static' variable should be set to point to the
+     'gawk_api_t' pointer that 'gawk' passes to your 'dl_load()'
+     function.  This variable is used by all of the macros.
+
+'static awk_ext_id_t ext_id;'
+     This global static variable should be set to the 'awk_ext_id_t'
+     value that 'gawk' passes to your 'dl_load()' function.  This
+     variable is used by all of the macros.
+
+'static const char *ext_version = NULL; /* or ... = "some string" */'
+     This global 'static' variable should be set either to 'NULL', or to
+     point to a string giving the name and version of your extension.
+
+'static awk_ext_func_t func_table[] = { ... };'
+     This is an array of one or more 'awk_ext_func_t' structures, as
+     described earlier (*note Extension Functions::).  It can then be
+     looped over for multiple calls to 'add_ext_func()'.
+
+'static awk_bool_t (*init_func)(void) = NULL;'
+'                   OR'
+'static awk_bool_t init_my_extension(void) { ... }'
+'static awk_bool_t (*init_func)(void) = init_my_extension;'
+     If you need to do some initialization work, you should define a
+     function that does it (creates variables, opens files, etc.)  and
+     then define the 'init_func' pointer to point to your function.  The
+     function should return 'awk_false' upon failure, or 'awk_true' if
+     everything goes well.
+
+     If you don't need to do any initialization, define the pointer and
+     initialize it to 'NULL'.
+
+'dl_load_func(func_table, some_name, "name_space_in_quotes")'
+     This macro expands to a 'dl_load()' function that performs all the
+     necessary initializations.
+
+   The point of all the variables and arrays is to let the 'dl_load()'
+function (from the 'dl_load_func()' macro) do all the standard work.  It
+does the following:
+
+  1. Check the API versions.  If the extension major version does not
+     match 'gawk''s, or if the extension minor version is greater than
+     'gawk''s, it prints a fatal error message and exits.
+
+  2. Load the functions defined in 'func_table'.  If any of them fails
+     to load, it prints a warning message but continues on.
+
+  3. If the 'init_func' pointer is not 'NULL', call the function it
+     points to.  If it returns 'awk_false', print a warning message.
+
+  4. If 'ext_version' is not 'NULL', register the version string with
+     'gawk'.
+
+
+File: gawk.info,  Node: Finding Extensions,  Next: Extension Example,  Prev: Extension API Description,  Up: Dynamic Extensions
+
+16.5 How 'gawk' Finds Extensions
+================================
+
+Compiled extensions have to be installed in a directory where 'gawk' can
+find them.  If 'gawk' is configured and built in the default fashion,
+the directory in which to find extensions is '/usr/local/lib/gawk'.  You
+can also specify a search path with a list of directories to search for
+compiled extensions.  *Note AWKLIBPATH Variable:: for more information.
+
+
+File: gawk.info,  Node: Extension Example,  Next: Extension Samples,  Prev: Finding Extensions,  Up: Dynamic Extensions
+
+16.6 Example: Some File Functions
+=================================
+
+     No matter where you go, there you are.
+                         -- _Buckaroo Banzai_
+
+   Two useful functions that are not in 'awk' are 'chdir()' (so that an
+'awk' program can change its directory) and 'stat()' (so that an 'awk'
+program can gather information about a file).  In order to illustrate
+the API in action, this minor node implements these functions for 'gawk'
+in an extension.
+
+* Menu:
+
+* Internal File Description::   What the new functions will do.
+* Internal File Ops::           The code for internal file operations.
+* Using Internal File Ops::     How to use an external extension.
+
+
+File: gawk.info,  Node: Internal File Description,  Next: Internal File Ops,  Up: Extension Example
+
+16.6.1 Using 'chdir()' and 'stat()'
+-----------------------------------
+
+This minor node shows how to use the new functions at the 'awk' level
+once they've been integrated into the running 'gawk' interpreter.  Using
+'chdir()' is very straightforward.  It takes one argument, the new
+directory to change to:
+
+     @load "filefuncs"
+     ...
+     newdir = "/home/arnold/funstuff"
+     ret = chdir(newdir)
+     if (ret < 0) {
+         printf("could not change to %s: %s\n", newdir, ERRNO) > "/dev/stderr"
+         exit 1
+     }
+     ...
+
+   The return value is negative if the 'chdir()' failed, and 'ERRNO'
+(*note Built-in Variables::) is set to a string indicating the error.
+
+   Using 'stat()' is a bit more complicated.  The C 'stat()' function
+fills in a structure that has a fair amount of information.  The right
+way to model this in 'awk' is to fill in an associative array with the
+appropriate information:
+
+     file = "/home/arnold/.profile"
+     ret = stat(file, fdata)
+     if (ret < 0) {
+         printf("could not stat %s: %s\n",
+                  file, ERRNO) > "/dev/stderr"
+         exit 1
+     }
+     printf("size of %s is %d bytes\n", file, fdata["size"])
+
+   The 'stat()' function always clears the data array, even if the
+'stat()' fails.  It fills in the following elements:
+
+'"name"'
+     The name of the file that was 'stat()'ed.
+
+'"dev"'
+'"ino"'
+     The file's device and inode numbers, respectively.
+
+'"mode"'
+     The file's mode, as a numeric value.  This includes both the file's
+     type and its permissions.
+
+'"nlink"'
+     The number of hard links (directory entries) the file has.
+
+'"uid"'
+'"gid"'
+     The numeric user and group ID numbers of the file's owner.
+
+'"size"'
+     The size in bytes of the file.
+
+'"blocks"'
+     The number of disk blocks the file actually occupies.  This may not
+     be a function of the file's size if the file has holes.
+
+'"atime"'
+'"mtime"'
+'"ctime"'
+     The file's last access, modification, and inode update times,
+     respectively.  These are numeric timestamps, suitable for
+     formatting with 'strftime()' (*note Time Functions::).
+
+'"pmode"'
+     The file's "printable mode."  This is a string representation of
+     the file's type and permissions, such as is produced by 'ls
+     -l'--for example, '"drwxr-xr-x"'.
+
+'"type"'
+     A printable string representation of the file's type.  The value is
+     one of the following:
+
+     '"blockdev"'
+     '"chardev"'
+          The file is a block or character device ("special file").
+
+     '"directory"'
+          The file is a directory.
+
+     '"fifo"'
+          The file is a named pipe (also known as a FIFO).
+
+     '"file"'
+          The file is just a regular file.
+
+     '"socket"'
+          The file is an 'AF_UNIX' ("Unix domain") socket in the
+          filesystem.
+
+     '"symlink"'
+          The file is a symbolic link.
+
+'"devbsize"'
+     The size of a block for the element indexed by '"blocks"'.  This
+     information is derived from either the 'DEV_BSIZE' constant defined
+     in '<sys/param.h>' on most systems, or the 'S_BLKSIZE' constant in
+     '<sys/stat.h>' on BSD systems.  For some other systems, "a priori"
+     knowledge is used to provide a value.  Where no value can be
+     determined, it defaults to 512.
+
+   Several additional elements may be present, depending upon the
+operating system and the type of the file.  You can test for them in
+your 'awk' program by using the 'in' operator (*note Reference to
+Elements::):
+
+'"blksize"'
+     The preferred block size for I/O to the file.  This field is not
+     present on all POSIX-like systems in the C 'stat' structure.
+
+'"linkval"'
+     If the file is a symbolic link, this element is the name of the
+     file the link points to (i.e., the value of the link).
+
+'"rdev"'
+'"major"'
+'"minor"'
+     If the file is a block or character device file, then these values
+     represent the numeric device number and the major and minor
+     components of that number, respectively.
+
+
+File: gawk.info,  Node: Internal File Ops,  Next: Using Internal File Ops,  Prev: Internal File Description,  Up: Extension Example
+
+16.6.2 C Code for 'chdir()' and 'stat()'
+----------------------------------------
+
+Here is the C code for these extensions.(1)
+
+   The file includes a number of standard header files, and then
+includes the 'gawkapi.h' header file, which provides the API
+definitions.  Those are followed by the necessary variable declarations
+to make use of the API macros and boilerplate code (*note Extension API
+Boilerplate::):
+
+     #ifdef HAVE_CONFIG_H
+     #include <config.h>
+     #endif
+
+     #include <stdio.h>
+     #include <assert.h>
+     #include <errno.h>
+     #include <stdlib.h>
+     #include <string.h>
+     #include <unistd.h>
+
+     #include <sys/types.h>
+     #include <sys/stat.h>
+
+     #include "gawkapi.h"
+
+     #include "gettext.h"
+     #define _(msgid)  gettext(msgid)
+     #define N_(msgid) msgid
+
+     #include "gawkfts.h"
+     #include "stack.h"
+
+     static const gawk_api_t *api;    /* for convenience macros to work */
+     static awk_ext_id_t *ext_id;
+     static awk_bool_t init_filefuncs(void);
+     static awk_bool_t (*init_func)(void) = init_filefuncs;
+     static const char *ext_version = "filefuncs extension: version 1.0";
+
+     int plugin_is_GPL_compatible;
+
+   By convention, for an 'awk' function 'foo()', the C function that
+implements it is called 'do_foo()'.  The function should have two
+arguments.  The first is an 'int', usually called 'nargs', that
+represents the number of actual arguments for the function.  The second
+is a pointer to an 'awk_value_t' structure, usually named 'result':
+
+     /*  do_chdir --- provide dynamically loaded chdir() function for gawk */
+
+     static awk_value_t *
+     do_chdir(int nargs, awk_value_t *result)
+     {
+         awk_value_t newdir;
+         int ret = -1;
+
+         assert(result != NULL);
+
+         if (do_lint && nargs != 1)
+             lintwarn(ext_id,
+                      _("chdir: called with incorrect number of arguments, "
+                        "expecting 1"));
+
+   The 'newdir' variable represents the new directory to change to,
+which is retrieved with 'get_argument()'.  Note that the first argument
+is numbered zero.
+
+   If the argument is retrieved successfully, the function calls the
+'chdir()' system call.  If the 'chdir()' fails, 'ERRNO' is updated:
+
+         if (get_argument(0, AWK_STRING, & newdir)) {
+             ret = chdir(newdir.str_value.str);
+             if (ret < 0)
+                 update_ERRNO_int(errno);
+         }
+
+   Finally, the function returns the return value to the 'awk' level:
+
+         return make_number(ret, result);
+     }
+
+   The 'stat()' extension is more involved.  First comes a function that
+turns a numeric mode into a printable representation (e.g., octal '0644'
+becomes '-rw-r--r--').  This is omitted here for brevity:
+
+     /* format_mode --- turn a stat mode field into something readable */
+
+     static char *
+     format_mode(unsigned long fmode)
+     {
+         ...
+     }
+
+   Next comes a function for reading symbolic links, which is also
+omitted here for brevity:
+
+     /* read_symlink --- read a symbolic link into an allocated buffer.
+        ... */
+
+     static char *
+     read_symlink(const char *fname, size_t bufsize, ssize_t *linksize)
+     {
+         ...
+     }
+
+   Two helper functions simplify entering values in the array that will
+contain the result of the 'stat()':
+
+     /* array_set --- set an array element */
+
+     static void
+     array_set(awk_array_t array, const char *sub, awk_value_t *value)
+     {
+         awk_value_t index;
+
+         set_array_element(array,
+                           make_const_string(sub, strlen(sub), & index),
+                           value);
+
+     }
+
+     /* array_set_numeric --- set an array element with a number */
+
+     static void
+     array_set_numeric(awk_array_t array, const char *sub, double num)
+     {
+         awk_value_t tmp;
+
+         array_set(array, sub, make_number(num, & tmp));
+     }
+
+   The following function does most of the work to fill in the
+'awk_array_t' result array with values obtained from a valid 'struct
+stat'.  This work is done in a separate function to support the 'stat()'
+function for 'gawk' and also to support the 'fts()' extension, which is
+included in the same file but whose code is not shown here (*note
+Extension Sample File Functions::).
+
+   The first part of the function is variable declarations, including a
+table to map file types to strings:
+
+     /* fill_stat_array --- do the work to fill an array with stat info */
+
+     static int
+     fill_stat_array(const char *name, awk_array_t array, struct stat *sbuf)
+     {
+         char *pmode;    /* printable mode */
+         const char *type = "unknown";
+         awk_value_t tmp;
+         static struct ftype_map {
+             unsigned int mask;
+             const char *type;
+         } ftype_map[] = {
+             { S_IFREG, "file" },
+             { S_IFBLK, "blockdev" },
+             { S_IFCHR, "chardev" },
+             { S_IFDIR, "directory" },
+     #ifdef S_IFSOCK
+             { S_IFSOCK, "socket" },
+     #endif
+     #ifdef S_IFIFO
+             { S_IFIFO, "fifo" },
+     #endif
+     #ifdef S_IFLNK
+             { S_IFLNK, "symlink" },
+     #endif
+     #ifdef S_IFDOOR /* Solaris weirdness */
+             { S_IFDOOR, "door" },
+     #endif /* S_IFDOOR */
+         };
+         int j, k;
+
+   The destination array is cleared, and then code fills in various
+elements based on values in the 'struct stat':
+
+         /* empty out the array */
+         clear_array(array);
+
+         /* fill in the array */
+         array_set(array, "name", make_const_string(name, strlen(name),
+                                                    & tmp));
+         array_set_numeric(array, "dev", sbuf->st_dev);
+         array_set_numeric(array, "ino", sbuf->st_ino);
+         array_set_numeric(array, "mode", sbuf->st_mode);
+         array_set_numeric(array, "nlink", sbuf->st_nlink);
+         array_set_numeric(array, "uid", sbuf->st_uid);
+         array_set_numeric(array, "gid", sbuf->st_gid);
+         array_set_numeric(array, "size", sbuf->st_size);
+         array_set_numeric(array, "blocks", sbuf->st_blocks);
+         array_set_numeric(array, "atime", sbuf->st_atime);
+         array_set_numeric(array, "mtime", sbuf->st_mtime);
+         array_set_numeric(array, "ctime", sbuf->st_ctime);
+
+         /* for block and character devices, add rdev,
+            major and minor numbers */
+         if (S_ISBLK(sbuf->st_mode) || S_ISCHR(sbuf->st_mode)) {
+             array_set_numeric(array, "rdev", sbuf->st_rdev);
+             array_set_numeric(array, "major", major(sbuf->st_rdev));
+             array_set_numeric(array, "minor", minor(sbuf->st_rdev));
+         }
+
+The latter part of the function makes selective additions to the
+destination array, depending upon the availability of certain members
+and/or the type of the file.  It then returns zero, for success:
+
+     #ifdef HAVE_STRUCT_STAT_ST_BLKSIZE
+         array_set_numeric(array, "blksize", sbuf->st_blksize);
+     #endif /* HAVE_STRUCT_STAT_ST_BLKSIZE */
+
+         pmode = format_mode(sbuf->st_mode);
+         array_set(array, "pmode", make_const_string(pmode, strlen(pmode),
+                                                     & tmp));
+
+         /* for symbolic links, add a linkval field */
+         if (S_ISLNK(sbuf->st_mode)) {
+             char *buf;
+             ssize_t linksize;
+
+             if ((buf = read_symlink(name, sbuf->st_size,
+                         & linksize)) != NULL)
+                 array_set(array, "linkval",
+                           make_malloced_string(buf, linksize, & tmp));
+             else
+                 warning(ext_id, _("stat: unable to read symbolic link `%s'"),
+                         name);
+         }
+
+         /* add a type field */
+         type = "unknown";   /* shouldn't happen */
+         for (j = 0, k = sizeof(ftype_map)/sizeof(ftype_map[0]); j < k; j++) {
+             if ((sbuf->st_mode & S_IFMT) == ftype_map[j].mask) {
+                 type = ftype_map[j].type;
+                 break;
+             }
+         }
+
+         array_set(array, "type", make_const_string(type, strlen(type), & tmp));
+
+         return 0;
+     }
+
+   The third argument to 'stat()' was not discussed previously.  This
+argument is optional.  If present, it causes 'do_stat()' to use the
+'stat()' system call instead of the 'lstat()' system call.  This is done
+by using a function pointer: 'statfunc'.  'statfunc' is initialized to
+point to 'lstat()' (instead of 'stat()') to get the file information, in
+case the file is a symbolic link.  However, if the third argument is
+included, 'statfunc' is set to point to 'stat()', instead.
+
+   Here is the 'do_stat()' function, which starts with variable
+declarations and argument checking:
+
+     /* do_stat --- provide a stat() function for gawk */
+
+     static awk_value_t *
+     do_stat(int nargs, awk_value_t *result)
+     {
+         awk_value_t file_param, array_param;
+         char *name;
+         awk_array_t array;
+         int ret;
+         struct stat sbuf;
+         /* default is lstat() */
+         int (*statfunc)(const char *path, struct stat *sbuf) = lstat;
+
+         assert(result != NULL);
+
+         if (nargs != 2 && nargs != 3) {
+             if (do_lint)
+                 lintwarn(ext_id,
+                    _("stat: called with wrong number of arguments"));
+             return make_number(-1, result);
+         }
+
+   Then comes the actual work.  First, the function gets the arguments.
+Next, it gets the information for the file.  If the called function
+('lstat()' or 'stat()') returns an error, the code sets 'ERRNO' and
+returns:
+
+         /* file is first arg, array to hold results is second */
+         if (   ! get_argument(0, AWK_STRING, & file_param)
+             || ! get_argument(1, AWK_ARRAY, & array_param)) {
+             warning(ext_id, _("stat: bad parameters"));
+             return make_number(-1, result);
+         }
+
+         if (nargs == 3) {
+             statfunc = stat;
+         }
+
+         name = file_param.str_value.str;
+         array = array_param.array_cookie;
+
+         /* always empty out the array */
+         clear_array(array);
+
+         /* stat the file; if error, set ERRNO and return */
+         ret = statfunc(name, & sbuf);
+         if (ret < 0) {
+             update_ERRNO_int(errno);
+             return make_number(ret, result);
+         }
+
+   The tedious work is done by 'fill_stat_array()', shown earlier.  When
+done, the function returns the result from 'fill_stat_array()':
+
+         ret = fill_stat_array(name, array, & sbuf);
+
+         return make_number(ret, result);
+     }
+
+   Finally, it's necessary to provide the "glue" that loads the new
+function(s) into 'gawk'.
+
+   The 'filefuncs' extension also provides an 'fts()' function, which we
+omit here (*note Extension Sample File Functions::).  For its sake,
+there is an initialization function:
+
+     /* init_filefuncs --- initialization routine */
+
+     static awk_bool_t
+     init_filefuncs(void)
+     {
+         ...
+     }
+
+   We are almost done.  We need an array of 'awk_ext_func_t' structures
+for loading each function into 'gawk':
+
+     static awk_ext_func_t func_table[] = {
+         { "chdir", do_chdir, 1 },
+         { "stat",  do_stat, 2 },
+     #ifndef __MINGW32__
+         { "fts",   do_fts, 3 },
+     #endif
+     };
+
+   Each extension must have a routine named 'dl_load()' to load
+everything that needs to be loaded.  It is simplest to use the
+'dl_load_func()' macro in 'gawkapi.h':
+
+     /* define the dl_load() function using the boilerplate macro */
+
+     dl_load_func(func_table, filefuncs, "")
+
+   And that's it!
+
+   ---------- Footnotes ----------
+
+   (1) This version is edited slightly for presentation.  See
+'extension/filefuncs.c' in the 'gawk' distribution for the complete
+version.
+
+
+File: gawk.info,  Node: Using Internal File Ops,  Prev: Internal File Ops,  Up: Extension Example
+
+16.6.3 Integrating the Extensions
+---------------------------------
+
+Now that the code is written, it must be possible to add it at runtime
+to the running 'gawk' interpreter.  First, the code must be compiled.
+Assuming that the functions are in a file named 'filefuncs.c', and IDIR
+is the location of the 'gawkapi.h' header file, the following steps(1)
+create a GNU/Linux shared library:
+
+     $ gcc -fPIC -shared -DHAVE_CONFIG_H -c -O -g -IIDIR filefuncs.c
+     $ gcc -o filefuncs.so -shared filefuncs.o
+
+   Once the library exists, it is loaded by using the '@load' keyword:
+
+     # file testff.awk
+     @load "filefuncs"
+
+     BEGIN {
+         "pwd" | getline curdir  # save current directory
+         close("pwd")
+
+         chdir("/tmp")
+         system("pwd")   # test it
+         chdir(curdir)   # go back
+
+         print "Info for testff.awk"
+         ret = stat("testff.awk", data)
+         print "ret =", ret
+         for (i in data)
+             printf "data[\"%s\"] = %s\n", i, data[i]
+         print "testff.awk modified:",
+             strftime("%m %d %Y %H:%M:%S", data["mtime"])
+
+         print "\nInfo for JUNK"
+         ret = stat("JUNK", data)
+         print "ret =", ret
+         for (i in data)
+             printf "data[\"%s\"] = %s\n", i, data[i]
+         print "JUNK modified:", strftime("%m %d %Y %H:%M:%S", data["mtime"])
+     }
+
+   The 'AWKLIBPATH' environment variable tells 'gawk' where to find
+extensions (*note Finding Extensions::).  We set it to the current
+directory and run the program:
+
+     $ AWKLIBPATH=$PWD gawk -f testff.awk
+     -| /tmp
+     -| Info for testff.awk
+     -| ret = 0
+     -| data["blksize"] = 4096
+     -| data["devbsize"] = 512
+     -| data["mtime"] = 1412004710
+     -| data["mode"] = 33204
+     -| data["type"] = file
+     -| data["dev"] = 2053
+     -| data["gid"] = 1000
+     -| data["ino"] = 10358899
+     -| data["ctime"] = 1412004710
+     -| data["blocks"] = 8
+     -| data["nlink"] = 1
+     -| data["name"] = testff.awk
+     -| data["atime"] = 1412004716
+     -| data["pmode"] = -rw-rw-r--
+     -| data["size"] = 666
+     -| data["uid"] = 1000
+     -| testff.awk modified: 09 29 2014 18:31:50
+     -|
+     -| Info for JUNK
+     -| ret = -1
+     -| JUNK modified: 01 01 1970 02:00:00
+
+   ---------- Footnotes ----------
+
+   (1) In practice, you would probably want to use the GNU Autotools
+(Automake, Autoconf, Libtool, and 'gettext') to configure and build your
+libraries.  Instructions for doing so are beyond the scope of this Info
+file.  *Note gawkextlib:: for Internet links to the tools.
+
+
+File: gawk.info,  Node: Extension Samples,  Next: gawkextlib,  Prev: Extension Example,  Up: Dynamic Extensions
+
+16.7 The Sample Extensions in the 'gawk' Distribution
+=====================================================
+
+This minor node provides a brief overview of the sample extensions that
+come in the 'gawk' distribution.  Some of them are intended for
+production use (e.g., the 'filefuncs', 'readdir', and 'inplace'
+extensions).  Others mainly provide example code that shows how to use
+the extension API.
+
+* Menu:
+
+* Extension Sample File Functions::   The file functions sample.
+* Extension Sample Fnmatch::          An interface to 'fnmatch()'.
+* Extension Sample Fork::             An interface to 'fork()' and other
+                                      process functions.
+* Extension Sample Inplace::          Enabling in-place file editing.
+* Extension Sample Ord::              Character to value to character
+                                      conversions.
+* Extension Sample Readdir::          An interface to 'readdir()'.
+* Extension Sample Revout::           Reversing output sample output wrapper.
+* Extension Sample Rev2way::          Reversing data sample two-way processor.
+* Extension Sample Read write array:: Serializing an array to a file.
+* Extension Sample Readfile::         Reading an entire file into a string.
+* Extension Sample Time::             An interface to 'gettimeofday()'
+                                      and 'sleep()'.
+* Extension Sample API Tests::        Tests for the API.
+
+
+File: gawk.info,  Node: Extension Sample File Functions,  Next: Extension Sample Fnmatch,  Up: Extension Samples
+
+16.7.1 File-Related Functions
+-----------------------------
+
+The 'filefuncs' extension provides three different functions, as
+follows.  The usage is:
+
+'@load "filefuncs"'
+     This is how you load the extension.
+
+'result = chdir("/some/directory")'
+     The 'chdir()' function is a direct hook to the 'chdir()' system
+     call to change the current directory.  It returns zero upon success
+     or a value less than zero upon error.  In the latter case, it
+     updates 'ERRNO'.
+
+'result = stat("/some/path", statdata' [', follow']')'
+     The 'stat()' function provides a hook into the 'stat()' system
+     call.  It returns zero upon success or a value less than zero upon
+     error.  In the latter case, it updates 'ERRNO'.
+
+     By default, it uses the 'lstat()' system call.  However, if passed
+     a third argument, it uses 'stat()' instead.
+
+     In all cases, it clears the 'statdata' array.  When the call is
+     successful, 'stat()' fills the 'statdata' array with information
+     retrieved from the filesystem, as follows:
+
+     Subscript   Field in 'struct stat'               File type
+     ----------------------------------------------------------------
+     '"name"'    The file name                        All
+     '"dev"'     'st_dev'                             All
+     '"ino"'     'st_ino'                             All
+     '"mode"'    'st_mode'                            All
+     '"nlink"'   'st_nlink'                           All
+     '"uid"'     'st_uid'                             All
+     '"gid"'     'st_gid'                             All
+     '"size"'    'st_size'                            All
+     '"atime"'   'st_atime'                           All
+     '"mtime"'   'st_mtime'                           All
+     '"ctime"'   'st_ctime'                           All
+     '"rdev"'    'st_rdev'                            Device files
+     '"major"'   'st_major'                           Device files
+     '"minor"'   'st_minor'                           Device files
+     '"blksize"' 'st_blksize'                         All
+     '"pmode"'   A human-readable version of the      All
+                 mode value, like that printed by
+                 'ls' (for example, '"-rwxr-xr-x"')
+     '"linkval"' The value of the symbolic link       Symbolic
+                                                      links
+     '"type"'    The type of the file as a            All
+                 string--one of '"file"',
+                 '"blockdev"', '"chardev"',
+                 '"directory"', '"socket"',
+                 '"fifo"', '"symlink"', '"door"',
+                 or '"unknown"' (not all systems
+                 support all file types)
+
+'flags = or(FTS_PHYSICAL, ...)'
+'result = fts(pathlist, flags, filedata)'
+     Walk the file trees provided in 'pathlist' and fill in the
+     'filedata' array, as described next.  'flags' is the bitwise OR of
+     several predefined values, also described in a moment.  Return zero
+     if there were no errors, otherwise return -1.
+
+   The 'fts()' function provides a hook to the C library 'fts()'
+routines for traversing file hierarchies.  Instead of returning data
+about one file at a time in a stream, it fills in a multidimensional
+array with data about each file and directory encountered in the
+requested hierarchies.
+
+   The arguments are as follows:
+
+'pathlist'
+     An array of file names.  The element values are used; the index
+     values are ignored.
+
+'flags'
+     This should be the bitwise OR of one or more of the following
+     predefined constant flag values.  At least one of 'FTS_LOGICAL' or
+     'FTS_PHYSICAL' must be provided; otherwise 'fts()' returns an error
+     value and sets 'ERRNO'.  The flags are:
+
+     'FTS_LOGICAL'
+          Do a "logical" file traversal, where the information returned
+          for a symbolic link refers to the linked-to file, and not to
+          the symbolic link itself.  This flag is mutually exclusive
+          with 'FTS_PHYSICAL'.
+
+     'FTS_PHYSICAL'
+          Do a "physical" file traversal, where the information returned
+          for a symbolic link refers to the symbolic link itself.  This
+          flag is mutually exclusive with 'FTS_LOGICAL'.
+
+     'FTS_NOCHDIR'
+          As a performance optimization, the C library 'fts()' routines
+          change directory as they traverse a file hierarchy.  This flag
+          disables that optimization.
+
+     'FTS_COMFOLLOW'
+          Immediately follow a symbolic link named in 'pathlist',
+          whether or not 'FTS_LOGICAL' is set.
+
+     'FTS_SEEDOT'
+          By default, the C library 'fts()' routines do not return
+          entries for '.' (dot) and '..' (dot-dot).  This option causes
+          entries for dot-dot to also be included.  (The extension
+          always includes an entry for dot; more on this in a moment.)
+
+     'FTS_XDEV'
+          During a traversal, do not cross onto a different mounted
+          filesystem.
+
+'filedata'
+     The 'filedata' array holds the results.  'fts()' first clears it.
+     Then it creates an element in 'filedata' for every element in
+     'pathlist'.  The index is the name of the directory or file given
+     in 'pathlist'.  The element for this index is itself an array.
+     There are two cases:
+
+     _The path is a file_
+          In this case, the array contains two or three elements:
+
+          '"path"'
+               The full path to this file, starting from the "root" that
+               was given in the 'pathlist' array.
+
+          '"stat"'
+               This element is itself an array, containing the same
+               information as provided by the 'stat()' function
+               described earlier for its 'statdata' argument.  The
+               element may not be present if the 'stat()' system call
+               for the file failed.
+
+          '"error"'
+               If some kind of error was encountered, the array will
+               also contain an element named '"error"', which is a
+               string describing the error.
+
+     _The path is a directory_
+          In this case, the array contains one element for each entry in
+          the directory.  If an entry is a file, that element is the
+          same as for files, just described.  If the entry is a
+          directory, that element is (recursively) an array describing
+          the subdirectory.  If 'FTS_SEEDOT' was provided in the flags,
+          then there will also be an element named '".."'.  This element
+          will be an array containing the data as provided by 'stat()'.
+
+          In addition, there will be an element whose index is '"."'.
+          This element is an array containing the same two or three
+          elements as for a file: '"path"', '"stat"', and '"error"'.
+
+   The 'fts()' function returns zero if there were no errors.
+Otherwise, it returns -1.
+
+     NOTE: The 'fts()' extension does not exactly mimic the interface of
+     the C library 'fts()' routines, choosing instead to provide an
+     interface that is based on associative arrays, which is more
+     comfortable to use from an 'awk' program.  This includes the lack
+     of a comparison function, because 'gawk' already provides powerful
+     array sorting facilities.  Although an 'fts_read()'-like interface
+     could have been provided, this felt less natural than simply
+     creating a multidimensional array to represent the file hierarchy
+     and its information.
+
+   See 'test/fts.awk' in the 'gawk' distribution for an example use of
+the 'fts()' extension function.
+
+
+File: gawk.info,  Node: Extension Sample Fnmatch,  Next: Extension Sample Fork,  Prev: Extension Sample File Functions,  Up: Extension Samples
+
+16.7.2 Interface to 'fnmatch()'
+-------------------------------
+
+This extension provides an interface to the C library 'fnmatch()'
+function.  The usage is:
+
+'@load "fnmatch"'
+     This is how you load the extension.
+
+'result = fnmatch(pattern, string, flags)'
+     The return value is zero on success, 'FNM_NOMATCH' if the string
+     did not match the pattern, or a different nonzero value if an error
+     occurred.
+
+   In addition to the 'fnmatch()' function, the 'fnmatch' extension adds
+one constant ('FNM_NOMATCH'), and an array of flag values named 'FNM'.
+
+   The arguments to 'fnmatch()' are:
+
+'pattern'
+     The file name wildcard to match
+
+'string'
+     The file name string
+
+'flag'
+     Either zero, or the bitwise OR of one or more of the flags in the
+     'FNM' array
+
+   The flags are as follows:
+
+Array element      Corresponding flag defined by 'fnmatch()'
+--------------------------------------------------------------------------
+'FNM["CASEFOLD"]'  'FNM_CASEFOLD'
+'FNM["FILE_NAME"]' 'FNM_FILE_NAME'
+'FNM["LEADING_DIR"]''FNM_LEADING_DIR'
+'FNM["NOESCAPE"]'  'FNM_NOESCAPE'
+'FNM["PATHNAME"]'  'FNM_PATHNAME'
+'FNM["PERIOD"]'    'FNM_PERIOD'
+
+   Here is an example:
+
+     @load "fnmatch"
+     ...
+     flags = or(FNM["PERIOD"], FNM["NOESCAPE"])
+     if (fnmatch("*.a", "foo.c", flags) == FNM_NOMATCH)
+         print "no match"
+
+
+File: gawk.info,  Node: Extension Sample Fork,  Next: Extension Sample Inplace,  Prev: Extension Sample Fnmatch,  Up: Extension Samples
+
+16.7.3 Interface to 'fork()', 'wait()', and 'waitpid()'
+-------------------------------------------------------
+
+The 'fork' extension adds three functions, as follows:
+
+'@load "fork"'
+     This is how you load the extension.
+
+'pid = fork()'
+     This function creates a new process.  The return value is zero in
+     the child and the process ID number of the child in the parent, or
+     -1 upon error.  In the latter case, 'ERRNO' indicates the problem.
+     In the child, 'PROCINFO["pid"]' and 'PROCINFO["ppid"]' are updated
+     to reflect the correct values.
+
+'ret = waitpid(pid)'
+     This function takes a numeric argument, which is the process ID to
+     wait for.  The return value is that of the 'waitpid()' system call.
+
+'ret = wait()'
+     This function waits for the first child to die.  The return value
+     is that of the 'wait()' system call.
+
+   There is no corresponding 'exec()' function.
+
+   Here is an example:
+
+     @load "fork"
+     ...
+     if ((pid = fork()) == 0)
+         print "hello from the child"
+     else
+         print "hello from the parent"
+
+
+File: gawk.info,  Node: Extension Sample Inplace,  Next: Extension Sample Ord,  Prev: Extension Sample Fork,  Up: Extension Samples
+
+16.7.4 Enabling In-Place File Editing
+-------------------------------------
+
+The 'inplace' extension emulates GNU 'sed''s '-i' option, which performs
+"in-place" editing of each input file.  It uses the bundled
+'inplace.awk' include file to invoke the extension properly:
+
+     # inplace --- load and invoke the inplace extension.
+
+     @load "inplace"
+
+     # Please set INPLACE_SUFFIX to make a backup copy.  For example, you may
+     # want to set INPLACE_SUFFIX to .bak on the command line or in a BEGIN rule.
+
+     # By default, each filename on the command line will be edited inplace.
+     # But you can selectively disable this by adding an inplace=0 argument
+     # prior to files that you do not want to process this way.  You can then
+     # reenable it later on the commandline by putting inplace=1 before files
+     # that you wish to be subject to inplace editing.
+
+     # N.B. We call inplace_end() in the BEGINFILE and END rules so that any
+     # actions in an ENDFILE rule will be redirected as expected.
+
+     BEGIN {
+         inplace = 1		# enabled by default
+     }
+
+     BEGINFILE {
+         if (_inplace_filename != "")
+             inplace_end(_inplace_filename, INPLACE_SUFFIX)
+         if (inplace)
+             inplace_begin(_inplace_filename = FILENAME, INPLACE_SUFFIX)
+         else
+             _inplace_filename = ""
+     }
+
+     END {
+         if (_inplace_filename != "")
+             inplace_end(_inplace_filename, INPLACE_SUFFIX)
+     }
+
+   For each regular file that is processed, the extension redirects
+standard output to a temporary file configured to have the same owner
+and permissions as the original.  After the file has been processed, the
+extension restores standard output to its original destination.  If
+'INPLACE_SUFFIX' is not an empty string, the original file is linked to
+a backup file name created by appending that suffix.  Finally, the
+temporary file is renamed to the original file name.
+
+   Note that the use of this feature can be controlled by placing
+'inplace=0' on the command-line prior to listing files that should not
+be processed this way.  You can reenable inplace editing by adding an
+'inplace=1' argument prior to files that should be subject to inplace
+editing.
+
+   The '_inplace_filename' variable serves to keep track of the current
+filename so as to not invoke 'inplace_end()' before processing the first
+file.
+
+   If any error occurs, the extension issues a fatal error to terminate
+processing immediately without damaging the original file.
+
+   Here are some simple examples:
+
+     $ gawk -i inplace '{ gsub(/foo/, "bar") }; { print }' file1 file2 file3
+
+   To keep a backup copy of the original files, try this:
+
+     $ gawk -i inplace -v INPLACE_SUFFIX=.bak '{ gsub(/foo/, "bar") }
+     > { print }' file1 file2 file3
+
+   Please note that, while the extension does attempt to preserve
+ownership and permissions, it makes no attempt to copy the ACLs from the
+original file.
+
+   If the program dies prematurely, as might happen if an unhandled
+signal is received, a temporary file may be left behind.
+
+
+File: gawk.info,  Node: Extension Sample Ord,  Next: Extension Sample Readdir,  Prev: Extension Sample Inplace,  Up: Extension Samples
+
+16.7.5 Character and Numeric values: 'ord()' and 'chr()'
+--------------------------------------------------------
+
+The 'ordchr' extension adds two functions, named 'ord()' and 'chr()', as
+follows:
+
+'@load "ordchr"'
+     This is how you load the extension.
+
+'number = ord(string)'
+     Return the numeric value of the first character in 'string'.
+
+'char = chr(number)'
+     Return a string whose first character is that represented by
+     'number'.
+
+   These functions are inspired by the Pascal language functions of the
+same name.  Here is an example:
+
+     @load "ordchr"
+     ...
+     printf("The numeric value of 'A' is %d\n", ord("A"))
+     printf("The string value of 65 is %s\n", chr(65))
+
+
+File: gawk.info,  Node: Extension Sample Readdir,  Next: Extension Sample Revout,  Prev: Extension Sample Ord,  Up: Extension Samples
+
+16.7.6 Reading Directories
+--------------------------
+
+The 'readdir' extension adds an input parser for directories.  The usage
+is as follows:
+
+     @load "readdir"
+
+   When this extension is in use, instead of skipping directories named
+on the command line (or with 'getline'), they are read, with each entry
+returned as a record.
+
+   The record consists of three fields.  The first two are the inode
+number and the file name, separated by a forward slash character.  On
+systems where the directory entry contains the file type, the record has
+a third field (also separated by a slash), which is a single letter
+indicating the type of the file.  The letters and their corresponding
+file types are shown in *note Table 16.3: table-readdir-file-types.
+
+Letter  File type
+--------------------------------------------------------------------------
+'b'     Block device
+'c'     Character device
+'d'     Directory
+'f'     Regular file
+'l'     Symbolic link
+'p'     Named pipe (FIFO)
+'s'     Socket
+'u'     Anything else (unknown)
+
+Table 16.3: File types returned by the 'readdir' extension
+
+   On systems without the file type information, the third field is
+always 'u'.
+
+     NOTE: On GNU/Linux systems, there are filesystems that don't
+     support the 'd_type' entry (see the readdir(3) manual page), and so
+     the file type is always 'u'.  You can use the 'filefuncs' extension
+     to call 'stat()' in order to get correct type information.
+
+   Here is an example:
+
+     @load "readdir"
+     ...
+     BEGIN { FS = "/" }
+     { print "file name is", $2 }
+
+
+File: gawk.info,  Node: Extension Sample Revout,  Next: Extension Sample Rev2way,  Prev: Extension Sample Readdir,  Up: Extension Samples
+
+16.7.7 Reversing Output
+-----------------------
+
+The 'revoutput' extension adds a simple output wrapper that reverses the
+characters in each output line.  Its main purpose is to show how to
+write an output wrapper, although it may be mildly amusing for the
+unwary.  Here is an example:
+
+     @load "revoutput"
+
+     BEGIN {
+         REVOUT = 1
+         print "don't panic" > "/dev/stdout"
+     }
+
+   The output from this program is 'cinap t'nod'.
+
+
+File: gawk.info,  Node: Extension Sample Rev2way,  Next: Extension Sample Read write array,  Prev: Extension Sample Revout,  Up: Extension Samples
+
+16.7.8 Two-Way I/O Example
+--------------------------
+
+The 'revtwoway' extension adds a simple two-way processor that reverses
+the characters in each line sent to it for reading back by the 'awk'
+program.  Its main purpose is to show how to write a two-way processor,
+although it may also be mildly amusing.  The following example shows how
+to use it:
+
+     @load "revtwoway"
+
+     BEGIN {
+         cmd = "/magic/mirror"
+         print "don't panic" |& cmd
+         cmd |& getline result
+         print result
+         close(cmd)
+     }
+
+   The output from this program is: 'cinap t'nod'.
+
+
+File: gawk.info,  Node: Extension Sample Read write array,  Next: Extension Sample Readfile,  Prev: Extension Sample Rev2way,  Up: Extension Samples
+
+16.7.9 Dumping and Restoring an Array
+-------------------------------------
+
+The 'rwarray' extension adds two functions, named 'writea()' and
+'reada()', as follows:
+
+'@load "rwarray"'
+     This is how you load the extension.
+
+'ret = writea(file, array)'
+     This function takes a string argument, which is the name of the
+     file to which to dump the array, and the array itself as the second
+     argument.  'writea()' understands arrays of arrays.  It returns one
+     on success, or zero upon failure.
+
+'ret = reada(file, array)'
+     'reada()' is the inverse of 'writea()'; it reads the file named as
+     its first argument, filling in the array named as the second
+     argument.  It clears the array first.  Here too, the return value
+     is one on success, or zero upon failure.
+
+   The array created by 'reada()' is identical to that written by
+'writea()' in the sense that the contents are the same.  However, due to
+implementation issues, the array traversal order of the re-created array
+is likely to be different from that of the original array.  As array
+traversal order in 'awk' is by default undefined, this is (technically)
+not a problem.  If you need to guarantee a particular traversal order,
+use the array sorting features in 'gawk' to do so (*note Array
+Sorting::).
+
+   The file contains binary data.  All integral values are written in
+network byte order.  However, double-precision floating-point values are
+written as native binary data.  Thus, arrays containing only string data
+can theoretically be dumped on systems with one byte order and restored
+on systems with a different one, but this has not been tried.
+
+   Here is an example:
+
+     @load "rwarray"
+     ...
+     ret = writea("arraydump.bin", array)
+     ...
+     ret = reada("arraydump.bin", array)
+
+
+File: gawk.info,  Node: Extension Sample Readfile,  Next: Extension Sample Time,  Prev: Extension Sample Read write array,  Up: Extension Samples
+
+16.7.10 Reading an Entire File
+------------------------------
+
+The 'readfile' extension adds a single function named 'readfile()', and
+an input parser:
+
+'@load "readfile"'
+     This is how you load the extension.
+
+'result = readfile("/some/path")'
+     The argument is the name of the file to read.  The return value is
+     a string containing the entire contents of the requested file.
+     Upon error, the function returns the empty string and sets 'ERRNO'.
+
+'BEGIN { PROCINFO["readfile"] = 1 }'
+     In addition, the extension adds an input parser that is activated
+     if 'PROCINFO["readfile"]' exists.  When activated, each input file
+     is returned in its entirety as '$0'.  'RT' is set to the null
+     string.
+
+   Here is an example:
+
+     @load "readfile"
+     ...
+     contents = readfile("/path/to/file");
+     if (contents == "" && ERRNO != "") {
+         print("problem reading file", ERRNO) > "/dev/stderr"
+         ...
+     }
+
+
+File: gawk.info,  Node: Extension Sample Time,  Next: Extension Sample API Tests,  Prev: Extension Sample Readfile,  Up: Extension Samples
+
+16.7.11 Extension Time Functions
+--------------------------------
+
+The 'time' extension adds two functions, named 'gettimeofday()' and
+'sleep()', as follows:
+
+'@load "time"'
+     This is how you load the extension.
+
+'the_time = gettimeofday()'
+     Return the time in seconds that has elapsed since 1970-01-01 UTC as
+     a floating-point value.  If the time is unavailable on this
+     platform, return -1 and set 'ERRNO'.  The returned time should have
+     sub-second precision, but the actual precision may vary based on
+     the platform.  If the standard C 'gettimeofday()' system call is
+     available on this platform, then it simply returns the value.
+     Otherwise, if on MS-Windows, it tries to use
+     'GetSystemTimeAsFileTime()'.
+
+'result = sleep(SECONDS)'
+     Attempt to sleep for SECONDS seconds.  If SECONDS is negative, or
+     the attempt to sleep fails, return -1 and set 'ERRNO'.  Otherwise,
+     return zero after sleeping for the indicated amount of time.  Note
+     that SECONDS may be a floating-point (nonintegral) value.
+     Implementation details: depending on platform availability, this
+     function tries to use 'nanosleep()' or 'select()' to implement the
+     delay.
+
+
+File: gawk.info,  Node: Extension Sample API Tests,  Prev: Extension Sample Time,  Up: Extension Samples
+
+16.7.12 API Tests
+-----------------
+
+The 'testext' extension exercises parts of the extension API that are
+not tested by the other samples.  The 'extension/testext.c' file
+contains both the C code for the extension and 'awk' test code inside C
+comments that run the tests.  The testing framework extracts the 'awk'
+code and runs the tests.  See the source file for more information.
+
+
+File: gawk.info,  Node: gawkextlib,  Next: Extension summary,  Prev: Extension Samples,  Up: Dynamic Extensions
+
+16.8 The 'gawkextlib' Project
+=============================
+
+The 'gawkextlib' (http://sourceforge.net/projects/gawkextlib/) project
+provides a number of 'gawk' extensions, including one for processing XML
+files.  This is the evolution of the original 'xgawk' (XML 'gawk')
+project.
+
+   As of this writing, there are seven extensions:
+
+   * 'errno' extension
+
+   * GD graphics library extension
+
+   * MPFR library extension (this provides access to a number of MPFR
+     functions that 'gawk''s native MPFR support does not)
+
+   * PDF extension
+
+   * PostgreSQL extension
+
+   * Redis extension
+
+   * Select extension
+
+   * XML parser extension, using the Expat
+     (http://expat.sourceforge.net) XML parsing library
+
+   You can check out the code for the 'gawkextlib' project using the Git
+(http://git-scm.com) distributed source code control system.  The
+command is as follows:
+
+     git clone git://git.code.sf.net/p/gawkextlib/code gawkextlib-code
+
+   You will need to have the Expat (http://expat.sourceforge.net) XML
+parser library installed in order to build and use the XML extension.
+
+   In addition, you must have the GNU Autotools installed (Autoconf
+(http://www.gnu.org/software/autoconf), Automake
+(http://www.gnu.org/software/automake), Libtool
+(http://www.gnu.org/software/libtool), and GNU 'gettext'
+(http://www.gnu.org/software/gettext)).
+
+   The simple recipe for building and testing 'gawkextlib' is as
+follows.  First, build and install 'gawk':
+
+     cd .../path/to/gawk/code
+     ./configure --prefix=/tmp/newgawk     Install in /tmp/newgawk for now
+     make && make check                    Build and check that all is OK
+     make install                          Install gawk
+
+   Next, go to <http://sourceforge.net/projects/gawkextlib/files> to
+download 'gawkextlib' and any extensions that you would like to build.
+The 'README' file at that site explains how to build the code.  If you
+installed 'gawk' in a non-standard location, you will need to specify
+'./configure --with-gawk=/PATH/TO/GAWK' to find it.  You may need to use
+the 'sudo' utility to install both 'gawk' and 'gawkextlib', depending
+upon how your system works.
+
+   If you write an extension that you wish to share with other 'gawk'
+users, consider doing so through the 'gawkextlib' project.  See the
+project's website for more information.
+
+
+File: gawk.info,  Node: Extension summary,  Next: Extension Exercises,  Prev: gawkextlib,  Up: Dynamic Extensions
+
+16.9 Summary
+============
+
+   * You can write extensions (sometimes called plug-ins) for 'gawk' in
+     C or C++ using the application programming interface (API) defined
+     by the 'gawk' developers.
+
+   * Extensions must have a license compatible with the GNU General
+     Public License (GPL), and they must assert that fact by declaring a
+     variable named 'plugin_is_GPL_compatible'.
+
+   * Communication between 'gawk' and an extension is two-way.  'gawk'
+     passes a 'struct' to the extension that contains various data
+     fields and function pointers.  The extension can then call into
+     'gawk' via the supplied function pointers to accomplish certain
+     tasks.
+
+   * One of these tasks is to "register" the name and implementation of
+     new 'awk'-level functions with 'gawk'.  The implementation takes
+     the form of a C function pointer with a defined signature.  By
+     convention, implementation functions are named 'do_XXXX()' for some
+     'awk'-level function 'XXXX()'.
+
+   * The API is defined in a header file named 'gawkapi.h'.  You must
+     include a number of standard header files _before_ including it in
+     your source file.
+
+   * API function pointers are provided for the following kinds of
+     operations:
+
+        * Allocating, reallocating, and releasing memory
+
+        * Registration functions (you may register extension functions,
+          exit callbacks, a version string, input parsers, output
+          wrappers, and two-way processors)
+
+        * Printing fatal, nonfatal, warning, and "lint" warning messages
+
+        * Updating 'ERRNO', or unsetting it
+
+        * Accessing parameters, including converting an undefined
+          parameter into an array
+
+        * Symbol table access (retrieving a global variable, creating
+          one, or changing one)
+
+        * Creating and releasing cached values; this provides an
+          efficient way to use values for multiple variables and can be
+          a big performance win
+
+        * Manipulating arrays (retrieving, adding, deleting, and
+          modifying elements; getting the count of elements in an array;
+          creating a new array; clearing an array; and flattening an
+          array for easy C-style looping over all its indices and
+          elements)
+
+   * The API defines a number of standard data types for representing
+     'awk' values, array elements, and arrays.
+
+   * The API provides convenience functions for constructing values.  It
+     also provides memory management functions to ensure compatibility
+     between memory allocated by 'gawk' and memory allocated by an
+     extension.
+
+   * _All_ memory passed from 'gawk' to an extension must be treated as
+     read-only by the extension.
+
+   * _All_ memory passed from an extension to 'gawk' must come from the
+     API's memory allocation functions.  'gawk' takes responsibility for
+     the memory and releases it when appropriate.
+
+   * The API provides information about the running version of 'gawk' so
+     that an extension can make sure it is compatible with the 'gawk'
+     that loaded it.
+
+   * It is easiest to start a new extension by copying the boilerplate
+     code described in this major node.  Macros in the 'gawkapi.h'
+     header file make this easier to do.
+
+   * The 'gawk' distribution includes a number of small but useful
+     sample extensions.  The 'gawkextlib' project includes several more
+     (larger) extensions.  If you wish to write an extension and
+     contribute it to the community of 'gawk' users, the 'gawkextlib'
+     project is the place to do so.
+
+
+File: gawk.info,  Node: Extension Exercises,  Prev: Extension summary,  Up: Dynamic Extensions
+
+16.10 Exercises
+===============
+
+  1. Add functions to implement system calls such as 'chown()',
+     'chmod()', and 'umask()' to the file operations extension presented
+     in *note Internal File Ops::.
+
+  2. Write an input parser that prints a prompt if the input is a from a
+     "terminal" device.  You can use the 'isatty()' function to tell if
+     the input file is a terminal.  (Hint: this function is usually
+     expensive to call; try to call it just once.)  The content of the
+     prompt should come from a variable settable by 'awk'-level code.
+     You can write the prompt to standard error.  However, for best
+     results, open a new file descriptor (or file pointer) on '/dev/tty'
+     and print the prompt there, in case standard error has been
+     redirected.
+
+     Why is standard error a better choice than standard output for
+     writing the prompt?  Which reading mechanism should you replace,
+     the one to get a record, or the one to read raw bytes?
+
+  3. (Hard.)  How would you provide namespaces in 'gawk', so that the
+     names of functions in different extensions don't conflict with each
+     other?  If you come up with a really good scheme, contact the
+     'gawk' maintainer to tell him about it.
+
+  4. Write a wrapper script that provides an interface similar to 'sed
+     -i' for the "inplace" extension presented in *note Extension Sample
+     Inplace::.
+
+
+File: gawk.info,  Node: Language History,  Next: Installation,  Prev: Dynamic Extensions,  Up: Top
+
+Appendix A The Evolution of the 'awk' Language
+**********************************************
+
+This Info file describes the GNU implementation of 'awk', which follows
+the POSIX specification.  Many longtime 'awk' users learned 'awk'
+programming with the original 'awk' implementation in Version 7 Unix.
+(This implementation was the basis for 'awk' in Berkeley Unix, through
+4.3-Reno.  Subsequent versions of Berkeley Unix, and, for a while, some
+systems derived from 4.4BSD-Lite, used various versions of 'gawk' for
+their 'awk'.)  This major node briefly describes the evolution of the
+'awk' language, with cross-references to other parts of the Info file
+where you can find more information.
+
+* Menu:
+
+* V7/SVR3.1::                   The major changes between V7 and System V
+                                Release 3.1.
+* SVR4::                        Minor changes between System V Releases 3.1
+                                and 4.
+* POSIX::                       New features from the POSIX standard.
+* BTL::                         New features from Brian Kernighan's version of
+                                'awk'.
+* POSIX/GNU::                   The extensions in 'gawk' not in POSIX
+                                'awk'.
+* Feature History::             The history of the features in 'gawk'.
+* Common Extensions::           Common Extensions Summary.
+* Ranges and Locales::          How locales used to affect regexp ranges.
+* Contributors::                The major contributors to 'gawk'.
+* History summary::             History summary.
+
+
+File: gawk.info,  Node: V7/SVR3.1,  Next: SVR4,  Up: Language History
+
+A.1 Major Changes Between V7 and SVR3.1
+=======================================
+
+The 'awk' language evolved considerably between the release of Version 7
+Unix (1978) and the new version that was first made generally available
+in System V Release 3.1 (1987).  This minor node summarizes the changes,
+with cross-references to further details:
+
+   * The requirement for ';' to separate rules on a line (*note
+     Statements/Lines::)
+
+   * User-defined functions and the 'return' statement (*note
+     User-defined::)
+
+   * The 'delete' statement (*note Delete::)
+
+   * The 'do'-'while' statement (*note Do Statement::)
+
+   * The built-in functions 'atan2()', 'cos()', 'sin()', 'rand()', and
+     'srand()' (*note Numeric Functions::)
+
+   * The built-in functions 'gsub()', 'sub()', and 'match()' (*note
+     String Functions::)
+
+   * The built-in functions 'close()' and 'system()' (*note I/O
+     Functions::)
+
+   * The 'ARGC', 'ARGV', 'FNR', 'RLENGTH', 'RSTART', and 'SUBSEP'
+     predefined variables (*note Built-in Variables::)
+
+   * Assignable '$0' (*note Changing Fields::)
+
+   * The conditional expression using the ternary operator '?:' (*note
+     Conditional Exp::)
+
+   * The expression 'INDX in ARRAY' outside of 'for' statements (*note
+     Reference to Elements::)
+
+   * The exponentiation operator '^' (*note Arithmetic Ops::) and its
+     assignment operator form '^=' (*note Assignment Ops::)
+
+   * C-compatible operator precedence, which breaks some old 'awk'
+     programs (*note Precedence::)
+
+   * Regexps as the value of 'FS' (*note Field Separators::) and as the
+     third argument to the 'split()' function (*note String
+     Functions::), rather than using only the first character of 'FS'
+
+   * Dynamic regexps as operands of the '~' and '!~' operators (*note
+     Computed Regexps::)
+
+   * The escape sequences '\b', '\f', and '\r' (*note Escape
+     Sequences::)
+
+   * Redirection of input for the 'getline' function (*note Getline::)
+
+   * Multiple 'BEGIN' and 'END' rules (*note BEGIN/END::)
+
+   * Multidimensional arrays (*note Multidimensional::)
+
+
+File: gawk.info,  Node: SVR4,  Next: POSIX,  Prev: V7/SVR3.1,  Up: Language History
+
+A.2 Changes Between SVR3.1 and SVR4
+===================================
+
+The System V Release 4 (1989) version of Unix 'awk' added these features
+(some of which originated in 'gawk'):
+
+   * The 'ENVIRON' array (*note Built-in Variables::)
+
+   * Multiple '-f' options on the command line (*note Options::)
+
+   * The '-v' option for assigning variables before program execution
+     begins (*note Options::)
+
+   * The '--' signal for terminating command-line options
+
+   * The '\a', '\v', and '\x' escape sequences (*note Escape
+     Sequences::)
+
+   * A defined return value for the 'srand()' built-in function (*note
+     Numeric Functions::)
+
+   * The 'toupper()' and 'tolower()' built-in string functions for case
+     translation (*note String Functions::)
+
+   * A cleaner specification for the '%c' format-control letter in the
+     'printf' function (*note Control Letters::)
+
+   * The ability to dynamically pass the field width and precision
+     ('"%*.*d"') in the argument list of 'printf' and 'sprintf()' (*note
+     Control Letters::)
+
+   * The use of regexp constants, such as '/foo/', as expressions, where
+     they are equivalent to using the matching operator, as in '$0 ~
+     /foo/' (*note Using Constant Regexps::)
+
+   * Processing of escape sequences inside command-line variable
+     assignments (*note Assignment Options::)
+
+
+File: gawk.info,  Node: POSIX,  Next: BTL,  Prev: SVR4,  Up: Language History
+
+A.3 Changes Between SVR4 and POSIX 'awk'
+========================================
+
+The POSIX Command Language and Utilities standard for 'awk' (1992)
+introduced the following changes into the language:
+
+   * The use of '-W' for implementation-specific options (*note
+     Options::)
+
+   * The use of 'CONVFMT' for controlling the conversion of numbers to
+     strings (*note Conversion::)
+
+   * The concept of a numeric string and tighter comparison rules to go
+     with it (*note Typing and Comparison::)
+
+   * The use of predefined variables as function parameter names is
+     forbidden (*note Definition Syntax::)
+
+   * More complete documentation of many of the previously undocumented
+     features of the language
+
+   In 2012, a number of extensions that had been commonly available for
+many years were finally added to POSIX. They are:
+
+   * The 'fflush()' built-in function for flushing buffered output
+     (*note I/O Functions::)
+
+   * The 'nextfile' statement (*note Nextfile Statement::)
+
+   * The ability to delete all of an array at once with 'delete ARRAY'
+     (*note Delete::)
+
+   *Note Common Extensions:: for a list of common extensions not
+permitted by the POSIX standard.
+
+   The 2008 POSIX standard can be found online at
+<http://www.opengroup.org/onlinepubs/9699919799/>.
+
+
+File: gawk.info,  Node: BTL,  Next: POSIX/GNU,  Prev: POSIX,  Up: Language History
+
+A.4 Extensions in Brian Kernighan's 'awk'
+=========================================
+
+Brian Kernighan has made his version available via his home page (*note
+Other Versions::).
+
+   This minor node describes common extensions that originally appeared
+in his version of 'awk':
+
+   * The '**' and '**=' operators (*note Arithmetic Ops:: and *note
+     Assignment Ops::)
+
+   * The use of 'func' as an abbreviation for 'function' (*note
+     Definition Syntax::)
+
+   * The 'fflush()' built-in function for flushing buffered output
+     (*note I/O Functions::)
+
+   *Note Common Extensions:: for a full list of the extensions available
+in his 'awk'.
+
+
+File: gawk.info,  Node: POSIX/GNU,  Next: Feature History,  Prev: BTL,  Up: Language History
+
+A.5 Extensions in 'gawk' Not in POSIX 'awk'
+===========================================
+
+The GNU implementation, 'gawk', adds a large number of features.  They
+can all be disabled with either the '--traditional' or '--posix' options
+(*note Options::).
+
+   A number of features have come and gone over the years.  This minor
+node summarizes the additional features over POSIX 'awk' that are in the
+current version of 'gawk'.
+
+   * Additional predefined variables:
+
+        - The 'ARGIND', 'BINMODE', 'ERRNO', 'FIELDWIDTHS', 'FPAT',
+          'IGNORECASE', 'LINT', 'PROCINFO', 'RT', and 'TEXTDOMAIN'
+          variables (*note Built-in Variables::)
+
+   * Special files in I/O redirections:
+
+        - The '/dev/stdin', '/dev/stdout', '/dev/stderr', and
+          '/dev/fd/N' special file names (*note Special Files::)
+
+        - The '/inet', '/inet4', and '/inet6' special files for TCP/IP
+          networking using '|&' to specify which version of the IP
+          protocol to use (*note TCP/IP Networking::)
+
+   * Changes and/or additions to the language:
+
+        - The '\x' escape sequence (*note Escape Sequences::)
+
+        - Full support for both POSIX and GNU regexps (*note Regexp::)
+
+        - The ability for 'FS' and for the third argument to 'split()'
+          to be null strings (*note Single Character Fields::)
+
+        - The ability for 'RS' to be a regexp (*note Records::)
+
+        - The ability to use octal and hexadecimal constants in 'awk'
+          program source code (*note Nondecimal-numbers::)
+
+        - The '|&' operator for two-way I/O to a coprocess (*note
+          Two-way I/O::)
+
+        - Indirect function calls (*note Indirect Calls::)
+
+        - Directories on the command line produce a warning and are
+          skipped (*note Command-line directories::)
+
+        - Output with 'print' and 'printf' need not be fatal (*note
+          Nonfatal::)
+
+   * New keywords:
+
+        - The 'BEGINFILE' and 'ENDFILE' special patterns (*note
+          BEGINFILE/ENDFILE::)
+
+        - The 'switch' statement (*note Switch Statement::)
+
+   * Changes to standard 'awk' functions:
+
+        - The optional second argument to 'close()' that allows closing
+          one end of a two-way pipe to a coprocess (*note Two-way I/O::)
+
+        - POSIX compliance for 'gsub()' and 'sub()' with '--posix'
+
+        - The 'length()' function accepts an array argument and returns
+          the number of elements in the array (*note String Functions::)
+
+        - The optional third argument to the 'match()' function for
+          capturing text-matching subexpressions within a regexp (*note
+          String Functions::)
+
+        - Positional specifiers in 'printf' formats for making
+          translations easier (*note Printf Ordering::)
+
+        - The 'split()' function's additional optional fourth argument,
+          which is an array to hold the text of the field separators
+          (*note String Functions::)
+
+   * Additional functions only in 'gawk':
+
+        - The 'gensub()', 'patsplit()', and 'strtonum()' functions for
+          more powerful text manipulation (*note String Functions::)
+
+        - The 'asort()' and 'asorti()' functions for sorting arrays
+          (*note Array Sorting::)
+
+        - The 'mktime()', 'systime()', and 'strftime()' functions for
+          working with timestamps (*note Time Functions::)
+
+        - The 'and()', 'compl()', 'lshift()', 'or()', 'rshift()', and
+          'xor()' functions for bit manipulation (*note Bitwise
+          Functions::)
+
+        - The 'isarray()' function to check if a variable is an array or
+          not (*note Type Functions::)
+
+        - The 'bindtextdomain()', 'dcgettext()', and 'dcngettext()'
+          functions for internationalization (*note Programmer i18n::)
+
+        - The 'intdiv()' function for doing integer division and
+          remainder (*note Numeric Functions::)
+
+   * Changes and/or additions in the command-line options:
+
+        - The 'AWKPATH' environment variable for specifying a path
+          search for the '-f' command-line option (*note Options::)
+
+        - The 'AWKLIBPATH' environment variable for specifying a path
+          search for the '-l' command-line option (*note Options::)
+
+        - The '-b', '-c', '-C', '-d', '-D', '-e', '-E', '-g', '-h',
+          '-i', '-l', '-L', '-M', '-n', '-N', '-o', '-O', '-p', '-P',
+          '-r', '-s', '-S', '-t', and '-V' short options.  Also, the
+          ability to use GNU-style long-named options that start with
+          '--', and the '--assign', '--bignum', '--characters-as-bytes',
+          '--copyright', '--debug', '--dump-variables', '--exec',
+          '--field-separator', '--file', '--gen-pot', '--help',
+          '--include', '--lint', '--lint-old', '--load',
+          '--non-decimal-data', '--optimize', '--no-optimize',
+          '--posix', '--pretty-print', '--profile', '--re-interval',
+          '--sandbox', '--source', '--traditional', '--use-lc-numeric',
+          and '--version' long options (*note Options::).
+
+   * Support for the following obsolete systems was removed from the
+     code and the documentation for 'gawk' version 4.0:
+
+        - Amiga
+
+        - Atari
+
+        - BeOS
+
+        - Cray
+
+        - MIPS RiscOS
+
+        - MS-DOS with the Microsoft Compiler
+
+        - MS-Windows with the Microsoft Compiler
+
+        - NeXT
+
+        - SunOS 3.x, Sun 386 (Road Runner)
+
+        - Tandem (non-POSIX)
+
+        - Prestandard VAX C compiler for VAX/VMS
+
+        - GCC for VAX and Alpha has not been tested for a while.
+
+   * Support for the following obsolete system was removed from the code
+     for 'gawk' version 4.1:
+
+        - Ultrix
+
+   * Support for the following systems was removed from the code for
+     'gawk' version 4.2:
+
+        - MirBSD
+
+
+File: gawk.info,  Node: Feature History,  Next: Common Extensions,  Prev: POSIX/GNU,  Up: Language History
+
+A.6 History of 'gawk' Features
+==============================
+
+This minor node describes the features in 'gawk' over and above those in
+POSIX 'awk', in the order they were added to 'gawk'.
+
+   Version 2.10 of 'gawk' introduced the following features:
+
+   * The 'AWKPATH' environment variable for specifying a path search for
+     the '-f' command-line option (*note Options::).
+
+   * The 'IGNORECASE' variable and its effects (*note
+     Case-sensitivity::).
+
+   * The '/dev/stdin', '/dev/stdout', '/dev/stderr' and '/dev/fd/N'
+     special file names (*note Special Files::).
+
+   Version 2.13 of 'gawk' introduced the following features:
+
+   * The 'FIELDWIDTHS' variable and its effects (*note Constant Size::).
+
+   * The 'systime()' and 'strftime()' built-in functions for obtaining
+     and printing timestamps (*note Time Functions::).
+
+   * Additional command-line options (*note Options::):
+
+        - The '-W lint' option to provide error and portability checking
+          for both the source code and at runtime.
+
+        - The '-W compat' option to turn off the GNU extensions.
+
+        - The '-W posix' option for full POSIX compliance.
+
+   Version 2.14 of 'gawk' introduced the following feature:
+
+   * The 'next file' statement for skipping to the next data file (*note
+     Nextfile Statement::).
+
+   Version 2.15 of 'gawk' introduced the following features:
+
+   * New variables (*note Built-in Variables::):
+
+        - 'ARGIND', which tracks the movement of 'FILENAME' through
+          'ARGV'.
+
+        - 'ERRNO', which contains the system error message when
+          'getline' returns -1 or 'close()' fails.
+
+   * The '/dev/pid', '/dev/ppid', '/dev/pgrpid', and '/dev/user' special
+     file names.  These have since been removed.
+
+   * The ability to delete all of an array at once with 'delete ARRAY'
+     (*note Delete::).
+
+   * Command-line option changes (*note Options::):
+
+        - The ability to use GNU-style long-named options that start
+          with '--'.
+
+        - The '--source' option for mixing command-line and library-file
+          source code.
+
+   Version 3.0 of 'gawk' introduced the following features:
+
+   * New or changed variables:
+
+        - 'IGNORECASE' changed, now applying to string comparison as
+          well as regexp operations (*note Case-sensitivity::).
+
+        - 'RT', which contains the input text that matched 'RS' (*note
+          Records::).
+
+   * Full support for both POSIX and GNU regexps (*note Regexp::).
+
+   * The 'gensub()' function for more powerful text manipulation (*note
+     String Functions::).
+
+   * The 'strftime()' function acquired a default time format, allowing
+     it to be called with no arguments (*note Time Functions::).
+
+   * The ability for 'FS' and for the third argument to 'split()' to be
+     null strings (*note Single Character Fields::).
+
+   * The ability for 'RS' to be a regexp (*note Records::).
+
+   * The 'next file' statement became 'nextfile' (*note Nextfile
+     Statement::).
+
+   * The 'fflush()' function from BWK 'awk' (then at Bell Laboratories;
+     *note I/O Functions::).
+
+   * New command-line options:
+
+        - The '--lint-old' option to warn about constructs that are not
+          available in the original Version 7 Unix version of 'awk'
+          (*note V7/SVR3.1::).
+
+        - The '-m' option from BWK 'awk'.  (Brian was still at Bell
+          Laboratories at the time.)  This was later removed from both
+          his 'awk' and from 'gawk'.
+
+        - The '--re-interval' option to provide interval expressions in
+          regexps (*note Regexp Operators::).
+
+        - The '--traditional' option was added as a better name for
+          '--compat' (*note Options::).
+
+   * The use of GNU Autoconf to control the configuration process (*note
+     Quick Installation::).
+
+   * Amiga support.  This has since been removed.
+
+   Version 3.1 of 'gawk' introduced the following features:
+
+   * New variables (*note Built-in Variables::):
+
+        - 'BINMODE', for non-POSIX systems, which allows binary I/O for
+          input and/or output files (*note PC Using::).
+
+        - 'LINT', which dynamically controls lint warnings.
+
+        - 'PROCINFO', an array for providing process-related
+          information.
+
+        - 'TEXTDOMAIN', for setting an application's
+          internationalization text domain (*note
+          Internationalization::).
+
+   * The ability to use octal and hexadecimal constants in 'awk' program
+     source code (*note Nondecimal-numbers::).
+
+   * The '|&' operator for two-way I/O to a coprocess (*note Two-way
+     I/O::).
+
+   * The '/inet' special files for TCP/IP networking using '|&' (*note
+     TCP/IP Networking::).
+
+   * The optional second argument to 'close()' that allows closing one
+     end of a two-way pipe to a coprocess (*note Two-way I/O::).
+
+   * The optional third argument to the 'match()' function for capturing
+     text-matching subexpressions within a regexp (*note String
+     Functions::).
+
+   * Positional specifiers in 'printf' formats for making translations
+     easier (*note Printf Ordering::).
+
+   * A number of new built-in functions:
+
+        - The 'asort()' and 'asorti()' functions for sorting arrays
+          (*note Array Sorting::).
+
+        - The 'bindtextdomain()', 'dcgettext()' and 'dcngettext()'
+          functions for internationalization (*note Programmer i18n::).
+
+        - The 'extension()' function and the ability to add new built-in
+          functions dynamically (*note Dynamic Extensions::).
+
+        - The 'mktime()' function for creating timestamps (*note Time
+          Functions::).
+
+        - The 'and()', 'or()', 'xor()', 'compl()', 'lshift()',
+          'rshift()', and 'strtonum()' functions (*note Bitwise
+          Functions::).
+
+   * The support for 'next file' as two words was removed completely
+     (*note Nextfile Statement::).
+
+   * Additional command-line options (*note Options::):
+
+        - The '--dump-variables' option to print a list of all global
+          variables.
+
+        - The '--exec' option, for use in CGI scripts.
+
+        - The '--gen-po' command-line option and the use of a leading
+          underscore to mark strings that should be translated (*note
+          String Extraction::).
+
+        - The '--non-decimal-data' option to allow non-decimal input
+          data (*note Nondecimal Data::).
+
+        - The '--profile' option and 'pgawk', the profiling version of
+          'gawk', for producing execution profiles of 'awk' programs
+          (*note Profiling::).
+
+        - The '--use-lc-numeric' option to force 'gawk' to use the
+          locale's decimal point for parsing input data (*note
+          Conversion::).
+
+   * The use of GNU Automake to help in standardizing the configuration
+     process (*note Quick Installation::).
+
+   * The use of GNU 'gettext' for 'gawk''s own message output (*note
+     Gawk I18N::).
+
+   * BeOS support.  This was later removed.
+
+   * Tandem support.  This was later removed.
+
+   * The Atari port became officially unsupported and was later removed
+     entirely.
+
+   * The source code changed to use ISO C standard-style function
+     definitions.
+
+   * POSIX compliance for 'sub()' and 'gsub()' (*note Gory Details::).
+
+   * The 'length()' function was extended to accept an array argument
+     and return the number of elements in the array (*note String
+     Functions::).
+
+   * The 'strftime()' function acquired a third argument to enable
+     printing times as UTC (*note Time Functions::).
+
+   Version 4.0 of 'gawk' introduced the following features:
+
+   * Variable additions:
+
+        - 'FPAT', which allows you to specify a regexp that matches the
+          fields, instead of matching the field separator (*note
+          Splitting By Content::).
+
+        - If 'PROCINFO["sorted_in"]' exists, 'for(iggy in foo)' loops
+          sort the indices before looping over them.  The value of this
+          element provides control over how the indices are sorted
+          before the loop traversal starts (*note Controlling
+          Scanning::).
+
+        - 'PROCINFO["strftime"]', which holds the default format for
+          'strftime()' (*note Time Functions::).
+
+   * The special files '/dev/pid', '/dev/ppid', '/dev/pgrpid' and
+     '/dev/user' were removed.
+
+   * Support for IPv6 was added via the '/inet6' special file.  '/inet4'
+     forces IPv4 and '/inet' chooses the system default, which is
+     probably IPv4 (*note TCP/IP Networking::).
+
+   * The use of '\s' and '\S' escape sequences in regular expressions
+     (*note GNU Regexp Operators::).
+
+   * Interval expressions became part of default regular expressions
+     (*note Regexp Operators::).
+
+   * POSIX character classes work even with '--traditional' (*note
+     Regexp Operators::).
+
+   * 'break' and 'continue' became invalid outside a loop, even with
+     '--traditional' (*note Break Statement::, and also see *note
+     Continue Statement::).
+
+   * 'fflush()', 'nextfile', and 'delete ARRAY' are allowed if '--posix'
+     or '--traditional', since they are all now part of POSIX.
+
+   * An optional third argument to 'asort()' and 'asorti()', specifying
+     how to sort (*note String Functions::).
+
+   * The behavior of 'fflush()' changed to match BWK 'awk' and for
+     POSIX; now both 'fflush()' and 'fflush("")' flush all open output
+     redirections (*note I/O Functions::).
+
+   * The 'isarray()' function which distinguishes if an item is an array
+     or not, to make it possible to traverse arrays of arrays (*note
+     Type Functions::).
+
+   * The 'patsplit()' function which gives the same capability as
+     'FPAT', for splitting (*note String Functions::).
+
+   * An optional fourth argument to the 'split()' function, which is an
+     array to hold the values of the separators (*note String
+     Functions::).
+
+   * Arrays of arrays (*note Arrays of Arrays::).
+
+   * The 'BEGINFILE' and 'ENDFILE' special patterns (*note
+     BEGINFILE/ENDFILE::).
+
+   * Indirect function calls (*note Indirect Calls::).
+
+   * 'switch' / 'case' are enabled by default (*note Switch
+     Statement::).
+
+   * Command-line option changes (*note Options::):
+
+        - The '-b' and '--characters-as-bytes' options which prevent
+          'gawk' from treating input as a multibyte string.
+
+        - The redundant '--compat', '--copyleft', and '--usage' long
+          options were removed.
+
+        - The '--gen-po' option was finally renamed to the correct
+          '--gen-pot'.
+
+        - The '--sandbox' option which disables certain features.
+
+        - All long options acquired corresponding short options, for use
+          in '#!' scripts.
+
+   * Directories named on the command line now produce a warning, not a
+     fatal error, unless '--posix' or '--traditional' are used (*note
+     Command-line directories::).
+
+   * The 'gawk' internals were rewritten, bringing the 'dgawk' debugger
+     and possibly improved performance (*note Debugger::).
+
+   * Per the GNU Coding Standards, dynamic extensions must now define a
+     global symbol indicating that they are GPL-compatible (*note Plugin
+     License::).
+
+   * In POSIX mode, string comparisons use 'strcoll()' / 'wcscoll()'
+     (*note POSIX String Comparison::).
+
+   * The option for raw sockets was removed, since it was never
+     implemented (*note TCP/IP Networking::).
+
+   * Ranges of the form '[d-h]' are treated as if they were in the C
+     locale, no matter what kind of regexp is being used, and even if
+     '--posix' (*note Ranges and Locales::).
+
+   * Support was removed for the following systems:
+
+        - Atari
+
+        - Amiga
+
+        - BeOS
+
+        - Cray
+
+        - MIPS RiscOS
+
+        - MS-DOS with Microsoft Compiler
+
+        - MS-Windows with Microsoft Compiler
+
+        - NeXT
+
+        - SunOS 3.x, Sun 386 (Road Runner)
+
+        - Tandem (non-POSIX)
+
+        - Prestandard VAX C compiler for VAX/VMS
+
+   Version 4.1 of 'gawk' introduced the following features:
+
+   * Three new arrays: 'SYMTAB', 'FUNCTAB', and
+     'PROCINFO["identifiers"]' (*note Auto-set::).
+
+   * The three executables 'gawk', 'pgawk', and 'dgawk', were merged
+     into one, named just 'gawk'.  As a result the command-line options
+     changed.
+
+   * Command-line option changes (*note Options::):
+
+        - The '-D' option invokes the debugger.
+
+        - The '-i' and '--include' options load 'awk' library files.
+
+        - The '-l' and '--load' options load compiled dynamic
+          extensions.
+
+        - The '-M' and '--bignum' options enable MPFR.
+
+        - The '-o' option only does pretty-printing.
+
+        - The '-p' option is used for profiling.
+
+        - The '-R' option was removed.
+
+   * Support for high precision arithmetic with MPFR (*note Arbitrary
+     Precision Arithmetic::).
+
+   * The 'and()', 'or()' and 'xor()' functions changed to allow any
+     number of arguments, with a minimum of two (*note Bitwise
+     Functions::).
+
+   * The dynamic extension interface was completely redone (*note
+     Dynamic Extensions::).
+
+   * Redirected 'getline' became allowed inside 'BEGINFILE' and
+     'ENDFILE' (*note BEGINFILE/ENDFILE::).
+
+   * The 'where' command was added to the debugger (*note Execution
+     Stack::).
+
+   * Support for Ultrix was removed.
+
+   Version 4.2 introduced the following changes:
+
+   * Changes to 'ENVIRON' are reflected into 'gawk''s environment and
+     that of programs that it runs.  *Note Auto-set::.
+
+   * The '--pretty-print' option no longer runs the 'awk' program too.
+     *Note Options::.
+
+   * The 'igawk' program and its manual page are no longer installed
+     when 'gawk' is built.  *Note Igawk Program::.
+
+   * The 'intdiv()' function.  *Note Numeric Functions::.
+
+   * The maximum number of hexadecimal digits in '\x' escapes is now
+     two.  *Note Escape Sequences::.
+
+   * Nonfatal output with 'print' and 'printf'.  *Note Nonfatal::.
+
+   * For many years, POSIX specified that default field splitting only
+     allowed spaces and tabs to separate fields, and this was how 'gawk'
+     behaved with '--posix'.  As of 2013, the standard restored
+     historical behavior, and now default field splitting with '--posix'
+     also allows newlines to separate fields.
+
+   * Support for MirBSD was removed.
+
+   * Support for GNU/Linux on Alpha was removed.
+
+
+File: gawk.info,  Node: Common Extensions,  Next: Ranges and Locales,  Prev: Feature History,  Up: Language History
+
+A.7 Common Extensions Summary
+=============================
+
+The following table summarizes the common extensions supported by
+'gawk', Brian Kernighan's 'awk', and 'mawk', the three most widely used
+freely available versions of 'awk' (*note Other Versions::).
+
+Feature                      BWK 'awk'   'mawk'   'gawk'   Now standard
+--------------------------------------------------------------------------
+'\x' escape sequence         X           X        X
+'FS' as null string          X           X        X
+'/dev/stdin' special file    X           X        X
+'/dev/stdout' special file   X           X        X
+'/dev/stderr' special file   X           X        X
+'delete' without subscript   X           X        X        X
+'fflush()' function          X           X        X        X
+'length()' of an array       X           X        X
+'nextfile' statement         X           X        X        X
+'**' and '**=' operators     X                    X
+'func' keyword               X                    X
+'BINMODE' variable                       X        X
+'RS' as regexp                           X        X
+Time-related functions                   X        X
+
+
+File: gawk.info,  Node: Ranges and Locales,  Next: Contributors,  Prev: Common Extensions,  Up: Language History
+
+A.8 Regexp Ranges and Locales: A Long Sad Story
+===============================================
+
+This minor node describes the confusing history of ranges within regular
+expressions and their interactions with locales, and how this affected
+different versions of 'gawk'.
+
+   The original Unix tools that worked with regular expressions defined
+character ranges (such as '[a-z]') to match any character between the
+first character in the range and the last character in the range,
+inclusive.  Ordering was based on the numeric value of each character in
+the machine's native character set.  Thus, on ASCII-based systems,
+'[a-z]' matched all the lowercase letters, and only the lowercase
+letters, as the numeric values for the letters from 'a' through 'z' were
+contiguous.  (On an EBCDIC system, the range '[a-z]' includes additional
+nonalphabetic characters as well.)
+
+   Almost all introductory Unix literature explained range expressions
+as working in this fashion, and in particular, would teach that the
+"correct" way to match lowercase letters was with '[a-z]', and that
+'[A-Z]' was the "correct" way to match uppercase letters.  And indeed,
+this was true.(1)
+
+   The 1992 POSIX standard introduced the idea of locales (*note
+Locales::).  Because many locales include other letters besides the
+plain 26 letters of the English alphabet, the POSIX standard added
+character classes (*note Bracket Expressions::) as a way to match
+different kinds of characters besides the traditional ones in the ASCII
+character set.
+
+   However, the standard _changed_ the interpretation of range
+expressions.  In the '"C"' and '"POSIX"' locales, a range expression
+like '[a-dx-z]' is still equivalent to '[abcdxyz]', as in ASCII. But
+outside those locales, the ordering was defined to be based on
+"collation order".
+
+   What does that mean?  In many locales, 'A' and 'a' are both less than
+'B'.  In other words, these locales sort characters in dictionary order,
+and '[a-dx-z]' is typically not equivalent to '[abcdxyz]'; instead, it
+might be equivalent to '[ABCXYabcdxyz]', for example.
+
+   This point needs to be emphasized: much literature teaches that you
+should use '[a-z]' to match a lowercase character.  But on systems with
+non-ASCII locales, this also matches all of the uppercase characters
+except 'A' or 'Z'!  This was a continuous cause of confusion, even well
+into the twenty-first century.
+
+   To demonstrate these issues, the following example uses the 'sub()'
+function, which does text replacement (*note String Functions::).  Here,
+the intent is to remove trailing uppercase characters:
+
+     $ echo something1234abc | gawk-3.1.8 '{ sub("[A-Z]*$", ""); print }'
+     -| something1234a
+
+This output is unexpected, as the 'bc' at the end of 'something1234abc'
+should not normally match '[A-Z]*'.  This result is due to the locale
+setting (and thus you may not see it on your system).
+
+   Similar considerations apply to other ranges.  For example, '["-/]'
+is perfectly valid in ASCII, but is not valid in many Unicode locales,
+such as 'en_US.UTF-8'.
+
+   Early versions of 'gawk' used regexp matching code that was not
+locale-aware, so ranges had their traditional interpretation.
+
+   When 'gawk' switched to using locale-aware regexp matchers, the
+problems began; especially as both GNU/Linux and commercial Unix vendors
+started implementing non-ASCII locales, _and making them the default_.
+Perhaps the most frequently asked question became something like, "Why
+does '[A-Z]' match lowercase letters?!?"
+
+   This situation existed for close to 10 years, if not more, and the
+'gawk' maintainer grew weary of trying to explain that 'gawk' was being
+nicely standards-compliant, and that the issue was in the user's locale.
+During the development of version 4.0, he modified 'gawk' to always
+treat ranges in the original, pre-POSIX fashion, unless '--posix' was
+used (*note Options::).(2)
+
+   Fortunately, shortly before the final release of 'gawk' 4.0, the
+maintainer learned that the 2008 standard had changed the definition of
+ranges, such that outside the '"C"' and '"POSIX"' locales, the meaning
+of range expressions was _undefined_.(3)
+
+   By using this lovely technical term, the standard gives license to
+implementers to implement ranges in whatever way they choose.  The
+'gawk' maintainer chose to apply the pre-POSIX meaning both with the
+default regexp matching and when '--traditional' or '--posix' are used.
+In all cases 'gawk' remains POSIX-compliant.
+
+   ---------- Footnotes ----------
+
+   (1) And Life was good.
+
+   (2) And thus was born the Campaign for Rational Range Interpretation
+(or RRI). A number of GNU tools have already implemented this change, or
+will soon.  Thanks to Karl Berry for coining the phrase "Rational Range
+Interpretation."
+
+   (3) See the standard
+(http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05)
+and its rationale
+(http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap09.html#tag_21_09_03_05).
+
+
+File: gawk.info,  Node: Contributors,  Next: History summary,  Prev: Ranges and Locales,  Up: Language History
+
+A.9 Major Contributors to 'gawk'
+================================
+
+     Always give credit where credit is due.
+                            -- _Anonymous_
+
+   This minor node names the major contributors to 'gawk' and/or this
+Info file, in approximate chronological order:
+
+   * Dr. Alfred V. Aho, Dr. Peter J. Weinberger, and Dr. Brian W.
+     Kernighan, all of Bell Laboratories, designed and implemented Unix
+     'awk', from which 'gawk' gets the majority of its feature set.
+
+   * Paul Rubin did the initial design and implementation in 1986, and
+     wrote the first draft (around 40 pages) of this Info file.
+
+   * Jay Fenlason finished the initial implementation.
+
+   * Diane Close revised the first draft of this Info file, bringing it
+     to around 90 pages.
+
+   * Richard Stallman helped finish the implementation and the initial
+     draft of this Info file.  He is also the founder of the FSF and the
+     GNU Project.
+
+   * John Woods contributed parts of the code (mostly fixes) in the
+     initial version of 'gawk'.
+
+   * In 1988, David Trueman took over primary maintenance of 'gawk',
+     making it compatible with "new" 'awk', and greatly improving its
+     performance.
+
+   * Conrad Kwok, Scott Garfinkle, and Kent Williams did the initial
+     ports to MS-DOS with various versions of MSC.
+
+   * Pat Rankin provided the VMS port and its documentation.
+
+   * Hal Peterson provided help in porting 'gawk' to Cray systems.
+     (This is no longer supported.)
+
+   * Kai Uwe Rommel provided the initial port to OS/2 and its
+     documentation.
+
+   * Michal Jaegermann provided the port to Atari systems and its
+     documentation.  (This port is no longer supported.)  He continues
+     to provide portability checking, and has done a lot of work to make
+     sure 'gawk' works on non-32-bit systems.
+
+   * Fred Fish provided the port to Amiga systems and its documentation.
+     (With Fred's sad passing, this is no longer supported.)
+
+   * Scott Deifik maintained the MS-DOS port using DJGPP.
+
+   * Eli Zaretskii currently maintains the MS-Windows port using MinGW.
+
+   * Juan Grigera provided a port to Windows32 systems.  (This is no
+     longer supported.)
+
+   * For many years, Dr. Darrel Hankerson acted as coordinator for the
+     various ports to different PC platforms and created binary
+     distributions for various PC operating systems.  He was also
+     instrumental in keeping the documentation up to date for the
+     various PC platforms.
+
+   * Christos Zoulas provided the 'extension()' built-in function for
+     dynamically adding new functions.  (This was obsoleted at 'gawk'
+     4.1.)
+
+   * Ju"rgen Kahrs contributed the initial version of the TCP/IP
+     networking code and documentation, and motivated the inclusion of
+     the '|&' operator.
+
+   * Stephen Davies provided the initial port to Tandem systems and its
+     documentation.  (However, this is no longer supported.)  He was
+     also instrumental in the initial work to integrate the byte-code
+     internals into the 'gawk' code base.
+
+   * Matthew Woehlke provided improvements for Tandem's POSIX-compliant
+     systems.
+
+   * Martin Brown provided the port to BeOS and its documentation.
+     (This is no longer supported.)
+
+   * Arno Peters did the initial work to convert 'gawk' to use GNU
+     Automake and GNU 'gettext'.
+
+   * Alan J. Broder provided the initial version of the 'asort()'
+     function as well as the code for the optional third argument to the
+     'match()' function.
+
+   * Andreas Buening updated the 'gawk' port for OS/2.
+
+   * Isamu Hasegawa, of IBM in Japan, contributed support for multibyte
+     characters.
+
+   * Michael Benzinger contributed the initial code for 'switch'
+     statements.
+
+   * Patrick T.J. McPhee contributed the code for dynamic loading in
+     Windows32 environments.  (This is no longer supported.)
+
+   * Anders Wallin helped keep the VMS port going for several years.
+
+   * Assaf Gordon contributed the code to implement the '--sandbox'
+     option.
+
+   * John Haque made the following contributions:
+
+        - The modifications to convert 'gawk' into a byte-code
+          interpreter, including the debugger
+
+        - The addition of true arrays of arrays
+
+        - The additional modifications for support of
+          arbitrary-precision arithmetic
+
+        - The initial text of *note Arbitrary Precision Arithmetic::
+
+        - The work to merge the three versions of 'gawk' into one, for
+          the 4.1 release
+
+        - Improved array internals for arrays indexed by integers
+
+        - The improved array sorting features were also driven by John,
+          together with Pat Rankin
+
+   * Panos Papadopoulos contributed the original text for *note Include
+     Files::.
+
+   * Efraim Yawitz contributed the original text for *note Debugger::.
+
+   * The development of the extension API first released with 'gawk' 4.1
+     was driven primarily by Arnold Robbins and Andrew Schorr, with
+     notable contributions from the rest of the development team.
+
+   * John Malmberg contributed significant improvements to the OpenVMS
+     port and the related documentation.
+
+   * Antonio Giovanni Colombo rewrote a number of examples in the early
+     chapters that were severely dated, for which I am incredibly
+     grateful.
+
+   * Arnold Robbins has been working on 'gawk' since 1988, at first
+     helping David Trueman, and as the primary maintainer since around
+     1994.
+
+
+File: gawk.info,  Node: History summary,  Prev: Contributors,  Up: Language History
+
+A.10 Summary
+============
+
+   * The 'awk' language has evolved over time.  The first release was
+     with V7 Unix, circa 1978.  In 1987, for System V Release 3.1, major
+     additions, including user-defined functions, were made to the
+     language.  Additional changes were made for System V Release 4, in
+     1989.  Since then, further minor changes have happened under the
+     auspices of the POSIX standard.
+
+   * Brian Kernighan's 'awk' provides a small number of extensions that
+     are implemented in common with other versions of 'awk'.
+
+   * 'gawk' provides a large number of extensions over POSIX 'awk'.
+     They can be disabled with either the '--traditional' or '--posix'
+     options.
+
+   * The interaction of POSIX locales and regexp matching in 'gawk' has
+     been confusing over the years.  Today, 'gawk' implements Rational
+     Range Interpretation, where ranges of the form '[a-z]' match _only_
+     the characters numerically between 'a' through 'z' in the machine's
+     native character set.  Usually this is ASCII, but it can be EBCDIC
+     on IBM S/390 systems.
+
+   * Many people have contributed to 'gawk' development over the years.
+     We hope that the list provided in this major node is complete and
+     gives the appropriate credit where credit is due.
+
+
+File: gawk.info,  Node: Installation,  Next: Notes,  Prev: Language History,  Up: Top
+
+Appendix B Installing 'gawk'
+****************************
+
+This appendix provides instructions for installing 'gawk' on the various
+platforms that are supported by the developers.  The primary developer
+supports GNU/Linux (and Unix), whereas the other ports are contributed.
+*Note Bugs:: for the email addresses of the people who maintain the
+respective ports.
+
+* Menu:
+
+* Gawk Distribution::           What is in the 'gawk' distribution.
+* Unix Installation::           Installing 'gawk' under various
+                                versions of Unix.
+* Non-Unix Installation::       Installation on Other Operating Systems.
+* Bugs::                        Reporting Problems and Bugs.
+* Other Versions::              Other freely available 'awk'
+                                implementations.
+* Installation summary::        Summary of installation.
+
+
+File: gawk.info,  Node: Gawk Distribution,  Next: Unix Installation,  Up: Installation
+
+B.1 The 'gawk' Distribution
+===========================
+
+This minor node describes how to get the 'gawk' distribution, how to
+extract it, and then what is in the various files and subdirectories.
+
+* Menu:
+
+* Getting::                     How to get the distribution.
+* Extracting::                  How to extract the distribution.
+* Distribution contents::       What is in the distribution.
+
+
+File: gawk.info,  Node: Getting,  Next: Extracting,  Up: Gawk Distribution
+
+B.1.1 Getting the 'gawk' Distribution
+-------------------------------------
+
+There are two ways to get GNU software:
+
+   * Copy it from someone else who already has it.
+
+   * Retrieve 'gawk' from the Internet host 'ftp.gnu.org', in the
+     directory '/gnu/gawk'.  Both anonymous 'ftp' and 'http' access are
+     supported.  If you have the 'wget' program, you can use a command
+     like the following:
+
+          wget http://ftp.gnu.org/gnu/gawk/gawk-4.1.4.tar.gz
+
+   The GNU software archive is mirrored around the world.  The
+up-to-date list of mirror sites is available from the main FSF website
+(http://www.gnu.org/order/ftp.html).  Try to use one of the mirrors;
+they will be less busy, and you can usually find one closer to your
+site.
+
+   You may also retrieve the 'gawk' source code from the official Git
+repository; for more information see *note Accessing The Source::.
+
+
+File: gawk.info,  Node: Extracting,  Next: Distribution contents,  Prev: Getting,  Up: Gawk Distribution
+
+B.1.2 Extracting the Distribution
+---------------------------------
+
+'gawk' is distributed as several 'tar' files compressed with different
+compression programs: 'gzip', 'bzip2', and 'xz'.  For simplicity, the
+rest of these instructions assume you are using the one compressed with
+the GNU Gzip program ('gzip').
+
+   Once you have the distribution (e.g., 'gawk-4.1.4.tar.gz'), use
+'gzip' to expand the file and then use 'tar' to extract it.  You can use
+the following pipeline to produce the 'gawk' distribution:
+
+     gzip -d -c gawk-4.1.4.tar.gz | tar -xvpf -
+
+   On a system with GNU 'tar', you can let 'tar' do the decompression
+for you:
+
+     tar -xvpzf gawk-4.1.4.tar.gz
+
+Extracting the archive creates a directory named 'gawk-4.1.4' in the
+current directory.
+
+   The distribution file name is of the form 'gawk-V.R.P.tar.gz'.  The V
+represents the major version of 'gawk', the R represents the current
+release of version V, and the P represents a "patch level", meaning that
+minor bugs have been fixed in the release.  The current patch level is
+4, but when retrieving distributions, you should get the version with
+the highest version, release, and patch level.  (Note, however, that
+patch levels greater than or equal to 70 denote "beta" or nonproduction
+software; you might not want to retrieve such a version unless you don't
+mind experimenting.)  If you are not on a Unix or GNU/Linux system, you
+need to make other arrangements for getting and extracting the 'gawk'
+distribution.  You should consult a local expert.
+
+
+File: gawk.info,  Node: Distribution contents,  Prev: Extracting,  Up: Gawk Distribution
+
+B.1.3 Contents of the 'gawk' Distribution
+-----------------------------------------
+
+The 'gawk' distribution has a number of C source files, documentation
+files, subdirectories, and files related to the configuration process
+(*note Unix Installation::), as well as several subdirectories related
+to different non-Unix operating systems:
+
+Various '.c', '.y', and '.h' files
+     These files contain the actual 'gawk' source code.
+
+'ABOUT-NLS'
+     A file containing information about GNU 'gettext' and translations.
+
+'AUTHORS'
+     A file with some information about the authorship of 'gawk'.  It
+     exists only to satisfy the pedants at the Free Software Foundation.
+
+'README'
+'README_d/README.*'
+     Descriptive files: 'README' for 'gawk' under Unix and the rest for
+     the various hardware and software combinations.
+
+'INSTALL'
+     A file providing an overview of the configuration and installation
+     process.
+
+'ChangeLog'
+     A detailed list of source code changes as bugs are fixed or
+     improvements made.
+
+'ChangeLog.0'
+     An older list of source code changes.
+
+'NEWS'
+     A list of changes to 'gawk' since the last release or patch.
+
+'NEWS.0'
+     An older list of changes to 'gawk'.
+
+'COPYING'
+     The GNU General Public License.
+
+'POSIX.STD'
+     A description of behaviors in the POSIX standard for 'awk' that are
+     left undefined, or where 'gawk' may not comply fully, as well as a
+     list of things that the POSIX standard should describe but does
+     not.
+
+'doc/awkforai.txt'
+     Pointers to the original draft of a short article describing why
+     'gawk' is a good language for artificial intelligence (AI)
+     programming.
+
+'doc/bc_notes'
+     A brief description of 'gawk''s "byte code" internals.
+
+'doc/README.card'
+'doc/ad.block'
+'doc/awkcard.in'
+'doc/cardfonts'
+'doc/colors'
+'doc/macros'
+'doc/no.colors'
+'doc/setter.outline'
+     The 'troff' source for a five-color 'awk' reference card.  A modern
+     version of 'troff' such as GNU 'troff' ('groff') is needed to
+     produce the color version.  See the file 'README.card' for
+     instructions if you have an older 'troff'.
+
+'doc/gawk.1'
+     The 'troff' source for a manual page describing 'gawk'.  This is
+     distributed for the convenience of Unix users.
+
+'doc/gawktexi.in'
+'doc/sidebar.awk'
+     The Texinfo source file for this Info file.  It should be processed
+     by 'doc/sidebar.awk' before processing with 'texi2dvi' or
+     'texi2pdf' to produce a printed document, and with 'makeinfo' to
+     produce an Info or HTML file.  The 'Makefile' takes care of this
+     processing and produces printable output via 'texi2dvi' or
+     'texi2pdf'.
+
+'doc/gawk.texi'
+     The file produced after processing 'gawktexi.in' with
+     'sidebar.awk'.
+
+'doc/gawk.info'
+     The generated Info file for this Info file.
+
+'doc/gawkinet.texi'
+     The Texinfo source file for *note (General Introduction, gawkinet,
+     TCP/IP Internetworking with 'gawk')Top::.  It should be processed
+     with TeX (via 'texi2dvi' or 'texi2pdf') to produce a printed
+     document and with 'makeinfo' to produce an Info or HTML file.
+
+'doc/gawkinet.info'
+     The generated Info file for 'TCP/IP Internetworking with 'gawk''.
+
+'doc/igawk.1'
+     The 'troff' source for a manual page describing the 'igawk' program
+     presented in *note Igawk Program::.  (Since 'gawk' can do its own
+     '@include' processing, neither 'igawk' nor 'igawk.1' are
+     installed.)
+
+'doc/Makefile.in'
+     The input file used during the configuration process to generate
+     the actual 'Makefile' for creating the documentation.
+
+'Makefile.am'
+'*/Makefile.am'
+     Files used by the GNU Automake software for generating the
+     'Makefile.in' files used by Autoconf and 'configure'.
+
+'Makefile.in'
+'aclocal.m4'
+'bisonfix.awk'
+'config.guess'
+'configh.in'
+'configure.ac'
+'configure'
+'custom.h'
+'depcomp'
+'install-sh'
+'missing_d/*'
+'mkinstalldirs'
+'m4/*'
+     These files and subdirectories are used when configuring and
+     compiling 'gawk' for various Unix systems.  Most of them are
+     explained in *note Unix Installation::.  The rest are there to
+     support the main infrastructure.
+
+'po/*'
+     The 'po' library contains message translations.
+
+'awklib/extract.awk'
+'awklib/Makefile.am'
+'awklib/Makefile.in'
+'awklib/eg/*'
+     The 'awklib' directory contains a copy of 'extract.awk' (*note
+     Extract Program::), which can be used to extract the sample
+     programs from the Texinfo source file for this Info file.  It also
+     contains a 'Makefile.in' file, which 'configure' uses to generate a
+     'Makefile'.  'Makefile.am' is used by GNU Automake to create
+     'Makefile.in'.  The library functions from *note Library
+     Functions::, are included as ready-to-use files in the 'gawk'
+     distribution.  They are installed as part of the installation
+     process.  The rest of the programs in this Info file are available
+     in appropriate subdirectories of 'awklib/eg'.
+
+'extension/*'
+     The source code, manual pages, and infrastructure files for the
+     sample extensions included with 'gawk'.  *Note Dynamic
+     Extensions::, for more information.
+
+'extras/*'
+     Additional non-essential files.  Currently, this directory contains
+     some shell startup files to be installed in '/etc/profile.d' to aid
+     in manipulating the 'AWKPATH' and 'AWKLIBPATH' environment
+     variables.  *Note Shell Startup Files::, for more information.
+
+'posix/*'
+     Files needed for building 'gawk' on POSIX-compliant systems.
+
+'pc/*'
+     Files needed for building 'gawk' under MS-Windows (*note PC
+     Installation:: for details).
+
+'vms/*'
+     Files needed for building 'gawk' under Vax/VMS and OpenVMS (*note
+     VMS Installation:: for details).
+
+'test/*'
+     A test suite for 'gawk'.  You can use 'make check' from the
+     top-level 'gawk' directory to run your version of 'gawk' against
+     the test suite.  If 'gawk' successfully passes 'make check', then
+     you can be confident of a successful port.
+
+
+File: gawk.info,  Node: Unix Installation,  Next: Non-Unix Installation,  Prev: Gawk Distribution,  Up: Installation
+
+B.2 Compiling and Installing 'gawk' on Unix-Like Systems
+========================================================
+
+Usually, you can compile and install 'gawk' by typing only two commands.
+However, if you use an unusual system, you may need to configure 'gawk'
+for your system yourself.
+
+* Menu:
+
+* Quick Installation::               Compiling 'gawk' under Unix.
+* Shell Startup Files::              Shell convenience functions.
+* Additional Configuration Options:: Other compile-time options.
+* Configuration Philosophy::         How it's all supposed to work.
+
+
+File: gawk.info,  Node: Quick Installation,  Next: Shell Startup Files,  Up: Unix Installation
+
+B.2.1 Compiling 'gawk' for Unix-Like Systems
+--------------------------------------------
+
+The normal installation steps should work on all modern commercial
+Unix-derived systems, GNU/Linux, BSD-based systems, and the Cygwin
+environment for MS-Windows.
+
+   After you have extracted the 'gawk' distribution, 'cd' to
+'gawk-4.1.4'.  As with most GNU software, you configure 'gawk' for your
+system by running the 'configure' program.  This program is a Bourne
+shell script that is generated automatically using GNU Autoconf.  (The
+Autoconf software is described fully starting with *note (Autoconf,
+autoconf,Autoconf---Generating Automatic Configuration Scripts)Top::.)
+
+   To configure 'gawk', simply run 'configure':
+
+     sh ./configure
+
+   This produces a 'Makefile' and 'config.h' tailored to your system.
+The 'config.h' file describes various facts about your system.  You
+might want to edit the 'Makefile' to change the 'CFLAGS' variable, which
+controls the command-line options that are passed to the C compiler
+(such as optimization levels or compiling for debugging).
+
+   Alternatively, you can add your own values for most 'make' variables
+on the command line, such as 'CC' and 'CFLAGS', when running
+'configure':
+
+     CC=cc CFLAGS=-g sh ./configure
+
+See the file 'INSTALL' in the 'gawk' distribution for all the details.
+
+   After you have run 'configure' and possibly edited the 'Makefile',
+type:
+
+     make
+
+Shortly thereafter, you should have an executable version of 'gawk'.
+That's all there is to it!  To verify that 'gawk' is working properly,
+run 'make check'.  All of the tests should succeed.  If these steps do
+not work, or if any of the tests fail, check the files in the 'README_d'
+directory to see if you've found a known problem.  If the failure is not
+described there, send in a bug report (*note Bugs::).
+
+   Of course, once you've built 'gawk', it is likely that you will wish
+to install it.  To do so, you need to run the command 'make install', as
+a user with the appropriate permissions.  How to do this varies by
+system, but on many systems you can use the 'sudo' command to do so.
+The command then becomes 'sudo make install'.  It is likely that you
+will be asked for your password, and you will have to have been set up
+previously as a user who is allowed to run the 'sudo' command.
+
+
+File: gawk.info,  Node: Shell Startup Files,  Next: Additional Configuration Options,  Prev: Quick Installation,  Up: Unix Installation
+
+B.2.2 Shell Startup Files
+-------------------------
+
+The distribution contains shell startup files 'gawk.sh' and 'gawk.csh'
+containing functions to aid in manipulating the 'AWKPATH' and
+'AWKLIBPATH' environment variables.  On a Fedora system, these files
+should be installed in '/etc/profile.d'; on other platforms, the
+appropriate location may be different.
+
+'gawkpath_default'
+     Reset the 'AWKPATH' environment variable to its default value.
+
+'gawkpath_prepend'
+     Add the argument to the front of the 'AWKPATH' environment
+     variable.
+
+'gawkpath_append'
+     Add the argument to the end of the 'AWKPATH' environment variable.
+
+'gawklibpath_default'
+     Reset the 'AWKLIBPATH' environment variable to its default value.
+
+'gawklibpath_prepend'
+     Add the argument to the front of the 'AWKLIBPATH' environment
+     variable.
+
+'gawklibpath_append'
+     Add the argument to the end of the 'AWKLIBPATH' environment
+     variable.
+
+
+File: gawk.info,  Node: Additional Configuration Options,  Next: Configuration Philosophy,  Prev: Shell Startup Files,  Up: Unix Installation
+
+B.2.3 Additional Configuration Options
+--------------------------------------
+
+There are several additional options you may use on the 'configure'
+command line when compiling 'gawk' from scratch, including:
+
+'--disable-extensions'
+     Disable configuring and building the sample extensions in the
+     'extension' directory.  This is useful for cross-compiling.  The
+     default action is to dynamically check if the extensions can be
+     configured and compiled.
+
+'--disable-lint'
+     Disable all lint checking within 'gawk'.  The '--lint' and
+     '--lint-old' options (*note Options::) are accepted, but silently
+     do nothing.  Similarly, setting the 'LINT' variable (*note
+     User-modified::) has no effect on the running 'awk' program.
+
+     When used with the GNU Compiler Collection's (GCC's) automatic
+     dead-code-elimination, this option cuts almost 23K bytes off the
+     size of the 'gawk' executable on GNU/Linux x86_64 systems.  Results
+     on other systems and with other compilers are likely to vary.
+     Using this option may bring you some slight performance
+     improvement.
+
+          CAUTION: Using this option will cause some of the tests in the
+          test suite to fail.  This option may be removed at a later
+          date.
+
+'--disable-nls'
+     Disable all message-translation facilities.  This is usually not
+     desirable, but it may bring you some slight performance
+     improvement.
+
+'--with-whiny-user-strftime'
+     Force use of the included version of the C 'strftime()' function
+     for deficient systems.
+
+   Use the command './configure --help' to see the full list of options
+supplied by 'configure'.
+
+
+File: gawk.info,  Node: Configuration Philosophy,  Prev: Additional Configuration Options,  Up: Unix Installation
+
+B.2.4 The Configuration Process
+-------------------------------
+
+This minor node is of interest only if you know something about using
+the C language and Unix-like operating systems.
+
+   The source code for 'gawk' generally attempts to adhere to formal
+standards wherever possible.  This means that 'gawk' uses library
+routines that are specified by the ISO C standard and by the POSIX
+operating system interface standard.  The 'gawk' source code requires
+using an ISO C compiler (the 1990 standard).
+
+   Many Unix systems do not support all of either the ISO or the POSIX
+standards.  The 'missing_d' subdirectory in the 'gawk' distribution
+contains replacement versions of those functions that are most likely to
+be missing.
+
+   The 'config.h' file that 'configure' creates contains definitions
+that describe features of the particular operating system where you are
+attempting to compile 'gawk'.  The three things described by this file
+are: what header files are available, so that they can be correctly
+included, what (supposedly) standard functions are actually available in
+your C libraries, and various miscellaneous facts about your operating
+system.  For example, there may not be an 'st_blksize' element in the
+'stat' structure.  In this case, 'HAVE_STRUCT_STAT_ST_BLKSIZE' is
+undefined.
+
+   It is possible for your C compiler to lie to 'configure'.  It may do
+so by not exiting with an error when a library function is not
+available.  To get around this, edit the 'custom.h' file.  Use an
+'#ifdef' that is appropriate for your system, and either '#define' any
+constants that 'configure' should have defined but didn't, or '#undef'
+any constants that 'configure' defined and should not have.  The
+'custom.h' file is automatically included by the 'config.h' file.
+
+   It is also possible that the 'configure' program generated by
+Autoconf will not work on your system in some other fashion.  If you do
+have a problem, the 'configure.ac' file is the input for Autoconf.  You
+may be able to change this file and generate a new version of
+'configure' that works on your system (*note Bugs:: for information on
+how to report problems in configuring 'gawk').  The same mechanism may
+be used to send in updates to 'configure.ac' and/or 'custom.h'.
+
+
+File: gawk.info,  Node: Non-Unix Installation,  Next: Bugs,  Prev: Unix Installation,  Up: Installation
+
+B.3 Installation on Other Operating Systems
+===========================================
+
+This minor node describes how to install 'gawk' on various non-Unix
+systems.
+
+* Menu:
+
+* PC Installation::             Installing and Compiling 'gawk' on
+                                Microsoft Windows.
+* VMS Installation::            Installing 'gawk' on VMS.
+
+
+File: gawk.info,  Node: PC Installation,  Next: VMS Installation,  Up: Non-Unix Installation
+
+B.3.1 Installation on MS-Windows
+--------------------------------
+
+This minor node covers installation and usage of 'gawk' on Intel
+architecture machines running any version of MS-Windows.  In this minor
+node, the term "Windows32" refers to any of Microsoft Windows
+95/98/ME/NT/2000/XP/Vista/7/8/10.
+
+   See also the 'README_d/README.pc' file in the distribution.
+
+* Menu:
+
+* PC Binary Installation::      Installing a prepared distribution.
+* PC Compiling::                Compiling 'gawk' for Windows32.
+* PC Using::                    Running 'gawk' on Windows32.
+* Cygwin::                      Building and running 'gawk' for
+                                Cygwin.
+* MSYS::                        Using 'gawk' In The MSYS Environment.
+
+
+File: gawk.info,  Node: PC Binary Installation,  Next: PC Compiling,  Up: PC Installation
+
+B.3.1.1 Installing a Prepared Distribution for MS-Windows Systems
+.................................................................
+
+The only supported binary distribution for MS-Windows systems is that
+provided by Eli Zaretskii's "ezwinports"
+(https://sourceforge.net/projects/ezwinports/) project.  Install the
+compiled 'gawk' from there.
+
+
+File: gawk.info,  Node: PC Compiling,  Next: PC Using,  Prev: PC Binary Installation,  Up: PC Installation
+
+B.3.1.2 Compiling 'gawk' for PC Operating Systems
+.................................................
+
+'gawk' can be compiled for Windows32 using MinGW (Windows32).  The file
+'README_d/README.pc' in the 'gawk' distribution contains additional
+notes, and 'pc/Makefile' contains important information on compilation
+options.
+
+   To build 'gawk' for Windows32, copy the files in the 'pc' directory
+(_except_ for 'ChangeLog') to the directory with the rest of the 'gawk'
+sources, then invoke 'make' with the appropriate target name as an
+argument to build 'gawk'.  The 'Makefile' copied from the 'pc' directory
+contains a configuration section with comments and may need to be edited
+in order to work with your 'make' utility.
+
+   The 'Makefile' supports a number of targets for building various
+MS-DOS and Windows32 versions.  A list of targets is printed if the
+'make' command is given without a target.  As an example, to build a
+native MS-Windows binary of 'gawk' using the MinGW tools, type 'make
+mingw32'.
+
+
+File: gawk.info,  Node: PC Using,  Next: Cygwin,  Prev: PC Compiling,  Up: PC Installation
+
+B.3.1.3 Using 'gawk' on PC Operating Systems
+............................................
+
+Under MS-Windows, the Cygwin and MinGW environments support both the
+'|&' operator and TCP/IP networking (*note TCP/IP Networking::).
+
+   The MS-Windows version of 'gawk' searches for program files as
+described in *note AWKPATH Variable::.  However, semicolons (rather than
+colons) separate elements in the 'AWKPATH' variable.  If 'AWKPATH' is
+not set or is empty, then the default search path is
+'.;c:/lib/awk;c:/gnu/lib/awk'.
+
+   Under MS-Windows, 'gawk' (and many other text programs) silently
+translates end-of-line '\r\n' to '\n' on input and '\n' to '\r\n' on
+output.  A special 'BINMODE' variable (c.e.)  allows control over these
+translations and is interpreted as follows:
+
+   * If 'BINMODE' is '"r"' or one, then binary mode is set on read
+     (i.e., no translations on reads).
+
+   * If 'BINMODE' is '"w"' or two, then binary mode is set on write
+     (i.e., no translations on writes).
+
+   * If 'BINMODE' is '"rw"' or '"wr"' or three, binary mode is set for
+     both read and write.
+
+   * 'BINMODE=NON-NULL-STRING' is the same as 'BINMODE=3' (i.e., no
+     translations on reads or writes).  However, 'gawk' issues a warning
+     message if the string is not one of '"rw"' or '"wr"'.
+
+The modes for standard input and standard output are set one time only
+(after the command line is read, but before processing any of the 'awk'
+program).  Setting 'BINMODE' for standard input or standard output is
+accomplished by using an appropriate '-v BINMODE=N' option on the
+command line.  'BINMODE' is set at the time a file or pipe is opened and
+cannot be changed midstream.
+
+   The name 'BINMODE' was chosen to match 'mawk' (*note Other
+Versions::).  'mawk' and 'gawk' handle 'BINMODE' similarly; however,
+'mawk' adds a '-W BINMODE=N' option and an environment variable that can
+set 'BINMODE', 'RS', and 'ORS'.  The files 'binmode[1-3].awk' (under
+'gnu/lib/awk' in some of the prepared binary distributions) have been
+chosen to match 'mawk''s '-W BINMODE=N' option.  These can be changed or
+discarded; in particular, the setting of 'RS' giving the fewest
+"surprises" is open to debate.  'mawk' uses 'RS = "\r\n"' if binary mode
+is set on read, which is appropriate for files with the MS-DOS-style
+end-of-line.
+
+   To illustrate, the following examples set binary mode on writes for
+standard output and other files, and set 'ORS' as the "usual"
+MS-DOS-style end-of-line:
+
+     gawk -v BINMODE=2 -v ORS="\r\n" ...
+
+or:
+
+     gawk -v BINMODE=w -f binmode2.awk ...
+
+These give the same result as the '-W BINMODE=2' option in 'mawk'.  The
+following changes the record separator to '"\r\n"' and sets binary mode
+on reads, but does not affect the mode on standard input:
+
+     gawk -v RS="\r\n" -e "BEGIN { BINMODE = 1 }" ...
+
+or:
+
+     gawk -f binmode1.awk ...
+
+With proper quoting, in the first example the setting of 'RS' can be
+moved into the 'BEGIN' rule.
+
+
+File: gawk.info,  Node: Cygwin,  Next: MSYS,  Prev: PC Using,  Up: PC Installation
+
+B.3.1.4 Using 'gawk' In The Cygwin Environment
+..............................................
+
+'gawk' can be built and used "out of the box" under MS-Windows if you
+are using the Cygwin environment (http://www.cygwin.com).  This
+environment provides an excellent simulation of GNU/Linux, using Bash,
+GCC, GNU Make, and other GNU programs.  Compilation and installation for
+Cygwin is the same as for a Unix system:
+
+     tar -xvpzf gawk-4.1.4.tar.gz
+     cd gawk-4.1.4
+     ./configure
+     make && make check
+
+   When compared to GNU/Linux on the same system, the 'configure' step
+on Cygwin takes considerably longer.  However, it does finish, and then
+the 'make' proceeds as usual.
+
+
+File: gawk.info,  Node: MSYS,  Prev: Cygwin,  Up: PC Installation
+
+B.3.1.5 Using 'gawk' In The MSYS Environment
+............................................
+
+In the MSYS environment under MS-Windows, 'gawk' automatically uses
+binary mode for reading and writing files.  Thus, there is no need to
+use the 'BINMODE' variable.
+
+   This can cause problems with other Unix-like components that have
+been ported to MS-Windows that expect 'gawk' to do automatic translation
+of '"\r\n"', because it won't.
+
+
+File: gawk.info,  Node: VMS Installation,  Prev: PC Installation,  Up: Non-Unix Installation
+
+B.3.2 Compiling and Installing 'gawk' on Vax/VMS and OpenVMS
+------------------------------------------------------------
+
+This node describes how to compile and install 'gawk' under VMS. The
+older designation "VMS" is used throughout to refer to OpenVMS.
+
+* Menu:
+
+* VMS Compilation::             How to compile 'gawk' under VMS.
+* VMS Dynamic Extensions::      Compiling 'gawk' dynamic extensions on
+                                VMS.
+* VMS Installation Details::    How to install 'gawk' under VMS.
+* VMS Running::                 How to run 'gawk' under VMS.
+* VMS GNV::                     The VMS GNV Project.
+* VMS Old Gawk::                An old version comes with some VMS systems.
+
+
+File: gawk.info,  Node: VMS Compilation,  Next: VMS Dynamic Extensions,  Up: VMS Installation
+
+B.3.2.1 Compiling 'gawk' on VMS
+...............................
+
+To compile 'gawk' under VMS, there is a 'DCL' command procedure that
+issues all the necessary 'CC' and 'LINK' commands.  There is also a
+'Makefile' for use with the 'MMS' and 'MMK' utilities.  From the source
+directory, use either:
+
+     $ @[.vms]vmsbuild.com
+
+or:
+
+     $ MMS/DESCRIPTION=[.vms]descrip.mms gawk
+
+or:
+
+     $ MMK/DESCRIPTION=[.vms]descrip.mms gawk
+
+   'MMK' is an open source, free, near-clone of 'MMS' and can better
+handle ODS-5 volumes with upper- and lowercase file names.  'MMK' is
+available from <https://github.com/endlesssoftware/mmk>.
+
+   With ODS-5 volumes and extended parsing enabled, the case of the
+target parameter may need to be exact.
+
+   'gawk' has been tested under VAX/VMS 7.3 and Alpha/VMS 7.3-1 using
+Compaq C V6.4, and under Alpha/VMS 7.3, Alpha/VMS 7.3-2, and IA64/VMS
+8.3.  The most recent builds used HP C V7.3 on Alpha VMS 8.3 and both
+Alpha and IA64 VMS 8.4 used HP C 7.3.(1)
+
+   *Note VMS GNV:: for information on building 'gawk' as a PCSI kit that
+is compatible with the GNV product.
+
+   ---------- Footnotes ----------
+
+   (1) The IA64 architecture is also known as "Itanium."
+
+
+File: gawk.info,  Node: VMS Dynamic Extensions,  Next: VMS Installation Details,  Prev: VMS Compilation,  Up: VMS Installation
+
+B.3.2.2 Compiling 'gawk' Dynamic Extensions on VMS
+..................................................
+
+The extensions that have been ported to VMS can be built using one of
+the following commands:
+
+     $ MMS/DESCRIPTION=[.vms]descrip.mms extensions
+
+or:
+
+     $ MMK/DESCRIPTION=[.vms]descrip.mms extensions
+
+   'gawk' uses 'AWKLIBPATH' as either an environment variable or a
+logical name to find the dynamic extensions.
+
+   Dynamic extensions need to be compiled with the same compiler options
+for floating-point, pointer size, and symbol name handling as were used
+to compile 'gawk' itself.  Alpha and Itanium should use IEEE floating
+point.  The pointer size is 32 bits, and the symbol name handling should
+be exact case with CRC shortening for symbols longer than 32 bits.
+
+   For Alpha and Itanium:
+
+     /name=(as_is,short)
+     /float=ieee/ieee_mode=denorm_results
+
+   For VAX:
+
+     /name=(as_is,short)
+
+   Compile-time macros need to be defined before the first VMS-supplied
+header file is included, as follows:
+
+     #if (__CRTL_VER >= 70200000) && !defined (__VAX)
+     #define _LARGEFILE 1
+     #endif
+
+     #ifndef __VAX
+     #ifdef __CRTL_VER
+     #if __CRTL_VER >= 80200000
+     #define _USE_STD_STAT 1
+     #endif
+     #endif
+     #endif
+
+   If you are writing your own extensions to run on VMS, you must supply
+these definitions yourself.  The 'config.h' file created when building
+'gawk' on VMS does this for you; if instead you use that file or a
+similar one, then you must remember to include it before any
+VMS-supplied header files.
+
+
+File: gawk.info,  Node: VMS Installation Details,  Next: VMS Running,  Prev: VMS Dynamic Extensions,  Up: VMS Installation
+
+B.3.2.3 Installing 'gawk' on VMS
+................................
+
+To use 'gawk', all you need is a "foreign" command, which is a 'DCL'
+symbol whose value begins with a dollar sign.  For example:
+
+     $ GAWK :== $disk1:[gnubin]gawk
+
+Substitute the actual location of 'gawk.exe' for '$disk1:[gnubin]'.  The
+symbol should be placed in the 'login.com' of any user who wants to run
+'gawk', so that it is defined every time the user logs on.
+Alternatively, the symbol may be placed in the system-wide 'sylogin.com'
+procedure, which allows all users to run 'gawk'.
+
+   If your 'gawk' was installed by a PCSI kit into the 'GNV$GNU:'
+directory tree, the program will be known as 'GNV$GNU:[bin]gnv$gawk.exe'
+and the help file will be 'GNV$GNU:[vms_help]gawk.hlp'.
+
+   The PCSI kit also installs a 'GNV$GNU:[vms_bin]gawk_verb.cld' file
+that can be used to add 'gawk' and 'awk' as DCL commands.
+
+   For just the current process you can use:
+
+     $ set command gnv$gnu:[vms_bin]gawk_verb.cld
+
+   Or the system manager can use 'GNV$GNU:[vms_bin]gawk_verb.cld' to add
+the 'gawk' and 'awk' to the system-wide 'DCLTABLES'.
+
+   The DCL syntax is documented in the 'gawk.hlp' file.
+
+   Optionally, the 'gawk.hlp' entry can be loaded into a VMS help
+library:
+
+     $ LIBRARY/HELP sys$help:helplib [.vms]gawk.hlp
+
+(You may want to substitute a site-specific help library rather than the
+standard VMS library 'HELPLIB'.)  After loading the help text, the
+command:
+
+     $ HELP GAWK
+
+provides information about both the 'gawk' implementation and the 'awk'
+programming language.
+
+   The logical name 'AWK_LIBRARY' can designate a default location for
+'awk' program files.  For the '-f' option, if the specified file name
+has no device or directory path information in it, 'gawk' looks in the
+current directory first, then in the directory specified by the
+translation of 'AWK_LIBRARY' if the file is not found.  If, after
+searching in both directories, the file still is not found, 'gawk'
+appends the suffix '.awk' to the file name and retries the file search.
+If 'AWK_LIBRARY' has no definition, a default value of 'SYS$LIBRARY:' is
+used for it.
+
+
+File: gawk.info,  Node: VMS Running,  Next: VMS GNV,  Prev: VMS Installation Details,  Up: VMS Installation
+
+B.3.2.4 Running 'gawk' on VMS
+.............................
+
+Command-line parsing and quoting conventions are significantly different
+on VMS, so examples in this Info file or from other sources often need
+minor changes.  They _are_ minor though, and all 'awk' programs should
+run correctly.
+
+   Here are a couple of trivial tests:
+
+     $ gawk -- "BEGIN {print ""Hello, World!""}"
+     $ gawk -"W" version
+     ! could also be -"W version" or "-W version"
+
+Note that uppercase and mixed-case text must be quoted.
+
+   The VMS port of 'gawk' includes a 'DCL'-style interface in addition
+to the original shell-style interface (see the help entry for details).
+One side effect of dual command-line parsing is that if there is only a
+single parameter (as in the quoted string program), the command becomes
+ambiguous.  To work around this, the normally optional '--' flag is
+required to force Unix-style parsing rather than 'DCL' parsing.  If any
+other dash-type options (or multiple parameters such as data files to
+process) are present, there is no ambiguity and '--' can be omitted.
+
+   The 'exit' value is a Unix-style value and is encoded into a VMS exit
+status value when the program exits.
+
+   The VMS severity bits will be set based on the 'exit' value.  A
+failure is indicated by 1, and VMS sets the 'ERROR' status.  A fatal
+error is indicated by 2, and VMS sets the 'FATAL' status.  All other
+values will have the 'SUCCESS' status.  The exit value is encoded to
+comply with VMS coding standards and will have the 'C_FACILITY_NO' of
+'0x350000' with the constant '0xA000' added to the number shifted over
+by 3 bits to make room for the severity codes.
+
+   To extract the actual 'gawk' exit code from the VMS status, use:
+
+     unix_status = (vms_status .and. %x7f8) / 8
+
+A C program that uses 'exec()' to call 'gawk' will get the original
+Unix-style exit value.
+
+   Older versions of 'gawk' for VMS treated a Unix exit code 0 as 1, a
+failure as 2, a fatal error as 4, and passed all the other numbers
+through.  This violated the VMS exit status coding requirements.
+
+   VAX/VMS floating point uses unbiased rounding.  *Note Round
+Function::.
+
+   VMS reports time values in GMT unless one of the 'SYS$TIMEZONE_RULE'
+or 'TZ' logical names is set.  Older versions of VMS, such as VAX/VMS
+7.3, do not set these logical names.
+
+   The default search path, when looking for 'awk' program files
+specified by the '-f' option, is '"SYS$DISK:[],AWK_LIBRARY:"'.  The
+logical name 'AWKPATH' can be used to override this default.  The format
+of 'AWKPATH' is a comma-separated list of directory specifications.
+When defining it, the value should be quoted so that it retains a single
+translation and not a multitranslation 'RMS' searchlist.
+
+   This restriction also applies to running 'gawk' under GNV, as
+redirection is always to a DCL command.
+
+   If you are redirecting data to a VMS command or utility, the current
+implementation requires that setting up a VMS foreign command that runs
+a command file before invoking 'gawk'.  (This restriction may be removed
+in a future release of 'gawk' on VMS.)
+
+   Without this command file, the input data will also appear prepended
+to the output data.
+
+   This also allows simulating POSIX commands that are not found on VMS
+or the use of GNV utilities.
+
+   The example below is for 'gawk' redirecting data to the VMS 'sort'
+command.
+
+     $ sort = "@device:[dir]vms_gawk_sort.com"
+
+   The command file needs to be of the format in the example below.
+
+   The first line inhibits the passed input data from also showing up in
+the output.  It must be in the format in the example.
+
+   The next line creates a foreign command that overrides the outer
+foreign command which prevents an infinite recursion of command files.
+
+   The next to the last command redirects 'sys$input' to be
+'sys$command', in order to pick up the data that is being redirected to
+the command.
+
+   The last line runs the actual command.  It must be the last command
+as the data redirected from 'gawk' will be read when the command file
+ends.
+
+     $!'f$verify(0,0)'
+     $ sort := sort
+     $ define/user sys$input sys$command:
+     $ sort sys$input: sys$output:
+
+
+File: gawk.info,  Node: VMS GNV,  Next: VMS Old Gawk,  Prev: VMS Running,  Up: VMS Installation
+
+B.3.2.5 The VMS GNV Project
+...........................
+
+The VMS GNV package provides a build environment similar to POSIX with
+ports of a collection of open source tools.  The 'gawk' found in the GNV
+base kit is an older port.  Currently, the GNV project is being
+reorganized to supply individual PCSI packages for each component.  See
+<https://sourceforge.net/p/gnv/wiki/InstallingGNVPackages/>.
+
+   The normal build procedure for 'gawk' produces a program that is
+suitable for use with GNV.
+
+   The file 'vms/gawk_build_steps.txt' in the distribution documents the
+procedure for building a VMS PCSI kit that is compatible with GNV.
+
+
+File: gawk.info,  Node: VMS Old Gawk,  Prev: VMS GNV,  Up: VMS Installation
+
+B.3.2.6 Some VMS Systems Have An Old Version of 'gawk'
+......................................................
+
+Some versions of VMS have an old version of 'gawk'.  To access it,
+define a symbol, as follows:
+
+     $ gawk :== $sys$common:[syshlp.examples.tcpip.snmp]gawk.exe
+
+   This is apparently version 2.15.6, which is extremely old.  We
+recommend compiling and using the current version.
+
+
+File: gawk.info,  Node: Bugs,  Next: Other Versions,  Prev: Non-Unix Installation,  Up: Installation
+
+B.4 Reporting Problems and Bugs
+===============================
+
+     There is nothing more dangerous than a bored archaeologist.
+      -- _Douglas Adams, 'The Hitchhiker's Guide to the Galaxy'_
+
+   If you have problems with 'gawk' or think that you have found a bug,
+report it to the developers; we cannot promise to do anything, but we
+might well want to fix it.
+
+* Menu:
+
+* Bug address::                 Where to send reports to.
+* Usenet::                      Where not to send reports to.
+* Maintainers::                 Maintainers of non-*nix ports.
+
+
+File: gawk.info,  Node: Bug address,  Next: Usenet,  Up: Bugs
+
+B.4.1 Submitting Bug Reports
+----------------------------
+
+Before reporting a bug, make sure you have really found a genuine bug.
+Carefully reread the documentation and see if it says you can do what
+you're trying to do.  If it's not clear whether you should be able to do
+something or not, report that too; it's a bug in the documentation!
+
+   Before reporting a bug or trying to fix it yourself, try to isolate
+it to the smallest possible 'awk' program and input data file that
+reproduce the problem.  Then send us the program and data file, some
+idea of what kind of Unix system you're using, the compiler you used to
+compile 'gawk', and the exact results 'gawk' gave you.  Also say what
+you expected to occur; this helps us decide whether the problem is
+really in the documentation.
+
+   Make sure to include the version number of 'gawk' you are using.  You
+can get this information with the command 'gawk --version'.
+
+   Once you have a precise problem description, send email to
+<bug-gawk@gnu.org>.
+
+   The 'gawk' maintainers subscribe to this address, and thus they will
+receive your bug report.  Although you can send mail to the maintainers
+directly, the bug reporting address is preferred because the email list
+is archived at the GNU Project.  _All email must be in English.  This is
+the only language understood in common by all the maintainers._  In
+addition, please be sure to send all mail in _plain text_, not (or not
+exclusively) in HTML.
+
+     NOTE: Many distributions of GNU/Linux and the various BSD-based
+     operating systems have their own bug reporting systems.  If you
+     report a bug using your distribution's bug reporting system, you
+     should also send a copy to <bug-gawk@gnu.org>.
+
+     This is for two reasons.  First, although some distributions
+     forward bug reports "upstream" to the GNU mailing list, many don't,
+     so there is a good chance that the 'gawk' maintainers won't even
+     see the bug report!  Second, mail to the GNU list is archived, and
+     having everything at the GNU Project keeps things self-contained
+     and not dependent on other organizations.
+
+   Non-bug suggestions are always welcome as well.  If you have
+questions about things that are unclear in the documentation or are just
+obscure features, ask on the bug list; we will try to help you out if we
+can.
+
+
+File: gawk.info,  Node: Usenet,  Next: Maintainers,  Prev: Bug address,  Up: Bugs
+
+B.4.2 Please Don't Post Bug Reports to USENET
+---------------------------------------------
+
+     I gave up on Usenet a couple of years ago and haven't really looked
+     back.  It's like sports talk radio--you feel smarter for not having
+     read it.
+                            -- _Chet Ramey_
+
+   Please do _not_ try to report bugs in 'gawk' by posting to the
+Usenet/Internet newsgroup 'comp.lang.awk'.  Although some of the 'gawk'
+developers occasionally read this newgroup, the primary 'gawk'
+maintainer no longer does.  Thus it's virtually guaranteed that he will
+_not_ see your posting.  The steps described here are the only
+officially recognized way for reporting bugs.  Really.
+
+
+File: gawk.info,  Node: Maintainers,  Prev: Usenet,  Up: Bugs
+
+B.4.3 Reporting Problems with Non-Unix Ports
+--------------------------------------------
+
+If you find bugs in one of the non-Unix ports of 'gawk', send an email
+to the bug list, with a copy to the person who maintains that port.  The
+maintainers are named in the following list, as well as in the 'README'
+file in the 'gawk' distribution.  Information in the 'README' file
+should be considered authoritative if it conflicts with this Info file.
+
+   The people maintaining the various 'gawk' ports are:
+
+Unix and POSIX          Arnold Robbins, <arnold@skeeve.com>
+systems                 
+MS-Windows with MinGW   Eli Zaretskii, <eliz@gnu.org>
+                        
+OS/2                    Andreas Buening, <andreas.buening@nexgo.de>
+                        
+VMS                     John Malmberg, <wb8tyw@qsl.net>
+                        
+z/OS (OS/390)           Daniel Richard G. <skunk@iSKUNK.ORG>
+                        Dave Pitts (Maintainer Emeritus), <dpitts@cozx.com>
+
+   If your bug is also reproducible under Unix, send a copy of your
+report to the <bug-gawk@gnu.org> email list as well.
+
+   The DJGPP port is no longer supported; it will remain in the code
+base for a while in case a volunteer wishes to take it over.  If this
+does not happen, then eventually code for this port will be removed.
+
+
+File: gawk.info,  Node: Other Versions,  Next: Installation summary,  Prev: Bugs,  Up: Installation
+
+B.5 Other Freely Available 'awk' Implementations
+================================================
+
+     It's kind of fun to put comments like this in your awk code:
+           '// Do C++ comments work? answer: yes! of course'
+                         -- _Michael Brennan_
+
+   There are a number of other freely available 'awk' implementations.
+This minor node briefly describes where to get them:
+
+Unix 'awk'
+     Brian Kernighan, one of the original designers of Unix 'awk', has
+     made his implementation of 'awk' freely available.  You can
+     retrieve this version via his home page
+     (http://www.cs.princeton.edu/~bwk).  It is available in several
+     archive formats:
+
+     Shell archive
+          <http://www.cs.princeton.edu/~bwk/btl.mirror/awk.shar>
+
+     Compressed 'tar' file
+          <http://www.cs.princeton.edu/~bwk/btl.mirror/awk.tar.gz>
+
+     Zip file
+          <http://www.cs.princeton.edu/~bwk/btl.mirror/awk.zip>
+
+     You can also retrieve it from GitHub:
+
+          git clone git://github.com/onetrueawk/awk bwkawk
+
+     This command creates a copy of the Git (http://git-scm.com)
+     repository in a directory named 'bwkawk'.  If you leave that
+     argument off the 'git' command line, the repository copy is created
+     in a directory named 'awk'.
+
+     This version requires an ISO C (1990 standard) compiler; the C
+     compiler from GCC (the GNU Compiler Collection) works quite nicely.
+
+     *Note Common Extensions:: for a list of extensions in this 'awk'
+     that are not in POSIX 'awk'.
+
+     As a side note, Dan Bornstein has created a Git repository tracking
+     all the versions of BWK 'awk' that he could find.  It's available
+     at <git://github.com/danfuzz/one-true-awk>.
+
+'mawk'
+     Michael Brennan wrote an independent implementation of 'awk',
+     called 'mawk'.  It is available under the GPL (*note Copying::),
+     just as 'gawk' is.
+
+     The original distribution site for the 'mawk' source code no longer
+     has it.  A copy is available at
+     <http://www.skeeve.com/gawk/mawk1.3.3.tar.gz>.
+
+     In 2009, Thomas Dickey took on 'mawk' maintenance.  Basic
+     information is available on the project's web page
+     (http://www.invisible-island.net/mawk).  The download URL is
+     <http://invisible-island.net/datafiles/release/mawk.tar.gz>.
+
+     Once you have it, 'gunzip' may be used to decompress this file.
+     Installation is similar to 'gawk''s (*note Unix Installation::).
+
+     *Note Common Extensions:: for a list of extensions in 'mawk' that
+     are not in POSIX 'awk'.
+
+'awka'
+     Written by Andrew Sumner, 'awka' translates 'awk' programs into C,
+     compiles them, and links them with a library of functions that
+     provide the core 'awk' functionality.  It also has a number of
+     extensions.
+
+     The 'awk' translator is released under the GPL, and the library is
+     under the LGPL.
+
+     To get 'awka', go to <http://sourceforge.net/projects/awka>.
+
+     The project seems to be frozen; no new code changes have been made
+     since approximately 2001.
+
+'pawk'
+     Nelson H.F. Beebe at the University of Utah has modified BWK 'awk'
+     to provide timing and profiling information.  It is different from
+     'gawk' with the '--profile' option (*note Profiling::) in that it
+     uses CPU-based profiling, not line-count profiling.  You may find
+     it at either
+     <ftp://ftp.math.utah.edu/pub/pawk/pawk-20030606.tar.gz> or
+     <http://www.math.utah.edu/pub/pawk/pawk-20030606.tar.gz>.
+
+BusyBox 'awk'
+     BusyBox is a GPL-licensed program providing small versions of many
+     applications within a single executable.  It is aimed at embedded
+     systems.  It includes a full implementation of POSIX 'awk'.  When
+     building it, be careful not to do 'make install' as it will
+     overwrite copies of other applications in your '/usr/local/bin'.
+     For more information, see the project's home page
+     (http://busybox.net).
+
+The OpenSolaris POSIX 'awk'
+     The versions of 'awk' in '/usr/xpg4/bin' and '/usr/xpg6/bin' on
+     Solaris are more or less POSIX-compliant.  They are based on the
+     'awk' from Mortice Kern Systems for PCs.  We were able to make this
+     code compile and work under GNU/Linux with 1-2 hours of work.
+     Making it more generally portable (using GNU Autoconf and/or
+     Automake) would take more work, and this has not been done, at
+     least to our knowledge.
+
+     The source code used to be available from the OpenSolaris website.
+     However, that project was ended and the website shut down.
+     Fortunately, the Illumos project
+     (http://wiki.illumos.org/display/illumos/illumos+Home) makes this
+     implementation available.  You can view the files one at a time
+     from
+     <https://github.com/joyent/illumos-joyent/blob/master/usr/src/cmd/awk_xpg4>.
+
+'jawk'
+     This is an interpreter for 'awk' written in Java.  It claims to be
+     a full interpreter, although because it uses Java facilities for
+     I/O and for regexp matching, the language it supports is different
+     from POSIX 'awk'.  More information is available on the project's
+     home page (http://jawk.sourceforge.net).
+
+Libmawk
+     This is an embeddable 'awk' interpreter derived from 'mawk'.  For
+     more information, see <http://repo.hu/projects/libmawk/>.
+
+'pawk'
+     This is a Python module that claims to bring 'awk'-like features to
+     Python.  See <https://github.com/alecthomas/pawk> for more
+     information.  (This is not related to Nelson Beebe's modified
+     version of BWK 'awk', described earlier.)
+
+QSE 'awk'
+     This is an embeddable 'awk' interpreter.  For more information, see
+     <http://code.google.com/p/qse/> and <http://awk.info/?tools/qse>.
+
+'QTawk'
+     This is an independent implementation of 'awk' distributed under
+     the GPL. It has a large number of extensions over standard 'awk'
+     and may not be 100% syntactically compatible with it.  See
+     <http://www.quiktrim.org/QTawk.html> for more information,
+     including the manual.  The download link there is out of date; see
+     <http://www.quiktrim.org/#AdditionalResources> for the latest
+     download link.
+
+     The project may also be frozen; no new code changes have been made
+     since approximately 2014.
+
+Other versions
+     See also the "Versions and implementations" section of the
+     Wikipedia article
+     (http://en.wikipedia.org/wiki/Awk_language#Versions_and_implementations)
+     on 'awk' for information on additional versions.
+
+
+File: gawk.info,  Node: Installation summary,  Prev: Other Versions,  Up: Installation
+
+B.6 Summary
+===========
+
+   * The 'gawk' distribution is available from the GNU Project's main
+     distribution site, 'ftp.gnu.org'.  The canonical build recipe is:
+
+          wget http://ftp.gnu.org/gnu/gawk/gawk-4.1.4.tar.gz
+          tar -xvpzf gawk-4.1.4.tar.gz
+          cd gawk-4.1.4
+          ./configure && make && make check
+
+   * 'gawk' may be built on non-POSIX systems as well.  The currently
+     supported systems are MS-Windows using MSYS, MinGW, and Cygwin, and
+     both Vax/VMS and OpenVMS. Instructions for each system are included
+     in this major node.
+
+   * Bug reports should be sent via email to <bug-gawk@gnu.org>.  Bug
+     reports should be in English and should include the version of
+     'gawk', how it was compiled, and a short program and data file that
+     demonstrate the problem.
+
+   * There are a number of other freely available 'awk' implementations.
+     Many are POSIX-compliant; others are less so.
+
+
+File: gawk.info,  Node: Notes,  Next: Basic Concepts,  Prev: Installation,  Up: Top
+
+Appendix C Implementation Notes
+*******************************
+
+This appendix contains information mainly of interest to implementers
+and maintainers of 'gawk'.  Everything in it applies specifically to
+'gawk' and not to other implementations.
+
+* Menu:
+
+* Compatibility Mode::          How to disable certain 'gawk'
+                                extensions.
+* Additions::                   Making Additions To 'gawk'.
+* Future Extensions::           New features that may be implemented one day.
+* Implementation Limitations::  Some limitations of the implementation.
+* Extension Design::            Design notes about the extension API.
+* Old Extension Mechanism::     Some compatibility for old extensions.
+* Notes summary::               Summary of implementation notes.
+
+
+File: gawk.info,  Node: Compatibility Mode,  Next: Additions,  Up: Notes
+
+C.1 Downward Compatibility and Debugging
+========================================
+
+*Note POSIX/GNU::, for a summary of the GNU extensions to the 'awk'
+language and program.  All of these features can be turned off by
+invoking 'gawk' with the '--traditional' option or with the '--posix'
+option.
+
+   If 'gawk' is compiled for debugging with '-DDEBUG', then there is one
+more option available on the command line:
+
+'-Y'
+'--parsedebug'
+     Print out the parse stack information as the program is being
+     parsed.
+
+   This option is intended only for serious 'gawk' developers and not
+for the casual user.  It probably has not even been compiled into your
+version of 'gawk', since it slows down execution.
+
+
+File: gawk.info,  Node: Additions,  Next: Future Extensions,  Prev: Compatibility Mode,  Up: Notes
+
+C.2 Making Additions to 'gawk'
+==============================
+
+If you find that you want to enhance 'gawk' in a significant fashion,
+you are perfectly free to do so.  That is the point of having free
+software; the source code is available and you are free to change it as
+you want (*note Copying::).
+
+   This minor node discusses the ways you might want to change 'gawk' as
+well as any considerations you should bear in mind.
+
+* Menu:
+
+* Accessing The Source::        Accessing the Git repository.
+* Adding Code::                 Adding code to the main body of
+                                'gawk'.
+* New Ports::                   Porting 'gawk' to a new operating
+                                system.
+* Derived Files::               Why derived files are kept in the Git
+                                repository.
+
+
+File: gawk.info,  Node: Accessing The Source,  Next: Adding Code,  Up: Additions
+
+C.2.1 Accessing The 'gawk' Git Repository
+-----------------------------------------
+
+As 'gawk' is Free Software, the source code is always available.  *note
+Gawk Distribution:: describes how to get and build the formal, released
+versions of 'gawk'.
+
+   However, if you want to modify 'gawk' and contribute back your
+changes, you will probably wish to work with the development version.
+To do so, you will need to access the 'gawk' source code repository.
+The code is maintained using the Git distributed version control system
+(http://git-scm.com).  You will need to install it if your system
+doesn't have it.  Once you have done so, use the command:
+
+     git clone git://git.savannah.gnu.org/gawk.git
+
+This clones the 'gawk' repository.  If you are behind a firewall that
+does not allow you to use the Git native protocol, you can still access
+the repository using:
+
+     git clone http://git.savannah.gnu.org/r/gawk.git
+
+   Once you have made changes, you can use 'git diff' to produce a
+patch, and send that to the 'gawk' maintainer; see *note Bugs::, for how
+to do that.
+
+   Once upon a time there was Git-CVS gateway for use by people who
+could not install Git.  However, this gateway no longer works, so you
+may have better luck using a more modern version control system like
+Bazaar, that has a Git plug-in for working with Git repositories.
+
+
+File: gawk.info,  Node: Adding Code,  Next: New Ports,  Prev: Accessing The Source,  Up: Additions
+
+C.2.2 Adding New Features
+-------------------------
+
+You are free to add any new features you like to 'gawk'.  However, if
+you want your changes to be incorporated into the 'gawk' distribution,
+there are several steps that you need to take in order to make it
+possible to include them:
+
+  1. Before building the new feature into 'gawk' itself, consider
+     writing it as an extension (*note Dynamic Extensions::).  If that's
+     not possible, continue with the rest of the steps in this list.
+
+  2. Be prepared to sign the appropriate paperwork.  In order for the
+     FSF to distribute your changes, you must either place those changes
+     in the public domain and submit a signed statement to that effect,
+     or assign the copyright in your changes to the FSF. Both of these
+     actions are easy to do and _many_ people have done so already.  If
+     you have questions, please contact me (*note Bugs::), or
+     <assign@gnu.org>.
+
+  3. Get the latest version.  It is much easier for me to integrate
+     changes if they are relative to the most recent distributed version
+     of 'gawk', or better yet, relative to the latest code in the Git
+     repository.  If your version of 'gawk' is very old, I may not be
+     able to integrate your changes at all.  (*Note Getting::, for
+     information on getting the latest version of 'gawk'.)
+
+  4. See *note (Version, standards, GNU Coding Standards)Top::.  This
+     document describes how GNU software should be written.  If you
+     haven't read it, please do so, preferably _before_ starting to
+     modify 'gawk'.  (The 'GNU Coding Standards' are available from the
+     GNU Project's website (http://www.gnu.org/prep/standards/).
+     Texinfo, Info, and DVI versions are also available.)
+
+  5. Use the 'gawk' coding style.  The C code for 'gawk' follows the
+     instructions in the 'GNU Coding Standards', with minor exceptions.
+     The code is formatted using the traditional "K&R" style,
+     particularly as regards to the placement of braces and the use of
+     TABs.  In brief, the coding rules for 'gawk' are as follows:
+
+        * Use ANSI/ISO style (prototype) function headers when defining
+          functions.
+
+        * Put the name of the function at the beginning of its own line.
+
+        * Use '#elif' instead of nesting '#if' inside '#else'.
+
+        * Put the return type of the function, even if it is 'int', on
+          the line above the line with the name and arguments of the
+          function.
+
+        * Put spaces around parentheses used in control structures
+          ('if', 'while', 'for', 'do', 'switch', and 'return').
+
+        * Do not put spaces in front of parentheses used in function
+          calls.
+
+        * Put spaces around all C operators and after commas in function
+          calls.
+
+        * Do not use the comma operator to produce multiple side
+          effects, except in 'for' loop initialization and increment
+          parts, and in macro bodies.
+
+        * Use real TABs for indenting, not spaces.
+
+        * Use the "K&R" brace layout style.
+
+        * Use comparisons against 'NULL' and ''\0'' in the conditions of
+          'if', 'while', and 'for' statements, as well as in the 'case's
+          of 'switch' statements, instead of just the plain pointer or
+          character value.
+
+        * Use 'true' and 'false' for 'bool' values, the 'NULL' symbolic
+          constant for pointer values, and the character constant ''\0''
+          where appropriate, instead of '1' and '0'.
+
+        * Provide one-line descriptive comments for each function.
+
+        * Do not use the 'alloca()' function for allocating memory off
+          the stack.  Its use causes more portability trouble than is
+          worth the minor benefit of not having to free the storage.
+          Instead, use 'malloc()' and 'free()'.
+
+        * Do not use comparisons of the form '! strcmp(a, b)' or
+          similar.  As Henry Spencer once said, "'strcmp()' is not a
+          boolean!"  Instead, use 'strcmp(a, b) == 0'.
+
+        * If adding new bit flag values, use explicit hexadecimal
+          constants ('0x001', '0x002', '0x004', and son on) instead of
+          shifting one left by successive amounts ('(1<<0)', '(1<<1)',
+          and so on).
+
+          NOTE: If I have to reformat your code to follow the coding
+          style used in 'gawk', I may not bother to integrate your
+          changes at all.
+
+  6. Update the documentation.  Along with your new code, please supply
+     new sections and/or chapters for this Info file.  If at all
+     possible, please use real Texinfo, instead of just supplying
+     unformatted ASCII text (although even that is better than no
+     documentation at all).  Conventions to be followed in 'GAWK:
+     Effective AWK Programming' are provided after the '@bye' at the end
+     of the Texinfo source file.  If possible, please update the 'man'
+     page as well.
+
+     You will also have to sign paperwork for your documentation
+     changes.
+
+  7. Submit changes as unified diffs.  Use 'diff -u -r -N' to compare
+     the original 'gawk' source tree with your version.  I recommend
+     using the GNU version of 'diff', or best of all, 'git diff' or 'git
+     format-patch'.  Send the output produced by 'diff' to me when you
+     submit your changes.  (*Note Bugs::, for the electronic mail
+     information.)
+
+     Using this format makes it easy for me to apply your changes to the
+     master version of the 'gawk' source code (using 'patch').  If I
+     have to apply the changes manually, using a text editor, I may not
+     do so, particularly if there are lots of changes.
+
+  8. Include an entry for the 'ChangeLog' file with your submission.
+     This helps further minimize the amount of work I have to do, making
+     it easier for me to accept patches.  It is simplest if you just
+     make this part of your diff.
+
+   Although this sounds like a lot of work, please remember that while
+you may write the new code, I have to maintain it and support it.  If it
+isn't possible for me to do that with a minimum of extra work, then I
+probably will not.
+
+
+File: gawk.info,  Node: New Ports,  Next: Derived Files,  Prev: Adding Code,  Up: Additions
+
+C.2.3 Porting 'gawk' to a New Operating System
+----------------------------------------------
+
+If you want to port 'gawk' to a new operating system, there are several
+steps:
+
+  1. Follow the guidelines in *note Adding Code::, concerning coding
+     style, submission of diffs, and so on.
+
+  2. Be prepared to sign the appropriate paperwork.  In order for the
+     FSF to distribute your code, you must either place your code in the
+     public domain and submit a signed statement to that effect, or
+     assign the copyright in your code to the FSF. Both of these actions
+     are easy to do and _many_ people have done so already.  If you have
+     questions, please contact me, or <gnu@gnu.org>.
+
+  3. When doing a port, bear in mind that your code must coexist
+     peacefully with the rest of 'gawk' and the other ports.  Avoid
+     gratuitous changes to the system-independent parts of the code.  If
+     at all possible, avoid sprinkling '#ifdef's just for your port
+     throughout the code.
+
+     If the changes needed for a particular system affect too much of
+     the code, I probably will not accept them.  In such a case, you
+     can, of course, distribute your changes on your own, as long as you
+     comply with the GPL (*note Copying::).
+
+  4. A number of the files that come with 'gawk' are maintained by other
+     people.  Thus, you should not change them unless it is for a very
+     good reason; i.e., changes are not out of the question, but changes
+     to these files are scrutinized extra carefully.  The files are
+     'dfa.c', 'dfa.h', 'getopt.c', 'getopt.h', 'getopt1.c',
+     'getopt_int.h', 'gettext.h', 'regcomp.c', 'regex.c', 'regex.h',
+     'regex_internal.c', 'regex_internal.h', and 'regexec.c'.
+
+  5. A number of other files are provided by the GNU Autotools
+     (Autoconf, Automake, and GNU 'gettext').  You should not change
+     them either, unless it is for a very good reason.  The files are
+     'ABOUT-NLS', 'config.guess', 'config.rpath', 'config.sub',
+     'depcomp', 'INSTALL', 'install-sh', 'missing', 'mkinstalldirs',
+     'xalloc.h', and 'ylwrap'.
+
+  6. Be willing to continue to maintain the port.  Non-Unix operating
+     systems are supported by volunteers who maintain the code needed to
+     compile and run 'gawk' on their systems.  If no-one volunteers to
+     maintain a port, it becomes unsupported and it may be necessary to
+     remove it from the distribution.
+
+  7. Supply an appropriate 'gawkmisc.???' file.  Each port has its own
+     'gawkmisc.???' that implements certain operating system specific
+     functions.  This is cleaner than a plethora of '#ifdef's scattered
+     throughout the code.  The 'gawkmisc.c' in the main source directory
+     includes the appropriate 'gawkmisc.???' file from each
+     subdirectory.  Be sure to update it as well.
+
+     Each port's 'gawkmisc.???' file has a suffix reminiscent of the
+     machine or operating system for the port--for example,
+     'pc/gawkmisc.pc' and 'vms/gawkmisc.vms'.  The use of separate
+     suffixes, instead of plain 'gawkmisc.c', makes it possible to move
+     files from a port's subdirectory into the main subdirectory,
+     without accidentally destroying the real 'gawkmisc.c' file.
+     (Currently, this is only an issue for the PC operating system
+     ports.)
+
+  8. Supply a 'Makefile' as well as any other C source and header files
+     that are necessary for your operating system.  All your code should
+     be in a separate subdirectory, with a name that is the same as, or
+     reminiscent of, either your operating system or the computer
+     system.  If possible, try to structure things so that it is not
+     necessary to move files out of the subdirectory into the main
+     source directory.  If that is not possible, then be sure to avoid
+     using names for your files that duplicate the names of files in the
+     main source directory.
+
+  9. Update the documentation.  Please write a section (or sections) for
+     this Info file describing the installation and compilation steps
+     needed to compile and/or install 'gawk' for your system.
+
+   Following these steps makes it much easier to integrate your changes
+into 'gawk' and have them coexist happily with other operating systems'
+code that is already there.
+
+   In the code that you supply and maintain, feel free to use a coding
+style and brace layout that suits your taste.
+
+
+File: gawk.info,  Node: Derived Files,  Prev: New Ports,  Up: Additions
+
+C.2.4 Why Generated Files Are Kept In Git
+-----------------------------------------
+
+If you look at the 'gawk' source in the Git repository, you will notice
+that it includes files that are automatically generated by GNU
+infrastructure tools, such as 'Makefile.in' from Automake and even
+'configure' from Autoconf.
+
+   This is different from many Free Software projects that do not store
+the derived files, because that keeps the repository less cluttered, and
+it is easier to see the substantive changes when comparing versions and
+trying to understand what changed between commits.
+
+   However, there are several reasons why the 'gawk' maintainer likes to
+have everything in the repository.
+
+   First, because it is then easy to reproduce any given version
+completely, without relying upon the availability of (older, likely
+obsolete, and maybe even impossible to find) other tools.
+
+   As an extreme example, if you ever even think about trying to
+compile, oh, say, the V7 'awk', you will discover that not only do you
+have to bootstrap the V7 'yacc' to do so, but you also need the V7
+'lex'.  And the latter is pretty much impossible to bring up on a modern
+GNU/Linux system.(1)
+
+   (Or, let's say 'gawk' 1.2 required 'bison' whatever-it-was in 1989
+and that there was no 'awkgram.c' file in the repository.  Is there a
+guarantee that we could find that 'bison' version?  Or that _it_ would
+build?)
+
+   If the repository has all the generated files, then it's easy to just
+check them out and build.  (Or _easier_, depending upon how far back we
+go.)
+
+   And that brings us to the second (and stronger) reason why all the
+files really need to be in Git.  It boils down to who do you cater
+to--the 'gawk' developer(s), or the user who just wants to check out a
+version and try it out?
+
+   The 'gawk' maintainer wants it to be possible for any interested
+'awk' user in the world to just clone the repository, check out the
+branch of interest and build it.  Without their having to have the
+correct version(s) of the autotools.(2)  That is the point of the
+'bootstrap.sh' file.  It touches the various other files in the right
+order such that
+
+     # The canonical incantation for building GNU software:
+     ./bootstrap.sh && ./configure && make
+
+will _just work_.
+
+   This is extremely important for the 'master' and 'gawk-X.Y-stable'
+branches.
+
+   Further, the 'gawk' maintainer would argue that it's also important
+for the 'gawk' developers.  When he tried to check out the 'xgawk'
+branch(3) to build it, he couldn't.  (No 'ltmain.sh' file, and he had no
+idea how to create it, and that was not the only problem.)
+
+   He felt _extremely_ frustrated.  With respect to that branch, the
+maintainer is no different than Jane User who wants to try to build
+'gawk-4.1-stable' or 'master' from the repository.
+
+   Thus, the maintainer thinks that it's not just important, but
+critical, that for any given branch, the above incantation _just works_.
+
+   A third reason to have all the files is that without them, using 'git
+bisect' to try to find the commit that introduced a bug is exceedingly
+difficult.  The maintainer tried to do that on another project that
+requires running bootstrapping scripts just to create 'configure' and so
+on; it was really painful.  When the repository is self-contained, using
+'git bisect' in it is very easy.
+
+   What are some of the consequences and/or actions to take?
+
+  1. We don't mind that there are differing files in the different
+     branches as a result of different versions of the autotools.
+
+       A. It's the maintainer's job to merge them and he will deal with
+          it.
+
+       B. He is really good at 'git diff x y > /tmp/diff1 ; gvim
+          /tmp/diff1' to remove the diffs that aren't of interest in
+          order to review code.
+
+  2. It would certainly help if everyone used the same versions of the
+     GNU tools as he does, which in general are the latest released
+     versions of Automake, Autoconf, 'bison', and GNU 'gettext'.
+
+     Installing from source is quite easy.  It's how the maintainer
+     worked for years (and still works).  He had '/usr/local/bin' at the
+     front of his 'PATH' and just did:
+
+          wget http://ftp.gnu.org/gnu/PACKAGE/PACKAGE-X.Y.Z.tar.gz
+          tar -xpzvf PACKAGE-X.Y.Z.tar.gz
+          cd PACKAGE-X.Y.Z
+          ./configure && make && make check
+          make install    # as root
+
+   Most of the above was originally written by the maintainer to other
+'gawk' developers.  It raised the objection from one of the developers
+"... that anybody pulling down the source from Git is not an end user."
+
+   However, this is not true.  There are "power 'awk' users" who can
+build 'gawk' (using the magic incantation shown previously) but who
+can't program in C. Thus, the major branches should be kept buildable
+all the time.
+
+   It was then suggested that there be a 'cron' job to create nightly
+tarballs of "the source."  Here, the problem is that there are source
+trees, corresponding to the various branches!  So, nightly tarballs
+aren't the answer, especially as the repository can go for weeks without
+significant change being introduced.
+
+   Fortunately, the Git server can meet this need.  For any given branch
+named BRANCHNAME, use:
+
+     wget http://git.savannah.gnu.org/cgit/gawk.git/snapshot/gawk-BRANCHNAME.tar.gz
+
+to retrieve a snapshot of the given branch.
+
+   ---------- Footnotes ----------
+
+   (1) We tried.  It was painful.
+
+   (2) There is one GNU program that is (in our opinion) severely
+difficult to bootstrap from the Git repository.  For example, on the
+author's old (but still working) PowerPC Macintosh with Mac OS X 10.5,
+it was necessary to bootstrap a ton of software, starting with Git
+itself, in order to try to work with the latest code.  It's not
+pleasant, and especially on older systems, it's a big waste of time.
+
+   Starting with the latest tarball was no picnic either.  The
+maintainers had dropped '.gz' and '.bz2' files and only distribute
+'.tar.xz' files.  It was necessary to bootstrap 'xz' first!
+
+   (3) A branch (since removed) created by one of the other developers
+that did not include the generated files.
+
+
+File: gawk.info,  Node: Future Extensions,  Next: Implementation Limitations,  Prev: Additions,  Up: Notes
+
+C.3 Probable Future Extensions
+==============================
+
+     AWK is a language similar to PERL, only considerably more elegant.
+                          -- _Arnold Robbins_
+
+     Hey!
+                            -- _Larry Wall_
+
+   The 'TODO' file in the 'master' branch of the 'gawk' Git repository
+lists possible future enhancements.  Some of these relate to the source
+code, and others to possible new features.  Please see that file for the
+list.  *Note Additions::, if you are interested in tackling any of the
+projects listed there.
+
+
+File: gawk.info,  Node: Implementation Limitations,  Next: Extension Design,  Prev: Future Extensions,  Up: Notes
+
+C.4 Some Limitations of the Implementation
+==========================================
+
+This following table describes limits of 'gawk' on a Unix-like system
+(although it is variable even then).  Other systems may have different
+limits.
+
+Item                          Limit
+--------------------------------------------------------------------------
+Characters in a character     2^(number of bits per byte)
+class
+Length of input record        'MAX_INT'
+Length of output record       Unlimited
+Length of source line         Unlimited
+Number of fields in a         'MAX_LONG'
+record
+Number of file redirections   Unlimited
+Number of input records in    'MAX_LONG'
+one file
+Number of input records       'MAX_LONG'
+total
+Number of pipe redirections   min(number of processes per user, number
+                              of open files)
+Numeric values                Double-precision floating point (if not
+                              using MPFR)
+Size of a field               'MAX_INT'
+Size of a literal string      'MAX_INT'
+Size of a printf string       'MAX_INT'
+
+
+File: gawk.info,  Node: Extension Design,  Next: Old Extension Mechanism,  Prev: Implementation Limitations,  Up: Notes
+
+C.5 Extension API Design
+========================
+
+This minor node documents the design of the extension API, including a
+discussion of some of the history and problems that needed to be solved.
+
+   The first version of extensions for 'gawk' was developed in the
+mid-1990s and released with 'gawk' 3.1 in the late 1990s.  The basic
+mechanisms and design remained unchanged for close to 15 years, until
+2012.
+
+   The old extension mechanism used data types and functions from 'gawk'
+itself, with a "clever hack" to install extension functions.
+
+   'gawk' included some sample extensions, of which a few were really
+useful.  However, it was clear from the outset that the extension
+mechanism was bolted onto the side and was not really well thought out.
+
+* Menu:
+
+* Old Extension Problems::           Problems with the old mechanism.
+* Extension New Mechanism Goals::    Goals for the new mechanism.
+* Extension Other Design Decisions:: Some other design decisions.
+* Extension Future Growth::          Some room for future growth.
+
+
+File: gawk.info,  Node: Old Extension Problems,  Next: Extension New Mechanism Goals,  Up: Extension Design
+
+C.5.1 Problems With The Old Mechanism
+-------------------------------------
+
+The old extension mechanism had several problems:
+
+   * It depended heavily upon 'gawk' internals.  Any time the 'NODE'
+     structure(1) changed, an extension would have to be recompiled.
+     Furthermore, to really write extensions required understanding
+     something about 'gawk''s internal functions.  There was some
+     documentation in this Info file, but it was quite minimal.
+
+   * Being able to call into 'gawk' from an extension required linker
+     facilities that are common on Unix-derived systems but that did not
+     work on MS-Windows systems; users wanting extensions on MS-Windows
+     had to statically link them into 'gawk', even though MS-Windows
+     supports dynamic loading of shared objects.
+
+   * The API would change occasionally as 'gawk' changed; no
+     compatibility between versions was ever offered or planned for.
+
+   Despite the drawbacks, the 'xgawk' project developers forked 'gawk'
+and developed several significant extensions.  They also enhanced
+'gawk''s facilities relating to file inclusion and shared object access.
+
+   A new API was desired for a long time, but only in 2012 did the
+'gawk' maintainer and the 'xgawk' developers finally start working on it
+together.  More information about the 'xgawk' project is provided in
+*note gawkextlib::.
+
+   ---------- Footnotes ----------
+
+   (1) A critical central data structure inside 'gawk'.
+
+
+File: gawk.info,  Node: Extension New Mechanism Goals,  Next: Extension Other Design Decisions,  Prev: Old Extension Problems,  Up: Extension Design
+
+C.5.2 Goals For A New Mechanism
+-------------------------------
+
+Some goals for the new API were:
+
+   * The API should be independent of 'gawk' internals.  Changes in
+     'gawk' internals should not be visible to the writer of an
+     extension function.
+
+   * The API should provide _binary_ compatibility across 'gawk'
+     releases as long as the API itself does not change.
+
+   * The API should enable extensions written in C or C++ to have
+     roughly the same "appearance" to 'awk'-level code as 'awk'
+     functions do.  This means that extensions should have:
+
+        - The ability to access function parameters.
+
+        - The ability to turn an undefined parameter into an array (call
+          by reference).
+
+        - The ability to create, access and update global variables.
+
+        - Easy access to all the elements of an array at once ("array
+          flattening") in order to loop over all the element in an easy
+          fashion for C code.
+
+        - The ability to create arrays (including 'gawk''s true arrays
+          of arrays).
+
+   Some additional important goals were:
+
+   * The API should use only features in ISO C 90, so that extensions
+     can be written using the widest range of C and C++ compilers.  The
+     header should include the appropriate '#ifdef __cplusplus' and
+     'extern "C"' magic so that a C++ compiler could be used.  (If using
+     C++, the runtime system has to be smart enough to call any
+     constructors and destructors, as 'gawk' is a C program.  As of this
+     writing, this has not been tested.)
+
+   * The API mechanism should not require access to 'gawk''s symbols(1)
+     by the compile-time or dynamic linker, in order to enable creation
+     of extensions that also work on MS-Windows.
+
+   During development, it became clear that there were other features
+that should be available to extensions, which were also subsequently
+provided:
+
+   * Extensions should have the ability to hook into 'gawk''s I/O
+     redirection mechanism.  In particular, the 'xgawk' developers
+     provided a so-called "open hook" to take over reading records.
+     During development, this was generalized to allow extensions to
+     hook into input processing, output processing, and two-way I/O.
+
+   * An extension should be able to provide a "call back" function to
+     perform cleanup actions when 'gawk' exits.
+
+   * An extension should be able to provide a version string so that
+     'gawk''s '--version' option can provide information about
+     extensions as well.
+
+   The requirement to avoid access to 'gawk''s symbols is, at first
+glance, a difficult one to meet.
+
+   One design, apparently used by Perl and Ruby and maybe others, would
+be to make the mainline 'gawk' code into a library, with the 'gawk'
+utility a small C 'main()' function linked against the library.
+
+   This seemed like the tail wagging the dog, complicating build and
+installation and making a simple copy of the 'gawk' executable from one
+system to another (or one place to another on the same system!)  into a
+chancy operation.
+
+   Pat Rankin suggested the solution that was adopted.  *Note Extension
+Mechanism Outline::, for the details.
+
+   ---------- Footnotes ----------
+
+   (1) The "symbols" are the variables and functions defined inside
+'gawk'.  Access to these symbols by code external to 'gawk' loaded
+dynamically at runtime is problematic on MS-Windows.
+
+
+File: gawk.info,  Node: Extension Other Design Decisions,  Next: Extension Future Growth,  Prev: Extension New Mechanism Goals,  Up: Extension Design
+
+C.5.3 Other Design Decisions
+----------------------------
+
+As an arbitrary design decision, extensions can read the values of
+predefined variables and arrays (such as 'ARGV' and 'FS'), but cannot
+change them, with the exception of 'PROCINFO'.
+
+   The reason for this is to prevent an extension function from
+affecting the flow of an 'awk' program outside its control.  While a
+real 'awk' function can do what it likes, that is at the discretion of
+the programmer.  An extension function should provide a service or make
+a C API available for use within 'awk', and not mess with 'FS' or 'ARGC'
+and 'ARGV'.
+
+   In addition, it becomes easy to start down a slippery slope.  How
+much access to 'gawk' facilities do extensions need?  Do they need
+'getline'?  What about calling 'gsub()' or compiling regular
+expressions?  What about calling into 'awk' functions?  (_That_ would be
+messy.)
+
+   In order to avoid these issues, the 'gawk' developers chose to start
+with the simplest, most basic features that are still truly useful.
+
+   Another decision is that although 'gawk' provides nice things like
+MPFR, and arrays indexed internally by integers, these features are not
+being brought out to the API in order to keep things simple and close to
+traditional 'awk' semantics.  (In fact, arrays indexed internally by
+integers are so transparent that they aren't even documented!)
+
+   Additionally, all functions in the API check that their pointer input
+parameters are not 'NULL'.  If they are, they return an error.  (It is a
+good idea for extension code to verify that pointers received from
+'gawk' are not 'NULL'.  Such a thing should not happen, but the 'gawk'
+developers are only human, and they have been known to occasionally make
+mistakes.)
+
+   With time, the API will undoubtedly evolve; the 'gawk' developers
+expect this to be driven by user needs.  For now, the current API seems
+to provide a minimal yet powerful set of features for creating
+extensions.
+
+
+File: gawk.info,  Node: Extension Future Growth,  Prev: Extension Other Design Decisions,  Up: Extension Design
+
+C.5.4 Room For Future Growth
+----------------------------
+
+The API can later be expanded, in two ways:
+
+   * 'gawk' passes an "extension id" into the extension when it first
+     loads the extension.  The extension then passes this id back to
+     'gawk' with each function call.  This mechanism allows 'gawk' to
+     identify the extension calling into it, should it need to know.
+
+   * Similarly, the extension passes a "name space" into 'gawk' when it
+     registers each extension function.  This accommodates a possible
+     future mechanism for grouping extension functions and possibly
+     avoiding name conflicts.
+
+   Of course, as of this writing, no decisions have been made with
+respect to any of the above.
+
+
+File: gawk.info,  Node: Old Extension Mechanism,  Next: Notes summary,  Prev: Extension Design,  Up: Notes
+
+C.6 Compatibility For Old Extensions
+====================================
+
+*note Dynamic Extensions::, describes the supported API and mechanisms
+for writing extensions for 'gawk'.  This API was introduced in version
+4.1.  However, for many years 'gawk' provided an extension mechanism
+that required knowledge of 'gawk' internals and that was not as well
+designed.
+
+   In order to provide a transition period, 'gawk' version 4.1 continues
+to support the original extension mechanism.  This will be true for the
+life of exactly one major release.  This support will be withdrawn, and
+removed from the source code, at the next major release.
+
+   Briefly, original-style extensions should be compiled by including
+the 'awk.h' header file in the extension source code.  Additionally, you
+must define the identifier 'GAWK' when building (use '-DGAWK' with
+Unix-style compilers).  Otherwise, the definitions in 'gawkapi.h' will
+cause conflicts with those in 'awk.h' and your extension will not
+compile.
+
+   Just as in previous versions, you load an old-style extension with
+the 'extension()' built-in function (which is not otherwise documented).
+This function in turn finds and loads the shared object file containing
+the extension and calls its 'dl_load()' C routine.
+
+   Because original-style and new-style extensions use different
+initialization routines ('dl_load()' versus 'dlload()'), they may safely
+be installed in the same directory (to be found by 'AWKLIBPATH') without
+conflict.
+
+   The 'gawk' development team strongly recommends that you convert any
+old extensions that you may have to use the new API described in *note
+Dynamic Extensions::.
+
+
+File: gawk.info,  Node: Notes summary,  Prev: Old Extension Mechanism,  Up: Notes
+
+C.7 Summary
+===========
+
+   * 'gawk''s extensions can be disabled with either the '--traditional'
+     option or with the '--posix' option.  The '--parsedebug' option is
+     available if 'gawk' is compiled with '-DDEBUG'.
+
+   * The source code for 'gawk' is maintained in a publicly accessible
+     Git repository.  Anyone may check it out and view the source.
+
+   * Contributions to 'gawk' are welcome.  Following the steps outlined
+     in this major node will make it easier to integrate your
+     contributions into the code base.  This applies both to new feature
+     contributions and to ports to additional operating systems.
+
+   * 'gawk' has some limits--generally those that are imposed by the
+     machine architecture.
+
+   * The extension API design was intended to solve a number of problems
+     with the previous extension mechanism, enable features needed by
+     the 'xgawk' project, and provide binary compatibility going
+     forward.
+
+   * The previous extension mechanism is still supported in version 4.1
+     of 'gawk', but it _will_ be removed in the next major release.
+
+
+File: gawk.info,  Node: Basic Concepts,  Next: Glossary,  Prev: Notes,  Up: Top
+
+Appendix D Basic Programming Concepts
+*************************************
+
+This major node attempts to define some of the basic concepts and terms
+that are used throughout the rest of this Info file.  As this Info file
+is specifically about 'awk', and not about computer programming in
+general, the coverage here is by necessity fairly cursory and
+simplistic.  (If you need more background, there are many other
+introductory texts that you should refer to instead.)
+
+* Menu:
+
+* Basic High Level::            The high level view.
+* Basic Data Typing::           A very quick intro to data types.
+
+
+File: gawk.info,  Node: Basic High Level,  Next: Basic Data Typing,  Up: Basic Concepts
+
+D.1 What a Program Does
+=======================
+
+At the most basic level, the job of a program is to process some input
+data and produce results.  See *note Figure D.1: figure-general-flow.
+
+
++------+         /       \\         +---------+
+| Data | -----> < Program > -----> | Results |
++------+         \\_______/         +---------+"
+
+Figure D.1: General Program Flow
+
+   The "program" in the figure can be either a compiled program(1) (such
+as 'ls'), or it may be "interpreted".  In the latter case, a
+machine-executable program such as 'awk' reads your program, and then
+uses the instructions in your program to process the data.
+
+   When you write a program, it usually consists of the following, very
+basic set of steps, as shown in *note Figure D.2: figure-process-flow.:
+
+
++----------------+           / More \\  No       +----------+
+| Initialization | -------> <  Data  > -------> | Clean Up |
++----------------+    ^      \\   ?  /           +----------+
+                      |       +--+-+
+                      |          | Yes
+                      |          |
+                      |          V
+                      |     +---------+
+                      +-----+ Process |
+                            +---------+"
+
+Figure D.2: Basic Program Steps
+
+Initialization
+     These are the things you do before actually starting to process
+     data, such as checking arguments, initializing any data you need to
+     work with, and so on.  This step corresponds to 'awk''s 'BEGIN'
+     rule (*note BEGIN/END::).
+
+     If you were baking a cake, this might consist of laying out all the
+     mixing bowls and the baking pan, and making sure you have all the
+     ingredients that you need.
+
+Processing
+     This is where the actual work is done.  Your program reads data,
+     one logical chunk at a time, and processes it as appropriate.
+
+     In most programming languages, you have to manually manage the
+     reading of data, checking to see if there is more each time you
+     read a chunk.  'awk''s pattern-action paradigm (*note Getting
+     Started::) handles the mechanics of this for you.
+
+     In baking a cake, the processing corresponds to the actual labor:
+     breaking eggs, mixing the flour, water, and other ingredients, and
+     then putting the cake into the oven.
+
+Clean Up
+     Once you've processed all the data, you may have things you need to
+     do before exiting.  This step corresponds to 'awk''s 'END' rule
+     (*note BEGIN/END::).
+
+     After the cake comes out of the oven, you still have to wrap it in
+     plastic wrap to keep anyone from tasting it, as well as wash the
+     mixing bowls and utensils.
+
+   An "algorithm" is a detailed set of instructions necessary to
+accomplish a task, or process data.  It is much the same as a recipe for
+baking a cake.  Programs implement algorithms.  Often, it is up to you
+to design the algorithm and implement it, simultaneously.
+
+   The "logical chunks" we talked about previously are called "records",
+similar to the records a company keeps on employees, a school keeps for
+students, or a doctor keeps for patients.  Each record has many
+component parts, such as first and last names, date of birth, address,
+and so on.  The component parts are referred to as the "fields" of the
+record.
+
+   The act of reading data is termed "input", and that of generating
+results, not too surprisingly, is termed "output".  They are often
+referred to together as "input/output," and even more often, as "I/O"
+for short.  (You will also see "input" and "output" used as verbs.)
+
+   'awk' manages the reading of data for you, as well as the breaking it
+up into records and fields.  Your program's job is to tell 'awk' what to
+do with the data.  You do this by describing "patterns" in the data to
+look for, and "actions" to execute when those patterns are seen.  This
+"data-driven" nature of 'awk' programs usually makes them both easier to
+write and easier to read.
+
+   ---------- Footnotes ----------
+
+   (1) Compiled programs are typically written in lower-level languages
+such as C, C++, or Ada, and then translated, or "compiled", into a form
+that the computer can execute directly.
+
+
+File: gawk.info,  Node: Basic Data Typing,  Prev: Basic High Level,  Up: Basic Concepts
+
+D.2 Data Values in a Computer
+=============================
+
+In a program, you keep track of information and values in things called
+"variables".  A variable is just a name for a given value, such as
+'first_name', 'last_name', 'address', and so on.  'awk' has several
+predefined variables, and it has special names to refer to the current
+input record and the fields of the record.  You may also group multiple
+associated values under one name, as an array.
+
+   Data, particularly in 'awk', consists of either numeric values, such
+as 42 or 3.1415927, or string values.  String values are essentially
+anything that's not a number, such as a name.  Strings are sometimes
+referred to as "character data", since they store the individual
+characters that comprise them.  Individual variables, as well as numeric
+and string variables, are referred to as "scalar" values.  Groups of
+values, such as arrays, are not scalars.
+
+   *note Computer Arithmetic::, provided a basic introduction to numeric
+types (integer and floating-point) and how they are used in a computer.
+Please review that information, including a number of caveats that were
+presented.
+
+   While you are probably used to the idea of a number without a value
+(i.e., zero), it takes a bit more getting used to the idea of
+zero-length character data.  Nevertheless, such a thing exists.  It is
+called the "null string".  The null string is character data that has no
+value.  In other words, it is empty.  It is written in 'awk' programs
+like this: '""'.
+
+   Humans are used to working in decimal; i.e., base 10.  In base 10,
+numbers go from 0 to 9, and then "roll over" into the next column.
+(Remember grade school?  42 = 4 x 10 + 2.)
+
+   There are other number bases though.  Computers commonly use base 2
+or "binary", base 8 or "octal", and base 16 or "hexadecimal".  In
+binary, each column represents two times the value in the column to its
+right.  Each column may contain either a 0 or a 1.  Thus, binary 1010
+represents (1 x 8) + (0 x 4) + (1 x 2) + (0 x 1), or decimal 10.  Octal
+and hexadecimal are discussed more in *note Nondecimal-numbers::.
+
+   At the very lowest level, computers store values as groups of binary
+digits, or "bits".  Modern computers group bits into groups of eight,
+called "bytes".  Advanced applications sometimes have to manipulate bits
+directly, and 'gawk' provides functions for doing so.
+
+   Programs are written in programming languages.  Hundreds, if not
+thousands, of programming languages exist.  One of the most popular is
+the C programming language.  The C language had a very strong influence
+on the design of the 'awk' language.
+
+   There have been several versions of C. The first is often referred to
+as "K&R" C, after the initials of Brian Kernighan and Dennis Ritchie,
+the authors of the first book on C. (Dennis Ritchie created the
+language, and Brian Kernighan was one of the creators of 'awk'.)
+
+   In the mid-1980s, an effort began to produce an international
+standard for C. This work culminated in 1989, with the production of the
+ANSI standard for C. This standard became an ISO standard in 1990.  In
+1999, a revised ISO C standard was approved and released.  Where it
+makes sense, POSIX 'awk' is compatible with 1999 ISO C.
+
+
+File: gawk.info,  Node: Glossary,  Next: Copying,  Prev: Basic Concepts,  Up: Top
+
+Glossary
+********
+
+Action
+     A series of 'awk' statements attached to a rule.  If the rule's
+     pattern matches an input record, 'awk' executes the rule's action.
+     Actions are always enclosed in braces.  (*Note Action Overview::.)
+
+Ada
+     A programming language originally defined by the U.S. Department of
+     Defense for embedded programming.  It was designed to enforce good
+     Software Engineering practices.
+
+Amazing 'awk' Assembler
+     Henry Spencer at the University of Toronto wrote a retargetable
+     assembler completely as 'sed' and 'awk' scripts.  It is thousands
+     of lines long, including machine descriptions for several eight-bit
+     microcomputers.  It is a good example of a program that would have
+     been better written in another language.  You can get it from
+     <http://awk.info/?awk100/aaa>.
+
+Amazingly Workable Formatter ('awf')
+     Henry Spencer at the University of Toronto wrote a formatter that
+     accepts a large subset of the 'nroff -ms' and 'nroff -man'
+     formatting commands, using 'awk' and 'sh'.  It is available from
+     <http://awk.info/?tools/awf>.
+
+Anchor
+     The regexp metacharacters '^' and '$', which force the match to the
+     beginning or end of the string, respectively.
+
+ANSI
+     The American National Standards Institute.  This organization
+     produces many standards, among them the standards for the C and C++
+     programming languages.  These standards often become international
+     standards as well.  See also "ISO."
+
+Argument
+     An argument can be two different things.  It can be an option or a
+     file name passed to a command while invoking it from the command
+     line, or it can be something passed to a "function" inside a
+     program, e.g.  inside 'awk'.
+
+     In the latter case, an argument can be passed to a function in two
+     ways.  Either it is given to the called function by value, i.e., a
+     copy of the value of the variable is made available to the called
+     function, but the original variable cannot be modified by the
+     function itself; or it is given by reference, i.e., a pointer to
+     the interested variable is passed to the function, which can then
+     directly modify it.  In 'awk' scalars are passed by value, and
+     arrays are passed by reference.  See "Pass By Value/Reference."
+
+Array
+     A grouping of multiple values under the same name.  Most languages
+     just provide sequential arrays.  'awk' provides associative arrays.
+
+Assertion
+     A statement in a program that a condition is true at this point in
+     the program.  Useful for reasoning about how a program is supposed
+     to behave.
+
+Assignment
+     An 'awk' expression that changes the value of some 'awk' variable
+     or data object.  An object that you can assign to is called an
+     "lvalue".  The assigned values are called "rvalues".  *Note
+     Assignment Ops::.
+
+Associative Array
+     Arrays in which the indices may be numbers or strings, not just
+     sequential integers in a fixed range.
+
+'awk' Language
+     The language in which 'awk' programs are written.
+
+'awk' Program
+     An 'awk' program consists of a series of "patterns" and "actions",
+     collectively known as "rules".  For each input record given to the
+     program, the program's rules are all processed in turn.  'awk'
+     programs may also contain function definitions.
+
+'awk' Script
+     Another name for an 'awk' program.
+
+Bash
+     The GNU version of the standard shell (the Bourne-Again SHell).
+     See also "Bourne Shell."
+
+Binary
+     Base-two notation, where the digits are '0'-'1'.  Since electronic
+     circuitry works "naturally" in base 2 (just think of Off/On),
+     everything inside a computer is calculated using base 2.  Each
+     digit represents the presence (or absence) of a power of 2 and is
+     called a "bit".  So, for example, the base-two number '10101' is
+     the same as decimal 21, ((1 x 16) + (1 x 4) + (1 x 1)).
+
+     Since base-two numbers quickly become very long to read and write,
+     they are usually grouped by 3 (i.e., they are read as octal
+     numbers), or by 4 (i.e., they are read as hexadecimal numbers).
+     There is no direct way to insert base 2 numbers in a C program.  If
+     need arises, such numbers are usually inserted as octal or
+     hexadecimal numbers.  The number of base-two digits that fit into
+     registers used for representing integer numbers in computers is a
+     rough indication of the computing power of the computer itself.
+     Most computers nowadays use 64 bits for representing integer
+     numbers in their registers, but 32-bit, 16-bit and 8-bit registers
+     have been widely used in the past.  *Note Nondecimal-numbers::.
+Bit
+     Short for "Binary Digit."  All values in computer memory ultimately
+     reduce to binary digits: values that are either zero or one.
+     Groups of bits may be interpreted differently--as integers,
+     floating-point numbers, character data, addresses of other memory
+     objects, or other data.  'awk' lets you work with floating-point
+     numbers and strings.  'gawk' lets you manipulate bit values with
+     the built-in functions described in *note Bitwise Functions::.
+
+     Computers are often defined by how many bits they use to represent
+     integer values.  Typical systems are 32-bit systems, but 64-bit
+     systems are becoming increasingly popular, and 16-bit systems have
+     essentially disappeared.
+
+Boolean Expression
+     Named after the English mathematician Boole.  See also "Logical
+     Expression."
+
+Bourne Shell
+     The standard shell ('/bin/sh') on Unix and Unix-like systems,
+     originally written by Steven R. Bourne at Bell Laboratories.  Many
+     shells (Bash, 'ksh', 'pdksh', 'zsh') are generally upwardly
+     compatible with the Bourne shell.
+
+Braces
+     The characters '{' and '}'.  Braces are used in 'awk' for
+     delimiting actions, compound statements, and function bodies.
+
+Bracket Expression
+     Inside a "regular expression", an expression included in square
+     brackets, meant to designate a single character as belonging to a
+     specified character class.  A bracket expression can contain a list
+     of one or more characters, like '[abc]', a range of characters,
+     like '[A-Z]', or a name, delimited by ':', that designates a known
+     set of characters, like '[:digit:]'.  The form of bracket
+     expression enclosed between ':' is independent of the underlying
+     representation of the character themselves, which could utilize the
+     ASCII, ECBDIC, or Unicode codesets, depending on the architecture
+     of the computer system, and on localization.  See also "Regular
+     Expression."
+
+Built-in Function
+     The 'awk' language provides built-in functions that perform various
+     numerical, I/O-related, and string computations.  Examples are
+     'sqrt()' (for the square root of a number) and 'substr()' (for a
+     substring of a string).  'gawk' provides functions for timestamp
+     management, bit manipulation, array sorting, type checking, and
+     runtime string translation.  (*Note Built-in::.)
+
+Built-in Variable
+     'ARGC', 'ARGV', 'CONVFMT', 'ENVIRON', 'FILENAME', 'FNR', 'FS',
+     'NF', 'NR', 'OFMT', 'OFS', 'ORS', 'RLENGTH', 'RSTART', 'RS', and
+     'SUBSEP' are the variables that have special meaning to 'awk'.  In
+     addition, 'ARGIND', 'BINMODE', 'ERRNO', 'FIELDWIDTHS', 'FPAT',
+     'IGNORECASE', 'LINT', 'PROCINFO', 'RT', and 'TEXTDOMAIN' are the
+     variables that have special meaning to 'gawk'.  Changing some of
+     them affects 'awk''s running environment.  (*Note Built-in
+     Variables::.)
+
+C
+     The system programming language that most GNU software is written
+     in.  The 'awk' programming language has C-like syntax, and this
+     Info file points out similarities between 'awk' and C when
+     appropriate.
+
+     In general, 'gawk' attempts to be as similar to the 1990 version of
+     ISO C as makes sense.
+
+C Shell
+     The C Shell ('csh' or its improved version, 'tcsh') is a Unix shell
+     that was created by Bill Joy in the late 1970s.  The C shell was
+     differentiated from other shells by its interactive features and
+     overall style, which looks more like C. The C Shell is not backward
+     compatible with the Bourne Shell, so special attention is required
+     when converting scripts written for other Unix shells to the C
+     shell, especially with regard to the management of shell variables.
+     See also "Bourne Shell."
+
+C++
+     A popular object-oriented programming language derived from C.
+
+Character Class
+     See "Bracket Expression."
+
+Character List
+     See "Bracket Expression."
+
+Character Set
+     The set of numeric codes used by a computer system to represent the
+     characters (letters, numbers, punctuation, etc.)  of a particular
+     country or place.  The most common character set in use today is
+     ASCII (American Standard Code for Information Interchange).  Many
+     European countries use an extension of ASCII known as ISO-8859-1
+     (ISO Latin-1).  The Unicode character set (http://www.unicode.org)
+     is increasingly popular and standard, and is particularly widely
+     used on GNU/Linux systems.
+
+CHEM
+     A preprocessor for 'pic' that reads descriptions of molecules and
+     produces 'pic' input for drawing them.  It was written in 'awk' by
+     Brian Kernighan and Jon Bentley, and is available from
+     <http://netlib.org/typesetting/chem>.
+
+Comparison Expression
+     A relation that is either true or false, such as 'a < b'.
+     Comparison expressions are used in 'if', 'while', 'do', and 'for'
+     statements, and in patterns to select which input records to
+     process.  (*Note Typing and Comparison::.)
+
+Compiler
+     A program that translates human-readable source code into
+     machine-executable object code.  The object code is then executed
+     directly by the computer.  See also "Interpreter."
+
+Complemented Bracket Expression
+     The negation of a "bracket expression".  All that is _not_
+     described by a given bracket expression.  The symbol '^' precedes
+     the negated bracket expression.  E.g.: '[[^:digit:]' designates
+     whatever character is not a digit.  '[^bad]' designates whatever
+     character is not one of the letters 'b', 'a', or 'd'.  See "Bracket
+     Expression."
+
+Compound Statement
+     A series of 'awk' statements, enclosed in curly braces.  Compound
+     statements may be nested.  (*Note Statements::.)
+
+Computed Regexps
+     See "Dynamic Regular Expressions."
+
+Concatenation
+     Concatenating two strings means sticking them together, one after
+     another, producing a new string.  For example, the string 'foo'
+     concatenated with the string 'bar' gives the string 'foobar'.
+     (*Note Concatenation::.)
+
+Conditional Expression
+     An expression using the '?:' ternary operator, such as 'EXPR1 ?
+     EXPR2 : EXPR3'.  The expression EXPR1 is evaluated; if the result
+     is true, the value of the whole expression is the value of EXPR2;
+     otherwise the value is EXPR3.  In either case, only one of EXPR2
+     and EXPR3 is evaluated.  (*Note Conditional Exp::.)
+
+Control Statement
+     A control statement is an instruction to perform a given operation
+     or a set of operations inside an 'awk' program, if a given
+     condition is true.  Control statements are: 'if', 'for', 'while',
+     and 'do' (*note Statements::).
+
+Cookie
+     A peculiar goodie, token, saying or remembrance produced by or
+     presented to a program.  (With thanks to Professor Doug McIlroy.)
+
+Coprocess
+     A subordinate program with which two-way communications is
+     possible.
+
+Curly Braces
+     See "Braces."
+
+Dark Corner
+     An area in the language where specifications often were (or still
+     are) not clear, leading to unexpected or undesirable behavior.
+     Such areas are marked in this Info file with "(d.c.)"  in the text
+     and are indexed under the heading "dark corner."
+
+Data Driven
+     A description of 'awk' programs, where you specify the data you are
+     interested in processing, and what to do when that data is seen.
+
+Data Objects
+     These are numbers and strings of characters.  Numbers are converted
+     into strings and vice versa, as needed.  (*Note Conversion::.)
+
+Deadlock
+     The situation in which two communicating processes are each waiting
+     for the other to perform an action.
+
+Debugger
+     A program used to help developers remove "bugs" from (de-bug) their
+     programs.
+
+Double Precision
+     An internal representation of numbers that can have fractional
+     parts.  Double precision numbers keep track of more digits than do
+     single precision numbers, but operations on them are sometimes more
+     expensive.  This is the way 'awk' stores numeric values.  It is the
+     C type 'double'.
+
+Dynamic Regular Expression
+     A dynamic regular expression is a regular expression written as an
+     ordinary expression.  It could be a string constant, such as
+     '"foo"', but it may also be an expression whose value can vary.
+     (*Note Computed Regexps::.)
+
+Empty String
+     See "Null String."
+
+Environment
+     A collection of strings, of the form 'NAME=VAL', that each program
+     has available to it.  Users generally place values into the
+     environment in order to provide information to various programs.
+     Typical examples are the environment variables 'HOME' and 'PATH'.
+
+Epoch
+     The date used as the "beginning of time" for timestamps.  Time
+     values in most systems are represented as seconds since the epoch,
+     with library functions available for converting these values into
+     standard date and time formats.
+
+     The epoch on Unix and POSIX systems is 1970-01-01 00:00:00 UTC. See
+     also "GMT" and "UTC."
+
+Escape Sequences
+     A special sequence of characters used for describing nonprinting
+     characters, such as '\n' for newline or '\033' for the ASCII ESC
+     (Escape) character.  (*Note Escape Sequences::.)
+
+Extension
+     An additional feature or change to a programming language or
+     utility not defined by that language's or utility's standard.
+     'gawk' has (too) many extensions over POSIX 'awk'.
+
+FDL
+     See "Free Documentation License."
+
+Field
+     When 'awk' reads an input record, it splits the record into pieces
+     separated by whitespace (or by a separator regexp that you can
+     change by setting the predefined variable 'FS').  Such pieces are
+     called fields.  If the pieces are of fixed length, you can use the
+     built-in variable 'FIELDWIDTHS' to describe their lengths.  If you
+     wish to specify the contents of fields instead of the field
+     separator, you can use the predefined variable 'FPAT' to do so.
+     (*Note Field Separators::, *note Constant Size::, and *note
+     Splitting By Content::.)
+
+Flag
+     A variable whose truth value indicates the existence or
+     nonexistence of some condition.
+
+Floating-Point Number
+     Often referred to in mathematical terms as a "rational" or real
+     number, this is just a number that can have a fractional part.  See
+     also "Double Precision" and "Single Precision."
+
+Format
+     Format strings control the appearance of output in the 'strftime()'
+     and 'sprintf()' functions, and in the 'printf' statement as well.
+     Also, data conversions from numbers to strings are controlled by
+     the format strings contained in the predefined variables 'CONVFMT'
+     and 'OFMT'.  (*Note Control Letters::.)
+
+Fortran
+     Shorthand for FORmula TRANslator, one of the first programming
+     languages available for scientific calculations.  It was created by
+     John Backus, and has been available since 1957.  It is still in use
+     today.
+
+Free Documentation License
+     This document describes the terms under which this Info file is
+     published and may be copied.  (*Note GNU Free Documentation
+     License::.)
+
+Free Software Foundation
+     A nonprofit organization dedicated to the production and
+     distribution of freely distributable software.  It was founded by
+     Richard M. Stallman, the author of the original Emacs editor.  GNU
+     Emacs is the most widely used version of Emacs today.
+
+FSF
+     See "Free Software Foundation."
+
+Function
+     A part of an 'awk' program that can be invoked from every point of
+     the program, to perform a task.  'awk' has several built-in
+     functions.  Users can define their own functions in every part of
+     the program.  Function can be recursive, i.e., they may invoke
+     themselves.  *Note Functions::.  In 'gawk' it is also possible to
+     have functions shared among different programs, and included where
+     required using the '@include' directive (*note Include Files::).
+     In 'gawk' the name of the function that should be invoked can be
+     generated at run time, i.e., dynamically.  The 'gawk' extension API
+     provides constructor functions (*note Constructor Functions::).
+
+'gawk'
+     The GNU implementation of 'awk'.
+
+General Public License
+     This document describes the terms under which 'gawk' and its source
+     code may be distributed.  (*Note Copying::.)
+
+GMT
+     "Greenwich Mean Time."  This is the old term for UTC. It is the
+     time of day used internally for Unix and POSIX systems.  See also
+     "Epoch" and "UTC."
+
+GNU
+     "GNU's not Unix".  An on-going project of the Free Software
+     Foundation to create a complete, freely distributable,
+     POSIX-compliant computing environment.
+
+GNU/Linux
+     A variant of the GNU system using the Linux kernel, instead of the
+     Free Software Foundation's Hurd kernel.  The Linux kernel is a
+     stable, efficient, full-featured clone of Unix that has been ported
+     to a variety of architectures.  It is most popular on PC-class
+     systems, but runs well on a variety of other systems too.  The
+     Linux kernel source code is available under the terms of the GNU
+     General Public License, which is perhaps its most important aspect.
+
+GPL
+     See "General Public License."
+
+Hexadecimal
+     Base 16 notation, where the digits are '0'-'9' and 'A'-'F', with
+     'A' representing 10, 'B' representing 11, and so on, up to 'F' for
+     15.  Hexadecimal numbers are written in C using a leading '0x', to
+     indicate their base.  Thus, '0x12' is 18 ((1 x 16) + 2).  *Note
+     Nondecimal-numbers::.
+
+I/O
+     Abbreviation for "Input/Output," the act of moving data into and/or
+     out of a running program.
+
+Input Record
+     A single chunk of data that is read in by 'awk'.  Usually, an 'awk'
+     input record consists of one line of text.  (*Note Records::.)
+
+Integer
+     A whole number, i.e., a number that does not have a fractional
+     part.
+
+Internationalization
+     The process of writing or modifying a program so that it can use
+     multiple languages without requiring further source code changes.
+
+Interpreter
+     A program that reads human-readable source code directly, and uses
+     the instructions in it to process data and produce results.  'awk'
+     is typically (but not always) implemented as an interpreter.  See
+     also "Compiler."
+
+Interval Expression
+     A component of a regular expression that lets you specify repeated
+     matches of some part of the regexp.  Interval expressions were not
+     originally available in 'awk' programs.
+
+ISO
+     The International Organization for Standardization.  This
+     organization produces international standards for many things,
+     including programming languages, such as C and C++.  In the
+     computer arena, important standards like those for C, C++, and
+     POSIX become both American national and ISO international standards
+     simultaneously.  This Info file refers to Standard C as "ISO C"
+     throughout.  See the ISO website
+     (http://www.iso.org/iso/home/about.htm) for more information about
+     the name of the organization and its language-independent
+     three-letter acronym.
+
+Java
+     A modern programming language originally developed by Sun
+     Microsystems (now Oracle) supporting Object-Oriented programming.
+     Although usually implemented by compiling to the instructions for a
+     standard virtual machine (the JVM), the language can be compiled to
+     native code.
+
+Keyword
+     In the 'awk' language, a keyword is a word that has special
+     meaning.  Keywords are reserved and may not be used as variable
+     names.
+
+     'gawk''s keywords are: 'BEGIN', 'BEGINFILE', 'END', 'ENDFILE',
+     'break', 'case', 'continue', 'default' 'delete', 'do...while',
+     'else', 'exit', 'for...in', 'for', 'function', 'func', 'if',
+     'next', 'nextfile', 'switch', and 'while'.
+
+Korn Shell
+     The Korn Shell ('ksh') is a Unix shell which was developed by David
+     Korn at Bell Laboratories in the early 1980s.  The Korn Shell is
+     backward-compatible with the Bourne shell and includes many
+     features of the C shell.  See also "Bourne Shell."
+
+Lesser General Public License
+     This document describes the terms under which binary library
+     archives or shared objects, and their source code may be
+     distributed.
+
+LGPL
+     See "Lesser General Public License."
+
+Linux
+     See "GNU/Linux."
+
+Localization
+     The process of providing the data necessary for an
+     internationalized program to work in a particular language.
+
+Logical Expression
+     An expression using the operators for logic, AND, OR, and NOT,
+     written '&&', '||', and '!' in 'awk'.  Often called Boolean
+     expressions, after the mathematician who pioneered this kind of
+     mathematical logic.
+
+Lvalue
+     An expression that can appear on the left side of an assignment
+     operator.  In most languages, lvalues can be variables or array
+     elements.  In 'awk', a field designator can also be used as an
+     lvalue.
+
+Matching
+     The act of testing a string against a regular expression.  If the
+     regexp describes the contents of the string, it is said to "match"
+     it.
+
+Metacharacters
+     Characters used within a regexp that do not stand for themselves.
+     Instead, they denote regular expression operations, such as
+     repetition, grouping, or alternation.
+
+Nesting
+     Nesting is where information is organized in layers, or where
+     objects contain other similar objects.  In 'gawk' the '@include'
+     directive can be nested.  The "natural" nesting of arithmetic and
+     logical operations can be changed using parentheses (*note
+     Precedence::).
+
+No-op
+     An operation that does nothing.
+
+Null String
+     A string with no characters in it.  It is represented explicitly in
+     'awk' programs by placing two double quote characters next to each
+     other ('""').  It can appear in input data by having two successive
+     occurrences of the field separator appear next to each other.
+
+Number
+     A numeric-valued data object.  Modern 'awk' implementations use
+     double precision floating-point to represent numbers.  Ancient
+     'awk' implementations used single precision floating-point.
+
+Octal
+     Base-eight notation, where the digits are '0'-'7'.  Octal numbers
+     are written in C using a leading '0', to indicate their base.
+     Thus, '013' is 11 ((1 x 8) + 3).  *Note Nondecimal-numbers::.
+
+Output Record
+     A single chunk of data that is written out by 'awk'.  Usually, an
+     'awk' output record consists of one or more lines of text.  *Note
+     Records::.
+
+Pattern
+     Patterns tell 'awk' which input records are interesting to which
+     rules.
+
+     A pattern is an arbitrary conditional expression against which
+     input is tested.  If the condition is satisfied, the pattern is
+     said to "match" the input record.  A typical pattern might compare
+     the input record against a regular expression.  (*Note Pattern
+     Overview::.)
+
+PEBKAC
+     An acronym describing what is possibly the most frequent source of
+     computer usage problems.  (Problem Exists Between Keyboard And
+     Chair.)
+
+Plug-in
+     See "Extensions."
+
+POSIX
+     The name for a series of standards that specify a Portable
+     Operating System interface.  The "IX" denotes the Unix heritage of
+     these standards.  The main standard of interest for 'awk' users is
+     'IEEE Standard for Information Technology, Standard 1003.1-2008'.
+     The 2008 POSIX standard can be found online at
+     <http://www.opengroup.org/onlinepubs/9699919799/>.
+
+Precedence
+     The order in which operations are performed when operators are used
+     without explicit parentheses.
+
+Private
+     Variables and/or functions that are meant for use exclusively by
+     library functions and not for the main 'awk' program.  Special care
+     must be taken when naming such variables and functions.  (*Note
+     Library Names::.)
+
+Range (of input lines)
+     A sequence of consecutive lines from the input file(s).  A pattern
+     can specify ranges of input lines for 'awk' to process or it can
+     specify single lines.  (*Note Pattern Overview::.)
+
+Record
+     See "Input record" and "Output record."
+
+Recursion
+     When a function calls itself, either directly or indirectly.  If
+     this is clear, stop, and proceed to the next entry.  Otherwise,
+     refer to the entry for "recursion."
+
+Redirection
+     Redirection means performing input from something other than the
+     standard input stream, or performing output to something other than
+     the standard output stream.
+
+     You can redirect input to the 'getline' statement using the '<',
+     '|', and '|&' operators.  You can redirect the output of the
+     'print' and 'printf' statements to a file or a system command,
+     using the '>', '>>', '|', and '|&' operators.  (*Note Getline::,
+     and *note Redirection::.)
+
+Reference Counts
+     An internal mechanism in 'gawk' to minimize the amount of memory
+     needed to store the value of string variables.  If the value
+     assumed by a variable is used in more than one place, only one copy
+     of the value itself is kept, and the associated reference count is
+     increased when the same value is used by an additional variable,
+     and decreased when the related variable is no longer in use.  When
+     the reference count goes to zero, the memory space used to store
+     the value of the variable is freed.
+
+Regexp
+     See "Regular Expression."
+
+Regular Expression
+     A regular expression ("regexp" for short) is a pattern that denotes
+     a set of strings, possibly an infinite set.  For example, the
+     regular expression 'R.*xp' matches any string starting with the
+     letter 'R' and ending with the letters 'xp'.  In 'awk', regular
+     expressions are used in patterns and in conditional expressions.
+     Regular expressions may contain escape sequences.  (*Note
+     Regexp::.)
+
+Regular Expression Constant
+     A regular expression constant is a regular expression written
+     within slashes, such as '/foo/'.  This regular expression is chosen
+     when you write the 'awk' program and cannot be changed during its
+     execution.  (*Note Regexp Usage::.)
+
+Regular Expression Operators
+     See "Metacharacters."
+
+Rounding
+     Rounding the result of an arithmetic operation can be tricky.  More
+     than one way of rounding exists, and in 'gawk' it is possible to
+     choose which method should be used in a program.  *Note Setting the
+     rounding mode::.
+
+Rule
+     A segment of an 'awk' program that specifies how to process single
+     input records.  A rule consists of a "pattern" and an "action".
+     'awk' reads an input record; then, for each rule, if the input
+     record satisfies the rule's pattern, 'awk' executes the rule's
+     action.  Otherwise, the rule does nothing for that input record.
+
+Rvalue
+     A value that can appear on the right side of an assignment
+     operator.  In 'awk', essentially every expression has a value.
+     These values are rvalues.
+
+Scalar
+     A single value, be it a number or a string.  Regular variables are
+     scalars; arrays and functions are not.
+
+Search Path
+     In 'gawk', a list of directories to search for 'awk' program source
+     files.  In the shell, a list of directories to search for
+     executable programs.
+
+'sed'
+     See "Stream Editor."
+
+Seed
+     The initial value, or starting point, for a sequence of random
+     numbers.
+
+Shell
+     The command interpreter for Unix and POSIX-compliant systems.  The
+     shell works both interactively, and as a programming language for
+     batch files, or shell scripts.
+
+Short-Circuit
+     The nature of the 'awk' logical operators '&&' and '||'.  If the
+     value of the entire expression is determinable from evaluating just
+     the lefthand side of these operators, the righthand side is not
+     evaluated.  (*Note Boolean Ops::.)
+
+Side Effect
+     A side effect occurs when an expression has an effect aside from
+     merely producing a value.  Assignment expressions, increment and
+     decrement expressions, and function calls have side effects.
+     (*Note Assignment Ops::.)
+
+Single Precision
+     An internal representation of numbers that can have fractional
+     parts.  Single precision numbers keep track of fewer digits than do
+     double precision numbers, but operations on them are sometimes less
+     expensive in terms of CPU time.  This is the type used by some
+     ancient versions of 'awk' to store numeric values.  It is the C
+     type 'float'.
+
+Space
+     The character generated by hitting the space bar on the keyboard.
+
+Special File
+     A file name interpreted internally by 'gawk', instead of being
+     handed directly to the underlying operating system--for example,
+     '/dev/stderr'.  (*Note Special Files::.)
+
+Statement
+     An expression inside an 'awk' program in the action part of a
+     pattern-action rule, or inside an 'awk' function.  A statement can
+     be a variable assignment, an array operation, a loop, etc.
+
+Stream Editor
+     A program that reads records from an input stream and processes
+     them one or more at a time.  This is in contrast with batch
+     programs, which may expect to read their input files in entirety
+     before starting to do anything, as well as with interactive
+     programs which require input from the user.
+
+String
+     A datum consisting of a sequence of characters, such as 'I am a
+     string'.  Constant strings are written with double quotes in the
+     'awk' language and may contain escape sequences.  (*Note Escape
+     Sequences::.)
+
+Tab
+     The character generated by hitting the 'TAB' key on the keyboard.
+     It usually expands to up to eight spaces upon output.
+
+Text Domain
+     A unique name that identifies an application.  Used for grouping
+     messages that are translated at runtime into the local language.
+
+Timestamp
+     A value in the "seconds since the epoch" format used by Unix and
+     POSIX systems.  Used for the 'gawk' functions 'mktime()',
+     'strftime()', and 'systime()'.  See also "Epoch," "GMT," and "UTC."
+
+Unix
+     A computer operating system originally developed in the early
+     1970's at AT&T Bell Laboratories.  It initially became popular in
+     universities around the world and later moved into commercial
+     environments as a software development system and network server
+     system.  There are many commercial versions of Unix, as well as
+     several work-alike systems whose source code is freely available
+     (such as GNU/Linux, NetBSD (http://www.netbsd.org), FreeBSD
+     (http://www.freebsd.org), and OpenBSD (http://www.openbsd.org)).
+
+UTC
+     The accepted abbreviation for "Universal Coordinated Time."  This
+     is standard time in Greenwich, England, which is used as a
+     reference time for day and date calculations.  See also "Epoch" and
+     "GMT."
+
+Variable
+     A name for a value.  In 'awk', variables may be either scalars or
+     arrays.
+
+Whitespace
+     A sequence of space, TAB, or newline characters occurring inside an
+     input record or a string.
+
+
+File: gawk.info,  Node: Copying,  Next: GNU Free Documentation License,  Prev: Glossary,  Up: Top
+
+GNU General Public License
+**************************
+
+                        Version 3, 29 June 2007
+
+     Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
+
+     Everyone is permitted to copy and distribute verbatim copies of this
+     license document, but changing it is not allowed.
+
+Preamble
+========
+
+The GNU General Public License is a free, copyleft license for software
+and other kinds of works.
+
+   The licenses for most software and other practical works are designed
+to take away your freedom to share and change the works.  By contrast,
+the GNU General Public License is intended to guarantee your freedom to
+share and change all versions of a program--to make sure it remains free
+software for all its users.  We, the Free Software Foundation, use the
+GNU General Public License for most of our software; it applies also to
+any other work released this way by its authors.  You can apply it to
+your programs, too.
+
+   When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+them if you wish), that you receive source code or can get it if you
+want it, that you can change the software or use pieces of it in new
+free programs, and that you know you can do these things.
+
+   To protect your rights, we need to prevent others from denying you
+these rights or asking you to surrender the rights.  Therefore, you have
+certain responsibilities if you distribute copies of the software, or if
+you modify it: responsibilities to respect the freedom of others.
+
+   For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must pass on to the recipients the same
+freedoms that you received.  You must make sure that they, too, receive
+or can get the source code.  And you must show them these terms so they
+know their rights.
+
+   Developers that use the GNU GPL protect your rights with two steps:
+(1) assert copyright on the software, and (2) offer you this License
+giving you legal permission to copy, distribute and/or modify it.
+
+   For the developers' and authors' protection, the GPL clearly explains
+that there is no warranty for this free software.  For both users' and
+authors' sake, the GPL requires that modified versions be marked as
+changed, so that their problems will not be attributed erroneously to
+authors of previous versions.
+
+   Some devices are designed to deny users access to install or run
+modified versions of the software inside them, although the manufacturer
+can do so.  This is fundamentally incompatible with the aim of
+protecting users' freedom to change the software.  The systematic
+pattern of such abuse occurs in the area of products for individuals to
+use, which is precisely where it is most unacceptable.  Therefore, we
+have designed this version of the GPL to prohibit the practice for those
+products.  If such problems arise substantially in other domains, we
+stand ready to extend this provision to those domains in future versions
+of the GPL, as needed to protect the freedom of users.
+
+   Finally, every program is threatened constantly by software patents.
+States should not allow patents to restrict development and use of
+software on general-purpose computers, but in those that do, we wish to
+avoid the special danger that patents applied to a free program could
+make it effectively proprietary.  To prevent this, the GPL assures that
+patents cannot be used to render the program non-free.
+
+   The precise terms and conditions for copying, distribution and
+modification follow.
+
+TERMS AND CONDITIONS
+====================
+
+  0. Definitions.
+
+     "This License" refers to version 3 of the GNU General Public
+     License.
+
+     "Copyright" also means copyright-like laws that apply to other
+     kinds of works, such as semiconductor masks.
+
+     "The Program" refers to any copyrightable work licensed under this
+     License.  Each licensee is addressed as "you".  "Licensees" and
+     "recipients" may be individuals or organizations.
+
+     To "modify" a work means to copy from or adapt all or part of the
+     work in a fashion requiring copyright permission, other than the
+     making of an exact copy.  The resulting work is called a "modified
+     version" of the earlier work or a work "based on" the earlier work.
+
+     A "covered work" means either the unmodified Program or a work
+     based on the Program.
+
+     To "propagate" a work means to do anything with it that, without
+     permission, would make you directly or secondarily liable for
+     infringement under applicable copyright law, except executing it on
+     a computer or modifying a private copy.  Propagation includes
+     copying, distribution (with or without modification), making
+     available to the public, and in some countries other activities as
+     well.
+
+     To "convey" a work means any kind of propagation that enables other
+     parties to make or receive copies.  Mere interaction with a user
+     through a computer network, with no transfer of a copy, is not
+     conveying.
+
+     An interactive user interface displays "Appropriate Legal Notices"
+     to the extent that it includes a convenient and prominently visible
+     feature that (1) displays an appropriate copyright notice, and (2)
+     tells the user that there is no warranty for the work (except to
+     the extent that warranties are provided), that licensees may convey
+     the work under this License, and how to view a copy of this
+     License.  If the interface presents a list of user commands or
+     options, such as a menu, a prominent item in the list meets this
+     criterion.
+
+  1. Source Code.
+
+     The "source code" for a work means the preferred form of the work
+     for making modifications to it.  "Object code" means any non-source
+     form of a work.
+
+     A "Standard Interface" means an interface that either is an
+     official standard defined by a recognized standards body, or, in
+     the case of interfaces specified for a particular programming
+     language, one that is widely used among developers working in that
+     language.
+
+     The "System Libraries" of an executable work include anything,
+     other than the work as a whole, that (a) is included in the normal
+     form of packaging a Major Component, but which is not part of that
+     Major Component, and (b) serves only to enable use of the work with
+     that Major Component, or to implement a Standard Interface for
+     which an implementation is available to the public in source code
+     form.  A "Major Component", in this context, means a major
+     essential component (kernel, window system, and so on) of the
+     specific operating system (if any) on which the executable work
+     runs, or a compiler used to produce the work, or an object code
+     interpreter used to run it.
+
+     The "Corresponding Source" for a work in object code form means all
+     the source code needed to generate, install, and (for an executable
+     work) run the object code and to modify the work, including scripts
+     to control those activities.  However, it does not include the
+     work's System Libraries, or general-purpose tools or generally
+     available free programs which are used unmodified in performing
+     those activities but which are not part of the work.  For example,
+     Corresponding Source includes interface definition files associated
+     with source files for the work, and the source code for shared
+     libraries and dynamically linked subprograms that the work is
+     specifically designed to require, such as by intimate data
+     communication or control flow between those subprograms and other
+     parts of the work.
+
+     The Corresponding Source need not include anything that users can
+     regenerate automatically from other parts of the Corresponding
+     Source.
+
+     The Corresponding Source for a work in source code form is that
+     same work.
+
+  2. Basic Permissions.
+
+     All rights granted under this License are granted for the term of
+     copyright on the Program, and are irrevocable provided the stated
+     conditions are met.  This License explicitly affirms your unlimited
+     permission to run the unmodified Program.  The output from running
+     a covered work is covered by this License only if the output, given
+     its content, constitutes a covered work.  This License acknowledges
+     your rights of fair use or other equivalent, as provided by
+     copyright law.
+
+     You may make, run and propagate covered works that you do not
+     convey, without conditions so long as your license otherwise
+     remains in force.  You may convey covered works to others for the
+     sole purpose of having them make modifications exclusively for you,
+     or provide you with facilities for running those works, provided
+     that you comply with the terms of this License in conveying all
+     material for which you do not control copyright.  Those thus making
+     or running the covered works for you must do so exclusively on your
+     behalf, under your direction and control, on terms that prohibit
+     them from making any copies of your copyrighted material outside
+     their relationship with you.
+
+     Conveying under any other circumstances is permitted solely under
+     the conditions stated below.  Sublicensing is not allowed; section
+     10 makes it unnecessary.
+
+  3. Protecting Users' Legal Rights From Anti-Circumvention Law.
+
+     No covered work shall be deemed part of an effective technological
+     measure under any applicable law fulfilling obligations under
+     article 11 of the WIPO copyright treaty adopted on 20 December
+     1996, or similar laws prohibiting or restricting circumvention of
+     such measures.
+
+     When you convey a covered work, you waive any legal power to forbid
+     circumvention of technological measures to the extent such
+     circumvention is effected by exercising rights under this License
+     with respect to the covered work, and you disclaim any intention to
+     limit operation or modification of the work as a means of
+     enforcing, against the work's users, your or third parties' legal
+     rights to forbid circumvention of technological measures.
+
+  4. Conveying Verbatim Copies.
+
+     You may convey verbatim copies of the Program's source code as you
+     receive it, in any medium, provided that you conspicuously and
+     appropriately publish on each copy an appropriate copyright notice;
+     keep intact all notices stating that this License and any
+     non-permissive terms added in accord with section 7 apply to the
+     code; keep intact all notices of the absence of any warranty; and
+     give all recipients a copy of this License along with the Program.
+
+     You may charge any price or no price for each copy that you convey,
+     and you may offer support or warranty protection for a fee.
+
+  5. Conveying Modified Source Versions.
+
+     You may convey a work based on the Program, or the modifications to
+     produce it from the Program, in the form of source code under the
+     terms of section 4, provided that you also meet all of these
+     conditions:
+
+       a. The work must carry prominent notices stating that you
+          modified it, and giving a relevant date.
+
+       b. The work must carry prominent notices stating that it is
+          released under this License and any conditions added under
+          section 7.  This requirement modifies the requirement in
+          section 4 to "keep intact all notices".
+
+       c. You must license the entire work, as a whole, under this
+          License to anyone who comes into possession of a copy.  This
+          License will therefore apply, along with any applicable
+          section 7 additional terms, to the whole of the work, and all
+          its parts, regardless of how they are packaged.  This License
+          gives no permission to license the work in any other way, but
+          it does not invalidate such permission if you have separately
+          received it.
+
+       d. If the work has interactive user interfaces, each must display
+          Appropriate Legal Notices; however, if the Program has
+          interactive interfaces that do not display Appropriate Legal
+          Notices, your work need not make them do so.
+
+     A compilation of a covered work with other separate and independent
+     works, which are not by their nature extensions of the covered
+     work, and which are not combined with it such as to form a larger
+     program, in or on a volume of a storage or distribution medium, is
+     called an "aggregate" if the compilation and its resulting
+     copyright are not used to limit the access or legal rights of the
+     compilation's users beyond what the individual works permit.
+     Inclusion of a covered work in an aggregate does not cause this
+     License to apply to the other parts of the aggregate.
+
+  6. Conveying Non-Source Forms.
+
+     You may convey a covered work in object code form under the terms
+     of sections 4 and 5, provided that you also convey the
+     machine-readable Corresponding Source under the terms of this
+     License, in one of these ways:
+
+       a. Convey the object code in, or embodied in, a physical product
+          (including a physical distribution medium), accompanied by the
+          Corresponding Source fixed on a durable physical medium
+          customarily used for software interchange.
+
+       b. Convey the object code in, or embodied in, a physical product
+          (including a physical distribution medium), accompanied by a
+          written offer, valid for at least three years and valid for as
+          long as you offer spare parts or customer support for that
+          product model, to give anyone who possesses the object code
+          either (1) a copy of the Corresponding Source for all the
+          software in the product that is covered by this License, on a
+          durable physical medium customarily used for software
+          interchange, for a price no more than your reasonable cost of
+          physically performing this conveying of source, or (2) access
+          to copy the Corresponding Source from a network server at no
+          charge.
+
+       c. Convey individual copies of the object code with a copy of the
+          written offer to provide the Corresponding Source.  This
+          alternative is allowed only occasionally and noncommercially,
+          and only if you received the object code with such an offer,
+          in accord with subsection 6b.
+
+       d. Convey the object code by offering access from a designated
+          place (gratis or for a charge), and offer equivalent access to
+          the Corresponding Source in the same way through the same
+          place at no further charge.  You need not require recipients
+          to copy the Corresponding Source along with the object code.
+          If the place to copy the object code is a network server, the
+          Corresponding Source may be on a different server (operated by
+          you or a third party) that supports equivalent copying
+          facilities, provided you maintain clear directions next to the
+          object code saying where to find the Corresponding Source.
+          Regardless of what server hosts the Corresponding Source, you
+          remain obligated to ensure that it is available for as long as
+          needed to satisfy these requirements.
+
+       e. Convey the object code using peer-to-peer transmission,
+          provided you inform other peers where the object code and
+          Corresponding Source of the work are being offered to the
+          general public at no charge under subsection 6d.
+
+     A separable portion of the object code, whose source code is
+     excluded from the Corresponding Source as a System Library, need
+     not be included in conveying the object code work.
+
+     A "User Product" is either (1) a "consumer product", which means
+     any tangible personal property which is normally used for personal,
+     family, or household purposes, or (2) anything designed or sold for
+     incorporation into a dwelling.  In determining whether a product is
+     a consumer product, doubtful cases shall be resolved in favor of
+     coverage.  For a particular product received by a particular user,
+     "normally used" refers to a typical or common use of that class of
+     product, regardless of the status of the particular user or of the
+     way in which the particular user actually uses, or expects or is
+     expected to use, the product.  A product is a consumer product
+     regardless of whether the product has substantial commercial,
+     industrial or non-consumer uses, unless such uses represent the
+     only significant mode of use of the product.
+
+     "Installation Information" for a User Product means any methods,
+     procedures, authorization keys, or other information required to
+     install and execute modified versions of a covered work in that
+     User Product from a modified version of its Corresponding Source.
+     The information must suffice to ensure that the continued
+     functioning of the modified object code is in no case prevented or
+     interfered with solely because modification has been made.
+
+     If you convey an object code work under this section in, or with,
+     or specifically for use in, a User Product, and the conveying
+     occurs as part of a transaction in which the right of possession
+     and use of the User Product is transferred to the recipient in
+     perpetuity or for a fixed term (regardless of how the transaction
+     is characterized), the Corresponding Source conveyed under this
+     section must be accompanied by the Installation Information.  But
+     this requirement does not apply if neither you nor any third party
+     retains the ability to install modified object code on the User
+     Product (for example, the work has been installed in ROM).
+
+     The requirement to provide Installation Information does not
+     include a requirement to continue to provide support service,
+     warranty, or updates for a work that has been modified or installed
+     by the recipient, or for the User Product in which it has been
+     modified or installed.  Access to a network may be denied when the
+     modification itself materially and adversely affects the operation
+     of the network or violates the rules and protocols for
+     communication across the network.
+
+     Corresponding Source conveyed, and Installation Information
+     provided, in accord with this section must be in a format that is
+     publicly documented (and with an implementation available to the
+     public in source code form), and must require no special password
+     or key for unpacking, reading or copying.
+
+  7. Additional Terms.
+
+     "Additional permissions" are terms that supplement the terms of
+     this License by making exceptions from one or more of its
+     conditions.  Additional permissions that are applicable to the
+     entire Program shall be treated as though they were included in
+     this License, to the extent that they are valid under applicable
+     law.  If additional permissions apply only to part of the Program,
+     that part may be used separately under those permissions, but the
+     entire Program remains governed by this License without regard to
+     the additional permissions.
+
+     When you convey a copy of a covered work, you may at your option
+     remove any additional permissions from that copy, or from any part
+     of it.  (Additional permissions may be written to require their own
+     removal in certain cases when you modify the work.)  You may place
+     additional permissions on material, added by you to a covered work,
+     for which you have or can give appropriate copyright permission.
+
+     Notwithstanding any other provision of this License, for material
+     you add to a covered work, you may (if authorized by the copyright
+     holders of that material) supplement the terms of this License with
+     terms:
+
+       a. Disclaiming warranty or limiting liability differently from
+          the terms of sections 15 and 16 of this License; or
+
+       b. Requiring preservation of specified reasonable legal notices
+          or author attributions in that material or in the Appropriate
+          Legal Notices displayed by works containing it; or
+
+       c. Prohibiting misrepresentation of the origin of that material,
+          or requiring that modified versions of such material be marked
+          in reasonable ways as different from the original version; or
+
+       d. Limiting the use for publicity purposes of names of licensors
+          or authors of the material; or
+
+       e. Declining to grant rights under trademark law for use of some
+          trade names, trademarks, or service marks; or
+
+       f. Requiring indemnification of licensors and authors of that
+          material by anyone who conveys the material (or modified
+          versions of it) with contractual assumptions of liability to
+          the recipient, for any liability that these contractual
+          assumptions directly impose on those licensors and authors.
+
+     All other non-permissive additional terms are considered "further
+     restrictions" within the meaning of section 10.  If the Program as
+     you received it, or any part of it, contains a notice stating that
+     it is governed by this License along with a term that is a further
+     restriction, you may remove that term.  If a license document
+     contains a further restriction but permits relicensing or conveying
+     under this License, you may add to a covered work material governed
+     by the terms of that license document, provided that the further
+     restriction does not survive such relicensing or conveying.
+
+     If you add terms to a covered work in accord with this section, you
+     must place, in the relevant source files, a statement of the
+     additional terms that apply to those files, or a notice indicating
+     where to find the applicable terms.
+
+     Additional terms, permissive or non-permissive, may be stated in
+     the form of a separately written license, or stated as exceptions;
+     the above requirements apply either way.
+
+  8. Termination.
+
+     You may not propagate or modify a covered work except as expressly
+     provided under this License.  Any attempt otherwise to propagate or
+     modify it is void, and will automatically terminate your rights
+     under this License (including any patent licenses granted under the
+     third paragraph of section 11).
+
+     However, if you cease all violation of this License, then your
+     license from a particular copyright holder is reinstated (a)
+     provisionally, unless and until the copyright holder explicitly and
+     finally terminates your license, and (b) permanently, if the
+     copyright holder fails to notify you of the violation by some
+     reasonable means prior to 60 days after the cessation.
+
+     Moreover, your license from a particular copyright holder is
+     reinstated permanently if the copyright holder notifies you of the
+     violation by some reasonable means, this is the first time you have
+     received notice of violation of this License (for any work) from
+     that copyright holder, and you cure the violation prior to 30 days
+     after your receipt of the notice.
+
+     Termination of your rights under this section does not terminate
+     the licenses of parties who have received copies or rights from you
+     under this License.  If your rights have been terminated and not
+     permanently reinstated, you do not qualify to receive new licenses
+     for the same material under section 10.
+
+  9. Acceptance Not Required for Having Copies.
+
+     You are not required to accept this License in order to receive or
+     run a copy of the Program.  Ancillary propagation of a covered work
+     occurring solely as a consequence of using peer-to-peer
+     transmission to receive a copy likewise does not require
+     acceptance.  However, nothing other than this License grants you
+     permission to propagate or modify any covered work.  These actions
+     infringe copyright if you do not accept this License.  Therefore,
+     by modifying or propagating a covered work, you indicate your
+     acceptance of this License to do so.
+
+  10. Automatic Licensing of Downstream Recipients.
+
+     Each time you convey a covered work, the recipient automatically
+     receives a license from the original licensors, to run, modify and
+     propagate that work, subject to this License.  You are not
+     responsible for enforcing compliance by third parties with this
+     License.
+
+     An "entity transaction" is a transaction transferring control of an
+     organization, or substantially all assets of one, or subdividing an
+     organization, or merging organizations.  If propagation of a
+     covered work results from an entity transaction, each party to that
+     transaction who receives a copy of the work also receives whatever
+     licenses to the work the party's predecessor in interest had or
+     could give under the previous paragraph, plus a right to possession
+     of the Corresponding Source of the work from the predecessor in
+     interest, if the predecessor has it or can get it with reasonable
+     efforts.
+
+     You may not impose any further restrictions on the exercise of the
+     rights granted or affirmed under this License.  For example, you
+     may not impose a license fee, royalty, or other charge for exercise
+     of rights granted under this License, and you may not initiate
+     litigation (including a cross-claim or counterclaim in a lawsuit)
+     alleging that any patent claim is infringed by making, using,
+     selling, offering for sale, or importing the Program or any portion
+     of it.
+
+  11. Patents.
+
+     A "contributor" is a copyright holder who authorizes use under this
+     License of the Program or a work on which the Program is based.
+     The work thus licensed is called the contributor's "contributor
+     version".
+
+     A contributor's "essential patent claims" are all patent claims
+     owned or controlled by the contributor, whether already acquired or
+     hereafter acquired, that would be infringed by some manner,
+     permitted by this License, of making, using, or selling its
+     contributor version, but do not include claims that would be
+     infringed only as a consequence of further modification of the
+     contributor version.  For purposes of this definition, "control"
+     includes the right to grant patent sublicenses in a manner
+     consistent with the requirements of this License.
+
+     Each contributor grants you a non-exclusive, worldwide,
+     royalty-free patent license under the contributor's essential
+     patent claims, to make, use, sell, offer for sale, import and
+     otherwise run, modify and propagate the contents of its contributor
+     version.
+
+     In the following three paragraphs, a "patent license" is any
+     express agreement or commitment, however denominated, not to
+     enforce a patent (such as an express permission to practice a
+     patent or covenant not to sue for patent infringement).  To "grant"
+     such a patent license to a party means to make such an agreement or
+     commitment not to enforce a patent against the party.
+
+     If you convey a covered work, knowingly relying on a patent
+     license, and the Corresponding Source of the work is not available
+     for anyone to copy, free of charge and under the terms of this
+     License, through a publicly available network server or other
+     readily accessible means, then you must either (1) cause the
+     Corresponding Source to be so available, or (2) arrange to deprive
+     yourself of the benefit of the patent license for this particular
+     work, or (3) arrange, in a manner consistent with the requirements
+     of this License, to extend the patent license to downstream
+     recipients.  "Knowingly relying" means you have actual knowledge
+     that, but for the patent license, your conveying the covered work
+     in a country, or your recipient's use of the covered work in a
+     country, would infringe one or more identifiable patents in that
+     country that you have reason to believe are valid.
+
+     If, pursuant to or in connection with a single transaction or
+     arrangement, you convey, or propagate by procuring conveyance of, a
+     covered work, and grant a patent license to some of the parties
+     receiving the covered work authorizing them to use, propagate,
+     modify or convey a specific copy of the covered work, then the
+     patent license you grant is automatically extended to all
+     recipients of the covered work and works based on it.
+
+     A patent license is "discriminatory" if it does not include within
+     the scope of its coverage, prohibits the exercise of, or is
+     conditioned on the non-exercise of one or more of the rights that
+     are specifically granted under this License.  You may not convey a
+     covered work if you are a party to an arrangement with a third
+     party that is in the business of distributing software, under which
+     you make payment to the third party based on the extent of your
+     activity of conveying the work, and under which the third party
+     grants, to any of the parties who would receive the covered work
+     from you, a discriminatory patent license (a) in connection with
+     copies of the covered work conveyed by you (or copies made from
+     those copies), or (b) primarily for and in connection with specific
+     products or compilations that contain the covered work, unless you
+     entered into that arrangement, or that patent license was granted,
+     prior to 28 March 2007.
+
+     Nothing in this License shall be construed as excluding or limiting
+     any implied license or other defenses to infringement that may
+     otherwise be available to you under applicable patent law.
+
+  12. No Surrender of Others' Freedom.
+
+     If conditions are imposed on you (whether by court order, agreement
+     or otherwise) that contradict the conditions of this License, they
+     do not excuse you from the conditions of this License.  If you
+     cannot convey a covered work so as to satisfy simultaneously your
+     obligations under this License and any other pertinent obligations,
+     then as a consequence you may not convey it at all.  For example,
+     if you agree to terms that obligate you to collect a royalty for
+     further conveying from those to whom you convey the Program, the
+     only way you could satisfy both those terms and this License would
+     be to refrain entirely from conveying the Program.
+
+  13. Use with the GNU Affero General Public License.
+
+     Notwithstanding any other provision of this License, you have
+     permission to link or combine any covered work with a work licensed
+     under version 3 of the GNU Affero General Public License into a
+     single combined work, and to convey the resulting work.  The terms
+     of this License will continue to apply to the part which is the
+     covered work, but the special requirements of the GNU Affero
+     General Public License, section 13, concerning interaction through
+     a network will apply to the combination as such.
+
+  14. Revised Versions of this License.
+
+     The Free Software Foundation may publish revised and/or new
+     versions of the GNU General Public License from time to time.  Such
+     new versions will be similar in spirit to the present version, but
+     may differ in detail to address new problems or concerns.
+
+     Each version is given a distinguishing version number.  If the
+     Program specifies that a certain numbered version of the GNU
+     General Public License "or any later version" applies to it, you
+     have the option of following the terms and conditions either of
+     that numbered version or of any later version published by the Free
+     Software Foundation.  If the Program does not specify a version
+     number of the GNU General Public License, you may choose any
+     version ever published by the Free Software Foundation.
+
+     If the Program specifies that a proxy can decide which future
+     versions of the GNU General Public License can be used, that
+     proxy's public statement of acceptance of a version permanently
+     authorizes you to choose that version for the Program.
+
+     Later license versions may give you additional or different
+     permissions.  However, no additional obligations are imposed on any
+     author or copyright holder as a result of your choosing to follow a
+     later version.
+
+  15. Disclaimer of Warranty.
+
+     THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
+     APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE
+     COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
+     WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED,
+     INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+     MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE
+     RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.
+     SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL
+     NECESSARY SERVICING, REPAIR OR CORRECTION.
+
+  16. Limitation of Liability.
+
+     IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
+     WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES
+     AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR
+     DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
+     CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE
+     THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA
+     BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
+     PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+     PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF
+     THE POSSIBILITY OF SUCH DAMAGES.
+
+  17. Interpretation of Sections 15 and 16.
+
+     If the disclaimer of warranty and limitation of liability provided
+     above cannot be given local legal effect according to their terms,
+     reviewing courts shall apply local law that most closely
+     approximates an absolute waiver of all civil liability in
+     connection with the Program, unless a warranty or assumption of
+     liability accompanies a copy of the Program in return for a fee.
+
+END OF TERMS AND CONDITIONS
+===========================
+
+How to Apply These Terms to Your New Programs
+=============================================
+
+If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these
+terms.
+
+   To do so, attach the following notices to the program.  It is safest
+to attach them to the start of each source file to most effectively
+state the exclusion of warranty; and each file should have at least the
+"copyright" line and a pointer to where the full notice is found.
+
+     ONE LINE TO GIVE THE PROGRAM'S NAME AND A BRIEF IDEA OF WHAT IT DOES.
+     Copyright (C) YEAR NAME OF AUTHOR
+
+     This program is free software: you can redistribute it and/or modify
+     it under the terms of the GNU General Public License as published by
+     the Free Software Foundation, either version 3 of the License, or (at
+     your option) any later version.
+
+     This program is distributed in the hope that it will be useful, but
+     WITHOUT ANY WARRANTY; without even the implied warranty of
+     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+     General Public License for more details.
+
+     You should have received a copy of the GNU General Public License
+     along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+   Also add information on how to contact you by electronic and paper
+mail.
+
+   If the program does terminal interaction, make it output a short
+notice like this when it starts in an interactive mode:
+
+     PROGRAM Copyright (C) YEAR NAME OF AUTHOR
+     This program comes with ABSOLUTELY NO WARRANTY; for details type 'show w'.
+     This is free software, and you are welcome to redistribute it
+     under certain conditions; type 'show c' for details.
+
+   The hypothetical commands 'show w' and 'show c' should show the
+appropriate parts of the General Public License.  Of course, your
+program's commands might be different; for a GUI interface, you would
+use an "about box".
+
+   You should also get your employer (if you work as a programmer) or
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary.  For more information on this, and how to apply and follow
+the GNU GPL, see <http://www.gnu.org/licenses/>.
+
+   The GNU General Public License does not permit incorporating your
+program into proprietary programs.  If your program is a subroutine
+library, you may consider it more useful to permit linking proprietary
+applications with the library.  If this is what you want to do, use the
+GNU Lesser General Public License instead of this License.  But first,
+please read <http://www.gnu.org/philosophy/why-not-lgpl.html>.
+
+
+File: gawk.info,  Node: GNU Free Documentation License,  Next: Index,  Prev: Copying,  Up: Top
+
+GNU Free Documentation License
+******************************
+
+                     Version 1.3, 3 November 2008
+
+     Copyright (C) 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.
+     <http://fsf.org/>
+
+     Everyone is permitted to copy and distribute verbatim copies
+     of this license document, but changing it is not allowed.
+
+  0. PREAMBLE
+
+     The purpose of this License is to make a manual, textbook, or other
+     functional and useful document "free" in the sense of freedom: to
+     assure everyone the effective freedom to copy and redistribute it,
+     with or without modifying it, either commercially or
+     noncommercially.  Secondarily, this License preserves for the
+     author and publisher a way to get credit for their work, while not
+     being considered responsible for modifications made by others.
+
+     This License is a kind of "copyleft", which means that derivative
+     works of the document must themselves be free in the same sense.
+     It complements the GNU General Public License, which is a copyleft
+     license designed for free software.
+
+     We have designed this License in order to use it for manuals for
+     free software, because free software needs free documentation: a
+     free program should come with manuals providing the same freedoms
+     that the software does.  But this License is not limited to
+     software manuals; it can be used for any textual work, regardless
+     of subject matter or whether it is published as a printed book.  We
+     recommend this License principally for works whose purpose is
+     instruction or reference.
+
+  1. APPLICABILITY AND DEFINITIONS
+
+     This License applies to any manual or other work, in any medium,
+     that contains a notice placed by the copyright holder saying it can
+     be distributed under the terms of this License.  Such a notice
+     grants a world-wide, royalty-free license, unlimited in duration,
+     to use that work under the conditions stated herein.  The
+     "Document", below, refers to any such manual or work.  Any member
+     of the public is a licensee, and is addressed as "you".  You accept
+     the license if you copy, modify or distribute the work in a way
+     requiring permission under copyright law.
+
+     A "Modified Version" of the Document means any work containing the
+     Document or a portion of it, either copied verbatim, or with
+     modifications and/or translated into another language.
+
+     A "Secondary Section" is a named appendix or a front-matter section
+     of the Document that deals exclusively with the relationship of the
+     publishers or authors of the Document to the Document's overall
+     subject (or to related matters) and contains nothing that could
+     fall directly within that overall subject.  (Thus, if the Document
+     is in part a textbook of mathematics, a Secondary Section may not
+     explain any mathematics.)  The relationship could be a matter of
+     historical connection with the subject or with related matters, or
+     of legal, commercial, philosophical, ethical or political position
+     regarding them.
+
+     The "Invariant Sections" are certain Secondary Sections whose
+     titles are designated, as being those of Invariant Sections, in the
+     notice that says that the Document is released under this License.
+     If a section does not fit the above definition of Secondary then it
+     is not allowed to be designated as Invariant.  The Document may
+     contain zero Invariant Sections.  If the Document does not identify
+     any Invariant Sections then there are none.
+
+     The "Cover Texts" are certain short passages of text that are
+     listed, as Front-Cover Texts or Back-Cover Texts, in the notice
+     that says that the Document is released under this License.  A
+     Front-Cover Text may be at most 5 words, and a Back-Cover Text may
+     be at most 25 words.
+
+     A "Transparent" copy of the Document means a machine-readable copy,
+     represented in a format whose specification is available to the
+     general public, that is suitable for revising the document
+     straightforwardly with generic text editors or (for images composed
+     of pixels) generic paint programs or (for drawings) some widely
+     available drawing editor, and that is suitable for input to text
+     formatters or for automatic translation to a variety of formats
+     suitable for input to text formatters.  A copy made in an otherwise
+     Transparent file format whose markup, or absence of markup, has
+     been arranged to thwart or discourage subsequent modification by
+     readers is not Transparent.  An image format is not Transparent if
+     used for any substantial amount of text.  A copy that is not
+     "Transparent" is called "Opaque".
+
+     Examples of suitable formats for Transparent copies include plain
+     ASCII without markup, Texinfo input format, LaTeX input format,
+     SGML or XML using a publicly available DTD, and standard-conforming
+     simple HTML, PostScript or PDF designed for human modification.
+     Examples of transparent image formats include PNG, XCF and JPG.
+     Opaque formats include proprietary formats that can be read and
+     edited only by proprietary word processors, SGML or XML for which
+     the DTD and/or processing tools are not generally available, and
+     the machine-generated HTML, PostScript or PDF produced by some word
+     processors for output purposes only.
+
+     The "Title Page" means, for a printed book, the title page itself,
+     plus such following pages as are needed to hold, legibly, the
+     material this License requires to appear in the title page.  For
+     works in formats which do not have any title page as such, "Title
+     Page" means the text near the most prominent appearance of the
+     work's title, preceding the beginning of the body of the text.
+
+     The "publisher" means any person or entity that distributes copies
+     of the Document to the public.
+
+     A section "Entitled XYZ" means a named subunit of the Document
+     whose title either is precisely XYZ or contains XYZ in parentheses
+     following text that translates XYZ in another language.  (Here XYZ
+     stands for a specific section name mentioned below, such as
+     "Acknowledgements", "Dedications", "Endorsements", or "History".)
+     To "Preserve the Title" of such a section when you modify the
+     Document means that it remains a section "Entitled XYZ" according
+     to this definition.
+
+     The Document may include Warranty Disclaimers next to the notice
+     which states that this License applies to the Document.  These
+     Warranty Disclaimers are considered to be included by reference in
+     this License, but only as regards disclaiming warranties: any other
+     implication that these Warranty Disclaimers may have is void and
+     has no effect on the meaning of this License.
+
+  2. VERBATIM COPYING
+
+     You may copy and distribute the Document in any medium, either
+     commercially or noncommercially, provided that this License, the
+     copyright notices, and the license notice saying this License
+     applies to the Document are reproduced in all copies, and that you
+     add no other conditions whatsoever to those of this License.  You
+     may not use technical measures to obstruct or control the reading
+     or further copying of the copies you make or distribute.  However,
+     you may accept compensation in exchange for copies.  If you
+     distribute a large enough number of copies you must also follow the
+     conditions in section 3.
+
+     You may also lend copies, under the same conditions stated above,
+     and you may publicly display copies.
+
+  3. COPYING IN QUANTITY
+
+     If you publish printed copies (or copies in media that commonly
+     have printed covers) of the Document, numbering more than 100, and
+     the Document's license notice requires Cover Texts, you must
+     enclose the copies in covers that carry, clearly and legibly, all
+     these Cover Texts: Front-Cover Texts on the front cover, and
+     Back-Cover Texts on the back cover.  Both covers must also clearly
+     and legibly identify you as the publisher of these copies.  The
+     front cover must present the full title with all words of the title
+     equally prominent and visible.  You may add other material on the
+     covers in addition.  Copying with changes limited to the covers, as
+     long as they preserve the title of the Document and satisfy these
+     conditions, can be treated as verbatim copying in other respects.
+
+     If the required texts for either cover are too voluminous to fit
+     legibly, you should put the first ones listed (as many as fit
+     reasonably) on the actual cover, and continue the rest onto
+     adjacent pages.
+
+     If you publish or distribute Opaque copies of the Document
+     numbering more than 100, you must either include a machine-readable
+     Transparent copy along with each Opaque copy, or state in or with
+     each Opaque copy a computer-network location from which the general
+     network-using public has access to download using public-standard
+     network protocols a complete Transparent copy of the Document, free
+     of added material.  If you use the latter option, you must take
+     reasonably prudent steps, when you begin distribution of Opaque
+     copies in quantity, to ensure that this Transparent copy will
+     remain thus accessible at the stated location until at least one
+     year after the last time you distribute an Opaque copy (directly or
+     through your agents or retailers) of that edition to the public.
+
+     It is requested, but not required, that you contact the authors of
+     the Document well before redistributing any large number of copies,
+     to give them a chance to provide you with an updated version of the
+     Document.
+
+  4. MODIFICATIONS
+
+     You may copy and distribute a Modified Version of the Document
+     under the conditions of sections 2 and 3 above, provided that you
+     release the Modified Version under precisely this License, with the
+     Modified Version filling the role of the Document, thus licensing
+     distribution and modification of the Modified Version to whoever
+     possesses a copy of it.  In addition, you must do these things in
+     the Modified Version:
+
+       A. Use in the Title Page (and on the covers, if any) a title
+          distinct from that of the Document, and from those of previous
+          versions (which should, if there were any, be listed in the
+          History section of the Document).  You may use the same title
+          as a previous version if the original publisher of that
+          version gives permission.
+
+       B. List on the Title Page, as authors, one or more persons or
+          entities responsible for authorship of the modifications in
+          the Modified Version, together with at least five of the
+          principal authors of the Document (all of its principal
+          authors, if it has fewer than five), unless they release you
+          from this requirement.
+
+       C. State on the Title page the name of the publisher of the
+          Modified Version, as the publisher.
+
+       D. Preserve all the copyright notices of the Document.
+
+       E. Add an appropriate copyright notice for your modifications
+          adjacent to the other copyright notices.
+
+       F. Include, immediately after the copyright notices, a license
+          notice giving the public permission to use the Modified
+          Version under the terms of this License, in the form shown in
+          the Addendum below.
+
+       G. Preserve in that license notice the full lists of Invariant
+          Sections and required Cover Texts given in the Document's
+          license notice.
+
+       H. Include an unaltered copy of this License.
+
+       I. Preserve the section Entitled "History", Preserve its Title,
+          and add to it an item stating at least the title, year, new
+          authors, and publisher of the Modified Version as given on the
+          Title Page.  If there is no section Entitled "History" in the
+          Document, create one stating the title, year, authors, and
+          publisher of the Document as given on its Title Page, then add
+          an item describing the Modified Version as stated in the
+          previous sentence.
+
+       J. Preserve the network location, if any, given in the Document
+          for public access to a Transparent copy of the Document, and
+          likewise the network locations given in the Document for
+          previous versions it was based on.  These may be placed in the
+          "History" section.  You may omit a network location for a work
+          that was published at least four years before the Document
+          itself, or if the original publisher of the version it refers
+          to gives permission.
+
+       K. For any section Entitled "Acknowledgements" or "Dedications",
+          Preserve the Title of the section, and preserve in the section
+          all the substance and tone of each of the contributor
+          acknowledgements and/or dedications given therein.
+
+       L. Preserve all the Invariant Sections of the Document, unaltered
+          in their text and in their titles.  Section numbers or the
+          equivalent are not considered part of the section titles.
+
+       M. Delete any section Entitled "Endorsements".  Such a section
+          may not be included in the Modified Version.
+
+       N. Do not retitle any existing section to be Entitled
+          "Endorsements" or to conflict in title with any Invariant
+          Section.
+
+       O. Preserve any Warranty Disclaimers.
+
+     If the Modified Version includes new front-matter sections or
+     appendices that qualify as Secondary Sections and contain no
+     material copied from the Document, you may at your option designate
+     some or all of these sections as invariant.  To do this, add their
+     titles to the list of Invariant Sections in the Modified Version's
+     license notice.  These titles must be distinct from any other
+     section titles.
+
+     You may add a section Entitled "Endorsements", provided it contains
+     nothing but endorsements of your Modified Version by various
+     parties--for example, statements of peer review or that the text
+     has been approved by an organization as the authoritative
+     definition of a standard.
+
+     You may add a passage of up to five words as a Front-Cover Text,
+     and a passage of up to 25 words as a Back-Cover Text, to the end of
+     the list of Cover Texts in the Modified Version.  Only one passage
+     of Front-Cover Text and one of Back-Cover Text may be added by (or
+     through arrangements made by) any one entity.  If the Document
+     already includes a cover text for the same cover, previously added
+     by you or by arrangement made by the same entity you are acting on
+     behalf of, you may not add another; but you may replace the old
+     one, on explicit permission from the previous publisher that added
+     the old one.
+
+     The author(s) and publisher(s) of the Document do not by this
+     License give permission to use their names for publicity for or to
+     assert or imply endorsement of any Modified Version.
+
+  5. COMBINING DOCUMENTS
+
+     You may combine the Document with other documents released under
+     this License, under the terms defined in section 4 above for
+     modified versions, provided that you include in the combination all
+     of the Invariant Sections of all of the original documents,
+     unmodified, and list them all as Invariant Sections of your
+     combined work in its license notice, and that you preserve all
+     their Warranty Disclaimers.
+
+     The combined work need only contain one copy of this License, and
+     multiple identical Invariant Sections may be replaced with a single
+     copy.  If there are multiple Invariant Sections with the same name
+     but different contents, make the title of each such section unique
+     by adding at the end of it, in parentheses, the name of the
+     original author or publisher of that section if known, or else a
+     unique number.  Make the same adjustment to the section titles in
+     the list of Invariant Sections in the license notice of the
+     combined work.
+
+     In the combination, you must combine any sections Entitled
+     "History" in the various original documents, forming one section
+     Entitled "History"; likewise combine any sections Entitled
+     "Acknowledgements", and any sections Entitled "Dedications".  You
+     must delete all sections Entitled "Endorsements."
+
+  6. COLLECTIONS OF DOCUMENTS
+
+     You may make a collection consisting of the Document and other
+     documents released under this License, and replace the individual
+     copies of this License in the various documents with a single copy
+     that is included in the collection, provided that you follow the
+     rules of this License for verbatim copying of each of the documents
+     in all other respects.
+
+     You may extract a single document from such a collection, and
+     distribute it individually under this License, provided you insert
+     a copy of this License into the extracted document, and follow this
+     License in all other respects regarding verbatim copying of that
+     document.
+
+  7. AGGREGATION WITH INDEPENDENT WORKS
+
+     A compilation of the Document or its derivatives with other
+     separate and independent documents or works, in or on a volume of a
+     storage or distribution medium, is called an "aggregate" if the
+     copyright resulting from the compilation is not used to limit the
+     legal rights of the compilation's users beyond what the individual
+     works permit.  When the Document is included in an aggregate, this
+     License does not apply to the other works in the aggregate which
+     are not themselves derivative works of the Document.
+
+     If the Cover Text requirement of section 3 is applicable to these
+     copies of the Document, then if the Document is less than one half
+     of the entire aggregate, the Document's Cover Texts may be placed
+     on covers that bracket the Document within the aggregate, or the
+     electronic equivalent of covers if the Document is in electronic
+     form.  Otherwise they must appear on printed covers that bracket
+     the whole aggregate.
+
+  8. TRANSLATION
+
+     Translation is considered a kind of modification, so you may
+     distribute translations of the Document under the terms of section
+     4.  Replacing Invariant Sections with translations requires special
+     permission from their copyright holders, but you may include
+     translations of some or all Invariant Sections in addition to the
+     original versions of these Invariant Sections.  You may include a
+     translation of this License, and all the license notices in the
+     Document, and any Warranty Disclaimers, provided that you also
+     include the original English version of this License and the
+     original versions of those notices and disclaimers.  In case of a
+     disagreement between the translation and the original version of
+     this License or a notice or disclaimer, the original version will
+     prevail.
+
+     If a section in the Document is Entitled "Acknowledgements",
+     "Dedications", or "History", the requirement (section 4) to
+     Preserve its Title (section 1) will typically require changing the
+     actual title.
+
+  9. TERMINATION
+
+     You may not copy, modify, sublicense, or distribute the Document
+     except as expressly provided under this License.  Any attempt
+     otherwise to copy, modify, sublicense, or distribute it is void,
+     and will automatically terminate your rights under this License.
+
+     However, if you cease all violation of this License, then your
+     license from a particular copyright holder is reinstated (a)
+     provisionally, unless and until the copyright holder explicitly and
+     finally terminates your license, and (b) permanently, if the
+     copyright holder fails to notify you of the violation by some
+     reasonable means prior to 60 days after the cessation.
+
+     Moreover, your license from a particular copyright holder is
+     reinstated permanently if the copyright holder notifies you of the
+     violation by some reasonable means, this is the first time you have
+     received notice of violation of this License (for any work) from
+     that copyright holder, and you cure the violation prior to 30 days
+     after your receipt of the notice.
+
+     Termination of your rights under this section does not terminate
+     the licenses of parties who have received copies or rights from you
+     under this License.  If your rights have been terminated and not
+     permanently reinstated, receipt of a copy of some or all of the
+     same material does not give you any rights to use it.
+
+  10. FUTURE REVISIONS OF THIS LICENSE
+
+     The Free Software Foundation may publish new, revised versions of
+     the GNU Free Documentation License from time to time.  Such new
+     versions will be similar in spirit to the present version, but may
+     differ in detail to address new problems or concerns.  See
+     <http://www.gnu.org/copyleft/>.
+
+     Each version of the License is given a distinguishing version
+     number.  If the Document specifies that a particular numbered
+     version of this License "or any later version" applies to it, you
+     have the option of following the terms and conditions either of
+     that specified version or of any later version that has been
+     published (not as a draft) by the Free Software Foundation.  If the
+     Document does not specify a version number of this License, you may
+     choose any version ever published (not as a draft) by the Free
+     Software Foundation.  If the Document specifies that a proxy can
+     decide which future versions of this License can be used, that
+     proxy's public statement of acceptance of a version permanently
+     authorizes you to choose that version for the Document.
+
+  11. RELICENSING
+
+     "Massive Multiauthor Collaboration Site" (or "MMC Site") means any
+     World Wide Web server that publishes copyrightable works and also
+     provides prominent facilities for anybody to edit those works.  A
+     public wiki that anybody can edit is an example of such a server.
+     A "Massive Multiauthor Collaboration" (or "MMC") contained in the
+     site means any set of copyrightable works thus published on the MMC
+     site.
+
+     "CC-BY-SA" means the Creative Commons Attribution-Share Alike 3.0
+     license published by Creative Commons Corporation, a not-for-profit
+     corporation with a principal place of business in San Francisco,
+     California, as well as future copyleft versions of that license
+     published by that same organization.
+
+     "Incorporate" means to publish or republish a Document, in whole or
+     in part, as part of another Document.
+
+     An MMC is "eligible for relicensing" if it is licensed under this
+     License, and if all works that were first published under this
+     License somewhere other than this MMC, and subsequently
+     incorporated in whole or in part into the MMC, (1) had no cover
+     texts or invariant sections, and (2) were thus incorporated prior
+     to November 1, 2008.
+
+     The operator of an MMC Site may republish an MMC contained in the
+     site under CC-BY-SA on the same site at any time before August 1,
+     2009, provided the MMC is eligible for relicensing.
+
+ADDENDUM: How to use this License for your documents
+====================================================
+
+To use this License in a document you have written, include a copy of
+the License in the document and put the following copyright and license
+notices just after the title page:
+
+       Copyright (C)  YEAR  YOUR NAME.
+       Permission is granted to copy, distribute and/or modify this document
+       under the terms of the GNU Free Documentation License, Version 1.3
+       or any later version published by the Free Software Foundation;
+       with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
+       Texts.  A copy of the license is included in the section entitled ``GNU
+       Free Documentation License''.
+
+   If you have Invariant Sections, Front-Cover Texts and Back-Cover
+Texts, replace the "with...Texts."  line with this:
+
+         with the Invariant Sections being LIST THEIR TITLES, with
+         the Front-Cover Texts being LIST, and with the Back-Cover Texts
+         being LIST.
+
+   If you have Invariant Sections without Cover Texts, or some other
+combination of the three, merge those two alternatives to suit the
+situation.
+
+   If your document contains nontrivial examples of program code, we
+recommend releasing these examples in parallel under your choice of free
+software license, such as the GNU General Public License, to permit
+their use in free software.
+
+
+File: gawk.info,  Node: Index,  Prev: GNU Free Documentation License,  Up: Top
+
+Index
+*****
+
+
+* Menu:
+
+* ! (exclamation point), ! operator:     Boolean Ops.         (line  69)
+* ! (exclamation point), ! operator <1>: Precedence.          (line  51)
+* ! (exclamation point), ! operator <2>: Ranges.              (line  47)
+* ! (exclamation point), ! operator <3>: Egrep Program.       (line 174)
+* ! (exclamation point), != operator:    Comparison Operators.
+                                                              (line  11)
+* ! (exclamation point), != operator <1>: Precedence.         (line  64)
+* ! (exclamation point), !~ operator:    Regexp Usage.        (line  19)
+* ! (exclamation point), !~ operator <1>: Computed Regexps.   (line   6)
+* ! (exclamation point), !~ operator <2>: Case-sensitivity.   (line  26)
+* ! (exclamation point), !~ operator <3>: Regexp Constants.   (line   6)
+* ! (exclamation point), !~ operator <4>: Comparison Operators.
+                                                              (line  11)
+* ! (exclamation point), !~ operator <5>: Comparison Operators.
+                                                              (line  98)
+* ! (exclamation point), !~ operator <6>: Precedence.         (line  79)
+* ! (exclamation point), !~ operator <7>: Expression Patterns.
+                                                              (line  24)
+* " (double quote), in regexp constants: Computed Regexps.    (line  30)
+* " (double quote), in shell commands:   Quoting.             (line  54)
+* # (number sign), #! (executable scripts): Executable Scripts.
+                                                              (line   6)
+* # (number sign), commenting:           Comments.            (line   6)
+* $ (dollar sign), $ field operator:     Fields.              (line  19)
+* $ (dollar sign), $ field operator <1>: Precedence.          (line  42)
+* $ (dollar sign), incrementing fields and arrays: Increment Ops.
+                                                              (line  30)
+* $ (dollar sign), regexp operator:      Regexp Operators.    (line  35)
+* % (percent sign), % operator:          Precedence.          (line  54)
+* % (percent sign), %= operator:         Assignment Ops.      (line 129)
+* % (percent sign), %= operator <1>:     Precedence.          (line  94)
+* & (ampersand), && operator:            Boolean Ops.         (line  59)
+* & (ampersand), && operator <1>:        Precedence.          (line  85)
+* & (ampersand), gsub()/gensub()/sub() functions and: Gory Details.
+                                                              (line   6)
+* ' (single quote):                      One-shot.            (line  15)
+* ' (single quote) in gawk command lines: Long.               (line  35)
+* ' (single quote), in shell commands:   Quoting.             (line  48)
+* ' (single quote), vs. apostrophe:      Comments.            (line  27)
+* ' (single quote), with double quotes:  Quoting.             (line  73)
+* () (parentheses), in a profile:        Profiling.           (line 146)
+* () (parentheses), regexp operator:     Regexp Operators.    (line  81)
+* * (asterisk), * operator, as multiplication operator: Precedence.
+                                                              (line  54)
+* * (asterisk), * operator, as regexp operator: Regexp Operators.
+                                                              (line  89)
+* * (asterisk), * operator, null strings, matching: String Functions.
+                                                              (line 537)
+* * (asterisk), ** operator:             Arithmetic Ops.      (line  81)
+* * (asterisk), ** operator <1>:         Precedence.          (line  48)
+* * (asterisk), **= operator:            Assignment Ops.      (line 129)
+* * (asterisk), **= operator <1>:        Precedence.          (line  94)
+* * (asterisk), *= operator:             Assignment Ops.      (line 129)
+* * (asterisk), *= operator <1>:         Precedence.          (line  94)
+* + (plus sign), + operator:             Precedence.          (line  51)
+* + (plus sign), + operator <1>:         Precedence.          (line  57)
+* + (plus sign), ++ operator:            Increment Ops.       (line  11)
+* + (plus sign), ++ operator <1>:        Increment Ops.       (line  40)
+* + (plus sign), ++ operator <2>:        Precedence.          (line  45)
+* + (plus sign), += operator:            Assignment Ops.      (line  81)
+* + (plus sign), += operator <1>:        Precedence.          (line  94)
+* + (plus sign), regexp operator:        Regexp Operators.    (line 105)
+* , (comma), in range patterns:          Ranges.              (line   6)
+* - (hyphen), - operator:                Precedence.          (line  51)
+* - (hyphen), - operator <1>:            Precedence.          (line  57)
+* - (hyphen), -- operator:               Increment Ops.       (line  48)
+* - (hyphen), -- operator <1>:           Precedence.          (line  45)
+* - (hyphen), -= operator:               Assignment Ops.      (line 129)
+* - (hyphen), -= operator <1>:           Precedence.          (line  94)
+* - (hyphen), filenames beginning with:  Options.             (line  60)
+* - (hyphen), in bracket expressions:    Bracket Expressions. (line  25)
+* --assign option:                       Options.             (line  32)
+* --bignum option:                       Options.             (line 203)
+* --characters-as-bytes option:          Options.             (line  69)
+* --copyright option:                    Options.             (line  89)
+* --debug option:                        Options.             (line 108)
+* --disable-extensions configuration option: Additional Configuration Options.
+                                                              (line   9)
+* --disable-lint configuration option:   Additional Configuration Options.
+                                                              (line  15)
+* --disable-nls configuration option:    Additional Configuration Options.
+                                                              (line  32)
+* --dump-variables option:               Options.             (line  94)
+* --dump-variables option, using for library functions: Library Names.
+                                                              (line  45)
+* --exec option:                         Options.             (line 125)
+* --field-separator option:              Options.             (line  21)
+* --file option:                         Options.             (line  25)
+* --gen-pot option:                      Options.             (line 147)
+* --gen-pot option <1>:                  String Extraction.   (line   6)
+* --gen-pot option <2>:                  String Extraction.   (line   6)
+* --help option:                         Options.             (line 154)
+* --include option:                      Options.             (line 159)
+* --lint option:                         Command Line.        (line  20)
+* --lint option <1>:                     Options.             (line 184)
+* --lint-old option:                     Options.             (line 299)
+* --load option:                         Options.             (line 172)
+* --no-optimize option:                  Options.             (line 285)
+* --non-decimal-data option:             Options.             (line 209)
+* --non-decimal-data option <1>:         Nondecimal Data.     (line   6)
+* --non-decimal-data option, strtonum() function and: Nondecimal Data.
+                                                              (line  35)
+* --optimize option:                     Options.             (line 234)
+* --posix option:                        Options.             (line 257)
+* --posix option, --traditional option and: Options.          (line 272)
+* --pretty-print option:                 Options.             (line 223)
+* --profile option:                      Options.             (line 245)
+* --profile option <1>:                  Profiling.           (line  12)
+* --re-interval option:                  Options.             (line 278)
+* --sandbox option:                      Options.             (line 290)
+* --sandbox option, disabling system() function: I/O Functions.
+                                                              (line 129)
+* --sandbox option, input redirection with getline: Getline.  (line  19)
+* --sandbox option, output redirection with print, printf: Redirection.
+                                                              (line   6)
+* --source option:                       Options.             (line 117)
+* --traditional option:                  Options.             (line  82)
+* --traditional option, --posix option and: Options.          (line 272)
+* --use-lc-numeric option:               Options.             (line 218)
+* --version option:                      Options.             (line 304)
+* --with-whiny-user-strftime configuration option: Additional Configuration Options.
+                                                              (line  37)
+* -b option:                             Options.             (line  69)
+* -c option:                             Options.             (line  82)
+* -C option:                             Options.             (line  89)
+* -d option:                             Options.             (line  94)
+* -D option:                             Options.             (line 108)
+* -e option:                             Options.             (line 117)
+* -E option:                             Options.             (line 125)
+* -e option <1>:                         Options.             (line 340)
+* -f option:                             Long.                (line  12)
+* -F option:                             Options.             (line  21)
+* -f option <1>:                         Options.             (line  25)
+* -F option, -Ft sets FS to TAB:         Options.             (line 312)
+* -F option, command-line:               Command Line Field Separator.
+                                                              (line   6)
+* -f option, multiple uses:              Options.             (line 317)
+* -g option:                             Options.             (line 147)
+* -h option:                             Options.             (line 154)
+* -i option:                             Options.             (line 159)
+* -l option:                             Options.             (line 172)
+* -l option <1>:                         Options.             (line 184)
+* -L option:                             Options.             (line 299)
+* -M option:                             Options.             (line 203)
+* -n option:                             Options.             (line 209)
+* -N option:                             Options.             (line 218)
+* -o option:                             Options.             (line 223)
+* -O option:                             Options.             (line 234)
+* -p option:                             Options.             (line 245)
+* -P option:                             Options.             (line 257)
+* -r option:                             Options.             (line 278)
+* -s option:                             Options.             (line 285)
+* -S option:                             Options.             (line 290)
+* -v option:                             Options.             (line  32)
+* -V option:                             Options.             (line 304)
+* -v option <1>:                         Assignment Options.  (line  12)
+* -W option:                             Options.             (line  47)
+* . (period), regexp operator:           Regexp Operators.    (line  44)
+* .gmo files:                            Explaining gettext.  (line  42)
+* .gmo files, specifying directory of:   Explaining gettext.  (line  54)
+* .gmo files, specifying directory of <1>: Programmer i18n.   (line  48)
+* .mo files, converting from .po:        I18N Example.        (line  66)
+* .po files:                             Explaining gettext.  (line  37)
+* .po files <1>:                         Translator i18n.     (line   6)
+* .po files, converting to .mo:          I18N Example.        (line  66)
+* .pot files:                            Explaining gettext.  (line  31)
+* / (forward slash) to enclose regular expressions: Regexp.   (line  10)
+* / (forward slash), / operator:         Precedence.          (line  54)
+* / (forward slash), /= operator:        Assignment Ops.      (line 129)
+* / (forward slash), /= operator <1>:    Precedence.          (line  94)
+* / (forward slash), /= operator, vs. /=.../ regexp constant: Assignment Ops.
+                                                              (line 149)
+* / (forward slash), patterns and:       Expression Patterns. (line  24)
+* /= operator vs. /=.../ regexp constant: Assignment Ops.     (line 149)
+* /dev/... special files:                Special FD.          (line  48)
+* /dev/fd/N special files (gawk):        Special FD.          (line  48)
+* /inet/... special files (gawk):        TCP/IP Networking.   (line   6)
+* /inet4/... special files (gawk):       TCP/IP Networking.   (line   6)
+* /inet6/... special files (gawk):       TCP/IP Networking.   (line   6)
+* ; (semicolon), AWKPATH variable and:   PC Using.            (line   9)
+* ; (semicolon), separating statements in actions: Statements/Lines.
+                                                              (line  90)
+* ; (semicolon), separating statements in actions <1>: Action Overview.
+                                                              (line  19)
+* ; (semicolon), separating statements in actions <2>: Statements.
+                                                              (line  10)
+* < (left angle bracket), < operator:    Comparison Operators.
+                                                              (line  11)
+* < (left angle bracket), < operator <1>: Precedence.         (line  64)
+* < (left angle bracket), < operator (I/O): Getline/File.     (line   6)
+* < (left angle bracket), <= operator:   Comparison Operators.
+                                                              (line  11)
+* < (left angle bracket), <= operator <1>: Precedence.        (line  64)
+* = (equals sign), = operator:           Assignment Ops.      (line   6)
+* = (equals sign), == operator:          Comparison Operators.
+                                                              (line  11)
+* = (equals sign), == operator <1>:      Precedence.          (line  64)
+* > (right angle bracket), > operator:   Comparison Operators.
+                                                              (line  11)
+* > (right angle bracket), > operator <1>: Precedence.        (line  64)
+* > (right angle bracket), > operator (I/O): Redirection.     (line  22)
+* > (right angle bracket), >= operator:  Comparison Operators.
+                                                              (line  11)
+* > (right angle bracket), >= operator <1>: Precedence.       (line  64)
+* > (right angle bracket), >> operator (I/O): Redirection.    (line  50)
+* > (right angle bracket), >> operator (I/O) <1>: Precedence. (line  64)
+* ? (question mark), ?: operator:        Precedence.          (line  91)
+* ? (question mark), regexp operator:    Regexp Operators.    (line 111)
+* ? (question mark), regexp operator <1>: GNU Regexp Operators.
+                                                              (line  62)
+* @-notation for indirect function calls: Indirect Calls.     (line  47)
+* @include directive:                    Include Files.       (line   8)
+* @load directive:                       Loading Shared Libraries.
+                                                              (line   8)
+* [] (square brackets), regexp operator: Regexp Operators.    (line  56)
+* \ (backslash):                         Comments.            (line  50)
+* \ (backslash), as field separator:     Command Line Field Separator.
+                                                              (line  24)
+* \ (backslash), continuing lines and:   Statements/Lines.    (line  19)
+* \ (backslash), continuing lines and, comments and: Statements/Lines.
+                                                              (line  75)
+* \ (backslash), continuing lines and, in csh: Statements/Lines.
+                                                              (line  43)
+* \ (backslash), gsub()/gensub()/sub() functions and: Gory Details.
+                                                              (line   6)
+* \ (backslash), in bracket expressions: Bracket Expressions. (line  25)
+* \ (backslash), in escape sequences:    Escape Sequences.    (line   6)
+* \ (backslash), in escape sequences <1>: Escape Sequences.   (line 103)
+* \ (backslash), in escape sequences, POSIX and: Escape Sequences.
+                                                              (line 108)
+* \ (backslash), in regexp constants:    Computed Regexps.    (line  30)
+* \ (backslash), in shell commands:      Quoting.             (line  48)
+* \ (backslash), regexp operator:        Regexp Operators.    (line  18)
+* \ (backslash), \" escape sequence:     Escape Sequences.    (line  85)
+* \ (backslash), \' operator (gawk):     GNU Regexp Operators.
+                                                              (line  59)
+* \ (backslash), \/ escape sequence:     Escape Sequences.    (line  76)
+* \ (backslash), \< operator (gawk):     GNU Regexp Operators.
+                                                              (line  33)
+* \ (backslash), \> operator (gawk):     GNU Regexp Operators.
+                                                              (line  37)
+* \ (backslash), \a escape sequence:     Escape Sequences.    (line  34)
+* \ (backslash), \b escape sequence:     Escape Sequences.    (line  38)
+* \ (backslash), \B operator (gawk):     GNU Regexp Operators.
+                                                              (line  46)
+* \ (backslash), \f escape sequence:     Escape Sequences.    (line  41)
+* \ (backslash), \n escape sequence:     Escape Sequences.    (line  44)
+* \ (backslash), \NNN escape sequence:   Escape Sequences.    (line  56)
+* \ (backslash), \r escape sequence:     Escape Sequences.    (line  47)
+* \ (backslash), \s operator (gawk):     GNU Regexp Operators.
+                                                              (line  13)
+* \ (backslash), \S operator (gawk):     GNU Regexp Operators.
+                                                              (line  17)
+* \ (backslash), \t escape sequence:     Escape Sequences.    (line  50)
+* \ (backslash), \v escape sequence:     Escape Sequences.    (line  53)
+* \ (backslash), \w operator (gawk):     GNU Regexp Operators.
+                                                              (line  22)
+* \ (backslash), \W operator (gawk):     GNU Regexp Operators.
+                                                              (line  28)
+* \ (backslash), \x escape sequence:     Escape Sequences.    (line  61)
+* \ (backslash), \y operator (gawk):     GNU Regexp Operators.
+                                                              (line  41)
+* \ (backslash), \` operator (gawk):     GNU Regexp Operators.
+                                                              (line  57)
+* ^ (caret), in bracket expressions:     Bracket Expressions. (line  25)
+* ^ (caret), in FS:                      Regexp Field Splitting.
+                                                              (line  59)
+* ^ (caret), regexp operator:            Regexp Operators.    (line  22)
+* ^ (caret), regexp operator <1>:        GNU Regexp Operators.
+                                                              (line  62)
+* ^ (caret), ^ operator:                 Precedence.          (line  48)
+* ^ (caret), ^= operator:                Assignment Ops.      (line 129)
+* ^ (caret), ^= operator <1>:            Precedence.          (line  94)
+* _ (underscore), C macro:               Explaining gettext.  (line  71)
+* _ (underscore), in names of private variables: Library Names.
+                                                              (line  29)
+* _ (underscore), translatable string:   Programmer i18n.     (line  69)
+* _gr_init() user-defined function:      Group Functions.     (line  83)
+* _ord_init() user-defined function:     Ordinal Functions.   (line  16)
+* _pw_init() user-defined function:      Passwd Functions.    (line 105)
+* {} (braces):                           Profiling.           (line 142)
+* {} (braces), actions and:              Action Overview.     (line  19)
+* {} (braces), statements, grouping:     Statements.          (line  10)
+* | (vertical bar):                      Regexp Operators.    (line  70)
+* | (vertical bar), | operator (I/O):    Getline/Pipe.        (line  10)
+* | (vertical bar), | operator (I/O) <1>: Redirection.        (line  57)
+* | (vertical bar), | operator (I/O) <2>: Precedence.         (line  64)
+* | (vertical bar), |& operator (I/O):   Getline/Coprocess.   (line   6)
+* | (vertical bar), |& operator (I/O) <1>: Redirection.       (line  96)
+* | (vertical bar), |& operator (I/O) <2>: Precedence.        (line  64)
+* | (vertical bar), |& operator (I/O) <3>: Two-way I/O.       (line  27)
+* | (vertical bar), |& operator (I/O), pipes, closing: Close Files And Pipes.
+                                                              (line 120)
+* | (vertical bar), || operator:         Boolean Ops.         (line  59)
+* | (vertical bar), || operator <1>:     Precedence.          (line  88)
+* ~ (tilde), ~ operator:                 Regexp Usage.        (line  19)
+* ~ (tilde), ~ operator <1>:             Computed Regexps.    (line   6)
+* ~ (tilde), ~ operator <2>:             Case-sensitivity.    (line  26)
+* ~ (tilde), ~ operator <3>:             Regexp Constants.    (line   6)
+* ~ (tilde), ~ operator <4>:             Comparison Operators.
+                                                              (line  11)
+* ~ (tilde), ~ operator <5>:             Comparison Operators.
+                                                              (line  98)
+* ~ (tilde), ~ operator <6>:             Precedence.          (line  79)
+* ~ (tilde), ~ operator <7>:             Expression Patterns. (line  24)
+* accessing fields:                      Fields.              (line   6)
+* accessing global variables from extensions: Symbol Table Access.
+                                                              (line   6)
+* account information:                   Passwd Functions.    (line  16)
+* account information <1>:               Group Functions.     (line   6)
+* actions:                               Action Overview.     (line   6)
+* actions, control statements in:        Statements.          (line   6)
+* actions, default:                      Very Simple.         (line  35)
+* actions, empty:                        Very Simple.         (line  40)
+* Ada programming language:              Glossary.            (line  11)
+* adding, features to gawk:              Adding Code.         (line   6)
+* adding, fields:                        Changing Fields.     (line  53)
+* advanced features, fixed-width data:   Constant Size.       (line   6)
+* advanced features, gawk:               Advanced Features.   (line   6)
+* advanced features, network programming: TCP/IP Networking.  (line   6)
+* advanced features, nondecimal input data: Nondecimal Data.  (line   6)
+* advanced features, processes, communicating with: Two-way I/O.
+                                                              (line   6)
+* advanced features, specifying field content: Splitting By Content.
+                                                              (line   9)
+* Aho, Alfred:                           History.             (line  17)
+* Aho, Alfred <1>:                       Contributors.        (line  12)
+* alarm clock example program:           Alarm Program.       (line  11)
+* alarm.awk program:                     Alarm Program.       (line  31)
+* algorithms:                            Basic High Level.    (line  57)
+* allocating memory for extensions:      Memory Allocation Functions.
+                                                              (line   6)
+* amazing awk assembler (aaa):           Glossary.            (line  16)
+* amazingly workable formatter (awf):    Glossary.            (line  24)
+* ambiguity, syntactic: /= operator vs. /=.../ regexp constant: Assignment Ops.
+                                                              (line 149)
+* ampersand (&), && operator:            Boolean Ops.         (line  59)
+* ampersand (&), && operator <1>:        Precedence.          (line  85)
+* ampersand (&), gsub()/gensub()/sub() functions and: Gory Details.
+                                                              (line   6)
+* anagram.awk program:                   Anagram Program.     (line  21)
+* anagrams, finding:                     Anagram Program.     (line   6)
+* and:                                   Bitwise Functions.   (line  40)
+* AND bitwise operation:                 Bitwise Functions.   (line   6)
+* and Boolean-logic operator:            Boolean Ops.         (line   6)
+* ANSI:                                  Glossary.            (line  34)
+* API informational variables:           Extension API Informational Variables.
+                                                              (line   6)
+* API version:                           Extension Versioning.
+                                                              (line   6)
+* arbitrary precision:                   Arbitrary Precision Arithmetic.
+                                                              (line   6)
+* arbitrary precision integers:          Arbitrary Precision Integers.
+                                                              (line   6)
+* archaeologists:                        Bugs.                (line   6)
+* arctangent:                            Numeric Functions.   (line  12)
+* ARGC/ARGV variables:                   Auto-set.            (line  15)
+* ARGC/ARGV variables, command-line arguments: Other Arguments.
+                                                              (line  15)
+* ARGC/ARGV variables, how to use:       ARGC and ARGV.       (line   6)
+* ARGC/ARGV variables, portability and:  Executable Scripts.  (line  59)
+* ARGIND variable:                       Auto-set.            (line  44)
+* ARGIND variable, command-line arguments: Other Arguments.   (line  15)
+* arguments, command-line:               Other Arguments.     (line   6)
+* arguments, command-line <1>:           Auto-set.            (line  15)
+* arguments, command-line <2>:           ARGC and ARGV.       (line   6)
+* arguments, command-line, invoking awk: Command Line.        (line   6)
+* arguments, in function calls:          Function Calls.      (line  18)
+* arguments, processing:                 Getopt Function.     (line   6)
+* ARGV array, indexing into:             Other Arguments.     (line  15)
+* arithmetic operators:                  Arithmetic Ops.      (line   6)
+* array manipulation in extensions:      Array Manipulation.  (line   6)
+* array members:                         Reference to Elements.
+                                                              (line   6)
+* array scanning order, controlling:     Controlling Scanning.
+                                                              (line  14)
+* array, number of elements:             String Functions.    (line 200)
+* arrays:                                Arrays.              (line   6)
+* arrays of arrays:                      Arrays of Arrays.    (line   6)
+* arrays, an example of using:           Array Example.       (line   6)
+* arrays, and IGNORECASE variable:       Array Intro.         (line 100)
+* arrays, as parameters to functions:    Pass By Value/Reference.
+                                                              (line  44)
+* arrays, associative:                   Array Intro.         (line  48)
+* arrays, associative, library functions and: Library Names.  (line  58)
+* arrays, deleting entire contents:      Delete.              (line  39)
+* arrays, elements that don't exist:     Reference to Elements.
+                                                              (line  23)
+* arrays, elements, assigning values:    Assigning Elements.  (line   6)
+* arrays, elements, deleting:            Delete.              (line   6)
+* arrays, elements, order of access by in operator: Scanning an Array.
+                                                              (line  48)
+* arrays, elements, retrieving number of: String Functions.   (line  42)
+* arrays, for statement and:             Scanning an Array.   (line  20)
+* arrays, indexing:                      Array Intro.         (line  48)
+* arrays, merging into strings:          Join Function.       (line   6)
+* arrays, multidimensional:              Multidimensional.    (line  10)
+* arrays, multidimensional, scanning:    Multiscanning.       (line  11)
+* arrays, numeric subscripts:            Numeric Array Subscripts.
+                                                              (line   6)
+* arrays, referencing elements:          Reference to Elements.
+                                                              (line   6)
+* arrays, scanning:                      Scanning an Array.   (line   6)
+* arrays, sorting:                       Array Sorting Functions.
+                                                              (line   6)
+* arrays, sorting, and IGNORECASE variable: Array Sorting Functions.
+                                                              (line  83)
+* arrays, sparse:                        Array Intro.         (line  76)
+* arrays, subscripts, uninitialized variables as: Uninitialized Subscripts.
+                                                              (line   6)
+* arrays, unassigned elements:           Reference to Elements.
+                                                              (line  18)
+* artificial intelligence, gawk and:     Distribution contents.
+                                                              (line  52)
+* ASCII:                                 Ordinal Functions.   (line  45)
+* ASCII <1>:                             Glossary.            (line 196)
+* asort:                                 String Functions.    (line  42)
+* asort <1>:                             Array Sorting Functions.
+                                                              (line   6)
+* asort() function (gawk), arrays, sorting: Array Sorting Functions.
+                                                              (line   6)
+* asorti:                                String Functions.    (line  42)
+* asorti <1>:                            Array Sorting Functions.
+                                                              (line   6)
+* asorti() function (gawk), arrays, sorting: Array Sorting Functions.
+                                                              (line   6)
+* assert() function (C library):         Assert Function.     (line   6)
+* assert() user-defined function:        Assert Function.     (line  28)
+* assertions:                            Assert Function.     (line   6)
+* assign values to variables, in debugger: Viewing And Changing Data.
+                                                              (line  58)
+* assignment operators:                  Assignment Ops.      (line   6)
+* assignment operators, evaluation order: Assignment Ops.     (line 110)
+* assignment operators, lvalues/rvalues: Assignment Ops.      (line  31)
+* assignments as filenames:              Ignoring Assigns.    (line   6)
+* associative arrays:                    Array Intro.         (line  48)
+* asterisk (*), * operator, as multiplication operator: Precedence.
+                                                              (line  54)
+* asterisk (*), * operator, as regexp operator: Regexp Operators.
+                                                              (line  89)
+* asterisk (*), * operator, null strings, matching: String Functions.
+                                                              (line 537)
+* asterisk (*), ** operator:             Arithmetic Ops.      (line  81)
+* asterisk (*), ** operator <1>:         Precedence.          (line  48)
+* asterisk (*), **= operator:            Assignment Ops.      (line 129)
+* asterisk (*), **= operator <1>:        Precedence.          (line  94)
+* asterisk (*), *= operator:             Assignment Ops.      (line 129)
+* asterisk (*), *= operator <1>:         Precedence.          (line  94)
+* atan2:                                 Numeric Functions.   (line  12)
+* automatic displays, in debugger:       Debugger Info.       (line  24)
+* awf (amazingly workable formatter) program: Glossary.       (line  24)
+* awk debugging, enabling:               Options.             (line 108)
+* awk language, POSIX version:           Assignment Ops.      (line 138)
+* awk profiling, enabling:               Options.             (line 245)
+* awk programs:                          Getting Started.     (line  12)
+* awk programs <1>:                      Executable Scripts.  (line   6)
+* awk programs <2>:                      Two Rules.           (line   6)
+* awk programs, complex:                 When.                (line  27)
+* awk programs, documenting:             Comments.            (line   6)
+* awk programs, documenting <1>:         Library Names.       (line   6)
+* awk programs, examples of:             Sample Programs.     (line   6)
+* awk programs, execution of:            Next Statement.      (line  16)
+* awk programs, internationalizing:      I18N Functions.      (line   6)
+* awk programs, internationalizing <1>:  Programmer i18n.     (line   6)
+* awk programs, lengthy:                 Long.                (line   6)
+* awk programs, lengthy, assertions:     Assert Function.     (line   6)
+* awk programs, location of:             Options.             (line  25)
+* awk programs, location of <1>:         Options.             (line 125)
+* awk programs, location of <2>:         Options.             (line 159)
+* awk programs, one-line examples:       Very Simple.         (line  46)
+* awk programs, profiling:               Profiling.           (line   6)
+* awk programs, running:                 Running gawk.        (line   6)
+* awk programs, running <1>:             Long.                (line   6)
+* awk programs, running, from shell scripts: One-shot.        (line  22)
+* awk programs, running, without input files: Read Terminal.  (line  16)
+* awk programs, shell variables in:      Using Shell Variables.
+                                                              (line   6)
+* awk, function of:                      Getting Started.     (line   6)
+* awk, gawk and:                         Preface.             (line  21)
+* awk, gawk and <1>:                     This Manual.         (line  14)
+* awk, history of:                       History.             (line  17)
+* awk, implementation issues, pipes:     Redirection.         (line 129)
+* awk, implementations:                  Other Versions.      (line   6)
+* awk, implementations, limits:          Getline Notes.       (line  14)
+* awk, invoking:                         Command Line.        (line   6)
+* awk, new vs. old:                      Names.               (line   6)
+* awk, new vs. old, OFMT variable:       Strings And Numbers. (line  56)
+* awk, POSIX and:                        Preface.             (line  21)
+* awk, POSIX and, See Also POSIX awk:    Preface.             (line  21)
+* awk, regexp constants and:             Comparison Operators.
+                                                              (line 103)
+* awk, See Also gawk:                    Preface.             (line  34)
+* awk, terms describing:                 This Manual.         (line   6)
+* awk, uses for:                         Preface.             (line  21)
+* awk, uses for <1>:                     Getting Started.     (line  12)
+* awk, uses for <2>:                     When.                (line   6)
+* awk, versions of:                      V7/SVR3.1.           (line   6)
+* awk, versions of, changes between SVR3.1 and SVR4: SVR4.    (line   6)
+* awk, versions of, changes between SVR4 and POSIX awk: POSIX.
+                                                              (line   6)
+* awk, versions of, changes between V7 and SVR3.1: V7/SVR3.1. (line   6)
+* awk, versions of, See Also Brian Kernighan's awk: BTL.      (line   6)
+* awk, versions of, See Also Brian Kernighan's awk <1>: Other Versions.
+                                                              (line  13)
+* awka compiler for awk:                 Other Versions.      (line  68)
+* AWKLIBPATH environment variable:       AWKLIBPATH Variable. (line   6)
+* AWKPATH environment variable:          AWKPATH Variable.    (line   6)
+* AWKPATH environment variable <1>:      PC Using.            (line   9)
+* awkprof.out file:                      Profiling.           (line   6)
+* awksed.awk program:                    Simple Sed.          (line  25)
+* awkvars.out file:                      Options.             (line  94)
+* b debugger command (alias for break):  Breakpoint Control.  (line  11)
+* backslash (\):                         Comments.            (line  50)
+* backslash (\), as field separator:     Command Line Field Separator.
+                                                              (line  24)
+* backslash (\), continuing lines and:   Statements/Lines.    (line  19)
+* backslash (\), continuing lines and, comments and: Statements/Lines.
+                                                              (line  75)
+* backslash (\), continuing lines and, in csh: Statements/Lines.
+                                                              (line  43)
+* backslash (\), gsub()/gensub()/sub() functions and: Gory Details.
+                                                              (line   6)
+* backslash (\), in bracket expressions: Bracket Expressions. (line  25)
+* backslash (\), in escape sequences:    Escape Sequences.    (line   6)
+* backslash (\), in escape sequences <1>: Escape Sequences.   (line 103)
+* backslash (\), in escape sequences, POSIX and: Escape Sequences.
+                                                              (line 108)
+* backslash (\), in regexp constants:    Computed Regexps.    (line  30)
+* backslash (\), in shell commands:      Quoting.             (line  48)
+* backslash (\), regexp operator:        Regexp Operators.    (line  18)
+* backslash (\), \" escape sequence:     Escape Sequences.    (line  85)
+* backslash (\), \' operator (gawk):     GNU Regexp Operators.
+                                                              (line  59)
+* backslash (\), \/ escape sequence:     Escape Sequences.    (line  76)
+* backslash (\), \< operator (gawk):     GNU Regexp Operators.
+                                                              (line  33)
+* backslash (\), \> operator (gawk):     GNU Regexp Operators.
+                                                              (line  37)
+* backslash (\), \a escape sequence:     Escape Sequences.    (line  34)
+* backslash (\), \b escape sequence:     Escape Sequences.    (line  38)
+* backslash (\), \B operator (gawk):     GNU Regexp Operators.
+                                                              (line  46)
+* backslash (\), \f escape sequence:     Escape Sequences.    (line  41)
+* backslash (\), \n escape sequence:     Escape Sequences.    (line  44)
+* backslash (\), \NNN escape sequence:   Escape Sequences.    (line  56)
+* backslash (\), \r escape sequence:     Escape Sequences.    (line  47)
+* backslash (\), \s operator (gawk):     GNU Regexp Operators.
+                                                              (line  13)
+* backslash (\), \S operator (gawk):     GNU Regexp Operators.
+                                                              (line  17)
+* backslash (\), \t escape sequence:     Escape Sequences.    (line  50)
+* backslash (\), \v escape sequence:     Escape Sequences.    (line  53)
+* backslash (\), \w operator (gawk):     GNU Regexp Operators.
+                                                              (line  22)
+* backslash (\), \W operator (gawk):     GNU Regexp Operators.
+                                                              (line  28)
+* backslash (\), \x escape sequence:     Escape Sequences.    (line  61)
+* backslash (\), \y operator (gawk):     GNU Regexp Operators.
+                                                              (line  41)
+* backslash (\), \` operator (gawk):     GNU Regexp Operators.
+                                                              (line  57)
+* backtrace debugger command:            Execution Stack.     (line  13)
+* Beebe, Nelson H.F.:                    Acknowledgments.     (line  60)
+* Beebe, Nelson H.F. <1>:                Other Versions.      (line  82)
+* BEGIN pattern:                         Field Separators.    (line  44)
+* BEGIN pattern <1>:                     BEGIN/END.           (line   6)
+* BEGIN pattern <2>:                     Using BEGIN/END.     (line   6)
+* BEGIN pattern, and profiling:          Profiling.           (line  62)
+* BEGIN pattern, assert() user-defined function and: Assert Function.
+                                                              (line  83)
+* BEGIN pattern, Boolean patterns and:   Expression Patterns. (line  70)
+* BEGIN pattern, exit statement and:     Exit Statement.      (line  12)
+* BEGIN pattern, getline and:            Getline Notes.       (line  19)
+* BEGIN pattern, headings, adding:       Print Examples.      (line  42)
+* BEGIN pattern, next/nextfile statements and: I/O And BEGIN/END.
+                                                              (line  36)
+* BEGIN pattern, next/nextfile statements and <1>: Next Statement.
+                                                              (line  44)
+* BEGIN pattern, OFS/ORS variables, assigning values to: Output Separators.
+                                                              (line  20)
+* BEGIN pattern, operators and:          Using BEGIN/END.     (line  17)
+* BEGIN pattern, print statement and:    I/O And BEGIN/END.   (line  15)
+* BEGIN pattern, pwcat program:          Passwd Functions.    (line 143)
+* BEGIN pattern, running awk programs and: Cut Program.       (line  63)
+* BEGIN pattern, TEXTDOMAIN variable and: Programmer i18n.    (line  60)
+* BEGINFILE pattern:                     BEGINFILE/ENDFILE.   (line   6)
+* BEGINFILE pattern, Boolean patterns and: Expression Patterns.
+                                                              (line  70)
+* beginfile() user-defined function:     Filetrans Function.  (line  62)
+* Bentley, Jon:                          Glossary.            (line 206)
+* Benzinger, Michael:                    Contributors.        (line  98)
+* Berry, Karl:                           Acknowledgments.     (line  33)
+* Berry, Karl <1>:                       Acknowledgments.     (line  75)
+* Berry, Karl <2>:                       Ranges and Locales.  (line  74)
+* binary input/output:                   User-modified.       (line  15)
+* bindtextdomain:                        I18N Functions.      (line  11)
+* bindtextdomain <1>:                    Programmer i18n.     (line  48)
+* bindtextdomain() function (C library): Explaining gettext.  (line  50)
+* bindtextdomain() function (gawk), portability and: I18N Portability.
+                                                              (line  33)
+* BINMODE variable:                      User-modified.       (line  15)
+* BINMODE variable <1>:                  PC Using.            (line  16)
+* bit-manipulation functions:            Bitwise Functions.   (line   6)
+* bits2str() user-defined function:      Bitwise Functions.   (line  69)
+* bitwise AND:                           Bitwise Functions.   (line  40)
+* bitwise complement:                    Bitwise Functions.   (line  44)
+* bitwise OR:                            Bitwise Functions.   (line  50)
+* bitwise XOR:                           Bitwise Functions.   (line  57)
+* bitwise, complement:                   Bitwise Functions.   (line  25)
+* bitwise, operations:                   Bitwise Functions.   (line   6)
+* bitwise, shift:                        Bitwise Functions.   (line  32)
+* body, in actions:                      Statements.          (line  10)
+* body, in loops:                        While Statement.     (line  14)
+* Boolean expressions:                   Boolean Ops.         (line   6)
+* Boolean expressions, as patterns:      Expression Patterns. (line  39)
+* Boolean operators, See Boolean expressions: Boolean Ops.    (line   6)
+* Bourne shell, quoting rules for:       Quoting.             (line  18)
+* braces ({}):                           Profiling.           (line 142)
+* braces ({}), actions and:              Action Overview.     (line  19)
+* braces ({}), statements, grouping:     Statements.          (line  10)
+* bracket expressions:                   Regexp Operators.    (line  56)
+* bracket expressions <1>:               Bracket Expressions. (line   6)
+* bracket expressions, character classes: Bracket Expressions.
+                                                              (line  40)
+* bracket expressions, collating elements: Bracket Expressions.
+                                                              (line  86)
+* bracket expressions, collating symbols: Bracket Expressions.
+                                                              (line  93)
+* bracket expressions, complemented:     Regexp Operators.    (line  64)
+* bracket expressions, equivalence classes: Bracket Expressions.
+                                                              (line  99)
+* bracket expressions, non-ASCII:        Bracket Expressions. (line  86)
+* bracket expressions, range expressions: Bracket Expressions.
+                                                              (line   6)
+* break debugger command:                Breakpoint Control.  (line  11)
+* break statement:                       Break Statement.     (line   6)
+* breakpoint:                            Debugging Terms.     (line  33)
+* breakpoint at location, how to delete: Breakpoint Control.  (line  36)
+* breakpoint commands:                   Debugger Execution Control.
+                                                              (line  10)
+* breakpoint condition:                  Breakpoint Control.  (line  54)
+* breakpoint, delete by number:          Breakpoint Control.  (line  64)
+* breakpoint, how to disable or enable:  Breakpoint Control.  (line  69)
+* breakpoint, setting:                   Breakpoint Control.  (line  11)
+* Brennan, Michael:                      Foreword3.           (line  84)
+* Brennan, Michael <1>:                  Foreword4.           (line  33)
+* Brennan, Michael <2>:                  Acknowledgments.     (line  79)
+* Brennan, Michael <3>:                  Delete.              (line  56)
+* Brennan, Michael <4>:                  Simple Sed.          (line  25)
+* Brennan, Michael <5>:                  Other Versions.      (line   6)
+* Brennan, Michael <6>:                  Other Versions.      (line  48)
+* Brian Kernighan's awk:                 When.                (line  21)
+* Brian Kernighan's awk <1>:             Escape Sequences.    (line 112)
+* Brian Kernighan's awk <2>:             GNU Regexp Operators.
+                                                              (line  85)
+* Brian Kernighan's awk <3>:             Regexp Field Splitting.
+                                                              (line  67)
+* Brian Kernighan's awk <4>:             Getline/Pipe.        (line  62)
+* Brian Kernighan's awk <5>:             Concatenation.       (line  36)
+* Brian Kernighan's awk <6>:             I/O And BEGIN/END.   (line  15)
+* Brian Kernighan's awk <7>:             Break Statement.     (line  51)
+* Brian Kernighan's awk <8>:             Continue Statement.  (line  44)
+* Brian Kernighan's awk <9>:             Nextfile Statement.  (line  47)
+* Brian Kernighan's awk <10>:            Delete.              (line  51)
+* Brian Kernighan's awk <11>:            String Functions.    (line 493)
+* Brian Kernighan's awk <12>:            Gory Details.        (line  19)
+* Brian Kernighan's awk <13>:            I/O Functions.       (line  43)
+* Brian Kernighan's awk, extensions:     BTL.                 (line   6)
+* Brian Kernighan's awk, source code:    Other Versions.      (line  13)
+* Brini, Davide:                         Signature Program.   (line   6)
+* Brink, Jeroen:                         DOS Quoting.         (line  10)
+* Broder, Alan J.:                       Contributors.        (line  89)
+* Brown, Martin:                         Contributors.        (line  83)
+* BSD-based operating systems:           Glossary.            (line 748)
+* bt debugger command (alias for backtrace): Execution Stack. (line  13)
+* Buening, Andreas:                      Acknowledgments.     (line  60)
+* Buening, Andreas <1>:                  Contributors.        (line  93)
+* Buening, Andreas <2>:                  Maintainers.         (line  14)
+* buffering, input/output:               I/O Functions.       (line 166)
+* buffering, input/output <1>:           Two-way I/O.         (line  53)
+* buffering, interactive vs. noninteractive: I/O Functions.   (line  76)
+* buffers, flushing:                     I/O Functions.       (line  32)
+* buffers, flushing <1>:                 I/O Functions.       (line 166)
+* buffers, operators for:                GNU Regexp Operators.
+                                                              (line  51)
+* bug reports, email address, bug-gawk@gnu.org: Bug address.  (line  22)
+* bug-gawk@gnu.org bug reporting address: Bug address.        (line  22)
+* built-in functions:                    Functions.           (line   6)
+* built-in functions, evaluation order:  Calling Built-in.    (line  30)
+* BusyBox Awk:                           Other Versions.      (line  92)
+* c.e., See common extensions:           Conventions.         (line  51)
+* call by reference:                     Pass By Value/Reference.
+                                                              (line  44)
+* call by value:                         Pass By Value/Reference.
+                                                              (line  15)
+* call stack, display in debugger:       Execution Stack.     (line  13)
+* caret (^), in bracket expressions:     Bracket Expressions. (line  25)
+* caret (^), regexp operator:            Regexp Operators.    (line  22)
+* caret (^), regexp operator <1>:        GNU Regexp Operators.
+                                                              (line  62)
+* caret (^), ^ operator:                 Precedence.          (line  48)
+* caret (^), ^= operator:                Assignment Ops.      (line 129)
+* caret (^), ^= operator <1>:            Precedence.          (line  94)
+* case keyword:                          Switch Statement.    (line   6)
+* case sensitivity, and regexps:         User-modified.       (line  76)
+* case sensitivity, and string comparisons: User-modified.    (line  76)
+* case sensitivity, array indices and:   Array Intro.         (line 100)
+* case sensitivity, converting case:     String Functions.    (line 523)
+* case sensitivity, example programs:    Library Functions.   (line  53)
+* case sensitivity, gawk:                Case-sensitivity.    (line  26)
+* case sensitivity, regexps and:         Case-sensitivity.    (line   6)
+* CGI, awk scripts for:                  Options.             (line 125)
+* character classes, See bracket expressions: Regexp Operators.
+                                                              (line  56)
+* character lists in regular expression: Bracket Expressions. (line   6)
+* character lists, See bracket expressions: Regexp Operators. (line  56)
+* character sets (machine character encodings): Ordinal Functions.
+                                                              (line  45)
+* character sets (machine character encodings) <1>: Glossary. (line 196)
+* character sets, See Also bracket expressions: Regexp Operators.
+                                                              (line  56)
+* characters, counting:                  Wc Program.          (line   6)
+* characters, transliterating:           Translate Program.   (line   6)
+* characters, values of as numbers:      Ordinal Functions.   (line   6)
+* Chassell, Robert J.:                   Acknowledgments.     (line  33)
+* chdir() extension function:            Extension Sample File Functions.
+                                                              (line  12)
+* chem utility:                          Glossary.            (line 206)
+* chr() extension function:              Extension Sample Ord.
+                                                              (line  15)
+* chr() user-defined function:           Ordinal Functions.   (line  16)
+* clear debugger command:                Breakpoint Control.  (line  36)
+* Cliff random numbers:                  Cliff Random Function.
+                                                              (line   6)
+* cliff_rand() user-defined function:    Cliff Random Function.
+                                                              (line  12)
+* close:                                 Close Files And Pipes.
+                                                              (line  18)
+* close <1>:                             I/O Functions.       (line  10)
+* close file or coprocess:               I/O Functions.       (line  10)
+* close() function, portability:         Close Files And Pipes.
+                                                              (line  81)
+* close() function, return value:        Close Files And Pipes.
+                                                              (line 132)
+* close() function, two-way pipes and:   Two-way I/O.         (line  60)
+* Close, Diane:                          Manual History.      (line  34)
+* Close, Diane <1>:                      Contributors.        (line  21)
+* Collado, Manuel:                       Acknowledgments.     (line  60)
+* collating elements:                    Bracket Expressions. (line  86)
+* collating symbols:                     Bracket Expressions. (line  93)
+* Colombo, Antonio:                      Acknowledgments.     (line  60)
+* Colombo, Antonio <1>:                  Contributors.        (line 141)
+* columns, aligning:                     Print Examples.      (line  69)
+* columns, cutting:                      Cut Program.         (line   6)
+* comma (,), in range patterns:          Ranges.              (line   6)
+* command completion, in debugger:       Readline Support.    (line   6)
+* command line, arguments:               Other Arguments.     (line   6)
+* command line, arguments <1>:           Auto-set.            (line  15)
+* command line, arguments <2>:           ARGC and ARGV.       (line   6)
+* command line, directories on:          Command-line directories.
+                                                              (line   6)
+* command line, formats:                 Running gawk.        (line  12)
+* command line, FS on, setting:          Command Line Field Separator.
+                                                              (line   6)
+* command line, invoking awk from:       Command Line.        (line   6)
+* command line, option -f:               Long.                (line  12)
+* command line, options:                 Options.             (line   6)
+* command line, options, end of:         Options.             (line  55)
+* command line, variables, assigning on: Assignment Options.  (line   6)
+* command-line options, processing:      Getopt Function.     (line   6)
+* command-line options, string extraction: String Extraction. (line   6)
+* commands debugger command:             Debugger Execution Control.
+                                                              (line  10)
+* commands to execute at breakpoint:     Debugger Execution Control.
+                                                              (line  10)
+* commenting:                            Comments.            (line   6)
+* commenting, backslash continuation and: Statements/Lines.   (line  75)
+* common extensions, ** operator:        Arithmetic Ops.      (line  30)
+* common extensions, **= operator:       Assignment Ops.      (line 138)
+* common extensions, /dev/stderr special file: Special FD.    (line  48)
+* common extensions, /dev/stdin special file: Special FD.     (line  48)
+* common extensions, /dev/stdout special file: Special FD.    (line  48)
+* common extensions, BINMODE variable:   PC Using.            (line  16)
+* common extensions, delete to delete entire arrays: Delete.  (line  39)
+* common extensions, func keyword:       Definition Syntax.   (line  99)
+* common extensions, length() applied to an array: String Functions.
+                                                              (line 200)
+* common extensions, RS as a regexp:     gawk split records.  (line   6)
+* common extensions, single character fields: Single Character Fields.
+                                                              (line   6)
+* common extensions, \x escape sequence: Escape Sequences.    (line  61)
+* comp.lang.awk newsgroup:               Usenet.              (line  11)
+* comparison expressions:                Typing and Comparison.
+                                                              (line   9)
+* comparison expressions, as patterns:   Expression Patterns. (line  14)
+* comparison expressions, string vs. regexp: Comparison Operators.
+                                                              (line  79)
+* compatibility mode (gawk), extensions: POSIX/GNU.           (line   6)
+* compatibility mode (gawk), file names: Special Caveats.     (line   9)
+* compatibility mode (gawk), hexadecimal numbers: Nondecimal-numbers.
+                                                              (line  59)
+* compatibility mode (gawk), octal numbers: Nondecimal-numbers.
+                                                              (line  59)
+* compatibility mode (gawk), specifying: Options.             (line  82)
+* compiled programs:                     Basic High Level.    (line  13)
+* compiled programs <1>:                 Glossary.            (line 218)
+* compiling gawk for Cygwin:             Cygwin.              (line   6)
+* compiling gawk for MS-Windows:         PC Compiling.        (line  11)
+* compiling gawk for VMS:                VMS Compilation.     (line   6)
+* compl:                                 Bitwise Functions.   (line  44)
+* complement, bitwise:                   Bitwise Functions.   (line  25)
+* compound statements, control statements and: Statements.    (line  10)
+* concatenating:                         Concatenation.       (line   9)
+* condition debugger command:            Breakpoint Control.  (line  54)
+* conditional expressions:               Conditional Exp.     (line   6)
+* configuration option, --disable-extensions: Additional Configuration Options.
+                                                              (line   9)
+* configuration option, --disable-lint:  Additional Configuration Options.
+                                                              (line  15)
+* configuration option, --disable-nls:   Additional Configuration Options.
+                                                              (line  32)
+* configuration option, --with-whiny-user-strftime: Additional Configuration Options.
+                                                              (line  37)
+* configuration options, gawk:           Additional Configuration Options.
+                                                              (line   6)
+* constant regexps:                      Regexp Usage.        (line  57)
+* constants, nondecimal:                 Nondecimal Data.     (line   6)
+* constants, numeric:                    Scalar Constants.    (line   6)
+* constants, types of:                   Constants.           (line   6)
+* continue program, in debugger:         Debugger Execution Control.
+                                                              (line  33)
+* continue statement:                    Continue Statement.  (line   6)
+* control statements:                    Statements.          (line   6)
+* controlling array scanning order:      Controlling Scanning.
+                                                              (line  14)
+* convert string to lower case:          String Functions.    (line 524)
+* convert string to number:              String Functions.    (line 391)
+* convert string to upper case:          String Functions.    (line 530)
+* converting integer array subscripts:   Numeric Array Subscripts.
+                                                              (line  31)
+* converting, dates to timestamps:       Time Functions.      (line  76)
+* converting, numbers to strings:        Strings And Numbers. (line   6)
+* converting, numbers to strings <1>:    Bitwise Functions.   (line 108)
+* converting, strings to numbers:        Strings And Numbers. (line   6)
+* converting, strings to numbers <1>:    Bitwise Functions.   (line 108)
+* CONVFMT variable:                      Strings And Numbers. (line  29)
+* CONVFMT variable <1>:                  User-modified.       (line  30)
+* CONVFMT variable, and array subscripts: Numeric Array Subscripts.
+                                                              (line   6)
+* cookie:                                Glossary.            (line 257)
+* coprocesses:                           Redirection.         (line  96)
+* coprocesses <1>:                       Two-way I/O.         (line  27)
+* coprocesses, closing:                  Close Files And Pipes.
+                                                              (line   6)
+* coprocesses, getline from:             Getline/Coprocess.   (line   6)
+* cos:                                   Numeric Functions.   (line  16)
+* cosine:                                Numeric Functions.   (line  16)
+* counting:                              Wc Program.          (line   6)
+* csh utility:                           Statements/Lines.    (line  43)
+* csh utility, POSIXLY_CORRECT environment variable: Options. (line 358)
+* csh utility, |& operator, comparison with: Two-way I/O.     (line  27)
+* ctime() user-defined function:         Function Example.    (line  74)
+* currency symbols, localization:        Explaining gettext.  (line 104)
+* current system time:                   Time Functions.      (line  66)
+* custom.h file:                         Configuration Philosophy.
+                                                              (line  30)
+* customized input parser:               Input Parsers.       (line   6)
+* customized output wrapper:             Output Wrappers.     (line   6)
+* customized two-way processor:          Two-way processors.  (line   6)
+* cut utility:                           Cut Program.         (line   6)
+* cut utility <1>:                       Cut Program.         (line   6)
+* cut.awk program:                       Cut Program.         (line  45)
+* d debugger command (alias for delete): Breakpoint Control.  (line  64)
+* d.c., See dark corner:                 Conventions.         (line  42)
+* dark corner:                           Conventions.         (line  42)
+* dark corner <1>:                       Glossary.            (line 268)
+* dark corner, "0" is actually true:     Truth Values.        (line  24)
+* dark corner, /= operator vs. /=.../ regexp constant: Assignment Ops.
+                                                              (line 149)
+* dark corner, array subscripts:         Uninitialized Subscripts.
+                                                              (line  43)
+* dark corner, break statement:          Break Statement.     (line  51)
+* dark corner, close() function:         Close Files And Pipes.
+                                                              (line 132)
+* dark corner, command-line arguments:   Assignment Options.  (line  43)
+* dark corner, continue statement:       Continue Statement.  (line  44)
+* dark corner, CONVFMT variable:         Strings And Numbers. (line  39)
+* dark corner, escape sequences:         Other Arguments.     (line  38)
+* dark corner, escape sequences, for metacharacters: Escape Sequences.
+                                                              (line 144)
+* dark corner, exit statement:           Exit Statement.      (line  30)
+* dark corner, field separators:         Full Line Fields.    (line  22)
+* dark corner, FILENAME variable:        Getline Notes.       (line  19)
+* dark corner, FILENAME variable <1>:    Auto-set.            (line 108)
+* dark corner, FNR/NR variables:         Auto-set.            (line 357)
+* dark corner, format-control characters: Control Letters.    (line  18)
+* dark corner, format-control characters <1>: Control Letters.
+                                                              (line  93)
+* dark corner, FS as null string:        Single Character Fields.
+                                                              (line  20)
+* dark corner, input files:              awk split records.   (line 110)
+* dark corner, invoking awk:             Command Line.        (line  16)
+* dark corner, length() function:        String Functions.    (line 186)
+* dark corner, locale's decimal point character: Locale influences conversions.
+                                                              (line  17)
+* dark corner, multiline records:        Multiple Line.       (line  35)
+* dark corner, NF variable, decrementing: Changing Fields.    (line 107)
+* dark corner, OFMT variable:            OFMT.                (line  27)
+* dark corner, regexp as second argument to index(): String Functions.
+                                                              (line 164)
+* dark corner, regexp constants:         Using Constant Regexps.
+                                                              (line   6)
+* dark corner, regexp constants, /= operator and: Assignment Ops.
+                                                              (line 149)
+* dark corner, regexp constants, as arguments to user-defined functions: Using Constant Regexps.
+                                                              (line  43)
+* dark corner, split() function:         String Functions.    (line 361)
+* dark corner, strings, storing:         gawk split records.  (line  82)
+* dark corner, value of ARGV[0]:         Auto-set.            (line  39)
+* dark corner, ^, in FS:                 Regexp Field Splitting.
+                                                              (line  59)
+* data, fixed-width:                     Constant Size.       (line   6)
+* data-driven languages:                 Basic High Level.    (line  74)
+* database, group, reading:              Group Functions.     (line   6)
+* database, users, reading:              Passwd Functions.    (line   6)
+* date utility, GNU:                     Time Functions.      (line  17)
+* date utility, POSIX:                   Time Functions.      (line 253)
+* dates, converting to timestamps:       Time Functions.      (line  76)
+* dates, information related to, localization: Explaining gettext.
+                                                              (line 112)
+* Davies, Stephen:                       Acknowledgments.     (line  60)
+* Davies, Stephen <1>:                   Contributors.        (line  75)
+* Day, Robert P.J.:                      Acknowledgments.     (line  79)
+* dcgettext:                             I18N Functions.      (line  21)
+* dcgettext <1>:                         Programmer i18n.     (line  20)
+* dcgettext() function (gawk), portability and: I18N Portability.
+                                                              (line  33)
+* dcngettext:                            I18N Functions.      (line  27)
+* dcngettext <1>:                        Programmer i18n.     (line  37)
+* dcngettext() function (gawk), portability and: I18N Portability.
+                                                              (line  33)
+* deadlocks:                             Two-way I/O.         (line  53)
+* debugger commands, b (break):          Breakpoint Control.  (line  11)
+* debugger commands, backtrace:          Execution Stack.     (line  13)
+* debugger commands, break:              Breakpoint Control.  (line  11)
+* debugger commands, bt (backtrace):     Execution Stack.     (line  13)
+* debugger commands, c (continue):       Debugger Execution Control.
+                                                              (line  33)
+* debugger commands, clear:              Breakpoint Control.  (line  36)
+* debugger commands, commands:           Debugger Execution Control.
+                                                              (line  10)
+* debugger commands, condition:          Breakpoint Control.  (line  54)
+* debugger commands, continue:           Debugger Execution Control.
+                                                              (line  33)
+* debugger commands, d (delete):         Breakpoint Control.  (line  64)
+* debugger commands, delete:             Breakpoint Control.  (line  64)
+* debugger commands, disable:            Breakpoint Control.  (line  69)
+* debugger commands, display:            Viewing And Changing Data.
+                                                              (line   8)
+* debugger commands, down:               Execution Stack.     (line  23)
+* debugger commands, dump:               Miscellaneous Debugger Commands.
+                                                              (line   9)
+* debugger commands, e (enable):         Breakpoint Control.  (line  73)
+* debugger commands, enable:             Breakpoint Control.  (line  73)
+* debugger commands, end:                Debugger Execution Control.
+                                                              (line  10)
+* debugger commands, eval:               Viewing And Changing Data.
+                                                              (line  23)
+* debugger commands, f (frame):          Execution Stack.     (line  27)
+* debugger commands, finish:             Debugger Execution Control.
+                                                              (line  39)
+* debugger commands, frame:              Execution Stack.     (line  27)
+* debugger commands, h (help):           Miscellaneous Debugger Commands.
+                                                              (line  69)
+* debugger commands, help:               Miscellaneous Debugger Commands.
+                                                              (line  69)
+* debugger commands, i (info):           Debugger Info.       (line  13)
+* debugger commands, ignore:             Breakpoint Control.  (line  87)
+* debugger commands, info:               Debugger Info.       (line  13)
+* debugger commands, l (list):           Miscellaneous Debugger Commands.
+                                                              (line  75)
+* debugger commands, list:               Miscellaneous Debugger Commands.
+                                                              (line  75)
+* debugger commands, n (next):           Debugger Execution Control.
+                                                              (line  43)
+* debugger commands, next:               Debugger Execution Control.
+                                                              (line  43)
+* debugger commands, nexti:              Debugger Execution Control.
+                                                              (line  49)
+* debugger commands, ni (nexti):         Debugger Execution Control.
+                                                              (line  49)
+* debugger commands, o (option):         Debugger Info.       (line  57)
+* debugger commands, option:             Debugger Info.       (line  57)
+* debugger commands, p (print):          Viewing And Changing Data.
+                                                              (line  35)
+* debugger commands, print:              Viewing And Changing Data.
+                                                              (line  35)
+* debugger commands, printf:             Viewing And Changing Data.
+                                                              (line  53)
+* debugger commands, q (quit):           Miscellaneous Debugger Commands.
+                                                              (line 102)
+* debugger commands, quit:               Miscellaneous Debugger Commands.
+                                                              (line 102)
+* debugger commands, r (run):            Debugger Execution Control.
+                                                              (line  62)
+* debugger commands, return:             Debugger Execution Control.
+                                                              (line  54)
+* debugger commands, run:                Debugger Execution Control.
+                                                              (line  62)
+* debugger commands, s (step):           Debugger Execution Control.
+                                                              (line  68)
+* debugger commands, set:                Viewing And Changing Data.
+                                                              (line  58)
+* debugger commands, si (stepi):         Debugger Execution Control.
+                                                              (line  75)
+* debugger commands, silent:             Debugger Execution Control.
+                                                              (line  10)
+* debugger commands, step:               Debugger Execution Control.
+                                                              (line  68)
+* debugger commands, stepi:              Debugger Execution Control.
+                                                              (line  75)
+* debugger commands, t (tbreak):         Breakpoint Control.  (line  90)
+* debugger commands, tbreak:             Breakpoint Control.  (line  90)
+* debugger commands, trace:              Miscellaneous Debugger Commands.
+                                                              (line 110)
+* debugger commands, u (until):          Debugger Execution Control.
+                                                              (line  82)
+* debugger commands, undisplay:          Viewing And Changing Data.
+                                                              (line  79)
+* debugger commands, until:              Debugger Execution Control.
+                                                              (line  82)
+* debugger commands, unwatch:            Viewing And Changing Data.
+                                                              (line  83)
+* debugger commands, up:                 Execution Stack.     (line  36)
+* debugger commands, w (watch):          Viewing And Changing Data.
+                                                              (line  66)
+* debugger commands, watch:              Viewing And Changing Data.
+                                                              (line  66)
+* debugger commands, where (backtrace):  Execution Stack.     (line  13)
+* debugger default list amount:          Debugger Info.       (line  69)
+* debugger history file:                 Debugger Info.       (line  81)
+* debugger history size:                 Debugger Info.       (line  65)
+* debugger options:                      Debugger Info.       (line  57)
+* debugger prompt:                       Debugger Info.       (line  78)
+* debugger, how to start:                Debugger Invocation. (line   6)
+* debugger, read commands from a file:   Debugger Info.       (line  97)
+* debugging awk programs:                Debugger.            (line   6)
+* debugging gawk, bug reports:           Bugs.                (line   9)
+* decimal point character, locale specific: Options.          (line 269)
+* decrement operators:                   Increment Ops.       (line  35)
+* default keyword:                       Switch Statement.    (line   6)
+* Deifik, Scott:                         Acknowledgments.     (line  60)
+* Deifik, Scott <1>:                     Contributors.        (line  54)
+* Deifik, Scott <2>:                     Maintainers.         (line  14)
+* delete ARRAY:                          Delete.              (line  39)
+* delete breakpoint at location:         Breakpoint Control.  (line  36)
+* delete breakpoint by number:           Breakpoint Control.  (line  64)
+* delete debugger command:               Breakpoint Control.  (line  64)
+* delete statement:                      Delete.              (line   6)
+* delete watchpoint:                     Viewing And Changing Data.
+                                                              (line  83)
+* deleting elements in arrays:           Delete.              (line   6)
+* deleting entire arrays:                Delete.              (line  39)
+* Demaille, Akim:                        Acknowledgments.     (line  60)
+* describe call stack frame, in debugger: Debugger Info.      (line  27)
+* differences between gawk and awk:      String Functions.    (line 200)
+* differences in awk and gawk, ARGC/ARGV variables: ARGC and ARGV.
+                                                              (line  89)
+* differences in awk and gawk, ARGIND variable: Auto-set.     (line  44)
+* differences in awk and gawk, array elements, deleting: Delete.
+                                                              (line  39)
+* differences in awk and gawk, AWKLIBPATH environment variable: AWKLIBPATH Variable.
+                                                              (line   6)
+* differences in awk and gawk, AWKPATH environment variable: AWKPATH Variable.
+                                                              (line   6)
+* differences in awk and gawk, BEGIN/END patterns: I/O And BEGIN/END.
+                                                              (line  15)
+* differences in awk and gawk, BEGINFILE/ENDFILE patterns: BEGINFILE/ENDFILE.
+                                                              (line   6)
+* differences in awk and gawk, BINMODE variable: User-modified.
+                                                              (line  15)
+* differences in awk and gawk, BINMODE variable <1>: PC Using.
+                                                              (line  16)
+* differences in awk and gawk, close() function: Close Files And Pipes.
+                                                              (line  81)
+* differences in awk and gawk, close() function <1>: Close Files And Pipes.
+                                                              (line 132)
+* differences in awk and gawk, command-line directories: Command-line directories.
+                                                              (line   6)
+* differences in awk and gawk, ERRNO variable: Auto-set.      (line  87)
+* differences in awk and gawk, error messages: Special FD.    (line  19)
+* differences in awk and gawk, FIELDWIDTHS variable: User-modified.
+                                                              (line  37)
+* differences in awk and gawk, FPAT variable: User-modified.  (line  43)
+* differences in awk and gawk, FUNCTAB variable: Auto-set.    (line 134)
+* differences in awk and gawk, function arguments (gawk): Calling Built-in.
+                                                              (line  16)
+* differences in awk and gawk, getline command: Getline.      (line  19)
+* differences in awk and gawk, IGNORECASE variable: User-modified.
+                                                              (line  76)
+* differences in awk and gawk, implementation limitations: Getline Notes.
+                                                              (line  14)
+* differences in awk and gawk, implementation limitations <1>: Redirection.
+                                                              (line 129)
+* differences in awk and gawk, indirect function calls: Indirect Calls.
+                                                              (line   6)
+* differences in awk and gawk, input/output operators: Getline/Coprocess.
+                                                              (line   6)
+* differences in awk and gawk, input/output operators <1>: Redirection.
+                                                              (line  96)
+* differences in awk and gawk, line continuations: Conditional Exp.
+                                                              (line  34)
+* differences in awk and gawk, LINT variable: User-modified.  (line  87)
+* differences in awk and gawk, match() function: String Functions.
+                                                              (line 262)
+* differences in awk and gawk, print/printf statements: Format Modifiers.
+                                                              (line  13)
+* differences in awk and gawk, PROCINFO array: Auto-set.      (line 148)
+* differences in awk and gawk, read timeouts: Read Timeout.   (line   6)
+* differences in awk and gawk, record separators: awk split records.
+                                                              (line 124)
+* differences in awk and gawk, regexp constants: Using Constant Regexps.
+                                                              (line  43)
+* differences in awk and gawk, regular expressions: Case-sensitivity.
+                                                              (line  26)
+* differences in awk and gawk, retrying input: Retrying Input.
+                                                              (line   6)
+* differences in awk and gawk, RS/RT variables: gawk split records.
+                                                              (line  58)
+* differences in awk and gawk, RT variable: Auto-set.         (line 295)
+* differences in awk and gawk, single-character fields: Single Character Fields.
+                                                              (line   6)
+* differences in awk and gawk, split() function: String Functions.
+                                                              (line 348)
+* differences in awk and gawk, strings:  Scalar Constants.    (line  20)
+* differences in awk and gawk, strings, storing: gawk split records.
+                                                              (line  76)
+* differences in awk and gawk, SYMTAB variable: Auto-set.     (line 299)
+* differences in awk and gawk, TEXTDOMAIN variable: User-modified.
+                                                              (line 152)
+* differences in awk and gawk, trunc-mod operation: Arithmetic Ops.
+                                                              (line  66)
+* directories, command-line:             Command-line directories.
+                                                              (line   6)
+* directories, searching:                Programs Exercises.  (line  70)
+* directories, searching for loadable extensions: AWKLIBPATH Variable.
+                                                              (line   6)
+* directories, searching for source files: AWKPATH Variable.  (line   6)
+* disable breakpoint:                    Breakpoint Control.  (line  69)
+* disable debugger command:              Breakpoint Control.  (line  69)
+* display debugger command:              Viewing And Changing Data.
+                                                              (line   8)
+* display debugger options:              Debugger Info.       (line  57)
+* division:                              Arithmetic Ops.      (line  44)
+* do-while statement:                    Do Statement.        (line   6)
+* do-while statement, use of regexps in: Regexp Usage.        (line  19)
+* documentation, of awk programs:        Library Names.       (line   6)
+* documentation, online:                 Manual History.      (line  11)
+* documents, searching:                  Dupword Program.     (line   6)
+* dollar sign ($), $ field operator:     Fields.              (line  19)
+* dollar sign ($), $ field operator <1>: Precedence.          (line  42)
+* dollar sign ($), incrementing fields and arrays: Increment Ops.
+                                                              (line  30)
+* dollar sign ($), regexp operator:      Regexp Operators.    (line  35)
+* double quote ("), in regexp constants: Computed Regexps.    (line  30)
+* double quote ("), in shell commands:   Quoting.             (line  54)
+* down debugger command:                 Execution Stack.     (line  23)
+* Drepper, Ulrich:                       Acknowledgments.     (line  52)
+* Duman, Patrice:                        Acknowledgments.     (line  75)
+* dump all variables of a program:       Options.             (line  94)
+* dump debugger command:                 Miscellaneous Debugger Commands.
+                                                              (line   9)
+* dupword.awk program:                   Dupword Program.     (line  31)
+* dynamic profiling:                     Profiling.           (line 177)
+* dynamically loaded extensions:         Dynamic Extensions.  (line   6)
+* e debugger command (alias for enable): Breakpoint Control.  (line  73)
+* EBCDIC:                                Ordinal Functions.   (line  45)
+* effective group ID of gawk user:       Auto-set.            (line 153)
+* effective user ID of gawk user:        Auto-set.            (line 161)
+* egrep utility:                         Bracket Expressions. (line  34)
+* egrep utility <1>:                     Egrep Program.       (line   6)
+* egrep.awk program:                     Egrep Program.       (line  53)
+* elements in arrays, assigning values:  Assigning Elements.  (line   6)
+* elements in arrays, deleting:          Delete.              (line   6)
+* elements in arrays, order of access by in operator: Scanning an Array.
+                                                              (line  48)
+* elements in arrays, scanning:          Scanning an Array.   (line   6)
+* elements of arrays:                    Reference to Elements.
+                                                              (line   6)
+* email address for bug reports, bug-gawk@gnu.org: Bug address.
+                                                              (line  22)
+* empty array elements:                  Reference to Elements.
+                                                              (line  18)
+* empty pattern:                         Empty.               (line   6)
+* empty strings:                         awk split records.   (line 114)
+* empty strings, See null strings:       Regexp Field Splitting.
+                                                              (line  43)
+* EMRED:                                 TCP/IP Networking.   (line   6)
+* enable breakpoint:                     Breakpoint Control.  (line  73)
+* enable debugger command:               Breakpoint Control.  (line  73)
+* end debugger command:                  Debugger Execution Control.
+                                                              (line  10)
+* END pattern:                           BEGIN/END.           (line   6)
+* END pattern <1>:                       Using BEGIN/END.     (line   6)
+* END pattern, and profiling:            Profiling.           (line  62)
+* END pattern, assert() user-defined function and: Assert Function.
+                                                              (line  75)
+* END pattern, Boolean patterns and:     Expression Patterns. (line  70)
+* END pattern, exit statement and:       Exit Statement.      (line  12)
+* END pattern, next/nextfile statements and: I/O And BEGIN/END.
+                                                              (line  36)
+* END pattern, next/nextfile statements and <1>: Next Statement.
+                                                              (line  44)
+* END pattern, operators and:            Using BEGIN/END.     (line  17)
+* END pattern, print statement and:      I/O And BEGIN/END.   (line  15)
+* ENDFILE pattern:                       BEGINFILE/ENDFILE.   (line   6)
+* ENDFILE pattern, Boolean patterns and: Expression Patterns. (line  70)
+* endfile() user-defined function:       Filetrans Function.  (line  62)
+* endgrent() function (C library):       Group Functions.     (line 213)
+* endgrent() user-defined function:      Group Functions.     (line 216)
+* endpwent() function (C library):       Passwd Functions.    (line 208)
+* endpwent() user-defined function:      Passwd Functions.    (line 211)
+* English, Steve:                        Advanced Features.   (line   6)
+* ENVIRON array:                         Auto-set.            (line  59)
+* environment variables used by gawk:    Environment Variables.
+                                                              (line   6)
+* environment variables, in ENVIRON array: Auto-set.          (line  59)
+* epoch, definition of:                  Glossary.            (line 312)
+* equals sign (=), = operator:           Assignment Ops.      (line   6)
+* equals sign (=), == operator:          Comparison Operators.
+                                                              (line  11)
+* equals sign (=), == operator <1>:      Precedence.          (line  64)
+* EREs (Extended Regular Expressions):   Bracket Expressions. (line  34)
+* ERRNO variable:                        Auto-set.            (line  87)
+* ERRNO variable <1>:                    TCP/IP Networking.   (line  54)
+* ERRNO variable, with BEGINFILE pattern: BEGINFILE/ENDFILE.  (line  26)
+* ERRNO variable, with close() function: Close Files And Pipes.
+                                                              (line 140)
+* ERRNO variable, with getline command:  Getline.             (line  19)
+* error handling:                        Special FD.          (line  19)
+* error handling, ERRNO variable and:    Auto-set.            (line  87)
+* error output:                          Special FD.          (line   6)
+* escape processing, gsub()/gensub()/sub() functions: Gory Details.
+                                                              (line   6)
+* escape sequences, in strings:          Escape Sequences.    (line   6)
+* eval debugger command:                 Viewing And Changing Data.
+                                                              (line  23)
+* evaluate expressions, in debugger:     Viewing And Changing Data.
+                                                              (line  23)
+* evaluation order:                      Increment Ops.       (line  60)
+* evaluation order, concatenation:       Concatenation.       (line  41)
+* evaluation order, functions:           Calling Built-in.    (line  30)
+* examining fields:                      Fields.              (line   6)
+* exclamation point (!), ! operator:     Boolean Ops.         (line  69)
+* exclamation point (!), ! operator <1>: Precedence.          (line  51)
+* exclamation point (!), ! operator <2>: Egrep Program.       (line 174)
+* exclamation point (!), != operator:    Comparison Operators.
+                                                              (line  11)
+* exclamation point (!), != operator <1>: Precedence.         (line  64)
+* exclamation point (!), !~ operator:    Regexp Usage.        (line  19)
+* exclamation point (!), !~ operator <1>: Computed Regexps.   (line   6)
+* exclamation point (!), !~ operator <2>: Case-sensitivity.   (line  26)
+* exclamation point (!), !~ operator <3>: Regexp Constants.   (line   6)
+* exclamation point (!), !~ operator <4>: Comparison Operators.
+                                                              (line  11)
+* exclamation point (!), !~ operator <5>: Comparison Operators.
+                                                              (line  98)
+* exclamation point (!), !~ operator <6>: Precedence.         (line  79)
+* exclamation point (!), !~ operator <7>: Expression Patterns.
+                                                              (line  24)
+* exit debugger command:                 Miscellaneous Debugger Commands.
+                                                              (line  66)
+* exit statement:                        Exit Statement.      (line   6)
+* exit status, of gawk:                  Exit Status.         (line   6)
+* exit status, of VMS:                   VMS Running.         (line  28)
+* exit the debugger:                     Miscellaneous Debugger Commands.
+                                                              (line  66)
+* exit the debugger <1>:                 Miscellaneous Debugger Commands.
+                                                              (line 102)
+* exp:                                   Numeric Functions.   (line  19)
+* expand utility:                        Very Simple.         (line  73)
+* Expat XML parser library:              gawkextlib.          (line  37)
+* exponent:                              Numeric Functions.   (line  19)
+* expressions:                           Expressions.         (line   6)
+* expressions, as patterns:              Expression Patterns. (line   6)
+* expressions, assignment:               Assignment Ops.      (line   6)
+* expressions, Boolean:                  Boolean Ops.         (line   6)
+* expressions, comparison:               Typing and Comparison.
+                                                              (line   9)
+* expressions, conditional:              Conditional Exp.     (line   6)
+* expressions, matching, See comparison expressions: Typing and Comparison.
+                                                              (line   9)
+* expressions, selecting:                Conditional Exp.     (line   6)
+* Extended Regular Expressions (EREs):   Bracket Expressions. (line  34)
+* extension API:                         Extension API Description.
+                                                              (line   6)
+* extension API informational variables: Extension API Informational Variables.
+                                                              (line   6)
+* extension API version:                 Extension Versioning.
+                                                              (line   6)
+* extension API, version number:         Auto-set.            (line 246)
+* extension example:                     Extension Example.   (line   6)
+* extension registration:                Registration Functions.
+                                                              (line   6)
+* extension search path:                 Finding Extensions.  (line   6)
+* extensions distributed with gawk:      Extension Samples.   (line   6)
+* extensions, allocating memory:         Memory Allocation Functions.
+                                                              (line   6)
+* extensions, Brian Kernighan's awk:     BTL.                 (line   6)
+* extensions, Brian Kernighan's awk <1>: Common Extensions.   (line   6)
+* extensions, common, ** operator:       Arithmetic Ops.      (line  30)
+* extensions, common, **= operator:      Assignment Ops.      (line 138)
+* extensions, common, /dev/stderr special file: Special FD.   (line  48)
+* extensions, common, /dev/stdin special file: Special FD.    (line  48)
+* extensions, common, /dev/stdout special file: Special FD.   (line  48)
+* extensions, common, BINMODE variable:  PC Using.            (line  16)
+* extensions, common, delete to delete entire arrays: Delete. (line  39)
+* extensions, common, fflush() function: I/O Functions.       (line  43)
+* extensions, common, func keyword:      Definition Syntax.   (line  99)
+* extensions, common, length() applied to an array: String Functions.
+                                                              (line 200)
+* extensions, common, RS as a regexp:    gawk split records.  (line   6)
+* extensions, common, single character fields: Single Character Fields.
+                                                              (line   6)
+* extensions, common, \x escape sequence: Escape Sequences.   (line  61)
+* extensions, in gawk, not in POSIX awk: POSIX/GNU.           (line   6)
+* extensions, loading, @load directive:  Loading Shared Libraries.
+                                                              (line   8)
+* extensions, mawk:                      Common Extensions.   (line   6)
+* extensions, where to find:             gawkextlib.          (line   6)
+* extract.awk program:                   Extract Program.     (line  79)
+* extraction, of marked strings (internationalization): String Extraction.
+                                                              (line   6)
+* f debugger command (alias for frame):  Execution Stack.     (line  27)
+* false, logical:                        Truth Values.        (line   6)
+* FDL (Free Documentation License):      GNU Free Documentation License.
+                                                              (line   8)
+* features, adding to gawk:              Adding Code.         (line   6)
+* features, deprecated:                  Obsolete.            (line   6)
+* features, undocumented:                Undocumented.        (line   6)
+* Fenlason, Jay:                         History.             (line  30)
+* Fenlason, Jay <1>:                     Contributors.        (line  19)
+* fflush:                                I/O Functions.       (line  28)
+* field numbers:                         Nonconstant Fields.  (line   6)
+* field operator $:                      Fields.              (line  19)
+* field operators, dollar sign as:       Fields.              (line  19)
+* field separator, in multiline records: Multiple Line.       (line  41)
+* field separator, on command line:      Command Line Field Separator.
+                                                              (line   6)
+* field separator, POSIX and:            Full Line Fields.    (line  16)
+* field separators:                      Field Separators.    (line  15)
+* field separators <1>:                  User-modified.       (line  50)
+* field separators <2>:                  User-modified.       (line 113)
+* field separators, choice of:           Field Separators.    (line  50)
+* field separators, FIELDWIDTHS variable and: User-modified.  (line  37)
+* field separators, FPAT variable and:   User-modified.       (line  43)
+* field separators, regular expressions as: Field Separators. (line  50)
+* field separators, regular expressions as <1>: Regexp Field Splitting.
+                                                              (line   6)
+* field separators, See Also OFS:        Changing Fields.     (line  64)
+* field separators, spaces as:           Cut Program.         (line 103)
+* fields:                                Reading Files.       (line  14)
+* fields <1>:                            Fields.              (line   6)
+* fields <2>:                            Basic High Level.    (line  62)
+* fields, adding:                        Changing Fields.     (line  53)
+* fields, changing contents of:          Changing Fields.     (line   6)
+* fields, cutting:                       Cut Program.         (line   6)
+* fields, examining:                     Fields.              (line   6)
+* fields, number of:                     Fields.              (line  33)
+* fields, numbers:                       Nonconstant Fields.  (line   6)
+* fields, printing:                      Print Examples.      (line  20)
+* fields, separating:                    Field Separators.    (line  15)
+* fields, separating <1>:                Field Separators.    (line  15)
+* fields, single-character:              Single Character Fields.
+                                                              (line   6)
+* FIELDWIDTHS variable:                  Constant Size.       (line  22)
+* FIELDWIDTHS variable <1>:              User-modified.       (line  37)
+* file descriptors:                      Special FD.          (line   6)
+* file inclusion, @include directive:    Include Files.       (line   8)
+* file names, distinguishing:            Auto-set.            (line  55)
+* file names, in compatibility mode:     Special Caveats.     (line   9)
+* file names, standard streams in gawk:  Special FD.          (line  48)
+* FILENAME variable:                     Reading Files.       (line   6)
+* FILENAME variable <1>:                 Auto-set.            (line 108)
+* FILENAME variable, getline, setting with: Getline Notes.    (line  19)
+* filenames, assignments as:             Ignoring Assigns.    (line   6)
+* files, .gmo:                           Explaining gettext.  (line  42)
+* files, .gmo, specifying directory of:  Explaining gettext.  (line  54)
+* files, .gmo, specifying directory of <1>: Programmer i18n.  (line  48)
+* files, .mo, converting from .po:       I18N Example.        (line  66)
+* files, .po:                            Explaining gettext.  (line  37)
+* files, .po <1>:                        Translator i18n.     (line   6)
+* files, .po, converting to .mo:         I18N Example.        (line  66)
+* files, .pot:                           Explaining gettext.  (line  31)
+* files, /dev/... special files:         Special FD.          (line  48)
+* files, /inet/... (gawk):               TCP/IP Networking.   (line   6)
+* files, /inet4/... (gawk):              TCP/IP Networking.   (line   6)
+* files, /inet6/... (gawk):              TCP/IP Networking.   (line   6)
+* files, awk programs in:                Long.                (line   6)
+* files, awkprof.out:                    Profiling.           (line   6)
+* files, awkvars.out:                    Options.             (line  94)
+* files, closing:                        I/O Functions.       (line  10)
+* files, descriptors, See file descriptors: Special FD.       (line   6)
+* files, group:                          Group Functions.     (line   6)
+* files, initialization and cleanup:     Filetrans Function.  (line   6)
+* files, input, See input files:         Read Terminal.       (line  16)
+* files, log, timestamps in:             Time Functions.      (line   6)
+* files, managing:                       Data File Management.
+                                                              (line   6)
+* files, managing, data file boundaries: Filetrans Function.  (line   6)
+* files, message object:                 Explaining gettext.  (line  42)
+* files, message object, converting from portable object files: I18N Example.
+                                                              (line  66)
+* files, message object, specifying directory of: Explaining gettext.
+                                                              (line  54)
+* files, message object, specifying directory of <1>: Programmer i18n.
+                                                              (line  48)
+* files, multiple passes over:           Other Arguments.     (line  56)
+* files, multiple, duplicating output into: Tee Program.      (line   6)
+* files, output, See output files:       Close Files And Pipes.
+                                                              (line   6)
+* files, password:                       Passwd Functions.    (line  16)
+* files, portable object:                Explaining gettext.  (line  37)
+* files, portable object <1>:            Translator i18n.     (line   6)
+* files, portable object template:       Explaining gettext.  (line  31)
+* files, portable object, converting to message object files: I18N Example.
+                                                              (line  66)
+* files, portable object, generating:    Options.             (line 147)
+* files, processing, ARGIND variable and: Auto-set.           (line  50)
+* files, reading:                        Rewind Function.     (line   6)
+* files, reading, multiline records:     Multiple Line.       (line   6)
+* files, searching for regular expressions: Egrep Program.    (line   6)
+* files, skipping:                       File Checking.       (line   6)
+* files, source, search path for:        Programs Exercises.  (line  70)
+* files, splitting:                      Split Program.       (line   6)
+* files, Texinfo, extracting programs from: Extract Program.  (line   6)
+* find substring in string:              String Functions.    (line 155)
+* finding extensions:                    Finding Extensions.  (line   6)
+* finish debugger command:               Debugger Execution Control.
+                                                              (line  39)
+* Fish, Fred:                            Contributors.        (line  51)
+* fixed-width data:                      Constant Size.       (line   6)
+* flag variables:                        Boolean Ops.         (line  69)
+* flag variables <1>:                    Tee Program.         (line  20)
+* floating-point, numbers, arbitrary precision: Arbitrary Precision Arithmetic.
+                                                              (line   6)
+* floating-point, VAX/VMS:               VMS Running.         (line  50)
+* flush buffered output:                 I/O Functions.       (line  28)
+* fnmatch() extension function:          Extension Sample Fnmatch.
+                                                              (line  12)
+* FNR variable:                          Records.             (line   6)
+* FNR variable <1>:                      Auto-set.            (line 118)
+* FNR variable, changing:                Auto-set.            (line 357)
+* for statement:                         For Statement.       (line   6)
+* for statement, looping over arrays:    Scanning an Array.   (line  20)
+* fork() extension function:             Extension Sample Fork.
+                                                              (line  11)
+* format specifiers:                     Basic Printf.        (line  15)
+* format specifiers, mixing regular with positional specifiers: Printf Ordering.
+                                                              (line  57)
+* format specifiers, printf statement:   Control Letters.     (line   6)
+* format specifiers, strftime() function (gawk): Time Functions.
+                                                              (line  89)
+* format time string:                    Time Functions.      (line  48)
+* formats, numeric output:               OFMT.                (line   6)
+* formatting output:                     Printf.              (line   6)
+* formatting strings:                    String Functions.    (line 384)
+* forward slash (/) to enclose regular expressions: Regexp.   (line  10)
+* forward slash (/), / operator:         Precedence.          (line  54)
+* forward slash (/), /= operator:        Assignment Ops.      (line 129)
+* forward slash (/), /= operator <1>:    Precedence.          (line  94)
+* forward slash (/), /= operator, vs. /=.../ regexp constant: Assignment Ops.
+                                                              (line 149)
+* forward slash (/), patterns and:       Expression Patterns. (line  24)
+* FPAT variable:                         Splitting By Content.
+                                                              (line  25)
+* FPAT variable <1>:                     User-modified.       (line  43)
+* frame debugger command:                Execution Stack.     (line  27)
+* Free Documentation License (FDL):      GNU Free Documentation License.
+                                                              (line   8)
+* Free Software Foundation (FSF):        Manual History.      (line   6)
+* Free Software Foundation (FSF) <1>:    Getting.             (line  10)
+* Free Software Foundation (FSF) <2>:    Glossary.            (line 372)
+* Free Software Foundation (FSF) <3>:    Glossary.            (line 405)
+* FreeBSD:                               Glossary.            (line 748)
+* FS variable:                           Field Separators.    (line  15)
+* FS variable <1>:                       User-modified.       (line  50)
+* FS variable, --field-separator option and: Options.         (line  21)
+* FS variable, as null string:           Single Character Fields.
+                                                              (line  20)
+* FS variable, as TAB character:         Options.             (line 266)
+* FS variable, changing value of:        Field Separators.    (line  34)
+* FS variable, running awk programs and: Cut Program.         (line  63)
+* FS variable, setting from command line: Command Line Field Separator.
+                                                              (line   6)
+* FS, containing ^:                      Regexp Field Splitting.
+                                                              (line  59)
+* FS, in multiline records:              Multiple Line.       (line  41)
+* FSF (Free Software Foundation):        Manual History.      (line   6)
+* FSF (Free Software Foundation) <1>:    Getting.             (line  10)
+* FSF (Free Software Foundation) <2>:    Glossary.            (line 372)
+* FSF (Free Software Foundation) <3>:    Glossary.            (line 405)
+* fts() extension function:              Extension Sample File Functions.
+                                                              (line  60)
+* FUNCTAB array:                         Auto-set.            (line 134)
+* function calls:                        Function Calls.      (line   6)
+* function calls, indirect:              Indirect Calls.      (line   6)
+* function calls, indirect, @-notation for: Indirect Calls.   (line  47)
+* function definition example:           Function Example.    (line   6)
+* function pointers:                     Indirect Calls.      (line   6)
+* functions, arrays as parameters to:    Pass By Value/Reference.
+                                                              (line  44)
+* functions, built-in:                   Function Calls.      (line  10)
+* functions, built-in <1>:               Functions.           (line   6)
+* functions, built-in, evaluation order: Calling Built-in.    (line  30)
+* functions, defining:                   Definition Syntax.   (line  10)
+* functions, library:                    Library Functions.   (line   6)
+* functions, library, assertions:        Assert Function.     (line   6)
+* functions, library, associative arrays and: Library Names.  (line  58)
+* functions, library, C library:         Getopt Function.     (line   6)
+* functions, library, character values as numbers: Ordinal Functions.
+                                                              (line   6)
+* functions, library, Cliff random numbers: Cliff Random Function.
+                                                              (line   6)
+* functions, library, command-line options: Getopt Function.  (line   6)
+* functions, library, example program for using: Igawk Program.
+                                                              (line   6)
+* functions, library, group database, reading: Group Functions.
+                                                              (line   6)
+* functions, library, managing data files: Data File Management.
+                                                              (line   6)
+* functions, library, managing time:     Getlocaltime Function.
+                                                              (line   6)
+* functions, library, merging arrays into strings: Join Function.
+                                                              (line   6)
+* functions, library, rounding numbers:  Round Function.      (line   6)
+* functions, library, user database, reading: Passwd Functions.
+                                                              (line   6)
+* functions, names of:                   Definition Syntax.   (line  24)
+* functions, recursive:                  Definition Syntax.   (line  89)
+* functions, string-translation:         I18N Functions.      (line   6)
+* functions, undefined:                  Pass By Value/Reference.
+                                                              (line  68)
+* functions, user-defined:               User-defined.        (line   6)
+* functions, user-defined, calling:      Function Caveats.    (line   6)
+* functions, user-defined, counts, in a profile: Profiling.   (line 137)
+* functions, user-defined, library of:   Library Functions.   (line   6)
+* functions, user-defined, next/nextfile statements and: Next Statement.
+                                                              (line  44)
+* functions, user-defined, next/nextfile statements and <1>: Nextfile Statement.
+                                                              (line  47)
+* G-d:                                   Acknowledgments.     (line  94)
+* G., Daniel Richard:                    Acknowledgments.     (line  60)
+* G., Daniel Richard <1>:                Maintainers.         (line  14)
+* Garfinkle, Scott:                      Contributors.        (line  35)
+* gawk program, dynamic profiling:       Profiling.           (line 177)
+* gawk version:                          Auto-set.            (line 221)
+* gawk, ARGIND variable in:              Other Arguments.     (line  15)
+* gawk, awk and:                         Preface.             (line  21)
+* gawk, awk and <1>:                     This Manual.         (line  14)
+* gawk, bitwise operations in:           Bitwise Functions.   (line  40)
+* gawk, break statement in:              Break Statement.     (line  51)
+* gawk, character classes and:           Bracket Expressions. (line 108)
+* gawk, coding style in:                 Adding Code.         (line  37)
+* gawk, command-line options, and regular expressions: GNU Regexp Operators.
+                                                              (line  73)
+* gawk, configuring:                     Configuration Philosophy.
+                                                              (line   6)
+* gawk, configuring, options:            Additional Configuration Options.
+                                                              (line   6)
+* gawk, continue statement in:           Continue Statement.  (line  44)
+* gawk, distribution:                    Distribution contents.
+                                                              (line   6)
+* gawk, ERRNO variable in:               Getline.             (line  19)
+* gawk, ERRNO variable in <1>:           Close Files And Pipes.
+                                                              (line 140)
+* gawk, ERRNO variable in <2>:           BEGINFILE/ENDFILE.   (line  26)
+* gawk, ERRNO variable in <3>:           Auto-set.            (line  87)
+* gawk, ERRNO variable in <4>:           TCP/IP Networking.   (line  54)
+* gawk, escape sequences:                Escape Sequences.    (line 121)
+* gawk, extensions, disabling:           Options.             (line 257)
+* gawk, features, adding:                Adding Code.         (line   6)
+* gawk, features, advanced:              Advanced Features.   (line   6)
+* gawk, field separators and:            User-modified.       (line  71)
+* gawk, FIELDWIDTHS variable in:         Constant Size.       (line  22)
+* gawk, FIELDWIDTHS variable in <1>:     User-modified.       (line  37)
+* gawk, file names in:                   Special Files.       (line   6)
+* gawk, format-control characters:       Control Letters.     (line  18)
+* gawk, format-control characters <1>:   Control Letters.     (line  93)
+* gawk, FPAT variable in:                Splitting By Content.
+                                                              (line  25)
+* gawk, FPAT variable in <1>:            User-modified.       (line  43)
+* gawk, FUNCTAB array in:                Auto-set.            (line 134)
+* gawk, function arguments and:          Calling Built-in.    (line  16)
+* gawk, hexadecimal numbers and:         Nondecimal-numbers.  (line  41)
+* gawk, IGNORECASE variable in:          Case-sensitivity.    (line  26)
+* gawk, IGNORECASE variable in <1>:      User-modified.       (line  76)
+* gawk, IGNORECASE variable in <2>:      Array Intro.         (line 100)
+* gawk, IGNORECASE variable in <3>:      String Functions.    (line  58)
+* gawk, IGNORECASE variable in <4>:      Array Sorting Functions.
+                                                              (line  83)
+* gawk, implementation issues:           Notes.               (line   6)
+* gawk, implementation issues, debugging: Compatibility Mode. (line   6)
+* gawk, implementation issues, downward compatibility: Compatibility Mode.
+                                                              (line   6)
+* gawk, implementation issues, limits:   Getline Notes.       (line  14)
+* gawk, implementation issues, pipes:    Redirection.         (line 129)
+* gawk, installing:                      Installation.        (line   6)
+* gawk, internationalization and, See internationalization: Internationalization.
+                                                              (line  13)
+* gawk, interpreter, adding code to:     Using Internal File Ops.
+                                                              (line   6)
+* gawk, interval expressions and:        Regexp Operators.    (line 139)
+* gawk, line continuation in:            Conditional Exp.     (line  34)
+* gawk, LINT variable in:                User-modified.       (line  87)
+* gawk, list of contributors to:         Contributors.        (line   6)
+* gawk, MS-Windows version of:           PC Using.            (line   9)
+* gawk, newlines in:                     Statements/Lines.    (line  12)
+* gawk, octal numbers and:               Nondecimal-numbers.  (line  41)
+* gawk, predefined variables and:        Built-in Variables.  (line  14)
+* gawk, PROCINFO array in:               Auto-set.            (line 148)
+* gawk, PROCINFO array in <1>:           Time Functions.      (line  47)
+* gawk, PROCINFO array in <2>:           Two-way I/O.         (line 114)
+* gawk, regexp constants and:            Using Constant Regexps.
+                                                              (line  28)
+* gawk, regular expressions, case sensitivity: Case-sensitivity.
+                                                              (line  26)
+* gawk, regular expressions, operators:  GNU Regexp Operators.
+                                                              (line   6)
+* gawk, regular expressions, precedence: Regexp Operators.    (line 161)
+* gawk, RT variable in:                  awk split records.   (line 124)
+* gawk, RT variable in <1>:              Multiple Line.       (line 130)
+* gawk, RT variable in <2>:              Auto-set.            (line 295)
+* gawk, See Also awk:                    Preface.             (line  34)
+* gawk, source code, obtaining:          Getting.             (line   6)
+* gawk, splitting fields and:            Constant Size.       (line  86)
+* gawk, string-translation functions:    I18N Functions.      (line   6)
+* gawk, SYMTAB array in:                 Auto-set.            (line 299)
+* gawk, TEXTDOMAIN variable in:          User-modified.       (line 152)
+* gawk, timestamps:                      Time Functions.      (line   6)
+* gawk, uses for:                        Preface.             (line  34)
+* gawk, versions of, information about, printing: Options.    (line 304)
+* gawk, VMS version of:                  VMS Installation.    (line   6)
+* gawk, word-boundary operator:          GNU Regexp Operators.
+                                                              (line  66)
+* gawkextlib:                            gawkextlib.          (line   6)
+* gawkextlib project:                    gawkextlib.          (line   6)
+* gawklibpath_append shell function:     Shell Startup Files. (line  29)
+* gawklibpath_default shell function:    Shell Startup Files. (line  22)
+* gawklibpath_prepend shell function:    Shell Startup Files. (line  25)
+* gawkpath_append shell function:        Shell Startup Files. (line  19)
+* gawkpath_default shell function:       Shell Startup Files. (line  12)
+* gawkpath_prepend shell function:       Shell Startup Files. (line  15)
+* General Public License (GPL):          Glossary.            (line 396)
+* General Public License, See GPL:       Manual History.      (line  11)
+* generate time values:                  Time Functions.      (line  25)
+* gensub:                                Using Constant Regexps.
+                                                              (line  43)
+* gensub <1>:                            String Functions.    (line  89)
+* gensub() function (gawk), escape processing: Gory Details.  (line   6)
+* getaddrinfo() function (C library):    TCP/IP Networking.   (line  39)
+* getgrent() function (C library):       Group Functions.     (line   6)
+* getgrent() function (C library) <1>:   Group Functions.     (line 202)
+* getgrent() user-defined function:      Group Functions.     (line   6)
+* getgrent() user-defined function <1>:  Group Functions.     (line 205)
+* getgrgid() function (C library):       Group Functions.     (line 184)
+* getgrgid() user-defined function:      Group Functions.     (line 187)
+* getgrnam() function (C library):       Group Functions.     (line 173)
+* getgrnam() user-defined function:      Group Functions.     (line 178)
+* getgruser() function (C library):      Group Functions.     (line 193)
+* getgruser() function, user-defined:    Group Functions.     (line 196)
+* getline command:                       Reading Files.       (line  20)
+* getline command, coprocesses, using from: Getline/Coprocess.
+                                                              (line   6)
+* getline command, coprocesses, using from <1>: Close Files And Pipes.
+                                                              (line   6)
+* getline command, deadlock and:         Two-way I/O.         (line  53)
+* getline command, explicit input with:  Getline.             (line   6)
+* getline command, FILENAME variable and: Getline Notes.      (line  19)
+* getline command, return values:        Getline.             (line  19)
+* getline command, variants:             Getline Summary.     (line   6)
+* getline command, _gr_init() user-defined function: Group Functions.
+                                                              (line  83)
+* getline command, _pw_init() function:  Passwd Functions.    (line 154)
+* getline from a file:                   Getline/File.        (line   6)
+* getline into a variable:               Getline/Variable.    (line   6)
+* getline statement, BEGINFILE/ENDFILE patterns and: BEGINFILE/ENDFILE.
+                                                              (line  53)
+* getlocaltime() user-defined function:  Getlocaltime Function.
+                                                              (line  16)
+* getopt() function (C library):         Getopt Function.     (line  15)
+* getopt() user-defined function:        Getopt Function.     (line 108)
+* getopt() user-defined function <1>:    Getopt Function.     (line 134)
+* getpwent() function (C library):       Passwd Functions.    (line  16)
+* getpwent() function (C library) <1>:   Passwd Functions.    (line 196)
+* getpwent() user-defined function:      Passwd Functions.    (line  16)
+* getpwent() user-defined function <1>:  Passwd Functions.    (line 200)
+* getpwnam() function (C library):       Passwd Functions.    (line 175)
+* getpwnam() user-defined function:      Passwd Functions.    (line 180)
+* getpwuid() function (C library):       Passwd Functions.    (line 186)
+* getpwuid() user-defined function:      Passwd Functions.    (line 190)
+* gettext library:                       Explaining gettext.  (line   6)
+* gettext library, locale categories:    Explaining gettext.  (line  81)
+* gettext() function (C library):        Explaining gettext.  (line  63)
+* gettimeofday() extension function:     Extension Sample Time.
+                                                              (line  12)
+* git utility:                           gawkextlib.          (line  31)
+* git utility <1>:                       Other Versions.      (line  29)
+* git utility <2>:                       Accessing The Source.
+                                                              (line  10)
+* git utility <3>:                       Adding Code.         (line 112)
+* Git, use of for gawk source code:      Derived Files.       (line   6)
+* GNITS mailing list:                    Acknowledgments.     (line  52)
+* GNU awk, See gawk:                     Preface.             (line  51)
+* GNU Free Documentation License:        GNU Free Documentation License.
+                                                              (line   8)
+* GNU General Public License:            Glossary.            (line 396)
+* GNU Lesser General Public License:     Glossary.            (line 491)
+* GNU long options:                      Command Line.        (line  13)
+* GNU long options <1>:                  Options.             (line   6)
+* GNU long options, printing list of:    Options.             (line 154)
+* GNU Project:                           Manual History.      (line  11)
+* GNU Project <1>:                       Glossary.            (line 405)
+* GNU/Linux:                             Manual History.      (line  28)
+* GNU/Linux <1>:                         I18N Example.        (line  57)
+* GNU/Linux <2>:                         Glossary.            (line 748)
+* Gordon, Assaf:                         Contributors.        (line 106)
+* GPL (General Public License):          Manual History.      (line  11)
+* GPL (General Public License) <1>:      Glossary.            (line 396)
+* GPL (General Public License), printing: Options.            (line  89)
+* grcat program:                         Group Functions.     (line  16)
+* Grigera, Juan:                         Contributors.        (line  58)
+* group database, reading:               Group Functions.     (line   6)
+* group file:                            Group Functions.     (line   6)
+* group ID of gawk user:                 Auto-set.            (line 170)
+* groups, information about:             Group Functions.     (line   6)
+* gsub:                                  Using Constant Regexps.
+                                                              (line  43)
+* gsub <1>:                              String Functions.    (line 139)
+* gsub() function, arguments of:         String Functions.    (line 463)
+* gsub() function, escape processing:    Gory Details.        (line   6)
+* h debugger command (alias for help):   Miscellaneous Debugger Commands.
+                                                              (line  69)
+* Hankerson, Darrel:                     Acknowledgments.     (line  60)
+* Hankerson, Darrel <1>:                 Contributors.        (line  61)
+* Haque, John:                           Contributors.        (line 109)
+* Hartholz, Elaine:                      Acknowledgments.     (line  38)
+* Hartholz, Marshall:                    Acknowledgments.     (line  38)
+* Hasegawa, Isamu:                       Contributors.        (line  95)
+* help debugger command:                 Miscellaneous Debugger Commands.
+                                                              (line  69)
+* hexadecimal numbers:                   Nondecimal-numbers.  (line   6)
+* hexadecimal values, enabling interpretation of: Options.    (line 209)
+* history expansion, in debugger:        Readline Support.    (line   6)
+* histsort.awk program:                  History Sorting.     (line  25)
+* Hughes, Phil:                          Acknowledgments.     (line  43)
+* HUP signal, for dynamic profiling:     Profiling.           (line 209)
+* hyphen (-), - operator:                Precedence.          (line  51)
+* hyphen (-), - operator <1>:            Precedence.          (line  57)
+* hyphen (-), -- operator:               Increment Ops.       (line  48)
+* hyphen (-), -- operator <1>:           Precedence.          (line  45)
+* hyphen (-), -= operator:               Assignment Ops.      (line 129)
+* hyphen (-), -= operator <1>:           Precedence.          (line  94)
+* hyphen (-), filenames beginning with:  Options.             (line  60)
+* hyphen (-), in bracket expressions:    Bracket Expressions. (line  25)
+* i debugger command (alias for info):   Debugger Info.       (line  13)
+* id utility:                            Id Program.          (line   6)
+* id.awk program:                        Id Program.          (line  31)
+* if statement:                          If Statement.        (line   6)
+* if statement, actions, changing:       Ranges.              (line  25)
+* if statement, use of regexps in:       Regexp Usage.        (line  19)
+* igawk.sh program:                      Igawk Program.       (line 124)
+* ignore breakpoint:                     Breakpoint Control.  (line  87)
+* ignore debugger command:               Breakpoint Control.  (line  87)
+* IGNORECASE variable:                   User-modified.       (line  76)
+* IGNORECASE variable, and array indices: Array Intro.        (line 100)
+* IGNORECASE variable, and array sorting functions: Array Sorting Functions.
+                                                              (line  83)
+* IGNORECASE variable, in example programs: Library Functions.
+                                                              (line  53)
+* IGNORECASE variable, with ~ and !~ operators: Case-sensitivity.
+                                                              (line  26)
+* Illumos:                               Other Versions.      (line 109)
+* Illumos, POSIX-compliant awk:          Other Versions.      (line 109)
+* implementation issues, gawk:           Notes.               (line   6)
+* implementation issues, gawk, debugging: Compatibility Mode. (line   6)
+* implementation issues, gawk, limits:   Getline Notes.       (line  14)
+* implementation issues, gawk, limits <1>: Redirection.       (line 129)
+* in operator:                           Comparison Operators.
+                                                              (line  11)
+* in operator <1>:                       Precedence.          (line  82)
+* in operator <2>:                       For Statement.       (line  75)
+* in operator, index existence in multidimensional arrays: Multidimensional.
+                                                              (line  41)
+* in operator, order of array access:    Scanning an Array.   (line  48)
+* in operator, testing if array element exists: Reference to Elements.
+                                                              (line  38)
+* in operator, use in loops:             Scanning an Array.   (line  17)
+* including files, @include directive:   Include Files.       (line   8)
+* increment operators:                   Increment Ops.       (line   6)
+* index:                                 String Functions.    (line 155)
+* indexing arrays:                       Array Intro.         (line  48)
+* indirect function calls:               Indirect Calls.      (line   6)
+* indirect function calls, @-notation:   Indirect Calls.      (line  47)
+* infinite precision:                    Arbitrary Precision Arithmetic.
+                                                              (line   6)
+* info debugger command:                 Debugger Info.       (line  13)
+* initialization, automatic:             More Complex.        (line  39)
+* inplace extension:                     Extension Sample Inplace.
+                                                              (line   6)
+* input files:                           Reading Files.       (line   6)
+* input files, closing:                  Close Files And Pipes.
+                                                              (line   6)
+* input files, counting elements in:     Wc Program.          (line   6)
+* input files, examples:                 Sample Data Files.   (line   6)
+* input files, reading:                  Reading Files.       (line   6)
+* input files, running awk without:      Read Terminal.       (line   6)
+* input files, running awk without <1>:  Read Terminal.       (line  16)
+* input files, variable assignments and: Other Arguments.     (line  26)
+* input pipeline:                        Getline/Pipe.        (line  10)
+* input record, length of:               String Functions.    (line 177)
+* input redirection:                     Getline/File.        (line   6)
+* input, data, nondecimal:               Nondecimal Data.     (line   6)
+* input, explicit:                       Getline.             (line   6)
+* input, files, See input files:         Multiple Line.       (line   6)
+* input, multiline records:              Multiple Line.       (line   6)
+* input, splitting into records:         Records.             (line   6)
+* input, standard:                       Read Terminal.       (line   6)
+* input, standard <1>:                   Special FD.          (line   6)
+* input/output functions:                I/O Functions.       (line   6)
+* input/output, binary:                  User-modified.       (line  15)
+* input/output, from BEGIN and END:      I/O And BEGIN/END.   (line   6)
+* input/output, two-way:                 Two-way I/O.         (line  27)
+* insomnia, cure for:                    Alarm Program.       (line   6)
+* installation, VMS:                     VMS Installation.    (line   6)
+* installing gawk:                       Installation.        (line   6)
+* instruction tracing, in debugger:      Debugger Info.       (line  90)
+* int:                                   Numeric Functions.   (line  24)
+* INT signal (MS-Windows):               Profiling.           (line 212)
+* intdiv:                                Numeric Functions.   (line  29)
+* intdiv <1>:                            Numeric Functions.   (line  29)
+* integer array indices:                 Numeric Array Subscripts.
+                                                              (line  31)
+* integers, arbitrary precision:         Arbitrary Precision Integers.
+                                                              (line   6)
+* integers, unsigned:                    Computer Arithmetic. (line  41)
+* interacting with other programs:       I/O Functions.       (line 107)
+* internationalization:                  I18N Functions.      (line   6)
+* internationalization <1>:              I18N and L10N.       (line   6)
+* internationalization, localization:    User-modified.       (line 152)
+* internationalization, localization <1>: Internationalization.
+                                                              (line  13)
+* internationalization, localization, character classes: Bracket Expressions.
+                                                              (line 108)
+* internationalization, localization, gawk and: Internationalization.
+                                                              (line  13)
+* internationalization, localization, locale categories: Explaining gettext.
+                                                              (line  81)
+* internationalization, localization, marked strings: Programmer i18n.
+                                                              (line  13)
+* internationalization, localization, portability and: I18N Portability.
+                                                              (line   6)
+* internationalizing a program:          Explaining gettext.  (line   6)
+* interpreted programs:                  Basic High Level.    (line  13)
+* interpreted programs <1>:              Glossary.            (line 445)
+* interval expressions, regexp operator: Regexp Operators.    (line 116)
+* inventory-shipped file:                Sample Data Files.   (line  32)
+* invoke shell command:                  I/O Functions.       (line 107)
+* isarray:                               Type Functions.      (line  11)
+* ISO:                                   Glossary.            (line 456)
+* ISO 8859-1:                            Glossary.            (line 196)
+* ISO Latin-1:                           Glossary.            (line 196)
+* Jacobs, Andrew:                        Passwd Functions.    (line  90)
+* Jaegermann, Michal:                    Acknowledgments.     (line  60)
+* Jaegermann, Michal <1>:                Contributors.        (line  46)
+* Java implementation of awk:            Other Versions.      (line 117)
+* Java programming language:             Glossary.            (line 468)
+* jawk:                                  Other Versions.      (line 117)
+* Jedi knights:                          Undocumented.        (line   6)
+* Johansen, Chris:                       Signature Program.   (line  25)
+* join() user-defined function:          Join Function.       (line  18)
+* Kahrs, Ju"rgen:                        Acknowledgments.     (line  60)
+* Kahrs, Ju"rgen <1>:                    Contributors.        (line  71)
+* Kasal, Stepan:                         Acknowledgments.     (line  60)
+* Kenobi, Obi-Wan:                       Undocumented.        (line   6)
+* Kernighan, Brian:                      History.             (line  17)
+* Kernighan, Brian <1>:                  Conventions.         (line  38)
+* Kernighan, Brian <2>:                  Acknowledgments.     (line  79)
+* Kernighan, Brian <3>:                  Getline/Pipe.        (line   6)
+* Kernighan, Brian <4>:                  Concatenation.       (line   6)
+* Kernighan, Brian <5>:                  Library Functions.   (line  12)
+* Kernighan, Brian <6>:                  BTL.                 (line   6)
+* Kernighan, Brian <7>:                  Contributors.        (line  12)
+* Kernighan, Brian <8>:                  Other Versions.      (line  13)
+* Kernighan, Brian <9>:                  Basic Data Typing.   (line  54)
+* Kernighan, Brian <10>:                 Glossary.            (line 206)
+* kill command, dynamic profiling:       Profiling.           (line 186)
+* Knights, jedi:                         Undocumented.        (line   6)
+* Kwok, Conrad:                          Contributors.        (line  35)
+* l debugger command (alias for list):   Miscellaneous Debugger Commands.
+                                                              (line  75)
+* labels.awk program:                    Labels Program.      (line  51)
+* Langston, Peter:                       Advanced Features.   (line   6)
+* LANGUAGE environment variable:         Explaining gettext.  (line 120)
+* languages, data-driven:                Basic High Level.    (line  74)
+* LC_ALL locale category:                Explaining gettext.  (line 117)
+* LC_COLLATE locale category:            Explaining gettext.  (line  94)
+* LC_CTYPE locale category:              Explaining gettext.  (line  98)
+* LC_MESSAGES locale category:           Explaining gettext.  (line  88)
+* LC_MESSAGES locale category, bindtextdomain() function (gawk): Programmer i18n.
+                                                              (line 101)
+* LC_MONETARY locale category:           Explaining gettext.  (line 104)
+* LC_NUMERIC locale category:            Explaining gettext.  (line 108)
+* LC_TIME locale category:               Explaining gettext.  (line 112)
+* left angle bracket (<), < operator:    Comparison Operators.
+                                                              (line  11)
+* left angle bracket (<), < operator <1>: Precedence.         (line  64)
+* left angle bracket (<), < operator (I/O): Getline/File.     (line   6)
+* left angle bracket (<), <= operator:   Comparison Operators.
+                                                              (line  11)
+* left angle bracket (<), <= operator <1>: Precedence.        (line  64)
+* left shift:                            Bitwise Functions.   (line  47)
+* left shift, bitwise:                   Bitwise Functions.   (line  32)
+* leftmost longest match:                Multiple Line.       (line  26)
+* length:                                String Functions.    (line 170)
+* length of input record:                String Functions.    (line 177)
+* length of string:                      String Functions.    (line 170)
+* Lesser General Public License (LGPL):  Glossary.            (line 491)
+* LGPL (Lesser General Public License):  Glossary.            (line 491)
+* libmawk:                               Other Versions.      (line 125)
+* libraries of awk functions:            Library Functions.   (line   6)
+* libraries of awk functions, assertions: Assert Function.    (line   6)
+* libraries of awk functions, associative arrays and: Library Names.
+                                                              (line  58)
+* libraries of awk functions, character values as numbers: Ordinal Functions.
+                                                              (line   6)
+* libraries of awk functions, command-line options: Getopt Function.
+                                                              (line   6)
+* libraries of awk functions, example program for using: Igawk Program.
+                                                              (line   6)
+* libraries of awk functions, group database, reading: Group Functions.
+                                                              (line   6)
+* libraries of awk functions, managing, data files: Data File Management.
+                                                              (line   6)
+* libraries of awk functions, managing, time: Getlocaltime Function.
+                                                              (line   6)
+* libraries of awk functions, merging arrays into strings: Join Function.
+                                                              (line   6)
+* libraries of awk functions, rounding numbers: Round Function.
+                                                              (line   6)
+* libraries of awk functions, user database, reading: Passwd Functions.
+                                                              (line   6)
+* line breaks:                           Statements/Lines.    (line   6)
+* line continuations:                    Boolean Ops.         (line  64)
+* line continuations, gawk:              Conditional Exp.     (line  34)
+* line continuations, in print statement: Print Examples.     (line  75)
+* line continuations, with C shell:      More Complex.        (line  31)
+* lines, blank, printing:                Print.               (line  22)
+* lines, counting:                       Wc Program.          (line   6)
+* lines, duplicate, removing:            History Sorting.     (line   6)
+* lines, matching ranges of:             Ranges.              (line   6)
+* lines, skipping between markers:       Ranges.              (line  43)
+* lint checking:                         User-modified.       (line  87)
+* lint checking, array elements:         Delete.              (line  34)
+* lint checking, array subscripts:       Uninitialized Subscripts.
+                                                              (line  43)
+* lint checking, empty programs:         Command Line.        (line  16)
+* lint checking, issuing warnings:       Options.             (line 184)
+* lint checking, POSIXLY_CORRECT environment variable: Options.
+                                                              (line 343)
+* lint checking, undefined functions:    Pass By Value/Reference.
+                                                              (line  85)
+* LINT variable:                         User-modified.       (line  87)
+* Linux:                                 Manual History.      (line  28)
+* Linux <1>:                             I18N Example.        (line  57)
+* Linux <2>:                             Glossary.            (line 748)
+* list all global variables, in debugger: Debugger Info.      (line  48)
+* list debugger command:                 Miscellaneous Debugger Commands.
+                                                              (line  75)
+* list function definitions, in debugger: Debugger Info.      (line  30)
+* loading extensions, @load directive:   Loading Shared Libraries.
+                                                              (line   8)
+* loading, extensions:                   Options.             (line 172)
+* local variables, in a function:        Variable Scope.      (line   6)
+* locale categories:                     Explaining gettext.  (line  81)
+* locale decimal point character:        Options.             (line 269)
+* locale, definition of:                 Locales.             (line   6)
+* localization:                          I18N and L10N.       (line   6)
+* localization, See internationalization, localization: I18N and L10N.
+                                                              (line   6)
+* log:                                   Numeric Functions.   (line  44)
+* log files, timestamps in:              Time Functions.      (line   6)
+* logarithm:                             Numeric Functions.   (line  44)
+* logical false/true:                    Truth Values.        (line   6)
+* logical operators, See Boolean expressions: Boolean Ops.    (line   6)
+* login information:                     Passwd Functions.    (line  16)
+* long options:                          Command Line.        (line  13)
+* loops:                                 While Statement.     (line   6)
+* loops, break statement and:            Break Statement.     (line   6)
+* loops, continue statements and:        For Statement.       (line  64)
+* loops, count for header, in a profile: Profiling.           (line 131)
+* loops, do-while:                       Do Statement.        (line   6)
+* loops, exiting:                        Break Statement.     (line   6)
+* loops, for, array scanning:            Scanning an Array.   (line   6)
+* loops, for, iterative:                 For Statement.       (line   6)
+* loops, See Also while statement:       While Statement.     (line   6)
+* loops, while:                          While Statement.     (line   6)
+* ls utility:                            More Complex.        (line  15)
+* lshift:                                Bitwise Functions.   (line  47)
+* lvalues/rvalues:                       Assignment Ops.      (line  31)
+* mail-list file:                        Sample Data Files.   (line   6)
+* mailing labels, printing:              Labels Program.      (line   6)
+* mailing list, GNITS:                   Acknowledgments.     (line  52)
+* Malmberg, John:                        Acknowledgments.     (line  60)
+* Malmberg, John <1>:                    Maintainers.         (line  14)
+* Malmberg, John E.:                     Contributors.        (line 138)
+* mark parity:                           Ordinal Functions.   (line  45)
+* marked string extraction (internationalization): String Extraction.
+                                                              (line   6)
+* marked strings, extracting:            String Extraction.   (line   6)
+* Marx, Groucho:                         Increment Ops.       (line  60)
+* match:                                 String Functions.    (line 210)
+* match regexp in string:                String Functions.    (line 210)
+* match() function, RSTART/RLENGTH variables: String Functions.
+                                                              (line 227)
+* matching, expressions, See comparison expressions: Typing and Comparison.
+                                                              (line   9)
+* matching, leftmost longest:            Multiple Line.       (line  26)
+* matching, null strings:                String Functions.    (line 537)
+* mawk utility:                          Escape Sequences.    (line 121)
+* mawk utility <1>:                      Getline/Pipe.        (line  62)
+* mawk utility <2>:                      Concatenation.       (line  36)
+* mawk utility <3>:                      Nextfile Statement.  (line  47)
+* mawk utility <4>:                      Other Versions.      (line  48)
+* maximum precision supported by MPFR library: Auto-set.      (line 235)
+* McIlroy, Doug:                         Glossary.            (line 257)
+* McPhee, Patrick:                       Contributors.        (line 101)
+* message object files:                  Explaining gettext.  (line  42)
+* message object files, converting from portable object files: I18N Example.
+                                                              (line  66)
+* message object files, specifying directory of: Explaining gettext.
+                                                              (line  54)
+* message object files, specifying directory of <1>: Programmer i18n.
+                                                              (line  48)
+* messages from extensions:              Printing Messages.   (line   6)
+* metacharacters in regular expressions: Regexp Operators.    (line   6)
+* metacharacters, escape sequences for:  Escape Sequences.    (line 140)
+* minimum precision required by MPFR library: Auto-set.       (line 238)
+* mktime:                                Time Functions.      (line  25)
+* modifiers, in format specifiers:       Format Modifiers.    (line   6)
+* monetary information, localization:    Explaining gettext.  (line 104)
+* Moore, Duncan:                         Getline Notes.       (line  40)
+* msgfmt utility:                        I18N Example.        (line  66)
+* multiple precision:                    Arbitrary Precision Arithmetic.
+                                                              (line   6)
+* multiple-line records:                 Multiple Line.       (line   6)
+* n debugger command (alias for next):   Debugger Execution Control.
+                                                              (line  43)
+* names, arrays/variables:               Library Names.       (line   6)
+* names, functions:                      Definition Syntax.   (line  24)
+* names, functions <1>:                  Library Names.       (line   6)
+* namespace issues:                      Library Names.       (line   6)
+* namespace issues, functions:           Definition Syntax.   (line  24)
+* NetBSD:                                Glossary.            (line 748)
+* networks, programming:                 TCP/IP Networking.   (line   6)
+* networks, support for:                 Special Network.     (line   6)
+* newlines:                              Statements/Lines.    (line   6)
+* newlines <1>:                          Options.             (line 263)
+* newlines <2>:                          Boolean Ops.         (line  69)
+* newlines, as record separators:        awk split records.   (line  12)
+* newlines, in dynamic regexps:          Computed Regexps.    (line  60)
+* newlines, in regexp constants:         Computed Regexps.    (line  70)
+* newlines, printing:                    Print Examples.      (line  11)
+* newlines, separating statements in actions: Action Overview.
+                                                              (line  19)
+* newlines, separating statements in actions <1>: Statements. (line  10)
+* next debugger command:                 Debugger Execution Control.
+                                                              (line  43)
+* next file statement:                   Feature History.     (line 168)
+* next statement:                        Boolean Ops.         (line  95)
+* next statement <1>:                    Next Statement.      (line   6)
+* next statement, BEGIN/END patterns and: I/O And BEGIN/END.  (line  36)
+* next statement, BEGINFILE/ENDFILE patterns and: BEGINFILE/ENDFILE.
+                                                              (line  49)
+* next statement, user-defined functions and: Next Statement. (line  44)
+* nextfile statement:                    Nextfile Statement.  (line   6)
+* nextfile statement, BEGIN/END patterns and: I/O And BEGIN/END.
+                                                              (line  36)
+* nextfile statement, BEGINFILE/ENDFILE patterns and: BEGINFILE/ENDFILE.
+                                                              (line  26)
+* nextfile statement, user-defined functions and: Nextfile Statement.
+                                                              (line  47)
+* nexti debugger command:                Debugger Execution Control.
+                                                              (line  49)
+* NF variable:                           Fields.              (line  33)
+* NF variable <1>:                       Auto-set.            (line 123)
+* NF variable, decrementing:             Changing Fields.     (line 107)
+* ni debugger command (alias for nexti): Debugger Execution Control.
+                                                              (line  49)
+* noassign.awk program:                  Ignoring Assigns.    (line  15)
+* non-existent array elements:           Reference to Elements.
+                                                              (line  23)
+* not Boolean-logic operator:            Boolean Ops.         (line   6)
+* NR variable:                           Records.             (line   6)
+* NR variable <1>:                       Auto-set.            (line 143)
+* NR variable, changing:                 Auto-set.            (line 357)
+* null strings:                          awk split records.   (line 114)
+* null strings <1>:                      Regexp Field Splitting.
+                                                              (line  43)
+* null strings <2>:                      Truth Values.        (line   6)
+* null strings <3>:                      Basic Data Typing.   (line  26)
+* null strings in gawk arguments, quoting and: Quoting.       (line  82)
+* null strings, and deleting array elements: Delete.          (line  27)
+* null strings, as array subscripts:     Uninitialized Subscripts.
+                                                              (line  43)
+* null strings, converting numbers to strings: Strings And Numbers.
+                                                              (line  21)
+* null strings, matching:                String Functions.    (line 537)
+* number as string of bits:              Bitwise Functions.   (line 108)
+* number of array elements:              String Functions.    (line 200)
+* number sign (#), #! (executable scripts): Executable Scripts.
+                                                              (line   6)
+* number sign (#), commenting:           Comments.            (line   6)
+* numbers, as array subscripts:          Numeric Array Subscripts.
+                                                              (line   6)
+* numbers, as values of characters:      Ordinal Functions.   (line   6)
+* numbers, Cliff random:                 Cliff Random Function.
+                                                              (line   6)
+* numbers, converting:                   Strings And Numbers. (line   6)
+* numbers, converting <1>:               Bitwise Functions.   (line 108)
+* numbers, converting, to strings:       User-modified.       (line  30)
+* numbers, converting, to strings <1>:   User-modified.       (line 104)
+* numbers, hexadecimal:                  Nondecimal-numbers.  (line   6)
+* numbers, octal:                        Nondecimal-numbers.  (line   6)
+* numbers, rounding:                     Round Function.      (line   6)
+* numeric constants:                     Scalar Constants.    (line   6)
+* numeric functions:                     Numeric Functions.   (line   6)
+* numeric, output format:                OFMT.                (line   6)
+* numeric, strings:                      Variable Typing.     (line   6)
+* o debugger command (alias for option): Debugger Info.       (line  57)
+* obsolete features:                     Obsolete.            (line   6)
+* octal numbers:                         Nondecimal-numbers.  (line   6)
+* octal values, enabling interpretation of: Options.          (line 209)
+* OFMT variable:                         OFMT.                (line  15)
+* OFMT variable <1>:                     Strings And Numbers. (line  56)
+* OFMT variable <2>:                     User-modified.       (line 104)
+* OFMT variable, POSIX awk and:          OFMT.                (line  27)
+* OFS variable:                          Changing Fields.     (line  64)
+* OFS variable <1>:                      Output Separators.   (line   6)
+* OFS variable <2>:                      User-modified.       (line 113)
+* OpenBSD:                               Glossary.            (line 748)
+* OpenSolaris:                           Other Versions.      (line 100)
+* operating systems, BSD-based:          Manual History.      (line  28)
+* operating systems, PC, gawk on:        PC Using.            (line   6)
+* operating systems, PC, gawk on, installing: PC Installation.
+                                                              (line   6)
+* operating systems, porting gawk to:    New Ports.           (line   6)
+* operating systems, See Also GNU/Linux, PC operating systems, Unix: Installation.
+                                                              (line   6)
+* operations, bitwise:                   Bitwise Functions.   (line   6)
+* operators, arithmetic:                 Arithmetic Ops.      (line   6)
+* operators, assignment:                 Assignment Ops.      (line   6)
+* operators, assignment <1>:             Assignment Ops.      (line  31)
+* operators, assignment, evaluation order: Assignment Ops.    (line 110)
+* operators, Boolean, See Boolean expressions: Boolean Ops.   (line   6)
+* operators, decrement/increment:        Increment Ops.       (line   6)
+* operators, GNU-specific:               GNU Regexp Operators.
+                                                              (line   6)
+* operators, input/output:               Getline/File.        (line   6)
+* operators, input/output <1>:           Getline/Pipe.        (line  10)
+* operators, input/output <2>:           Getline/Coprocess.   (line   6)
+* operators, input/output <3>:           Redirection.         (line  22)
+* operators, input/output <4>:           Redirection.         (line  96)
+* operators, input/output <5>:           Precedence.          (line  64)
+* operators, input/output <6>:           Precedence.          (line  64)
+* operators, input/output <7>:           Precedence.          (line  64)
+* operators, logical, See Boolean expressions: Boolean Ops.   (line   6)
+* operators, precedence:                 Increment Ops.       (line  60)
+* operators, precedence <1>:             Precedence.          (line   6)
+* operators, relational, See operators, comparison: Typing and Comparison.
+                                                              (line   9)
+* operators, short-circuit:              Boolean Ops.         (line  59)
+* operators, string:                     Concatenation.       (line   9)
+* operators, string-matching:            Regexp Usage.        (line  19)
+* operators, string-matching, for buffers: GNU Regexp Operators.
+                                                              (line  51)
+* operators, word-boundary (gawk):       GNU Regexp Operators.
+                                                              (line  66)
+* option debugger command:               Debugger Info.       (line  57)
+* options, command-line:                 Options.             (line   6)
+* options, command-line, end of:         Options.             (line  55)
+* options, command-line, invoking awk:   Command Line.        (line   6)
+* options, command-line, processing:     Getopt Function.     (line   6)
+* options, deprecated:                   Obsolete.            (line   6)
+* options, long:                         Command Line.        (line  13)
+* options, long <1>:                     Options.             (line   6)
+* options, printing list of:             Options.             (line 154)
+* or:                                    Bitwise Functions.   (line  50)
+* OR bitwise operation:                  Bitwise Functions.   (line   6)
+* or Boolean-logic operator:             Boolean Ops.         (line   6)
+* ord() extension function:              Extension Sample Ord.
+                                                              (line  12)
+* ord() user-defined function:           Ordinal Functions.   (line  16)
+* order of evaluation, concatenation:    Concatenation.       (line  41)
+* ORS variable:                          Output Separators.   (line  20)
+* ORS variable <1>:                      User-modified.       (line 119)
+* output field separator, See OFS variable: Changing Fields.  (line  64)
+* output record separator, See ORS variable: Output Separators.
+                                                              (line  20)
+* output redirection:                    Redirection.         (line   6)
+* output wrapper:                        Output Wrappers.     (line   6)
+* output, buffering:                     I/O Functions.       (line  32)
+* output, buffering <1>:                 I/O Functions.       (line 166)
+* output, duplicating into files:        Tee Program.         (line   6)
+* output, files, closing:                Close Files And Pipes.
+                                                              (line   6)
+* output, format specifier, OFMT:        OFMT.                (line  15)
+* output, formatted:                     Printf.              (line   6)
+* output, pipes:                         Redirection.         (line  57)
+* output, printing, See printing:        Printing.            (line   6)
+* output, records:                       Output Separators.   (line  20)
+* output, standard:                      Special FD.          (line   6)
+* p debugger command (alias for print):  Viewing And Changing Data.
+                                                              (line  35)
+* Papadopoulos, Panos:                   Contributors.        (line 129)
+* parent process ID of gawk process:     Auto-set.            (line 210)
+* parentheses (), in a profile:          Profiling.           (line 146)
+* parentheses (), regexp operator:       Regexp Operators.    (line  81)
+* password file:                         Passwd Functions.    (line  16)
+* patsplit:                              String Functions.    (line 296)
+* patterns:                              Patterns and Actions.
+                                                              (line   6)
+* patterns, comparison expressions as:   Expression Patterns. (line  14)
+* patterns, counts, in a profile:        Profiling.           (line 118)
+* patterns, default:                     Very Simple.         (line  35)
+* patterns, empty:                       Empty.               (line   6)
+* patterns, expressions as:              Regexp Patterns.     (line   6)
+* patterns, ranges in:                   Ranges.              (line   6)
+* patterns, regexp constants as:         Expression Patterns. (line  34)
+* patterns, types of:                    Pattern Overview.    (line  15)
+* pawk (profiling version of Brian Kernighan's awk): Other Versions.
+                                                              (line  82)
+* pawk, awk-like facilities for Python:  Other Versions.      (line 129)
+* PC operating systems, gawk on:         PC Using.            (line   6)
+* PC operating systems, gawk on, installing: PC Installation. (line   6)
+* percent sign (%), % operator:          Precedence.          (line  54)
+* percent sign (%), %= operator:         Assignment Ops.      (line 129)
+* percent sign (%), %= operator <1>:     Precedence.          (line  94)
+* period (.), regexp operator:           Regexp Operators.    (line  44)
+* Perl:                                  Future Extensions.   (line   6)
+* Peters, Arno:                          Contributors.        (line  86)
+* Peterson, Hal:                         Contributors.        (line  40)
+* pipe, closing:                         Close Files And Pipes.
+                                                              (line   6)
+* pipe, input:                           Getline/Pipe.        (line  10)
+* pipe, output:                          Redirection.         (line  57)
+* Pitts, Dave:                           Acknowledgments.     (line  60)
+* Pitts, Dave <1>:                       Maintainers.         (line  14)
+* Plauger, P.J.:                         Library Functions.   (line  12)
+* plug-in:                               Extension Intro.     (line   6)
+* plus sign (+), + operator:             Precedence.          (line  51)
+* plus sign (+), + operator <1>:         Precedence.          (line  57)
+* plus sign (+), ++ operator:            Increment Ops.       (line  11)
+* plus sign (+), ++ operator <1>:        Increment Ops.       (line  40)
+* plus sign (+), ++ operator <2>:        Precedence.          (line  45)
+* plus sign (+), += operator:            Assignment Ops.      (line  81)
+* plus sign (+), += operator <1>:        Precedence.          (line  94)
+* plus sign (+), regexp operator:        Regexp Operators.    (line 105)
+* pointers to functions:                 Indirect Calls.      (line   6)
+* portability:                           Escape Sequences.    (line 103)
+* portability, #! (executable scripts):  Executable Scripts.  (line  33)
+* portability, ** operator and:          Arithmetic Ops.      (line  81)
+* portability, **= operator and:         Assignment Ops.      (line 144)
+* portability, ARGV variable:            Executable Scripts.  (line  59)
+* portability, backslash continuation and: Statements/Lines.  (line  30)
+* portability, backslash in escape sequences: Escape Sequences.
+                                                              (line 108)
+* portability, close() function and:     Close Files And Pipes.
+                                                              (line  81)
+* portability, data files as single record: gawk split records.
+                                                              (line  65)
+* portability, deleting array elements:  Delete.              (line  56)
+* portability, example programs:         Library Functions.   (line  42)
+* portability, functions, defining:      Definition Syntax.   (line 114)
+* portability, gawk:                     New Ports.           (line   6)
+* portability, gettext library and:      Explaining gettext.  (line  11)
+* portability, internationalization and: I18N Portability.    (line   6)
+* portability, length() function:        String Functions.    (line 179)
+* portability, new awk vs. old awk:      Strings And Numbers. (line  56)
+* portability, next statement in user-defined functions: Pass By Value/Reference.
+                                                              (line  88)
+* portability, NF variable, decrementing: Changing Fields.    (line 115)
+* portability, operators:                Increment Ops.       (line  60)
+* portability, operators, not in POSIX awk: Precedence.       (line  97)
+* portability, POSIXLY_CORRECT environment variable: Options. (line 363)
+* portability, substr() function:        String Functions.    (line 513)
+* portable object files:                 Explaining gettext.  (line  37)
+* portable object files <1>:             Translator i18n.     (line   6)
+* portable object files, converting to message object files: I18N Example.
+                                                              (line  66)
+* portable object files, generating:     Options.             (line 147)
+* portable object template files:        Explaining gettext.  (line  31)
+* porting gawk:                          New Ports.           (line   6)
+* positional specifiers, printf statement: Format Modifiers.  (line  13)
+* positional specifiers, printf statement <1>: Printf Ordering.
+                                                              (line   6)
+* positional specifiers, printf statement, mixing with regular formats: Printf Ordering.
+                                                              (line  57)
+* POSIX awk:                             This Manual.         (line  14)
+* POSIX awk <1>:                         Assignment Ops.      (line 138)
+* POSIX awk, ** operator and:            Precedence.          (line  97)
+* POSIX awk, **= operator and:           Assignment Ops.      (line 144)
+* POSIX awk, < operator and:             Getline/File.        (line  26)
+* POSIX awk, arithmetic operators and:   Arithmetic Ops.      (line  30)
+* POSIX awk, backslashes in string constants: Escape Sequences.
+                                                              (line 108)
+* POSIX awk, BEGIN/END patterns:         I/O And BEGIN/END.   (line  15)
+* POSIX awk, bracket expressions and:    Bracket Expressions. (line  34)
+* POSIX awk, bracket expressions and, character classes: Bracket Expressions.
+                                                              (line  40)
+* POSIX awk, bracket expressions and, character classes <1>: Bracket Expressions.
+                                                              (line 108)
+* POSIX awk, break statement and:        Break Statement.     (line  51)
+* POSIX awk, changes in awk versions:    POSIX.               (line   6)
+* POSIX awk, continue statement and:     Continue Statement.  (line  44)
+* POSIX awk, CONVFMT variable and:       User-modified.       (line  30)
+* POSIX awk, date utility and:           Time Functions.      (line 253)
+* POSIX awk, field separators and:       Full Line Fields.    (line  16)
+* POSIX awk, function keyword in:        Definition Syntax.   (line  99)
+* POSIX awk, functions and, gsub()/sub(): Gory Details.       (line  90)
+* POSIX awk, functions and, length():    String Functions.    (line 179)
+* POSIX awk, GNU long options and:       Options.             (line  15)
+* POSIX awk, interval expressions in:    Regexp Operators.    (line 135)
+* POSIX awk, next/nextfile statements and: Next Statement.    (line  44)
+* POSIX awk, numeric strings and:        Variable Typing.     (line   6)
+* POSIX awk, OFMT variable and:          OFMT.                (line  27)
+* POSIX awk, OFMT variable and <1>:      Strings And Numbers. (line  56)
+* POSIX awk, period (.), using:          Regexp Operators.    (line  51)
+* POSIX awk, printf format strings and:  Format Modifiers.    (line 157)
+* POSIX awk, regular expressions and:    Regexp Operators.    (line 161)
+* POSIX awk, timestamps and:             Time Functions.      (line   6)
+* POSIX awk, | I/O operator and:         Getline/Pipe.        (line  56)
+* POSIX mode:                            Options.             (line 257)
+* POSIX mode <1>:                        Options.             (line 343)
+* POSIX, awk and:                        Preface.             (line  21)
+* POSIX, gawk extensions not included in: POSIX/GNU.          (line   6)
+* POSIX, programs, implementing in awk:  Clones.              (line   6)
+* POSIXLY_CORRECT environment variable:  Options.             (line 343)
+* PREC variable:                         User-modified.       (line 124)
+* precedence:                            Increment Ops.       (line  60)
+* precedence <1>:                        Precedence.          (line   6)
+* precedence, regexp operators:          Regexp Operators.    (line 156)
+* predefined variables:                  Built-in Variables.  (line   6)
+* predefined variables, -v option, setting with: Options.     (line  41)
+* predefined variables, conveying information: Auto-set.      (line   6)
+* predefined variables, user-modifiable: User-modified.       (line   6)
+* print debugger command:                Viewing And Changing Data.
+                                                              (line  35)
+* print statement:                       Printing.            (line  16)
+* print statement, BEGIN/END patterns and: I/O And BEGIN/END. (line  15)
+* print statement, commas, omitting:     Print Examples.      (line  30)
+* print statement, I/O operators in:     Precedence.          (line  70)
+* print statement, line continuations and: Print Examples.    (line  75)
+* print statement, OFMT variable and:    User-modified.       (line 113)
+* print statement, See Also redirection, of output: Redirection.
+                                                              (line  17)
+* print statement, sprintf() function and: Round Function.    (line   6)
+* print variables, in debugger:          Viewing And Changing Data.
+                                                              (line  35)
+* printf debugger command:               Viewing And Changing Data.
+                                                              (line  53)
+* printf statement:                      Printing.            (line  16)
+* printf statement <1>:                  Printf.              (line   6)
+* printf statement, columns, aligning:   Print Examples.      (line  69)
+* printf statement, format-control characters: Control Letters.
+                                                              (line   6)
+* printf statement, I/O operators in:    Precedence.          (line  70)
+* printf statement, modifiers:           Format Modifiers.    (line   6)
+* printf statement, positional specifiers: Format Modifiers.  (line  13)
+* printf statement, positional specifiers <1>: Printf Ordering.
+                                                              (line   6)
+* printf statement, positional specifiers, mixing with regular formats: Printf Ordering.
+                                                              (line  57)
+* printf statement, See Also redirection, of output: Redirection.
+                                                              (line  17)
+* printf statement, sprintf() function and: Round Function.   (line   6)
+* printf statement, syntax of:           Basic Printf.        (line   6)
+* printing:                              Printing.            (line   6)
+* printing messages from extensions:     Printing Messages.   (line   6)
+* printing, list of options:             Options.             (line 154)
+* printing, mailing labels:              Labels Program.      (line   6)
+* printing, unduplicated lines of text:  Uniq Program.        (line   6)
+* printing, user information:            Id Program.          (line   6)
+* private variables:                     Library Names.       (line  11)
+* process group ID of gawk process:      Auto-set.            (line 204)
+* process ID of gawk process:            Auto-set.            (line 207)
+* processes, two-way communications with: Two-way I/O.        (line   6)
+* processing data:                       Basic High Level.    (line   6)
+* PROCINFO array:                        Auto-set.            (line 148)
+* PROCINFO array <1>:                    Time Functions.      (line  47)
+* PROCINFO array <2>:                    Passwd Functions.    (line   6)
+* PROCINFO array, and communications via ptys: Two-way I/O.   (line 114)
+* PROCINFO array, and group membership:  Group Functions.     (line   6)
+* PROCINFO array, and user and group ID numbers: Id Program.  (line  15)
+* PROCINFO array, testing the field splitting: Passwd Functions.
+                                                              (line 154)
+* PROCINFO, values of sorted_in:         Controlling Scanning.
+                                                              (line  26)
+* profiling awk programs:                Profiling.           (line   6)
+* profiling awk programs, dynamically:   Profiling.           (line 177)
+* program identifiers:                   Auto-set.            (line 173)
+* program, definition of:                Getting Started.     (line  21)
+* programming conventions, --non-decimal-data option: Nondecimal Data.
+                                                              (line  35)
+* programming conventions, ARGC/ARGV variables: Auto-set.     (line  35)
+* programming conventions, exit statement: Exit Statement.    (line  38)
+* programming conventions, function parameters: Return Statement.
+                                                              (line  44)
+* programming conventions, functions, calling: Calling Built-in.
+                                                              (line  10)
+* programming conventions, functions, writing: Definition Syntax.
+                                                              (line  71)
+* programming conventions, gawk extensions: Internal File Ops.
+                                                              (line  45)
+* programming conventions, private variable names: Library Names.
+                                                              (line  23)
+* programming language, recipe for:      History.             (line   6)
+* programming languages, Ada:            Glossary.            (line  11)
+* programming languages, data-driven vs. procedural: Getting Started.
+                                                              (line  12)
+* programming languages, Java:           Glossary.            (line 468)
+* programming, basic steps:              Basic High Level.    (line  18)
+* programming, concepts:                 Basic Concepts.      (line   6)
+* programming, concepts <1>:             Basic Concepts.      (line   6)
+* pwcat program:                         Passwd Functions.    (line  23)
+* q debugger command (alias for quit):   Miscellaneous Debugger Commands.
+                                                              (line 102)
+* QSE awk:                               Other Versions.      (line 135)
+* Quanstrom, Erik:                       Alarm Program.       (line   8)
+* question mark (?), ?: operator:        Precedence.          (line  91)
+* question mark (?), regexp operator:    Regexp Operators.    (line 111)
+* question mark (?), regexp operator <1>: GNU Regexp Operators.
+                                                              (line  62)
+* QuikTrim Awk:                          Other Versions.      (line 139)
+* quit debugger command:                 Miscellaneous Debugger Commands.
+                                                              (line 102)
+* QUIT signal (MS-Windows):              Profiling.           (line 212)
+* quoting in gawk command lines:         Long.                (line  26)
+* quoting in gawk command lines, tricks for: Quoting.         (line  91)
+* quoting, for small awk programs:       Comments.            (line  27)
+* r debugger command (alias for run):    Debugger Execution Control.
+                                                              (line  62)
+* Rakitzis, Byron:                       History Sorting.     (line  25)
+* Ramey, Chet:                           Acknowledgments.     (line  60)
+* Ramey, Chet <1>:                       General Data Types.  (line   6)
+* rand:                                  Numeric Functions.   (line  49)
+* random numbers, Cliff:                 Cliff Random Function.
+                                                              (line   6)
+* random numbers, rand()/srand() functions: Numeric Functions.
+                                                              (line  49)
+* random numbers, seed of:               Numeric Functions.   (line  79)
+* range expressions (regexps):           Bracket Expressions. (line   6)
+* range patterns:                        Ranges.              (line   6)
+* range patterns, line continuation and: Ranges.              (line  64)
+* Rankin, Pat:                           Acknowledgments.     (line  60)
+* Rankin, Pat <1>:                       Assignment Ops.      (line  99)
+* Rankin, Pat <2>:                       Contributors.        (line  38)
+* reada() extension function:            Extension Sample Read write array.
+                                                              (line  18)
+* readable data files, checking:         File Checking.       (line   6)
+* readable.awk program:                  File Checking.       (line  11)
+* readdir extension:                     Extension Sample Readdir.
+                                                              (line   9)
+* readfile() extension function:         Extension Sample Readfile.
+                                                              (line  12)
+* readfile() user-defined function:      Readfile Function.   (line  30)
+* reading input files:                   Reading Files.       (line   6)
+* recipe for a programming language:     History.             (line   6)
+* record separators:                     awk split records.   (line   6)
+* record separators <1>:                 User-modified.       (line 133)
+* record separators, changing:           awk split records.   (line  85)
+* record separators, regular expressions as: awk split records.
+                                                              (line 124)
+* record separators, with multiline records: Multiple Line.   (line  10)
+* records:                               Reading Files.       (line  14)
+* records <1>:                           Basic High Level.    (line  62)
+* records, multiline:                    Multiple Line.       (line   6)
+* records, printing:                     Print.               (line  22)
+* records, splitting input into:         Records.             (line   6)
+* records, terminating:                  awk split records.   (line 124)
+* records, treating files as:            gawk split records.  (line  92)
+* recursive functions:                   Definition Syntax.   (line  89)
+* redirect gawk output, in debugger:     Debugger Info.       (line  73)
+* redirection of input:                  Getline/File.        (line   6)
+* redirection of output:                 Redirection.         (line   6)
+* redirection on VMS:                    VMS Running.         (line  64)
+* reference counting, sorting arrays:    Array Sorting Functions.
+                                                              (line  77)
+* regexp:                                Regexp.              (line   6)
+* regexp constants:                      Regexp Usage.        (line  57)
+* regexp constants <1>:                  Regexp Constants.    (line   6)
+* regexp constants <2>:                  Comparison Operators.
+                                                              (line 103)
+* regexp constants, /=.../, /= operator and: Assignment Ops.  (line 149)
+* regexp constants, as patterns:         Expression Patterns. (line  34)
+* regexp constants, in gawk:             Using Constant Regexps.
+                                                              (line  28)
+* regexp constants, slashes vs. quotes:  Computed Regexps.    (line  30)
+* regexp constants, vs. string constants: Computed Regexps.   (line  40)
+* register extension:                    Registration Functions.
+                                                              (line   6)
+* regular expressions:                   Regexp.              (line   6)
+* regular expressions as field separators: Field Separators.  (line  50)
+* regular expressions, anchors in:       Regexp Operators.    (line  22)
+* regular expressions, as field separators: Regexp Field Splitting.
+                                                              (line   6)
+* regular expressions, as patterns:      Regexp Usage.        (line   6)
+* regular expressions, as patterns <1>:  Regexp Patterns.     (line   6)
+* regular expressions, as record separators: awk split records.
+                                                              (line 124)
+* regular expressions, case sensitivity: Case-sensitivity.    (line   6)
+* regular expressions, case sensitivity <1>: User-modified.   (line  76)
+* regular expressions, computed:         Computed Regexps.    (line   6)
+* regular expressions, constants, See regexp constants: Regexp Usage.
+                                                              (line  57)
+* regular expressions, dynamic:          Computed Regexps.    (line   6)
+* regular expressions, dynamic, with embedded newlines: Computed Regexps.
+                                                              (line  60)
+* regular expressions, gawk, command-line options: GNU Regexp Operators.
+                                                              (line  73)
+* regular expressions, interval expressions and: Options.     (line 278)
+* regular expressions, leftmost longest match: Leftmost Longest.
+                                                              (line   6)
+* regular expressions, operators:        Regexp Usage.        (line  19)
+* regular expressions, operators <1>:    Regexp Operators.    (line   6)
+* regular expressions, operators, for buffers: GNU Regexp Operators.
+                                                              (line  51)
+* regular expressions, operators, for words: GNU Regexp Operators.
+                                                              (line   6)
+* regular expressions, operators, gawk:  GNU Regexp Operators.
+                                                              (line   6)
+* regular expressions, operators, precedence of: Regexp Operators.
+                                                              (line 156)
+* regular expressions, searching for:    Egrep Program.       (line   6)
+* relational operators, See comparison operators: Typing and Comparison.
+                                                              (line   9)
+* replace in string:                     String Functions.    (line 409)
+* retrying input:                        Retrying Input.      (line   6)
+* return debugger command:               Debugger Execution Control.
+                                                              (line  54)
+* return statement, user-defined functions: Return Statement. (line   6)
+* return value, close() function:        Close Files And Pipes.
+                                                              (line 132)
+* rev() user-defined function:           Function Example.    (line  54)
+* revoutput extension:                   Extension Sample Revout.
+                                                              (line  11)
+* revtwoway extension:                   Extension Sample Rev2way.
+                                                              (line  12)
+* rewind() user-defined function:        Rewind Function.     (line  15)
+* right angle bracket (>), > operator:   Comparison Operators.
+                                                              (line  11)
+* right angle bracket (>), > operator <1>: Precedence.        (line  64)
+* right angle bracket (>), > operator (I/O): Redirection.     (line  22)
+* right angle bracket (>), >= operator:  Comparison Operators.
+                                                              (line  11)
+* right angle bracket (>), >= operator <1>: Precedence.       (line  64)
+* right angle bracket (>), >> operator (I/O): Redirection.    (line  50)
+* right angle bracket (>), >> operator (I/O) <1>: Precedence. (line  64)
+* right shift:                           Bitwise Functions.   (line  54)
+* right shift, bitwise:                  Bitwise Functions.   (line  32)
+* Ritchie, Dennis:                       Basic Data Typing.   (line  54)
+* RLENGTH variable:                      Auto-set.            (line 282)
+* RLENGTH variable, match() function and: String Functions.   (line 227)
+* Robbins, Arnold:                       Command Line Field Separator.
+                                                              (line  71)
+* Robbins, Arnold <1>:                   Getline/Pipe.        (line  40)
+* Robbins, Arnold <2>:                   Passwd Functions.    (line  90)
+* Robbins, Arnold <3>:                   Alarm Program.       (line   6)
+* Robbins, Arnold <4>:                   General Data Types.  (line   6)
+* Robbins, Arnold <5>:                   Contributors.        (line 145)
+* Robbins, Arnold <6>:                   Maintainers.         (line  14)
+* Robbins, Arnold <7>:                   Future Extensions.   (line   6)
+* Robbins, Bill:                         Getline/Pipe.        (line  40)
+* Robbins, Harry:                        Acknowledgments.     (line  94)
+* Robbins, Jean:                         Acknowledgments.     (line  94)
+* Robbins, Miriam:                       Acknowledgments.     (line  94)
+* Robbins, Miriam <1>:                   Getline/Pipe.        (line  40)
+* Robbins, Miriam <2>:                   Passwd Functions.    (line  90)
+* Rommel, Kai Uwe:                       Contributors.        (line  43)
+* round to nearest integer:              Numeric Functions.   (line  24)
+* round() user-defined function:         Round Function.      (line  16)
+* rounding numbers:                      Round Function.      (line   6)
+* ROUNDMODE variable:                    User-modified.       (line 128)
+* RS variable:                           awk split records.   (line  12)
+* RS variable <1>:                       User-modified.       (line 133)
+* RS variable, multiline records and:    Multiple Line.       (line  17)
+* rshift:                                Bitwise Functions.   (line  54)
+* RSTART variable:                       Auto-set.            (line 288)
+* RSTART variable, match() function and: String Functions.    (line 227)
+* RT variable:                           awk split records.   (line 124)
+* RT variable <1>:                       Multiple Line.       (line 130)
+* RT variable <2>:                       Auto-set.            (line 295)
+* Rubin, Paul:                           History.             (line  30)
+* Rubin, Paul <1>:                       Contributors.        (line  16)
+* rule, definition of:                   Getting Started.     (line  21)
+* run debugger command:                  Debugger Execution Control.
+                                                              (line  62)
+* rvalues/lvalues:                       Assignment Ops.      (line  31)
+* s debugger command (alias for step):   Debugger Execution Control.
+                                                              (line  68)
+* sample debugging session:              Sample Debugging Session.
+                                                              (line   6)
+* sandbox mode:                          Options.             (line 290)
+* save debugger options:                 Debugger Info.       (line  85)
+* scalar or array:                       Type Functions.      (line  11)
+* scalar values:                         Basic Data Typing.   (line  13)
+* scanning arrays:                       Scanning an Array.   (line   6)
+* scanning multidimensional arrays:      Multiscanning.       (line  11)
+* Schorr, Andrew:                        Acknowledgments.     (line  60)
+* Schorr, Andrew <1>:                    Auto-set.            (line 327)
+* Schorr, Andrew <2>:                    Contributors.        (line 134)
+* Schreiber, Bert:                       Acknowledgments.     (line  38)
+* Schreiber, Rita:                       Acknowledgments.     (line  38)
+* search and replace in strings:         String Functions.    (line  89)
+* search in string:                      String Functions.    (line 155)
+* search paths:                          Programs Exercises.  (line  70)
+* search paths <1>:                      PC Using.            (line   9)
+* search paths <2>:                      VMS Running.         (line  57)
+* search paths, for loadable extensions: AWKLIBPATH Variable. (line   6)
+* search paths, for source files:        AWKPATH Variable.    (line   6)
+* search paths, for source files <1>:    Programs Exercises.  (line  70)
+* search paths, for source files <2>:    PC Using.            (line   9)
+* search paths, for source files <3>:    VMS Running.         (line  57)
+* searching, files for regular expressions: Egrep Program.    (line   6)
+* searching, for words:                  Dupword Program.     (line   6)
+* sed utility:                           Full Line Fields.    (line  22)
+* sed utility <1>:                       Simple Sed.          (line   6)
+* sed utility <2>:                       Glossary.            (line  16)
+* seeding random number generator:       Numeric Functions.   (line  79)
+* semicolon (;), AWKPATH variable and:   PC Using.            (line   9)
+* semicolon (;), separating statements in actions: Statements/Lines.
+                                                              (line  90)
+* semicolon (;), separating statements in actions <1>: Action Overview.
+                                                              (line  19)
+* semicolon (;), separating statements in actions <2>: Statements.
+                                                              (line  10)
+* separators, field:                     User-modified.       (line  50)
+* separators, field <1>:                 User-modified.       (line 113)
+* separators, field, FIELDWIDTHS variable and: User-modified. (line  37)
+* separators, field, FPAT variable and:  User-modified.       (line  43)
+* separators, for records:               awk split records.   (line   6)
+* separators, for records <1>:           awk split records.   (line  85)
+* separators, for records <2>:           User-modified.       (line 133)
+* separators, for records, regular expressions as: awk split records.
+                                                              (line 124)
+* separators, for statements in actions: Action Overview.     (line  19)
+* separators, subscript:                 User-modified.       (line 146)
+* set breakpoint:                        Breakpoint Control.  (line  11)
+* set debugger command:                  Viewing And Changing Data.
+                                                              (line  58)
+* set directory of message catalogs:     I18N Functions.      (line  11)
+* set watchpoint:                        Viewing And Changing Data.
+                                                              (line  66)
+* shadowing of variable values:          Definition Syntax.   (line  77)
+* shell quoting, rules for:              Quoting.             (line   6)
+* shells, piping commands into:          Redirection.         (line 136)
+* shells, quoting:                       Using Shell Variables.
+                                                              (line  12)
+* shells, quoting, rules for:            Quoting.             (line  18)
+* shells, scripts:                       One-shot.            (line  22)
+* shells, sea:                           Undocumented.        (line   9)
+* shells, variables:                     Using Shell Variables.
+                                                              (line   6)
+* shift, bitwise:                        Bitwise Functions.   (line  32)
+* short-circuit operators:               Boolean Ops.         (line  59)
+* show all source files, in debugger:    Debugger Info.       (line  45)
+* show breakpoints:                      Debugger Info.       (line  21)
+* show function arguments, in debugger:  Debugger Info.       (line  18)
+* show local variables, in debugger:     Debugger Info.       (line  34)
+* show name of current source file, in debugger: Debugger Info.
+                                                              (line  37)
+* show watchpoints:                      Debugger Info.       (line  51)
+* si debugger command (alias for stepi): Debugger Execution Control.
+                                                              (line  75)
+* side effects:                          Concatenation.       (line  41)
+* side effects <1>:                      Increment Ops.       (line  11)
+* side effects <2>:                      Increment Ops.       (line  75)
+* side effects, array indexing:          Reference to Elements.
+                                                              (line  43)
+* side effects, asort() function:        Array Sorting Functions.
+                                                              (line  24)
+* side effects, assignment expressions:  Assignment Ops.      (line  22)
+* side effects, Boolean operators:       Boolean Ops.         (line  30)
+* side effects, conditional expressions: Conditional Exp.     (line  22)
+* side effects, decrement/increment operators: Increment Ops. (line  11)
+* side effects, FILENAME variable:       Getline Notes.       (line  19)
+* side effects, function calls:          Function Calls.      (line  57)
+* side effects, statements:              Action Overview.     (line  32)
+* sidebar, A Constant's Base Does Not Affect Its Value: Nondecimal-numbers.
+                                                              (line  63)
+* sidebar, Backslash Before Regular Characters: Escape Sequences.
+                                                              (line 106)
+* sidebar, Beware The Smoke and Mirrors!: Bitwise Functions.  (line 126)
+* sidebar, Changing FS Does Not Affect the Fields: Full Line Fields.
+                                                              (line  14)
+* sidebar, Changing NR and FNR:          Auto-set.            (line 355)
+* sidebar, Controlling Output Buffering with system(): I/O Functions.
+                                                              (line 164)
+* sidebar, Escape Sequences for Metacharacters: Escape Sequences.
+                                                              (line 138)
+* sidebar, FS and IGNORECASE:            Field Splitting Summary.
+                                                              (line  37)
+* sidebar, Interactive Versus Noninteractive Buffering: I/O Functions.
+                                                              (line  74)
+* sidebar, Matching the Null String:     String Functions.    (line 535)
+* sidebar, Operator Evaluation Order:    Increment Ops.       (line  58)
+* sidebar, Piping into sh:               Redirection.         (line 134)
+* sidebar, Pre-POSIX awk Used OFMT for String Conversion: Strings And Numbers.
+                                                              (line  54)
+* sidebar, Recipe for a Programming Language: History.        (line   6)
+* sidebar, RS = "\0" Is Not Portable:    gawk split records.  (line  63)
+* sidebar, So Why Does gawk Have BEGINFILE and ENDFILE?: Filetrans Function.
+                                                              (line  83)
+* sidebar, Syntactic Ambiguities Between /= and Regular Expressions: Assignment Ops.
+                                                              (line 147)
+* sidebar, Understanding #!:             Executable Scripts.  (line  31)
+* sidebar, Understanding $0:             Changing Fields.     (line 134)
+* sidebar, Using close()'s Return Value: Close Files And Pipes.
+                                                              (line 130)
+* sidebar, Using \n in Bracket Expressions of Dynamic Regexps: Computed Regexps.
+                                                              (line  58)
+* SIGHUP signal, for dynamic profiling:  Profiling.           (line 209)
+* SIGINT signal (MS-Windows):            Profiling.           (line 212)
+* signals, HUP/SIGHUP, for profiling:    Profiling.           (line 209)
+* signals, INT/SIGINT (MS-Windows):      Profiling.           (line 212)
+* signals, QUIT/SIGQUIT (MS-Windows):    Profiling.           (line 212)
+* signals, USR1/SIGUSR1, for profiling:  Profiling.           (line 186)
+* signature program:                     Signature Program.   (line   6)
+* SIGQUIT signal (MS-Windows):           Profiling.           (line 212)
+* SIGUSR1 signal, for dynamic profiling: Profiling.           (line 186)
+* silent debugger command:               Debugger Execution Control.
+                                                              (line  10)
+* sin:                                   Numeric Functions.   (line  90)
+* sine:                                  Numeric Functions.   (line  90)
+* single quote ('):                      One-shot.            (line  15)
+* single quote (') in gawk command lines: Long.               (line  35)
+* single quote ('), in shell commands:   Quoting.             (line  48)
+* single quote ('), vs. apostrophe:      Comments.            (line  27)
+* single quote ('), with double quotes:  Quoting.             (line  73)
+* single-character fields:               Single Character Fields.
+                                                              (line   6)
+* single-step execution, in the debugger: Debugger Execution Control.
+                                                              (line  43)
+* Skywalker, Luke:                       Undocumented.        (line   6)
+* sleep utility:                         Alarm Program.       (line 109)
+* sleep() extension function:            Extension Sample Time.
+                                                              (line  22)
+* Solaris, POSIX-compliant awk:          Other Versions.      (line 100)
+* sort array:                            String Functions.    (line  42)
+* sort array indices:                    String Functions.    (line  42)
+* sort function, arrays, sorting:        Array Sorting Functions.
+                                                              (line   6)
+* sort utility:                          Word Sorting.        (line  50)
+* sort utility, coprocesses and:         Two-way I/O.         (line  66)
+* sorting characters in different languages: Explaining gettext.
+                                                              (line  94)
+* source code, awka:                     Other Versions.      (line  68)
+* source code, Brian Kernighan's awk:    Other Versions.      (line  13)
+* source code, BusyBox Awk:              Other Versions.      (line  92)
+* source code, gawk:                     Gawk Distribution.   (line   6)
+* source code, Illumos awk:              Other Versions.      (line 109)
+* source code, jawk:                     Other Versions.      (line 117)
+* source code, libmawk:                  Other Versions.      (line 125)
+* source code, mawk:                     Other Versions.      (line  48)
+* source code, mixing:                   Options.             (line 117)
+* source code, pawk:                     Other Versions.      (line  82)
+* source code, pawk (Python version):    Other Versions.      (line 129)
+* source code, QSE awk:                  Other Versions.      (line 135)
+* source code, QuikTrim Awk:             Other Versions.      (line 139)
+* source code, Solaris awk:              Other Versions.      (line 100)
+* source files, search path for:         Programs Exercises.  (line  70)
+* sparse arrays:                         Array Intro.         (line  76)
+* Spencer, Henry:                        Glossary.            (line  16)
+* split:                                 String Functions.    (line 315)
+* split string into array:               String Functions.    (line 296)
+* split utility:                         Split Program.       (line   6)
+* split() function, array elements, deleting: Delete.         (line  61)
+* split.awk program:                     Split Program.       (line  30)
+* sprintf:                               OFMT.                (line  15)
+* sprintf <1>:                           String Functions.    (line 384)
+* sprintf() function, OFMT variable and: User-modified.       (line 113)
+* sprintf() function, print/printf statements and: Round Function.
+                                                              (line   6)
+* sqrt:                                  Numeric Functions.   (line  93)
+* square brackets ([]), regexp operator: Regexp Operators.    (line  56)
+* square root:                           Numeric Functions.   (line  93)
+* srand:                                 Numeric Functions.   (line  97)
+* stack frame:                           Debugging Terms.     (line  10)
+* Stallman, Richard:                     Manual History.      (line   6)
+* Stallman, Richard <1>:                 Acknowledgments.     (line  18)
+* Stallman, Richard <2>:                 Contributors.        (line  24)
+* Stallman, Richard <3>:                 Glossary.            (line 372)
+* standard error:                        Special FD.          (line   6)
+* standard input:                        Read Terminal.       (line   6)
+* standard input <1>:                    Special FD.          (line   6)
+* standard output:                       Special FD.          (line   6)
+* starting the debugger:                 Debugger Invocation. (line   6)
+* stat() extension function:             Extension Sample File Functions.
+                                                              (line  18)
+* statements, compound, control statements and: Statements.   (line  10)
+* statements, control, in actions:       Statements.          (line   6)
+* statements, multiple:                  Statements/Lines.    (line  90)
+* step debugger command:                 Debugger Execution Control.
+                                                              (line  68)
+* stepi debugger command:                Debugger Execution Control.
+                                                              (line  75)
+* stop automatic display, in debugger:   Viewing And Changing Data.
+                                                              (line  79)
+* stream editors:                        Full Line Fields.    (line  22)
+* stream editors <1>:                    Simple Sed.          (line   6)
+* strftime:                              Time Functions.      (line  48)
+* string constants:                      Scalar Constants.    (line  15)
+* string constants, vs. regexp constants: Computed Regexps.   (line  40)
+* string extraction (internationalization): String Extraction.
+                                                              (line   6)
+* string length:                         String Functions.    (line 170)
+* string operators:                      Concatenation.       (line   9)
+* string, regular expression match:      String Functions.    (line 210)
+* string-manipulation functions:         String Functions.    (line   6)
+* string-matching operators:             Regexp Usage.        (line  19)
+* string-translation functions:          I18N Functions.      (line   6)
+* strings splitting, example:            String Functions.    (line 334)
+* strings, converting:                   Strings And Numbers. (line   6)
+* strings, converting <1>:               Bitwise Functions.   (line 108)
+* strings, converting letter case:       String Functions.    (line 523)
+* strings, converting, numbers to:       User-modified.       (line  30)
+* strings, converting, numbers to <1>:   User-modified.       (line 104)
+* strings, empty, See null strings:      awk split records.   (line 114)
+* strings, extracting:                   String Extraction.   (line   6)
+* strings, for localization:             Programmer i18n.     (line  13)
+* strings, length limitations:           Scalar Constants.    (line  20)
+* strings, merging arrays into:          Join Function.       (line   6)
+* strings, null:                         Regexp Field Splitting.
+                                                              (line  43)
+* strings, numeric:                      Variable Typing.     (line   6)
+* strtonum:                              String Functions.    (line 391)
+* strtonum() function (gawk), --non-decimal-data option and: Nondecimal Data.
+                                                              (line  35)
+* sub:                                   Using Constant Regexps.
+                                                              (line  43)
+* sub <1>:                               String Functions.    (line 409)
+* sub() function, arguments of:          String Functions.    (line 463)
+* sub() function, escape processing:     Gory Details.        (line   6)
+* subscript separators:                  User-modified.       (line 146)
+* subscripts in arrays, multidimensional: Multidimensional.   (line  10)
+* subscripts in arrays, multidimensional, scanning: Multiscanning.
+                                                              (line  11)
+* subscripts in arrays, numbers as:      Numeric Array Subscripts.
+                                                              (line   6)
+* subscripts in arrays, uninitialized variables as: Uninitialized Subscripts.
+                                                              (line   6)
+* SUBSEP variable:                       User-modified.       (line 146)
+* SUBSEP variable, and multidimensional arrays: Multidimensional.
+                                                              (line  16)
+* substitute in string:                  String Functions.    (line  89)
+* substr:                                String Functions.    (line 482)
+* substring:                             String Functions.    (line 482)
+* Sumner, Andrew:                        Other Versions.      (line  68)
+* supplementary groups of gawk process:  Auto-set.            (line 251)
+* switch statement:                      Switch Statement.    (line   6)
+* SYMTAB array:                          Auto-set.            (line 299)
+* syntactic ambiguity: /= operator vs. /=.../ regexp constant: Assignment Ops.
+                                                              (line 149)
+* system:                                I/O Functions.       (line 107)
+* systime:                               Time Functions.      (line  66)
+* t debugger command (alias for tbreak): Breakpoint Control.  (line  90)
+* tbreak debugger command:               Breakpoint Control.  (line  90)
+* Tcl:                                   Library Names.       (line  58)
+* TCP/IP:                                TCP/IP Networking.   (line   6)
+* TCP/IP, support for:                   Special Network.     (line   6)
+* tee utility:                           Tee Program.         (line   6)
+* tee.awk program:                       Tee Program.         (line  26)
+* temporary breakpoint:                  Breakpoint Control.  (line  90)
+* terminating records:                   awk split records.   (line 124)
+* testbits.awk program:                  Bitwise Functions.   (line  69)
+* testext extension:                     Extension Sample API Tests.
+                                                              (line   6)
+* Texinfo:                               Conventions.         (line   6)
+* Texinfo <1>:                           Library Functions.   (line  33)
+* Texinfo <2>:                           Dupword Program.     (line  17)
+* Texinfo <3>:                           Extract Program.     (line  12)
+* Texinfo <4>:                           Distribution contents.
+                                                              (line  77)
+* Texinfo <5>:                           Adding Code.         (line 100)
+* Texinfo, chapter beginnings in files:  Regexp Operators.    (line  22)
+* Texinfo, extracting programs from source files: Extract Program.
+                                                              (line   6)
+* text, printing:                        Print.               (line  22)
+* text, printing, unduplicated lines of: Uniq Program.        (line   6)
+* TEXTDOMAIN variable:                   User-modified.       (line 152)
+* TEXTDOMAIN variable <1>:               Programmer i18n.     (line   8)
+* TEXTDOMAIN variable, BEGIN pattern and: Programmer i18n.    (line  60)
+* TEXTDOMAIN variable, portability and:  I18N Portability.    (line  20)
+* textdomain() function (C library):     Explaining gettext.  (line  28)
+* tilde (~), ~ operator:                 Regexp Usage.        (line  19)
+* tilde (~), ~ operator <1>:             Computed Regexps.    (line   6)
+* tilde (~), ~ operator <2>:             Case-sensitivity.    (line  26)
+* tilde (~), ~ operator <3>:             Regexp Constants.    (line   6)
+* tilde (~), ~ operator <4>:             Comparison Operators.
+                                                              (line  11)
+* tilde (~), ~ operator <5>:             Comparison Operators.
+                                                              (line  98)
+* tilde (~), ~ operator <6>:             Precedence.          (line  79)
+* tilde (~), ~ operator <7>:             Expression Patterns. (line  24)
+* time functions:                        Time Functions.      (line   6)
+* time, alarm clock example program:     Alarm Program.       (line  11)
+* time, localization and:                Explaining gettext.  (line 112)
+* time, managing:                        Getlocaltime Function.
+                                                              (line   6)
+* time, retrieving:                      Time Functions.      (line  17)
+* timeout, reading input:                Read Timeout.        (line   6)
+* timestamps:                            Time Functions.      (line   6)
+* timestamps <1>:                        Time Functions.      (line  66)
+* timestamps, converting dates to:       Time Functions.      (line  76)
+* timestamps, formatted:                 Getlocaltime Function.
+                                                              (line   6)
+* tolower:                               String Functions.    (line 524)
+* toupper:                               String Functions.    (line 530)
+* tr utility:                            Translate Program.   (line   6)
+* trace debugger command:                Miscellaneous Debugger Commands.
+                                                              (line 110)
+* traceback, display in debugger:        Execution Stack.     (line  13)
+* translate string:                      I18N Functions.      (line  21)
+* translate.awk program:                 Translate Program.   (line  55)
+* treating files, as single records:     gawk split records.  (line  92)
+* troubleshooting, --non-decimal-data option: Options.        (line 209)
+* troubleshooting, == operator:          Comparison Operators.
+                                                              (line  37)
+* troubleshooting, awk uses FS not IFS:  Field Separators.    (line  29)
+* troubleshooting, backslash before nonspecial character: Escape Sequences.
+                                                              (line 108)
+* troubleshooting, division:             Arithmetic Ops.      (line  44)
+* troubleshooting, fatal errors, field widths, specifying: Constant Size.
+                                                              (line  22)
+* troubleshooting, fatal errors, printf format strings: Format Modifiers.
+                                                              (line 157)
+* troubleshooting, fflush() function:    I/O Functions.       (line  63)
+* troubleshooting, function call syntax: Function Calls.      (line  30)
+* troubleshooting, gawk:                 Compatibility Mode.  (line   6)
+* troubleshooting, gawk, bug reports:    Bugs.                (line   9)
+* troubleshooting, gawk, fatal errors, function arguments: Calling Built-in.
+                                                              (line  16)
+* troubleshooting, getline function:     File Checking.       (line  25)
+* troubleshooting, gsub()/sub() functions: String Functions.  (line 473)
+* troubleshooting, match() function:     String Functions.    (line 291)
+* troubleshooting, print statement, omitting commas: Print Examples.
+                                                              (line  30)
+* troubleshooting, printing:             Redirection.         (line 112)
+* troubleshooting, quotes with file names: Special FD.        (line  62)
+* troubleshooting, readable data files:  File Checking.       (line   6)
+* troubleshooting, regexp constants vs. string constants: Computed Regexps.
+                                                              (line  40)
+* troubleshooting, string concatenation: Concatenation.       (line  27)
+* troubleshooting, substr() function:    String Functions.    (line 500)
+* troubleshooting, system() function:    I/O Functions.       (line 129)
+* troubleshooting, typographical errors, global variables: Options.
+                                                              (line  99)
+* true, logical:                         Truth Values.        (line   6)
+* Trueman, David:                        History.             (line  30)
+* Trueman, David <1>:                    Acknowledgments.     (line  47)
+* Trueman, David <2>:                    Contributors.        (line  31)
+* trunc-mod operation:                   Arithmetic Ops.      (line  66)
+* truth values:                          Truth Values.        (line   6)
+* type conversion:                       Strings And Numbers. (line  21)
+* type, of variable:                     Type Functions.      (line  14)
+* typeof:                                Type Functions.      (line  14)
+* u debugger command (alias for until):  Debugger Execution Control.
+                                                              (line  82)
+* unassigned array elements:             Reference to Elements.
+                                                              (line  18)
+* undefined functions:                   Pass By Value/Reference.
+                                                              (line  68)
+* underscore (_), C macro:               Explaining gettext.  (line  71)
+* underscore (_), in names of private variables: Library Names.
+                                                              (line  29)
+* underscore (_), translatable string:   Programmer i18n.     (line  69)
+* undisplay debugger command:            Viewing And Changing Data.
+                                                              (line  79)
+* undocumented features:                 Undocumented.        (line   6)
+* Unicode:                               Ordinal Functions.   (line  45)
+* Unicode <1>:                           Ranges and Locales.  (line  61)
+* Unicode <2>:                           Glossary.            (line 196)
+* uninitialized variables, as array subscripts: Uninitialized Subscripts.
+                                                              (line   6)
+* uniq utility:                          Uniq Program.        (line   6)
+* uniq.awk program:                      Uniq Program.        (line  65)
+* Unix:                                  Glossary.            (line 748)
+* Unix awk, backslashes in escape sequences: Escape Sequences.
+                                                              (line 121)
+* Unix awk, close() function and:        Close Files And Pipes.
+                                                              (line 132)
+* Unix awk, password files, field separators and: Command Line Field Separator.
+                                                              (line  62)
+* Unix, awk scripts and:                 Executable Scripts.  (line   6)
+* unsigned integers:                     Computer Arithmetic. (line  41)
+* until debugger command:                Debugger Execution Control.
+                                                              (line  82)
+* unwatch debugger command:              Viewing And Changing Data.
+                                                              (line  83)
+* up debugger command:                   Execution Stack.     (line  36)
+* user database, reading:                Passwd Functions.    (line   6)
+* user-defined functions:                User-defined.        (line   6)
+* user-defined, functions, counts, in a profile: Profiling.   (line 137)
+* user-defined, variables:               Variables.           (line   6)
+* user-modifiable variables:             User-modified.       (line   6)
+* users, information about, printing:    Id Program.          (line   6)
+* users, information about, retrieving:  Passwd Functions.    (line  16)
+* USR1 signal, for dynamic profiling:    Profiling.           (line 186)
+* values, numeric:                       Basic Data Typing.   (line  13)
+* values, string:                        Basic Data Typing.   (line  13)
+* variable assignments and input files:  Other Arguments.     (line  26)
+* variable type:                         Type Functions.      (line  14)
+* variable typing:                       Typing and Comparison.
+                                                              (line   9)
+* variables:                             Other Features.      (line   6)
+* variables <1>:                         Basic Data Typing.   (line   6)
+* variables, assigning on command line:  Assignment Options.  (line   6)
+* variables, built-in:                   Using Variables.     (line  23)
+* variables, flag:                       Boolean Ops.         (line  69)
+* variables, getline command into, using: Getline/Variable.   (line   6)
+* variables, getline command into, using <1>: Getline/Variable/File.
+                                                              (line   6)
+* variables, getline command into, using <2>: Getline/Variable/Pipe.
+                                                              (line   6)
+* variables, getline command into, using <3>: Getline/Variable/Coprocess.
+                                                              (line   6)
+* variables, global, for library functions: Library Names.    (line  11)
+* variables, global, printing list of:   Options.             (line  94)
+* variables, initializing:               Using Variables.     (line  23)
+* variables, local to a function:        Variable Scope.      (line   6)
+* variables, predefined:                 Built-in Variables.  (line   6)
+* variables, predefined -v option, setting with: Options.     (line  41)
+* variables, predefined conveying information: Auto-set.      (line   6)
+* variables, private:                    Library Names.       (line  11)
+* variables, setting:                    Options.             (line  32)
+* variables, shadowing:                  Definition Syntax.   (line  77)
+* variables, types of:                   Assignment Ops.      (line  39)
+* variables, types of, comparison expressions and: Typing and Comparison.
+                                                              (line   9)
+* variables, uninitialized, as array subscripts: Uninitialized Subscripts.
+                                                              (line   6)
+* variables, user-defined:               Variables.           (line   6)
+* version of gawk:                       Auto-set.            (line 221)
+* version of gawk extension API:         Auto-set.            (line 246)
+* version of GNU MP library:             Auto-set.            (line 229)
+* version of GNU MPFR library:           Auto-set.            (line 231)
+* vertical bar (|):                      Regexp Operators.    (line  70)
+* vertical bar (|), | operator (I/O):    Getline/Pipe.        (line  10)
+* vertical bar (|), | operator (I/O) <1>: Precedence.         (line  64)
+* vertical bar (|), |& operator (I/O):   Getline/Coprocess.   (line   6)
+* vertical bar (|), |& operator (I/O) <1>: Precedence.        (line  64)
+* vertical bar (|), |& operator (I/O) <2>: Two-way I/O.       (line  27)
+* vertical bar (|), || operator:         Boolean Ops.         (line  59)
+* vertical bar (|), || operator <1>:     Precedence.          (line  88)
+* Vinschen, Corinna:                     Acknowledgments.     (line  60)
+* w debugger command (alias for watch):  Viewing And Changing Data.
+                                                              (line  66)
+* w utility:                             Constant Size.       (line  22)
+* wait() extension function:             Extension Sample Fork.
+                                                              (line  22)
+* waitpid() extension function:          Extension Sample Fork.
+                                                              (line  18)
+* walk_array() user-defined function:    Walking Arrays.      (line  14)
+* Wall, Larry:                           Array Intro.         (line   6)
+* Wall, Larry <1>:                       Future Extensions.   (line   6)
+* Wallin, Anders:                        Contributors.        (line 104)
+* warnings, issuing:                     Options.             (line 184)
+* watch debugger command:                Viewing And Changing Data.
+                                                              (line  66)
+* watchpoint:                            Debugging Terms.     (line  42)
+* wc utility:                            Wc Program.          (line   6)
+* wc.awk program:                        Wc Program.          (line  46)
+* Weinberger, Peter:                     History.             (line  17)
+* Weinberger, Peter <1>:                 Contributors.        (line  12)
+* where debugger command:                Execution Stack.     (line  13)
+* where debugger command (alias for backtrace): Execution Stack.
+                                                              (line  13)
+* while statement:                       While Statement.     (line   6)
+* while statement, use of regexps in:    Regexp Usage.        (line  19)
+* whitespace, as field separators:       Default Field Splitting.
+                                                              (line   6)
+* whitespace, functions, calling:        Calling Built-in.    (line  10)
+* whitespace, newlines as:               Options.             (line 263)
+* Williams, Kent:                        Contributors.        (line  35)
+* Woehlke, Matthew:                      Contributors.        (line  80)
+* Woods, John:                           Contributors.        (line  28)
+* word boundaries, matching:             GNU Regexp Operators.
+                                                              (line  41)
+* word, regexp definition of:            GNU Regexp Operators.
+                                                              (line   6)
+* word-boundary operator (gawk):         GNU Regexp Operators.
+                                                              (line  66)
+* wordfreq.awk program:                  Word Sorting.        (line  56)
+* words, counting:                       Wc Program.          (line   6)
+* words, duplicate, searching for:       Dupword Program.     (line   6)
+* words, usage counts, generating:       Word Sorting.        (line   6)
+* writea() extension function:           Extension Sample Read write array.
+                                                              (line  12)
+* xgettext utility:                      String Extraction.   (line  13)
+* xor:                                   Bitwise Functions.   (line  57)
+* XOR bitwise operation:                 Bitwise Functions.   (line   6)
+* Yawitz, Efraim:                        Contributors.        (line 132)
+* Zaretskii, Eli:                        Acknowledgments.     (line  60)
+* Zaretskii, Eli <1>:                    Contributors.        (line  56)
+* Zaretskii, Eli <2>:                    Maintainers.         (line  14)
+* zerofile.awk program:                  Empty Files.         (line  20)
+* Zoulas, Christos:                      Contributors.        (line  67)
+
+
+
+Tag Table:
+Node: Top1200
+Node: Foreword342530
+Node: Foreword446972
+Node: Preface48504
+Ref: Preface-Footnote-151363
+Ref: Preface-Footnote-251470
+Ref: Preface-Footnote-351704
+Node: History51846
+Node: Names54198
+Ref: Names-Footnote-155292
+Node: This Manual55439
+Ref: This Manual-Footnote-161924
+Node: Conventions62024
+Node: Manual History64378
+Ref: Manual History-Footnote-167373
+Ref: Manual History-Footnote-267414
+Node: How To Contribute67488
+Node: Acknowledgments68617
+Node: Getting Started73503
+Node: Running gawk75942
+Node: One-shot77132
+Node: Read Terminal78395
+Node: Long80388
+Node: Executable Scripts81901
+Ref: Executable Scripts-Footnote-184696
+Node: Comments84799
+Node: Quoting87283
+Node: DOS Quoting92800
+Node: Sample Data Files93475
+Node: Very Simple96070
+Node: Two Rules100972
+Node: More Complex102857
+Node: Statements/Lines105723
+Ref: Statements/Lines-Footnote-1110182
+Node: Other Features110447
+Node: When111383
+Ref: When-Footnote-1113137
+Node: Intro Summary113202
+Node: Invoking Gawk114086
+Node: Command Line115600
+Node: Options116398
+Ref: Options-Footnote-1132497
+Ref: Options-Footnote-2132727
+Node: Other Arguments132752
+Node: Naming Standard Input135699
+Node: Environment Variables136792
+Node: AWKPATH Variable137350
+Ref: AWKPATH Variable-Footnote-1140761
+Ref: AWKPATH Variable-Footnote-2140795
+Node: AWKLIBPATH Variable141056
+Node: Other Environment Variables142313
+Node: Exit Status146134
+Node: Include Files146811
+Node: Loading Shared Libraries150406
+Node: Obsolete151834
+Node: Undocumented152526
+Node: Invoking Summary152823
+Node: Regexp154483
+Node: Regexp Usage156002
+Node: Escape Sequences158039
+Node: Regexp Operators164271
+Ref: Regexp Operators-Footnote-1171687
+Ref: Regexp Operators-Footnote-2171834
+Node: Bracket Expressions171932
+Ref: table-char-classes174408
+Node: Leftmost Longest177545
+Node: Computed Regexps178848
+Node: GNU Regexp Operators182275
+Node: Case-sensitivity185954
+Ref: Case-sensitivity-Footnote-1188850
+Ref: Case-sensitivity-Footnote-2189085
+Node: Strong Regexp Constants189193
+Node: Regexp Summary189982
+Node: Reading Files191457
+Node: Records193620
+Node: awk split records194353
+Node: gawk split records199284
+Ref: gawk split records-Footnote-1203824
+Node: Fields203861
+Node: Nonconstant Fields206602
+Ref: Nonconstant Fields-Footnote-1208838
+Node: Changing Fields209042
+Node: Field Separators214970
+Node: Default Field Splitting217668
+Node: Regexp Field Splitting218786
+Node: Single Character Fields222139
+Node: Command Line Field Separator223199
+Node: Full Line Fields226417
+Ref: Full Line Fields-Footnote-1227939
+Ref: Full Line Fields-Footnote-2227985
+Node: Field Splitting Summary228086
+Node: Constant Size230160
+Node: Splitting By Content234738
+Ref: Splitting By Content-Footnote-1238709
+Node: Multiple Line238872
+Ref: Multiple Line-Footnote-1244754
+Node: Getline244933
+Node: Plain Getline247400
+Node: Getline/Variable250039
+Node: Getline/File251188
+Node: Getline/Variable/File252574
+Ref: Getline/Variable/File-Footnote-1254177
+Node: Getline/Pipe254265
+Node: Getline/Variable/Pipe256970
+Node: Getline/Coprocess258103
+Node: Getline/Variable/Coprocess259368
+Node: Getline Notes260108
+Node: Getline Summary262903
+Ref: table-getline-variants263325
+Node: Read Timeout264073
+Ref: Read Timeout-Footnote-1267979
+Node: Retrying Input268037
+Node: Command-line directories269236
+Node: Input Summary270142
+Node: Input Exercises273314
+Node: Printing274042
+Node: Print275876
+Node: Print Examples277333
+Node: Output Separators280113
+Node: OFMT282130
+Node: Printf283486
+Node: Basic Printf284271
+Node: Control Letters285845
+Node: Format Modifiers289833
+Node: Printf Examples295848
+Node: Redirection298334
+Node: Special FD305175
+Ref: Special FD-Footnote-1308343
+Node: Special Files308417
+Node: Other Inherited Files309034
+Node: Special Network310035
+Node: Special Caveats310895
+Node: Close Files And Pipes311844
+Ref: table-close-pipe-return-values318751
+Ref: Close Files And Pipes-Footnote-1319534
+Ref: Close Files And Pipes-Footnote-2319682
+Node: Nonfatal319834
+Node: Output Summary322159
+Node: Output Exercises323381
+Node: Expressions324060
+Node: Values325248
+Node: Constants325926
+Node: Scalar Constants326617
+Ref: Scalar Constants-Footnote-1327481
+Node: Nondecimal-numbers327731
+Node: Regexp Constants330744
+Node: Using Constant Regexps331270
+Node: Variables334433
+Node: Using Variables335090
+Node: Assignment Options337000
+Node: Conversion338873
+Node: Strings And Numbers339397
+Ref: Strings And Numbers-Footnote-1342460
+Node: Locale influences conversions342569
+Ref: table-locale-affects345327
+Node: All Operators345945
+Node: Arithmetic Ops346574
+Node: Concatenation349080
+Ref: Concatenation-Footnote-1351927
+Node: Assignment Ops352034
+Ref: table-assign-ops357025
+Node: Increment Ops358338
+Node: Truth Values and Conditions361798
+Node: Truth Values362872
+Node: Typing and Comparison363920
+Node: Variable Typing364740
+Node: Comparison Operators368364
+Ref: table-relational-ops368783
+Node: POSIX String Comparison372278
+Ref: POSIX String Comparison-Footnote-1373973
+Ref: POSIX String Comparison-Footnote-2374112
+Node: Boolean Ops374196
+Ref: Boolean Ops-Footnote-1378678
+Node: Conditional Exp378770
+Node: Function Calls380506
+Node: Precedence384383
+Node: Locales388042
+Node: Expressions Summary389674
+Node: Patterns and Actions392247
+Node: Pattern Overview393367
+Node: Regexp Patterns395044
+Node: Expression Patterns395586
+Node: Ranges399367
+Node: BEGIN/END402475
+Node: Using BEGIN/END403236
+Ref: Using BEGIN/END-Footnote-1405972
+Node: I/O And BEGIN/END406078
+Node: BEGINFILE/ENDFILE408392
+Node: Empty411299
+Node: Using Shell Variables411616
+Node: Action Overview413890
+Node: Statements416215
+Node: If Statement418063
+Node: While Statement419558
+Node: Do Statement421586
+Node: For Statement422734
+Node: Switch Statement425892
+Node: Break Statement428278
+Node: Continue Statement430370
+Node: Next Statement432197
+Node: Nextfile Statement434580
+Node: Exit Statement437232
+Node: Built-in Variables439635
+Node: User-modified440768
+Node: Auto-set448354
+Ref: Auto-set-Footnote-1463007
+Ref: Auto-set-Footnote-2463213
+Node: ARGC and ARGV463269
+Node: Pattern Action Summary467482
+Node: Arrays469912
+Node: Array Basics471241
+Node: Array Intro472085
+Ref: figure-array-elements474060
+Ref: Array Intro-Footnote-1476764
+Node: Reference to Elements476892
+Node: Assigning Elements479356
+Node: Array Example479847
+Node: Scanning an Array481606
+Node: Controlling Scanning484628
+Ref: Controlling Scanning-Footnote-1490027
+Node: Numeric Array Subscripts490343
+Node: Uninitialized Subscripts492527
+Node: Delete494146
+Ref: Delete-Footnote-1496898
+Node: Multidimensional496955
+Node: Multiscanning500050
+Node: Arrays of Arrays501641
+Node: Arrays Summary506408
+Node: Functions508501
+Node: Built-in509539
+Node: Calling Built-in510620
+Node: Numeric Functions512616
+Ref: Numeric Functions-Footnote-1517449
+Ref: Numeric Functions-Footnote-2517806
+Ref: Numeric Functions-Footnote-3517854
+Node: String Functions518126
+Ref: String Functions-Footnote-1541630
+Ref: String Functions-Footnote-2541758
+Ref: String Functions-Footnote-3542006
+Node: Gory Details542093
+Ref: table-sub-escapes543884
+Ref: table-sub-proposed545403
+Ref: table-posix-sub546766
+Ref: table-gensub-escapes548307
+Ref: Gory Details-Footnote-1549130
+Node: I/O Functions549284
+Ref: table-system-return-values555866
+Ref: I/O Functions-Footnote-1557846
+Ref: I/O Functions-Footnote-2557994
+Node: Time Functions558114
+Ref: Time Functions-Footnote-1568636
+Ref: Time Functions-Footnote-2568704
+Ref: Time Functions-Footnote-3568862
+Ref: Time Functions-Footnote-4568973
+Ref: Time Functions-Footnote-5569085
+Ref: Time Functions-Footnote-6569312
+Node: Bitwise Functions569578
+Ref: table-bitwise-ops570172
+Ref: Bitwise Functions-Footnote-1576217
+Ref: Bitwise Functions-Footnote-2576390
+Node: Type Functions576581
+Node: I18N Functions579113
+Node: User-defined580764
+Node: Definition Syntax581569
+Ref: Definition Syntax-Footnote-1587256
+Node: Function Example587327
+Ref: Function Example-Footnote-1590249
+Node: Function Caveats590271
+Node: Calling A Function590789
+Node: Variable Scope591747
+Node: Pass By Value/Reference594741
+Node: Return Statement598240
+Node: Dynamic Typing601219
+Node: Indirect Calls602149
+Ref: Indirect Calls-Footnote-1612400
+Node: Functions Summary612528
+Node: Library Functions615233
+Ref: Library Functions-Footnote-1618840
+Ref: Library Functions-Footnote-2618983
+Node: Library Names619154
+Ref: Library Names-Footnote-1622614
+Ref: Library Names-Footnote-2622837
+Node: General Functions622923
+Node: Strtonum Function624026
+Node: Assert Function627048
+Node: Round Function630374
+Node: Cliff Random Function631915
+Node: Ordinal Functions632931
+Ref: Ordinal Functions-Footnote-1635994
+Ref: Ordinal Functions-Footnote-2636246
+Node: Join Function636456
+Ref: Join Function-Footnote-1638226
+Node: Getlocaltime Function638426
+Node: Readfile Function642168
+Node: Shell Quoting644140
+Node: Data File Management645541
+Node: Filetrans Function646173
+Node: Rewind Function650269
+Node: File Checking652175
+Ref: File Checking-Footnote-1653509
+Node: Empty Files653710
+Node: Ignoring Assigns655689
+Node: Getopt Function657239
+Ref: Getopt Function-Footnote-1668708
+Node: Passwd Functions668908
+Ref: Passwd Functions-Footnote-1677747
+Node: Group Functions677835
+Ref: Group Functions-Footnote-1685733
+Node: Walking Arrays685940
+Node: Library Functions Summary688948
+Node: Library Exercises690354
+Node: Sample Programs690819
+Node: Running Examples691589
+Node: Clones692317
+Node: Cut Program693541
+Node: Egrep Program703470
+Ref: Egrep Program-Footnote-1710982
+Node: Id Program711092
+Node: Split Program714772
+Ref: Split Program-Footnote-1718231
+Node: Tee Program718360
+Node: Uniq Program721150
+Node: Wc Program728576
+Ref: Wc Program-Footnote-1732831
+Node: Miscellaneous Programs732925
+Node: Dupword Program734138
+Node: Alarm Program736168
+Node: Translate Program741023
+Ref: Translate Program-Footnote-1745588
+Node: Labels Program745858
+Ref: Labels Program-Footnote-1749209
+Node: Word Sorting749293
+Node: History Sorting753365
+Node: Extract Program755200
+Node: Simple Sed762729
+Node: Igawk Program765803
+Ref: Igawk Program-Footnote-1780134
+Ref: Igawk Program-Footnote-2780336
+Ref: Igawk Program-Footnote-3780458
+Node: Anagram Program780573
+Node: Signature Program783635
+Node: Programs Summary784882
+Node: Programs Exercises786096
+Ref: Programs Exercises-Footnote-1790225
+Node: Advanced Features790316
+Node: Nondecimal Data792306
+Node: Array Sorting793897
+Node: Controlling Array Traversal794597
+Ref: Controlling Array Traversal-Footnote-1802964
+Node: Array Sorting Functions803082
+Ref: Array Sorting Functions-Footnote-1808173
+Node: Two-way I/O808369
+Ref: Two-way I/O-Footnote-1814919
+Ref: Two-way I/O-Footnote-2815106
+Node: TCP/IP Networking815188
+Node: Profiling818306
+Ref: Profiling-Footnote-1826799
+Node: Advanced Features Summary827122
+Node: Internationalization828966
+Node: I18N and L10N830446
+Node: Explaining gettext831133
+Ref: Explaining gettext-Footnote-1837025
+Ref: Explaining gettext-Footnote-2837210
+Node: Programmer i18n837375
+Ref: Programmer i18n-Footnote-1842230
+Node: Translator i18n842279
+Node: String Extraction843073
+Ref: String Extraction-Footnote-1844205
+Node: Printf Ordering844291
+Ref: Printf Ordering-Footnote-1847077
+Node: I18N Portability847141
+Ref: I18N Portability-Footnote-1849597
+Node: I18N Example849660
+Ref: I18N Example-Footnote-1852466
+Node: Gawk I18N852539
+Node: I18N Summary853184
+Node: Debugger854525
+Node: Debugging855547
+Node: Debugging Concepts855988
+Node: Debugging Terms857797
+Node: Awk Debugging860372
+Node: Sample Debugging Session861278
+Node: Debugger Invocation861812
+Node: Finding The Bug863198
+Node: List of Debugger Commands869676
+Node: Breakpoint Control871009
+Node: Debugger Execution Control874703
+Node: Viewing And Changing Data878065
+Node: Execution Stack881439
+Node: Debugger Info883076
+Node: Miscellaneous Debugger Commands887147
+Node: Readline Support892235
+Node: Limitations893131
+Ref: Limitations-Footnote-1897362
+Node: Debugging Summary897413
+Node: Arbitrary Precision Arithmetic898692
+Node: Computer Arithmetic900108
+Ref: table-numeric-ranges903699
+Ref: Computer Arithmetic-Footnote-1904421
+Node: Math Definitions904478
+Ref: table-ieee-formats907792
+Ref: Math Definitions-Footnote-1908395
+Node: MPFR features908500
+Node: FP Math Caution910217
+Ref: FP Math Caution-Footnote-1911289
+Node: Inexactness of computations911658
+Node: Inexact representation912618
+Node: Comparing FP Values913978
+Node: Errors accumulate915060
+Node: Getting Accuracy916493
+Node: Try To Round919203
+Node: Setting precision920102
+Ref: table-predefined-precision-strings920799
+Node: Setting the rounding mode922629
+Ref: table-gawk-rounding-modes923003
+Ref: Setting the rounding mode-Footnote-1926411
+Node: Arbitrary Precision Integers926590
+Ref: Arbitrary Precision Integers-Footnote-1931507
+Node: POSIX Floating Point Problems931656
+Ref: POSIX Floating Point Problems-Footnote-1935538
+Node: Floating point summary935576
+Node: Dynamic Extensions937766
+Node: Extension Intro939319
+Node: Plugin License940585
+Node: Extension Mechanism Outline941382
+Ref: figure-load-extension941821
+Ref: figure-register-new-function943386
+Ref: figure-call-new-function944478
+Node: Extension API Description946540
+Node: Extension API Functions Introduction948072
+Node: General Data Types952931
+Ref: General Data Types-Footnote-1958886
+Node: Memory Allocation Functions959185
+Ref: Memory Allocation Functions-Footnote-1962030
+Node: Constructor Functions962129
+Node: Registration Functions963874
+Node: Extension Functions964559
+Node: Exit Callback Functions967182
+Node: Extension Version String968432
+Node: Input Parsers969095
+Node: Output Wrappers978977
+Node: Two-way processors983489
+Node: Printing Messages985754
+Ref: Printing Messages-Footnote-1986925
+Node: Updating ERRNO987078
+Node: Requesting Values987817
+Ref: table-value-types-returned988554
+Node: Accessing Parameters989437
+Node: Symbol Table Access990672
+Node: Symbol table by name991184
+Node: Symbol table by cookie993205
+Ref: Symbol table by cookie-Footnote-1997357
+Node: Cached values997421
+Ref: Cached values-Footnote-11000928
+Node: Array Manipulation1001019
+Ref: Array Manipulation-Footnote-11002110
+Node: Array Data Types1002147
+Ref: Array Data Types-Footnote-11004805
+Node: Array Functions1004897
+Node: Flattening Arrays1008755
+Node: Creating Arrays1015663
+Node: Redirection API1020432
+Node: Extension API Variables1023263
+Node: Extension Versioning1023896
+Ref: gawk-api-version1024333
+Node: Extension API Informational Variables1026089
+Node: Extension API Boilerplate1027153
+Node: Finding Extensions1030967
+Node: Extension Example1031526
+Node: Internal File Description1032324
+Node: Internal File Ops1036404
+Ref: Internal File Ops-Footnote-11048166
+Node: Using Internal File Ops1048306
+Ref: Using Internal File Ops-Footnote-11050689
+Node: Extension Samples1050963
+Node: Extension Sample File Functions1052492
+Node: Extension Sample Fnmatch1060141
+Node: Extension Sample Fork1061628
+Node: Extension Sample Inplace1062846
+Node: Extension Sample Ord1066056
+Node: Extension Sample Readdir1066892
+Ref: table-readdir-file-types1067781
+Node: Extension Sample Revout1068586
+Node: Extension Sample Rev2way1069175
+Node: Extension Sample Read write array1069915
+Node: Extension Sample Readfile1071857
+Node: Extension Sample Time1072952
+Node: Extension Sample API Tests1074300
+Node: gawkextlib1074792
+Node: Extension summary1077239
+Node: Extension Exercises1080941
+Node: Language History1082439
+Node: V7/SVR3.11084095
+Node: SVR41086247
+Node: POSIX1087681
+Node: BTL1089060
+Node: POSIX/GNU1089789
+Node: Feature History1095651
+Node: Common Extensions1110021
+Node: Ranges and Locales1111304
+Ref: Ranges and Locales-Footnote-11115920
+Ref: Ranges and Locales-Footnote-21115947
+Ref: Ranges and Locales-Footnote-31116182
+Node: Contributors1116403
+Node: History summary1121963
+Node: Installation1123343
+Node: Gawk Distribution1124287
+Node: Getting1124771
+Node: Extracting1125732
+Node: Distribution contents1127370
+Node: Unix Installation1133455
+Node: Quick Installation1134137
+Node: Shell Startup Files1136551
+Node: Additional Configuration Options1137629
+Node: Configuration Philosophy1139434
+Node: Non-Unix Installation1141803
+Node: PC Installation1142263
+Node: PC Binary Installation1143101
+Node: PC Compiling1143536
+Node: PC Using1144653
+Node: Cygwin1147698
+Node: MSYS1148468
+Node: VMS Installation1148969
+Node: VMS Compilation1149760
+Ref: VMS Compilation-Footnote-11150989
+Node: VMS Dynamic Extensions1151047
+Node: VMS Installation Details1152732
+Node: VMS Running1154985
+Node: VMS GNV1159264
+Node: VMS Old Gawk1159999
+Node: Bugs1160470
+Node: Bug address1161133
+Node: Usenet1163530
+Node: Maintainers1164305
+Node: Other Versions1165681
+Node: Installation summary1172265
+Node: Notes1173300
+Node: Compatibility Mode1174165
+Node: Additions1174947
+Node: Accessing The Source1175872
+Node: Adding Code1177307
+Node: New Ports1183526
+Node: Derived Files1188014
+Ref: Derived Files-Footnote-11193499
+Ref: Derived Files-Footnote-21193534
+Ref: Derived Files-Footnote-31194132
+Node: Future Extensions1194246
+Node: Implementation Limitations1194904
+Node: Extension Design1196087
+Node: Old Extension Problems1197241
+Ref: Old Extension Problems-Footnote-11198759
+Node: Extension New Mechanism Goals1198816
+Ref: Extension New Mechanism Goals-Footnote-11202180
+Node: Extension Other Design Decisions1202369
+Node: Extension Future Growth1204482
+Node: Old Extension Mechanism1205318
+Node: Notes summary1207081
+Node: Basic Concepts1208263
+Node: Basic High Level1208944
+Ref: figure-general-flow1209226
+Ref: figure-process-flow1209911
+Ref: Basic High Level-Footnote-11213212
+Node: Basic Data Typing1213397
+Node: Glossary1216725
+Node: Copying1248672
+Node: GNU Free Documentation License1286211
+Node: Index1311329
+
+End Tag Table
diff --git a/doc/gawkinet.info b/doc/gawkinet.info
new file mode 100644
index 00000000..d5a7abf8
--- /dev/null
+++ b/doc/gawkinet.info
@@ -0,0 +1,4406 @@
+This is gawkinet.info, produced by makeinfo version 6.1 from
+gawkinet.texi.
+
+This is Edition 1.4 of 'TCP/IP Internetworking with 'gawk'', for the
+4.1.4 (or later) version of the GNU implementation of AWK.
+
+
+   Copyright (C) 2000, 2001, 2002, 2004, 2009, 2010, 2016 Free Software
+Foundation, Inc.
+
+
+   Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3 or
+any later version published by the Free Software Foundation; with the
+Invariant Sections being "GNU General Public License", the Front-Cover
+texts being (a) (see below), and with the Back-Cover Texts being (b)
+(see below).  A copy of the license is included in the section entitled
+"GNU Free Documentation License".
+
+  a. "A GNU Manual"
+
+  b. "You have the freedom to copy and modify this GNU manual.  Buying
+     copies from the FSF supports it in developing GNU and promoting
+     software freedom."
+INFO-DIR-SECTION Network applications
+START-INFO-DIR-ENTRY
+* Gawkinet: (gawkinet).         TCP/IP Internetworking With 'gawk'.
+END-INFO-DIR-ENTRY
+
+   This file documents the networking features in GNU 'awk'.
+
+   This is Edition 1.4 of 'TCP/IP Internetworking with 'gawk'', for the
+4.1.4 (or later) version of the GNU implementation of AWK.
+
+
+   Copyright (C) 2000, 2001, 2002, 2004, 2009, 2010, 2016 Free Software
+Foundation, Inc.
+
+
+   Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3 or
+any later version published by the Free Software Foundation; with the
+Invariant Sections being "GNU General Public License", the Front-Cover
+texts being (a) (see below), and with the Back-Cover Texts being (b)
+(see below).  A copy of the license is included in the section entitled
+"GNU Free Documentation License".
+
+  a. "A GNU Manual"
+
+  b. "You have the freedom to copy and modify this GNU manual.  Buying
+     copies from the FSF supports it in developing GNU and promoting
+     software freedom."
+
+
+File: gawkinet.info,  Node: Top,  Next: Preface,  Prev: (dir),  Up: (dir)
+
+General Introduction
+********************
+
+This file documents the networking features in GNU Awk ('gawk') version
+4.0 and later.
+
+   This is Edition 1.4 of 'TCP/IP Internetworking with 'gawk'', for the
+4.1.4 (or later) version of the GNU implementation of AWK.
+
+
+   Copyright (C) 2000, 2001, 2002, 2004, 2009, 2010, 2016 Free Software
+Foundation, Inc.
+
+
+   Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3 or
+any later version published by the Free Software Foundation; with the
+Invariant Sections being "GNU General Public License", the Front-Cover
+texts being (a) (see below), and with the Back-Cover Texts being (b)
+(see below).  A copy of the license is included in the section entitled
+"GNU Free Documentation License".
+
+  a. "A GNU Manual"
+
+  b. "You have the freedom to copy and modify this GNU manual.  Buying
+     copies from the FSF supports it in developing GNU and promoting
+     software freedom."
+
+* Menu:
+
+* Preface::                          About this document.
+* Introduction::                     About networking.
+* Using Networking::                 Some examples.
+* Some Applications and Techniques:: More extended examples.
+* Links::                            Where to find the stuff mentioned in this
+                                     document.
+* GNU Free Documentation License::   The license for this document.
+* Index::                            The index.
+
+* Stream Communications::          Sending data streams.
+* Datagram Communications::        Sending self-contained messages.
+* The TCP/IP Protocols::           How these models work in the Internet.
+* Basic Protocols::                The basic protocols.
+* Ports::                          The idea behind ports.
+* Making Connections::             Making TCP/IP connections.
+* Gawk Special Files::             How to do 'gawk' networking.
+* Special File Fields::            The fields in the special file name.
+* Comparing Protocols::            Differences between the protocols.
+* File /inet/tcp::                 The TCP special file.
+* File /inet/udp::                 The UDP special file.
+* TCP Connecting::                 Making a TCP connection.
+* Troubleshooting::                Troubleshooting TCP/IP connections.
+* Interacting::                    Interacting with a service.
+* Setting Up::                     Setting up a service.
+* Email::                          Reading email.
+* Web page::                       Reading a Web page.
+* Primitive Service::              A primitive Web service.
+* Interacting Service::            A Web service with interaction.
+* CGI Lib::                        A simple CGI library.
+* Simple Server::                  A simple Web server.
+* Caveats::                        Network programming caveats.
+* Challenges::                     Where to go from here.
+* PANIC::                          An Emergency Web Server.
+* GETURL::                         Retrieving Web Pages.
+* REMCONF::                        Remote Configuration Of Embedded Systems.
+* URLCHK::                         Look For Changed Web Pages.
+* WEBGRAB::                        Extract Links From A Page.
+* STATIST::                        Graphing A Statistical Distribution.
+* MAZE::                           Walking Through A Maze In Virtual Reality.
+* MOBAGWHO::                       A Simple Mobile Agent.
+* STOXPRED::                       Stock Market Prediction As A Service.
+* PROTBASE::                       Searching Through A Protein Database.
+
+
+File: gawkinet.info,  Node: Preface,  Next: Introduction,  Prev: Top,  Up: Top
+
+Preface
+*******
+
+In May of 1997, Ju"rgen Kahrs felt the need for network access from
+'awk', and, with a little help from me, set about adding features to do
+this for 'gawk'.  At that time, he wrote the bulk of this Info file.
+
+   The code and documentation were added to the 'gawk' 3.1 development
+tree, and languished somewhat until I could finally get down to some
+serious work on that version of 'gawk'.  This finally happened in the
+middle of 2000.
+
+   Meantime, Ju"rgen wrote an article about the Internet special files
+and '|&' operator for 'Linux Journal', and made a networking patch for
+the production versions of 'gawk' available from his home page.  In
+August of 2000 (for 'gawk' 3.0.6), this patch also made it to the main
+GNU 'ftp' distribution site.
+
+   For release with 'gawk', I edited Ju"rgen's prose for English grammar
+and style, as he is not a native English speaker.  I also rearranged the
+material somewhat for what I felt was a better order of presentation,
+and (re)wrote some of the introductory material.
+
+   The majority of this document and the code are his work, and the high
+quality and interesting ideas speak for themselves.  It is my hope that
+these features will be of significant value to the 'awk' community.
+
+
+Arnold Robbins
+Nof Ayalon, ISRAEL
+March, 2001
+
+
+File: gawkinet.info,  Node: Introduction,  Next: Using Networking,  Prev: Preface,  Up: Top
+
+1 Networking Concepts
+*********************
+
+This major node provides a (necessarily) brief introduction to computer
+networking concepts.  For many applications of 'gawk' to TCP/IP
+networking, we hope that this is enough.  For more advanced tasks, you
+will need deeper background, and it may be necessary to switch to
+lower-level programming in C or C++.
+
+   There are two real-life models for the way computers send messages to
+each other over a network.  While the analogies are not perfect, they
+are close enough to convey the major concepts.  These two models are the
+phone system (reliable byte-stream communications), and the postal
+system (best-effort datagrams).
+
+* Menu:
+
+* Stream Communications::       Sending data streams.
+* Datagram Communications::     Sending self-contained messages.
+* The TCP/IP Protocols::        How these models work in the Internet.
+* Making Connections::          Making TCP/IP connections.
+
+
+File: gawkinet.info,  Node: Stream Communications,  Next: Datagram Communications,  Prev: Introduction,  Up: Introduction
+
+1.1 Reliable Byte-streams (Phone Calls)
+=======================================
+
+When you make a phone call, the following steps occur:
+
+  1. You dial a number.
+
+  2. The phone system connects to the called party, telling them there
+     is an incoming call.  (Their phone rings.)
+
+  3. The other party answers the call, or, in the case of a computer
+     network, refuses to answer the call.
+
+  4. Assuming the other party answers, the connection between you is now
+     a "duplex" (two-way), "reliable" (no data lost), sequenced (data
+     comes out in the order sent) data stream.
+
+  5. You and your friend may now talk freely, with the phone system
+     moving the data (your voices) from one end to the other.  From your
+     point of view, you have a direct end-to-end connection with the
+     person on the other end.
+
+   The same steps occur in a duplex reliable computer networking
+connection.  There is considerably more overhead in setting up the
+communications, but once it's done, data moves in both directions,
+reliably, in sequence.
+
+
+File: gawkinet.info,  Node: Datagram Communications,  Next: The TCP/IP Protocols,  Prev: Stream Communications,  Up: Introduction
+
+1.2 Best-effort Datagrams (Mailed Letters)
+==========================================
+
+Suppose you mail three different documents to your office on the other
+side of the country on two different days.  Doing so entails the
+following.
+
+  1. Each document travels in its own envelope.
+
+  2. Each envelope contains both the sender and the recipient address.
+
+  3. Each envelope may travel a different route to its destination.
+
+  4. The envelopes may arrive in a different order from the one in which
+     they were sent.
+
+  5. One or more may get lost in the mail.  (Although, fortunately, this
+     does not occur very often.)
+
+  6. In a computer network, one or more "packets" may also arrive
+     multiple times.  (This doesn't happen with the postal system!)
+
+   The important characteristics of datagram communications, like those
+of the postal system are thus:
+
+   * Delivery is "best effort;" the data may never get there.
+
+   * Each message is self-contained, including the source and
+     destination addresses.
+
+   * Delivery is _not_ sequenced; packets may arrive out of order,
+     and/or multiple times.
+
+   * Unlike the phone system, overhead is considerably lower.  It is not
+     necessary to set up the call first.
+
+   The price the user pays for the lower overhead of datagram
+communications is exactly the lower reliability; it is often necessary
+for user-level protocols that use datagram communications to add their
+own reliability features on top of the basic communications.
+
+
+File: gawkinet.info,  Node: The TCP/IP Protocols,  Next: Making Connections,  Prev: Datagram Communications,  Up: Introduction
+
+1.3 The Internet Protocols
+==========================
+
+The Internet Protocol Suite (usually referred to as just TCP/IP)(1)
+consists of a number of different protocols at different levels or
+"layers."  For our purposes, three protocols provide the fundamental
+communications mechanisms.  All other defined protocols are referred to
+as user-level protocols (e.g., HTTP, used later in this Info file).
+
+* Menu:
+
+* Basic Protocols::             The basic protocols.
+* Ports::                       The idea behind ports.
+
+   ---------- Footnotes ----------
+
+   (1) It should be noted that although the Internet seems to have
+conquered the world, there are other networking protocol suites in
+existence and in use.
+
+
+File: gawkinet.info,  Node: Basic Protocols,  Next: Ports,  Prev: The TCP/IP Protocols,  Up: The TCP/IP Protocols
+
+1.3.1 The Basic Internet Protocols
+----------------------------------
+
+IP
+     The Internet Protocol.  This protocol is almost never used directly
+     by applications.  It provides the basic packet delivery and routing
+     infrastructure of the Internet.  Much like the phone company's
+     switching centers or the Post Office's trucks, it is not of much
+     day-to-day interest to the regular user (or programmer).  It
+     happens to be a best effort datagram protocol.  In the early
+     twenty-first century, there are two versions of this protocol in
+     use:
+
+     IPv4
+          The original version of the Internet Protocol, with 32-bit
+          addresses, on which most of the current Internet is based.
+
+     IPv6
+          The "next generation" of the Internet Protocol, with 128-bit
+          addresses.  This protocol is in wide use in certain parts of
+          the world, but has not yet replaced IPv4.(1)
+
+     Versions of the other protocols that sit "atop" IP exist for both
+     IPv4 and IPv6.  However, as the IPv6 versions are fundamentally the
+     same as the original IPv4 versions, we will not distinguish further
+     between them.
+
+UDP
+     The User Datagram Protocol.  This is a best effort datagram
+     protocol.  It provides a small amount of extra reliability over IP,
+     and adds the notion of "ports", described in *note TCP and UDP
+     Ports: Ports.
+
+TCP
+     The Transmission Control Protocol.  This is a duplex, reliable,
+     sequenced byte-stream protocol, again layered on top of IP, and
+     also providing the notion of ports.  This is the protocol that you
+     will most likely use when using 'gawk' for network programming.
+
+   All other user-level protocols use either TCP or UDP to do their
+basic communications.  Examples are SMTP (Simple Mail Transfer
+Protocol), FTP (File Transfer Protocol), and HTTP (HyperText Transfer
+Protocol).
+
+   ---------- Footnotes ----------
+
+   (1) There isn't an IPv5.
+
+
+File: gawkinet.info,  Node: Ports,  Prev: Basic Protocols,  Up: The TCP/IP Protocols
+
+1.3.2 TCP and UDP Ports
+-----------------------
+
+In the postal system, the address on an envelope indicates a physical
+location, such as a residence or office building.  But there may be more
+than one person at the location; thus you have to further quantify the
+recipient by putting a person or company name on the envelope.
+
+   In the phone system, one phone number may represent an entire
+company, in which case you need a person's extension number in order to
+reach that individual directly.  Or, when you call a home, you have to
+say, "May I please speak to ..."  before talking to the person directly.
+
+   IP networking provides the concept of addressing.  An IP address
+represents a particular computer, but no more.  In order to reach the
+mail service on a system, or the FTP or WWW service on a system, you
+must have some way to further specify which service you want.  In the
+Internet Protocol suite, this is done with "port numbers", which
+represent the services, much like an extension number used with a phone
+number.
+
+   Port numbers are 16-bit integers.  Unix and Unix-like systems reserve
+ports below 1024 for "well known" services, such as SMTP, FTP, and HTTP.
+Numbers 1024 and above may be used by any application, although there is
+no promise made that a particular port number is always available.
+
+
+File: gawkinet.info,  Node: Making Connections,  Prev: The TCP/IP Protocols,  Up: Introduction
+
+1.4 Making TCP/IP Connections (And Some Terminology)
+====================================================
+
+Two terms come up repeatedly when discussing networking: "client" and
+"server".  For now, we'll discuss these terms at the "connection level",
+when first establishing connections between two processes on different
+systems over a network.  (Once the connection is established, the higher
+level, or "application level" protocols, such as HTTP or FTP, determine
+who is the client and who is the server.  Often, it turns out that the
+client and server are the same in both roles.)
+
+   The "server" is the system providing the service, such as the web
+server or email server.  It is the "host" (system) which is _connected
+to_ in a transaction.  For this to work though, the server must be
+expecting connections.  Much as there has to be someone at the office
+building to answer the phone(1), the server process (usually) has to be
+started first and be waiting for a connection.
+
+   The "client" is the system requesting the service.  It is the system
+_initiating the connection_ in a transaction.  (Just as when you pick up
+the phone to call an office or store.)
+
+   In the TCP/IP framework, each end of a connection is represented by a
+pair of (ADDRESS, PORT) pairs.  For the duration of the connection, the
+ports in use at each end are unique, and cannot be used simultaneously
+by other processes on the same system.  (Only after closing a connection
+can a new one be built up on the same port.  This is contrary to the
+usual behavior of fully developed web servers which have to avoid
+situations in which they are not reachable.  We have to pay this price
+in order to enjoy the benefits of a simple communication paradigm in
+'gawk'.)
+
+   Furthermore, once the connection is established, communications are
+"synchronous".(2)  I.e., each end waits on the other to finish
+transmitting, before replying.  This is much like two people in a phone
+conversation.  While both could talk simultaneously, doing so usually
+doesn't work too well.
+
+   In the case of TCP, the synchronicity is enforced by the protocol
+when sending data.  Data writes "block" until the data have been
+received on the other end.  For both TCP and UDP, data reads block until
+there is incoming data waiting to be read.  This is summarized in the
+following table, where an "X" indicates that the given action blocks.
+
+TCP        X       X
+UDP        X
+
+   ---------- Footnotes ----------
+
+   (1) In the days before voice mail systems!
+
+   (2) For the technically savvy, data reads block--if there's no
+incoming data, the program is made to wait until there is, instead of
+receiving a "there's no data" error return.
+
+
+File: gawkinet.info,  Node: Using Networking,  Next: Some Applications and Techniques,  Prev: Introduction,  Up: Top
+
+2 Networking With 'gawk'
+************************
+
+The 'awk' programming language was originally developed as a
+pattern-matching language for writing short programs to perform data
+manipulation tasks.  'awk''s strength is the manipulation of textual
+data that is stored in files.  It was never meant to be used for
+networking purposes.  To exploit its features in a networking context,
+it's necessary to use an access mode for network connections that
+resembles the access of files as closely as possible.
+
+   'awk' is also meant to be a prototyping language.  It is used to
+demonstrate feasibility and to play with features and user interfaces.
+This can be done with file-like handling of network connections.  'gawk'
+trades the lack of many of the advanced features of the TCP/IP family of
+protocols for the convenience of simple connection handling.  The
+advanced features are available when programming in C or Perl.  In fact,
+the network programming in this major node is very similar to what is
+described in books such as 'Internet Programming with Python', 'Advanced
+Perl Programming', or 'Web Client Programming with Perl'.
+
+   However, you can do the programming here without first having to
+learn object-oriented ideology; underlying languages such as Tcl/Tk,
+Perl, Python; or all of the libraries necessary to extend these
+languages before they are ready for the Internet.
+
+   This major node demonstrates how to use the TCP protocol.  The UDP
+protocol is much less important for most users.
+
+* Menu:
+
+* Gawk Special Files::          How to do 'gawk' networking.
+* TCP Connecting::              Making a TCP connection.
+* Troubleshooting::             Troubleshooting TCP/IP connections.
+* Interacting::                 Interacting with a service.
+* Setting Up::                  Setting up a service.
+* Email::                       Reading email.
+* Web page::                    Reading a Web page.
+* Primitive Service::           A primitive Web service.
+* Interacting Service::         A Web service with interaction.
+* Simple Server::               A simple Web server.
+* Caveats::                     Network programming caveats.
+* Challenges::                  Where to go from here.
+
+
+File: gawkinet.info,  Node: Gawk Special Files,  Next: TCP Connecting,  Prev: Using Networking,  Up: Using Networking
+
+2.1 'gawk''s Networking Mechanisms
+==================================
+
+The '|&' operator for use in communicating with a "coprocess" is
+described in *note Two-way Communications With Another Process:
+(gawk)Two-way I/O. It shows how to do two-way I/O to a separate process,
+sending it data with 'print' or 'printf' and reading data with
+'getline'.  If you haven't read it already, you should detour there to
+do so.
+
+   'gawk' transparently extends the two-way I/O mechanism to simple
+networking through the use of special file names.  When a "coprocess"
+that matches the special files we are about to describe is started,
+'gawk' creates the appropriate network connection, and then two-way I/O
+proceeds as usual.
+
+   At the C, C++, and Perl level, networking is accomplished via
+"sockets", an Application Programming Interface (API) originally
+developed at the University of California at Berkeley that is now used
+almost universally for TCP/IP networking.  Socket level programming,
+while fairly straightforward, requires paying attention to a number of
+details, as well as using binary data.  It is not well-suited for use
+from a high-level language like 'awk'.  The special files provided in
+'gawk' hide the details from the programmer, making things much simpler
+and easier to use.
+
+   The special file name for network access is made up of several
+fields, all of which are mandatory:
+
+     /NET-TYPE/PROTOCOL/LOCALPORT/HOSTNAME/REMOTEPORT
+
+   The NET-TYPE field lets you specify IPv4 versus IPv6, or lets you
+allow the system to choose.
+
+* Menu:
+
+* Special File Fields::         The fields in the special file name.
+* Comparing Protocols::         Differences between the protocols.
+
+
+File: gawkinet.info,  Node: Special File Fields,  Next: Comparing Protocols,  Prev: Gawk Special Files,  Up: Gawk Special Files
+
+2.1.1 The Fields of the Special File Name
+-----------------------------------------
+
+This node explains the meaning of all the other fields, as well as the
+range of values and the defaults.  All of the fields are mandatory.  To
+let the system pick a value, or if the field doesn't apply to the
+protocol, specify it as '0':
+
+NET-TYPE
+     This is one of 'inet4' for IPv4, 'inet6' for IPv6, or 'inet' to use
+     the system default (which is likely to be IPv4).  For the rest of
+     this document, we will use the generic '/inet' in our descriptions
+     of how 'gawk''s networking works.
+
+PROTOCOL
+     Determines which member of the TCP/IP family of protocols is
+     selected to transport the data across the network.  There are two
+     possible values (always written in lowercase): 'tcp' and 'udp'.
+     The exact meaning of each is explained later in this node.
+
+LOCALPORT
+     Determines which port on the local machine is used to communicate
+     across the network.  Application-level clients usually use '0' to
+     indicate they do not care which local port is used--instead they
+     specify a remote port to connect to.  It is vital for
+     application-level servers to use a number different from '0' here
+     because their service has to be available at a specific publicly
+     known port number.  It is possible to use a name from
+     '/etc/services' here.
+
+HOSTNAME
+     Determines which remote host is to be at the other end of the
+     connection.  Application-level servers must fill this field with a
+     '0' to indicate their being open for all other hosts to connect to
+     them and enforce connection level server behavior this way.  It is
+     not possible for an application-level server to restrict its
+     availability to one remote host by entering a host name here.
+     Application-level clients must enter a name different from '0'.
+     The name can be either symbolic (e.g., 'jpl-devvax.jpl.nasa.gov')
+     or numeric (e.g., '128.149.1.143').
+
+REMOTEPORT
+     Determines which port on the remote machine is used to communicate
+     across the network.  For '/inet/tcp' and '/inet/udp',
+     application-level clients _must_ use a number other than '0' to
+     indicate to which port on the remote machine they want to connect.
+     Application-level servers must not fill this field with a '0'.
+     Instead they specify a local port to which clients connect.  It is
+     possible to use a name from '/etc/services' here.
+
+   Experts in network programming will notice that the usual
+client/server asymmetry found at the level of the socket API is not
+visible here.  This is for the sake of simplicity of the high-level
+concept.  If this asymmetry is necessary for your application, use
+another language.  For 'gawk', it is more important to enable users to
+write a client program with a minimum of code.  What happens when first
+accessing a network connection is seen in the following pseudocode:
+
+     if ((name of remote host given) && (other side accepts connection)) {
+       rendez-vous successful; transmit with getline or print
+     } else {
+       if ((other side did not accept) && (localport == 0))
+         exit unsuccessful
+       if (TCP) {
+         set up a server accepting connections
+         this means waiting for the client on the other side to connect
+       } else
+         ready
+     }
+
+   The exact behavior of this algorithm depends on the values of the
+fields of the special file name.  When in doubt, *note Table 2.1:
+table-inet-components. gives you the combinations of values and their
+meaning.  If this table is too complicated, focus on the three lines
+printed in *bold*.  All the examples in *note Networking With 'gawk':
+Using Networking, use only the patterns printed in bold letters.
+
+PROTOCOL    LOCAL       HOST NAME   REMOTE      RESULTING CONNECTION-LEVEL
+            PORT                    PORT        BEHAVIOR
+------------------------------------------------------------------------------
+*tcp*       *0*         *x*         *x*         *Dedicated client, fails if
+                                                immediately connecting to a
+                                                server on the other side
+                                                fails*
+udp         0           x           x           Dedicated client
+*tcp,       *x*         *x*         *x*         *Client, switches to
+udp*                                            dedicated server if
+                                                necessary*
+*tcp,       *x*         *0*         *0*         *Dedicated server*
+udp*
+tcp, udp    x           x           0           Invalid
+tcp, udp    0           0           x           Invalid
+tcp, udp    x           0           x           Invalid
+tcp, udp    0           0           0           Invalid
+tcp, udp    0           x           0           Invalid
+
+Table 2.1: /inet Special File Components
+
+   In general, TCP is the preferred mechanism to use.  It is the
+simplest protocol to understand and to use.  Use UDP only if
+circumstances demand low-overhead.
+
+
+File: gawkinet.info,  Node: Comparing Protocols,  Prev: Special File Fields,  Up: Gawk Special Files
+
+2.1.2 Comparing Protocols
+-------------------------
+
+This node develops a pair of programs (sender and receiver) that do
+nothing but send a timestamp from one machine to another.  The sender
+and the receiver are implemented with each of the two protocols
+available and demonstrate the differences between them.
+
+* Menu:
+
+* File /inet/tcp::              The TCP special file.
+* File /inet/udp::              The UDP special file.
+
+
+File: gawkinet.info,  Node: File /inet/tcp,  Next: File /inet/udp,  Prev: Comparing Protocols,  Up: Comparing Protocols
+
+2.1.2.1 '/inet/tcp'
+...................
+
+Once again, always use TCP. (Use UDP when low overhead is a necessity,
+and use RAW for network experimentation.)  The first example is the
+sender program:
+
+     # Server
+     BEGIN {
+       print strftime() |& "/inet/tcp/8888/0/0"
+       close("/inet/tcp/8888/0/0")
+     }
+
+   The receiver is very simple:
+
+     # Client
+     BEGIN {
+       "/inet/tcp/0/localhost/8888" |& getline
+       print $0
+       close("/inet/tcp/0/localhost/8888")
+     }
+
+   TCP guarantees that the bytes arrive at the receiving end in exactly
+the same order that they were sent.  No byte is lost (except for broken
+connections), doubled, or out of order.  Some overhead is necessary to
+accomplish this, but this is the price to pay for a reliable service.
+It does matter which side starts first.  The sender/server has to be
+started first, and it waits for the receiver to read a line.
+
+
+File: gawkinet.info,  Node: File /inet/udp,  Prev: File /inet/tcp,  Up: Comparing Protocols
+
+2.1.2.2 '/inet/udp'
+...................
+
+The server and client programs that use UDP are almost identical to
+their TCP counterparts; only the PROTOCOL has changed.  As before, it
+does matter which side starts first.  The receiving side blocks and
+waits for the sender.  In this case, the receiver/client has to be
+started first:
+
+     # Server
+     BEGIN {
+       print strftime() |& "/inet/udp/8888/0/0"
+       close("/inet/udp/8888/0/0")
+     }
+
+   The receiver is almost identical to the TCP receiver:
+
+     # Client
+     BEGIN {
+       print "hi!" |& "/inet/udp/0/localhost/8888"
+       "/inet/udp/0/localhost/8888" |& getline
+       print $0
+       close("/inet/udp/0/localhost/8888")
+     }
+
+   In the case of UDP, the initial 'print' command is the one that
+actually sends data so that there is a connection.  UDP and "connection"
+sounds strange to anyone who has learned that UDP is a connectionless
+protocol.  Here, "connection" means that the 'connect()' system call has
+completed its work and completed the "association" between a certain
+socket and an IP address.  Thus there are subtle differences between
+'connect()' for TCP and UDP; see the man page for details.(1)
+
+   UDP cannot guarantee that the datagrams at the receiving end will
+arrive in exactly the same order they were sent.  Some datagrams could
+be lost, some doubled, and some out of order.  But no overhead is
+necessary to accomplish this.  This unreliable behavior is good enough
+for tasks such as data acquisition, logging, and even stateless services
+like the original versions of NFS.
+
+   ---------- Footnotes ----------
+
+   (1) This subtlety is just one of many details that are hidden in the
+socket API, invisible and intractable for the 'gawk' user.  The
+developers are currently considering how to rework the network
+facilities to make them easier to understand and use.
+
+
+File: gawkinet.info,  Node: TCP Connecting,  Next: Troubleshooting,  Prev: Gawk Special Files,  Up: Using Networking
+
+2.2 Establishing a TCP Connection
+=================================
+
+Let's observe a network connection at work.  Type in the following
+program and watch the output.  Within a second, it connects via TCP
+('/inet/tcp') to the machine it is running on ('localhost') and asks the
+service 'daytime' on the machine what time it is:
+
+     BEGIN {
+       "/inet/tcp/0/localhost/daytime" |& getline
+       print $0
+       close("/inet/tcp/0/localhost/daytime")
+     }
+
+   Even experienced 'awk' users will find the second line strange in two
+respects:
+
+   * A special file is used as a shell command that pipes its output
+     into 'getline'.  One would rather expect to see the special file
+     being read like any other file ('getline <
+     "/inet/tcp/0/localhost/daytime")'.
+
+   * The operator '|&' has not been part of any 'awk' implementation
+     (until now).  It is actually the only extension of the 'awk'
+     language needed (apart from the special files) to introduce network
+     access.
+
+   The '|&' operator was introduced in 'gawk' 3.1 in order to overcome
+the crucial restriction that access to files and pipes in 'awk' is
+always unidirectional.  It was formerly impossible to use both access
+modes on the same file or pipe.  Instead of changing the whole concept
+of file access, the '|&' operator behaves exactly like the usual pipe
+operator except for two additions:
+
+   * Normal shell commands connected to their 'gawk' program with a '|&'
+     pipe can be accessed bidirectionally.  The '|&' turns out to be a
+     quite general, useful, and natural extension of 'awk'.
+
+   * Pipes that consist of a special file name for network connections
+     are not executed as shell commands.  Instead, they can be read and
+     written to, just like a full-duplex network connection.
+
+   In the earlier example, the '|&' operator tells 'getline' to read a
+line from the special file '/inet/tcp/0/localhost/daytime'.  We could
+also have printed a line into the special file.  But instead we just
+read a line with the time, printed it, and closed the connection.
+(While we could just let 'gawk' close the connection by finishing the
+program, in this Info file we are pedantic and always explicitly close
+the connections.)
+
+
+File: gawkinet.info,  Node: Troubleshooting,  Next: Interacting,  Prev: TCP Connecting,  Up: Using Networking
+
+2.3 Troubleshooting Connection Problems
+=======================================
+
+It may well be that for some reason the program shown in the previous
+example does not run on your machine.  When looking at possible reasons
+for this, you will learn much about typical problems that arise in
+network programming.  First of all, your implementation of 'gawk' may
+not support network access because it is a pre-3.1 version or you do not
+have a network interface in your machine.  Perhaps your machine uses
+some other protocol, such as DECnet or Novell's IPX. For the rest of
+this major node, we will assume you work on a Unix machine that supports
+TCP/IP. If the previous example program does not run on your machine, it
+may help to replace the name 'localhost' with the name of your machine
+or its IP address.  If it does, you could replace 'localhost' with the
+name of another machine in your vicinity--this way, the program connects
+to another machine.  Now you should see the date and time being printed
+by the program, otherwise your machine may not support the 'daytime'
+service.  Try changing the service to 'chargen' or 'ftp'.  This way, the
+program connects to other services that should give you some response.
+If you are curious, you should have a look at your '/etc/services' file.
+It could look like this:
+
+     # /etc/services:
+     #
+     # Network services, Internet style
+     #
+     # Name     Number/Protocol  Alternate name # Comments
+
+     echo        7/tcp
+     echo        7/udp
+     discard     9/tcp         sink null
+     discard     9/udp         sink null
+     daytime     13/tcp
+     daytime     13/udp
+     chargen     19/tcp        ttytst source
+     chargen     19/udp        ttytst source
+     ftp         21/tcp
+     telnet      23/tcp
+     smtp        25/tcp        mail
+     finger      79/tcp
+     www         80/tcp        http      # WorldWideWeb HTTP
+     www         80/udp        # HyperText Transfer Protocol
+     pop-2       109/tcp       postoffice    # POP version 2
+     pop-2       109/udp
+     pop-3       110/tcp       # POP version 3
+     pop-3       110/udp
+     nntp        119/tcp       readnews untp  # USENET News
+     irc         194/tcp       # Internet Relay Chat
+     irc         194/udp
+     ...
+
+   Here, you find a list of services that traditional Unix machines
+usually support.  If your GNU/Linux machine does not do so, it may be
+that these services are switched off in some startup script.  Systems
+running some flavor of Microsoft Windows usually do _not_ support these
+services.  Nevertheless, it _is_ possible to do networking with 'gawk'
+on Microsoft Windows.(1)  The first column of the file gives the name of
+the service, and the second column gives a unique number and the
+protocol that one can use to connect to this service.  The rest of the
+line is treated as a comment.  You see that some services ('echo')
+support TCP as well as UDP.
+
+   ---------- Footnotes ----------
+
+   (1) Microsoft preferred to ignore the TCP/IP family of protocols
+until 1995.  Then came the rise of the Netscape browser as a landmark
+"killer application."  Microsoft added TCP/IP support and their own
+browser to Microsoft Windows 95 at the last minute.  They even
+back-ported their TCP/IP implementation to Microsoft Windows for
+Workgroups 3.11, but it was a rather rudimentary and half-hearted
+implementation.  Nevertheless, the equivalent of '/etc/services' resides
+under 'C:\WINNT\system32\drivers\etc\services' on Microsoft Windows 2000
+and Microsoft Windows XP.
+
+
+File: gawkinet.info,  Node: Interacting,  Next: Setting Up,  Prev: Troubleshooting,  Up: Using Networking
+
+2.4 Interacting with a Network Service
+======================================
+
+The next program makes use of the possibility to really interact with a
+network service by printing something into the special file.  It asks
+the so-called 'finger' service if a user of the machine is logged in.
+When testing this program, try to change 'localhost' to some other
+machine name in your local network:
+
+     BEGIN {
+       NetService = "/inet/tcp/0/localhost/finger"
+       print "NAME" |& NetService
+       while ((NetService |& getline) > 0)
+         print $0
+       close(NetService)
+     }
+
+   After telling the service on the machine which user to look for, the
+program repeatedly reads lines that come as a reply.  When no more lines
+are coming (because the service has closed the connection), the program
+also closes the connection.  Try replacing '"NAME"' with your login name
+(or the name of someone else logged in).  For a list of all users
+currently logged in, replace NAME with an empty string ('""').
+
+   The final 'close()' command could be safely deleted from the above
+script, because the operating system closes any open connection by
+default when a script reaches the end of execution.  In order to avoid
+portability problems, it is best to always close connections explicitly.
+With the Linux kernel, for example, proper closing results in flushing
+of buffers.  Letting the close happen by default may result in
+discarding buffers.
+
+   When looking at '/etc/services' you may have noticed that the
+'daytime' service is also available with 'udp'.  In the earlier example,
+change 'tcp' to 'udp', and change 'finger' to 'daytime'.  After starting
+the modified program, you see the expected day and time message.  The
+program then hangs, because it waits for more lines coming from the
+service.  However, they never come.  This behavior is a consequence of
+the differences between TCP and UDP. When using UDP, neither party is
+automatically informed about the other closing the connection.
+Continuing to experiment this way reveals many other subtle differences
+between TCP and UDP. To avoid such trouble, one should always remember
+the advice Douglas E. Comer and David Stevens give in Volume III of
+their series 'Internetworking With TCP' (page 14):
+
+     When designing client-server applications, beginners are strongly
+     advised to use TCP because it provides reliable,
+     connection-oriented communication.  Programs only use UDP if the
+     application protocol handles reliability, the application requires
+     hardware broadcast or multicast, or the application cannot tolerate
+     virtual circuit overhead.
+
+
+File: gawkinet.info,  Node: Setting Up,  Next: Email,  Prev: Interacting,  Up: Using Networking
+
+2.5 Setting Up a Service
+========================
+
+The preceding programs behaved as clients that connect to a server
+somewhere on the Internet and request a particular service.  Now we set
+up such a service to mimic the behavior of the 'daytime' service.  Such
+a server does not know in advance who is going to connect to it over the
+network.  Therefore, we cannot insert a name for the host to connect to
+in our special file name.
+
+   Start the following program in one window.  Notice that the service
+does not have the name 'daytime', but the number '8888'.  From looking
+at '/etc/services', you know that names like 'daytime' are just
+mnemonics for predetermined 16-bit integers.  Only the system
+administrator ('root') could enter our new service into '/etc/services'
+with an appropriate name.  Also notice that the service name has to be
+entered into a different field of the special file name because we are
+setting up a server, not a client:
+
+     BEGIN {
+       print strftime() |& "/inet/tcp/8888/0/0"
+       close("/inet/tcp/8888/0/0")
+     }
+
+   Now open another window on the same machine.  Copy the client program
+given as the first example (*note Establishing a TCP Connection: TCP
+Connecting.) to a new file and edit it, changing the name 'daytime' to
+'8888'.  Then start the modified client.  You should get a reply like
+this:
+
+     Sat Sep 27 19:08:16 CEST 1997
+
+Both programs explicitly close the connection.
+
+   Now we will intentionally make a mistake to see what happens when the
+name '8888' (the so-called port) is already used by another service.
+Start the server program in both windows.  The first one works, but the
+second one complains that it could not open the connection.  Each port
+on a single machine can only be used by one server program at a time.
+Now terminate the server program and change the name '8888' to 'echo'.
+After restarting it, the server program does not run any more, and you
+know why: there is already an 'echo' service running on your machine.
+But even if this isn't true, you would not get your own 'echo' server
+running on a Unix machine, because the ports with numbers smaller than
+1024 ('echo' is at port 7) are reserved for 'root'.  On machines running
+some flavor of Microsoft Windows, there is no restriction that reserves
+ports 1 to 1024 for a privileged user; hence, you can start an 'echo'
+server there.
+
+   Turning this short server program into something really useful is
+simple.  Imagine a server that first reads a file name from the client
+through the network connection, then does something with the file and
+sends a result back to the client.  The server-side processing could be:
+
+     BEGIN {
+       NetService = "/inet/tcp/8888/0/0"
+       NetService |& getline
+       CatPipe    = ("cat " $1)    # sets $0 and the fields
+       while ((CatPipe | getline) > 0)
+         print $0 |& NetService
+       close(NetService)
+     }
+
+and we would have a remote copying facility.  Such a server reads the
+name of a file from any client that connects to it and transmits the
+contents of the named file across the net.  The server-side processing
+could also be the execution of a command that is transmitted across the
+network.  From this example, you can see how simple it is to open up a
+security hole on your machine.  If you allow clients to connect to your
+machine and execute arbitrary commands, anyone would be free to do 'rm
+-rf *'.
+
+
+File: gawkinet.info,  Node: Email,  Next: Web page,  Prev: Setting Up,  Up: Using Networking
+
+2.6 Reading Email
+=================
+
+The distribution of email is usually done by dedicated email servers
+that communicate with your machine using special protocols.  To receive
+email, we will use the Post Office Protocol (POP). Sending can be done
+with the much older Simple Mail Transfer Protocol (SMTP).
+
+   When you type in the following program, replace the EMAILHOST by the
+name of your local email server.  Ask your administrator if the server
+has a POP service, and then use its name or number in the program below.
+Now the program is ready to connect to your email server, but it will
+not succeed in retrieving your mail because it does not yet know your
+login name or password.  Replace them in the program and it shows you
+the first email the server has in store:
+
+     BEGIN {
+       POPService  = "/inet/tcp/0/EMAILHOST/pop3"
+       RS = ORS = "\r\n"
+       print "user NAME"            |& POPService
+       POPService                    |& getline
+       print "pass PASSWORD"         |& POPService
+       POPService                    |& getline
+       print "retr 1"                |& POPService
+       POPService                    |& getline
+       if ($1 != "+OK") exit
+       print "quit"                  |& POPService
+       RS = "\r\n\\.\r\n"
+       POPService |& getline
+       print $0
+       close(POPService)
+     }
+
+   The record separators 'RS' and 'ORS' are redefined because the
+protocol (POP) requires CR-LF to separate lines.  After identifying
+yourself to the email service, the command 'retr 1' instructs the
+service to send the first of all your email messages in line.  If the
+service replies with something other than '+OK', the program exits;
+maybe there is no email.  Otherwise, the program first announces that it
+intends to finish reading email, and then redefines 'RS' in order to
+read the entire email as multiline input in one record.  From the POP
+RFC, we know that the body of the email always ends with a single line
+containing a single dot.  The program looks for this using 'RS =
+"\r\n\\.\r\n"'.  When it finds this sequence in the mail message, it
+quits.  You can invoke this program as often as you like; it does not
+delete the message it reads, but instead leaves it on the server.
+
+
+File: gawkinet.info,  Node: Web page,  Next: Primitive Service,  Prev: Email,  Up: Using Networking
+
+2.7 Reading a Web Page
+======================
+
+Retrieving a web page from a web server is as simple as retrieving email
+from an email server.  We only have to use a similar, but not identical,
+protocol and a different port.  The name of the protocol is HyperText
+Transfer Protocol (HTTP) and the port number is usually 80.  As in the
+preceding node, ask your administrator about the name of your local web
+server or proxy web server and its port number for HTTP requests.
+
+   The following program employs a rather crude approach toward
+retrieving a web page.  It uses the prehistoric syntax of HTTP 0.9,
+which almost all web servers still support.  The most noticeable thing
+about it is that the program directs the request to the local proxy
+server whose name you insert in the special file name (which in turn
+calls 'www.yahoo.com'):
+
+     BEGIN {
+       RS = ORS = "\r\n"
+       HttpService = "/inet/tcp/0/PROXY/80"
+       print "GET http://www.yahoo.com"     |& HttpService
+       while ((HttpService |& getline) > 0)
+          print $0
+       close(HttpService)
+     }
+
+   Again, lines are separated by a redefined 'RS' and 'ORS'.  The 'GET'
+request that we send to the server is the only kind of HTTP request that
+existed when the web was created in the early 1990s.  HTTP calls this
+'GET' request a "method," which tells the service to transmit a web page
+(here the home page of the Yahoo!  search engine).  Version 1.0 added
+the request methods 'HEAD' and 'POST'.  The current version of HTTP is
+1.1,(1) and knows the additional request methods 'OPTIONS', 'PUT',
+'DELETE', and 'TRACE'.  You can fill in any valid web address, and the
+program prints the HTML code of that page to your screen.
+
+   Notice the similarity between the responses of the POP and HTTP
+services.  First, you get a header that is terminated by an empty line,
+and then you get the body of the page in HTML. The lines of the headers
+also have the same form as in POP. There is the name of a parameter,
+then a colon, and finally the value of that parameter.
+
+   Images ('.png' or '.gif' files) can also be retrieved this way, but
+then you get binary data that should be redirected into a file.  Another
+application is calling a CGI (Common Gateway Interface) script on some
+server.  CGI scripts are used when the contents of a web page are not
+constant, but generated instantly at the moment you send a request for
+the page.  For example, to get a detailed report about the current
+quotes of Motorola stock shares, call a CGI script at Yahoo!  with the
+following:
+
+     get = "GET http://quote.yahoo.com/q?s=MOT&d=t"
+     print get |& HttpService
+
+   You can also request weather reports this way.
+
+   ---------- Footnotes ----------
+
+   (1) Version 1.0 of HTTP was defined in RFC 1945.  HTTP 1.1 was
+initially specified in RFC 2068.  In June 1999, RFC 2068 was made
+obsolete by RFC 2616, an update without any substantial changes.
+
+
+File: gawkinet.info,  Node: Primitive Service,  Next: Interacting Service,  Prev: Web page,  Up: Using Networking
+
+2.8 A Primitive Web Service
+===========================
+
+Now we know enough about HTTP to set up a primitive web service that
+just says '"Hello, world"' when someone connects to it with a browser.
+Compared to the situation in the preceding node, our program changes the
+role.  It tries to behave just like the server we have observed.  Since
+we are setting up a server here, we have to insert the port number in
+the 'localport' field of the special file name.  The other two fields
+(HOSTNAME and REMOTEPORT) have to contain a '0' because we do not know
+in advance which host will connect to our service.
+
+   In the early 1990s, all a server had to do was send an HTML document
+and close the connection.  Here, we adhere to the modern syntax of HTTP.
+The steps are as follows:
+
+  1. Send a status line telling the web browser that everything is okay.
+
+  2. Send a line to tell the browser how many bytes follow in the body
+     of the message.  This was not necessary earlier because both
+     parties knew that the document ended when the connection closed.
+     Nowadays it is possible to stay connected after the transmission of
+     one web page.  This is to avoid the network traffic necessary for
+     repeatedly establishing TCP connections for requesting several
+     images.  Thus, there is the need to tell the receiving party how
+     many bytes will be sent.  The header is terminated as usual with an
+     empty line.
+
+  3. Send the '"Hello, world"' body in HTML. The useless 'while' loop
+     swallows the request of the browser.  We could actually omit the
+     loop, and on most machines the program would still work.  First,
+     start the following program:
+
+     BEGIN {
+       RS = ORS = "\r\n"
+       HttpService = "/inet/tcp/8080/0/0"
+       Hello = "<HTML><HEAD>" \
+               "<TITLE>A Famous Greeting</TITLE></HEAD>" \
+               "<BODY><H1>Hello, world</H1></BODY></HTML>"
+       Len = length(Hello) + length(ORS)
+       print "HTTP/1.0 200 OK"          |& HttpService
+       print "Content-Length: " Len ORS |& HttpService
+       print Hello                      |& HttpService
+       while ((HttpService |& getline) > 0)
+          continue;
+       close(HttpService)
+     }
+
+   Now, on the same machine, start your favorite browser and let it
+point to <http://localhost:8080> (the browser needs to know on which
+port our server is listening for requests).  If this does not work, the
+browser probably tries to connect to a proxy server that does not know
+your machine.  If so, change the browser's configuration so that the
+browser does not try to use a proxy to connect to your machine.
+
+
+File: gawkinet.info,  Node: Interacting Service,  Next: Simple Server,  Prev: Primitive Service,  Up: Using Networking
+
+2.9 A Web Service with Interaction
+==================================
+
+This node shows how to set up a simple web server.  The subnode is a
+library file that we will use with all the examples in *note Some
+Applications and Techniques::.
+
+* Menu:
+
+* CGI Lib::                     A simple CGI library.
+
+   Setting up a web service that allows user interaction is more
+difficult and shows us the limits of network access in 'gawk'.  In this
+node, we develop a main program (a 'BEGIN' pattern and its action) that
+will become the core of event-driven execution controlled by a graphical
+user interface (GUI). Each HTTP event that the user triggers by some
+action within the browser is received in this central procedure.
+Parameters and menu choices are extracted from this request, and an
+appropriate measure is taken according to the user's choice.  For
+example:
+
+     BEGIN {
+       if (MyHost == "") {
+          "uname -n" | getline MyHost
+          close("uname -n")
+       }
+       if (MyPort ==  0) MyPort = 8080
+       HttpService = "/inet/tcp/" MyPort "/0/0"
+       MyPrefix    = "http://" MyHost ":" MyPort
+       SetUpServer()
+       while ("awk" != "complex") {
+         # header lines are terminated this way
+         RS = ORS = "\r\n"
+         Status   = 200          # this means OK
+         Reason   = "OK"
+         Header   = TopHeader
+         Document = TopDoc
+         Footer   = TopFooter
+         if        (GETARG["Method"] == "GET") {
+             HandleGET()
+         } else if (GETARG["Method"] == "HEAD") {
+             # not yet implemented
+         } else if (GETARG["Method"] != "") {
+             print "bad method", GETARG["Method"]
+         }
+         Prompt = Header Document Footer
+         print "HTTP/1.0", Status, Reason       |& HttpService
+         print "Connection: Close"              |& HttpService
+         print "Pragma: no-cache"               |& HttpService
+         len = length(Prompt) + length(ORS)
+         print "Content-length:", len           |& HttpService
+         print ORS Prompt                       |& HttpService
+         # ignore all the header lines
+         while ((HttpService |& getline) > 0)
+             ;
+         # stop talking to this client
+         close(HttpService)
+         # wait for new client request
+         HttpService |& getline
+         # do some logging
+         print systime(), strftime(), $0
+         # read request parameters
+         CGI_setup($1, $2, $3)
+       }
+     }
+
+   This web server presents menu choices in the form of HTML links.
+Therefore, it has to tell the browser the name of the host it is
+residing on.  When starting the server, the user may supply the name of
+the host from the command line with 'gawk -v MyHost="Rumpelstilzchen"'.
+If the user does not do this, the server looks up the name of the host
+it is running on for later use as a web address in HTML documents.  The
+same applies to the port number.  These values are inserted later into
+the HTML content of the web pages to refer to the home system.
+
+   Each server that is built around this core has to initialize some
+application-dependent variables (such as the default home page) in a
+procedure 'SetUpServer()', which is called immediately before entering
+the infinite loop of the server.  For now, we will write an instance
+that initiates a trivial interaction.  With this home page, the client
+user can click on two possible choices, and receive the current date
+either in human-readable format or in seconds since 1970:
+
+     function SetUpServer() {
+       TopHeader = "<HTML><HEAD>"
+       TopHeader = TopHeader \
+          "<title>My name is GAWK, GNU AWK</title></HEAD>"
+       TopDoc    = "<BODY><h2>\
+         Do you prefer your date <A HREF=" MyPrefix \
+         "/human>human</A> or \
+         <A HREF=" MyPrefix "/POSIX>POSIXed</A>?</h2>" ORS ORS
+       TopFooter = "</BODY></HTML>"
+     }
+
+   On the first run through the main loop, the default line terminators
+are set and the default home page is copied to the actual home page.
+Since this is the first run, 'GETARG["Method"]' is not initialized yet,
+hence the case selection over the method does nothing.  Now that the
+home page is initialized, the server can start communicating to a client
+browser.
+
+   It does so by printing the HTTP header into the network connection
+('print ... |& HttpService').  This command blocks execution of the
+server script until a client connects.  If this server script is
+compared with the primitive one we wrote before, you will notice two
+additional lines in the header.  The first instructs the browser to
+close the connection after each request.  The second tells the browser
+that it should never try to _remember_ earlier requests that had
+identical web addresses (no caching).  Otherwise, it could happen that
+the browser retrieves the time of day in the previous example just once,
+and later it takes the web page from the cache, always displaying the
+same time of day although time advances each second.
+
+   Having supplied the initial home page to the browser with a valid
+document stored in the parameter 'Prompt', it closes the connection and
+waits for the next request.  When the request comes, a log line is
+printed that allows us to see which request the server receives.  The
+final step in the loop is to call the function 'CGI_setup()', which
+reads all the lines of the request (coming from the browser), processes
+them, and stores the transmitted parameters in the array 'PARAM'.  The
+complete text of these application-independent functions can be found in
+*note A Simple CGI Library: CGI Lib.  For now, we use a simplified
+version of 'CGI_setup()':
+
+     function CGI_setup(   method, uri, version, i) {
+       delete GETARG;         delete MENU;        delete PARAM
+       GETARG["Method"] = $1
+       GETARG["URI"] = $2
+       GETARG["Version"] = $3
+       i = index($2, "?")
+       # is there a "?" indicating a CGI request?
+       if (i > 0) {
+         split(substr($2, 1, i-1), MENU, "[/:]")
+         split(substr($2, i+1), PARAM, "&")
+         for (i in PARAM) {
+           j = index(PARAM[i], "=")
+           GETARG[substr(PARAM[i], 1, j-1)] = \
+                                       substr(PARAM[i], j+1)
+         }
+       } else {    # there is no "?", no need for splitting PARAMs
+         split($2, MENU, "[/:]")
+       }
+     }
+
+   At first, the function clears all variables used for global storage
+of request parameters.  The rest of the function serves the purpose of
+filling the global parameters with the extracted new values.  To
+accomplish this, the name of the requested resource is split into parts
+and stored for later evaluation.  If the request contains a '?', then
+the request has CGI variables seamlessly appended to the web address.
+Everything in front of the '?' is split up into menu items, and
+everything behind the '?' is a list of 'VARIABLE=VALUE' pairs (separated
+by '&') that also need splitting.  This way, CGI variables are isolated
+and stored.  This procedure lacks recognition of special characters that
+are transmitted in coded form(1).  Here, any optional request header and
+body parts are ignored.  We do not need header parameters and the
+request body.  However, when refining our approach or working with the
+'POST' and 'PUT' methods, reading the header and body becomes
+inevitable.  Header parameters should then be stored in a global array
+as well as the body.
+
+   On each subsequent run through the main loop, one request from a
+browser is received, evaluated, and answered according to the user's
+choice.  This can be done by letting the value of the HTTP method guide
+the main loop into execution of the procedure 'HandleGET()', which
+evaluates the user's choice.  In this case, we have only one
+hierarchical level of menus, but in the general case, menus are nested.
+The menu choices at each level are separated by '/', just as in file
+names.  Notice how simple it is to construct menus of arbitrary depth:
+
+     function HandleGET() {
+       if (       MENU[2] == "human") {
+         Footer = strftime() TopFooter
+       } else if (MENU[2] == "POSIX") {
+         Footer = systime()  TopFooter
+       }
+     }
+
+   The disadvantage of this approach is that our server is slow and can
+handle only one request at a time.  Its main advantage, however, is that
+the server consists of just one 'gawk' program.  No need for installing
+an 'httpd', and no need for static separate HTML files, CGI scripts, or
+'root' privileges.  This is rapid prototyping.  This program can be
+started on the same host that runs your browser.  Then let your browser
+point to <http://localhost:8080>.
+
+   It is also possible to include images into the HTML pages.  Most
+browsers support the not very well-known '.xbm' format, which may
+contain only monochrome pictures but is an ASCII format.  Binary images
+are possible but not so easy to handle.  Another way of including images
+is to generate them with a tool such as GNUPlot, by calling the tool
+with the 'system()' function or through a pipe.
+
+   ---------- Footnotes ----------
+
+   (1) As defined in RFC 2068.
+
+
+File: gawkinet.info,  Node: CGI Lib,  Prev: Interacting Service,  Up: Interacting Service
+
+2.9.1 A Simple CGI Library
+--------------------------
+
+     HTTP is like being married: you have to be able to handle whatever
+     you're given, while being very careful what you send back.
+     Phil Smith III,
+     <http://www.netfunny.com/rhf/jokes/99/Mar/http.html>
+
+   In *note A Web Service with Interaction: Interacting Service, we saw
+the function 'CGI_setup()' as part of the web server "core logic"
+framework.  The code presented there handles almost everything necessary
+for CGI requests.  One thing it doesn't do is handle encoded characters
+in the requests.  For example, an '&' is encoded as a percent sign
+followed by the hexadecimal value: '%26'.  These encoded values should
+be decoded.  Following is a simple library to perform these tasks.  This
+code is used for all web server examples used throughout the rest of
+this Info file.  If you want to use it for your own web server, store
+the source code into a file named 'inetlib.awk'.  Then you can include
+these functions into your code by placing the following statement into
+your program (on the first line of your script):
+
+     @include inetlib.awk
+
+But beware, this mechanism is only possible if you invoke your web
+server script with 'igawk' instead of the usual 'awk' or 'gawk'.  Here
+is the code:
+
+     # CGI Library and core of a web server
+     # Global arrays
+     #   GETARG --- arguments to CGI GET command
+     #   MENU   --- menu items (path names)
+     #   PARAM  --- parameters of form x=y
+
+     # Optional variable MyHost contains host address
+     # Optional variable MyPort contains port number
+     # Needs TopHeader, TopDoc, TopFooter
+     # Sets MyPrefix, HttpService, Status, Reason
+
+     BEGIN {
+       if (MyHost == "") {
+          "uname -n" | getline MyHost
+          close("uname -n")
+       }
+       if (MyPort ==  0) MyPort = 8080
+       HttpService = "/inet/tcp/" MyPort "/0/0"
+       MyPrefix    = "http://" MyHost ":" MyPort
+       SetUpServer()
+       while ("awk" != "complex") {
+         # header lines are terminated this way
+         RS = ORS    = "\r\n"
+         Status      = 200             # this means OK
+         Reason      = "OK"
+         Header      = TopHeader
+         Document    = TopDoc
+         Footer      = TopFooter
+         if        (GETARG["Method"] == "GET") {
+             HandleGET()
+         } else if (GETARG["Method"] == "HEAD") {
+             # not yet implemented
+         } else if (GETARG["Method"] != "") {
+             print "bad method", GETARG["Method"]
+         }
+         Prompt = Header Document Footer
+         print "HTTP/1.0", Status, Reason     |& HttpService
+         print "Connection: Close"            |& HttpService
+         print "Pragma: no-cache"             |& HttpService
+         len = length(Prompt) + length(ORS)
+         print "Content-length:", len         |& HttpService
+         print ORS Prompt                     |& HttpService
+         # ignore all the header lines
+         while ((HttpService |& getline) > 0)
+             continue
+         # stop talking to this client
+         close(HttpService)
+         # wait for new client request
+         HttpService |& getline
+         # do some logging
+         print systime(), strftime(), $0
+         CGI_setup($1, $2, $3)
+       }
+     }
+
+     function CGI_setup(   method, uri, version, i)
+     {
+         delete GETARG
+         delete MENU
+         delete PARAM
+         GETARG["Method"] = method
+         GETARG["URI"] = uri
+         GETARG["Version"] = version
+
+         i = index(uri, "?")
+         if (i > 0) {  # is there a "?" indicating a CGI request?
+             split(substr(uri, 1, i-1), MENU, "[/:]")
+             split(substr(uri, i+1), PARAM, "&")
+             for (i in PARAM) {
+                 PARAM[i] = _CGI_decode(PARAM[i])
+                 j = index(PARAM[i], "=")
+                 GETARG[substr(PARAM[i], 1, j-1)] = \
+                                              substr(PARAM[i], j+1)
+             }
+         } else { # there is no "?", no need for splitting PARAMs
+             split(uri, MENU, "[/:]")
+         }
+         for (i in MENU)     # decode characters in path
+             if (i > 4)      # but not those in host name
+                 MENU[i] = _CGI_decode(MENU[i])
+     }
+
+   This isolates details in a single function, 'CGI_setup()'.  Decoding
+of encoded characters is pushed off to a helper function,
+'_CGI_decode()'.  The use of the leading underscore ('_') in the
+function name is intended to indicate that it is an "internal" function,
+although there is nothing to enforce this:
+
+     function _CGI_decode(str,   hexdigs, i, pre, code1, code2,
+                                 val, result)
+     {
+        hexdigs = "123456789abcdef"
+
+        i = index(str, "%")
+        if (i == 0) # no work to do
+           return str
+
+        do {
+           pre = substr(str, 1, i-1)   # part before %xx
+           code1 = substr(str, i+1, 1) # first hex digit
+           code2 = substr(str, i+2, 1) # second hex digit
+           str = substr(str, i+3)      # rest of string
+
+           code1 = tolower(code1)
+           code2 = tolower(code2)
+           val = index(hexdigs, code1) * 16 \
+                 + index(hexdigs, code2)
+
+           result = result pre sprintf("%c", val)
+           i = index(str, "%")
+        } while (i != 0)
+        if (length(str) > 0)
+           result = result str
+        return result
+     }
+
+   This works by splitting the string apart around an encoded character.
+The two digits are converted to lowercase characters and looked up in a
+string of hex digits.  Note that '0' is not in the string on purpose;
+'index()' returns zero when it's not found, automatically giving the
+correct value!  Once the hexadecimal value is converted from characters
+in a string into a numerical value, 'sprintf()' converts the value back
+into a real character.  The following is a simple test harness for the
+above functions:
+
+     BEGIN {
+       CGI_setup("GET",
+       "http://www.gnu.org/cgi-bin/foo?p1=stuff&p2=stuff%26junk" \
+            "&percent=a %25 sign",
+       "1.0")
+       for (i in MENU)
+           printf "MENU[\"%s\"] = %s\n", i, MENU[i]
+       for (i in PARAM)
+           printf "PARAM[\"%s\"] = %s\n", i, PARAM[i]
+       for (i in GETARG)
+           printf "GETARG[\"%s\"] = %s\n", i, GETARG[i]
+     }
+
+   And this is the result when we run it:
+
+     $ gawk -f testserv.awk
+     -| MENU["4"] = www.gnu.org
+     -| MENU["5"] = cgi-bin
+     -| MENU["6"] = foo
+     -| MENU["1"] = http
+     -| MENU["2"] =
+     -| MENU["3"] =
+     -| PARAM["1"] = p1=stuff
+     -| PARAM["2"] = p2=stuff&junk
+     -| PARAM["3"] = percent=a % sign
+     -| GETARG["p1"] = stuff
+     -| GETARG["percent"] = a % sign
+     -| GETARG["p2"] = stuff&junk
+     -| GETARG["Method"] = GET
+     -| GETARG["Version"] = 1.0
+     -| GETARG["URI"] = http://www.gnu.org/cgi-bin/foo?p1=stuff&
+     p2=stuff%26junk&percent=a %25 sign
+
+
+File: gawkinet.info,  Node: Simple Server,  Next: Caveats,  Prev: Interacting Service,  Up: Using Networking
+
+2.10 A Simple Web Server
+========================
+
+In the preceding node, we built the core logic for event-driven GUIs.
+In this node, we finally extend the core to a real application.  No one
+would actually write a commercial web server in 'gawk', but it is
+instructive to see that it is feasible in principle.
+
+   The application is ELIZA, the famous program by Joseph Weizenbaum
+that mimics the behavior of a professional psychotherapist when talking
+to you.  Weizenbaum would certainly object to this description, but this
+is part of the legend around ELIZA. Take the site-independent core logic
+and append the following code:
+
+     function SetUpServer() {
+       SetUpEliza()
+       TopHeader = \
+         "<HTML><title>An HTTP-based System with GAWK</title>\
+         <HEAD><META HTTP-EQUIV=\"Content-Type\"\
+         CONTENT=\"text/html; charset=iso-8859-1\"></HEAD>\
+         <BODY BGCOLOR=\"#ffffff\" TEXT=\"#000000\"\
+         LINK=\"#0000ff\" VLINK=\"#0000ff\"\
+         ALINK=\"#0000ff\"> <A NAME=\"top\">"
+       TopDoc    = "\
+        <h2>Please choose one of the following actions:</h2>\
+        <UL>\
+        <LI>\
+        <A HREF=" MyPrefix "/AboutServer>About this server</A>\
+        </LI><LI>\
+        <A HREF=" MyPrefix "/AboutELIZA>About Eliza</A></LI>\
+        <LI>\
+        <A HREF=" MyPrefix \
+           "/StartELIZA>Start talking to Eliza</A></LI></UL>"
+       TopFooter = "</BODY></HTML>"
+     }
+
+   'SetUpServer()' is similar to the previous example, except for
+calling another function, 'SetUpEliza()'.  This approach can be used to
+implement other kinds of servers.  The only changes needed to do so are
+hidden in the functions 'SetUpServer()' and 'HandleGET()'.  Perhaps it
+might be necessary to implement other HTTP methods.  The 'igawk' program
+that comes with 'gawk' may be useful for this process.
+
+   When extending this example to a complete application, the first
+thing to do is to implement the function 'SetUpServer()' to initialize
+the HTML pages and some variables.  These initializations determine the
+way your HTML pages look (colors, titles, menu items, etc.).
+
+   The function 'HandleGET()' is a nested case selection that decides
+which page the user wants to see next.  Each nesting level refers to a
+menu level of the GUI. Each case implements a certain action of the
+menu.  On the deepest level of case selection, the handler essentially
+knows what the user wants and stores the answer into the variable that
+holds the HTML page contents:
+
+     function HandleGET() {
+       # A real HTTP server would treat some parts of the URI as a file name.
+       # We take parts of the URI as menu choices and go on accordingly.
+       if(MENU[2] == "AboutServer") {
+         Document    = "This is not a CGI script.\
+           This is an httpd, an HTML file, and a CGI script all \
+           in one GAWK script. It needs no separate www-server, \
+           no installation, and no root privileges.\
+           <p>To run it, do this:</p><ul>\
+           <li> start this script with \"gawk -f httpserver.awk\",</li>\
+           <li> and on the same host let your www browser open location\
+                \"http://localhost:8080\"</li>\
+           </ul>\<p>\ Details of HTTP come from:</p><ul>\
+                 <li>Hethmon:  Illustrated Guide to HTTP</p>\
+                 <li>RFC 2068</li></ul><p>JK 14.9.1997</p>"
+       } else if (MENU[2] == "AboutELIZA") {
+         Document    = "This is an implementation of the famous ELIZA\
+             program by Joseph Weizenbaum. It is written in GAWK and\
+             uses an HTML GUI."
+       } else if (MENU[2] == "StartELIZA") {
+         gsub(/\+/, " ", GETARG["YouSay"])
+         # Here we also have to substitute coded special characters
+         Document    = "<form method=GET>" \
+           "<h3>" ElizaSays(GETARG["YouSay"]) "</h3>\
+           <p><input type=text name=YouSay value=\"\" size=60>\
+           <br><input type=submit value=\"Tell her about it\"></p></form>"
+       }
+     }
+
+   Now we are down to the heart of ELIZA, so you can see how it works.
+Initially the user does not say anything; then ELIZA resets its money
+counter and asks the user to tell what comes to mind open heartedly.
+The subsequent answers are converted to uppercase characters and stored
+for later comparison.  ELIZA presents the bill when being confronted
+with a sentence that contains the phrase "shut up."  Otherwise, it looks
+for keywords in the sentence, conjugates the rest of the sentence,
+remembers the keyword for later use, and finally selects an answer from
+the set of possible answers:
+
+     function ElizaSays(YouSay) {
+       if (YouSay == "") {
+         cost = 0
+         answer = "HI, IM ELIZA, TELL ME YOUR PROBLEM"
+       } else {
+         q = toupper(YouSay)
+         gsub("'", "", q)
+         if(q == qold) {
+           answer = "PLEASE DONT REPEAT YOURSELF !"
+         } else {
+           if (index(q, "SHUT UP") > 0) {
+             answer = "WELL, PLEASE PAY YOUR BILL. ITS EXACTLY ... $"\
+                      int(100*rand()+30+cost/100)
+           } else {
+             qold = q
+             w = "-"                 # no keyword recognized yet
+             for (i in k) {          # search for keywords
+               if (index(q, i) > 0) {
+                 w = i
+                 break
+               }
+             }
+             if (w == "-") {         # no keyword, take old subject
+               w    = wold
+               subj = subjold
+             } else {                # find subject
+               subj = substr(q, index(q, w) + length(w)+1)
+               wold = w
+               subjold = subj        #  remember keyword and subject
+             }
+             for (i in conj)
+                gsub(i, conj[i], q)   # conjugation
+             # from all answers to this keyword, select one randomly
+             answer = r[indices[int(split(k[w], indices) * rand()) + 1]]
+             # insert subject into answer
+             gsub("_", subj, answer)
+           }
+         }
+       }
+       cost += length(answer) # for later payment : 1 cent per character
+       return answer
+     }
+
+   In the long but simple function 'SetUpEliza()', you can see tables
+for conjugation, keywords, and answers.(1)  The associative array 'k'
+contains indices into the array of answers 'r'.  To choose an answer,
+ELIZA just picks an index randomly:
+
+     function SetUpEliza() {
+       srand()
+       wold = "-"
+       subjold = " "
+
+       # table for conjugation
+       conj[" ARE "     ] = " AM "
+       conj["WERE "     ] = "WAS "
+       conj[" YOU "     ] = " I "
+       conj["YOUR "     ] = "MY "
+       conj[" IVE "     ] =\
+       conj[" I HAVE "  ] = " YOU HAVE "
+       conj[" YOUVE "   ] =\
+       conj[" YOU HAVE "] = " I HAVE "
+       conj[" IM "      ] =\
+       conj[" I AM "    ] = " YOU ARE "
+       conj[" YOURE "   ] =\
+       conj[" YOU ARE " ] = " I AM "
+
+       # table of all answers
+       r[1]   = "DONT YOU BELIEVE THAT I CAN  _"
+       r[2]   = "PERHAPS YOU WOULD LIKE TO BE ABLE TO _ ?"
+       ...
+
+       # table for looking up answers that
+       # fit to a certain keyword
+       k["CAN YOU"]      = "1 2 3"
+       k["CAN I"]        = "4 5"
+       k["YOU ARE"]      =\
+       k["YOURE"]        = "6 7 8 9"
+       ...
+     }
+
+   Some interesting remarks and details (including the original source
+code of ELIZA) are found on Mark Humphrys' home page.  Yahoo!  also has
+a page with a collection of ELIZA-like programs.  Many of them are
+written in Java, some of them disclosing the Java source code, and a few
+even explain how to modify the Java source code.
+
+   ---------- Footnotes ----------
+
+   (1) The version shown here is abbreviated.  The full version comes
+with the 'gawk' distribution.
+
+
+File: gawkinet.info,  Node: Caveats,  Next: Challenges,  Prev: Simple Server,  Up: Using Networking
+
+2.11 Network Programming Caveats
+================================
+
+By now it should be clear that debugging a networked application is more
+complicated than debugging a single-process single-hosted application.
+The behavior of a networked application sometimes looks noncausal
+because it is not reproducible in a strong sense.  Whether a network
+application works or not sometimes depends on the following:
+
+   * How crowded the underlying network is
+
+   * If the party at the other end is running or not
+
+   * The state of the party at the other end
+
+   The most difficult problems for a beginner arise from the hidden
+states of the underlying network.  After closing a TCP connection, it's
+often necessary to wait a short while before reopening the connection.
+Even more difficult is the establishment of a connection that previously
+ended with a "broken pipe."  Those connections have to "time out" for a
+minute or so before they can reopen.  Check this with the command
+'netstat -a', which provides a list of still "active" connections.
+
+
+File: gawkinet.info,  Node: Challenges,  Prev: Caveats,  Up: Using Networking
+
+2.12 Where To Go From Here
+==========================
+
+Now, you have learned enough to build your own application.  You could,
+for example, take part in the Loebner Contest to win $100,000.  The
+Loebner Prize is the first formal instantiation of a Turing Test.  Hugh
+Loebner agreed with The Cambridge Center for Behavioral Studies to
+underwrite a contest designed to implement the Turing Test.  Dr. Loebner
+pledged a Grand Prize of $100,000 for the first computer whose responses
+were indistinguishable from a human's.  Each year an annual prize of
+$2000 and a bronze medal is awarded to the _most_ human computer.  The
+winner of the annual contest is the best entry relative to other entries
+that year, irrespective of how good it is in an absolute sense.  Here is
+an example of a conversation with the winning program of 1997:
+
+     (c)1997 Cambridge Center For Behavioral Studies all rights reserved [CONVERSE]
+     [Intelligent Research Ltd.] Start at: [1997/04/29 08:59:40]
+     PROGRAM[10:39:20]Hi, my name is Catherine, what's yours?
+     JUDGE02[10:39:36]My name is Nathan.
+     PROGRAM[10:39:37]Well, it's nice to meet you, Nathan.
+     PROGRAM[10:39:45]Did you see that story on CNN last night about the lesbian
+     PROGRAM[10:40:03]couple who came out at a White House party on Sunday?
+     JUDGE02[10:40:45]No.  I'm sorry I missed that.  How did the Prez take it?
+     PROGRAM[10:40:47]Ellen Degeneres was one of them - she was kissing her lover
+     PROGRAM[10:41:08]in front of Bill Clinton. Do you think he was very
+     PROGRAM[10:41:21]embarrassed?
+     JUDGE02[10:41:37]You tell me.  I didn't see it.
+     PROGRAM[10:41:38]Are you still a Friend of Bill Clinton?
+     JUDGE02[10:42:22]Yes.  I would like to talk about something else.  Do you know anything about sculpture.
+     PROGRAM[10:42:24]Well, I suppose he's doing his best like most of us, but he
+     PROGRAM[10:42:35]does seem to attract an UNUSUAL amount of bad publicity,
+     PROGRAM[10:42:51]wouldnt you say--can it all be bad luck?
+     JUDGE02[10:43:20]I asked you if you knew anything about sculpture.
+     PROGRAM[10:43:21]OK, but what happens now--does it all drift on, with
+     PROGRAM[10:43:36]everyone waiting to see how much his friends have said about
+     PROGRAM[10:43:47]him under threat of lots and lots of jail-or will something
+     PROGRAM[10:44:01]dramatic actually happen?
+     JUDGE02[10:44:19]Do you know what sculpture is?
+     PROGRAM[10:44:20]Well, youre more a fan of his than I am I guess---though I
+     PROGRAM[10:44:34]cant SWEAR what Id have done in that locked hotel room, I
+     PROGRAM[10:44:51]must say.
+     JUDGE02[10:45:26]Have you ever been to the Salmagundi Club?
+     PROGRAM[10:45:29]Never. to go back to the original discussion Time to get
+     PROGRAM[10:45:43]off this one I think and onto something more down to earth!
+     JUDGE02[10:46:20]What is more down to earth than art?
+     PROGRAM[10:46:21]I don't know what -- to be honest. -
+     PROGRAM[10:46:30]I suppose it depends on where you're coming from, but as the
+     PROGRAM[10:46:46]song goes in My Fair Lady, "Why can't a woman be more like a
+     PROGRAM[10:47:03]man? "
+     JUDGE02[10:47:52]I liked that song but I prefer "Why can't children be more like we are"
+     JUDGE02[10:48:10]from another play the name of which I cannot remember.
+     PROGRAM[10:48:12]What, if anything, do you enjoy about being a man?
+     JUDGE02[10:49:01]I would enjoy all aspects of life no matter what my gender.
+     PROGRAM[10:49:02]Do you think of yourself as being attractive?
+
+   This program insists on always speaking about the same story around
+Bill Clinton.  You see, even a program with a rather narrow mind can
+behave so much like a human being that it can win this prize.  It is
+quite common to let these programs talk to each other via network
+connections.  But during the competition itself, the program and its
+computer have to be present at the place the competition is held.  We
+all would love to see a 'gawk' program win in such an event.  Maybe it
+is up to you to accomplish this?
+
+   Some other ideas for useful networked applications:
+   * Read the file 'doc/awkforai.txt' in the 'gawk' distribution.  It
+     was written by Ronald P. Loui (at the time, Associate Professor of
+     Computer Science, at Washington University in St.  Louis,
+     <loui@ai.wustl.edu>) and summarizes why he taught 'gawk' to
+     students of Artificial Intelligence.  Here are some passages from
+     the text:
+
+          The GAWK manual can be consumed in a single lab session and
+          the language can be mastered by the next morning by the
+          average student.  GAWK's automatic initialization, implicit
+          coercion, I/O support and lack of pointers forgive many of the
+          mistakes that young programmers are likely to make.  Those who
+          have seen C but not mastered it are happy to see that GAWK
+          retains some of the same sensibilities while adding what must
+          be regarded as spoonsful of syntactic sugar.
+          ...
+          There are further simple answers.  Probably the best is the
+          fact that increasingly, undergraduate AI programming is
+          involving the Web.  Oren Etzioni (University of Washington,
+          Seattle) has for a while been arguing that the "softbot" is
+          replacing the mechanical engineers' robot as the most
+          glamorous AI testbed.  If the artifact whose behavior needs to
+          be controlled in an intelligent way is the software agent,
+          then a language that is well-suited to controlling the
+          software environment is the appropriate language.  That would
+          imply a scripting language.  If the robot is KAREL, then the
+          right language is "turn left; turn right."  If the robot is
+          Netscape, then the right language is something that can
+          generate 'netscape -remote
+          'openURL(http://cs.wustl.edu/~loui)'' with elan.
+          ...
+          AI programming requires high-level thinking.  There have
+          always been a few gifted programmers who can write high-level
+          programs in assembly language.  Most however need the ambient
+          abstraction to have a higher floor.
+          ...
+          Second, inference is merely the expansion of notation.  No
+          matter whether the logic that underlies an AI program is
+          fuzzy, probabilistic, deontic, defeasible, or deductive, the
+          logic merely defines how strings can be transformed into other
+          strings.  A language that provides the best support for string
+          processing in the end provides the best support for logic, for
+          the exploration of various logics, and for most forms of
+          symbolic processing that AI might choose to call "reasoning"
+          instead of "logic."  The implication is that PROLOG, which
+          saves the AI programmer from having to write a unifier, saves
+          perhaps two dozen lines of GAWK code at the expense of
+          strongly biasing the logic and representational expressiveness
+          of any approach.
+
+     Now that 'gawk' itself can connect to the Internet, it should be
+     obvious that it is suitable for writing intelligent web agents.
+
+   * 'awk' is strong at pattern recognition and string processing.  So,
+     it is well suited to the classic problem of language translation.
+     A first try could be a program that knows the 100 most frequent
+     English words and their counterparts in German or French.  The
+     service could be implemented by regularly reading email with the
+     program above, replacing each word by its translation and sending
+     the translation back via SMTP. Users would send English email to
+     their translation service and get back a translated email message
+     in return.  As soon as this works, more effort can be spent on a
+     real translation program.
+
+   * Another dialogue-oriented application (on the verge of ridicule) is
+     the email "support service."  Troubled customers write an email to
+     an automatic 'gawk' service that reads the email.  It looks for
+     keywords in the mail and assembles a reply email accordingly.  By
+     carefully investigating the email header, and repeating these
+     keywords through the reply email, it is rather simple to give the
+     customer a feeling that someone cares.  Ideally, such a service
+     would search a database of previous cases for solutions.  If none
+     exists, the database could, for example, consist of all the
+     newsgroups, mailing lists and FAQs on the Internet.
+
+
+File: gawkinet.info,  Node: Some Applications and Techniques,  Next: Links,  Prev: Using Networking,  Up: Top
+
+3 Some Applications and Techniques
+**********************************
+
+In this major node, we look at a number of self-contained scripts, with
+an emphasis on concise networking.  Along the way, we work towards
+creating building blocks that encapsulate often needed functions of the
+networking world, show new techniques that broaden the scope of problems
+that can be solved with 'gawk', and explore leading edge technology that
+may shape the future of networking.
+
+   We often refer to the site-independent core of the server that we
+built in *note A Simple Web Server: Simple Server.  When building new
+and nontrivial servers, we always copy this building block and append
+new instances of the two functions 'SetUpServer()' and 'HandleGET()'.
+
+   This makes a lot of sense, since this scheme of event-driven
+execution provides 'gawk' with an interface to the most widely accepted
+standard for GUIs: the web browser.  Now, 'gawk' can rival even Tcl/Tk.
+
+   Tcl and 'gawk' have much in common.  Both are simple scripting
+languages that allow us to quickly solve problems with short programs.
+But Tcl has Tk on top of it, and 'gawk' had nothing comparable up to
+now.  While Tcl needs a large and ever-changing library (Tk, which was
+bound to the X Window System until recently), 'gawk' needs just the
+networking interface and some kind of browser on the client's side.
+Besides better portability, the most important advantage of this
+approach (embracing well-established standards such HTTP and HTML) is
+that _we do not need to change the language_.  We let others do the work
+of fighting over protocols and standards.  We can use HTML, JavaScript,
+VRML, or whatever else comes along to do our work.
+
+* Menu:
+
+* PANIC::                       An Emergency Web Server.
+* GETURL::                      Retrieving Web Pages.
+* REMCONF::                     Remote Configuration Of Embedded Systems.
+* URLCHK::                      Look For Changed Web Pages.
+* WEBGRAB::                     Extract Links From A Page.
+* STATIST::                     Graphing A Statistical Distribution.
+* MAZE::                        Walking Through A Maze In Virtual Reality.
+* MOBAGWHO::                    A Simple Mobile Agent.
+* STOXPRED::                    Stock Market Prediction As A Service.
+* PROTBASE::                    Searching Through A Protein Database.
+
+
+File: gawkinet.info,  Node: PANIC,  Next: GETURL,  Prev: Some Applications and Techniques,  Up: Some Applications and Techniques
+
+3.1 PANIC: An Emergency Web Server
+==================================
+
+At first glance, the '"Hello, world"' example in *note A Primitive Web
+Service: Primitive Service, seems useless.  By adding just a few lines,
+we can turn it into something useful.
+
+   The PANIC program tells everyone who connects that the local site is
+not working.  When a web server breaks down, it makes a difference if
+customers get a strange "network unreachable" message, or a short
+message telling them that the server has a problem.  In such an
+emergency, the hard disk and everything on it (including the regular web
+service) may be unavailable.  Rebooting the web server off a diskette
+makes sense in this setting.
+
+   To use the PANIC program as an emergency web server, all you need are
+the 'gawk' executable and the program below on a diskette.  By default,
+it connects to port 8080.  A different value may be supplied on the
+command line:
+
+     BEGIN {
+       RS = ORS = "\r\n"
+       if (MyPort ==  0) MyPort = 8080
+       HttpService = "/inet/tcp/" MyPort "/0/0"
+       Hello = "<HTML><HEAD><TITLE>Out Of Service</TITLE>" \
+          "</HEAD><BODY><H1>" \
+          "This site is temporarily out of service." \
+          "</H1></BODY></HTML>"
+       Len = length(Hello) + length(ORS)
+       while ("awk" != "complex") {
+         print "HTTP/1.0 200 OK"          |& HttpService
+         print "Content-Length: " Len ORS |& HttpService
+         print Hello                      |& HttpService
+         while ((HttpService |& getline) > 0)
+            continue;
+         close(HttpService)
+       }
+     }
+
+
+File: gawkinet.info,  Node: GETURL,  Next: REMCONF,  Prev: PANIC,  Up: Some Applications and Techniques
+
+3.2 GETURL: Retrieving Web Pages
+================================
+
+GETURL is a versatile building block for shell scripts that need to
+retrieve files from the Internet.  It takes a web address as a
+command-line parameter and tries to retrieve the contents of this
+address.  The contents are printed to standard output, while the header
+is printed to '/dev/stderr'.  A surrounding shell script could analyze
+the contents and extract the text or the links.  An ASCII browser could
+be written around GETURL. But more interestingly, web robots are
+straightforward to write on top of GETURL. On the Internet, you can find
+several programs of the same name that do the same job.  They are
+usually much more complex internally and at least 10 times longer.
+
+   At first, GETURL checks if it was called with exactly one web
+address.  Then, it checks if the user chose to use a special proxy
+server whose name is handed over in a variable.  By default, it is
+assumed that the local machine serves as proxy.  GETURL uses the 'GET'
+method by default to access the web page.  By handing over the name of a
+different method (such as 'HEAD'), it is possible to choose a different
+behavior.  With the 'HEAD' method, the user does not receive the body of
+the page content, but does receive the header:
+
+     BEGIN {
+       if (ARGC != 2) {
+         print "GETURL - retrieve Web page via HTTP 1.0"
+         print "IN:\n    the URL as a command-line parameter"
+         print "PARAM(S):\n    -v Proxy=MyProxy"
+         print "OUT:\n    the page content on stdout"
+         print "    the page header on stderr"
+         print "JK 16.05.1997"
+         print "ADR 13.08.2000"
+         exit
+       }
+       URL = ARGV[1]; ARGV[1] = ""
+       if (Proxy     == "")  Proxy     = "127.0.0.1"
+       if (ProxyPort ==  0)  ProxyPort = 80
+       if (Method    == "")  Method    = "GET"
+       HttpService = "/inet/tcp/0/" Proxy "/" ProxyPort
+       ORS = RS = "\r\n\r\n"
+       print Method " " URL " HTTP/1.0" |& HttpService
+       HttpService                      |& getline Header
+       print Header > "/dev/stderr"
+       while ((HttpService |& getline) > 0)
+         printf "%s", $0
+       close(HttpService)
+     }
+
+   This program can be changed as needed, but be careful with the last
+lines.  Make sure transmission of binary data is not corrupted by
+additional line breaks.  Even as it is now, the byte sequence
+'"\r\n\r\n"' would disappear if it were contained in binary data.  Don't
+get caught in a trap when trying a quick fix on this one.
+
+
+File: gawkinet.info,  Node: REMCONF,  Next: URLCHK,  Prev: GETURL,  Up: Some Applications and Techniques
+
+3.3 REMCONF: Remote Configuration of Embedded Systems
+=====================================================
+
+Today, you often find powerful processors in embedded systems.
+Dedicated network routers and controllers for all kinds of machinery are
+examples of embedded systems.  Processors like the Intel 80x86 or the
+AMD Elan are able to run multitasking operating systems, such as XINU or
+GNU/Linux in embedded PCs.  These systems are small and usually do not
+have a keyboard or a display.  Therefore it is difficult to set up their
+configuration.  There are several widespread ways to set them up:
+
+   * DIP switches
+
+   * Read Only Memories such as EPROMs
+
+   * Serial lines or some kind of keyboard
+
+   * Network connections via 'telnet' or SNMP
+
+   * HTTP connections with HTML GUIs
+
+   In this node, we look at a solution that uses HTTP connections to
+control variables of an embedded system that are stored in a file.
+Since embedded systems have tight limits on resources like memory, it is
+difficult to employ advanced techniques such as SNMP and HTTP servers.
+'gawk' fits in quite nicely with its single executable which needs just
+a short script to start working.  The following program stores the
+variables in a file, and a concurrent process in the embedded system may
+read the file.  The program uses the site-independent part of the simple
+web server that we developed in *note A Web Service with Interaction:
+Interacting Service.  As mentioned there, all we have to do is to write
+two new procedures 'SetUpServer()' and 'HandleGET()':
+
+     function SetUpServer() {
+       TopHeader = "<HTML><title>Remote Configuration</title>"
+       TopDoc = "<BODY>\
+         <h2>Please choose one of the following actions:</h2>\
+         <UL>\
+           <LI><A HREF=" MyPrefix "/AboutServer>About this server</A></LI>\
+           <LI><A HREF=" MyPrefix "/ReadConfig>Read Configuration</A></LI>\
+           <LI><A HREF=" MyPrefix "/CheckConfig>Check Configuration</A></LI>\
+           <LI><A HREF=" MyPrefix "/ChangeConfig>Change Configuration</A></LI>\
+           <LI><A HREF=" MyPrefix "/SaveConfig>Save Configuration</A></LI>\
+         </UL>"
+       TopFooter  = "</BODY></HTML>"
+       if (ConfigFile == "") ConfigFile = "config.asc"
+     }
+
+   The function 'SetUpServer()' initializes the top level HTML texts as
+usual.  It also initializes the name of the file that contains the
+configuration parameters and their values.  In case the user supplies a
+name from the command line, that name is used.  The file is expected to
+contain one parameter per line, with the name of the parameter in column
+one and the value in column two.
+
+   The function 'HandleGET()' reflects the structure of the menu tree as
+usual.  The first menu choice tells the user what this is all about.
+The second choice reads the configuration file line by line and stores
+the parameters and their values.  Notice that the record separator for
+this file is '"\n"', in contrast to the record separator for HTTP. The
+third menu choice builds an HTML table to show the contents of the
+configuration file just read.  The fourth choice does the real work of
+changing parameters, and the last one just saves the configuration into
+a file:
+
+     function HandleGET() {
+       if(MENU[2] == "AboutServer") {
+         Document  = "This is a GUI for remote configuration of an\
+           embedded system. It is is implemented as one GAWK script."
+       } else if (MENU[2] == "ReadConfig") {
+         RS = "\n"
+         while ((getline < ConfigFile) > 0)
+            config[$1] = $2;
+         close(ConfigFile)
+         RS = "\r\n"
+         Document = "Configuration has been read."
+       } else if (MENU[2] == "CheckConfig") {
+         Document = "<TABLE BORDER=1 CELLPADDING=5>"
+         for (i in config)
+           Document = Document "<TR><TD>" i "</TD>" \
+             "<TD>" config[i] "</TD></TR>"
+         Document = Document "</TABLE>"
+       } else if (MENU[2] == "ChangeConfig") {
+         if ("Param" in GETARG) {            # any parameter to set?
+           if (GETARG["Param"] in config) {  # is  parameter valid?
+             config[GETARG["Param"]] = GETARG["Value"]
+             Document = (GETARG["Param"] " = " GETARG["Value"] ".")
+           } else {
+             Document = "Parameter <b>" GETARG["Param"] "</b> is invalid."
+           }
+         } else {
+           Document = "<FORM method=GET><h4>Change one parameter</h4>\
+             <TABLE BORDER CELLPADDING=5>\
+             <TR><TD>Parameter</TD><TD>Value</TD></TR>\
+             <TR><TD><input type=text name=Param value=\"\" size=20></TD>\
+                 <TD><input type=text name=Value value=\"\" size=40></TD>\
+             </TR></TABLE><input type=submit value=\"Set\"></FORM>"
+         }
+       } else if (MENU[2] == "SaveConfig") {
+         for (i in config)
+           printf("%s %s\n", i, config[i]) > ConfigFile
+         close(ConfigFile)
+         Document = "Configuration has been saved."
+       }
+     }
+
+   We could also view the configuration file as a database.  From this
+point of view, the previous program acts like a primitive database
+server.  Real SQL database systems also make a service available by
+providing a TCP port that clients can connect to.  But the application
+level protocols they use are usually proprietary and also change from
+time to time.  This is also true for the protocol that MiniSQL uses.
+
+
+File: gawkinet.info,  Node: URLCHK,  Next: WEBGRAB,  Prev: REMCONF,  Up: Some Applications and Techniques
+
+3.4 URLCHK: Look for Changed Web Pages
+======================================
+
+Most people who make heavy use of Internet resources have a large
+bookmark file with pointers to interesting web sites.  It is impossible
+to regularly check by hand if any of these sites have changed.  A
+program is needed to automatically look at the headers of web pages and
+tell which ones have changed.  URLCHK does the comparison after using
+GETURL with the 'HEAD' method to retrieve the header.
+
+   Like GETURL, this program first checks that it is called with exactly
+one command-line parameter.  URLCHK also takes the same command-line
+variables 'Proxy' and 'ProxyPort' as GETURL, because these variables are
+handed over to GETURL for each URL that gets checked.  The one and only
+parameter is the name of a file that contains one line for each URL. In
+the first column, we find the URL, and the second and third columns hold
+the length of the URL's body when checked for the two last times.  Now,
+we follow this plan:
+
+  1. Read the URLs from the file and remember their most recent lengths
+
+  2. Delete the contents of the file
+
+  3. For each URL, check its new length and write it into the file
+
+  4. If the most recent and the new length differ, tell the user
+
+   It may seem a bit peculiar to read the URLs from a file together with
+their two most recent lengths, but this approach has several advantages.
+You can call the program again and again with the same file.  After
+running the program, you can regenerate the changed URLs by extracting
+those lines that differ in their second and third columns:
+
+     BEGIN {
+       if (ARGC != 2) {
+         print "URLCHK - check if URLs have changed"
+         print "IN:\n    the file with URLs as a command-line parameter"
+         print "    file contains URL, old length, new length"
+         print "PARAMS:\n    -v Proxy=MyProxy -v ProxyPort=8080"
+         print "OUT:\n    same as file with URLs"
+         print "JK 02.03.1998"
+         exit
+       }
+       URLfile = ARGV[1]; ARGV[1] = ""
+       if (Proxy     != "") Proxy     = " -v Proxy="     Proxy
+       if (ProxyPort != "") ProxyPort = " -v ProxyPort=" ProxyPort
+       while ((getline < URLfile) > 0)
+          Length[$1] = $3 + 0
+       close(URLfile)      # now, URLfile is read in and can be updated
+       GetHeader = "gawk " Proxy ProxyPort " -v Method=\"HEAD\" -f geturl.awk "
+       for (i in Length) {
+         GetThisHeader = GetHeader i " 2>&1"
+         while ((GetThisHeader | getline) > 0)
+           if (toupper($0) ~ /CONTENT-LENGTH/) NewLength = $2 + 0
+         close(GetThisHeader)
+         print i, Length[i], NewLength > URLfile
+         if (Length[i] != NewLength)  # report only changed URLs
+           print i, Length[i], NewLength
+       }
+       close(URLfile)
+     }
+
+   Another thing that may look strange is the way GETURL is called.
+Before calling GETURL, we have to check if the proxy variables need to
+be passed on.  If so, we prepare strings that will become part of the
+command line later.  In 'GetHeader()', we store these strings together
+with the longest part of the command line.  Later, in the loop over the
+URLs, 'GetHeader()' is appended with the URL and a redirection operator
+to form the command that reads the URL's header over the Internet.
+GETURL always produces the headers over '/dev/stderr'.  That is the
+reason why we need the redirection operator to have the header piped in.
+
+   This program is not perfect because it assumes that changing URLs
+results in changed lengths, which is not necessarily true.  A more
+advanced approach is to look at some other header line that holds time
+information.  But, as always when things get a bit more complicated,
+this is left as an exercise to the reader.
+
+
+File: gawkinet.info,  Node: WEBGRAB,  Next: STATIST,  Prev: URLCHK,  Up: Some Applications and Techniques
+
+3.5 WEBGRAB: Extract Links from a Page
+======================================
+
+Sometimes it is necessary to extract links from web pages.  Browsers do
+it, web robots do it, and sometimes even humans do it.  Since we have a
+tool like GETURL at hand, we can solve this problem with some help from
+the Bourne shell:
+
+     BEGIN { RS = "http://[#%&\\+\\-\\./0-9\\:;\\?A-Z_a-z\\~]*" }
+     RT != "" {
+        command = ("gawk -v Proxy=MyProxy -f geturl.awk " RT \
+                    " > doc" NR ".html")
+        print command
+     }
+
+   Notice that the regular expression for URLs is rather crude.  A
+precise regular expression is much more complex.  But this one works
+rather well.  One problem is that it is unable to find internal links of
+an HTML document.  Another problem is that 'ftp', 'telnet', 'news',
+'mailto', and other kinds of links are missing in the regular
+expression.  However, it is straightforward to add them, if doing so is
+necessary for other tasks.
+
+   This program reads an HTML file and prints all the HTTP links that it
+finds.  It relies on 'gawk''s ability to use regular expressions as
+record separators.  With 'RS' set to a regular expression that matches
+links, the second action is executed each time a non-empty link is
+found.  We can find the matching link itself in 'RT'.
+
+   The action could use the 'system()' function to let another GETURL
+retrieve the page, but here we use a different approach.  This simple
+program prints shell commands that can be piped into 'sh' for execution.
+This way it is possible to first extract the links, wrap shell commands
+around them, and pipe all the shell commands into a file.  After editing
+the file, execution of the file retrieves exactly those files that we
+really need.  In case we do not want to edit, we can retrieve all the
+pages like this:
+
+     gawk -f geturl.awk http://www.suse.de | gawk -f webgrab.awk | sh
+
+   After this, you will find the contents of all referenced documents in
+files named 'doc*.html' even if they do not contain HTML code.  The most
+annoying thing is that we always have to pass the proxy to GETURL. If
+you do not like to see the headers of the web pages appear on the
+screen, you can redirect them to '/dev/null'.  Watching the headers
+appear can be quite interesting, because it reveals interesting details
+such as which web server the companies use.  Now, it is clear how the
+clever marketing people use web robots to determine the market shares of
+Microsoft and Netscape in the web server market.
+
+   Port 80 of any web server is like a small hole in a repellent
+firewall.  After attaching a browser to port 80, we usually catch a
+glimpse of the bright side of the server (its home page).  With a tool
+like GETURL at hand, we are able to discover some of the more concealed
+or even "indecent" services (i.e., lacking conformity to standards of
+quality).  It can be exciting to see the fancy CGI scripts that lie
+there, revealing the inner workings of the server, ready to be called:
+
+   * With a command such as:
+
+          gawk -f geturl.awk http://any.host.on.the.net/cgi-bin/
+
+     some servers give you a directory listing of the CGI files.
+     Knowing the names, you can try to call some of them and watch for
+     useful results.  Sometimes there are executables in such
+     directories (such as Perl interpreters) that you may call remotely.
+     If there are subdirectories with configuration data of the web
+     server, this can also be quite interesting to read.
+
+   * The well-known Apache web server usually has its CGI files in the
+     directory '/cgi-bin'.  There you can often find the scripts
+     'test-cgi' and 'printenv'.  Both tell you some things about the
+     current connection and the installation of the web server.  Just
+     call:
+
+          gawk -f geturl.awk http://any.host.on.the.net/cgi-bin/test-cgi
+          gawk -f geturl.awk http://any.host.on.the.net/cgi-bin/printenv
+
+   * Sometimes it is even possible to retrieve system files like the web
+     server's log file--possibly containing customer data--or even the
+     file '/etc/passwd'.  (We don't recommend this!)
+
+   *Caution:* Although this may sound funny or simply irrelevant, we are
+talking about severe security holes.  Try to explore your own system
+this way and make sure that none of the above reveals too much
+information about your system.
+
+
+File: gawkinet.info,  Node: STATIST,  Next: MAZE,  Prev: WEBGRAB,  Up: Some Applications and Techniques
+
+3.6 STATIST: Graphing a Statistical Distribution
+================================================
+
+In the HTTP server examples we've shown thus far, we never present an
+image to the browser and its user.  Presenting images is one task.
+Generating images that reflect some user input and presenting these
+dynamically generated images is another.  In this node, we use GNUPlot
+for generating '.png', '.ps', or '.gif' files.(1)
+
+   The program we develop takes the statistical parameters of two
+samples and computes the t-test statistics.  As a result, we get the
+probabilities that the means and the variances of both samples are the
+same.  In order to let the user check plausibility, the program presents
+an image of the distributions.  The statistical computation follows
+'Numerical Recipes in C: The Art of Scientific Computing' by William H.
+Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery.
+Since 'gawk' does not have a built-in function for the computation of
+the beta function, we use the 'ibeta()' function of GNUPlot.  As a side
+effect, we learn how to use GNUPlot as a sophisticated calculator.  The
+comparison of means is done as in 'tutest', paragraph 14.2, page 613,
+and the comparison of variances is done as in 'ftest', page 611 in
+'Numerical Recipes'.
+
+   As usual, we take the site-independent code for servers and append
+our own functions 'SetUpServer()' and 'HandleGET()':
+
+     function SetUpServer() {
+       TopHeader = "<HTML><title>Statistics with GAWK</title>"
+       TopDoc = "<BODY>\
+        <h2>Please choose one of the following actions:</h2>\
+        <UL>\
+         <LI><A HREF=" MyPrefix "/AboutServer>About this server</A></LI>\
+         <LI><A HREF=" MyPrefix "/EnterParameters>Enter Parameters</A></LI>\
+        </UL>"
+       TopFooter  = "</BODY></HTML>"
+       GnuPlot    = "gnuplot 2>&1"
+       m1=m2=0;    v1=v2=1;    n1=n2=10
+     }
+
+   Here, you see the menu structure that the user sees.  Later, we will
+see how the program structure of the 'HandleGET()' function reflects the
+menu structure.  What is missing here is the link for the image we
+generate.  In an event-driven environment, request, generation, and
+delivery of images are separated.
+
+   Notice the way we initialize the 'GnuPlot' command string for the
+pipe.  By default, GNUPlot outputs the generated image via standard
+output, as well as the results of 'print'(ed) calculations via standard
+error.  The redirection causes standard error to be mixed into standard
+output, enabling us to read results of calculations with 'getline'.  By
+initializing the statistical parameters with some meaningful defaults,
+we make sure the user gets an image the first time he uses the program.
+
+   Following is the rather long function 'HandleGET()', which implements
+the contents of this service by reacting to the different kinds of
+requests from the browser.  Before you start playing with this script,
+make sure that your browser supports JavaScript and that it also has
+this option switched on.  The script uses a short snippet of JavaScript
+code for delayed opening of a window with an image.  A more detailed
+explanation follows:
+
+     function HandleGET() {
+       if(MENU[2] == "AboutServer") {
+         Document  = "This is a GUI for a statistical computation.\
+           It compares means and variances of two distributions.\
+           It is implemented as one GAWK script and uses GNUPLOT."
+       } else if (MENU[2] == "EnterParameters") {
+         Document = ""
+         if ("m1" in GETARG) {     # are there parameters to compare?
+           Document = Document "<SCRIPT LANGUAGE=\"JavaScript\">\
+             setTimeout(\"window.open(\\\"" MyPrefix "/Image" systime()\
+              "\\\",\\\"dist\\\", \\\"status=no\\\");\", 1000); </SCRIPT>"
+           m1 = GETARG["m1"]; v1 = GETARG["v1"]; n1 = GETARG["n1"]
+           m2 = GETARG["m2"]; v2 = GETARG["v2"]; n2 = GETARG["n2"]
+           t = (m1-m2)/sqrt(v1/n1+v2/n2)
+           df = (v1/n1+v2/n2)*(v1/n1+v2/n2)/((v1/n1)*(v1/n1)/(n1-1) \
+                + (v2/n2)*(v2/n2) /(n2-1))
+           if (v1>v2) {
+               f = v1/v2
+               df1 = n1 - 1
+               df2 = n2 - 1
+           } else {
+               f = v2/v1
+               df1 = n2 - 1
+               df2 = n1 - 1
+           }
+           print "pt=ibeta(" df/2 ",0.5," df/(df+t*t) ")"  |& GnuPlot
+           print "pF=2.0*ibeta(" df2/2 "," df1/2 "," \
+                 df2/(df2+df1*f) ")"                    |& GnuPlot
+           print "print pt, pF"                         |& GnuPlot
+           RS="\n"; GnuPlot |& getline; RS="\r\n"    # $1 is pt, $2 is pF
+           print "invsqrt2pi=1.0/sqrt(2.0*pi)"          |& GnuPlot
+           print "nd(x)=invsqrt2pi/sd*exp(-0.5*((x-mu)/sd)**2)" |& GnuPlot
+           print "set term png small color"             |& GnuPlot
+           #print "set term postscript color"           |& GnuPlot
+           #print "set term gif medium size 320,240"    |& GnuPlot
+           print "set yrange[-0.3:]"                    |& GnuPlot
+           print "set label 'p(m1=m2) =" $1 "' at 0,-0.1 left"  |& GnuPlot
+           print "set label 'p(v1=v2) =" $2 "' at 0,-0.2 left"  |& GnuPlot
+           print "plot mu=" m1 ",sd=" sqrt(v1) ", nd(x) title 'sample 1',\
+             mu=" m2 ",sd=" sqrt(v2) ", nd(x) title 'sample 2'" |& GnuPlot
+           print "quit"                                         |& GnuPlot
+           GnuPlot |& getline Image
+           while ((GnuPlot |& getline) > 0)
+               Image = Image RS $0
+           close(GnuPlot)
+         }
+         Document = Document "\
+         <h3>Do these samples have the same Gaussian distribution?</h3>\
+         <FORM METHOD=GET> <TABLE BORDER CELLPADDING=5>\
+         <TR>\
+         <TD>1. Mean    </TD>
+         <TD><input type=text name=m1 value=" m1 " size=8></TD>\
+         <TD>1. Variance</TD>
+         <TD><input type=text name=v1 value=" v1 " size=8></TD>\
+         <TD>1. Count   </TD>
+         <TD><input type=text name=n1 value=" n1 " size=8></TD>\
+         </TR><TR>\
+         <TD>2. Mean    </TD>
+         <TD><input type=text name=m2 value=" m2 " size=8></TD>\
+         <TD>2. Variance</TD>
+         <TD><input type=text name=v2 value=" v2 " size=8></TD>\
+         <TD>2. Count   </TD>
+         <TD><input type=text name=n2 value=" n2 " size=8></TD>\
+         </TR>                   <input type=submit value=\"Compute\">\
+         </TABLE></FORM><BR>"
+       } else if (MENU[2] ~ "Image") {
+         Reason = "OK" ORS "Content-type: image/png"
+         #Reason = "OK" ORS "Content-type: application/x-postscript"
+         #Reason = "OK" ORS "Content-type: image/gif"
+         Header = Footer = ""
+         Document = Image
+       }
+     }
+
+   As usual, we give a short description of the service in the first
+menu choice.  The third menu choice shows us that generation and
+presentation of an image are two separate actions.  While the latter
+takes place quite instantly in the third menu choice, the former takes
+place in the much longer second choice.  Image data passes from the
+generating action to the presenting action via the variable 'Image' that
+contains a complete '.png' image, which is otherwise stored in a file.
+If you prefer '.ps' or '.gif' images over the default '.png' images, you
+may select these options by uncommenting the appropriate lines.  But
+remember to do so in two places: when telling GNUPlot which kind of
+images to generate, and when transmitting the image at the end of the
+program.
+
+   Looking at the end of the program, the way we pass the 'Content-type'
+to the browser is a bit unusual.  It is appended to the 'OK' of the
+first header line to make sure the type information becomes part of the
+header.  The other variables that get transmitted across the network are
+made empty, because in this case we do not have an HTML document to
+transmit, but rather raw image data to contain in the body.
+
+   Most of the work is done in the second menu choice.  It starts with a
+strange JavaScript code snippet.  When first implementing this server,
+we used a short '"<IMG SRC=" MyPrefix "/Image>"' here.  But then
+browsers got smarter and tried to improve on speed by requesting the
+image and the HTML code at the same time.  When doing this, the browser
+tries to build up a connection for the image request while the request
+for the HTML text is not yet completed.  The browser tries to connect to
+the 'gawk' server on port 8080 while port 8080 is still in use for
+transmission of the HTML text.  The connection for the image cannot be
+built up, so the image appears as "broken" in the browser window.  We
+solved this problem by telling the browser to open a separate window for
+the image, but only after a delay of 1000 milliseconds.  By this time,
+the server should be ready for serving the next request.
+
+   But there is one more subtlety in the JavaScript code.  Each time the
+JavaScript code opens a window for the image, the name of the image is
+appended with a timestamp ('systime()').  Why this constant change of
+name for the image?  Initially, we always named the image 'Image', but
+then the Netscape browser noticed the name had _not_ changed since the
+previous request and displayed the previous image (caching behavior).
+The server core is implemented so that browsers are told _not_ to cache
+anything.  Obviously HTTP requests do not always work as expected.  One
+way to circumvent the cache of such overly smart browsers is to change
+the name of the image with each request.  These three lines of
+JavaScript caused us a lot of trouble.
+
+   The rest can be broken down into two phases.  At first, we check if
+there are statistical parameters.  When the program is first started,
+there usually are no parameters because it enters the page coming from
+the top menu.  Then, we only have to present the user a form that he can
+use to change statistical parameters and submit them.  Subsequently, the
+submission of the form causes the execution of the first phase because
+_now_ there _are_ parameters to handle.
+
+   Now that we have parameters, we know there will be an image
+available.  Therefore we insert the JavaScript code here to initiate the
+opening of the image in a separate window.  Then, we prepare some
+variables that will be passed to GNUPlot for calculation of the
+probabilities.  Prior to reading the results, we must temporarily change
+'RS' because GNUPlot separates lines with newlines.  After instructing
+GNUPlot to generate a '.png' (or '.ps' or '.gif') image, we initiate the
+insertion of some text, explaining the resulting probabilities.  The
+final 'plot' command actually generates the image data.  This raw binary
+has to be read in carefully without adding, changing, or deleting a
+single byte.  Hence the unusual initialization of 'Image' and completion
+with a 'while' loop.
+
+   When using this server, it soon becomes clear that it is far from
+being perfect.  It mixes source code of six scripting languages or
+protocols:
+
+   * GNU 'awk' implements a server for the protocol:
+   * HTTP which transmits:
+   * HTML text which contains a short piece of:
+   * JavaScript code opening a separate window.
+   * A Bourne shell script is used for piping commands into:
+   * GNUPlot to generate the image to be opened.
+
+   After all this work, the GNUPlot image opens in the JavaScript window
+where it can be viewed by the user.
+
+   It is probably better not to mix up so many different languages.  The
+result is not very readable.  Furthermore, the statistical part of the
+server does not take care of invalid input.  Among others, using
+negative variances will cause invalid results.
+
+   ---------- Footnotes ----------
+
+   (1) Due to licensing problems, the default installation of GNUPlot
+disables the generation of '.gif' files.  If your installed version does
+not accept 'set term gif', just download and install the most recent
+version of GNUPlot and the GD library (http://www.boutell.com/gd/) by
+Thomas Boutell.  Otherwise you still have the chance to generate some
+ASCII-art style images with GNUPlot by using 'set term dumb'.  (We tried
+it and it worked.)
+
+
+File: gawkinet.info,  Node: MAZE,  Next: MOBAGWHO,  Prev: STATIST,  Up: Some Applications and Techniques
+
+3.7 MAZE: Walking Through a Maze In Virtual Reality
+===================================================
+
+     In the long run, every program becomes rococo, and then rubble.
+     Alan Perlis
+
+   By now, we know how to present arbitrary 'Content-type's to a
+browser.  In this node, our server will present a 3D world to our
+browser.  The 3D world is described in a scene description language
+(VRML, Virtual Reality Modeling Language) that allows us to travel
+through a perspective view of a 2D maze with our browser.  Browsers with
+a VRML plugin enable exploration of this technology.  We could do one of
+those boring 'Hello world' examples here, that are usually presented
+when introducing novices to VRML. If you have never written any VRML
+code, have a look at the VRML FAQ. Presenting a static VRML scene is a
+bit trivial; in order to expose 'gawk''s new capabilities, we will
+present a dynamically generated VRML scene.  The function
+'SetUpServer()' is very simple because it only sets the default HTML
+page and initializes the random number generator.  As usual, the
+surrounding server lets you browse the maze.
+
+     function SetUpServer() {
+       TopHeader = "<HTML><title>Walk through a maze</title>"
+       TopDoc = "\
+         <h2>Please choose one of the following actions:</h2>\
+         <UL>\
+           <LI><A HREF=" MyPrefix "/AboutServer>About this server</A>\
+           <LI><A HREF=" MyPrefix "/VRMLtest>Watch a simple VRML scene</A>\
+         </UL>"
+       TopFooter  = "</HTML>"
+       srand()
+     }
+
+   The function 'HandleGET()' is a bit longer because it first computes
+the maze and afterwards generates the VRML code that is sent across the
+network.  As shown in the STATIST example (*note STATIST::), we set the
+type of the content to VRML and then store the VRML representation of
+the maze as the page content.  We assume that the maze is stored in a 2D
+array.  Initially, the maze consists of walls only.  Then, we add an
+entry and an exit to the maze and let the rest of the work be done by
+the function 'MakeMaze()'.  Now, only the wall fields are left in the
+maze.  By iterating over the these fields, we generate one line of VRML
+code for each wall field.
+
+     function HandleGET() {
+       if (MENU[2] == "AboutServer") {
+         Document  = "If your browser has a VRML 2 plugin,\
+           this server shows you a simple VRML scene."
+       } else if (MENU[2] == "VRMLtest") {
+         XSIZE = YSIZE = 11              # initially, everything is wall
+         for (y = 0; y < YSIZE; y++)
+            for (x = 0; x < XSIZE; x++)
+               Maze[x, y] = "#"
+         delete Maze[0, 1]              # entry is not wall
+         delete Maze[XSIZE-1, YSIZE-2]  # exit  is not wall
+         MakeMaze(1, 1)
+         Document = "\
+     #VRML V2.0 utf8\n\
+     Group {\n\
+       children [\n\
+         PointLight {\n\
+           ambientIntensity 0.2\n\
+           color 0.7 0.7 0.7\n\
+           location 0.0 8.0 10.0\n\
+         }\n\
+         DEF B1 Background {\n\
+           skyColor [0 0 0, 1.0 1.0 1.0 ]\n\
+           skyAngle 1.6\n\
+           groundColor [1 1 1, 0.8 0.8 0.8, 0.2 0.2 0.2 ]\n\
+           groundAngle [ 1.2 1.57 ]\n\
+         }\n\
+         DEF Wall Shape {\n\
+           geometry Box {size 1 1 1}\n\
+           appearance Appearance { material Material { diffuseColor 0 0 1 } }\n\
+         }\n\
+         DEF Entry Viewpoint {\n\
+           position 0.5 1.0 5.0\n\
+           orientation 0.0 0.0 -1.0 0.52\n\
+         }\n"
+         for (i in Maze) {
+           split(i, t, SUBSEP)
+           Document = Document "    Transform { translation "
+           Document = Document t[1] " 0 -" t[2] " children USE Wall }\n"
+         }
+         Document = Document "  ] # end of group for world\n}"
+         Reason = "OK" ORS "Content-type: model/vrml"
+         Header = Footer = ""
+       }
+     }
+
+   Finally, we have a look at 'MakeMaze()', the function that generates
+the 'Maze' array.  When entered, this function assumes that the array
+has been initialized so that each element represents a wall element and
+the maze is initially full of wall elements.  Only the entrance and the
+exit of the maze should have been left free.  The parameters of the
+function tell us which element must be marked as not being a wall.
+After this, we take a look at the four neighboring elements and remember
+which we have already treated.  Of all the neighboring elements, we take
+one at random and walk in that direction.  Therefore, the wall element
+in that direction has to be removed and then, we call the function
+recursively for that element.  The maze is only completed if we iterate
+the above procedure for _all_ neighboring elements (in random order) and
+for our present element by recursively calling the function for the
+present element.  This last iteration could have been done in a loop,
+but it is done much simpler recursively.
+
+   Notice that elements with coordinates that are both odd are assumed
+to be on our way through the maze and the generating process cannot
+terminate as long as there is such an element not being 'delete'd.  All
+other elements are potentially part of the wall.
+
+     function MakeMaze(x, y) {
+       delete Maze[x, y]     # here we are, we have no wall here
+       p = 0                 # count unvisited fields in all directions
+       if (x-2 SUBSEP y   in Maze) d[p++] = "-x"
+       if (x   SUBSEP y-2 in Maze) d[p++] = "-y"
+       if (x+2 SUBSEP y   in Maze) d[p++] = "+x"
+       if (x   SUBSEP y+2 in Maze) d[p++] = "+y"
+       if (p>0) {            # if there are unvisited fields, go there
+         p = int(p*rand())   # choose one unvisited field at random
+         if        (d[p] == "-x") { delete Maze[x - 1, y]; MakeMaze(x - 2, y)
+         } else if (d[p] == "-y") { delete Maze[x, y - 1]; MakeMaze(x, y - 2)
+         } else if (d[p] == "+x") { delete Maze[x + 1, y]; MakeMaze(x + 2, y)
+         } else if (d[p] == "+y") { delete Maze[x, y + 1]; MakeMaze(x, y + 2)
+         }                   # we are back from recursion
+         MakeMaze(x, y);     # try again while there are unvisited fields
+       }
+     }
+
+
+File: gawkinet.info,  Node: MOBAGWHO,  Next: STOXPRED,  Prev: MAZE,  Up: Some Applications and Techniques
+
+3.8 MOBAGWHO: a Simple Mobile Agent
+===================================
+
+     There are two ways of constructing a software design: One way is to
+     make it so simple that there are obviously no deficiencies, and the
+     other way is to make it so complicated that there are no obvious
+     deficiencies.
+     C. A. R. Hoare
+
+   A "mobile agent" is a program that can be dispatched from a computer
+and transported to a remote server for execution.  This is called
+"migration", which means that a process on another system is started
+that is independent from its originator.  Ideally, it wanders through a
+network while working for its creator or owner.  In places like the UMBC
+Agent Web, people are quite confident that (mobile) agents are a
+software engineering paradigm that enables us to significantly increase
+the efficiency of our work.  Mobile agents could become the mediators
+between users and the networking world.  For an unbiased view at this
+technology, see the remarkable paper 'Mobile Agents: Are they a good
+idea?'.(1)
+
+   When trying to migrate a process from one system to another, a server
+process is needed on the receiving side.  Depending on the kind of
+server process, several ways of implementation come to mind.  How the
+process is implemented depends upon the kind of server process:
+
+   * HTTP can be used as the protocol for delivery of the migrating
+     process.  In this case, we use a common web server as the receiving
+     server process.  A universal CGI script mediates between migrating
+     process and web server.  Each server willing to accept migrating
+     agents makes this universal service available.  HTTP supplies the
+     'POST' method to transfer some data to a file on the web server.
+     When a CGI script is called remotely with the 'POST' method instead
+     of the usual 'GET' method, data is transmitted from the client
+     process to the standard input of the server's CGI script.  So, to
+     implement a mobile agent, we must not only write the agent program
+     to start on the client side, but also the CGI script to receive the
+     agent on the server side.
+
+   * The 'PUT' method can also be used for migration.  HTTP does not
+     require a CGI script for migration via 'PUT'.  However, with common
+     web servers there is no advantage to this solution, because web
+     servers such as Apache require explicit activation of a special
+     'PUT' script.
+
+   * 'Agent Tcl' pursues a different course; it relies on a dedicated
+     server process with a dedicated protocol specialized for receiving
+     mobile agents.
+
+   Our agent example abuses a common web server as a migration tool.
+So, it needs a universal CGI script on the receiving side (the web
+server).  The receiving script is activated with a 'POST' request when
+placed into a location like '/httpd/cgi-bin/PostAgent.sh'.  Make sure
+that the server system uses a version of 'gawk' that supports network
+access (Version 3.1 or later; verify with 'gawk --version').
+
+     #!/bin/sh
+     MobAg=/tmp/MobileAgent.$$
+     # direct script to mobile agent file
+     cat > $MobAg
+     # execute agent concurrently
+     gawk -f $MobAg $MobAg > /dev/null &
+     # HTTP header, terminator and body
+     gawk 'BEGIN { print "\r\nAgent started" }'
+     rm $MobAg      # delete script file of agent
+
+   By making its process id ('$$') part of the unique file name, the
+script avoids conflicts between concurrent instances of the script.
+First, all lines from standard input (the mobile agent's source code)
+are copied into this unique file.  Then, the agent is started as a
+concurrent process and a short message reporting this fact is sent to
+the submitting client.  Finally, the script file of the mobile agent is
+removed because it is no longer needed.  Although it is a short script,
+there are several noteworthy points:
+
+Security
+     _There is none_.  In fact, the CGI script should never be made
+     available on a server that is part of the Internet because everyone
+     would be allowed to execute arbitrary commands with it.  This
+     behavior is acceptable only when performing rapid prototyping.
+
+Self-Reference
+     Each migrating instance of an agent is started in a way that
+     enables it to read its own source code from standard input and use
+     the code for subsequent migrations.  This is necessary because it
+     needs to treat the agent's code as data to transmit.  'gawk' is not
+     the ideal language for such a job.  Lisp and Tcl are more suitable
+     because they do not make a distinction between program code and
+     data.
+
+Independence
+     After migration, the agent is not linked to its former home in any
+     way.  By reporting 'Agent started', it waves "Goodbye" to its
+     origin.  The originator may choose to terminate or not.
+
+   The originating agent itself is started just like any other
+command-line script, and reports the results on standard output.  By
+letting the name of the original host migrate with the agent, the agent
+that migrates to a host far away from its origin can report the result
+back home.  Having arrived at the end of the journey, the agent
+establishes a connection and reports the results.  This is the reason
+for determining the name of the host with 'uname -n' and storing it in
+'MyOrigin' for later use.  We may also set variables with the '-v'
+option from the command line.  This interactivity is only of importance
+in the context of starting a mobile agent; therefore this 'BEGIN'
+pattern and its action do not take part in migration:
+
+     BEGIN {
+       if (ARGC != 2) {
+         print "MOBAG - a simple mobile agent"
+         print "CALL:\n    gawk -f mobag.awk mobag.awk"
+         print "IN:\n    the name of this script as a command-line parameter"
+         print "PARAM:\n    -v MyOrigin=myhost.com"
+         print "OUT:\n    the result on stdout"
+         print "JK 29.03.1998 01.04.1998"
+         exit
+       }
+       if (MyOrigin == "") {
+          "uname -n" | getline MyOrigin
+          close("uname -n")
+       }
+     }
+
+   Since 'gawk' cannot manipulate and transmit parts of the program
+directly, the source code is read and stored in strings.  Therefore, the
+program scans itself for the beginning and the ending of functions.
+Each line in between is appended to the code string until the end of the
+function has been reached.  A special case is this part of the program
+itself.  It is not a function.  Placing a similar framework around it
+causes it to be treated like a function.  Notice that this mechanism
+works for all the functions of the source code, but it cannot guarantee
+that the order of the functions is preserved during migration:
+
+     #ReadMySelf
+     /^function /                     { FUNC = $2 }
+     /^END/ || /^#ReadMySelf/         { FUNC = $1 }
+     FUNC != ""                       { MOBFUN[FUNC] = MOBFUN[FUNC] RS $0 }
+     (FUNC != "") && (/^}/ || /^#EndOfMySelf/) \
+                                      { FUNC = "" }
+     #EndOfMySelf
+
+   The web server code in *note A Web Service with Interaction:
+Interacting Service, was first developed as a site-independent core.
+Likewise, the 'gawk'-based mobile agent starts with an agent-independent
+core, to which can be appended application-dependent functions.  What
+follows is the only application-independent function needed for the
+mobile agent:
+
+     function migrate(Destination, MobCode, Label) {
+       MOBVAR["Label"] = Label
+       MOBVAR["Destination"] = Destination
+       RS = ORS = "\r\n"
+       HttpService = "/inet/tcp/0/" Destination
+       for (i in MOBFUN)
+          MobCode = (MobCode "\n" MOBFUN[i])
+       MobCode = MobCode  "\n\nBEGIN {"
+       for (i in MOBVAR)
+          MobCode = (MobCode "\n  MOBVAR[\"" i "\"] = \"" MOBVAR[i] "\"")
+       MobCode = MobCode "\n}\n"
+       print "POST /cgi-bin/PostAgent.sh HTTP/1.0"  |& HttpService
+       print "Content-length:", length(MobCode) ORS |& HttpService
+       printf "%s", MobCode                         |& HttpService
+       while ((HttpService |& getline) > 0)
+          print $0
+       close(HttpService)
+     }
+
+   The 'migrate()' function prepares the aforementioned strings
+containing the program code and transmits them to a server.  A
+consequence of this modular approach is that the 'migrate()' function
+takes some parameters that aren't needed in this application, but that
+will be in future ones.  Its mandatory parameter 'Destination' holds the
+name (or IP address) of the server that the agent wants as a host for
+its code.  The optional parameter 'MobCode' may contain some 'gawk' code
+that is inserted during migration in front of all other code.  The
+optional parameter 'Label' may contain a string that tells the agent
+what to do in program execution after arrival at its new home site.  One
+of the serious obstacles in implementing a framework for mobile agents
+is that it does not suffice to migrate the code.  It is also necessary
+to migrate the state of execution of the agent.  In contrast to 'Agent
+Tcl', this program does not try to migrate the complete set of
+variables.  The following conventions are used:
+
+   * Each variable in an agent program is local to the current host and
+     does _not_ migrate.
+
+   * The array 'MOBFUN' shown above is an exception.  It is handled by
+     the function 'migrate()' and does migrate with the application.
+
+   * The other exception is the array 'MOBVAR'.  Each variable that
+     takes part in migration has to be an element of this array.
+     'migrate()' also takes care of this.
+
+   Now it's clear what happens to the 'Label' parameter of the function
+'migrate()'.  It is copied into 'MOBVAR["Label"]' and travels alongside
+the other data.  Since travelling takes place via HTTP, records must be
+separated with '"\r\n"' in 'RS' and 'ORS' as usual.  The code assembly
+for migration takes place in three steps:
+
+   * Iterate over 'MOBFUN' to collect all functions verbatim.
+
+   * Prepare a 'BEGIN' pattern and put assignments to mobile variables
+     into the action part.
+
+   * Transmission itself resembles GETURL: the header with the request
+     and the 'Content-length' is followed by the body.  In case there is
+     any reply over the network, it is read completely and echoed to
+     standard output to avoid irritating the server.
+
+   The application-independent framework is now almost complete.  What
+follows is the 'END' pattern that is executed when the mobile agent has
+finished reading its own code.  First, it checks whether it is already
+running on a remote host or not.  In case initialization has not yet
+taken place, it starts 'MyInit()'.  Otherwise (later, on a remote host),
+it starts 'MyJob()':
+
+     END {
+       if (ARGC != 2) exit    # stop when called with wrong parameters
+       if (MyOrigin != "")    # is this the originating host?
+         MyInit()             # if so, initialize the application
+       else                   # we are on a host with migrated data
+         MyJob()              # so we do our job
+     }
+
+   All that's left to extend the framework into a complete application
+is to write two application-specific functions: 'MyInit()' and
+'MyJob()'.  Keep in mind that the former is executed once on the
+originating host, while the latter is executed after each migration:
+
+     function MyInit() {
+       MOBVAR["MyOrigin"] = MyOrigin
+       MOBVAR["Machines"] = "localhost/80 max/80 moritz/80 castor/80"
+       split(MOBVAR["Machines"], Machines)           # which host is the first?
+       migrate(Machines[1], "", "")                  # go to the first host
+       while (("/inet/tcp/8080/0/0" |& getline) > 0) # wait for result
+         print $0                                    # print result
+       close("/inet/tcp/8080/0/0")
+     }
+
+   As mentioned earlier, this agent takes the name of its origin
+('MyOrigin') with it.  Then, it takes the name of its first destination
+and goes there for further work.  Notice that this name has the port
+number of the web server appended to the name of the server, because the
+function 'migrate()' needs it this way to create the 'HttpService'
+variable.  Finally, it waits for the result to arrive.  The 'MyJob()'
+function runs on the remote host:
+
+     function MyJob() {
+       # forget this host
+       sub(MOBVAR["Destination"], "", MOBVAR["Machines"])
+       MOBVAR["Result"]=MOBVAR["Result"] SUBSEP SUBSEP MOBVAR["Destination"] ":"
+       while (("who" | getline) > 0)               # who is logged in?
+         MOBVAR["Result"] = MOBVAR["Result"] SUBSEP $0
+       close("who")
+       if (index(MOBVAR["Machines"], "/") > 0) {   # any more machines to visit?
+         split(MOBVAR["Machines"], Machines)       # which host is next?
+         migrate(Machines[1], "", "")              # go there
+       } else {                                    # no more machines
+         gsub(SUBSEP, "\n", MOBVAR["Result"])      # send result to origin
+         print MOBVAR["Result"] |& "/inet/tcp/0/" MOBVAR["MyOrigin"] "/8080"
+         close("/inet/tcp/0/" MOBVAR["MyOrigin"] "/8080")
+       }
+     }
+
+   After migrating, the first thing to do in 'MyJob()' is to delete the
+name of the current host from the list of hosts to visit.  Now, it is
+time to start the real work by appending the host's name to the result
+string, and reading line by line who is logged in on this host.  A very
+annoying circumstance is the fact that the elements of 'MOBVAR' cannot
+hold the newline character ('"\n"').  If they did, migration of this
+string did not work because the string didn't obey the syntax rule for a
+string in 'gawk'.  'SUBSEP' is used as a temporary replacement.  If the
+list of hosts to visit holds at least one more entry, the agent migrates
+to that place to go on working there.  Otherwise, we replace the
+'SUBSEP's with a newline character in the resulting string, and report
+it to the originating host, whose name is stored in
+'MOBVAR["MyOrigin"]'.
+
+   ---------- Footnotes ----------
+
+   (1) <http://www.research.ibm.com/massive/mobag.ps>
+
+
+File: gawkinet.info,  Node: STOXPRED,  Next: PROTBASE,  Prev: MOBAGWHO,  Up: Some Applications and Techniques
+
+3.9 STOXPRED: Stock Market Prediction As A Service
+==================================================
+
+     Far out in the uncharted backwaters of the unfashionable end of the
+     Western Spiral arm of the Galaxy lies a small unregarded yellow
+     sun.
+
+     Orbiting this at a distance of roughly ninety-two million miles is
+     an utterly insignificant little blue-green planet whose
+     ape-descendent life forms are so amazingly primitive that they
+     still think digital watches are a pretty neat idea.
+
+     This planet has -- or rather had -- a problem, which was this: most
+     of the people living on it were unhappy for pretty much of the
+     time.  Many solutions were suggested for this problem, but most of
+     these were largely concerned with the movements of small green
+     pieces of paper, which is odd because it wasn't the small green
+     pieces of paper that were unhappy.
+     Douglas Adams, 'The Hitch Hiker's Guide to the Galaxy'
+
+   Valuable services on the Internet are usually _not_ implemented as
+mobile agents.  There are much simpler ways of implementing services.
+All Unix systems provide, for example, the 'cron' service.  Unix system
+users can write a list of tasks to be done each day, each week, twice a
+day, or just once.  The list is entered into a file named 'crontab'.
+For example, to distribute a newsletter on a daily basis this way, use
+'cron' for calling a script each day early in the morning.
+
+     # run at 8 am on weekdays, distribute the newsletter
+     0 8 * * 1-5   $HOME/bin/daily.job >> $HOME/log/newsletter 2>&1
+
+   The script first looks for interesting information on the Internet,
+assembles it in a nice form and sends the results via email to the
+customers.
+
+   The following is an example of a primitive newsletter on stock market
+prediction.  It is a report which first tries to predict the change of
+each share in the Dow Jones Industrial Index for the particular day.
+Then it mentions some especially promising shares as well as some shares
+which look remarkably bad on that day.  The report ends with the usual
+disclaimer which tells every child _not_ to try this at home and hurt
+anybody.
+
+     Good morning Uncle Scrooge,
+
+     This is your daily stock market report for Monday, October 16, 2000.
+     Here are the predictions for today:
+
+             AA      neutral
+             GE      up
+             JNJ     down
+             MSFT    neutral
+             ...
+             UTX     up
+             DD      down
+             IBM     up
+             MO      down
+             WMT     up
+             DIS     up
+             INTC    up
+             MRK     down
+             XOM     down
+             EK      down
+             IP      down
+
+     The most promising shares for today are these:
+
+             INTC            http://biz.yahoo.com/n/i/intc.html
+
+     The stock shares to avoid today are these:
+
+             EK              http://biz.yahoo.com/n/e/ek.html
+             IP              http://biz.yahoo.com/n/i/ip.html
+             DD              http://biz.yahoo.com/n/d/dd.html
+             ...
+
+   The script as a whole is rather long.  In order to ease the pain of
+studying other people's source code, we have broken the script up into
+meaningful parts which are invoked one after the other.  The basic
+structure of the script is as follows:
+
+     BEGIN {
+       Init()
+       ReadQuotes()
+       CleanUp()
+       Prediction()
+       Report()
+       SendMail()
+     }
+
+   The earlier parts store data into variables and arrays which are
+subsequently used by later parts of the script.  The 'Init()' function
+first checks if the script is invoked correctly (without any
+parameters).  If not, it informs the user of the correct usage.  What
+follows are preparations for the retrieval of the historical quote data.
+The names of the 30 stock shares are stored in an array 'name' along
+with the current date in 'day', 'month', and 'year'.
+
+   All users who are separated from the Internet by a firewall and have
+to direct their Internet accesses to a proxy must supply the name of the
+proxy to this script with the '-v Proxy=NAME' option.  For most users,
+the default proxy and port number should suffice.
+
+     function Init() {
+       if (ARGC != 1) {
+         print "STOXPRED - daily stock share prediction"
+         print "IN:\n    no parameters, nothing on stdin"
+         print "PARAM:\n    -v Proxy=MyProxy -v ProxyPort=80"
+         print "OUT:\n    commented predictions as email"
+         print "JK 09.10.2000"
+         exit
+       }
+       # Remember ticker symbols from Dow Jones Industrial Index
+       StockCount = split("AA GE JNJ MSFT AXP GM JPM PG BA HD KO \
+         SBC C HON MCD T CAT HWP MMM UTX DD IBM MO WMT DIS INTC \
+         MRK XOM EK IP", name);
+       # Remember the current date as the end of the time series
+       day   = strftime("%d")
+       month = strftime("%m")
+       year  = strftime("%Y")
+       if (Proxy     == "")  Proxy     = "chart.yahoo.com"
+       if (ProxyPort ==  0)  ProxyPort = 80
+       YahooData = "/inet/tcp/0/" Proxy "/" ProxyPort
+     }
+
+   There are two really interesting parts in the script.  One is the
+function which reads the historical stock quotes from an Internet
+server.  The other is the one that does the actual prediction.  In the
+following function we see how the quotes are read from the Yahoo server.
+The data which comes from the server is in CSV format (comma-separated
+values):
+
+     Date,Open,High,Low,Close,Volume
+     9-Oct-00,22.75,22.75,21.375,22.375,7888500
+     6-Oct-00,23.8125,24.9375,21.5625,22,10701100
+     5-Oct-00,24.4375,24.625,23.125,23.50,5810300
+
+   Lines contain values of the same time instant, whereas columns are
+separated by commas and contain the kind of data that is described in
+the header (first) line.  At first, 'gawk' is instructed to separate
+columns by commas ('FS = ","').  In the loop that follows, a connection
+to the Yahoo server is first opened, then a download takes place, and
+finally the connection is closed.  All this happens once for each ticker
+symbol.  In the body of this loop, an Internet address is built up as a
+string according to the rules of the Yahoo server.  The starting and
+ending date are chosen to be exactly the same, but one year apart in the
+past.  All the action is initiated within the 'printf' command which
+transmits the request for data to the Yahoo server.
+
+   In the inner loop, the server's data is first read and then scanned
+line by line.  Only lines which have six columns and the name of a month
+in the first column contain relevant data.  This data is stored in the
+two-dimensional array 'quote'; one dimension being time, the other being
+the ticker symbol.  During retrieval of the first stock's data, the
+calendar names of the time instances are stored in the array 'day'
+because we need them later.
+
+     function ReadQuotes() {
+       # Retrieve historical data for each ticker symbol
+       FS = ","
+       for (stock = 1; stock <= StockCount; stock++) {
+         URL = "http://chart.yahoo.com/table.csv?s=" name[stock] \
+               "&a=" month "&b=" day   "&c=" year-1 \
+               "&d=" month "&e=" day   "&f=" year \
+               "g=d&q=q&y=0&z=" name[stock] "&x=.csv"
+         printf("GET " URL " HTTP/1.0\r\n\r\n") |& YahooData
+         while ((YahooData |& getline) > 0) {
+           if (NF == 6 && $1 ~ /Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec/) {
+             if (stock == 1)
+               days[++daycount] = $1;
+             quote[$1, stock] = $5
+           }
+         }
+         close(YahooData)
+       }
+       FS = " "
+     }
+
+   Now that we _have_ the data, it can be checked once again to make
+sure that no individual stock is missing or invalid, and that all the
+stock quotes are aligned correctly.  Furthermore, we renumber the time
+instances.  The most recent day gets day number 1 and all other days get
+consecutive numbers.  All quotes are rounded toward the nearest whole
+number in US Dollars.
+
+     function CleanUp() {
+       # clean up time series; eliminate incomplete data sets
+       for (d = 1; d <= daycount; d++) {
+         for (stock = 1; stock <= StockCount; stock++)
+           if (! ((days[d], stock) in quote))
+               stock = StockCount + 10
+         if (stock > StockCount + 1)
+             continue
+         datacount++
+         for (stock = 1; stock <= StockCount; stock++)
+           data[datacount, stock] = int(0.5 + quote[days[d], stock])
+       }
+       delete quote
+       delete days
+     }
+
+   Now we have arrived at the second really interesting part of the
+whole affair.  What we present here is a very primitive prediction
+algorithm: _If a stock fell yesterday, assume it will also fall today;
+if it rose yesterday, assume it will rise today_.  (Feel free to replace
+this algorithm with a smarter one.)  If a stock changed in the same
+direction on two consecutive days, this is an indication which should be
+highlighted.  Two-day advances are stored in 'hot' and two-day declines
+in 'avoid'.
+
+   The rest of the function is a sanity check.  It counts the number of
+correct predictions in relation to the total number of predictions one
+could have made in the year before.
+
+     function Prediction() {
+       # Predict each ticker symbol by prolonging yesterday's trend
+       for (stock = 1; stock <= StockCount; stock++) {
+         if         (data[1, stock] > data[2, stock]) {
+           predict[stock] = "up"
+         } else if  (data[1, stock] < data[2, stock]) {
+           predict[stock] = "down"
+         } else {
+           predict[stock] = "neutral"
+         }
+         if ((data[1, stock] > data[2, stock]) && (data[2, stock] > data[3, stock]))
+           hot[stock] = 1
+         if ((data[1, stock] < data[2, stock]) && (data[2, stock] < data[3, stock]))
+           avoid[stock] = 1
+       }
+       # Do a plausibility check: how many predictions proved correct?
+       for (s = 1; s <= StockCount; s++) {
+         for (d = 1; d <= datacount-2; d++) {
+           if         (data[d+1, s] > data[d+2, s]) {
+             UpCount++
+           } else if  (data[d+1, s] < data[d+2, s]) {
+             DownCount++
+           } else {
+             NeutralCount++
+           }
+           if (((data[d, s]  > data[d+1, s]) && (data[d+1, s]  > data[d+2, s])) ||
+               ((data[d, s]  < data[d+1, s]) && (data[d+1, s]  < data[d+2, s])) ||
+               ((data[d, s] == data[d+1, s]) && (data[d+1, s] == data[d+2, s])))
+             CorrectCount++
+         }
+       }
+     }
+
+   At this point the hard work has been done: the array 'predict'
+contains the predictions for all the ticker symbols.  It is up to the
+function 'Report()' to find some nice words to introduce the desired
+information.
+
+     function Report() {
+       # Generate report
+       report =        "\nThis is your daily "
+       report = report "stock market report for "strftime("%A, %B %d, %Y")".\n"
+       report = report "Here are the predictions for today:\n\n"
+       for (stock = 1; stock <= StockCount; stock++)
+         report = report "\t" name[stock] "\t" predict[stock] "\n"
+       for (stock in hot) {
+         if (HotCount++ == 0)
+           report = report "\nThe most promising shares for today are these:\n\n"
+         report = report "\t" name[stock] "\t\thttp://biz.yahoo.com/n/" \
+           tolower(substr(name[stock], 1, 1)) "/" tolower(name[stock]) ".html\n"
+       }
+       for (stock in avoid) {
+         if (AvoidCount++ == 0)
+           report = report "\nThe stock shares to avoid today are these:\n\n"
+         report = report "\t" name[stock] "\t\thttp://biz.yahoo.com/n/" \
+           tolower(substr(name[stock], 1, 1)) "/" tolower(name[stock]) ".html\n"
+       }
+       report = report "\nThis sums up to " HotCount+0 " winners and " AvoidCount+0
+       report = report " losers. When using this kind\nof prediction scheme for"
+       report = report " the 12 months which lie behind us,\nwe get " UpCount
+       report = report " 'ups' and " DownCount " 'downs' and " NeutralCount
+       report = report " 'neutrals'. Of all\nthese " UpCount+DownCount+NeutralCount
+       report = report " predictions " CorrectCount " proved correct next day.\n"
+       report = report "A success rate of "\
+                  int(100*CorrectCount/(UpCount+DownCount+NeutralCount)) "%.\n"
+       report = report "Random choice would have produced a 33% success rate.\n"
+       report = report "Disclaimer: Like every other prediction of the stock\n"
+       report = report "market, this report is, of course, complete nonsense.\n"
+       report = report "If you are stupid enough to believe these predictions\n"
+       report = report "you should visit a doctor who can treat your ailment."
+     }
+
+   The function 'SendMail()' goes through the list of customers and
+opens a pipe to the 'mail' command for each of them.  Each one receives
+an email message with a proper subject heading and is addressed with his
+full name.
+
+     function SendMail() {
+       # send report to customers
+       customer["uncle.scrooge@ducktown.gov"] = "Uncle Scrooge"
+       customer["more@utopia.org"           ] = "Sir Thomas More"
+       customer["spinoza@denhaag.nl"        ] = "Baruch de Spinoza"
+       customer["marx@highgate.uk"          ] = "Karl Marx"
+       customer["keynes@the.long.run"       ] = "John Maynard Keynes"
+       customer["bierce@devil.hell.org"     ] = "Ambrose Bierce"
+       customer["laplace@paris.fr"          ] = "Pierre Simon de Laplace"
+       for (c in customer) {
+         MailPipe = "mail -s 'Daily Stock Prediction Newsletter'" c
+         print "Good morning " customer[c] "," | MailPipe
+         print report "\n.\n" | MailPipe
+         close(MailPipe)
+       }
+     }
+
+   Be patient when running the script by hand.  Retrieving the data for
+all the ticker symbols and sending the emails may take several minutes
+to complete, depending upon network traffic and the speed of the
+available Internet link.  The quality of the prediction algorithm is
+likely to be disappointing.  Try to find a better one.  Should you find
+one with a success rate of more than 50%, please tell us about it!  It
+is only for the sake of curiosity, of course.  ':-)'
+
+
+File: gawkinet.info,  Node: PROTBASE,  Prev: STOXPRED,  Up: Some Applications and Techniques
+
+3.10 PROTBASE: Searching Through A Protein Database
+===================================================
+
+     Hoare's Law of Large Problems: Inside every large problem is a
+     small problem struggling to get out.
+
+   Yahoo's database of stock market data is just one among the many
+large databases on the Internet.  Another one is located at NCBI
+(National Center for Biotechnology Information).  Established in 1988 as
+a national resource for molecular biology information, NCBI creates
+public databases, conducts research in computational biology, develops
+software tools for analyzing genome data, and disseminates biomedical
+information.  In this section, we look at one of NCBI's public services,
+which is called BLAST (Basic Local Alignment Search Tool).
+
+   You probably know that the information necessary for reproducing
+living cells is encoded in the genetic material of the cells.  The
+genetic material is a very long chain of four base nucleotides.  It is
+the order of appearance (the sequence) of nucleotides which contains the
+information about the substance to be produced.  Scientists in
+biotechnology often find a specific fragment, determine the nucleotide
+sequence, and need to know where the sequence at hand comes from.  This
+is where the large databases enter the game.  At NCBI, databases store
+the knowledge about which sequences have ever been found and where they
+have been found.  When the scientist sends his sequence to the BLAST
+service, the server looks for regions of genetic material in its
+database which look the most similar to the delivered nucleotide
+sequence.  After a search time of some seconds or minutes the server
+sends an answer to the scientist.  In order to make access simple, NCBI
+chose to offer their database service through popular Internet
+protocols.  There are four basic ways to use the so-called BLAST
+services:
+
+   * The easiest way to use BLAST is through the web.  Users may simply
+     point their browsers at the NCBI home page and link to the BLAST
+     pages.  NCBI provides a stable URL that may be used to perform
+     BLAST searches without interactive use of a web browser.  This is
+     what we will do later in this section.  A demonstration client and
+     a 'README' file demonstrate how to access this URL.
+
+   * Currently, 'blastcl3' is the standard network BLAST client.  You
+     can download 'blastcl3' from the anonymous FTP location.
+
+   * BLAST 2.0 can be run locally as a full executable and can be used
+     to run BLAST searches against private local databases, or
+     downloaded copies of the NCBI databases.  BLAST 2.0 executables may
+     be found on the NCBI anonymous FTP server.
+
+   * The NCBI BLAST Email server is the best option for people without
+     convenient access to the web.  A similarity search can be performed
+     by sending a properly formatted mail message containing the
+     nucleotide or protein query sequence to <blast@ncbi.nlm.nih.gov>.
+     The query sequence is compared against the specified database using
+     the BLAST algorithm and the results are returned in an email
+     message.  For more information on formulating email BLAST searches,
+     you can send a message consisting of the word "HELP" to the same
+     address, <blast@ncbi.nlm.nih.gov>.
+
+   Our starting point is the demonstration client mentioned in the first
+option.  The 'README' file that comes along with the client explains the
+whole process in a nutshell.  In the rest of this section, we first show
+what such requests look like.  Then we show how to use 'gawk' to
+implement a client in about 10 lines of code.  Finally, we show how to
+interpret the result returned from the service.
+
+   Sequences are expected to be represented in the standard IUB/IUPAC
+amino acid and nucleic acid codes, with these exceptions: lower-case
+letters are accepted and are mapped into upper-case; a single hyphen or
+dash can be used to represent a gap of indeterminate length; and in
+amino acid sequences, 'U' and '*' are acceptable letters (see below).
+Before submitting a request, any numerical digits in the query sequence
+should either be removed or replaced by appropriate letter codes (e.g.,
+'N' for unknown nucleic acid residue or 'X' for unknown amino acid
+residue).  The nucleic acid codes supported are:
+
+     A --> adenosine               M --> A C (amino)
+     C --> cytidine                S --> G C (strong)
+     G --> guanine                 W --> A T (weak)
+     T --> thymidine               B --> G T C
+     U --> uridine                 D --> G A T
+     R --> G A (purine)            H --> A C T
+     Y --> T C (pyrimidine)        V --> G C A
+     K --> G T (keto)              N --> A G C T (any)
+                                   -  gap of indeterminate length
+
+   Now you know the alphabet of nucleotide sequences.  The last two
+lines of the following example query show you such a sequence, which is
+obviously made up only of elements of the alphabet just described.
+Store this example query into a file named 'protbase.request'.  You are
+now ready to send it to the server with the demonstration client.
+
+     PROGRAM blastn
+     DATALIB month
+     EXPECT 0.75
+     BEGIN
+     >GAWK310 the gawking gene GNU AWK
+     tgcttggctgaggagccataggacgagagcttcctggtgaagtgtgtttcttgaaatcat
+     caccaccatggacagcaaa
+
+   The actual search request begins with the mandatory parameter
+'PROGRAM' in the first column followed by the value 'blastn' (the name
+of the program) for searching nucleic acids.  The next line contains the
+mandatory search parameter 'DATALIB' with the value 'month' for the
+newest nucleic acid sequences.  The third line contains an optional
+'EXPECT' parameter and the value desired for it.  The fourth line
+contains the mandatory 'BEGIN' directive, followed by the query sequence
+in FASTA/Pearson format.  Each line of information must be less than 80
+characters in length.
+
+   The "month" database contains all new or revised sequences released
+in the last 30 days and is useful for searching against new sequences.
+There are five different blast programs, 'blastn' being the one that
+compares a nucleotide query sequence against a nucleotide sequence
+database.
+
+   The last server directive that must appear in every request is the
+'BEGIN' directive.  The query sequence should immediately follow the
+'BEGIN' directive and must appear in FASTA/Pearson format.  A sequence
+in FASTA/Pearson format begins with a single-line description.  The
+description line, which is required, is distinguished from the lines of
+sequence data that follow it by having a greater-than ('>') symbol in
+the first column.  For the purposes of the BLAST server, the text of the
+description is arbitrary.
+
+   If you prefer to use a client written in 'gawk', just store the
+following 10 lines of code into a file named 'protbase.awk' and use this
+client instead.  Invoke it with 'gawk -f protbase.awk protbase.request'.
+Then wait a minute and watch the result coming in.  In order to
+replicate the demonstration client's behavior as closely as possible,
+this client does not use a proxy server.  We could also have extended
+the client program in *note Retrieving Web Pages: GETURL, to implement
+the client request from 'protbase.awk' as a special case.
+
+     { request = request "\n" $0 }
+
+     END {
+       BLASTService     = "/inet/tcp/0/www.ncbi.nlm.nih.gov/80"
+       printf "POST /cgi-bin/BLAST/nph-blast_report HTTP/1.0\n" |& BLASTService
+       printf "Content-Length: " length(request) "\n\n"         |& BLASTService
+       printf request                                           |& BLASTService
+       while ((BLASTService |& getline) > 0)
+           print $0
+       close(BLASTService)
+     }
+
+   The demonstration client from NCBI is 214 lines long (written in C)
+and it is not immediately obvious what it does.  Our client is so short
+that it _is_ obvious what it does.  First it loops over all lines of the
+query and stores the whole query into a variable.  Then the script
+establishes an Internet connection to the NCBI server and transmits the
+query by framing it with a proper HTTP request.  Finally it receives and
+prints the complete result coming from the server.
+
+   Now, let us look at the result.  It begins with an HTTP header, which
+you can ignore.  Then there are some comments about the query having
+been filtered to avoid spuriously high scores.  After this, there is a
+reference to the paper that describes the software being used for
+searching the data base.  After a repetition of the original query's
+description we find the list of significant alignments:
+
+     Sequences producing significant alignments:                        (bits)  Value
+
+     gb|AC021182.14|AC021182 Homo sapiens chromosome 7 clone RP11-733...    38  0.20
+     gb|AC021056.12|AC021056 Homo sapiens chromosome 3 clone RP11-115...    38  0.20
+     emb|AL160278.10|AL160278 Homo sapiens chromosome 9 clone RP11-57...    38  0.20
+     emb|AL391139.11|AL391139 Homo sapiens chromosome X clone RP11-35...    38  0.20
+     emb|AL365192.6|AL365192 Homo sapiens chromosome 6 clone RP3-421H...    38  0.20
+     emb|AL138812.9|AL138812 Homo sapiens chromosome 11 clone RP1-276...    38  0.20
+     gb|AC073881.3|AC073881 Homo sapiens chromosome 15 clone CTD-2169...    38  0.20
+
+   This means that the query sequence was found in seven human
+chromosomes.  But the value 0.20 (20%) means that the probability of an
+accidental match is rather high (20%) in all cases and should be taken
+into account.  You may wonder what the first column means.  It is a key
+to the specific database in which this occurrence was found.  The unique
+sequence identifiers reported in the search results can be used as
+sequence retrieval keys via the NCBI server.  The syntax of sequence
+header lines used by the NCBI BLAST server depends on the database from
+which each sequence was obtained.  The table below lists the identifiers
+for the databases from which the sequences were derived.
+
+     Database Name                     Identifier Syntax
+     ============================      ========================
+     GenBank                           gb|accession|locus
+     EMBL Data Library                 emb|accession|locus
+     DDBJ, DNA Database of Japan       dbj|accession|locus
+     NBRF PIR                          pir||entry
+     Protein Research Foundation       prf||name
+     SWISS-PROT                        sp|accession|entry name
+     Brookhaven Protein Data Bank      pdb|entry|chain
+     Kabat's Sequences of Immuno...    gnl|kabat|identifier
+     Patents                           pat|country|number
+     GenInfo Backbone Id               bbs|number
+
+   For example, an identifier might be 'gb|AC021182.14|AC021182', where
+the 'gb' tag indicates that the identifier refers to a GenBank sequence,
+'AC021182.14' is its GenBank ACCESSION, and 'AC021182' is the GenBank
+LOCUS. The identifier contains no spaces, so that a space indicates the
+end of the identifier.
+
+   Let us continue in the result listing.  Each of the seven alignments
+mentioned above is subsequently described in detail.  We will have a
+closer look at the first of them.
+
+     >gb|AC021182.14|AC021182 Homo sapiens chromosome 7 clone RP11-733N23, WORKING DRAFT SEQUENCE, 4
+                  unordered pieces
+               Length = 176383
+
+      Score = 38.2 bits (19), Expect = 0.20
+      Identities = 19/19 (100%)
+      Strand = Plus / Plus
+
+     Query: 35    tggtgaagtgtgtttcttg 53
+                  |||||||||||||||||||
+     Sbjct: 69786 tggtgaagtgtgtttcttg 69804
+
+   This alignment was located on the human chromosome 7.  The fragment
+on which part of the query was found had a total length of 176383.  Only
+19 of the nucleotides matched and the matching sequence ran from
+character 35 to 53 in the query sequence and from 69786 to 69804 in the
+fragment on chromosome 7.  If you are still reading at this point, you
+are probably interested in finding out more about Computational Biology
+and you might appreciate the following hints.
+
+  1. There is a book called 'Introduction to Computational Biology' by
+     Michael S. Waterman, which is worth reading if you are seriously
+     interested.  You can find a good book review on the Internet.
+
+  2. While Waterman's book can explain to you the algorithms employed
+     internally in the database search engines, most practitioners
+     prefer to approach the subject differently.  The applied side of
+     Computational Biology is called Bioinformatics, and emphasizes the
+     tools available for day-to-day work as well as how to actually
+     _use_ them.  One of the very few affordable books on Bioinformatics
+     is 'Developing Bioinformatics Computer Skills'.
+
+  3. The sequences _gawk_ and _gnuawk_ are in widespread use in the
+     genetic material of virtually every earthly living being.  Let us
+     take this as a clear indication that the divine creator has
+     intended 'gawk' to prevail over other scripting languages such as
+     'perl', 'tcl', or 'python' which are not even proper sequences.
+     (:-)
+
+
+File: gawkinet.info,  Node: Links,  Next: GNU Free Documentation License,  Prev: Some Applications and Techniques,  Up: Top
+
+4 Related Links
+***************
+
+This section lists the URLs for various items discussed in this major
+node.  They are presented in the order in which they appear.
+
+'Internet Programming with Python'
+     <http://www.fsbassociates.com/books/python.htm>
+
+'Advanced Perl Programming'
+     <http://www.oreilly.com/catalog/advperl>
+
+'Web Client Programming with Perl'
+     <http://www.oreilly.com/catalog/webclient>
+
+Richard Stevens's home page and book
+     <http://www.kohala.com/~rstevens>
+
+The SPAK home page
+     <http://www.userfriendly.net/linux/RPM/contrib/libc6/i386/spak-0.6b-1.i386.html>
+
+Volume III of 'Internetworking with TCP/IP', by Comer and Stevens
+     <http://www.cs.purdue.edu/homes/dec/tcpip3s.cont.html>
+
+XBM Graphics File Format
+     <http://www.wotsit.org/download.asp?f=xbm>
+
+GNUPlot
+     <http://www.cs.dartmouth.edu/gnuplot_info.html>
+
+Mark Humphrys' Eliza page
+     <http://www.compapp.dcu.ie/~humphrys/eliza.html>
+
+Yahoo! Eliza Information
+     <http://dir.yahoo.com/Recreation/Games/Computer_Games/Internet_Games/Web_Games/Artificial_Intelligence>
+
+Java versions of Eliza
+     <http://www.tjhsst.edu/Psych/ch1/eliza.html>
+
+Java versions of Eliza with source code
+     <http://home.adelphia.net/~lifeisgood/eliza/eliza.htm>
+
+Eliza Programs with Explanations
+     <http://chayden.net/chayden/eliza/Eliza.shtml>
+
+Loebner Contest
+     <http://acm.org/~loebner/loebner-prize.htmlx>
+
+Tck/Tk Information
+     <http://www.scriptics.com/>
+
+Intel 80x86 Processors
+     <http://developer.intel.com/design/platform/embedpc/what_is.htm>
+
+AMD Elan Processors
+     <http://www.amd.com/products/epd/processors/4.32bitcont/32bitcont/index.html>
+
+XINU
+     <http://willow.canberra.edu.au/~chrisc/xinu.html>
+
+GNU/Linux
+     <http://uclinux.lineo.com/>
+
+Embedded PCs
+     <http://dir.yahoo.com/Business_and_Economy/Business_to_Business/Computers/Hardware/Embedded_Control/>
+
+MiniSQL
+     <http://www.hughes.com.au/library/>
+
+Market Share Surveys
+     <http://www.netcraft.com/survey>
+
+'Numerical Recipes in C: The Art of Scientific Computing'
+     <http://www.nr.com>
+
+VRML
+     <http://www.vrml.org>
+
+The VRML FAQ
+     <http://www.vrml.org/technicalinfo/specifications/specifications.htm#FAQ>
+
+The UMBC Agent Web
+     <http://www.cs.umbc.edu/agents>
+
+Apache Web Server
+     <http://www.apache.org>
+
+National Center for Biotechnology Information (NCBI)
+     <http://www.ncbi.nlm.nih.gov>
+
+Basic Local Alignment Search Tool (BLAST)
+     <http://www.ncbi.nlm.nih.gov/BLAST/blast_overview.html>
+
+NCBI Home Page
+     <http://www.ncbi.nlm.nih.gov>
+
+BLAST Pages
+     <http://www.ncbi.nlm.nih.gov/BLAST>
+
+BLAST Demonstration Client
+     <ftp://ncbi.nlm.nih.gov/blast/blasturl/>
+
+BLAST anonymous FTP location
+     <ftp://ncbi.nlm.nih.gov/blast/network/netblast/>
+
+BLAST 2.0 Executables
+     <ftp://ncbi.nlm.nih.gov/blast/executables/>
+
+IUB/IUPAC Amino Acid and Nucleic Acid Codes
+     <http://www.uthscsa.edu/geninfo/blastmail.html#item6>
+
+FASTA/Pearson Format
+     <http://www.ncbi.nlm.nih.gov/BLAST/fasta.html>
+
+Fasta/Pearson Sequence in Java
+     <http://www.kazusa.or.jp/java/codon_table_java/>
+
+Book Review of 'Introduction to Computational Biology'
+     <http://www.acm.org/crossroads/xrds5-1/introcb.html>
+
+'Developing Bioinformatics Computer Skills'
+     <http://www.oreilly.com/catalog/bioskills/>
+
+
+File: gawkinet.info,  Node: GNU Free Documentation License,  Next: Index,  Prev: Links,  Up: Top
+
+GNU Free Documentation License
+******************************
+
+                     Version 1.3, 3 November 2008
+
+     Copyright (C) 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.
+     <http://fsf.org/>
+
+     Everyone is permitted to copy and distribute verbatim copies
+     of this license document, but changing it is not allowed.
+
+  0. PREAMBLE
+
+     The purpose of this License is to make a manual, textbook, or other
+     functional and useful document "free" in the sense of freedom: to
+     assure everyone the effective freedom to copy and redistribute it,
+     with or without modifying it, either commercially or
+     noncommercially.  Secondarily, this License preserves for the
+     author and publisher a way to get credit for their work, while not
+     being considered responsible for modifications made by others.
+
+     This License is a kind of "copyleft", which means that derivative
+     works of the document must themselves be free in the same sense.
+     It complements the GNU General Public License, which is a copyleft
+     license designed for free software.
+
+     We have designed this License in order to use it for manuals for
+     free software, because free software needs free documentation: a
+     free program should come with manuals providing the same freedoms
+     that the software does.  But this License is not limited to
+     software manuals; it can be used for any textual work, regardless
+     of subject matter or whether it is published as a printed book.  We
+     recommend this License principally for works whose purpose is
+     instruction or reference.
+
+  1. APPLICABILITY AND DEFINITIONS
+
+     This License applies to any manual or other work, in any medium,
+     that contains a notice placed by the copyright holder saying it can
+     be distributed under the terms of this License.  Such a notice
+     grants a world-wide, royalty-free license, unlimited in duration,
+     to use that work under the conditions stated herein.  The
+     "Document", below, refers to any such manual or work.  Any member
+     of the public is a licensee, and is addressed as "you".  You accept
+     the license if you copy, modify or distribute the work in a way
+     requiring permission under copyright law.
+
+     A "Modified Version" of the Document means any work containing the
+     Document or a portion of it, either copied verbatim, or with
+     modifications and/or translated into another language.
+
+     A "Secondary Section" is a named appendix or a front-matter section
+     of the Document that deals exclusively with the relationship of the
+     publishers or authors of the Document to the Document's overall
+     subject (or to related matters) and contains nothing that could
+     fall directly within that overall subject.  (Thus, if the Document
+     is in part a textbook of mathematics, a Secondary Section may not
+     explain any mathematics.)  The relationship could be a matter of
+     historical connection with the subject or with related matters, or
+     of legal, commercial, philosophical, ethical or political position
+     regarding them.
+
+     The "Invariant Sections" are certain Secondary Sections whose
+     titles are designated, as being those of Invariant Sections, in the
+     notice that says that the Document is released under this License.
+     If a section does not fit the above definition of Secondary then it
+     is not allowed to be designated as Invariant.  The Document may
+     contain zero Invariant Sections.  If the Document does not identify
+     any Invariant Sections then there are none.
+
+     The "Cover Texts" are certain short passages of text that are
+     listed, as Front-Cover Texts or Back-Cover Texts, in the notice
+     that says that the Document is released under this License.  A
+     Front-Cover Text may be at most 5 words, and a Back-Cover Text may
+     be at most 25 words.
+
+     A "Transparent" copy of the Document means a machine-readable copy,
+     represented in a format whose specification is available to the
+     general public, that is suitable for revising the document
+     straightforwardly with generic text editors or (for images composed
+     of pixels) generic paint programs or (for drawings) some widely
+     available drawing editor, and that is suitable for input to text
+     formatters or for automatic translation to a variety of formats
+     suitable for input to text formatters.  A copy made in an otherwise
+     Transparent file format whose markup, or absence of markup, has
+     been arranged to thwart or discourage subsequent modification by
+     readers is not Transparent.  An image format is not Transparent if
+     used for any substantial amount of text.  A copy that is not
+     "Transparent" is called "Opaque".
+
+     Examples of suitable formats for Transparent copies include plain
+     ASCII without markup, Texinfo input format, LaTeX input format,
+     SGML or XML using a publicly available DTD, and standard-conforming
+     simple HTML, PostScript or PDF designed for human modification.
+     Examples of transparent image formats include PNG, XCF and JPG.
+     Opaque formats include proprietary formats that can be read and
+     edited only by proprietary word processors, SGML or XML for which
+     the DTD and/or processing tools are not generally available, and
+     the machine-generated HTML, PostScript or PDF produced by some word
+     processors for output purposes only.
+
+     The "Title Page" means, for a printed book, the title page itself,
+     plus such following pages as are needed to hold, legibly, the
+     material this License requires to appear in the title page.  For
+     works in formats which do not have any title page as such, "Title
+     Page" means the text near the most prominent appearance of the
+     work's title, preceding the beginning of the body of the text.
+
+     The "publisher" means any person or entity that distributes copies
+     of the Document to the public.
+
+     A section "Entitled XYZ" means a named subunit of the Document
+     whose title either is precisely XYZ or contains XYZ in parentheses
+     following text that translates XYZ in another language.  (Here XYZ
+     stands for a specific section name mentioned below, such as
+     "Acknowledgements", "Dedications", "Endorsements", or "History".)
+     To "Preserve the Title" of such a section when you modify the
+     Document means that it remains a section "Entitled XYZ" according
+     to this definition.
+
+     The Document may include Warranty Disclaimers next to the notice
+     which states that this License applies to the Document.  These
+     Warranty Disclaimers are considered to be included by reference in
+     this License, but only as regards disclaiming warranties: any other
+     implication that these Warranty Disclaimers may have is void and
+     has no effect on the meaning of this License.
+
+  2. VERBATIM COPYING
+
+     You may copy and distribute the Document in any medium, either
+     commercially or noncommercially, provided that this License, the
+     copyright notices, and the license notice saying this License
+     applies to the Document are reproduced in all copies, and that you
+     add no other conditions whatsoever to those of this License.  You
+     may not use technical measures to obstruct or control the reading
+     or further copying of the copies you make or distribute.  However,
+     you may accept compensation in exchange for copies.  If you
+     distribute a large enough number of copies you must also follow the
+     conditions in section 3.
+
+     You may also lend copies, under the same conditions stated above,
+     and you may publicly display copies.
+
+  3. COPYING IN QUANTITY
+
+     If you publish printed copies (or copies in media that commonly
+     have printed covers) of the Document, numbering more than 100, and
+     the Document's license notice requires Cover Texts, you must
+     enclose the copies in covers that carry, clearly and legibly, all
+     these Cover Texts: Front-Cover Texts on the front cover, and
+     Back-Cover Texts on the back cover.  Both covers must also clearly
+     and legibly identify you as the publisher of these copies.  The
+     front cover must present the full title with all words of the title
+     equally prominent and visible.  You may add other material on the
+     covers in addition.  Copying with changes limited to the covers, as
+     long as they preserve the title of the Document and satisfy these
+     conditions, can be treated as verbatim copying in other respects.
+
+     If the required texts for either cover are too voluminous to fit
+     legibly, you should put the first ones listed (as many as fit
+     reasonably) on the actual cover, and continue the rest onto
+     adjacent pages.
+
+     If you publish or distribute Opaque copies of the Document
+     numbering more than 100, you must either include a machine-readable
+     Transparent copy along with each Opaque copy, or state in or with
+     each Opaque copy a computer-network location from which the general
+     network-using public has access to download using public-standard
+     network protocols a complete Transparent copy of the Document, free
+     of added material.  If you use the latter option, you must take
+     reasonably prudent steps, when you begin distribution of Opaque
+     copies in quantity, to ensure that this Transparent copy will
+     remain thus accessible at the stated location until at least one
+     year after the last time you distribute an Opaque copy (directly or
+     through your agents or retailers) of that edition to the public.
+
+     It is requested, but not required, that you contact the authors of
+     the Document well before redistributing any large number of copies,
+     to give them a chance to provide you with an updated version of the
+     Document.
+
+  4. MODIFICATIONS
+
+     You may copy and distribute a Modified Version of the Document
+     under the conditions of sections 2 and 3 above, provided that you
+     release the Modified Version under precisely this License, with the
+     Modified Version filling the role of the Document, thus licensing
+     distribution and modification of the Modified Version to whoever
+     possesses a copy of it.  In addition, you must do these things in
+     the Modified Version:
+
+       A. Use in the Title Page (and on the covers, if any) a title
+          distinct from that of the Document, and from those of previous
+          versions (which should, if there were any, be listed in the
+          History section of the Document).  You may use the same title
+          as a previous version if the original publisher of that
+          version gives permission.
+
+       B. List on the Title Page, as authors, one or more persons or
+          entities responsible for authorship of the modifications in
+          the Modified Version, together with at least five of the
+          principal authors of the Document (all of its principal
+          authors, if it has fewer than five), unless they release you
+          from this requirement.
+
+       C. State on the Title page the name of the publisher of the
+          Modified Version, as the publisher.
+
+       D. Preserve all the copyright notices of the Document.
+
+       E. Add an appropriate copyright notice for your modifications
+          adjacent to the other copyright notices.
+
+       F. Include, immediately after the copyright notices, a license
+          notice giving the public permission to use the Modified
+          Version under the terms of this License, in the form shown in
+          the Addendum below.
+
+       G. Preserve in that license notice the full lists of Invariant
+          Sections and required Cover Texts given in the Document's
+          license notice.
+
+       H. Include an unaltered copy of this License.
+
+       I. Preserve the section Entitled "History", Preserve its Title,
+          and add to it an item stating at least the title, year, new
+          authors, and publisher of the Modified Version as given on the
+          Title Page.  If there is no section Entitled "History" in the
+          Document, create one stating the title, year, authors, and
+          publisher of the Document as given on its Title Page, then add
+          an item describing the Modified Version as stated in the
+          previous sentence.
+
+       J. Preserve the network location, if any, given in the Document
+          for public access to a Transparent copy of the Document, and
+          likewise the network locations given in the Document for
+          previous versions it was based on.  These may be placed in the
+          "History" section.  You may omit a network location for a work
+          that was published at least four years before the Document
+          itself, or if the original publisher of the version it refers
+          to gives permission.
+
+       K. For any section Entitled "Acknowledgements" or "Dedications",
+          Preserve the Title of the section, and preserve in the section
+          all the substance and tone of each of the contributor
+          acknowledgements and/or dedications given therein.
+
+       L. Preserve all the Invariant Sections of the Document, unaltered
+          in their text and in their titles.  Section numbers or the
+          equivalent are not considered part of the section titles.
+
+       M. Delete any section Entitled "Endorsements".  Such a section
+          may not be included in the Modified Version.
+
+       N. Do not retitle any existing section to be Entitled
+          "Endorsements" or to conflict in title with any Invariant
+          Section.
+
+       O. Preserve any Warranty Disclaimers.
+
+     If the Modified Version includes new front-matter sections or
+     appendices that qualify as Secondary Sections and contain no
+     material copied from the Document, you may at your option designate
+     some or all of these sections as invariant.  To do this, add their
+     titles to the list of Invariant Sections in the Modified Version's
+     license notice.  These titles must be distinct from any other
+     section titles.
+
+     You may add a section Entitled "Endorsements", provided it contains
+     nothing but endorsements of your Modified Version by various
+     parties--for example, statements of peer review or that the text
+     has been approved by an organization as the authoritative
+     definition of a standard.
+
+     You may add a passage of up to five words as a Front-Cover Text,
+     and a passage of up to 25 words as a Back-Cover Text, to the end of
+     the list of Cover Texts in the Modified Version.  Only one passage
+     of Front-Cover Text and one of Back-Cover Text may be added by (or
+     through arrangements made by) any one entity.  If the Document
+     already includes a cover text for the same cover, previously added
+     by you or by arrangement made by the same entity you are acting on
+     behalf of, you may not add another; but you may replace the old
+     one, on explicit permission from the previous publisher that added
+     the old one.
+
+     The author(s) and publisher(s) of the Document do not by this
+     License give permission to use their names for publicity for or to
+     assert or imply endorsement of any Modified Version.
+
+  5. COMBINING DOCUMENTS
+
+     You may combine the Document with other documents released under
+     this License, under the terms defined in section 4 above for
+     modified versions, provided that you include in the combination all
+     of the Invariant Sections of all of the original documents,
+     unmodified, and list them all as Invariant Sections of your
+     combined work in its license notice, and that you preserve all
+     their Warranty Disclaimers.
+
+     The combined work need only contain one copy of this License, and
+     multiple identical Invariant Sections may be replaced with a single
+     copy.  If there are multiple Invariant Sections with the same name
+     but different contents, make the title of each such section unique
+     by adding at the end of it, in parentheses, the name of the
+     original author or publisher of that section if known, or else a
+     unique number.  Make the same adjustment to the section titles in
+     the list of Invariant Sections in the license notice of the
+     combined work.
+
+     In the combination, you must combine any sections Entitled
+     "History" in the various original documents, forming one section
+     Entitled "History"; likewise combine any sections Entitled
+     "Acknowledgements", and any sections Entitled "Dedications".  You
+     must delete all sections Entitled "Endorsements."
+
+  6. COLLECTIONS OF DOCUMENTS
+
+     You may make a collection consisting of the Document and other
+     documents released under this License, and replace the individual
+     copies of this License in the various documents with a single copy
+     that is included in the collection, provided that you follow the
+     rules of this License for verbatim copying of each of the documents
+     in all other respects.
+
+     You may extract a single document from such a collection, and
+     distribute it individually under this License, provided you insert
+     a copy of this License into the extracted document, and follow this
+     License in all other respects regarding verbatim copying of that
+     document.
+
+  7. AGGREGATION WITH INDEPENDENT WORKS
+
+     A compilation of the Document or its derivatives with other
+     separate and independent documents or works, in or on a volume of a
+     storage or distribution medium, is called an "aggregate" if the
+     copyright resulting from the compilation is not used to limit the
+     legal rights of the compilation's users beyond what the individual
+     works permit.  When the Document is included in an aggregate, this
+     License does not apply to the other works in the aggregate which
+     are not themselves derivative works of the Document.
+
+     If the Cover Text requirement of section 3 is applicable to these
+     copies of the Document, then if the Document is less than one half
+     of the entire aggregate, the Document's Cover Texts may be placed
+     on covers that bracket the Document within the aggregate, or the
+     electronic equivalent of covers if the Document is in electronic
+     form.  Otherwise they must appear on printed covers that bracket
+     the whole aggregate.
+
+  8. TRANSLATION
+
+     Translation is considered a kind of modification, so you may
+     distribute translations of the Document under the terms of section
+     4.  Replacing Invariant Sections with translations requires special
+     permission from their copyright holders, but you may include
+     translations of some or all Invariant Sections in addition to the
+     original versions of these Invariant Sections.  You may include a
+     translation of this License, and all the license notices in the
+     Document, and any Warranty Disclaimers, provided that you also
+     include the original English version of this License and the
+     original versions of those notices and disclaimers.  In case of a
+     disagreement between the translation and the original version of
+     this License or a notice or disclaimer, the original version will
+     prevail.
+
+     If a section in the Document is Entitled "Acknowledgements",
+     "Dedications", or "History", the requirement (section 4) to
+     Preserve its Title (section 1) will typically require changing the
+     actual title.
+
+  9. TERMINATION
+
+     You may not copy, modify, sublicense, or distribute the Document
+     except as expressly provided under this License.  Any attempt
+     otherwise to copy, modify, sublicense, or distribute it is void,
+     and will automatically terminate your rights under this License.
+
+     However, if you cease all violation of this License, then your
+     license from a particular copyright holder is reinstated (a)
+     provisionally, unless and until the copyright holder explicitly and
+     finally terminates your license, and (b) permanently, if the
+     copyright holder fails to notify you of the violation by some
+     reasonable means prior to 60 days after the cessation.
+
+     Moreover, your license from a particular copyright holder is
+     reinstated permanently if the copyright holder notifies you of the
+     violation by some reasonable means, this is the first time you have
+     received notice of violation of this License (for any work) from
+     that copyright holder, and you cure the violation prior to 30 days
+     after your receipt of the notice.
+
+     Termination of your rights under this section does not terminate
+     the licenses of parties who have received copies or rights from you
+     under this License.  If your rights have been terminated and not
+     permanently reinstated, receipt of a copy of some or all of the
+     same material does not give you any rights to use it.
+
+  10. FUTURE REVISIONS OF THIS LICENSE
+
+     The Free Software Foundation may publish new, revised versions of
+     the GNU Free Documentation License from time to time.  Such new
+     versions will be similar in spirit to the present version, but may
+     differ in detail to address new problems or concerns.  See
+     <http://www.gnu.org/copyleft/>.
+
+     Each version of the License is given a distinguishing version
+     number.  If the Document specifies that a particular numbered
+     version of this License "or any later version" applies to it, you
+     have the option of following the terms and conditions either of
+     that specified version or of any later version that has been
+     published (not as a draft) by the Free Software Foundation.  If the
+     Document does not specify a version number of this License, you may
+     choose any version ever published (not as a draft) by the Free
+     Software Foundation.  If the Document specifies that a proxy can
+     decide which future versions of this License can be used, that
+     proxy's public statement of acceptance of a version permanently
+     authorizes you to choose that version for the Document.
+
+  11. RELICENSING
+
+     "Massive Multiauthor Collaboration Site" (or "MMC Site") means any
+     World Wide Web server that publishes copyrightable works and also
+     provides prominent facilities for anybody to edit those works.  A
+     public wiki that anybody can edit is an example of such a server.
+     A "Massive Multiauthor Collaboration" (or "MMC") contained in the
+     site means any set of copyrightable works thus published on the MMC
+     site.
+
+     "CC-BY-SA" means the Creative Commons Attribution-Share Alike 3.0
+     license published by Creative Commons Corporation, a not-for-profit
+     corporation with a principal place of business in San Francisco,
+     California, as well as future copyleft versions of that license
+     published by that same organization.
+
+     "Incorporate" means to publish or republish a Document, in whole or
+     in part, as part of another Document.
+
+     An MMC is "eligible for relicensing" if it is licensed under this
+     License, and if all works that were first published under this
+     License somewhere other than this MMC, and subsequently
+     incorporated in whole or in part into the MMC, (1) had no cover
+     texts or invariant sections, and (2) were thus incorporated prior
+     to November 1, 2008.
+
+     The operator of an MMC Site may republish an MMC contained in the
+     site under CC-BY-SA on the same site at any time before August 1,
+     2009, provided the MMC is eligible for relicensing.
+
+ADDENDUM: How to use this License for your documents
+====================================================
+
+To use this License in a document you have written, include a copy of
+the License in the document and put the following copyright and license
+notices just after the title page:
+
+       Copyright (C)  YEAR  YOUR NAME.
+       Permission is granted to copy, distribute and/or modify this document
+       under the terms of the GNU Free Documentation License, Version 1.3
+       or any later version published by the Free Software Foundation;
+       with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
+       Texts.  A copy of the license is included in the section entitled ``GNU
+       Free Documentation License''.
+
+   If you have Invariant Sections, Front-Cover Texts and Back-Cover
+Texts, replace the "with...Texts."  line with this:
+
+         with the Invariant Sections being LIST THEIR TITLES, with
+         the Front-Cover Texts being LIST, and with the Back-Cover Texts
+         being LIST.
+
+   If you have Invariant Sections without Cover Texts, or some other
+combination of the three, merge those two alternatives to suit the
+situation.
+
+   If your document contains nontrivial examples of program code, we
+recommend releasing these examples in parallel under your choice of free
+software license, such as the GNU General Public License, to permit
+their use in free software.
+
+
+File: gawkinet.info,  Node: Index,  Prev: GNU Free Documentation License,  Up: Top
+
+Index
+*****
+
+
+* Menu:
+
+* /inet/ files (gawk):                   Gawk Special Files.  (line  34)
+* /inet/tcp special files (gawk):        File /inet/tcp.      (line   6)
+* /inet/udp special files (gawk):        File /inet/udp.      (line   6)
+* | (vertical bar), |& operator (I/O):   TCP Connecting.      (line  25)
+* advanced features, network connections: Troubleshooting.    (line   6)
+* agent:                                 Challenges.          (line  75)
+* agent <1>:                             MOBAGWHO.            (line   6)
+* AI:                                    Challenges.          (line  75)
+* apache:                                WEBGRAB.             (line  72)
+* apache <1>:                            MOBAGWHO.            (line  42)
+* Bioinformatics:                        PROTBASE.            (line 227)
+* BLAST, Basic Local Alignment Search Tool: PROTBASE.         (line   6)
+* blocking:                              Making Connections.  (line  35)
+* Boutell, Thomas:                       STATIST.             (line   6)
+* CGI (Common Gateway Interface):        MOBAGWHO.            (line  42)
+* CGI (Common Gateway Interface), dynamic web pages and: Web page.
+                                                              (line  45)
+* CGI (Common Gateway Interface), library: CGI Lib.           (line  11)
+* clients:                               Making Connections.  (line  21)
+* Clinton, Bill:                         Challenges.          (line  58)
+* Common Gateway Interface, See CGI:     Web page.            (line  45)
+* Computational Biology:                 PROTBASE.            (line 227)
+* contest:                               Challenges.          (line   6)
+* cron utility:                          STOXPRED.            (line  23)
+* CSV format:                            STOXPRED.            (line 128)
+* Dow Jones Industrial Index:            STOXPRED.            (line  44)
+* ELIZA program:                         Simple Server.       (line  11)
+* ELIZA program <1>:                     Simple Server.       (line 178)
+* email:                                 Email.               (line  11)
+* FASTA/Pearson format:                  PROTBASE.            (line 102)
+* FDL (Free Documentation License):      GNU Free Documentation License.
+                                                              (line   6)
+* filenames, for network access:         Gawk Special Files.  (line  29)
+* files, /inet/ (gawk):                  Gawk Special Files.  (line  34)
+* files, /inet/tcp (gawk):               File /inet/tcp.      (line   6)
+* files, /inet/udp (gawk):               File /inet/udp.      (line   6)
+* finger utility:                        Setting Up.          (line  22)
+* Free Documentation License (FDL):      GNU Free Documentation License.
+                                                              (line   6)
+* FTP (File Transfer Protocol):          Basic Protocols.     (line  45)
+* gawk, networking:                      Using Networking.    (line   6)
+* gawk, networking, connections:         Special File Fields. (line  53)
+* gawk, networking, connections <1>:     TCP Connecting.      (line   6)
+* gawk, networking, filenames:           Gawk Special Files.  (line  29)
+* gawk, networking, See Also email:      Email.               (line   6)
+* gawk, networking, service, establishing: Setting Up.        (line   6)
+* gawk, networking, troubleshooting:     Caveats.             (line   6)
+* gawk, web and, See web service:        Interacting Service. (line   6)
+* getline command:                       TCP Connecting.      (line  11)
+* GETURL program:                        GETURL.              (line   6)
+* GIF image format:                      Web page.            (line  45)
+* GIF image format <1>:                  STATIST.             (line   6)
+* GNU Free Documentation License:        GNU Free Documentation License.
+                                                              (line   6)
+* GNU/Linux:                             Troubleshooting.     (line  54)
+* GNU/Linux <1>:                         Interacting.         (line  27)
+* GNU/Linux <2>:                         REMCONF.             (line   6)
+* GNUPlot utility:                       Interacting Service. (line 189)
+* GNUPlot utility <1>:                   STATIST.             (line   6)
+* Hoare, C.A.R.:                         MOBAGWHO.            (line   6)
+* Hoare, C.A.R. <1>:                     PROTBASE.            (line   6)
+* hostname field:                        Special File Fields. (line  34)
+* HTML (Hypertext Markup Language):      Web page.            (line  29)
+* HTTP (Hypertext Transfer Protocol):    Basic Protocols.     (line  45)
+* HTTP (Hypertext Transfer Protocol) <1>: Web page.           (line   6)
+* HTTP (Hypertext Transfer Protocol), record separators and: Web page.
+                                                              (line  29)
+* HTTP server, core logic:               Interacting Service. (line   6)
+* HTTP server, core logic <1>:           Interacting Service. (line  24)
+* Humphrys, Mark:                        Simple Server.       (line 178)
+* Hypertext Markup Language (HTML):      Web page.            (line  29)
+* Hypertext Transfer Protocol, See HTTP: Web page.            (line   6)
+* image format:                          STATIST.             (line   6)
+* images, in web pages:                  Interacting Service. (line 189)
+* images, retrieving over networks:      Web page.            (line  45)
+* input/output, two-way, See Also gawk, networking: Gawk Special Files.
+                                                              (line  19)
+* Internet, See networks:                Interacting.         (line  48)
+* JavaScript:                            STATIST.             (line  56)
+* Linux:                                 Troubleshooting.     (line  54)
+* Linux <1>:                             Interacting.         (line  27)
+* Linux <2>:                             REMCONF.             (line   6)
+* Lisp:                                  MOBAGWHO.            (line  98)
+* localport field:                       Gawk Special Files.  (line  34)
+* Loebner, Hugh:                         Challenges.          (line   6)
+* Loui, Ronald:                          Challenges.          (line  75)
+* MAZE:                                  MAZE.                (line   6)
+* Microsoft Windows:                     WEBGRAB.             (line  43)
+* Microsoft Windows, networking:         Troubleshooting.     (line  54)
+* Microsoft Windows, networking, ports:  Setting Up.          (line  37)
+* MiniSQL:                               REMCONF.             (line 109)
+* MOBAGWHO program:                      MOBAGWHO.            (line   6)
+* NCBI, National Center for Biotechnology Information: PROTBASE.
+                                                              (line   6)
+* network type field:                    Special File Fields. (line  11)
+* networks, gawk and:                    Using Networking.    (line   6)
+* networks, gawk and, connections:       Special File Fields. (line  53)
+* networks, gawk and, connections <1>:   TCP Connecting.      (line   6)
+* networks, gawk and, filenames:         Gawk Special Files.  (line  29)
+* networks, gawk and, See Also email:    Email.               (line   6)
+* networks, gawk and, service, establishing: Setting Up.      (line   6)
+* networks, gawk and, troubleshooting:   Caveats.             (line   6)
+* networks, ports, reserved:             Setting Up.          (line  37)
+* networks, ports, specifying:           Special File Fields. (line  24)
+* networks, See Also web pages:          PANIC.               (line   6)
+* Numerical Recipes:                     STATIST.             (line  24)
+* ORS variable, HTTP and:                Web page.            (line  29)
+* ORS variable, POP and:                 Email.               (line  36)
+* PANIC program:                         PANIC.               (line   6)
+* Perl:                                  Using Networking.    (line  14)
+* Perl, gawk networking and:             Using Networking.    (line  24)
+* Perlis, Alan:                          MAZE.                (line   6)
+* pipes, networking and:                 TCP Connecting.      (line  30)
+* PNG image format:                      Web page.            (line  45)
+* PNG image format <1>:                  STATIST.             (line   6)
+* POP (Post Office Protocol):            Email.               (line   6)
+* POP (Post Office Protocol) <1>:        Email.               (line  36)
+* Post Office Protocol (POP):            Email.               (line   6)
+* PostScript:                            STATIST.             (line 138)
+* PROLOG:                                Challenges.          (line  75)
+* PROTBASE:                              PROTBASE.            (line   6)
+* protocol field:                        Special File Fields. (line  17)
+* PS image format:                       STATIST.             (line   6)
+* Python:                                Using Networking.    (line  14)
+* Python, gawk networking and:           Using Networking.    (line  24)
+* record separators, HTTP and:           Web page.            (line  29)
+* record separators, POP and:            Email.               (line  36)
+* REMCONF program:                       REMCONF.             (line   6)
+* remoteport field:                      Gawk Special Files.  (line  34)
+* RFC 1939:                              Email.               (line   6)
+* RFC 1939 <1>:                          Email.               (line  36)
+* RFC 1945:                              Web page.            (line  29)
+* RFC 2068:                              Web page.            (line   6)
+* RFC 2068 <1>:                          Interacting Service. (line 104)
+* RFC 2616:                              Web page.            (line   6)
+* RFC 821:                               Email.               (line   6)
+* robot:                                 Challenges.          (line  84)
+* robot <1>:                             WEBGRAB.             (line   6)
+* RS variable, HTTP and:                 Web page.            (line  29)
+* RS variable, POP and:                  Email.               (line  36)
+* servers:                               Making Connections.  (line  14)
+* servers <1>:                           Setting Up.          (line  22)
+* servers, as hosts:                     Special File Fields. (line  34)
+* servers, HTTP:                         Interacting Service. (line   6)
+* servers, web:                          Simple Server.       (line   6)
+* Simple Mail Transfer Protocol (SMTP):  Email.               (line   6)
+* SMTP (Simple Mail Transfer Protocol):  Basic Protocols.     (line  45)
+* SMTP (Simple Mail Transfer Protocol) <1>: Email.            (line   6)
+* STATIST program:                       STATIST.             (line   6)
+* STOXPRED program:                      STOXPRED.            (line   6)
+* synchronous communications:            Making Connections.  (line  35)
+* Tcl/Tk:                                Using Networking.    (line  14)
+* Tcl/Tk, gawk and:                      Using Networking.    (line  24)
+* Tcl/Tk, gawk and <1>:                  Some Applications and Techniques.
+                                                              (line  22)
+* TCP (Transmission Control Protocol):   Using Networking.    (line  29)
+* TCP (Transmission Control Protocol) <1>: File /inet/tcp.    (line   6)
+* TCP (Transmission Control Protocol), connection, establishing: TCP Connecting.
+                                                              (line   6)
+* TCP (Transmission Control Protocol), UDP and: Interacting.  (line  48)
+* TCP/IP, network type, selecting:       Special File Fields. (line  11)
+* TCP/IP, protocols, selecting:          Special File Fields. (line  17)
+* TCP/IP, sockets and:                   Gawk Special Files.  (line  19)
+* Transmission Control Protocol, See TCP: Using Networking.   (line  29)
+* troubleshooting, gawk, networks:       Caveats.             (line   6)
+* troubleshooting, networks, connections: Troubleshooting.    (line   6)
+* troubleshooting, networks, timeouts:   Caveats.             (line  18)
+* UDP (User Datagram Protocol):          File /inet/udp.      (line   6)
+* UDP (User Datagram Protocol), TCP and: Interacting.         (line  48)
+* Unix, network ports and:               Setting Up.          (line  37)
+* URLCHK program:                        URLCHK.              (line   6)
+* User Datagram Protocol, See UDP:       File /inet/udp.      (line   6)
+* vertical bar (|), |& operator (I/O):   TCP Connecting.      (line  25)
+* VRML:                                  MAZE.                (line   6)
+* web browsers, See web service:         Interacting Service. (line   6)
+* web pages:                             Web page.            (line   6)
+* web pages, images in:                  Interacting Service. (line 189)
+* web pages, retrieving:                 GETURL.              (line   6)
+* web servers:                           Simple Server.       (line   6)
+* web service:                           Primitive Service.   (line   6)
+* web service <1>:                       PANIC.               (line   6)
+* WEBGRAB program:                       WEBGRAB.             (line   6)
+* Weizenbaum, Joseph:                    Simple Server.       (line  11)
+* XBM image format:                      Interacting Service. (line 189)
+* Yahoo!:                                REMCONF.             (line   6)
+* Yahoo! <1>:                            STOXPRED.            (line   6)
+
+
+
+Tag Table:
+Node: Top2022
+Node: Preface5665
+Node: Introduction7040
+Node: Stream Communications8066
+Node: Datagram Communications9240
+Node: The TCP/IP Protocols10870
+Ref: The TCP/IP Protocols-Footnote-111554
+Node: Basic Protocols11711
+Ref: Basic Protocols-Footnote-113756
+Node: Ports13785
+Node: Making Connections15192
+Ref: Making Connections-Footnote-117750
+Ref: Making Connections-Footnote-217797
+Node: Using Networking17978
+Node: Gawk Special Files20301
+Node: Special File Fields22110
+Ref: table-inet-components26003
+Node: Comparing Protocols27312
+Node: File /inet/tcp27846
+Node: File /inet/udp28874
+Ref: File /inet/udp-Footnote-130573
+Node: TCP Connecting30827
+Node: Troubleshooting33173
+Ref: Troubleshooting-Footnote-136232
+Node: Interacting36805
+Node: Setting Up39545
+Node: Email43048
+Node: Web page45380
+Ref: Web page-Footnote-148197
+Node: Primitive Service48395
+Node: Interacting Service51136
+Ref: Interacting Service-Footnote-160303
+Node: CGI Lib60335
+Node: Simple Server67310
+Ref: Simple Server-Footnote-175053
+Node: Caveats75154
+Node: Challenges76299
+Node: Some Applications and Techniques84997
+Node: PANIC87462
+Node: GETURL89186
+Node: REMCONF91819
+Node: URLCHK97314
+Node: WEBGRAB101166
+Node: STATIST105628
+Ref: STATIST-Footnote-1117377
+Node: MAZE117822
+Node: MOBAGWHO124029
+Ref: MOBAGWHO-Footnote-1138047
+Node: STOXPRED138102
+Node: PROTBASE152390
+Node: Links165506
+Node: GNU Free Documentation License168939
+Node: Index194059
+
+End Tag Table
author	Arnold D. Robbins <arnold@skeeve.com>	2016-10-26 21:52:49 +0300
committer	Arnold D. Robbins <arnold@skeeve.com>	2016-10-26 21:52:49 +0300
commit	627c1b8f9913547703c7c53b0716b913f327a402 (patch)
tree	f34e72451f571b4be5640e18ed9df054d4c3fff3
parent	e404706d5e2ea41229fe5be9b0725202f49bf308 (diff)
parent	e5abd6a16d42fc0f42277919a2d0a2c28476788c (diff)
download	gawk-627c1b8f9913547703c7c53b0716b913f327a402.tar.gz