diff options
Diffstat (limited to 'doc/m4.texi')
-rw-r--r-- | doc/m4.texi | 10411 |
1 files changed, 0 insertions, 10411 deletions
diff --git a/doc/m4.texi b/doc/m4.texi deleted file mode 100644 index 5108e20e..00000000 --- a/doc/m4.texi +++ /dev/null @@ -1,10411 +0,0 @@ -\input texinfo @c -*- texinfo -*- -@comment ======================================================== -@comment %**start of header -@setfilename m4.info -@include version.texi -@settitle GNU M4 @value{VERSION} macro processor -@setchapternewpage odd -@finalout - -@set beta - -@c @tabchar{} -@c ---------- -@c The testsuite expects literal tab output in some examples, but -@c literal tabs in texinfo leads to formatting issues. -@macro tabchar -@ @c -@end macro - -@c @ovar{ARG} -@c ------------------- -@c The ARG is an optional argument. To be used for macro arguments in -@c their documentation (@defmac). -@macro ovar{varname} -@r{[}@var{\varname\}@r{]}@c -@end macro - -@c @dvar{ARG, DEFAULT} -@c ------------------- -@c The ARG is an optional argument, defaulting to DEFAULT. To be used -@c for macro arguments in their documentation (@defmac). -@macro dvar{varname, default} -@r{[}@var{\varname\} = @samp{\default\}@r{]}@c -@end macro - -@comment %**end of header -@comment ======================================================== - -@copying - -This manual (@value{UPDATED}) is for GNU M4 (version -@value{VERSION}), a package containing an implementation of the m4 macro -language. - -Copyright @copyright{} 1989-1994, 2004-2011, 2013-2014, 2017 Free -Software Foundation, Inc. - -@quotation -Permission is granted to copy, distribute and/or modify this document -under the terms of the GNU Free Documentation License, -Version 1.3 or any later version published by the Free Software -Foundation; with no Invariant Sections, no Front-Cover Texts, and no -Back-Cover Texts. A copy of the license is included in the section -entitled ``GNU Free Documentation License.'' -@end quotation -@end copying - -@dircategory Text creation and manipulation -@direntry -* M4: (m4). A powerful macro processor. -@end direntry - -@titlepage -@title GNU M4, version @value{VERSION} -@subtitle A powerful macro processor -@subtitle Edition @value{EDITION}, @value{UPDATED} -@author by Ren@'e Seindal, Fran@,{c}ois Pinard, -@author Gary V. Vaughan, and Eric Blake -@author (@email{bug-m4@@gnu.org}) - -@page -@vskip 0pt plus 1filll -@insertcopying -@end titlepage - -@contents - -@ifnottex -@node Top -@top GNU M4 -@insertcopying -@end ifnottex - -GNU @code{m4} is an implementation of the traditional UNIX macro -processor. It is mostly SVR4 compatible, although it has some -extensions (for example, handling more than 9 positional parameters -to macros). @code{m4} also has builtin functions for including -files, running shell commands, doing arithmetic, etc. Autoconf needs -GNU @code{m4} for generating @file{configure} scripts, but not for -running them. - -GNU @code{m4} was originally written by Ren@'e Seindal, with -subsequent changes by Fran@,{c}ois Pinard and other volunteers -on the Internet. All names and email addresses can be found in the -files @file{m4-@value{VERSION}/@/AUTHORS} and -@file{m4-@value{VERSION}/@/THANKS} from the GNU M4 -distribution. - -@ifclear beta -This is release @value{VERSION}. It is now considered stable: future -releases on this branch are only meant to fix bugs, increase speed, or -improve documentation. -@end ifclear - -@ifset beta -This is BETA release @value{VERSION}. This is a development release, -and as such, is prone to bugs, crashes, unforeseen features, incomplete -documentation@dots{}, therefore, use at your own peril. In case of -problems, please do not hesitate to report them (see the -@file{m4-@value{VERSION}/@/README} file in the distribution). -@xref{Experiments}. -@end ifset - -@menu -* Preliminaries:: Introduction and preliminaries -* Invoking m4:: Invoking @code{m4} -* Syntax:: Lexical and syntactic conventions - -* Macros:: How to invoke macros -* Definitions:: How to define new macros -* Conditionals:: Conditionals, loops, and recursion - -* Debugging:: How to debug macros and input - -* Input Control:: Input control -* File Inclusion:: File inclusion -* Diversions:: Diverting and undiverting output - -* Modules:: Extending M4 with dynamic runtime modules - -* Text handling:: Macros for text handling -* Arithmetic:: Macros for doing arithmetic -* Shell commands:: Macros for running shell commands -* Miscellaneous:: Miscellaneous builtin macros -* Frozen files:: Fast loading of frozen state - -* Compatibility:: Compatibility with other versions of @code{m4} -* Answers:: Correct version of some examples - -* Copying This Package:: How to make copies of the overall M4 package -* Copying This Manual:: How to make copies of this manual -* Indices:: Indices of concepts and macros - -@detailmenu - --- The Detailed Node Listing --- - -Introduction and preliminaries - -* Intro:: Introduction to @code{m4} -* History:: Historical references -* Bugs:: Problems and bugs -* Manual:: Using this manual - -Invoking @code{m4} - -* Operation modes:: Command line options for operation modes -* Preprocessor features:: Command line options for preprocessor features -* Limits control:: Command line options for limits control -* Frozen state:: Command line options for frozen state -* Debugging options:: Command line options for debugging -* Command line files:: Specifying input files on the command line - -Lexical and syntactic conventions - -* Names:: Macro names -* Quoted strings:: Quoting input to @code{m4} -* Comments:: Comments in @code{m4} input -* Other tokens:: Other kinds of input tokens -* Input processing:: How @code{m4} copies input to output -* Regular expression syntax:: How @code{m4} interprets regular expressions - -How to invoke macros - -* Invocation:: Macro invocation -* Inhibiting Invocation:: Preventing macro invocation -* Macro Arguments:: Macro arguments -* Quoting Arguments:: On Quoting Arguments to macros -* Macro expansion:: Expanding macros - -How to define new macros - -* Define:: Defining a new macro -* Arguments:: Arguments to macros -* Pseudo Arguments:: Special arguments to macros -* Undefine:: Deleting a macro -* Defn:: Renaming macros -* Pushdef:: Temporarily redefining macros -* Renamesyms:: Renaming macros with regular expressions - -* Indir:: Indirect call of macros -* Builtin:: Indirect call of builtins -* M4symbols:: Getting the defined macro names - -Conditionals, loops, and recursion - -* Ifdef:: Testing if a macro is defined -* Ifelse:: If-else construct, or multibranch -* Shift:: Recursion in @code{m4} -* Forloop:: Iteration by counting -* Foreach:: Iteration by list contents -* Stacks:: Working with definition stacks -* Composition:: Building macros with macros - -How to debug macros and input - -* Dumpdef:: Displaying macro definitions -* Trace:: Tracing macro calls -* Debugmode:: Controlling debugging options -* Debuglen:: Limiting debug output -* Debugfile:: Saving debugging output - -Input control - -* Dnl:: Deleting whitespace in input -* Changequote:: Changing the quote characters -* Changecom:: Changing the comment delimiters -* Changeresyntax:: Changing the regular expression syntax -* Changesyntax:: Changing the lexical structure of the input -* M4wrap:: Saving text until end of input - -File inclusion - -* Include:: Including named files -* Search Path:: Searching for include files - -Diverting and undiverting output - -* Divert:: Diverting output -* Undivert:: Undiverting output -* Divnum:: Diversion numbers -* Cleardivert:: Discarding diverted text - -Extending M4 with dynamic runtime modules - -* M4modules:: Listing loaded modules -* Standard Modules:: Standard bundled modules - -Macros for text handling - -* Len:: Calculating length of strings -* Index macro:: Searching for substrings -* Regexp:: Searching for regular expressions -* Substr:: Extracting substrings -* Translit:: Translating characters -* Patsubst:: Substituting text by regular expression -* Format:: Formatting strings (printf-like) - -Macros for doing arithmetic - -* Incr:: Decrement and increment operators -* Eval:: Evaluating integer expressions -* Mpeval:: Multiple precision arithmetic - -Macros for running shell commands - -* Platform macros:: Determining the platform -* Syscmd:: Executing simple commands -* Esyscmd:: Reading the output of commands -* Sysval:: Exit status -* Mkstemp:: Making temporary files -* Mkdtemp:: Making temporary directories - -Miscellaneous builtin macros - -* Errprint:: Printing error messages -* Location:: Printing current location -* M4exit:: Exiting from @code{m4} -* Syncoutput:: Turning on and off sync lines - -Fast loading of frozen state - -* Using frozen files:: Using frozen files -* Frozen file format 1:: Frozen file format 1 -* Frozen file format 2:: Frozen file format 2 - -Compatibility with other versions of @code{m4} - -* Extensions:: Extensions in GNU M4 -* Incompatibilities:: Other incompatibilities -* Experiments:: Experimental features in GNU M4 - -Correct version of some examples - -* Improved exch:: Solution for @code{exch} -* Improved forloop:: Solution for @code{forloop} -* Improved foreach:: Solution for @code{foreach} -* Improved copy:: Solution for @code{copy} -* Improved m4wrap:: Solution for @code{m4wrap} -* Improved cleardivert:: Solution for @code{cleardivert} -* Improved capitalize:: Solution for @code{capitalize} -* Improved fatal_error:: Solution for @code{fatal_error} - -How to make copies of the overall M4 package - -* GNU General Public License:: License for copying the M4 package - -How to make copies of this manual - -* GNU Free Documentation License:: License for copying this manual - -Indices of concepts and macros - -* Macro index:: Index for all @code{m4} macros -* Concept index:: Index for many concepts - -@end detailmenu -@end menu - -@node Preliminaries -@chapter Introduction and preliminaries - -This first chapter explains what GNU @code{m4} is, where @code{m4} -comes from, how to read and use this documentation, how to call the -@code{m4} program, and how to report bugs about it. It concludes by -giving tips for reading the remainder of the manual. - -The following chapters then detail all the features of the @code{m4} -language, as shipped in the GNU M4 package. - -@menu -* Intro:: Introduction to @code{m4} -* History:: Historical references -* Bugs:: Problems and bugs -* Manual:: Using this manual -@end menu - -@node Intro -@section Introduction to @code{m4} - -@cindex overview of @code{m4} -@code{m4} is a macro processor, in the sense that it copies its -input to the output, expanding macros as it goes. Macros are either -builtin or user-defined, and can take any number of arguments. -Besides just doing macro expansion, @code{m4} has builtin functions -for including named files, running shell commands, doing integer -arithmetic, manipulating text in various ways, performing recursion, -etc.@dots{} @code{m4} can be used either as a front-end to a compiler, -or as a macro processor in its own right. - -The @code{m4} macro processor is widely available on all UNIXes, and has -been standardized by POSIX. -Usually, only a small percentage of users are aware of its existence. -However, those who find it often become committed users. The -popularity of GNU Autoconf, which requires GNU -@code{m4} for @emph{generating} @file{configure} scripts, is an incentive -for many to install it, while these people will not themselves -program in @code{m4}. GNU @code{m4} is mostly compatible with the -System V, Release 4 version, except for some minor differences. -@xref{Compatibility}, for more details. - -Some people find @code{m4} to be fairly addictive. They first use -@code{m4} for simple problems, then take bigger and bigger challenges, -learning how to write complex sets of @code{m4} macros along the way. -Once really addicted, users pursue writing of sophisticated @code{m4} -applications even to solve simple problems, devoting more time -debugging their @code{m4} scripts than doing real work. Beware that -@code{m4} may be dangerous for the health of compulsive programmers. - -@node History -@section Historical references - -@cindex history of @code{m4} -@cindex GNU M4, history of -Macro languages were invented early in the history of computing. In the -1950s Alan Perlis suggested that the macro language be independent of the -language being processed. Techniques such as conditional and recursive -macros, and using macros to define other macros, were described by Doug -McIlroy of Bell Labs in ``Macro Instruction Extensions of Compiler -Languages'', @emph{Communications of the ACM} 3, 4 (1960), 214--20, -@url{http://dx.doi.org/10.1145/367177.367223}. - -An important precursor of @code{m4} was GPM; see C. Strachey, -@c The title uses lower case and has no space between "macro" and "generator". -``A general purpose macrogenerator'', @emph{Computer Journal} 8, 3 -(1965), 225--41, @url{http://dx.doi.org/10.1093/comjnl/8.3.225}. GPM is -also succinctly described in David Gries's book @emph{Compiler -Construction for Digital Computers}, Wiley (1971). Strachey was a -brilliant programmer: GPM fit into 250 machine instructions! - -Inspired by GPM while visiting Strachey's Lab in 1968, McIlroy wrote a -model preprocessor in that fit into a page of Snobol 3 code, and McIlroy -and Robert Morris developed a series of further models at Bell Labs. -Andrew D. Hall followed up with M6, a general purpose macro processor -used to port the Fortran source code of the Altran computer algebra -system; see Hall's ``The M6 Macro Processor'', Computing Science -Technical Report #2, Bell Labs (1972), -@url{http://cm.bell-labs.com/cm/cs/cstr/2.pdf}. M6's source code -consisted of about 600 Fortran statements. Its name was the first of -the @code{m4} line. - -The Brian Kernighan and P.J. Plauger book @emph{Software Tools}, -Addison-Wesley (1976), describes and implements a Unix -macro-processor language, which inspired Dennis Ritchie to write -@code{m3}, a macro processor for the AP-3 minicomputer. - -Kernighan and Ritchie then joined forces to develop the original -@code{m4}, described in ``The M4 Macro Processor'', Bell Laboratories -(1977), @url{http://wolfram.schneider.org/bsd/7thEdManVol2/m4/m4.pdf}. -It had only 21 builtin macros. - -While @code{GPM} was more @emph{pure}, @code{m4} is meant to deal with -the true intricacies of real life: macros can be recognized without -being pre-announced, skipping whitespace or end-of-lines is easier, -more constructs are builtin instead of derived, etc. - -Originally, the Kernighan and Plauger macro-processor, and then -@code{m3}, formed the engine for the Rational FORTRAN preprocessor, -that is, the @code{Ratfor} equivalent of @code{cpp}. Later, @code{m4} -was used as a front-end for @code{Ratfor}, @code{C} and @code{Cobol}. - -Ren@'e Seindal released his implementation of @code{m4}, GNU -@code{m4}, -in 1990, with the aim of removing the artificial limitations in many -of the traditional @code{m4} implementations, such as maximum line -length, macro size, or number of macros. - -The late Professor A. Dain Samples described and implemented a further -evolution in the form of @code{M5}: ``User's Guide to the M5 Macro -Language: 2nd edition'', Electronic Announcement on comp.compilers -newsgroup (1992). - -Fran@,{c}ois Pinard took over maintenance of GNU @code{m4} in -1992, until 1994 when he released GNU @code{m4} 1.4, which was -the stable release for 10 years. It was at this time that GNU -Autoconf decided to require GNU @code{m4} as its underlying -engine, since all other implementations of @code{m4} had too many -limitations. - -More recently, in 2004, Paul Eggert released 1.4.1 and 1.4.2 which -addressed some long standing bugs in the venerable 1.4 release. Then in -2005, Gary V. Vaughan collected together the many patches to -GNU @code{m4} 1.4 that were floating around the net and -released 1.4.3 and 1.4.4. And in 2006, Eric Blake joined the team and -prepared patches for the release of 1.4.5, 1.4.6, 1.4.7, and 1.4.8. -More bug fixes were incorporated in 2007, with releases 1.4.9 and -1.4.10. Eric continued with some portability fixes for 1.4.11 and -1.4.12 in 2008, 1.4.13 in 2009, 1.4.14 and 1.4.15 in 2010, and 1.4.16 -in 2011. Following a long hiatus, Gary released 1.4.17 after upgrading -to the latest autotools (and gnulib) along with all the small fixes they -had accumulated. - -Additionally, in 2008, Eric rewrote the scanning engine to reduce -recursive evaluation from quadratic to linear complexity. This was -released as M4 1.6 in 2009. The 1.x branch series remains open for bug -fixes. - -Meanwhile, development was underway for new features for @code{m4}, -such as dynamic module loading and additional builtins, practically -rewriting the entire code base. This development has spurred -improvements to other GNU software, such as GNU -Libtool. GNU M4 2.0 is the result of this effort. - -@node Bugs -@section Problems and bugs - -@cindex reporting bugs -@cindex bug reports -@cindex suggestions, reporting -If you have problems with GNU M4 or think you've found a bug, -please report it. Before reporting a bug, make sure you've actually -found a real bug. Carefully reread the documentation and see if it -really says you can do what you're trying to do. If it's not clear -whether you should be able to do something or not, report that too; it's -a bug in the documentation! - -Before reporting a bug or trying to fix it yourself, try to isolate it -to the smallest possible input file that reproduces the problem. Then -send us the input file and the exact results @code{m4} gave you. Also -say what you expected to occur; this will help us decide whether the -problem was really in the documentation. - -Once you've got a precise problem, send e-mail to -@email{bug-m4@@gnu.org}. Please include the version number of @code{m4} -you are using. You can get this information with the command -@kbd{m4 --version}. You can also run @kbd{make check} to generate the -file @file{tests/@/testsuite.log}, useful for including in your report. - -Non-bug suggestions are always welcome as well. If you have questions -about things that are unclear in the documentation or are just obscure -features, please report them too. - -@node Manual -@section Using this manual - -@cindex examples, understanding -This manual contains a number of examples of @code{m4} input and output, -and a simple notation is used to distinguish input, output and error -messages from @code{m4}. Examples are set out from the normal text, and -shown in a fixed width font, like this - -@comment ignore -@example -This is an example of an example! -@end example - -To distinguish input from output, all output from @code{m4} is prefixed -by the string @samp{@result{}}, and all error messages by the string -@samp{@error{}}. When showing how command line options affect matters, -the command line is shown with a prompt @samp{$ @kbd{like this}}, -otherwise, you can assume that a simple @kbd{m4} invocation will work. -Thus: - -@comment ignore -@example -$ @kbd{command line to invoke m4} -Example of input line -@result{}Output line from m4 -@error{}and an error message -@end example - -The sequence @samp{^D} in an example indicates the end of the input -file. The sequence @samp{@key{NL}} refers to the newline character. -The majority of these examples are self-contained, and you can run them -with similar results. In fact, the testsuite that is bundled in the -GNU M4 package consists in part of the examples -in this document! Some of the examples assume that your current -directory is located where you unpacked the installation, so if you plan -on following along, you may find it helpful to do this now: - -@comment ignore -@example -$ @kbd{cd m4-@value{VERSION}} -@end example - -As each of the predefined macros in @code{m4} is described, a prototype -call of the macro will be shown, giving descriptive names to the -arguments, e.g., - -@deffn {Composite (none)} example (@var{string}, @dvar{count, 1}, @ - @ovar{argument}@dots{}) -This is a sample prototype. There is not really a macro named -@code{example}, but this documents that if there were, it would be a -Composite macro, rather than a Builtin, and would be provided by the -module @code{none}. - -It requires at least one argument, @var{string}. Remember that in -@code{m4}, there must not be a space between the macro name and the -opening parenthesis, unless it was intended to call the macro without -any arguments. The brackets around @var{count} and @var{argument} show -that these arguments are optional. If @var{count} is omitted, the macro -behaves as if count were @samp{1}, whereas if @var{argument} is omitted, -the macro behaves as if it were the empty string. A blank argument is -not the same as an omitted argument. For example, @samp{example(`a')}, -@samp{example(`a',`1')}, and @samp{example(`a',`1',)} would behave -identically with @var{count} set to @samp{1}; while @samp{example(`a',)} -and @samp{example(`a',`')} would explicitly pass the empty string for -@var{count}. The ellipses (@samp{@dots{}}) show that the macro -processes additional arguments after @var{argument}, rather than -ignoring them. -@end deffn - -Each builtin definition will list, in parentheses, the module that must -be loaded to use that macro. The standard modules include -@samp{m4} (which is always available), @samp{gnu} (for GNU specific -m4 extensions), and @samp{traditional} (for compatibility with System V -m4). @xref{Modules}. - -@cindex numbers -All macro arguments in @code{m4} are strings, but some are given -special interpretation, e.g., as numbers, file names, regular -expressions, etc. The documentation for each macro will state how the -parameters are interpreted, and what happens if the argument cannot be -parsed according to the desired interpretation. Unless specified -otherwise, a parameter specified to be a number is parsed as a decimal, -even if the argument has leading zeros; and parsing the empty string as -a number results in 0 rather than an error, although a warning will be -issued. - -This document consistently writes and uses @dfn{builtin}, without a -hyphen, as if it were an English word. This is how the @code{builtin} -primitive is spelled within @code{m4}. - -@node Invoking m4 -@chapter Invoking @code{m4} - -@cindex command line -@cindex invoking @code{m4} -The format of the @code{m4} command is: - -@comment ignore -@example -@code{m4} @r{[}@var{option}@dots{}@r{]} @r{[}@var{file}@dots{}@r{]} -@end example - -@cindex command line, options -@cindex options, command line -@cindex @env{POSIXLY_CORRECT} -All options begin with @samp{-}, or if long option names are used, with -@samp{--}. A long option name need not be written completely, any -unambiguous prefix is sufficient. POSIX requires @code{m4} to -recognize arguments intermixed with files, even when -@env{POSIXLY_CORRECT} is set in the environment. Most options take -effect at startup regardless of their position, but some are documented -below as taking effect after any files that occurred earlier in the -command line. The argument @option{--} is a marker to denote the end of -options. - -With short options, options that do not take arguments may be combined -into a single command line argument with subsequent options, options -with mandatory arguments may be provided either as a single command line -argument or as two arguments, and options with optional arguments must -be provided as a single argument. In other words, -@kbd{m4 -QPDfoo -d a -d+f} is equivalent to -@kbd{m4 -Q -P -D foo -d ./a -d+f}, although the latter form is -considered canonical. - -With long options, options with mandatory arguments may be provided with -an equal sign (@samp{=}) in a single argument, or as two arguments, and -options with optional arguments must be provided as a single argument. -In other words, @kbd{m4 --def foo --debug a} is equivalent to -@kbd{m4 --define=foo --debug= -- ./a}, although the latter form is -considered canonical (not to mention more robust, in case a future -version of @code{m4} introduces an option named @option{--default}). - -@code{m4} understands the following options, grouped by functionality. - -@menu -* Operation modes:: Command line options for operation modes -* Preprocessor features:: Command line options for preprocessor features -* Limits control:: Command line options for limits control -* Frozen state:: Command line options for frozen state -* Debugging options:: Command line options for debugging -* Command line files:: Specifying input files on the command line -@end menu - -@node Operation modes -@section Command line options for operation modes - -Several options control the overall operation of @code{m4}: - -@table @code -@item --help -Print a help summary on standard output, then immediately exit -@code{m4} without reading any input files or performing any other -actions. - -@item --version -Print the version number of the program on standard output, then -immediately exit @code{m4} without reading any input files or -performing any other actions. - -@item -b -@itemx --batch -Makes this invocation of @code{m4} non-interactive. This means that -output will be buffered, and an interrupt or pipe write error will halt -execution. If neither -@option{-b} nor @option{-i} are specified, this is activated by default -when any input files are specified, or when either standard input or -standard error is not a terminal. Note that this means that @kbd{m4} -alone might be interactive, but @kbd{m4 -} is not, even though both -commands process only standard input. If both @option{-b} and -@option{-i} are specified, only the last one takes effect. - -@item -c -@itemx --discard-comments -Discard all comments instead of copying them to the output. - -@item -E -@itemx --fatal-warnings -@cindex errors, fatal -@cindex fatal errors -Controls the effect of warnings. If unspecified, then execution -continues and exit status is unaffected when a warning is printed. If -specified exactly once, warnings become fatal; when one is issued, -execution continues, but the exit status will be non-zero. If specified -multiple times, then execution halts with non-zero status the first time -a warning is issued. The introduction of behavior levels is new to M4 -1.4.9; for behavior consistent with earlier versions, you should specify -@option{-E} twice. - - -For backwards compatibility reasons, using @option{-E} behaves as if an -implicit @option{--debug=-d} option is also present. This is so that -scripts written for older M4 versions will not fail if they used -constructs that were previously silently allowed, but would now trigger -a warning. - -@example -$ @kbd{m4} -defn(`oops') -@error{}m4:stdin:1: warning: defn: undefined macro 'oops' -@result{} -^D -@end example - -@comment ignore -@example -$ @kbd{echo $?} -@result{}0 -@end example - -@comment options: -E -@example -$ @kbd{m4 -E} -defn(`oops') -@result{} -^D -@end example - -@comment ignore -@example -$ @kbd{echo $?} -@result{}0 -@end example - -@comment options: -E -d -@comment status: 1 -@example -$ @kbd{m4 -E -d} -defn(`oops') -@error{}m4:stdin:1: warning: defn: undefined macro 'oops' -@result{} -^D -@end example - -@comment ignore -@example -$ @kbd{echo $?} -@result{}1 -@end example - -@item -i -@itemx --interactive -@itemx -e -Makes this invocation of @code{m4} interactive. This means that all -output will be unbuffered, interrupts will be ignored, and behavior on -pipe write errors is inherited from the parent process. If neither -@option{-b} nor @option{-i} are specified, this is activated by default -when no input files are specified, and when both standard input and -standard error are terminals (similar to the way that /bin/sh determines -when to be interactive). If both @option{-b} and @option{-i} are -specified, only the last one takes effect. The spelling @option{-e} -exists for compatibility with other @code{m4} implementations, and -issues a warning because it may be withdrawn in a future version of -GNU M4. - -@item -P -@itemx --prefix-builtins -Internally modify @emph{all} builtin macro names so they all start with -the prefix @samp{m4_}. For example, using this option, one should write -@samp{m4_define} instead of @samp{define}, and @samp{@w{m4___file__}} -instead of @samp{@w{__file__}}. This option has no effect if @option{-R} -is also specified. - -@item -Q -@itemx --quiet -@itemx --silent -Suppress warnings, such as missing or superfluous arguments in macro -calls, or treating the empty string as zero. Error messages are still -printed. The distinction between error and warning is fuzzy, and if -you encounter a situation where the message output did not match your -expectations, please report that as a bug. This option is implied if -@env{POSIXLY_CORRECT} is set in the environment. - -@item -r@r{[}@var{resyntax-spec}@r{]} -@itemx --regexp-syntax@r{[}=@var{resyntax-spec}@r{]} -Set the regular expression syntax according to @var{resyntax-spec}. -When this option is not given, or @var{resyntax-spec} is omitted, -GNU M4 uses the flavor @code{GNU_M4}, which provides -emacs-compatible regular expressions. @xref{Changeresyntax}, for more -details on the format and meaning of @var{resyntax-spec}. This option -may be given more than once, and order with respect to file names is -significant. - -@item --safer -Cripple the following builtins, since each can perform potentially -unsafe actions: @code{maketemp}, @code{mkstemp} (@pxref{Mkstemp}), -@code{mkdtemp} (@pxref{Mkdtemp}), @code{debugfile} (@pxref{Debugfile}), -@code{syscmd} (@pxref{Syscmd}), and @code{esyscmd} (@pxref{Esyscmd}). -An attempt to use any of these macros will result in an error. This -option is intended to make it safer to preprocess an input file of -unknown origin. - -@item -W -@itemx --warnings -Enable warnings. Warnings are on by default unless -@env{POSIXLY_CORRECT} was set in the environment; this option exists to -allow overriding @option{--silent}. -@comment FIXME should we accept -Wall, -Wnone, -Wcategory, -@comment -Wno-category...? -@end table - -@node Preprocessor features -@section Command line options for preprocessor features - -@cindex macro definitions, on the command line -@cindex command line, macro definitions on the -@cindex preprocessor features -Several options allow @code{m4} to behave more like a preprocessor. -Macro definitions and deletions can be made on the command line, the -search path can be altered, and the output file can track where the -input came from. These features occur with the following options: - -@table @code -@item -B @var{directory} -@itemx --prepend-include=@var{directory} -Make @code{m4} search @var{directory} for included files, prior to -searching the current working directory. @xref{Search Path}, for more -details. This option may be given more than once. Some other -implementations of @code{m4} use @option{-B @var{number}} to change their -hard-coded limits, but that is unnecessary in GNU where the -only limit is your hardware capability. So although it is unlikely that -you will want to include a relative directory whose name is purely -numeric, GNU @code{m4} will warn you about this potential -compatibility issue; you can avoid the warning by using the long -spelling, or by using @samp{./@var{number}} if you really meant it. - -@item -D @var{name}@r{[}=@var{value}@r{]} -@itemx --define=@var{name}@r{[}=@var{value}@r{]} -This enters @var{name} into the symbol table. If @samp{=@var{value}} is -missing, the value is taken to be the empty string. The @var{value} can -be any string, and the macro can be defined to take arguments, just as -if it was defined from within the input. This option may be given more -than once; order with respect to file names is significant, and -redefining the same @var{name} loses the previous value. - -@item --import-environment -Imports every variable in the environment as a macro. This is done -before @option{-D} and @option{-U}, so they can override the -environment. - -@item -I @var{directory} -@itemx --include=@var{directory} -Make @code{m4} search @var{directory} for included files that are not -found in the current working directory. @xref{Search Path}, for more -details. This option may be given more than once. - -@item --popdef=@var{name} -This deletes the top-most meaning @var{name} might have. Obviously, -only predefined macros can be deleted in this way. This option may be -given more than once; popping a @var{name} that does not have a -definition is silently ignored. Order is significant with respect to -file names. - -@item -p @var{name}@r{[}=@var{value}@r{]} -@itemx --pushdef=@var{name}@r{[}=@var{value}@r{]} -This enters @var{name} into the symbol table. If @samp{=@var{value}} is -missing, the value is taken to be the empty string. The @var{value} can -be any string, and the macro can be defined to take arguments, just as -if it was defined from within the input. This option may be given more -than once; order with respect to file names is significant, and -redefining the same @var{name} adds another definition to its stack. - -@item -s -@itemx --synclines -Short for @option{--syncoutput=1}, turning on synchronization lines -(sometimes called @dfn{synclines}). - -@item --syncoutput@r{[}=@var{state}@r{]} -@cindex synchronization lines -@cindex location, input -@cindex input location -Control the generation of synchronization lines from the command line. -Synchronization lines are for use by the C preprocessor or other -similar tools. Order is significant with respect to file names. This -option is useful, for example, when @code{m4} is used as a -front end to a compiler. Source file name and line number information -is conveyed by directives of the form @samp{#line @var{linenum} -"@var{file}"}, which are inserted as needed into the middle of the -output. Such directives mean that the following line originated or was -expanded from the contents of input file @var{file} at line -@var{linenum}. The @samp{"@var{file}"} part is often omitted when -the file name did not change from the previous directive. - -Synchronization directives are always given on complete lines by -themselves. When a synchronization discrepancy occurs in the middle of -an output line, the associated synchronization directive is delayed -until the next newline that does not occur in the middle of a quoted -string or comment. @xref{Syncoutput}, for runtime control. @var{state} -is interpreted the same as the argument to @code{syncoutput}; if -@var{state} is omitted, or @option{--syncoutput} is not used, -synchronization lines are disabled. - -@item -U @var{name} -@itemx --undefine=@var{name} -This deletes any predefined meaning @var{name} might have. Obviously, -only predefined macros can be deleted in this way. This option may be -given more than once; undefining a @var{name} that does not have a -definition is silently ignored. Order is significant with respect to -file names. -@end table - -@node Limits control -@section Command line options for limits control - -There are some limits within @code{m4} that can be tuned. For -compatibility, @code{m4} also accepts some options that control limits -in other implementations, but which are automatically unbounded (limited -only by your hardware and operating system constraints) in GNU -@code{m4}. - -@table @code -@item -g -@itemx --gnu -Enable all the extensions in this implementation. This is on by -default unless @env{POSIXLY_CORRECT} is set in the environment; it -exists to allow overriding @option{--traditional}. - -@item -G -@itemx --posix -@itemx --traditional -Suppress all the extensions made in this implementation, compared to the -System V version. @xref{Compatibility}, for a list of these. This -loads the @samp{traditional} module in place of the @samp{gnu} module. -It is implied if @env{POSIXLY_CORRECT} is set in the environment. - -@item -L @var{num} -@itemx --nesting-limit=@var{num} -@cindex nesting limit -@cindex limit, nesting -Artificially limit the nesting of macro calls to @var{num} levels, -stopping program execution if this limit is ever exceeded. When not -specified, nesting is limited to 1024 levels. A value of zero means -unlimited; but then heavily nested code could potentially cause a stack -overflow. @var{num} can have an optional scaling suffix. -@comment FIXME - need a node on what scaling suffixes are supported (see -@comment [info coreutils 'block size'] for ideas), and need to consider -@comment whether builtins should also understand scaling suffixes: -@comment eval, mpeval, perhaps format - -The precise effect of this option might be more correctly associated -with textual nesting than dynamic recursion. It has been useful -when some complex @code{m4} input was generated by mechanical means. -Most users would never need this option. If shown to be obtrusive, -this option (which is still experimental) might well disappear. - -@cindex rescanning -This option does @emph{not} have the ability to break endless -rescanning loops, since these do not necessarily consume much memory -or stack space. Through clever usage of rescanning loops, one can -request complex, time-consuming computations from @code{m4} with useful -results. Putting limitations in this area would break @code{m4} power. -There are many pathological cases: @w{@samp{define(`a', `a')a}} is -only the simplest example (but @pxref{Compatibility}). Expecting GNU -@code{m4} to detect these would be a little like expecting a compiler -system to detect and diagnose endless loops: it is a quite @emph{hard} -problem in general, if not undecidable! - -@item -H @var{num} -@itemx --hashsize=@var{num} -@itemx --word-regexp=@var{regexp} -These options are present only for compatibility with previous versions -of GNU @code{m4}. They do nothing except issue a warning, because the -symbol table size is not fixed anymore, and because the new -@code{changesyntax} feature is more efficient than the withdrawn -experimental @code{changeword}. These options will eventually disappear -in future releases. - -@item -S @var{num} -@itemx -T @var{num} -These options are present for compatibility with System V @code{m4}, but -do nothing in this implementation. They may disappear in future -releases, and issue a warning to that effect. -@end table - -@node Frozen state -@section Command line options for frozen state - -GNU @code{m4} comes with a feature of freezing internal state -(@pxref{Frozen files}). This can be used to speed up @code{m4} -execution when reusing a common initialization script. - -@table @code -@item -F @var{file} -@itemx --freeze-state=@var{file} -Once execution is finished, write out the frozen state on the specified -@var{file}. It is conventional, but not required, for @var{file} to end -in @samp{.m4f}. - -@item -R @var{file} -@itemx --reload-state=@var{file} -Before execution starts, recover the internal state from the specified -frozen @var{file}. The options @option{-D}, @option{-U}, @option{-t}, -@option{-m}, @option{-r}, and @option{--import-environment} take effect -after state is reloaded, but before the input files are read. -@end table - -@node Debugging options -@section Command line options for debugging - -Finally, there are several options for aiding in debugging @code{m4} -scripts. - -@table @code -@item -d@r{[}@r{[}-@r{|}+@r{]}@var{flags}@r{]} -@itemx --debug@r{[}=@r{[}-@r{|}+@r{]}@var{flags}@r{]} -@itemx --debugmode@r{[}=@r{[}-@r{|}+@r{]}@var{flags}@r{]} -Set the debug-level according to the flags @var{flags}. The debug-level -controls the format and amount of information presented by the debugging -functions. @xref{Debugmode}, for more details on the format and -meaning of @var{flags}. If omitted, @var{flags} defaults to -@samp{+adeq}. If the option occurs multiple times, @var{flags} starting -with @samp{-} or @samp{+} are cumulative, while @var{flags} starting -with a letter override all earlier settings. The debug-level starts -with @samp{d} enabled and all other flags disabled. To disable all -previously set flags, specify an explicit @var{flags} of @samp{-V}. For -backward compatibility reasons, the option @option{--fatal-warnings} -implies @samp{--debug=-d} as part of its effects. The spelling -@option{--debug} is recognized as an unambiguous option for -compatibility with earlier versions of GNU M4, but for -consistency with the builtin name, you can also use the spelling -@option{--debugmode}. Order is significant with respect to file names. - -The cumulative effect of the various options in this example is -equivalent to a single invocation of @code{debugmode(`adlqx')}: - -@comment options: -d-V -d+lx --debug --debugmode=-e -@example -$ @kbd{m4 -d+lx --debug --debugmode=-e} -traceon(`len') -@result{} -len(`123') -@error{}m4trace:2: -1- id 2: len(`123') -@result{}3 -@end example - -@item --debugfile@r{[}=@var{file}@r{]} -@itemx -o @var{file} -@itemx --error-output=@var{file} -Redirect debug messages and trace output to the -named @var{file}. Warnings, error messages, and @code{errprint} output -are still printed to standard error. Output from @code{dumpdef} goes to -this file when the debug level @code{o} is not set (@pxref{Debugmode}). -If these options are not used, or -if @var{file} is unspecified (only possible for @option{--debugfile}), -debug output goes to standard error; if @var{file} is the empty string, -debug output is discarded. @xref{Debugfile}, for more details. The -option @option{--debugfile} may be given more than once, and order is -significant with respect to file names. The spellings @option{-o} and -@option{--error-output} are misleading and -inconsistent with other GNU tools; using those spellings will -evoke a warning, and they may be withdrawn or change semantics in a -future release. - -@item -l @var{num} -@itemx --debuglen=@var{num} -@itemx --arglength=@var{num} -Restrict the size of the output generated by macro tracing or by -@code{dumpdef} to @var{num} characters per string. If unspecified or -zero, output is unlimited. @xref{Debuglen}, for more details. -@var{num} can have an optional scaling suffix. The spelling -@option{--arglength} is deprecated, since it does not match the -@code{debuglen} macro; using it will evoke a warning, and it may be -withdrawn in a future release. -@comment FIXME - Should we add an option that controls whether output -@comment strings are sanitized with escape sequences, so that dumpdef is -@comment truly one line per macro? -@comment FIXME - see comment on --nesting-limit about NUM. - -@item -t @var{name} -@itemx --trace=@var{name} -@itemx --traceon=@var{name} -This enables tracing for the macro @var{name}, at any point where it is -defined. @var{name} need not be defined when this option is given. -This option may be given more than once, and order is significant with -respect to file names. @xref{Trace}, for more details. - -@item --traceoff=@var{name} -This disables tracing for the macro @var{name}, at any point where it is -defined. @var{name} need not be defined when this option is given. -This option may be given more than once, and order is significant with -respect to file names. @xref{Trace}, for more details. -@end table - -@node Command line files -@section Specifying input files on the command line - -@cindex command line, file names on the -@cindex file names, on the command line -The remaining arguments on the command line are taken to be input file -names or module names (@pxref{Modules}). Whether or not any modules -are loaded from command line arguments, when no actual input file names -are given, then standard input is read. A file name of @file{-} can be -used to denote standard input. It is conventional, but not required, -for input file names to end in @samp{.m4} and for module names to end -in @samp{.la}. The input files and modules are attended to in the -sequence given. - -Standard input can be read more than once, so the file name @file{-} -may appear multiple times on the command line; this makes a difference -when input is from a terminal or other special file type. It is an -error if an input file ends in the middle of argument collection, a -comment, or a quoted string. -@comment FIXME - it would be nicer if we let these three things -@comment continue across file boundaries, provided that we warn in -@comment interactive use when switching to stdin in a non-default parse -@comment state. - -Various options, such as @option{--define} (@option{-D}), @option{--undefine} -(@option{-U}), @option{--synclines} (@option{-s}), @option{--trace} -(@option{-t}), and @option{--regexp-syntax} (@option{-r}), only take -effect after processing input from any file names that occur earlier -on the command line. For example, assume the file @file{foo} contains: - -@comment file: foo -@example -$ @kbd{cat foo} -bar -@end example - -The text @samp{bar} can then be redefined over multiple uses of -@file{foo}: - -@comment options: -Dbar=hello foo -Dbar=world foo -@example -$ @kbd{m4 -Dbar=hello foo -Dbar=world foo} -@result{}hello -@result{}world -@end example - -@cindex command line, module names on the -@cindex module names, on the command line -The use of loadable runtime modules in any sense is a GNU M4 -extension, so if @option{-G} is also passed or if the @env{POSIXLY_CORRECT} -environment variable is set, even otherwise valid module names will be -treated as though they were input file names (and no doubt cause havoc as -M4 tries to scan and expand the contents as if it were written in @code{m4}). - -If none of the input files invoked @code{m4exit} (@pxref{M4exit}), the -exit status of @code{m4} will be 0 for success, 1 for general failure -(such as problems with reading an input file), and 63 for version -mismatch (@pxref{Using frozen files}). - -If you need to read a file whose name starts with a @file{-}, you can -specify it as @samp{./-file}, or use @option{--} to mark the end of -options. - -@ignore -@comment Test that 'm4 file/' detects that file is not a directory; we -@comment can assume that the current directory contains a Makefile. -@comment mingw fails with EINVAL rather than ENOTDIR. - -@comment status: 1 -@comment xerr: ignore -@comment options: Makefile/ -@example -@error{}m4: cannot open file 'Makefile/': No such file or directory -@end example - -@comment Test that closed stderr does not cause a crash. Not all -@comment systems have the same message for EBADF. - -@comment xerr: ignore -@example -ifdef(`__unix__', , - `errprint(` skipping: syscmd does not have unix semantics -')m4exit(`77')')dnl -syscmd(`echo | cat >&- 2>/dev/null')ifelse(sysval, `0', - `errprint(` skipping: system does not allow closing stdout -')m4exit(`77')')dnl -changequote(`[', `]')dnl -syscmd([echo | ']__program__[' >&-])dnl -@error{}m4: write error: Bad file descriptor -sysval -@result{}1 -@end example - -@example -ifdef(`__unix__', , - `errprint(` skipping: syscmd does not have unix semantics -')m4exit(`77')')dnl -syscmd(`echo | cat >&- 2>/dev/null')ifelse(sysval, `0', - `errprint(` skipping: system does not allow closing stdout -')m4exit(`77')')dnl -changequote(`[', `]')dnl -syscmd([echo 'esyscmd(echo hi >&2 && echo err"print(bye -)d"nl)dnl' > tmp.m4 \ - && ']__program__[' tmp.m4 <&- >&- \ - && rm tmp.m4])sysval -@error{}hi -@error{}bye -@result{}0 -@end example - -@comment Test that we obey POSIX semantics with -D interspersed with -@comment files, even with POSIXLY_CORRECT (BSD getopt gets it wrong). - -$ @kbd{m4 } -@example -ifdef(`__unix__', , - `errprint(` skipping: syscmd does not have unix semantics -')m4exit(`77')')dnl -changequote(`[', `]')dnl -syscmd([POSIXLY_CORRECT=1 ']__program__[' -Dbar=hello foo -Dbar=world foo])dnl -@result{}hello -@result{}world -sysval -@result{}0 -@end example -@end ignore - -@node Syntax -@chapter Lexical and syntactic conventions - -@cindex input tokens -@cindex tokens -As @code{m4} reads its input, it separates it into @dfn{tokens}. A -token is either a name, a quoted string, or any single character, that -is not a part of either a name or a string. Input to @code{m4} can also -contain comments. GNU @code{m4} does not yet understand -multibyte locales; all operations are byte-oriented rather than -character-oriented (although if your locale uses a single byte -encoding, such as @sc{ISO-8859-1}, you will not notice a difference). -However, @code{m4} is eight-bit clean, so you can -use non-@sc{ascii} characters in quoted strings (@pxref{Changequote}), -comments (@pxref{Changecom}), and macro names (@pxref{Indir}), with the -exception of the @sc{nul} character (the zero byte @samp{'\0'}). - -@comment FIXME - each builtin needs to document how it handles NUL, then -@comment update the above paragraph to mention that NUL is now handled -@comment transparently. - -@menu -* Names:: Macro names -* Quoted strings:: Quoting input to @code{m4} -* Comments:: Comments in @code{m4} input -* Other tokens:: Other kinds of input tokens -* Input processing:: How @code{m4} copies input to output -* Regular expression syntax:: How @code{m4} interprets regular expressions -@end menu - -@node Names -@section Macro names - -@cindex names -@cindex words -A name is any sequence of letters, digits, and the character @samp{_} -(underscore), where the first character is not a digit. @code{m4} will -use the longest such sequence found in the input. If a name has a -macro definition, it will be subject to macro expansion -(@pxref{Macros}). Names are case-sensitive. - -Examples of legal names are: @samp{foo}, @samp{_tmp}, and @samp{name01}. - -The definitions of letters, digits and other input characters can be -changed at any time, using the builtin macro @code{changesyntax}. -@xref{Changesyntax}, for more information. - -@node Quoted strings -@section Quoting input to @code{m4} - -@cindex quoted string -@cindex string, quoted -A quoted string is a sequence of characters surrounded by quote -strings, defaulting to -@samp{`} (grave-accent, also known as back-tick, with UCS value U0060) -and @samp{'} (apostrophe, also known as single-quote, with UCS value -U0027), where the nested begin and end quotes within the -string are balanced. The value of a string token is the text, with one -level of quotes stripped off. Thus - -@comment ignore -@example -`' -@result{} -@end example - -@noindent -is the empty string, and double-quoting turns into single-quoting. - -@comment ignore -@example -``quoted'' -@result{}`quoted' -@end example - -The quote characters can be changed at any time, using the builtin macros -@code{changequote} (@pxref{Changequote}) or @code{changesyntax} -(@pxref{Changesyntax}). - -@node Comments -@section Comments in @code{m4} input - -@cindex comments -Comments in @code{m4} are normally delimited by the characters @samp{#} -and newline. All characters between the comment delimiters are ignored, -but the entire comment (including the delimiters) is passed through to -the output, unless you supply the @option{--discard-comments} or -@option{-c} option at the command line (@pxref{Operation modes, , -Invoking m4}). When discarding comments, the comment delimiters are -discarded, even if the close-comment string is a newline. - -Comments cannot be nested, so the first newline after a @samp{#} ends -the comment. The commenting effect of the begin-comment string -can be inhibited by quoting it. - -@example -$ @kbd{m4} -`quoted text' # `commented text' -@result{}quoted text # `commented text' -`quoting inhibits' `#' `comments' -@result{}quoting inhibits # comments -@end example - -@comment options: -c -@example -$ @kbd{m4 -c} -`quoted text' # `commented text' -`quoting inhibits' `#' `comments' -@result{}quoted text quoting inhibits # comments -@end example - -The comment delimiters can be changed to any string at any time, using -the builtin macros @code{changecom} (@pxref{Changecom}) or -@code{changesyntax} (@pxref{Changesyntax}). - -@node Other tokens -@section Other kinds of input tokens - -@cindex tokens, special -Any character, that is neither a part of a name, nor of a quoted string, -nor a comment, is a token by itself. When not in the context of macro -expansion, all of these tokens are just copied to output. However, -during macro expansion, whitespace characters (space, tab, newline, -formfeed, carriage return, vertical tab), parentheses (@samp{(} and -@samp{)}), comma (@samp{,}), and dollar (@samp{$}) have additional -roles, explained later. Which characters actually perform these roles -can be adjusted with @code{changesyntax} (@pxref{Changesyntax}). - -@node Input processing -@section How @code{m4} copies input to output - -As @code{m4} reads the input token by token, it will copy each token -directly to the output immediately. - -The exception is when it finds a word with a macro definition. In that -case @code{m4} will calculate the macro's expansion, possibly reading -more input to get the arguments. It then inserts the expansion in front -of the remaining input. In other words, the resulting text from a macro -call will be read and parsed into tokens again. - -@code{m4} expands a macro as soon as possible. If it finds a macro call -when collecting the arguments to another, it will expand the second call -first. This process continues until there are no more macro calls to -expand and all the input has been consumed. - -For a running example, examine how @code{m4} handles this input: - -@comment ignore -@example -format(`Result is %d', eval(`2**15')) -@end example - -@noindent -First, @code{m4} sees that the token @samp{format} is a macro name, so -it collects the tokens @samp{(}, @samp{`Result is %d'}, @samp{,}, -and @samp{@w{ }}, before encountering another potential macro. Sure -enough, @samp{eval} is a macro name, so the nested argument collection -picks up @samp{(}, @samp{`2**15'}, and @samp{)}, invoking the eval macro -with the lone argument of @samp{2**15}. The expansion of -@samp{eval(2**15)} is @samp{32768}, which is then rescanned as the five -tokens @samp{3}, @samp{2}, @samp{7}, @samp{6}, and @samp{8}; and -combined with the next @samp{)}, the format macro now has all its -arguments, as if the user had typed: - -@comment ignore -@example -format(`Result is %d', 32768) -@end example - -@noindent -The format macro expands to @samp{Result is 32768}, and we have another -round of scanning for the tokens @samp{Result}, @samp{@w{ }}, -@samp{is}, @samp{@w{ }}, @samp{3}, @samp{2}, @samp{7}, @samp{6}, and -@samp{8}. None of these are macros, so the final output is - -@comment ignore -@example -@result{}Result is 32768 -@end example - -As a more complicated example, we will contrast an actual code example -from the Gnulib project@footnote{Derived from a patch in -@uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-01/@/msg00389.html}, -and a followup patch in -@uref{http://lists.gnu.org/archive/html/bug-gnulib/@/2007-02/@/msg00000.html}}, -showing both a buggy approach and the desired results. The user desires -to output a shell assignment statement that takes its argument and turns -it into a shell variable by converting it to uppercase and prepending a -prefix. The original attempt looks like this: - -@example -changequote([,])dnl -define([gl_STRING_MODULE_INDICATOR], - [ - dnl comment - GNULIB_]translit([$1],[a-z],[A-Z])[=1 - ])dnl - gl_STRING_MODULE_INDICATOR([strcase]) -@result{} @w{ } -@result{} GNULIB_strcase=1 -@result{} @w{ } -@end example - -Oops -- the argument did not get capitalized. And although the manual -is not able to easily show it, both lines that appear empty actually -contain two trailing spaces. By stepping through the parse, it is easy -to see what happened. First, @code{m4} sees the token -@samp{changequote}, which it recognizes as a macro, followed by -@samp{(}, @samp{[}, @samp{,}, @samp{]}, and @samp{)} to form the -argument list. The macro expands to the empty string, but changes the -quoting characters to something more useful for generating shell code -(unbalanced @samp{`} and @samp{'} appear all the time in shell scripts, -but unbalanced @samp{[]} tend to be rare). Also in the first line, -@code{m4} sees the token @samp{dnl}, which it recognizes as a builtin -macro that consumes the rest of the line, resulting in no output for -that line. - -The second line starts a macro definition. @code{m4} sees the token -@samp{define}, which it recognizes as a macro, followed by a @samp{(}, -@samp{[gl_STRING_MODULE_INDICATOR]}, and @samp{,}. Because an unquoted -comma was encountered, the first argument is known to be the expansion -of the single-quoted string token, or @samp{gl_STRING_MODULE_INDICATOR}. -Next, @code{m4} sees @samp{@key{NL}}, @samp{ }, and @samp{ }, but this -whitespace is discarded as part of argument collection. Then comes a -rather lengthy single-quoted string token, @samp{[@key{NL}@ @ @ @ dnl -comment@key{NL}@ @ @ @ GNULIB_]}. This is followed by the token -@samp{translit}, which @code{m4} recognizes as a macro name, so a nested -macro expansion has started. - -The arguments to the @code{translit} are found by the tokens @samp{(}, -@samp{[$1]}, @samp{,}, @samp{[a-z]}, @samp{,}, @samp{[A-Z]}, and finally -@samp{)}. All three string arguments are expanded (or in other words, -the quotes are stripped), and since neither @samp{$} nor @samp{1} need -capitalization, the result of the macro is @samp{$1}. This expansion is -rescanned, resulting in the two literal characters @samp{$} and -@samp{1}. - -Scanning of the outer macro resumes, and picks up with -@samp{[=1@key{NL}@ @ ]}, and finally @samp{)}. The collected pieces of -expanded text are concatenated, with the end result that the macro -@samp{gl_STRING_MODULE_INDICATOR} is now defined to be the sequence -@samp{@key{NL}@ @ @ @ dnl comment@key{NL}@ @ @ @ GNULIB_$1=1@key{NL}@ @ }. -Once again, @samp{dnl} is recognized and avoids a newline in the output. - -The final line is then parsed, beginning with @samp{ } and @samp{ } -that are output literally. Then @samp{gl_STRING_MODULE_INDICATOR} is -recognized as a macro name, with an argument list of @samp{(}, -@samp{[strcase]}, and @samp{)}. Since the definition of the macro -contains the sequence @samp{$1}, that sequence is replaced with the -argument @samp{strcase} prior to starting the rescan. The rescan sees -@samp{@key{NL}} and four spaces, which are output literally, then -@samp{dnl}, which discards the text @samp{ comment@key{NL}}. Next -comes four more spaces, also output literally, and the token -@samp{GNULIB_strcase}, which resulted from the earlier parameter -substitution. Since that is not a macro name, it is output literally, -followed by the literal tokens @samp{=}, @samp{1}, @samp{@key{NL}}, and -two more spaces. Finally, the original @samp{@key{NL}} seen after the -macro invocation is scanned and output literally. - -Now for a corrected approach. This rearranges the use of newlines and -whitespace so that less whitespace is output (which, although harmless -to shell scripts, can be visually unappealing), and fixes the quoting -issues so that the capitalization occurs when the macro -@samp{gl_STRING_MODULE_INDICATOR} is invoked, rather then when it is -defined. It also adds another layer of quoting to the first argument of -@code{translit}, to ensure that the output will be rescanned as a string -rather than a potential uppercase macro name needing further expansion. - -@example -changequote([,])dnl -define([gl_STRING_MODULE_INDICATOR], - [dnl comment - GNULIB_[]translit([[$1]], [a-z], [A-Z])=1dnl -])dnl - gl_STRING_MODULE_INDICATOR([strcase]) -@result{} GNULIB_STRCASE=1 -@end example - -The parsing of the first line is unchanged. The second line sees the -name of the macro to define, then sees the discarded @samp{@key{NL}} -and two spaces, as before. But this time, the next token is -@samp{[dnl comment@key{NL}@ @ GNULIB_[]translit([[$1]], [a-z], -[A-Z])=1dnl@key{NL}]}, which includes nested quotes, followed by -@samp{)} to end the macro definition and @samp{dnl} to skip the -newline. No early expansion of @code{translit} occurs, so the entire -string becomes the definition of the macro. - -The final line is then parsed, beginning with two spaces that are -output literally, and an invocation of -@code{gl_STRING_MODULE_INDICATOR} with the argument @samp{strcase}. -Again, the @samp{$1} in the macro definition is substituted prior to -rescanning. Rescanning first encounters @samp{dnl}, and discards -@samp{ comment@key{NL}}. Then two spaces are output literally. Next -comes the token @samp{GNULIB_}, but that is not a macro, so it is -output literally. The token @samp{[]} is an empty string, so it does -not affect output. Then the token @samp{translit} is encountered. - -This time, the arguments to @code{translit} are parsed as @samp{(}, -@samp{[[strcase]]}, @samp{,}, @samp{ }, @samp{[a-z]}, @samp{,}, @samp{ }, -@samp{[A-Z]}, and @samp{)}. The two spaces are discarded, and the -translit results in the desired result @samp{[STRCASE]}. This is -rescanned, but since it is a string, the quotes are stripped and the -only output is a literal @samp{STRCASE}. -Then the scanner sees @samp{=} and @samp{1}, which are output -literally, followed by @samp{dnl} which discards the rest of the -definition of @code{gl_STRING_MODULE_INDICATOR}. The newline at the -end of output is the literal @samp{@key{NL}} that appeared after the -invocation of the macro. - -The order in which @code{m4} expands the macros can be further explored -using the trace facilities of GNU @code{m4} (@pxref{Trace}). - -@node Regular expression syntax -@section How @code{m4} interprets regular expressions - -There are several contexts where @code{m4} parses an argument as a -regular expression. This section describes the various flavors of -regular expressions. @xref{Changeresyntax}. - -@include regexprops-generic.texi - -@node Macros -@chapter How to invoke macros - -This chapter covers macro invocation, macro arguments and how macro -expansion is treated. - -@menu -* Invocation:: Macro invocation -* Inhibiting Invocation:: Preventing macro invocation -* Macro Arguments:: Macro arguments -* Quoting Arguments:: On Quoting Arguments to macros -* Macro expansion:: Expanding macros -@end menu - -@node Invocation -@section Macro invocation - -@cindex macro invocation -@cindex invoking macros -Macro invocations has one of the forms - -@comment ignore -@example -name -@end example - -@noindent -which is a macro invocation without any arguments, or - -@comment ignore -@example -name(arg1, arg2, @dots{}, arg@var{n}) -@end example - -@noindent -which is a macro invocation with @var{n} arguments. Macros can have any -number of arguments. All arguments are strings, but different macros -might interpret the arguments in different ways. - -The opening parenthesis @emph{must} follow the @var{name} directly, with -no spaces in between. If it does not, the macro is called with no -arguments at all. - -For a macro call to have no arguments, the parentheses @emph{must} be -left out. The macro call - -@comment ignore -@example -name() -@end example - -@noindent -is a macro call with one argument, which is the empty string, not a call -with no arguments. - -@node Inhibiting Invocation -@section Preventing macro invocation - -An innovation of the @code{m4} language, compared to some of its -predecessors (like Strachey's @code{GPM}, for example), is the ability -to recognize macro calls without resorting to any special, prefixed -invocation character. While generally useful, this feature might -sometimes be the source of spurious, unwanted macro calls. So, GNU -@code{m4} offers several mechanisms or techniques for inhibiting the -recognition of names as macro calls. - -@cindex GNU extensions -@cindex blind macro -@cindex macro, blind -First of all, many builtin macros cannot meaningfully be called without -arguments. As a GNU extension, for any of these macros, -whenever an opening parenthesis does not immediately follow their name, -the builtin macro call is not triggered. This solves the most usual -cases, like for @samp{include} or @samp{eval}. Later in this document, -the sentence ``This macro is recognized only with parameters'' refers to -this specific provision of GNU M4, also known as a blind -builtin macro. For the builtins defined by POSIX that bear -this disclaimer, POSIX specifically states that invoking those -builtins without arguments is unspecified, because many other -implementations simply invoke the builtin as though it were given one -empty argument instead. - -@example -$ @kbd{m4} -eval -@result{}eval -eval(`1') -@result{}1 -@end example - -There is also a command line option (@option{--prefix-builtins}, or -@option{-P}, @pxref{Operation modes, , Invoking m4}) that renames all -builtin macros with a prefix of @samp{m4_} at startup. The option has -no effect whatsoever on user defined macros. For example, with this option, -one has to write @code{m4_dnl} and even @code{m4_m4exit}. It also has -no effect on whether a macro requires parameters. - -@comment options: -P -@example -$ @kbd{m4 -P} -eval -@result{}eval -eval(`1') -@result{}eval(1) -m4_eval -@result{}m4_eval -m4_eval(`1') -@result{}1 -@end example - -Another alternative is to redefine problematic macros to a name less -likely to cause conflicts, using @ref{Definitions}. Or the parsing -engine can be changed to redefine what constitutes a valid macro name, -using @ref{Changesyntax}. - -Of course, the simplest way to prevent a name from being interpreted -as a call to an existing macro is to quote it. The remainder of -this section studies a little more deeply how quoting affects macro -invocation, and how quoting can be used to inhibit macro invocation. - -Even if quoting is usually done over the whole macro name, it can also -be done over only a few characters of this name (provided, of course, -that the unquoted portions are not also a macro). It is also possible -to quote the empty string, but this works only @emph{inside} the name. -For example: - -@example -`divert' -@result{}divert -`d'ivert -@result{}divert -di`ver't -@result{}divert -div`'ert -@result{}divert -@end example - -@noindent -all yield the string @samp{divert}. While in both: - -@example -`'divert -@result{} -divert`' -@result{} -@end example - -@noindent -the @code{divert} builtin macro will be called, which expands to the -empty string. - -@cindex rescanning -The output of macro evaluations is always rescanned. In the following -example, the input @samp{x`'y} yields the string @samp{bCD}, exactly as -if @code{m4} -has been given @w{@samp{substr(ab`'cde, `1', `3')}} as input: - -@example -define(`cde', `CDE') -@result{} -define(`x', `substr(ab') -@result{} -define(`y', `cde, `1', `3')') -@result{} -x`'y -@result{}bCD -@end example - -Unquoted strings on either side of a quoted string are subject to -being recognized as macro names. In the following example, quoting the -empty string allows for the second @code{macro} to be recognized as such: - -@example -define(`macro', `m') -@result{} -macro(`m')macro -@result{}mmacro -macro(`m')`'macro -@result{}mm -@end example - -Quoting may prevent recognizing as a macro name the concatenation of a -macro expansion with the surrounding characters. In this example: - -@example -define(`macro', `di$1') -@result{} -macro(`v')`ert' -@result{}divert -macro(`v')ert -@result{} -@end example - -@noindent -the input will produce the string @samp{divert}. When the quotes were -removed, the @code{divert} builtin was called instead. - -@node Macro Arguments -@section Macro arguments - -@cindex macros, arguments to -@cindex arguments to macros -When a name is seen, and it has a macro definition, it will be expanded -as a macro. - -If the name is followed by an opening parenthesis, the arguments will be -collected before the macro is called. If too few arguments are -supplied, the missing arguments are taken to be the empty string. -However, some builtins are documented to behave differently for a -missing optional argument than for an explicit empty string. If there -are too many arguments, the excess arguments are ignored. Unquoted -leading whitespace is stripped off all arguments, but whitespace -generated by a macro expansion or occurring after a macro that expanded -to an empty string remains intact. Whitespace includes space, tab, -newline, carriage return, vertical tab, and formfeed. - -@example -define(`macro', `$1') -@result{} -macro( unquoted leading space lost) -@result{}unquoted leading space lost -macro(` quoted leading space kept') -@result{} quoted leading space kept -macro( - divert `unquoted space kept after expansion') -@result{} unquoted space kept after expansion -macro(macro(` -')`whitespace from expansion kept') -@result{} -@result{}whitespace from expansion kept -macro(`unquoted trailing whitespace kept' -) -@result{}unquoted trailing whitespace kept -@result{} -@end example - -@cindex warnings, suppressing -@cindex suppressing warnings -Normally @code{m4} will issue warnings if a builtin macro is called -with an inappropriate number of arguments, but it can be suppressed with -the @option{--quiet} command line option (or @option{--silent}, or -@option{-Q}, @pxref{Operation modes, , Invoking m4}). For user -defined macros, there is no check of the number of arguments given. - -@example -$ @kbd{m4} -index(`abc') -@error{}m4:stdin:1: warning: index: too few arguments: 1 < 2 -@result{}0 -index(`abc',) -@result{}0 -index(`abc', `b', `0', `ignored') -@error{}m4:stdin:3: warning: index: extra arguments ignored: 4 > 3 -@result{}1 -@end example - -@comment options: -Q -@example -$ @kbd{m4 -Q} -index(`abc') -@result{}0 -index(`abc',) -@result{}0 -index(`abc', `b', `', `ignored') -@result{}1 -@end example - -Macros are expanded normally during argument collection, and whatever -commas, quotes and parentheses that might show up in the resulting -expanded text will serve to define the arguments as well. Thus, if -@var{foo} expands to @samp{, b, c}, the macro call - -@comment ignore -@example -bar(a foo, d) -@end example - -@noindent -is a macro call with four arguments, which are @samp{a }, @samp{b}, -@samp{c} and @samp{d}. To understand why the first argument contains -whitespace, remember that unquoted leading whitespace is never part -of an argument, but trailing whitespace always is. - -It is possible for a macro's definition to change during argument -collection, in which case the expansion uses the definition that was in -effect at the time the opening @samp{(} was seen. - -@example -define(`f', `1') -@result{} -f(define(`f', `2')) -@result{}1 -f -@result{}2 -@end example - -It is an error if the end of file occurs while collecting arguments. - -@comment status: 1 -@example -hello world -@result{}hello world -define( -^D -@error{}m4:stdin:2: define: end of file in argument list -@end example - -@node Quoting Arguments -@section On Quoting Arguments to macros - -@cindex quoted macro arguments -@cindex macros, quoted arguments to -@cindex arguments, quoted macro -Each argument has unquoted leading whitespace removed. Within each -argument, all unquoted parentheses must match. For example, if -@var{foo} is a macro, - -@comment ignore -@example -foo(() (`(') `(') -@end example - -@noindent -is a macro call, with one argument, whose value is @samp{() (() (}. -Commas separate arguments, except when they occur inside quotes, -comments, or unquoted parentheses. @xref{Pseudo Arguments}, for -examples. - -It is common practice to quote all arguments to macros, unless you are -sure you want the arguments expanded. Thus, in the above -example with the parentheses, the `right' way to do it is like this: - -@comment ignore -@example -foo(`() (() (') -@end example - -@cindex quoting rule of thumb -@cindex rule of thumb, quoting -It is, however, in certain cases necessary (because nested expansion -must occur to create the arguments for the outer macro) or convenient -(because it uses fewer characters) to leave out quotes for some -arguments, and there is nothing wrong in doing it. It just makes life a -bit harder, if you are not careful to follow a consistent quoting style. -For consistency, this manual follows the rule of thumb that each layer -of parentheses introduces another layer of single quoting, except when -showing the consequences of quoting rules. This is done even when the -quoted string cannot be a macro, such as with integers when you have not -changed the syntax via @code{changesyntax} (@pxref{Changesyntax}). - -The quoting rule of thumb of one level of quoting per parentheses has a -nice property: when a macro name appears inside parentheses, you can -determine when it will be expanded. If it is not quoted, it will be -expanded prior to the outer macro, so that its expansion becomes the -argument. If it is single-quoted, it will be expanded after the outer -macro. And if it is double-quoted, it will be used as literal text -instead of a macro name. - -@example -define(`active', `ACT, IVE') -@result{} -define(`show', `$1 $1') -@result{} -show(active) -@result{}ACT ACT -show(`active') -@result{}ACT, IVE ACT, IVE -show(``active'') -@result{}active active -@end example - -@node Macro expansion -@section Macro expansion - -@cindex macros, expansion of -@cindex expansion of macros -When the arguments, if any, to a macro call have been collected, the -macro is expanded, and the expansion text is pushed back onto the input -(unquoted), and reread. The expansion text from one macro call might -therefore result in more macros being called, if the calls are included, -completely or partially, in the first macro calls' expansion. - -Taking a very simple example, if @var{foo} expands to @samp{bar}, and -@var{bar} expands to @samp{Hello world}, the input - -@comment options: -Dbar='Hello world' -Dfoo=bar -@example -$ @kbd{m4 -Dbar="Hello world" -Dfoo=bar} -foo -@result{}Hello world -@end example - -@noindent -will expand first to @samp{bar}, and when this is reread and -expanded, into @samp{Hello world}. - -@node Definitions -@chapter How to define new macros - -@cindex macros, how to define new -@cindex defining new macros -Macros can be defined, redefined and deleted in several different ways. -Also, it is possible to redefine a macro without losing a previous -value, and bring back the original value at a later time. - -@menu -* Define:: Defining a new macro -* Arguments:: Arguments to macros -* Pseudo Arguments:: Special arguments to macros -* Undefine:: Deleting a macro -* Defn:: Renaming macros -* Pushdef:: Temporarily redefining macros -* Renamesyms:: Renaming macros with regular expressions - -* Indir:: Indirect call of macros -* Builtin:: Indirect call of builtins -* M4symbols:: Getting the defined macro names -@end menu - -@node Define -@section Defining a macro - -The normal way to define or redefine macros is to use the builtin -@code{define}: - -@deffn {Builtin (m4)} define (@var{name}, @ovar{expansion}) -Defines @var{name} to expand to @var{expansion}. If -@var{expansion} is not given, it is taken to be empty. - -The expansion of @code{define} is void. -The macro @code{define} is recognized only with parameters. -@end deffn -@comment Other implementations, such as Solaris, can define a macro -@comment with a builtin token attached to text: -@comment define(foo, a`'defn(`divnum')b) -@comment defn(`foo') => ab -@comment dumpdef(`foo') => foo: a<divnum>b -@comment len(defn(`foo')) => 3 -@comment index(defn(`foo'), defn(`divnum')) => 1 -@comment foo => a0b -@comment It may be worth making some changes to support this behavior, -@comment or something similar to it. -@comment -@comment But be sure it has sane semantics, with potentially deferred -@comment expansion of builtins. For example, this should not warn -@comment about trying to access the definition of an undefined macro: -@comment define(`foo', `ifdef(`$1', 'defn(`defn')`)')foo(`oops') -@comment Also, think how to handle conflicting argument counts: -@comment define(`bar', defn(`dnl', `len')) - -The following example defines the macro @var{foo} to expand to the text -@samp{Hello World.}. - -@example -define(`foo', `Hello world.') -@result{} -foo -@result{}Hello world. -@end example - -The empty line in the output is there because the newline is not -a part of the macro definition, and it is consequently copied to -the output. This can be avoided by use of the macro @code{dnl}. -@xref{Dnl}, for details. - -The first argument to @code{define} should be quoted; otherwise, if the -macro is already defined, you will be defining a different macro. This -example shows the problems with underquoting, since we did not want to -redefine @code{one}: - -@example -define(foo, one) -@result{} -define(foo, two) -@result{} -one -@result{}two -@end example - -@cindex GNU extensions -GNU @code{m4} normally replaces only the @emph{topmost} -definition of a macro if it has several definitions from @code{pushdef} -(@pxref{Pushdef}). Some other implementations of @code{m4} replace all -definitions of a macro with @code{define}. @xref{Incompatibilities}, -for more details. - -As a GNU extension, the first argument to @code{define} does -not have to be a simple word. -It can be any text string, even the empty string. A macro with a -non-standard name cannot be invoked in the normal way, as the name is -not recognized. It can only be referenced by the builtins @code{Indir} -(@pxref{Indir}) and @code{Defn} (@pxref{Defn}). - -@cindex arrays -Arrays and associative arrays can be simulated by using non-standard -macro names. - -@deffn Composite array (@var{index}) -@deffnx Composite array_set (@var{index}, @ovar{value}) -Provide access to entries within an array. @code{array} reads the entry -at location @var{index}, and @code{array_set} assigns @var{value} to -location @var{index}. -@end deffn - -@example -define(`array', `defn(format(``array[%d]'', `$1'))') -@result{} -define(`array_set', `define(format(``array[%d]'', `$1'), `$2')') -@result{} -array_set(`4', `array element no. 4') -@result{} -array_set(`17', `array element no. 17') -@result{} -array(`4') -@result{}array element no. 4 -array(eval(`10 + 7')) -@result{}array element no. 17 -@end example - -Change the @samp{%d} to @samp{%s} and it is an associative array. - -@node Arguments -@section Arguments to macros - -@cindex macros, arguments to -@cindex arguments to macros -Macros can have arguments. The @var{n}th argument is denoted by -@code{$n} in the expansion text, and is replaced by the @var{n}th actual -argument, when the macro is expanded. Replacement of arguments happens -before rescanning, regardless of how many nesting levels of quoting -appear in the expansion. Here is an example of a macro with -two arguments. - -@deffn Composite exch (@var{arg1}, @var{arg2}) -Expands to @var{arg2} followed by @var{arg1}, effectively exchanging -their order. -@end deffn - -@example -define(`exch', `$2, $1') -@result{} -exch(`arg1', `arg2') -@result{}arg2, arg1 -@end example - -This can be used, for example, if you like the arguments to -@code{define} to be reversed. - -@example -define(`exch', `$2, $1') -@result{} -define(exch(``expansion text'', ``macro'')) -@result{} -macro -@result{}expansion text -@end example - -@xref{Quoting Arguments}, for an explanation of the double quotes. -(You should try and improve this example so that clients of @code{exch} -do not have to double quote; or @pxref{Improved exch, , Answers}). - -@cindex GNU extensions -GNU @code{m4} allows the number following the @samp{$} to -consist of one -or more digits, allowing macros to have any number of arguments. This -is not so in UNIX implementations of @code{m4}, which only recognize -one digit. -@comment FIXME - See Austin group XCU ERN 111. POSIX says that $11 must -@comment be the first argument concatenated with 1, and instead reserves -@comment ${11} for implementation use. Once this is implemented, the -@comment documentation needs to reflect how these extended arguments -@comment are handled, as well as backwards compatibility issues with -@comment 1.4.x. Also, consider adding further extensions such as -@comment ${1-default}, which expands to `default' if $1 is empty. - -As a special case, the zeroth argument, @code{$0}, is always the name -of the macro being expanded. - -@example -define(`test', ``Macro name: $0'') -@result{} -test -@result{}Macro name: test -@end example - -If you want quoted text to appear as part of the expansion text, -remember that quotes can be nested in quoted strings. Thus, in - -@example -define(`foo', `This is macro `foo'.') -@result{} -foo -@result{}This is macro foo. -@end example - -@noindent -The @samp{foo} in the expansion text is @emph{not} expanded, since it is -a quoted string, and not a name. - -@node Pseudo Arguments -@section Special arguments to macros - -@cindex special arguments to macros -@cindex macros, special arguments to -@cindex arguments to macros, special -There is a special notation for the number of actual arguments supplied, -and for all the actual arguments. - -The number of actual arguments in a macro call is denoted by @code{$#} -in the expansion text. - -@deffn Composite nargs (@dots{}) -Expands to a count of the number of arguments supplied. -@end deffn - -@example -define(`nargs', `$#') -@result{} -nargs -@result{}0 -nargs() -@result{}1 -nargs(`arg1', `arg2', `arg3') -@result{}3 -nargs(`commas can be quoted, like this') -@result{}1 -nargs(arg1#inside comments, commas do not separate arguments -still arg1) -@result{}1 -nargs((unquoted parentheses, like this, group arguments)) -@result{}1 -@end example - -Remember that @samp{#} defaults to the comment character; if you forget -quotes to inhibit the comment behavior, your macro definition may not -end where you expected. - -@example -dnl Attempt to define a macro to just `$#' -define(underquoted, $#) -oops) -@result{} -underquoted -@result{}0) -@result{}oops -@end example - -The notation @code{$*} can be used in the expansion text to denote all -the actual arguments, unquoted, with commas in between. For example - -@example -define(`echo', `$*') -@result{} -echo(arg1, arg2, arg3 , arg4) -@result{}arg1,arg2,arg3 ,arg4 -@end example - -Often each argument should be quoted, and the notation @code{$@@} handles -that. It is just like @code{$*}, except that it quotes each argument. -A simple example of that is: - -@example -define(`echo', `$@@') -@result{} -echo(arg1, arg2, arg3 , arg4) -@result{}arg1,arg2,arg3 ,arg4 -@end example - -Where did the quotes go? Of course, they were eaten, when the expanded -text were reread by @code{m4}. To show the difference, try - -@example -define(`echo1', `$*') -@result{} -define(`echo2', `$@@') -@result{} -define(`foo', `This is macro `foo'.') -@result{} -echo1(foo) -@result{}This is macro This is macro foo.. -echo1(`foo') -@result{}This is macro foo. -echo2(foo) -@result{}This is macro foo. -echo2(`foo') -@result{}foo -@end example - -@noindent -@xref{Trace}, if you do not understand this. As another example of the -difference, remember that comments encountered in arguments are passed -untouched to the macro, and that quoting disables comments. - -@example -define(`echo1', `$*') -@result{} -define(`echo2', `$@@') -@result{} -define(`foo', `bar') -@result{} -echo1(#foo'foo -foo) -@result{}#foo'foo -@result{}bar -echo2(#foo'foo -foo) -@result{}#foobar -@result{}bar' -@end example - -A @samp{$} sign in the expansion text, that is not followed by anything -@code{m4} understands, is simply copied to the macro expansion, as any -other text is. - -@example -define(`foo', `$$$ hello $$$') -@result{} -foo -@result{}$$$ hello $$$ -@end example - -@cindex rescanning -@cindex literal output -@cindex output, literal -If you want a macro to expand to something like @samp{$12}, the -judicious use of nested quoting can put a safe character between the -@code{$} and the next character, relying on the rescanning to remove the -nested quote. This will prevent @code{m4} from interpreting the -@code{$} sign as a reference to an argument. - -@example -define(`foo', `no nested quote: $1') -@result{} -foo(`arg') -@result{}no nested quote: arg -define(`foo', `nested quote around $: `$'1') -@result{} -foo(`arg') -@result{}nested quote around $: $1 -define(`foo', `nested empty quote after $: $`'1') -@result{} -foo(`arg') -@result{}nested empty quote after $: $1 -define(`foo', `nested quote around next character: $`1'') -@result{} -foo(`arg') -@result{}nested quote around next character: $1 -define(`foo', `nested quote around both: `$1'') -@result{} -foo(`arg') -@result{}nested quote around both: arg -@end example - -@node Undefine -@section Deleting a macro - -@cindex macros, how to delete -@cindex deleting macros -@cindex undefining macros -A macro definition can be removed with @code{undefine}: - -@deffn {Builtin (m4)} undefine (@var{name}@dots{}) -For each argument, remove the macro @var{name}. The macro names must -necessarily be quoted, since they will be expanded otherwise. If an -argument is not a defined macro, then the @samp{d} debug level controls -whether a warning is issued (@pxref{Debugmode}). - -The expansion of @code{undefine} is void. -The macro @code{undefine} is recognized only with parameters. -@end deffn - -@example -foo bar blah -@result{}foo bar blah -define(`foo', `some')define(`bar', `other')define(`blah', `text') -@result{} -foo bar blah -@result{}some other text -undefine(`foo') -@result{} -foo bar blah -@result{}foo other text -undefine(`bar', `blah') -@result{} -foo bar blah -@result{}foo bar blah -@end example - -Undefining a macro inside that macro's expansion is safe; the macro -still expands to the definition that was in effect at the @samp{(}. - -@example -define(`f', ``$0':$1') -@result{} -f(f(f(undefine(`f')`hello world'))) -@result{}f:f:f:hello world -f(`bye') -@result{}f(bye) -@end example - -As of M4 1.6, @code{undefine} can warn if @var{name} is not a macro, by -using @code{debugmode} (@pxref{Debugmode}) or the command line option -@option{-d} (@option{--debugmode}, @pxref{Debugging options, , Invoking -m4}). - -@example -$ @kbd{m4} -undefine(`a') -@error{}m4:stdin:1: warning: undefine: undefined macro 'a' -@result{} -debugmode(`-d') -@result{} -undefine(`a') -@result{} -@end example - -@node Defn -@section Renaming macros - -@cindex macros, how to rename -@cindex renaming macros -@cindex macros, displaying definitions -@cindex definitions, displaying macro -It is possible to rename an already defined macro. To do this, you need -the builtin @code{defn}: - -@deffn {Builtin (m4)} defn (@var{name}@dots{}) -Expands to the @emph{quoted definition} of each @var{name}. If an -argument is not a defined macro, the expansion for that argument is -empty, and the @samp{d} debug level controls whether a warning is issued -(@pxref{Debugmode}). - -If @var{name} is a user-defined macro, the quoted definition is simply -the quoted expansion text. If, instead, @var{name} is a builtin, the -expansion is a special token, which points to the builtin's internal -definition. This token meaningful primarily as the second argument to -@code{define} (and @code{pushdef}), and is silently converted to an -empty string in many other contexts. - -The macro @code{defn} is recognized only with parameters. -@end deffn - -Its normal use is best understood through an example, which shows how to -rename @code{undefine} to @code{zap}: - -@example -define(`zap', defn(`undefine')) -@result{} -zap(`undefine') -@result{} -undefine(`zap') -@result{}undefine(zap) -@end example - -In this way, @code{defn} can be used to copy macro definitions, and also -definitions of builtin macros. Even if the original macro is removed, -the other name can still be used to access the definition. - -The fact that macro definitions can be transferred also explains why you -should use @code{$0}, rather than retyping a macro's name in its -definition: - -@example -define(`foo', `This is `$0'') -@result{} -define(`bar', defn(`foo')) -@result{} -bar -@result{}This is bar -@end example - -Macros used as string variables should be referred through @code{defn}, -to avoid unwanted expansion of the text: - -@example -define(`string', `The macro dnl is very useful -') -@result{} -string -@result{}The macro@w{ } -defn(`string') -@result{}The macro dnl is very useful -@result{} -@end example - -@cindex rescanning -However, it is important to remember that @code{m4} rescanning is purely -textual. If an unbalanced end-quote string occurs in a macro -definition, the rescan will see that embedded quote as the termination -of the quoted string, and the remainder of the macro's definition will -be rescanned unquoted. Thus it is a good idea to avoid unbalanced -end-quotes in macro definitions or arguments to macros. - -@example -define(`foo', a'a) -@result{} -define(`a', `A') -@result{} -define(`echo', `$@@') -@result{} -foo -@result{}A'A -defn(`foo') -@result{}aA' -echo(foo) -@result{}AA' -@end example - -On the other hand, it is possible to exploit the fact that @code{defn} -can concatenate multiple macros prior to the rescanning phase, in order -to join the definitions of macros that, in isolation, have unbalanced -quotes. This is particularly useful when one has used several macros to -accumulate text that M4 should rescan as a whole. In the example below, -note how the use of @code{defn} on @code{l} in isolation opens a string, -which is not closed until the next line; but used on @code{l} and -@code{r} together results in nested quoting. - -@example -define(`l', `<[>')define(`r', `<]>') -@result{} -changequote(`[', `]') -@result{} -defn([l])defn([r]) -]) -@result{}<[>]defn([r]) -@result{}) -defn([l], [r]) -@result{}<[>][<]> -@end example - -@cindex builtins, special tokens -@cindex tokens, builtin macro -Using @code{defn} to generate special tokens for builtin macros will -generate a warning in contexts where a macro name is expected. But in -contexts that operate on text, the builtin token is just silently -converted to an empty string. As of M4 1.6, expansion of user macros -will also preserve builtin tokens. However, any use of builtin tokens -outside of the second argument to @code{define} and @code{pushdef} is -generally not portable, since earlier GNU M4 versions, as well -as other @code{m4} implementations, vary on how such tokens are treated. - -@example -$ @kbd{m4 -d} -defn(`defn') -@result{} -define(defn(`divnum'), `cannot redefine a builtin token') -@error{}m4:stdin:2: warning: define: invalid macro name ignored -@result{} -divnum -@result{}0 -len(defn(`divnum')) -@result{}0 -define(`echo', `$@@') -@result{} -define(`mydivnum', shift(echo(`', defn(`divnum')))) -@result{} -mydivnum -@result{}0 -define(`', `empty-$1') -@result{} -defn(defn(`divnum')) -@error{}m4:stdin:9: warning: defn: invalid macro name ignored -@result{} -pushdef(defn(`divnum'), `oops') -@error{}m4:stdin:10: warning: pushdef: invalid macro name ignored -@result{} -traceon(defn(`divnum')) -@error{}m4:stdin:11: warning: traceon: invalid macro name ignored -@result{} -indir(defn(`divnum'), `string') -@error{}m4:stdin:12: warning: indir: invalid macro name ignored -@result{} -indir(`', `string') -@result{}empty-string -traceoff(defn(`divnum')) -@error{}m4:stdin:14: warning: traceoff: invalid macro name ignored -@result{} -popdef(defn(`divnum')) -@error{}m4:stdin:15: warning: popdef: invalid macro name ignored -@result{} -dumpdef(defn(`divnum')) -@error{}m4:stdin:16: warning: dumpdef: invalid macro name ignored -@result{} -undefine(defn(`divnum')) -@error{}m4:stdin:17: warning: undefine: invalid macro name ignored -@result{} -dumpdef(`') -@error{}:@tabchar{}`empty-$1' -@result{} -m4symbols(defn(`divnum')) -@error{}m4:stdin:19: warning: m4symbols: invalid macro name ignored -@result{} -define(`foo', `define(`$1', $2)')dnl -foo(`bar', defn(`divnum')) -@result{} -bar -@result{}0 -@end example - -As of M4 1.6, @code{defn} can warn if @var{name} is not a macro, by -using @code{debugmode} (@pxref{Debugmode}) or the command line option -@option{-d} (@option{--debugmode}, @pxref{Debugging options, , Invoking -m4}). Also, @code{defn} with multiple arguments can join text with -builtin tokens. However, when defining a macro via @code{define} or -@code{pushdef}, a warning is issued and the builtin token ignored if the -builtin token does not occur in isolation. A future version of -GNU M4 may lift this restriction. - -@example -$ @kbd{m4 -d} -defn(`foo') -@error{}m4:stdin:1: warning: defn: undefined macro 'foo' -@result{} -debugmode(`-d') -@result{} -defn(`foo') -@result{} -define(`a', `A')define(`AA', `b') -@result{} -traceon(`defn', `define') -@result{} -defn(`a', `divnum', `a') -@error{}m4trace: -1- defn(`a', `divnum', `a') -> ``A'<divnum>`A'' -@result{}AA -define(`mydivnum', defn(`divnum', `divnum'))mydivnum -@error{}m4trace: -2- defn(`divnum', `divnum') -> `<divnum><divnum>' -@error{}m4:stdin:7: warning: define: cannot concatenate builtins -@error{}m4trace: -1- define(`mydivnum', `<divnum><divnum>') -> `' -@result{} -traceoff(`defn', `define')dumpdef(`mydivnum') -@error{}mydivnum:@tabchar{}`' -@result{} -define(`mydivnum', defn(`divnum')defn(`divnum'))mydivnum -@error{}m4:stdin:9: warning: define: cannot concatenate builtins -@result{} -define(`mydivnum', defn(`divnum')`a')mydivnum -@error{}m4:stdin:10: warning: define: cannot concatenate builtins -@result{}A -define(`mydivnum', `a'defn(`divnum'))mydivnum -@error{}m4:stdin:11: warning: define: cannot concatenate builtins -@result{}A -define(`q', ``$@@'') -@result{} -define(`foo', q(`a', defn(`divnum')))foo -@error{}m4:stdin:13: warning: define: cannot concatenate builtins -@result{}a, -ifdef(`foo', `yes', `no') -@result{}yes -@end example - -@node Pushdef -@section Temporarily redefining macros - -@cindex macros, temporary redefinition of -@cindex temporary redefinition of macros -@cindex redefinition of macros, temporary -@cindex definition stack -@cindex pushdef stack -@cindex stack, macro definition -It is possible to redefine a macro temporarily, reverting to the -previous definition at a later time. This is done with the builtins -@code{pushdef} and @code{popdef}: - -@deffn {Builtin (m4)} pushdef (@var{name}, @ovar{expansion}) -@deffnx {Builtin (m4)} popdef (@var{name}@dots{}) -Analogous to @code{define} and @code{undefine}. - -These macros work in a stack-like fashion. A macro is temporarily -redefined with @code{pushdef}, which replaces an existing definition of -@var{name}, while saving the previous definition, before the new one is -installed. If there is no previous definition, @code{pushdef} behaves -exactly like @code{define}. - -If a macro has several definitions (of which only one is accessible), -the topmost definition can be removed with @code{popdef}. If there is -no previous definition, @code{popdef} behaves like @code{undefine}, and -if there is no definition at all, the @samp{d} debug level controls -whether a warning is issued (@pxref{Debugmode}). - -The expansion of both @code{pushdef} and @code{popdef} is void. -The macros @code{pushdef} and @code{popdef} are recognized only with -parameters. -@end deffn - -@example -define(`foo', `Expansion one.') -@result{} -foo -@result{}Expansion one. -pushdef(`foo', `Expansion two.') -@result{} -foo -@result{}Expansion two. -pushdef(`foo', `Expansion three.') -@result{} -pushdef(`foo', `Expansion four.') -@result{} -popdef(`foo') -@result{} -foo -@result{}Expansion three. -popdef(`foo', `foo') -@result{} -foo -@result{}Expansion one. -popdef(`foo') -@result{} -foo -@result{}foo -@end example - -If a macro with several definitions is redefined with @code{define}, the -topmost definition is @emph{replaced} with the new definition. If it is -removed with @code{undefine}, @emph{all} the definitions are removed, -and not only the topmost one. However, POSIX allows other -implementations that treat @code{define} as replacing an entire stack -of definitions with a single new definition, so to be portable to other -implementations, it may be worth explicitly using @code{popdef} and -@code{pushdef} rather than relying on the GNU behavior of -@code{define}. - -@example -define(`foo', `Expansion one.') -@result{} -foo -@result{}Expansion one. -pushdef(`foo', `Expansion two.') -@result{} -foo -@result{}Expansion two. -define(`foo', `Second expansion two.') -@result{} -foo -@result{}Second expansion two. -undefine(`foo') -@result{} -foo -@result{}foo -@end example - -@cindex local variables -@cindex variables, local -Local variables within macros are made with @code{pushdef} and -@code{popdef}. At the start of the macro a new definition is pushed, -within the macro it is manipulated and at the end it is popped, -revealing the former definition. - -It is possible to temporarily redefine a builtin with @code{pushdef} -and @code{defn}. - -As of M4 1.6, @code{popdef} can warn if @var{name} is not a macro, by -using @code{debugmode} (@pxref{Debugmode}) or the command line option -@option{-d} (@option{--debugmode}, @pxref{Debugging options, , Invoking -m4}). - -@example -define(`a', `1') -@result{} -popdef -@result{}popdef -popdef(`a', `a') -@error{}m4:stdin:3: warning: popdef: undefined macro 'a' -@result{} -debugmode(`-d') -@result{} -popdef(`a') -@result{} -@end example - -@node Renamesyms -@section Renaming macros with regular expressions - -@cindex regular expressions -@cindex macros, how to rename -@cindex renaming macros -@cindex GNU extensions -Sometimes it is desirable to rename multiple symbols without having to -use a long sequence of calls to @code{define}. The @code{renamesyms} -builtin allows this: - -@deffn {Builtin (gnu)} renamesyms (@var{regexp}, @var{replacement}, @ - @ovar{resyntax}) -Global renaming of macros is done by @code{renamesyms}, which selects -all macros with names that match @var{regexp}, and renames each match -according to @var{replacement}. It is unspecified what happens if the -rename causes multiple macros to map to the same name. -@comment FIXME - right now, collisions cause a core dump on some platforms: -@comment define(bar,1)define(baz,2)renamesyms(^ba., baa)dumpdef(`baa') - -If @var{resyntax} is given, the particular flavor of regular -expression understood with respect to @var{regexp} can be changed from -the current default. @xref{Changeresyntax}, for details of the values -that can be given for this argument. - -A macro that does not have a name that matches @var{regexp} is left -with its original name. If only part of the name matches, any part of -the name that is not covered by @var{regexp} is copied to the -replacement name. Whenever a match is found in the name, the search -proceeds from the end of the match, so no character in the original -name can be substituted twice. If @var{regexp} matches a string of -zero length, the start position for the continued search is -incremented to avoid infinite loops. - -Where a replacement is to be made, @var{replacement} replaces the -matched text in the original name, with @samp{\@var{n}} substituted by -the text matched by the @var{n}th parenthesized sub-expression of -@var{regexp}, and @samp{\&} being the text matched by the entire -regular expression. - -The expansion of @code{renamesyms} is void. -The macro @code{renamesyms} is recognized only with parameters. -This macro was added in M4 2.0. -@end deffn - -The following example starts with a rename similar to the -@option{--prefix-builtins} option (or @option{-P}), prefixing every -macro with @code{m4_}. However, note that @option{-P} only renames M4 -builtin macros, even if other macros were defined previously, while -@code{renamesyms} will rename any macros that match when it runs, -including text macros. The rest of the example demonstrates the -behavior of unanchored regular expressions in symbol renaming. - -@comment options: -Dfoo=bar -P -@example -$ @kbd{m4 -Dfoo=bar -P} -foo -@result{}bar -m4_foo -@result{}m4_foo -m4_defn(`foo') -@result{}bar -@end example - -@example -$ @kbd{m4} -define(`foo', `bar') -@result{} -renamesyms(`^.*$', `m4_\&') -@result{} -foo -@result{}foo -m4_foo -@result{}bar -m4_defn(`m4_foo') -@result{}bar -m4_renamesyms(`f', `g') -@result{} -m4_igdeg(`m4_goo', `m4_goo') -@result{}bar -@end example - -If @var{resyntax} is given, @var{regexp} must be given according to -the syntax chosen, though the default regular expression syntax -remains unchanged for other invocations. Here is a more realistic -example that performs a similar renaming on macros, except that it -ignores macros with names that begin with @samp{_}, and avoids creating -macros with names that begin with @samp{m4_m4}. - -@example -renamesyms(`^[^_]\w*$', `m4_\&') -@result{} -m4_renamesyms(`^m4_m4(\w*)$', `m4_\1', `POSIX_EXTENDED') -@result{} -m4_wrap(__line__ -) -@result{} -^D -@result{}3 -@end example - -When a symbol has multiple definitions, thanks to @code{pushdef}, the -entire stack is renamed. - -@example -pushdef(`foo', `1')pushdef(`foo', `2') -@result{} -renamesyms(`^foo$', `bar') -@result{} -bar -@result{}2 -popdef(`bar')bar -@result{}1 -popdef(`bar')bar -@result{}bar -@end example - -@node Indir -@section Indirect call of macros - -@cindex indirect call of macros -@cindex call of macros, indirect -@cindex macros, indirect call of -@cindex GNU extensions -Any macro can be called indirectly with @code{indir}: - -@deffn {Builtin (gnu)} indir (@var{name}, @ovar{args@dots{}}) -Results in a call to the macro @var{name}, which is passed the rest of -the arguments @var{args}. If @var{name} is not defined, the expansion -is void, and the @samp{d} debug level controls whether a warning is -issued (@pxref{Debugmode}). - -The macro @code{indir} is recognized only with parameters. -@end deffn - -This can be used to call macros with computed or ``invalid'' -names (@code{define} allows such names to be defined): - -@example -define(`$$internal$macro', `Internal macro (name `$0')') -@result{} -$$internal$macro -@result{}$$internal$macro -indir(`$$internal$macro') -@result{}Internal macro (name $$internal$macro) -@end example - -The point is, here, that larger macro packages can have private macros -defined, that will not be called by accident. They can @emph{only} be -called through the builtin @code{indir}. - -One other point to observe is that argument collection occurs before -@code{indir} invokes @var{name}, so if argument collection changes the -value of @var{name}, that will be reflected in the final expansion. -This is different than the behavior when invoking macros directly, -where the definition that was in effect before argument collection is -used. - -@example -$ @kbd{m4 -d} -define(`f', `1') -@result{} -f(define(`f', `2')) -@result{}1 -indir(`f', define(`f', `3')) -@result{}3 -indir(`f', undefine(`f')) -@error{}m4:stdin:4: warning: indir: undefined macro 'f' -@result{} -debugmode(`-d') -@result{} -indir(`f') -@result{} -@end example - -When handed the result of @code{defn} (@pxref{Defn}) as one of its -arguments, @code{indir} defers to the invoked @var{name} for whether a -token representing a builtin is recognized or flattened to the empty -string. - -@example -$ @kbd{m4 -d} -indir(defn(`defn'), `divnum') -@error{}m4:stdin:1: warning: indir: invalid macro name ignored -@result{} -indir(`define', defn(`defn'), `divnum') -@error{}m4:stdin:2: warning: define: invalid macro name ignored -@result{} -indir(`define', `foo', defn(`divnum')) -@result{} -foo -@result{}0 -indir(`divert', defn(`foo')) -@error{}m4:stdin:5: warning: divert: empty string treated as 0 -@result{} -@end example - -Warning messages issued on behalf of an indirect macro use an -unambiguous representation of the macro name, using escape sequences -similar to C strings, and with colons also quoted. - -@example -define(`%%:\ -odd', defn(`divnum')) -@result{} -indir(`%%:\ -odd', `extra') -@error{}m4:stdin:3: warning: %%\:\\\nodd: extra arguments ignored: 1 > 0 -@result{}0 -@end example - -@node Builtin -@section Indirect call of builtins - -@cindex indirect call of builtins -@cindex call of builtins, indirect -@cindex builtins, indirect call of -@cindex GNU extensions -Builtin macros can be called indirectly with @code{builtin}: - -@deffn {Builtin (gnu)} builtin (@var{name}, @ovar{args@dots{}}) -@deffnx {Builtin (gnu)} builtin (@code{defn(`builtin')}, @var{name1}) -Results in a call to the builtin @var{name}, which is passed the -rest of the arguments @var{args}. If @var{name} does not name a -builtin, the expansion is void, and the @samp{d} debug level controls -whether a warning is issued (@pxref{Debugmode}). - -As a special case, if @var{name} is exactly the special token -representing the @code{builtin} macro, as obtained by @code{defn} -(@pxref{Defn}), then @var{args} must consist of a single @var{name1}, -and the expansion is the special token representing the builtin macro -named by @var{name1}. - -The macro @code{builtin} is recognized only with parameters. -@end deffn - -This can be used even if @var{name} has been given another definition -that has covered the original, or been undefined so that no macro -maps to the builtin. - -@example -pushdef(`define', `hidden') -@result{} -undefine(`undefine') -@result{} -define(`foo', `bar') -@result{}hidden -foo -@result{}foo -builtin(`define', `foo', defn(`divnum')) -@result{} -foo -@result{}0 -builtin(`define', `foo', `BAR') -@result{} -foo -@result{}BAR -undefine(`foo') -@result{}undefine(foo) -foo -@result{}BAR -builtin(`undefine', `foo') -@result{} -foo -@result{}foo -@end example - -The @var{name} argument only matches the original name of the builtin, -even when the @option{--prefix-builtins} option (or @option{-P}, -@pxref{Operation modes, , Invoking m4}) is in effect. This is different -from @code{indir}, which only tracks current macro names. - -@comment options: -P -@example -$ @kbd{m4 -P} -m4_builtin(`divnum') -@result{}0 -m4_builtin(`m4_divnum') -@error{}m4:stdin:2: warning: m4_builtin: undefined builtin 'm4_divnum' -@result{} -m4_indir(`divnum') -@error{}m4:stdin:3: warning: m4_indir: undefined macro 'divnum' -@result{} -m4_indir(`m4_divnum') -@result{}0 -m4_debugmode(`-d') -@result{} -m4_builtin(`m4_divnum') -@result{} -@end example - -Note that @code{indir} and @code{builtin} can be used to invoke builtins -without arguments, even when they normally require parameters to be -recognized; but it will provoke a warning, and the expansion will behave -as though empty strings had been passed as the required arguments. - -@example -builtin -@result{}builtin -builtin() -@error{}m4:stdin:2: warning: builtin: undefined builtin '' -@result{} -builtin(`builtin') -@error{}m4:stdin:3: warning: builtin: too few arguments: 0 < 1 -@result{} -builtin(`builtin',) -@error{}m4:stdin:4: warning: builtin: undefined builtin '' -@result{} -builtin(`builtin', ``' -') -@error{}m4:stdin:5: warning: builtin: undefined builtin '`\'\n' -@result{} -indir(`index') -@error{}m4:stdin:7: warning: index: too few arguments: 0 < 2 -@result{}0 -@end example - -Normally, once a builtin macro is undefined, the only way to retrieve -its functionality is by defining a new macro that expands to -@code{builtin} under the hood. But this extra layer of expansion is -slightly inefficient, not to mention the fact that it is not robust to -changes in the current quoting scheme due to @code{changequote} -(@pxref{Changequote}). On the other hand, defining a macro to the -special token produced by @code{defn} (@pxref{Defn}) is very efficient, -and avoids the need for quoting within the macro definition; but -@code{defn} only works if the desired macro is already defined by some -other name. So @code{builtin} provides a special case where it is -possible to retrieve the same special token representing a builtin as -what @code{defn} would provide, were the desired macro still defined. -This feature is activated by passing @code{defn(`builtin')} as the first -argument to builtin. Normally, passing a special token representing a -macro as @var{name} results in a warning and an empty expansion, but in -this case, if the second argument @var{name1} names a valid builtin, -there is no warning and the expansion is the appropriate special -token. In fact, with just the @code{builtin} macro accessible, it is -possible to reconstitute the entire startup state of @code{m4}. - -In the example below, compare the number of macro invocations performed -by @code{defn1} and @code{defn2}, and the differences once quoting is -changed. - -@example -$ @kbd{m4 -d} -undefine(`defn') -@result{} -define(`foo', `bar') -@result{} -define(`defn1', `builtin(`defn', $@@)') -@result{} -define(`defn2', builtin(builtin(`defn', `builtin'), `defn')) -@result{} -dumpdef(`defn1', `defn2') -@error{}defn1:@tabchar{}`builtin(`defn', $@@)' -@error{}defn2:@tabchar{}<defn> -@result{} -traceon -@result{} -defn1(`foo') -@error{}m4trace: -1- defn1(`foo') -> `builtin(`defn', `foo')' -@error{}m4trace: -1- builtin(`defn', `foo') -> ``bar'' -@result{}bar -defn2(`foo') -@error{}m4trace: -1- defn2(`foo') -> ``bar'' -@result{}bar -traceoff -@error{}m4trace: -1- traceoff -> `' -@result{} -changequote(`[', `]') -@result{} -defn1([foo]) -@error{}m4:stdin:11: warning: builtin: undefined builtin '`defn\'' -@result{} -defn2([foo]) -@result{}bar -define([defn1], [builtin([defn], $@@)]) -@result{} -defn1([foo]) -@result{}bar -changequote -@result{} -defn1(`foo') -@error{}m4:stdin:16: warning: builtin: undefined builtin '[defn]' -@result{} -@end example - -@node M4symbols -@section Getting the defined macro names - -@cindex macro names, listing -@cindex listing macro names -@cindex currently defined macros -@cindex GNU extensions -The name of the currently defined macros can be accessed by -@code{m4symbols}: - -@deffn {Builtin (gnu)} m4symbols (@ovar{names@dots{}}) -Without arguments, @code{m4symbols} expands to a sorted list of quoted -strings, separated by commas. This contrasts with @code{dumpdef} -(@pxref{Dumpdef}), whose output cannot be accessed by @code{m4} -programs. - -When given arguments, @code{m4symbols} returns the sorted subset of the -@var{names} currently defined, and silently ignores the rest. -This macro was added in M4 2.0. -@end deffn - -@example -m4symbols(`ifndef', `ifdef', `define', `undef') -@result{}define,ifdef -@end example - -@node Conditionals -@chapter Conditionals, loops, and recursion - -Macros, expanding to plain text, perhaps with arguments, are not quite -enough. We would like to have macros expand to different things, based -on decisions taken at run-time. For that, we need some kind of conditionals. -Also, we would like to have some kind of loop construct, so we could do -something a number of times, or while some condition is true. - -@menu -* Ifdef:: Testing if a macro is defined -* Ifelse:: If-else construct, or multibranch -* Shift:: Recursion in @code{m4} -* Forloop:: Iteration by counting -* Foreach:: Iteration by list contents -* Stacks:: Working with definition stacks -* Composition:: Building macros with macros -@end menu - -@node Ifdef -@section Testing if a macro is defined - -@cindex conditionals -There are two different builtin conditionals in @code{m4}. The first is -@code{ifdef}: - -@deffn {Builtin (m4)} ifdef (@var{name}, @var{string-1}, @ovar{string-2}) -If @var{name} is defined as a macro, @code{ifdef} expands to -@var{string-1}, otherwise to @var{string-2}. If @var{string-2} is -omitted, it is taken to be the empty string (according to the normal -rules). - -The macro @code{ifdef} is recognized only with parameters. -@end deffn - -@example -ifdef(`foo', ``foo' is defined', ``foo' is not defined') -@result{}foo is not defined -define(`foo', `') -@result{} -ifdef(`foo', ``foo' is defined', ``foo' is not defined') -@result{}foo is defined -ifdef(`no_such_macro', `yes', `no', `extra argument') -@error{}m4:stdin:4: warning: ifdef: extra arguments ignored: 4 > 3 -@result{}no -@end example - -As of M4 1.6, @code{ifdef} transparently handles builtin tokens -generated by @code{defn} (@pxref{Defn}) that occur in either -@var{string}, although a warning is issued for invalid macro names. - -@example -define(`', `empty') -@result{} -ifdef(defn(`defn'), `yes', `no') -@error{}m4:stdin:2: warning: ifdef: invalid macro name ignored -@result{}no -define(`foo', ifdef(`divnum', defn(`divnum'), `undefined')) -@result{} -foo -@result{}0 -@end example - -@node Ifelse -@section If-else construct, or multibranch - -@cindex comparing strings -@cindex discarding input -@cindex input, discarding -The other conditional, @code{ifelse}, is much more powerful. It can be -used as a way to introduce a long comment, as an if-else construct, or -as a multibranch, depending on the number of arguments supplied: - -@deffn {Builtin (m4)} ifelse (@var{comment}) -@deffnx {Builtin (m4)} ifelse (@var{string-1}, @var{string-2}, @var{equal}, @ - @ovar{not-equal}) -@deffnx {Builtin (m4)} ifelse (@var{string-1}, @var{string-2}, @var{equal-1}, @ - @var{string-3}, @var{string-4}, @var{equal-2}, @dots{}, @ovar{not-equal}) -Used with only one argument, the @code{ifelse} simply discards it and -produces no output. - -If called with three or four arguments, @code{ifelse} expands into -@var{equal}, if @var{string-1} and @var{string-2} are equal (character -for character), otherwise it expands to @var{not-equal}. A final fifth -argument is ignored, after triggering a warning. - -If called with six or more arguments, and @var{string-1} and -@var{string-2} are equal, @code{ifelse} expands into @var{equal-1}, -otherwise the first three arguments are discarded and the processing -starts again. - -The macro @code{ifelse} is recognized only with parameters. -@end deffn - -Using only one argument is a common @code{m4} idiom for introducing a -block comment, as an alternative to repeatedly using @code{dnl}. This -special usage is recognized by GNU @code{m4}, so that in this -case, the warning about missing arguments is never triggered. - -@example -ifelse(`some comments') -@result{} -ifelse(`foo', `bar') -@error{}m4:stdin:2: warning: ifelse: too few arguments: 2 < 3 -@result{} -@end example - -Using three or four arguments provides decision points. - -@example -ifelse(`foo', `bar', `true') -@result{} -ifelse(`foo', `foo', `true') -@result{}true -define(`foo', `bar') -@result{} -ifelse(foo, `bar', `true', `false') -@result{}true -ifelse(foo, `foo', `true', `false') -@result{}false -@end example - -@cindex macro, blind -@cindex blind macro -Notice how the first argument was used unquoted; it is common to compare -the expansion of a macro with a string. With this macro, you can now -reproduce the behavior of blind builtins, where the macro is recognized -only with arguments. - -@example -define(`foo', `ifelse(`$#', `0', ``$0'', `arguments:$#')') -@result{} -foo -@result{}foo -foo() -@result{}arguments:1 -foo(`a', `b', `c') -@result{}arguments:3 -@end example - -For an example of a way to make defining blind macros easier, see -@ref{Composition}. - -@cindex multibranches -@cindex switch statement -@cindex case statement -The macro @code{ifelse} can take more than four arguments. If given more -than four arguments, @code{ifelse} works like a @code{case} or @code{switch} -statement in traditional programming languages. If @var{string-1} and -@var{string-2} are equal, @code{ifelse} expands into @var{equal-1}, otherwise -the procedure is repeated with the first three arguments discarded. This -calls for an example: - -@example -ifelse(`foo', `bar', `third', `gnu', `gnats') -@error{}m4:stdin:1: warning: ifelse: extra arguments ignored: 5 > 4 -@result{}gnu -ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth') -@result{} -ifelse(`foo', `bar', `third', `gnu', `gnats', `sixth', `seventh') -@result{}seventh -ifelse(`foo', `bar', `3', `gnu', `gnats', `6', `7', `8') -@error{}m4:stdin:4: warning: ifelse: extra arguments ignored: 8 > 7 -@result{}7 -@end example - -As of M4 1.6, @code{ifelse} transparently handles builtin tokens -generated by @code{defn} (@pxref{Defn}). Because of this, it is always -safe to compare two macro definitions, without worrying whether the -macro might be a builtin. - -@example -ifelse(defn(`defn'), `', `yes', `no') -@result{}no -ifelse(defn(`defn'), defn(`divnum'), `yes', `no') -@result{}no -ifelse(defn(`defn'), defn(`defn'), `yes', `no') -@result{}yes -define(`foo', ifelse(`', `', defn(`divnum'))) -@result{} -foo -@result{}0 -@end example - -Naturally, the normal case will be slightly more advanced than these -examples. A common use of @code{ifelse} is in macros implementing loops -of various kinds. - -@node Shift -@section Recursion in @code{m4} - -@cindex recursive macros -@cindex macros, recursive -There is no direct support for loops in @code{m4}, but macros can be -recursive. There is no limit on the number of recursion levels, other -than those enforced by your hardware and operating system. - -@cindex loops -Loops can be programmed using recursion and the conditionals described -previously. - -There is a builtin macro, @code{shift}, which can, among other things, -be used for iterating through the actual arguments to a macro: - -@deffn {Builtin (m4)} shift (@var{arg1}, @dots{}) -Takes any number of arguments, and expands to all its arguments except -@var{arg1}, separated by commas, with each argument quoted. - -The macro @code{shift} is recognized only with parameters. -@end deffn - -@example -shift -@result{}shift -shift(`bar') -@result{} -shift(`foo', `bar', `baz') -@result{}bar,baz -@end example - -An example of the use of @code{shift} is this macro: - -@cindex reversing arguments -@cindex arguments, reversing -@deffn Composite reverse (@dots{}) -Takes any number of arguments, and reverses their order. -@end deffn - -It is implemented as: - -@example -define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'', - `reverse(shift($@@)), `$1'')') -@result{} -reverse -@result{} -reverse(`foo') -@result{}foo -reverse(`foo', `bar', `gnats', `and gnus') -@result{}and gnus, gnats, bar, foo -@end example - -While not a very interesting macro, it does show how simple loops can be -made with @code{shift}, @code{ifelse} and recursion. It also shows -that @code{shift} is usually used with @samp{$@@}. Another example of -this is an implementation of a short-circuiting conditional operator. - -@cindex short-circuiting conditional -@cindex conditional, short-circuiting -@deffn Composite cond (@var{test-1}, @var{string-1}, @var{equal-1}, @ - @ovar{test-2}, @ovar{string-2}, @ovar{equal-2}, @dots{}, @ovar{not-equal}) -Similar to @code{ifelse}, where an equal comparison between the first -two strings results in the third, otherwise the first three arguments -are discarded and the process repeats. The difference is that each -@var{test-<n>} is expanded only when it is encountered. This means that -every third argument to @code{cond} is normally given one more level of -quoting than the corresponding argument to @code{ifelse}. -@end deffn - -Here is the implementation of @code{cond}, along with a demonstration of -how it can short-circuit the side effects in @code{side}. Notice how -all the unquoted side effects happen regardless of how many comparisons -are made with @code{ifelse}, compared with only the relevant effects -with @code{cond}. - -@example -define(`cond', -`ifelse(`$#', `1', `$1', - `ifelse($1, `$2', `$3', - `$0(shift(shift(shift($@@))))')')')dnl -define(`side', `define(`counter', incr(counter))$1')dnl -define(`example1', -`define(`counter', `0')dnl -ifelse(side(`$1'), `yes', `one comparison: ', - side(`$1'), `no', `two comparisons: ', - side(`$1'), `maybe', `three comparisons: ', - `side(`default answer: ')')counter')dnl -define(`example2', -`define(`counter', `0')dnl -cond(`side(`$1')', `yes', `one comparison: ', - `side(`$1')', `no', `two comparisons: ', - `side(`$1')', `maybe', `three comparisons: ', - `side(`default answer: ')')counter')dnl -example1(`yes') -@result{}one comparison: 3 -example1(`no') -@result{}two comparisons: 3 -example1(`maybe') -@result{}three comparisons: 3 -example1(`feeling rather indecisive today') -@result{}default answer: 4 -example2(`yes') -@result{}one comparison: 1 -example2(`no') -@result{}two comparisons: 2 -example2(`maybe') -@result{}three comparisons: 3 -example2(`feeling rather indecisive today') -@result{}default answer: 4 -@end example - -@cindex joining arguments -@cindex arguments, joining -@cindex concatenating arguments -Another common task that requires iteration is joining a list of -arguments into a single string. - -@deffn Composite join (@ovar{separator}, @ovar{args@dots{}}) -@deffnx Composite joinall (@ovar{separator}, @ovar{args@dots{}}) -Generate a single-quoted string, consisting of each @var{arg} separated -by @var{separator}. While @code{joinall} always outputs a -@var{separator} between arguments, @code{join} avoids the -@var{separator} for an empty @var{arg}. -@end deffn - -Here are some examples of its usage, based on the implementation -@file{m4-@value{VERSION}/@/doc/examples/@/join.m4} distributed in this -package: - -@comment examples -@example -$ @kbd{m4 -I examples} -include(`join.m4') -@result{} -join,join(`-'),join(`-', `'),join(`-', `', `') -@result{},,, -joinall,joinall(`-'),joinall(`-', `'),joinall(`-', `', `') -@result{},,,- -join(`-', `1') -@result{}1 -join(`-', `1', `2', `3') -@result{}1-2-3 -join(`', `1', `2', `3') -@result{}123 -join(`-', `', `1', `', `', `2', `') -@result{}1-2 -joinall(`-', `', `1', `', `', `2', `') -@result{}-1---2- -join(`,', `1', `2', `3') -@result{}1,2,3 -define(`nargs', `$#')dnl -nargs(join(`,', `1', `2', `3')) -@result{}1 -@end example - -Examining the implementation shows some interesting points about several -m4 programming idioms. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -undivert(`join.m4')dnl -@result{}divert(`-1') -@result{}# join(sep, args) - join each non-empty ARG into a single -@result{}# string, with each element separated by SEP -@result{}define(`join', -@result{}`ifelse(`$#', `2', ``$2'', -@result{} `ifelse(`$2', `', `', ``$2'_')$0(`$1', shift(shift($@@)))')') -@result{}define(`_join', -@result{}`ifelse(`$#$2', `2', `', -@result{} `ifelse(`$2', `', `', ``$1$2'')$0(`$1', shift(shift($@@)))')') -@result{}# joinall(sep, args) - join each ARG, including empty ones, -@result{}# into a single string, with each element separated by SEP -@result{}define(`joinall', ``$2'_$0(`$1', shift($@@))') -@result{}define(`_joinall', -@result{}`ifelse(`$#', `2', `', ``$1$3'$0(`$1', shift(shift($@@)))')') -@result{}divert`'dnl -@end example - -First, notice that this implementation creates helper macros -@code{_join} and @code{_joinall}. This division of labor makes it -easier to output the correct number of @var{separator} instances: -@code{join} and @code{joinall} are responsible for the first argument, -without a separator, while @code{_join} and @code{_joinall} are -responsible for all remaining arguments, always outputting a separator -when outputting an argument. - -Next, observe how @code{join} decides to iterate to itself, because the -first @var{arg} was empty, or to output the argument and swap over to -@code{_join}. If the argument is non-empty, then the nested -@code{ifelse} results in an unquoted @samp{_}, which is concatenated -with the @samp{$0} to form the next macro name to invoke. The -@code{joinall} implementation is simpler since it does not have to -suppress empty @var{arg}; it always executes once then defers to -@code{_joinall}. - -Another important idiom is the idea that @var{separator} is reused for -each iteration. Each iteration has one less argument, but rather than -discarding @samp{$1} by iterating with @code{$0(shift($@@))}, the macro -discards @samp{$2} by using @code{$0(`$1', shift(shift($@@)))}. - -Next, notice that it is possible to compare more than one condition in a -single @code{ifelse} test. The test of @samp{$#$2} against @samp{2} -allows @code{_join} to iterate for two separate reasons---either there -are still more than two arguments, or there are exactly two arguments -but the last argument is not empty. - -Finally, notice that these macros require exactly two arguments to -terminate recursion, but that they still correctly result in empty -output when given no @var{args} (i.e., zero or one macro argument). On -the first pass when there are too few arguments, the @code{shift} -results in no output, but leaves an empty string to serve as the -required second argument for the second pass. Put another way, -@samp{`$1', shift($@@)} is not the same as @samp{$@@}, since only the -former guarantees at least two arguments. - -@cindex quote manipulation -@cindex manipulating quotes -Sometimes, a recursive algorithm requires adding quotes to each element, -or treating multiple arguments as a single element: - -@deffn Composite quote (@dots{}) -@deffnx Composite dquote (@dots{}) -@deffnx Composite dquote_elt (@dots{}) -Takes any number of arguments, and adds quoting. With @code{quote}, -only one level of quoting is added, effectively removing whitespace -after commas and turning multiple arguments into a single string. With -@code{dquote}, two levels of quoting are added, one around each element, -and one around the list. And with @code{dquote_elt}, two levels of -quoting are added around each element. -@end deffn - -An actual implementation of these three macros is distributed as -@file{m4-@value{VERSION}/@/doc/examples/@/quote.m4} in this package. -First, let's examine their usage: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`quote.m4') -@result{} --quote-dquote-dquote_elt- -@result{}---- --quote()-dquote()-dquote_elt()- -@result{}--`'-`'- --quote(`1')-dquote(`1')-dquote_elt(`1')- -@result{}-1-`1'-`1'- --quote(`1', `2')-dquote(`1', `2')-dquote_elt(`1', `2')- -@result{}-1,2-`1',`2'-`1',`2'- -define(`n', `$#')dnl --n(quote(`1', `2'))-n(dquote(`1', `2'))-n(dquote_elt(`1', `2'))- -@result{}-1-1-2- -dquote(dquote_elt(`1', `2')) -@result{}``1'',``2'' -dquote_elt(dquote(`1', `2')) -@result{}``1',`2'' -@end example - -The last two lines show that when given two arguments, @code{dquote} -results in one string, while @code{dquote_elt} results in two. Now, -examine the implementation. Note that @code{quote} and -@code{dquote_elt} make decisions based on their number of arguments, so -that when called without arguments, they result in nothing instead of a -quoted empty string; this is so that it is possible to distinguish -between no arguments and an empty first argument. @code{dquote}, on the -other hand, results in a string no matter what, since it is still -possible to tell whether it was invoked without arguments based on the -resulting string. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -undivert(`quote.m4')dnl -@result{}divert(`-1') -@result{}# quote(args) - convert args to single-quoted string -@result{}define(`quote', `ifelse(`$#', `0', `', ``$*'')') -@result{}# dquote(args) - convert args to quoted list of quoted strings -@result{}define(`dquote', ``$@@'') -@result{}# dquote_elt(args) - convert args to list of double-quoted strings -@result{}define(`dquote_elt', `ifelse(`$#', `0', `', `$#', `1', ```$1''', -@result{} ```$1'',$0(shift($@@))')') -@result{}divert`'dnl -@end example - -It is worth pointing out that @samp{quote(@var{args})} is more efficient -than @samp{joinall(`,', @var{args})} for producing the same output. - -@cindex nine arguments, more than -@cindex more than nine arguments -@cindex arguments, more than nine -One more useful macro based on @code{shift} allows portably selecting -an arbitrary argument (usually greater than the ninth argument), without -relying on the GNU extension of multi-digit arguments -(@pxref{Arguments}). - -@deffn Composite argn (@var{n}, @dots{}) -Expands to argument @var{n} out of the remaining arguments. @var{n} -must be a positive number. Usually invoked as -@samp{argn(`@var{n}',$@@)}. -@end deffn - -It is implemented as: - -@example -define(`argn', `ifelse(`$1', 1, ``$2'', - `argn(decr(`$1'), shift(shift($@@)))')') -@result{} -argn(`1', `a') -@result{}a -define(`foo', `argn(`11', $@@)') -@result{} -foo(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k', `l') -@result{}k -@end example - -@node Forloop -@section Iteration by counting - -@cindex for loops -@cindex loops, counting -@cindex counting loops -Here is an example of a loop macro that implements a simple for loop. - -@deffn Composite forloop (@var{iterator}, @var{start}, @var{end}, @var{text}) -Takes the name in @var{iterator}, which must be a valid macro name, and -successively assign it each integer value from @var{start} to @var{end}, -inclusive. For each assignment to @var{iterator}, append @var{text} to -the expansion of the @code{forloop}. @var{text} may refer to -@var{iterator}. Any definition of @var{iterator} prior to this -invocation is restored. -@end deffn - -It can, for example, be used for simple counting: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`forloop.m4') -@result{} -forloop(`i', `1', `8', `i ') -@result{}1 2 3 4 5 6 7 8@w{ } -@end example - -For-loops can be nested, like: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`forloop.m4') -@result{} -forloop(`i', `1', `4', `forloop(`j', `1', `8', ` (i, j)') -') -@result{} (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8) -@result{} (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8) -@result{} (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8) -@result{} (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8) -@result{} -@end example - -The implementation of the @code{forloop} macro is fairly -straightforward. The @code{forloop} macro itself is simply a wrapper, -which saves the previous definition of the first argument, calls the -internal macro @code{@w{_forloop}}, and re-establishes the saved -definition of the first argument. - -The macro @code{@w{_forloop}} expands the fourth argument once, and -tests to see if the iterator has reached the final value. If it has -not finished, it increments the iterator (using the predefined macro -@code{incr}, @pxref{Incr}), and recurses. - -Here is an actual implementation of @code{forloop}, distributed as -@file{m4-@value{VERSION}/@/doc/examples/@/forloop.m4} in this package: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -undivert(`forloop.m4')dnl -@result{}divert(`-1') -@result{}# forloop(var, from, to, stmt) - simple version -@result{}define(`forloop', `pushdef(`$1', `$2')_forloop($@@)popdef(`$1')') -@result{}define(`_forloop', -@result{} `$4`'ifelse($1, `$3', `', `define(`$1', incr($1))$0($@@)')') -@result{}divert`'dnl -@end example - -Notice the careful use of quotes. Certain macro arguments are left -unquoted, each for its own reason. Try to find out @emph{why} these -arguments are left unquoted, and see what happens if they are quoted. -(As presented, these two macros are useful but not very robust for -general use. They lack even basic error handling for cases like -@var{start} less than @var{end}, @var{end} not numeric, or -@var{iterator} not being a macro name. See if you can improve these -macros; or @pxref{Improved forloop, , Answers}). - -@node Foreach -@section Iteration by list contents - -@cindex for each loops -@cindex loops, list iteration -@cindex iterating over lists -Here is an example of a loop macro that implements list iteration. - -@deffn Composite foreach (@var{iterator}, @var{paren-list}, @var{text}) -@deffnx Composite foreachq (@var{iterator}, @var{quote-list}, @var{text}) -Takes the name in @var{iterator}, which must be a valid macro name, and -successively assign it each value from @var{paren-list} or -@var{quote-list}. In @code{foreach}, @var{paren-list} is a -comma-separated list of elements contained in parentheses. In -@code{foreachq}, @var{quote-list} is a comma-separated list of elements -contained in a quoted string. For each assignment to @var{iterator}, -append @var{text} to the overall expansion. @var{text} may refer to -@var{iterator}. Any definition of @var{iterator} prior to this -invocation is restored. -@end deffn - -As an example, this displays each word in a list inside of a sentence, -using an implementation of @code{foreach} distributed as -@file{m4-@value{VERSION}/@/doc/examples/@/foreach.m4}, and @code{foreachq} -in @file{m4-@value{VERSION}/@/doc/examples/@/foreachq.m4}. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`foreach.m4') -@result{} -foreach(`x', (foo, bar, foobar), `Word was: x -')dnl -@result{}Word was: foo -@result{}Word was: bar -@result{}Word was: foobar -include(`foreachq.m4') -@result{} -foreachq(`x', `foo, bar, foobar', `Word was: x -')dnl -@result{}Word was: foo -@result{}Word was: bar -@result{}Word was: foobar -@end example - -It is possible to be more complex; each element of the @var{paren-list} -or @var{quote-list} can itself be a list, to pass as further arguments -to a helper macro. This example generates a shell case statement: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`foreach.m4') -@result{} -define(`_case', ` $1) - $2=" $1";; -')dnl -define(`_cat', `$1$2')dnl -case $`'1 in -@result{}case $1 in -foreach(`x', `(`(`a', `vara')', `(`b', `varb')', `(`c', `varc')')', - `_cat(`_case', x)')dnl -@result{} a) -@result{} vara=" a";; -@result{} b) -@result{} varb=" b";; -@result{} c) -@result{} varc=" c";; -esac -@result{}esac -@end example - -The implementation of the @code{foreach} macro is a bit more involved; -it is a wrapper around two helper macros. First, @code{@w{_arg1}} is -needed to grab the first element of a list. Second, -@code{@w{_foreach}} implements the recursion, successively walking -through the original list. Here is a simple implementation of -@code{foreach}: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -undivert(`foreach.m4')dnl -@result{}divert(`-1') -@result{}# foreach(x, (item_1, item_2, ..., item_n), stmt) -@result{}# parenthesized list, simple version -@result{}define(`foreach', `pushdef(`$1')_foreach($@@)popdef(`$1')') -@result{}define(`_arg1', `$1') -@result{}define(`_foreach', `ifelse(`$2', `()', `', -@result{} `define(`$1', _arg1$2)$3`'$0(`$1', (shift$2), `$3')')') -@result{}divert`'dnl -@end example - -Unfortunately, that implementation is not robust to macro names as list -elements. Each iteration of @code{@w{_foreach}} is stripping another -layer of quotes, leading to erratic results if list elements are not -already fully expanded. The first cut at implementing @code{foreachq} -takes this into account. Also, when using quoted elements in a -@var{paren-list}, the overall list must be quoted. A @var{quote-list} -has the nice property of requiring fewer characters to create a list -containing the same quoted elements. To see the difference between the -two macros, we attempt to pass double-quoted macro names in a list, -expecting the macro name on output after one layer of quotes is removed -during list iteration and the final layer removed during the final -rescan: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -define(`a', `1')define(`b', `2')define(`c', `3') -@result{} -include(`foreach.m4') -@result{} -include(`foreachq.m4') -@result{} -foreach(`x', `(``a'', ``(b'', ``c)'')', `x -') -@result{}1 -@result{}(2)1 -@result{} -@result{}, x -@result{}) -foreachq(`x', ```a'', ``(b'', ``c)''', `x -')dnl -@result{}a -@result{}(b -@result{}c) -@end example - -Obviously, @code{foreachq} did a better job; here is its implementation: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -undivert(`foreachq.m4')dnl -@result{}include(`quote.m4')dnl -@result{}divert(`-1') -@result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt) -@result{}# quoted list, simple version -@result{}define(`foreachq', `pushdef(`$1')_foreachq($@@)popdef(`$1')') -@result{}define(`_arg1', `$1') -@result{}define(`_foreachq', `ifelse(quote($2), `', `', -@result{} `define(`$1', `_arg1($2)')$3`'$0(`$1', `shift($2)', `$3')')') -@result{}divert`'dnl -@end example - -Notice that @code{@w{_foreachq}} had to use the helper macro -@code{quote} defined earlier (@pxref{Shift}), to ensure that the -embedded @code{ifelse} call does not go haywire if a list element -contains a comma. Unfortunately, this implementation of @code{foreachq} -has its own severe flaw. Whereas the @code{foreach} implementation was -linear, this macro is quadratic in the number of list elements, and is -much more likely to trip up the limit set by the command line option -@option{--nesting-limit} (or @option{-L}, @pxref{Limits control, , -Invoking m4}). Additionally, this implementation does not expand -@samp{defn(`@var{iterator}')} very well, when compared with -@code{foreach}. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`foreach.m4')include(`foreachq.m4') -@result{} -foreach(`name', `(`a', `b')', ` defn(`name')') -@result{} a b -foreachq(`name', ``a', `b'', ` defn(`name')') -@result{} _arg1(`a', `b') _arg1(shift(`a', `b')) -@end example - -It is possible to have robust iteration with linear behavior and sane -@var{iterator} contents for either list style. See if you can learn -from the best elements of both of these implementations to create robust -macros (or @pxref{Improved foreach, , Answers}). - -@node Stacks -@section Working with definition stacks - -@cindex definition stack -@cindex pushdef stack -@cindex stack, macro definition -Thanks to @code{pushdef}, manipulation of a stack is an intrinsic -operation in @code{m4}. Normally, only the topmost definition in a -stack is important, but sometimes, it is desirable to manipulate the -entire definition stack. - -@deffn Composite stack_foreach (@var{macro}, @var{action}) -@deffnx Composite stack_foreach_lifo (@var{macro}, @var{action}) -For each of the @code{pushdef} definitions associated with @var{macro}, -invoke the macro @var{action} with a single argument of that definition. -@code{stack_foreach} visits the oldest definition first, while -@code{stack_foreach_lifo} visits the current definition first. -@var{action} should not modify or dereference @var{macro}. There are a -few special macros, such as @code{defn}, which cannot be used as the -@var{macro} parameter. -@end deffn - -A sample implementation of these macros is distributed in the file -@file{m4-@value{VERSION}/@/doc/examples/@/stack.m4}. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`stack.m4') -@result{} -pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3') -@result{} -define(`show', ``$1' -') -@result{} -stack_foreach(`a', `show')dnl -@result{}1 -@result{}2 -@result{}3 -stack_foreach_lifo(`a', `show')dnl -@result{}3 -@result{}2 -@result{}1 -@end example - -Now for the implementation. Note the definition of a helper macro, -@code{_stack_reverse}, which destructively swaps the contents of one -stack of definitions into the reverse order in the temporary macro -@samp{tmp-$1}. By calling the helper twice, the original order is -restored back into the macro @samp{$1}; since the operation is -destructive, this explains why @samp{$1} must not be modified or -dereferenced during the traversal. The caller can then inject -additional code to pass the definition currently being visited to -@samp{$2}. The choice of helper names is intentional; since @samp{-} is -not valid as part of a macro name, there is no risk of conflict with a -valid macro name, and the code is guaranteed to use @code{defn} where -necessary. Finally, note that any macro used in the traversal of a -@code{pushdef} stack, such as @code{pushdef} or @code{defn}, cannot be -handled by @code{stack_foreach}, since the macro would temporarily be -undefined during the algorithm. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -undivert(`stack.m4')dnl -@result{}divert(`-1') -@result{}# stack_foreach(macro, action) -@result{}# Invoke ACTION with a single argument of each definition -@result{}# from the definition stack of MACRO, starting with the oldest. -@result{}define(`stack_foreach', -@result{}`_stack_reverse(`$1', `tmp-$1')'dnl -@result{}`_stack_reverse(`tmp-$1', `$1', `$2(defn(`$1'))')') -@result{}# stack_foreach_lifo(macro, action) -@result{}# Invoke ACTION with a single argument of each definition -@result{}# from the definition stack of MACRO, starting with the newest. -@result{}define(`stack_foreach_lifo', -@result{}`_stack_reverse(`$1', `tmp-$1', `$2(defn(`$1'))')'dnl -@result{}`_stack_reverse(`tmp-$1', `$1')') -@result{}define(`_stack_reverse', -@result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0($@@)')') -@result{}divert`'dnl -@end example - -@node Composition -@section Building macros with macros - -@cindex macro composition -@cindex composing macros -Since m4 is a macro language, it is possible to write macros that -can build other macros. First on the list is a way to automate the -creation of blind macros. - -@cindex macro, blind -@cindex blind macro -@deffn Composite define_blind (@var{name}, @ovar{value}) -Defines @var{name} as a blind macro, such that @var{name} will expand to -@var{value} only when given explicit arguments. @var{value} should not -be the result of @code{defn} (@pxref{Defn}). This macro is only -recognized with parameters, and results in an empty string. -@end deffn - -Defining a macro to define another macro can be a bit tricky. We want -to use a literal @samp{$#} in the argument to the nested @code{define}. -However, if @samp{$} and @samp{#} are adjacent in the definition of -@code{define_blind}, then it would be expanded as the number of -arguments to @code{define_blind} rather than the intended number of -arguments to @var{name}. The solution is to pass the difficult -characters through extra arguments to a helper macro -@code{_define_blind}. When composing macros, it is a common idiom to -need a helper macro to concatenate text that forms parameters in the -composed macro, rather than interpreting the text as a parameter of the -composing macro. - -As for the limitation against using @code{defn}, there are two reasons. -If a macro was previously defined with @code{define_blind}, then it can -safely be renamed to a new blind macro using plain @code{define}; using -@code{define_blind} to rename it just adds another layer of -@code{ifelse}, occupying memory and slowing down execution. And if a -macro is a builtin, then it would result in an attempt to define a macro -consisting of both text and a builtin token; this is not supported, and -the builtin token is flattened to an empty string. - -With that explanation, here's the definition, and some sample usage. -Notice that @code{define_blind} is itself a blind macro. - -@example -$ @kbd{m4 -d} -define(`define_blind', `ifelse(`$#', `0', ``$0'', -`_$0(`$1', `$2', `$'`#', `$'`0')')') -@result{} -define(`_define_blind', `define(`$1', -`ifelse(`$3', `0', ``$4'', `$2')')') -@result{} -define_blind -@result{}define_blind -define_blind(`foo', `arguments were $*') -@result{} -foo -@result{}foo -foo(`bar') -@result{}arguments were bar -define(`blah', defn(`foo')) -@result{} -blah -@result{}blah -blah(`a', `b') -@result{}arguments were a,b -defn(`blah') -@result{}ifelse(`$#', `0', ``$0'', `arguments were $*') -@end example - -@cindex currying arguments -@cindex argument currying -Another interesting composition tactic is argument @dfn{currying}, or -factoring a macro that takes multiple arguments for use in a context -that provides exactly one argument. - -@deffn Composite curry (@var{macro}, @dots{}) -Expand to a macro call that takes exactly one argument, then appends -that argument to the original arguments and invokes @var{macro} with the -resulting list of arguments. -@end deffn - -A demonstration of currying makes the intent of this macro a little more -obvious. The macro @code{stack_foreach} mentioned earlier is an example -of a context that provides exactly one argument to a macro name. But -coupled with currying, we can invoke @code{reverse} with two arguments -for each definition of a macro stack. This example uses the file -@file{m4-@value{VERSION}/@/doc/examples/@/curry.m4} included in the -distribution. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`curry.m4')include(`stack.m4') -@result{} -define(`reverse', `ifelse(`$#', `0', , `$#', `1', ``$1'', - `reverse(shift($@@)), `$1'')') -@result{} -pushdef(`a', `1')pushdef(`a', `2')pushdef(`a', `3') -@result{} -stack_foreach(`a', `:curry(`reverse', `4')') -@result{}:1, 4:2, 4:3, 4 -curry(`curry', `reverse', `1')(`2')(`3') -@result{}3, 2, 1 -@end example - -Now for the implementation. Notice how @code{curry} leaves off with a -macro name but no open parenthesis, while still in the middle of -collecting arguments for @samp{$1}. The macro @code{_curry} is the -helper macro that takes one argument, then adds it to the list and -finally supplies the closing parenthesis. The use of a comma inside the -@code{shift} call allows currying to also work for a macro that takes -one argument, although it often makes more sense to invoke that macro -directly rather than going through @code{curry}. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -undivert(`curry.m4')dnl -@result{}divert(`-1') -@result{}# curry(macro, args) -@result{}# Expand to a macro call that takes one argument, then invoke -@result{}# macro(args, extra). -@result{}define(`curry', `$1(shift($@@,)_$0') -@result{}define(`_curry', ``$1')') -@result{}divert`'dnl -@end example - -Unfortunately, with M4 1.4.x, @code{curry} is unable to handle builtin -tokens, which are silently flattened to the empty string when passed -through another text macro. The following example demonstrates a usage -of @code{curry} that works in M4 1.6, but is not portable to earlier -versions: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`curry.m4') -@result{} -curry(`define', `mylen')(defn(`len')) -@result{} -mylen(`abc') -@result{}3 -@end example - -@cindex renaming macros -@cindex copying macros -@cindex macros, copying -Putting the last few concepts together, it is possible to copy or rename -an entire stack of macro definitions. - -@deffn Composite copy (@var{source}, @var{dest}) -@deffnx Composite rename (@var{source}, @var{dest}) -Ensure that @var{dest} is undefined, then define it to the same stack of -definitions currently in @var{source}. @code{copy} leaves @var{source} -unchanged, while @code{rename} undefines @var{source}. There are only a -few macros, such as @code{copy} or @code{defn}, which cannot be copied -via this macro. -@end deffn - -The implementation is relatively straightforward (although since it uses -@code{curry}, it is unable to copy builtin macros when used with M4 -1.4.x. See if you can design a portable version that works across all -M4 versions, or @pxref{Improved copy, , Answers}). - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`curry.m4')include(`stack.m4') -@result{} -define(`rename', `copy($@@)undefine(`$1')')dnl -define(`copy', `ifdef(`$2', `errprint(`$2 already defined -')m4exit(`1')', - `stack_foreach(`$1', `curry(`pushdef', `$2')')')')dnl -pushdef(`a', `1')pushdef(`a', defn(`divnum'))pushdef(`a', `2') -@result{} -copy(`a', `b') -@result{} -rename(`b', `c') -@result{} -a b c -@result{}2 b 2 -popdef(`a', `c')a c -@result{}0 0 -popdef(`a', `c')a c -@result{}1 1 -@end example - -@node Debugging -@chapter How to debug macros and input - -@cindex debugging macros -@cindex macros, debugging -When writing macros for @code{m4}, they often do not work as intended on -the first try (as is the case with most programming languages). -Fortunately, there is support for macro debugging in @code{m4}. - -@menu -* Dumpdef:: Displaying macro definitions -* Trace:: Tracing macro calls -* Debugmode:: Controlling debugging options -* Debuglen:: Limiting debug output -* Debugfile:: Saving debugging output -@end menu - -@node Dumpdef -@section Displaying macro definitions - -@cindex displaying macro definitions -@cindex macros, displaying definitions -@cindex definitions, displaying macro -@cindex standard error, output to -If you want to see what a name expands into, you can use the builtin -@code{dumpdef}: - -@deffn {Builtin (m4)} dumpdef (@ovar{name@dots{}}) -Accepts any number of arguments. If called without any arguments, it -displays the definitions of all known names, otherwise it displays the -definitions of each @var{name} given, sorted by name. If a @var{name} -is undefined, the @samp{d} debug level controls whether a warning is -issued (@pxref{Debugmode}). Likewise, the @samp{o} debug level controls -whether the output is issued to standard error or the current debug -file (@pxref{Debugfile}). - -The expansion of @code{dumpdef} is void. -@end deffn - -@example -$ @kbd{m4 -d} -define(`foo', `Hello world.') -@result{} -dumpdef(`foo') -@error{}foo:@tabchar{}`Hello world.' -@result{} -dumpdef(`define') -@error{}define:@tabchar{}<define> -@result{} -@end example - -The last example shows how builtin macros definitions are displayed. -The definition that is dumped corresponds to what would occur if the -macro were to be called at that point, even if other definitions are -still live due to redefining a macro during argument collection. - -@example -$ @kbd{m4 -d} -pushdef(`f', ``$0'1')pushdef(`f', ``$0'2') -@result{} -f(popdef(`f')dumpdef(`f')) -@error{}f:@tabchar{}``$0'1' -@result{}f2 -f(popdef(`f')dumpdef(`f')) -@error{}m4:stdin:3: warning: dumpdef: undefined macro 'f' -@result{}f1 -debugmode(`-d') -@result{} -dumpdef(`f') -@result{} -@end example - -@xref{Debugmode}, for information on how the @samp{m}, @samp{q}, and -@samp{s} flags affect the details of the display. Remember, the -@samp{q} flag is implied when the @option{--debug} option (@option{-d}, -@pxref{Debugging options, , Invoking m4}) is used in the command line -without arguments. Also, @option{--debuglen} (@pxref{Debuglen}) can affect -output, by truncating longer strings (but not builtin and module names). - -@comment options: -ds -l3 -@example -$ @kbd{m4 -ds -l 3} -pushdef(`foo', `1 long string') -@result{} -pushdef(`foo', defn(`divnum')) -@result{} -pushdef(`foo', `3') -@result{} -debugmode(`+m') -@result{} -dumpdef(`foo', `dnl', `indir', `__gnu__') -@error{}__gnu__:@tabchar{}@{gnu@} -@error{}dnl:@tabchar{}<dnl>@{m4@} -@error{}foo:@tabchar{}3, <divnum>@{m4@}, 1 l... -@error{}indir:@tabchar{}<indir>@{gnu@} -@result{} -debugmode(`-ms')debugmode(`+q') -@result{} -dumpdef(`foo') -@error{}foo:@tabchar{}`3' -@result{} -@end example - -@node Trace -@section Tracing macro calls - -@cindex tracing macro expansion -@cindex macro expansion, tracing -@cindex expansion, tracing macro -@cindex standard error, output to -It is possible to trace macro calls and expansions through the builtins -@code{traceon} and @code{traceoff}: - -@deffn {Builtin (m4)} traceon (@ovar{names@dots{}}) -@deffnx {Builtin (m4)} traceoff (@ovar{names@dots{}}) -When called without any arguments, @code{traceon} and @code{traceoff} -will turn tracing on and off, respectively, for all macros, identical to -using the @samp{t} flag of @code{debugmode} (@pxref{Debugmode}). - -When called with arguments, only the macros listed in @var{names} are -affected, whether or not they are currently defined. A macro's -expansion will be traced if global tracing is on, or if the individual -macro tracing flag is set; to avoid tracing a macro, both the global -flag and the macro must have tracing off. - -The expansion of @code{traceon} and @code{traceoff} is void. -@end deffn - -Whenever a traced macro is called and the arguments have been collected, -the call is displayed. If the expansion of the macro call is not void, -the expansion can be displayed after the call. The output is printed -to the current debug file (defaulting to standard error, -@pxref{Debugfile}). - -@example -$ @kbd{m4 -d} -define(`foo', `Hello World.') -@result{} -define(`echo', `$@@') -@result{} -traceon(`foo', `echo') -@result{} -foo -@error{}m4trace: -1- foo -> `Hello World.' -@result{}Hello World. -echo(`gnus', `and gnats') -@error{}m4trace: -1- echo(`gnus', `and gnats') -> ``gnus',`and gnats'' -@result{}gnus,and gnats -@end example - -The number between dashes is the depth of the expansion. It is one most -of the time, signifying an expansion at the outermost level, but it -increases when macro arguments contain unquoted macro calls. The -maximum number that will appear between dashes is controlled by the -option @option{--nesting-limit} (or @option{-L}, @pxref{Limits control, -, Invoking m4}). Additionally, the option @option{--trace} (or -@option{-t}) can be used to invoke @code{traceon(@var{name})} before -parsing input. - -@comment options: -d-V -L3 -tifelse -@comment status: 1 -@example -$ @kbd{m4 -L 3 -t ifelse} -ifelse(`one level') -@error{}m4trace: -1- ifelse -@result{} -ifelse(ifelse(ifelse(`three levels'))) -@error{}m4trace: -3- ifelse -@error{}m4trace: -2- ifelse -@error{}m4trace: -1- ifelse -@result{} -ifelse(ifelse(ifelse(ifelse(`four levels')))) -@error{}m4:stdin:3: recursion limit of 3 exceeded, use -L<N> to change it -@end example - -Tracing by name is an attribute that is preserved whether the macro is -defined or not. This allows the selection of macros to trace before -those macros are defined. - -@example -$ @kbd{m4 -d} -traceoff(`foo') -@result{} -traceon(`foo') -@result{} -foo -@result{}foo -defn(`foo') -@error{}m4:stdin:4: warning: defn: undefined macro 'foo' -@result{} -undefine(`foo') -@error{}m4:stdin:5: warning: undefine: undefined macro 'foo' -@result{} -pushdef(`foo') -@result{} -popdef(`foo') -@result{} -popdef(`foo') -@error{}m4:stdin:8: warning: popdef: undefined macro 'foo' -@result{} -define(`foo', `bar') -@result{} -foo -@error{}m4trace: -1- foo -> `bar' -@result{}bar -undefine(`foo') -@result{} -ifdef(`foo', `yes', `no') -@result{}no -indir(`foo') -@error{}m4:stdin:13: warning: indir: undefined macro 'foo' -@result{} -define(`foo', `blah') -@result{} -foo -@error{}m4trace: -1- foo -> `blah' -@result{}blah -@end example - -Tracing even works on builtins. However, @code{defn} (@pxref{Defn}) -does not transfer tracing status. - -@example -$ @kbd{m4 -d} -traceon(`traceon') -@result{} -traceon(`traceoff') -@error{}m4trace: -1- traceon(`traceoff') -> `' -@result{} -traceoff(`traceoff') -@error{}m4trace: -1- traceoff(`traceoff') -> `' -@result{} -traceoff(`traceon') -@result{} -traceon(`eval', `m4_divnum') -@result{} -define(`m4_eval', defn(`eval')) -@result{} -define(`m4_divnum', defn(`divnum')) -@result{} -eval(divnum) -@error{}m4trace: -1- eval(`0') -> `0' -@result{}0 -m4_eval(m4_divnum) -@error{}m4trace: -2- m4_divnum -> `0' -@result{}0 -@end example - -As of GNU M4 2.0, named macro tracing is independent of global -tracing status; calling @code{traceoff} without arguments turns off the -global trace flag, but does not turn off tracing for macros where -tracing was requested by name. Likewise, calling @code{traceon} without -arguments will affect tracing of macros that are not defined yet. This -behavior matches traditional implementations of @code{m4}. - -@example -$ @kbd{m4 -d} -traceon -@result{} -define(`foo', `bar') -@error{}m4trace: -1- define(`foo', `bar') -> `' -@result{} -foo # traced, even though foo was not defined at traceon -@error{}m4trace: -1- foo -> `bar' -@result{}bar # traced, even though foo was not defined at traceon -traceoff(`foo') -@error{}m4trace: -1- traceoff(`foo') -> `' -@result{} -foo # traced, since global tracing is still on -@error{}m4trace: -1- foo -> `bar' -@result{}bar # traced, since global tracing is still on -traceon(`foo') -@error{}m4trace: -1- traceon(`foo') -> `' -@result{} -traceoff -@error{}m4trace: -1- traceoff -> `' -@result{} -foo # traced, since foo is now traced by name -@error{}m4trace: -1- foo -> `bar' -@result{}bar # traced, since foo is now traced by name -traceoff(`foo') -@result{} -foo # untraced -@result{}bar # untraced -@end example - -However, GNU M4 prior to 2.0 had slightly different -semantics, where @code{traceon} without arguments only affected symbols -that were defined at that moment, and @code{traceoff} without arguments -stopped all tracing, even when tracing was requested by macro name. The -addition of the macro @code{m4symbols} (@pxref{M4symbols}) in 2.0 makes it -possible to write a file that approximates the older semantics -regardless of which version of GNU M4 is in use. - -@comment options: -d-V -@example -$ @kbd{m4} -ifdef(`m4symbols', - `define(`traceon', `ifelse(`$#', `0', `builtin(`traceon', m4symbols)', - `builtin(`traceon', $@@)')')dnl -define(`traceoff', `ifelse(`$#', `0', - `builtin(`traceoff')builtin(`traceoff', m4symbols)', - `builtin(`traceoff', $@@)')')')dnl -define(`a', `1') -@result{} -traceon # called before b is defined, so b is not traced -@result{} # called before b is defined, so b is not traced -define(`b', `2') -@error{}m4trace: -1- define -@result{} -a b -@error{}m4trace: -1- a -@result{}1 2 -traceon(`b') -@error{}m4trace: -1- traceon -@error{}m4trace: -1- ifelse -@error{}m4trace: -1- builtin -@result{} -a b -@error{}m4trace: -1- a -@error{}m4trace: -1- b -@result{}1 2 -traceoff # stops tracing b, even though it was traced by name -@error{}m4trace: -1- traceoff -@error{}m4trace: -1- ifelse -@error{}m4trace: -1- builtin -@error{}m4trace: -2- m4symbols -@error{}m4trace: -1- builtin -@result{} # stops tracing b, even though it was traced by name -a b -@result{}1 2 -@end example - -@xref{Debugmode}, for information on controlling the details of the -display. The format of the trace output is not specified by -POSIX, and varies between implementations of @code{m4}. - -Starting with M4 1.6, tracing also works via @code{indir} -(@pxref{Indir}). However, since tracing is an attribute tracked by -macro names, and @code{builtin} bypasses macro names (@pxref{Builtin}), -it is not possible for @code{builtin} to trace which subsidiary builtin -it invokes. If you are worried about tracking all invocations of a -given builtin, you should also trace @code{builtin}, or enable global -tracing (the @samp{t} debug level, @pxref{Debugmode}). - -@example -$ @kbd{m4 -d} -define(`my_defn', defn(`defn'))undefine(`defn') -@result{} -define(`foo', `bar')traceon(`foo', `defn', `my_defn') -@result{} -foo -@error{}m4trace: -1- foo -> `bar' -@result{}bar -indir(`foo') -@error{}m4trace: -1- foo -> `bar' -@result{}bar -my_defn(`foo') -@error{}m4trace: -1- my_defn(`foo') -> ``bar'' -@result{}bar -indir(`my_defn', `foo') -@error{}m4trace: -1- my_defn(`foo') -> ``bar'' -@result{}bar -builtin(`defn', `foo') -@result{}bar -debugmode(`+cxt') -@result{} -builtin(`defn', builtin(`shift', `', `foo')) -@error{}m4trace: -1- id 12: builtin ... = <builtin> -@error{}m4trace: -2- id 13: builtin ... = <builtin> -@error{}m4trace: -2- id 13: builtin(`shift', `', `foo') -> ``foo'' -@error{}m4trace: -1- id 12: builtin(`defn', `foo') -> ``bar'' -@result{}bar -indir(`my_defn', indir(`shift', `', `foo')) -@error{}m4trace: -1- id 14: indir ... = <indir> -@error{}m4trace: -2- id 15: indir ... = <indir> -@error{}m4trace: -2- id 15: shift ... = <shift> -@error{}m4trace: -2- id 15: shift(`', `foo') -> ``foo'' -@error{}m4trace: -2- id 15: indir(`shift', `', `foo') -> ``foo'' -@error{}m4trace: -1- id 14: my_defn ... = <defn> -@error{}m4trace: -1- id 14: my_defn(`foo') -> ``bar'' -@error{}m4trace: -1- id 14: indir(`my_defn', `foo') -> ``bar'' -@result{}bar -@end example - -@node Debugmode -@section Controlling debugging options - -@cindex controlling debugging output -@cindex debugging output, controlling -The @option{--debug} option to @code{m4} (also spelled -@option{--debugmode} or @option{-d}, @pxref{Debugging options, , -Invoking m4}) controls the amount of details presented in three -categories of output. Trace output is requested by @code{traceon} -(@pxref{Trace}), and each line is prefixed by @samp{m4trace:} in -relation to a macro invocation. Debug output tracks useful events not -associated with a macro invocation, and each line is prefixed by -@samp{m4debug:}. Finally, @code{dumpdef} (@pxref{Dumpdef}) output is -affected, with no prefix added to the output lines. - -The @var{flags} following the option can be one or more of the -following: - -@table @code -@item a -In trace output, show the actual arguments that were collected before -invoking the macro. Arguments are subject to length truncation -specified by @code{debuglen} (@pxref{Debuglen}). - -@item c -In trace output, show an additional line for each macro call, when the -macro is seen, but before the arguments are collected, and show the -definition of the macro that will be used for the expansion. By -default, only one line is printed, after all arguments are collected and -the expansion determined. The definition is subject to length -truncation specified by @code{debuglen} (@pxref{Debuglen}). This is -often used with the @samp{x} flag. - -@item d -Output a warning on any attempt to dereference an undefined macro via -@code{builtin}, @code{defn}, @code{dumpdef}, @code{indir}, -@code{popdef}, or @code{undefine}. Note that @code{indef}, -@code{m4symbols}, -@code{traceon}, and @code{traceoff} do not dereference undefined macros. -Like any other warning, the warnings enabled by this flag go to standard -error regardless of the current @code{debugfile} setting, and will -change exit status if the command line option @option{--fatal-warnings} -was specified. This flag is useful in diagnosing spelling mistakes in -macro names. It is enabled by default when neither @option{--debug} nor -@option{--fatal-warnings} are specified on the command line. - -@item e -In trace output, show the expansion of each macro call. The expansion -is subject to length truncation specified by @code{debuglen} -(@pxref{Debuglen}). - -@item f -In debug and trace output, include the name of the current input file in -the output line. - -@item i -In debug output, print a message each time the current input file is -changed. - -@item l -In debug and trace output, include the current input line number in the -output line. - -@item m -In debug output, print a message each time a module is manipulated -(@pxref{Modules}). In trace output when the @samp{c} flag is in effect, -and in dumpdef output, follow builtin macros with their module name, -surrounded by braces (@samp{@{@}}). - -@item o -Output @code{dumpdef} data to standard error instead of the current -debug file. This can be useful when post-processing trace output, where -interleaving dumpdef and trace output can cause ambiguities. - -@item p -In debug output, print a message when a named file is found through the -path search mechanism (@pxref{Search Path}), giving the actual file name -used. - -@item q -In trace and dumpdef output, quote actual arguments and macro expansions -in the display with the current quotes. This is useful in connection -with the @samp{a} and @samp{e} flags above. - -@item s -In dumpdef output, show the entire stack of definitions associated with -a symbol via @code{pushdef}. - -@item t -In trace output, trace all macro calls made in this invocation of -@code{m4}. This is equivalent to using @code{traceon} without -arguments. - -@item x -In trace output, add a unique `macro call id' to each line of the trace -output. This is useful in connection with the @samp{c} flag above, to -match where a macro is first recognized with where it is finally -expanded, in spite of intermediate expansions that occur while -collecting arguments. It can also be used in isolation to determine how -many macros have been expanded. - -@item V -A shorthand for all of the above flags. -@end table - -As special cases, if @var{flags} starts with a @samp{+}, the named flags -are enabled without impacting other flags, and if it starts with a -@samp{-}, the named flags are disabled without impacting other flags. -Without either of these starting characters, @var{flags} simply replaces -the previous setting. -@comment FIXME - should we accept usage like debugmode(+fl-q)? Also, -@comment should we add debugmode(?) which expands to the current -@comment enabled flags, and debugmode(e?) which expands to e if e is -@comment currently enabled? - -If no flags are specified with the @option{--debug} option, the default is -@samp{+adeq}. Many examples in this manual show their output using -default flags. - -@cindex GNU extensions -There is a builtin macro @code{debugmode}, which allows on-the-fly control of -the debugging output format: - -@deffn {Builtin (gnu)} debugmode (@ovar{flags}) -The argument @var{flags} should be a subset of the letters listed above. -If no argument is present, all debugging flags are cleared (as if -@var{flags} were an explicit @samp{-V}). With an empty argument, the -most common flags are enabled (as if @var{flags} were an explicit -@samp{+adeq}). If an unknown flag is encountered, an error is issued. - -The expansion of @code{debugmode} is void. -@end deffn - -@comment options: -d-V -@example -$ @kbd{m4} -define(`foo', `FOO$1') -@result{} -traceon(`foo', `divnum') -@result{} -debugmode()dnl same as debugmode(`+adeq') -foo -@error{}m4trace: -1- foo -> `FOO' -@result{}FOO -debugmode(`V')debugmode(`-q') -@error{}m4trace:stdin:5: -1- id 7: debugmode ... = <debugmode>@{gnu@} -@error{}m4trace:stdin:5: -1- id 7: debugmode(`-q') -> `' -@result{} -foo( -`BAR') -@error{}m4trace:stdin:6: -1- id 8: foo ... = FOO$1 -@error{}m4trace:stdin:6: -1- id 8: foo(BAR) -> FOOBAR -@result{}FOOBAR -debugmode`'dnl same as debugmode(`-V') -@error{}m4trace:stdin:8: -1- id 9: debugmode ... = <debugmode>@{gnu@} -@error{}m4trace:stdin:8: -1- id 9: debugmode ->@w{ } -foo -@error{}m4trace: -1- foo -@result{}FOO -debugmode(`+clmx') -@result{} -foo(divnum) -@error{}m4trace:11: -1- id 13: foo ... = FOO$1 -@error{}m4trace:11: -2- id 14: divnum ... = <divnum>@{m4@} -@error{}m4trace:11: -2- id 14: divnum -@error{}m4trace:11: -1- id 13: foo -@result{}FOO0 -debugmode(`-m') -@result{} -@end example - -This example shows the effects of the debug flags that are not related -to macro tracing. - -@comment examples -@comment options: -dip -@example -$ @kbd{m4 -dip -I doc/examples} -@error{}m4debug: input read from 'stdin' -define(`foo', `m4wrap(`wrapped text -')dnl') -@result{} -include(`incl.m4')dnl -@error{}m4debug: path search for 'incl.m4' found 'doc/examples/incl.m4' -@error{}m4debug: input read from 'doc/examples/incl.m4' -@result{}Include file start -@result{}Include file end -@error{}m4debug: input reverted to stdin, line 3 -^D -@error{}m4debug: input exhausted -@error{}m4debug: input from m4wrap recursion level 1 -@result{}wrapped text -@error{}m4debug: input from m4wrap exhausted -@end example - -@node Debuglen -@section Limiting debug output - -@cindex GNU extensions -@cindex arglength -@cindex debuglen -@cindex limiting trace output length -@cindex trace output, limiting length -@cindex dumpdef output, limiting length -When debugging, sometimes it is desirable to reduce the clutter of -arbitrary-length strings, because the prefix carries enough information -to understand the issues. The builtin macro @code{debuglen}, along with -the command line option counterpart @option{--debuglen} (or @option{-l}, -@pxref{Debugging options, , Invoking m4}), allow on-the-fly control of -debugging string lengths: - -@deffn {Builtin (gnu)} debuglen (@var{len}) -The argument @var{len} is an integer that controls how much of -arbitrary-length strings should be output during trace and dumpdef -output. If specified to a non-zero value, then strings longer than that -length are truncated, and @samp{...} included in the output to show that -truncation took place. A warning is issued if @var{len} cannot be -parsed as an integer. -@comment FIXME - make this understand an optional suffix, similar to how -@comment --debuglen does. Also, we need a section documenting scaling -@comment suffixes. -@comment FIXME - should we allow len to be `?', meaning expand to the -@comment current value? - -The macro @code{debuglen} is recognized only with parameters. -@end deffn - -The following example demonstrates the behavior of length truncation. -Note that each argument and the final result are individually truncated. -Also, the special tokens for builtin functions are not truncated. - -@comment options: -l6 -techo -tdefn -@example -$ @kbd{m4 -d -l 6 -t echo -t defn} -debuglen(`oops') -@error{}m4:stdin:1: warning: debuglen: non-numeric argument 'oops' -@result{} -define(`echo', `$@@') -@result{} -echo(`1', `long string') -@error{}m4trace: -1- echo(`1', `long s...') -> ``1',`l...' -@result{}1,long string -indir(`echo', defn(`changequote')) -@error{}m4trace: -2- defn(`change...') -> `<changequote>' -@error{}m4trace: -1- echo(<changequote>) -> ``<changequote>'' -@result{} -debuglen -@result{}debuglen -debuglen(`0') -@result{} -echo(`long string') -@error{}m4trace: -1- echo(`long string') -> ``long string'' -@result{}long string -debuglen(`12') -@result{} -echo(`long string') -@error{}m4trace: -1- echo(`long string') -> ``long string...' -@result{}long string -@end example - -@node Debugfile -@section Saving debugging output - -@cindex saving debugging output -@cindex debugging output, saving -@cindex output, saving debugging -@cindex GNU extensions -Debug and tracing output can be redirected to files using either the -@option{--debugfile} option to @code{m4} (@pxref{Debugging options, , -Invoking m4}), or with the builtin macro @code{debugfile}: - -@deffn {Builtin (gnu)} debugfile (@ovar{file}) -Send all further debug and trace output to @var{file}, opened in append -mode. If @var{file} is the empty string, debug and trace output are -discarded. If @code{debugfile} is called without any arguments, debug -and trace output are sent to standard error. Output from @code{dumpdef} -is sent to this file if the debug level @code{o} is not set -(@pxref{Debugmode}). This does not affect -warnings, error messages, or @code{errprint} output, which are -always sent to standard error. If @var{file} cannot be opened, the -current debug file is unchanged, and an error is issued. - -When the @option{--safer} option (@pxref{Operation modes, , Invoking -m4}) is in effect, @var{file} must be empty or omitted, since otherwise -an input file could cause the modification of arbitrary files. - -The expansion of @code{debugfile} is void. -@end deffn - -@example -$ @kbd{m4 -d} -traceon(`divnum') -@result{} -divnum(`extra') -@error{}m4:stdin:2: warning: divnum: extra arguments ignored: 1 > 0 -@error{}m4trace: -1- divnum(`extra') -> `0' -@result{}0 -debugfile() -@result{} -divnum(`extra') -@error{}m4:stdin:4: warning: divnum: extra arguments ignored: 1 > 0 -@result{}0 -debugfile -@result{} -divnum -@error{}m4trace: -1- divnum -> `0' -@result{}0 -@end example - -Although the @option{--safer} option cripples @code{debugfile} to a -limited subset of capabilities, you may still use the @option{--debugfile} -option from the command line with no restrictions. - -@comment options: --safer --debugfile=trace -tfoo -Dfoo=bar -d+l -@comment status: 1 -@example -$ @kbd{m4 --safer --debugfile trace -t foo -D foo=bar -daelq} -foo # traced to `trace' -@result{}bar # traced to `trace' -debugfile(`file') -@error{}m4:stdin:2: debugfile: disabled by --safer -@result{} -foo # traced to `trace' -@result{}bar # traced to `trace' -debugfile() -@result{} -foo # trace discarded -@result{}bar # trace discarded -debugfile -@result{} -foo # traced to stderr -@error{}m4trace:7: -1- foo -> `bar' -@result{}bar # traced to stderr -undivert(`trace')dnl -@result{}m4trace:1: -1- foo -> `bar' -@result{}m4trace:3: -1- foo -> `bar' -@end example - -Sometimes it is useful to post-process trace output, even though there -is no standardized format for trace output. In this situation, forcing -@code{dumpdef} to output to standard error instead of the default of the -current debug file will avoid any ambiguities between the two types of -output; it also allows debugging via @code{dumpdef} when debug output is -discarded. - -@example -$ @kbd{m4 -d} -traceon(`divnum') -@result{} -divnum -@error{}m4trace: -1- divnum -> `0' -@result{}0 -dumpdef(`divnum') -@error{}divnum:@tabchar{}<divnum> -@result{} -debugfile(`') -@result{} -divnum -@result{}0 -dumpdef(`divnum') -@result{} -debugmode(`+o') -@result{} -divnum -@result{}0 -dumpdef(`divnum') -@error{}divnum:@tabchar{}<divnum> -@result{} -@end example - -@node Input Control -@chapter Input control - -This chapter describes various builtin macros for controlling the input -to @code{m4}. - -@menu -* Dnl:: Deleting whitespace in input -* Changequote:: Changing the quote characters -* Changecom:: Changing the comment delimiters -* Changeresyntax:: Changing the regular expression syntax -* Changesyntax:: Changing the lexical structure of the input -* M4wrap:: Saving text until end of input -@end menu - -@node Dnl -@section Deleting whitespace in input - -@cindex deleting whitespace in input -@cindex discarding input -@cindex input, discarding -The builtin @code{dnl} stands for ``Discard to Next Line'': - -@deffn {Builtin (m4)} dnl -All characters, up to and including the next newline, are discarded -without performing any macro expansion. A warning is issued if the end -of the file is encountered without a newline. - -The expansion of @code{dnl} is void. -@end deffn - -It is often used in connection with @code{define}, to remove the -newline that follows the call to @code{define}. Thus - -@example -define(`foo', `Macro `foo'.')dnl A very simple macro, indeed. -foo -@result{}Macro foo. -@end example - -The input up to and including the next newline is discarded, as opposed -to the way comments are treated (@pxref{Comments}), when the command -line option @option{--discard-comments} is not in effect -(@pxref{Operation modes, , Invoking m4}). - -Usually, @code{dnl} is immediately followed by an end of line or some -other whitespace. GNU @code{m4} will produce a warning diagnostic if -@code{dnl} is followed by an open parenthesis. In this case, @code{dnl} -will collect and process all arguments, looking for a matching close -parenthesis. All predictable side effects resulting from this -collection will take place. @code{dnl} will return no output. The -input following the matching close parenthesis up to and including the -next newline, on whatever line containing it, will still be discarded. - -@example -dnl(`args are ignored, but side effects occur', -define(`foo', `like this')) while this text is ignored: undefine(`foo') -@error{}m4:stdin:1: warning: dnl: extra arguments ignored: 2 > 0 -See how `foo' was defined, foo? -@result{}See how foo was defined, like this? -@end example - -If the end of file is encountered without a newline character, a -warning is issued and dnl stops consuming input. - -@example -m4wrap(`m4wrap(`2 hi -')0 hi dnl 1 hi') -@result{} -define(`hi', `HI') -@result{} -^D -@error{}m4:stdin:1: warning: dnl: end of file treated as newline -@result{}0 HI 2 HI -@end example - -@node Changequote -@section Changing the quote characters - -@cindex changing quote delimiters -@cindex quote delimiters, changing -@cindex delimiters, changing -The default quote delimiters can be changed with the builtin -@code{changequote}: - -@deffn {Builtin (m4)} changequote (@dvar{start, `}, @dvar{end, '}) -This sets @var{start} as the new begin-quote delimiter and @var{end} as -the new end-quote delimiter. If both arguments are missing, the default -quotes (@code{`} and @code{'}) are used. If @var{start} is void, then -quoting is disabled. Otherwise, if @var{end} is missing or void, the -default end-quote delimiter (@code{'}) is used. The quote delimiters -can be of any length. - -The expansion of @code{changequote} is void. -@end deffn - -@example -changequote(`[', `]') -@result{} -define([foo], [Macro [foo].]) -@result{} -foo -@result{}Macro foo. -@end example - -The quotation strings can safely contain eight-bit characters. -If no single character is appropriate, @var{start} and @var{end} can be -of any length. Other implementations cap the delimiter length to five -characters, but GNU has no inherent limit. - -@example -changequote(`[[[', `]]]') -@result{} -define([[[foo]]], [[[Macro [[[[[foo]]]]].]]]) -@result{} -foo -@result{}Macro [[foo]]. -@end example - -Calling @code{changequote} with @var{start} as the empty string will -effectively disable the quoting mechanism, leaving no way to quote text. -However, using an empty string is not portable, as some other -implementations of @code{m4} revert to the default quoting, while others -preserve the prior non-empty delimiter. If @var{start} is not empty, -then an empty @var{end} will use the default end-quote delimiter of -@samp{'}, as otherwise, it would be impossible to end a quoted string. -Again, this is not portable, as some other @code{m4} implementations -reuse @var{start} as the end-quote delimiter, while others preserve the -previous non-empty value. Omitting both arguments restores the default -begin-quote and end-quote delimiters; fortunately this behavior is -portable to all implementations of @code{m4}. - -@example -define(`foo', `Macro `FOO'.') -@result{} -changequote(`', `') -@result{} -foo -@result{}Macro `FOO'. -`foo' -@result{}`Macro `FOO'.' -changequote(`,) -@result{} -foo -@result{}Macro FOO. -@end example - -There is no way in @code{m4} to quote a string containing an unmatched -begin-quote, except using @code{changequote} to change the current -quotes. - -If the quotes should be changed from, say, @samp{[} to @samp{[[}, -temporary quote characters have to be defined. To achieve this, two -calls of @code{changequote} must be made, one for the temporary quotes -and one for the new quotes. - -Macros are recognized in preference to the begin-quote string, so if a -prefix of @var{start} can be recognized as part of a potential macro -name, the quoting mechanism is effectively disabled. Unless you use -@code{changesyntax} (@pxref{Changesyntax}), this means that @var{start} -should not begin with a letter, digit, or @samp{_} (underscore). -However, even though quoted strings are not recognized, the quote -characters can still be discerned in macro expansion and in trace -output. - -@example -define(`echo', `$@@') -@result{} -define(`hi', `HI') -@result{} -changequote(`q', `Q') -@result{} -q hi Q hi -@result{}q HI Q HI -echo(hi) -@result{}qHIQ -changequote -@result{} -changequote(`-', `EOF') -@result{} -- hi EOF hi -@result{} hi HI -changequote -@result{} -changequote(`1', `2') -@result{} -hi1hi2 -@result{}hi1hi2 -hi 1hi2 -@result{}HI hi -@end example - -Quotes are recognized in preference to argument collection. In -particular, if @var{start} is a single @samp{(}, then argument -collection is effectively disabled. For portability with other -implementations, it is a good idea to avoid @samp{(}, @samp{,}, and -@samp{)} as the first character in @var{start}. - -@example -define(`echo', `$#:$@@:') -@result{} -define(`hi', `HI') -@result{} -changequote(`(',`)') -@result{} -echo(hi) -@result{}0::hi -changequote -@result{} -changequote(`((', `))') -@result{} -echo(hi) -@result{}1:HI: -echo((hi)) -@result{}0::hi -changequote -@result{} -changequote(`,', `)') -@result{} -echo(hi,hi)bye) -@result{}1:HIhibye: -@end example - -However, if you are not worried about portability, using @samp{(} and -@samp{)} as quoting characters has an interesting property---you can use -it to compute a quoted string containing the expansion of any quoted -text, as long as the expansion results in both balanced quotes and -balanced parentheses. The trick is realizing @code{expand} uses -@samp{$1} unquoted, to trigger its expansion using the normal quoting -characters, but uses extra parentheses to group unquoted commas that -occur in the expansion without consuming whitespace following those -commas. Then @code{_expand} uses @code{changequote} to convert the -extra parentheses back into quoting characters. Note that it takes two -more @code{changequote} invocations to restore the original quotes. -Contrast the behavior on whitespace when using @samp{$*}, via -@code{quote}, to attempt the same task. - -@example -changequote(`[', `]')dnl -define([a], [1, (b)])dnl -define([b], [2])dnl -define([quote], [[$*]])dnl -define([expand], [_$0(($1))])dnl -define([_expand], - [changequote([(], [)])$1changequote`'changequote(`[', `]')])dnl -expand([a, a, [a, a], [[a, a]]]) -@result{}1, (2), 1, (2), a, a, [a, a] -quote(a, a, [a, a], [[a, a]]) -@result{}1,(2),1,(2),a, a,[a, a] -@end example - -If @var{end} is a prefix of @var{start}, the end-quote will be -recognized in preference to a nested begin-quote. In particular, -changing the quotes to have the same string for @var{start} and -@var{end} disables nesting of quotes. When quote nesting is disabled, -it is impossible to double-quote strings across macro expansions, so -using the same string is not done very often. - -@example -define(`hi', `HI') -@result{} -changequote(`""', `"') -@result{} -""hi"""hi" -@result{}hihi -""hi" ""hi" -@result{}hi hi -""hi"" "hi" -@result{}hi" "HI" -changequote -@result{} -`hi`hi'hi' -@result{}hi`hi'hi -changequote(`"', `"') -@result{} -"hi"hi"hi" -@result{}hiHIhi -@end example - -It is an error if the end of file occurs within a quoted string. - -@comment status: 1 -@example -`hello world' -@result{}hello world -`dangling quote -^D -@error{}m4:stdin:2: end of file in string -@end example - -@comment status: 1 -@example -ifelse(`dangling quote -^D -@error{}m4:stdin:1: ifelse: end of file in string -@end example - -@node Changecom -@section Changing the comment delimiters - -@cindex changing comment delimiters -@cindex comment delimiters, changing -@cindex delimiters, changing -The default comment delimiters can be changed with the builtin -macro @code{changecom}: - -@deffn {Builtin (m4)} changecom (@ovar{start}, @dvar{end, @key{NL}}) -This sets @var{start} as the new begin-comment delimiter and @var{end} -as the new end-comment delimiter. If both arguments are missing, or -@var{start} is void, then comments are disabled. Otherwise, if -@var{end} is missing or void, the default end-comment delimiter of -newline is used. The comment delimiters can be of any length. - -The expansion of @code{changecom} is void. -@end deffn - -@example -define(`comment', `COMMENT') -@result{} -# A normal comment -@result{}# A normal comment -changecom(`/*', `*/') -@result{} -# Not a comment anymore -@result{}# Not a COMMENT anymore -But: /* this is a comment now */ while this is not a comment -@result{}But: /* this is a comment now */ while this is not a COMMENT -@end example - -@cindex comments, copied to output -Note how comments are copied to the output, much as if they were quoted -strings. If you want the text inside a comment expanded, quote the -begin-comment delimiter. - -Calling @code{changecom} without any arguments, or with @var{start} as -the empty string, will effectively disable the commenting mechanism. To -restore the original comment start of @samp{#}, you must explicitly ask -for it. If @var{start} is not empty, then an empty @var{end} will use -the default end-comment delimiter of newline, as otherwise, it would be -impossible to end a comment. However, this is not portable, as some -other @code{m4} implementations preserve the previous non-empty -delimiters instead. - -@example -define(`comment', `COMMENT') -@result{} -changecom -@result{} -# Not a comment anymore -@result{}# Not a COMMENT anymore -changecom(`#', `') -@result{} -# comment again -@result{}# comment again -@end example - -The comment strings can safely contain eight-bit characters. -If no single character is appropriate, @var{start} and @var{end} can be -of any length. Other implementations cap the delimiter length to five -characters, but GNU has no inherent limit. - -As of M4 1.6, macros and quotes are recognized in preference to -comments, so if a prefix of @var{start} can be recognized as part of a -potential macro name, or confused with a quoted string, the comment -mechanism is effectively disabled (earlier versions of GNU M4 -favored comments, but this was inconsistent with other implementations). -Unless you use @code{changesyntax} (@pxref{Changesyntax}), this means -that @var{start} should not begin with a letter, digit, or @samp{_} -(underscore), and that neither the start-quote nor the start-comment -string should be a prefix of the other. - -@example -define(`hi', `HI') -@result{} -define(`hi1hi2', `hello') -@result{} -changecom(`q', `Q') -@result{} -q hi Q hi -@result{}q HI Q HI -changecom(`1', `2') -@result{} -hi1hi2 -@result{}hello -hi 1hi2 -@result{}HI 1hi2 -changecom(`[[', `]]') -@result{} -changequote(`[[[', `]]]') -@result{} -[hi] -@result{}[HI] -[[hi]] -@result{}[[hi]] -[[[hi]]] -@result{}hi -changequote -@result{} -changecom(`[[[', `]]]') -@result{} -changequote(`[[', `]]') -@result{} -[[hi]] -@result{}hi -[[[hi]]] -@result{}[hi] -@end example - -Comments are recognized in preference to argument collection. In -particular, if @var{start} is a single @samp{(}, then argument -collection is effectively disabled. For portability with other -implementations, it is a good idea to avoid @samp{(}, @samp{,}, and -@samp{)} as the first character in @var{start}. - -@example -define(`echo', `$#:$*:$@@:') -@result{} -define(`hi', `HI') -@result{} -changecom(`(',`)') -@result{} -echo(hi) -@result{}0:::(hi) -changecom -@result{} -changecom(`((', `))') -@result{} -echo(hi) -@result{}1:HI:HI: -echo((hi)) -@result{}0:::((hi)) -changecom(`,', `)') -@result{} -echo(hi,hi)bye) -@result{}1:HI,hi)bye:HI,hi)bye: -changecom -@result{} -echo(hi,`,`'hi',hi) -@result{}3:HI,,HI,HI:HI,,`'hi,HI: -echo(hi,`,`'hi',hi`'changecom(`,,', `hi')) -@result{}3:HI,,`'hi,HI:HI,,`'hi,HI: -@end example - -It is an error if the end of file occurs within a comment. - -@comment status: 1 -@example -changecom(`/*', `*/') -@result{} -/*dangling comment -^D -@error{}m4:stdin:2: end of file in comment -@end example - -@comment status: 1 -@example -changecom(`/*', `*/') -@result{} -len(/*dangling comment -^D -@error{}m4:stdin:2: len: end of file in comment -@end example - -@node Changeresyntax -@section Changing the regular expression syntax - -@cindex regular expression syntax, changing -@cindex basic regular expressions -@cindex extended regular expressions -@cindex regular expressions -@cindex expressions, regular -@cindex syntax, changing regular expression -@cindex flavors of regular expressions -@cindex GNU extensions -The GNU extensions @code{patsubst}, @code{regexp}, and more -recently, @code{renamesyms} each deal with regular expressions. There -are multiple flavors of regular expressions, so the -@code{changeresyntax} builtin exists to allow choosing the default -flavor: - -@deffn {Builtin (gnu)} changeresyntax (@var{resyntax}) -Changes the default regular expression syntax used by M4 according to -the value of @var{resyntax}, equivalent to passing @var{resyntax} as the -argument to the command line option @option{--regexp-syntax} -(@pxref{Operation modes, , Invoking m4}). If @var{resyntax} is empty, -the default flavor is reverted to the @code{GNU_M4} style, compatible -with emacs. - -@var{resyntax} can be any one of the values in the table below. Case is -not important, and @samp{-} or @samp{ } can be substituted for @samp{_} in -the given names. If @var{resyntax} is unrecognized, a warning is -issued and the default flavor is not changed. - -@table @dfn -@item AWK -@xref{awk regular expression syntax}, for details. - -@item BASIC -@itemx ED -@itemx POSIX_BASIC -@itemx SED -@xref{posix-basic regular expression syntax}, for details. - -@item BSD_M4 -@item EXTENDED -@itemx POSIX_EXTENDED -@xref{posix-extended regular expression syntax}, for details. - -@item GNU_AWK -@itemx GAWK -@xref{gnu-awk regular expression syntax}, for details. - -@item GNU_EGREP -@itemx EGREP -@xref{egrep regular expression syntax}, for details. - -@item GNU_M4 -@item EMACS -@itemx GNU_EMACS -@xref{emacs regular expression syntax}, for details. This is the -default regular expression flavor. - -@item GREP -@xref{grep regular expression syntax}, for details. - -@item MINIMAL -@itemx POSIX_MINIMAL -@itemx POSIX_MINIMAL_BASIC -@xref{posix-minimal-basic regular expression syntax}, for details. - -@item POSIX_AWK -@xref{posix-awk regular expression syntax}, for details. - -@item POSIX_EGREP -@xref{posix-egrep regular expression syntax}, for details. -@end table - -The expansion of @code{changeresyntax} is void. -The macro @code{changeresyntax} is recognized only with parameters. -This macro was added in M4 2.0. -@end deffn - -For an example of how @var{resyntax} is recognized, the first three -usages select the @samp{GNU_M4} regular expression flavor: - -@example -changeresyntax(`gnu m4') -@result{} -changeresyntax(`GNU-m4') -@result{} -changeresyntax(`Gnu_M4') -@result{} -changeresyntax(`unknown') -@error{}m4:stdin:4: warning: changeresyntax: bad syntax-spec: 'unknown' -@result{} -@end example - -Using @code{changeresyntax} makes it possible to omit the optional -@var{resyntax} parameter to other macros, while still using a different -regular expression flavor. - -@example -patsubst(`ab', `a|b', `c') -@result{}ab -patsubst(`ab', `a\|b', `c') -@result{}cc -patsubst(`ab', `a|b', `c', `EXTENDED') -@result{}cc -changeresyntax(`EXTENDED') -@result{} -patsubst(`ab', `a|b', `c') -@result{}cc -patsubst(`ab', `a\|b', `c') -@result{}ab -@end example - -@node Changesyntax -@section Changing the lexical structure of the input - -@cindex lexical structure of the input -@cindex input, lexical structure of the -@cindex syntax table -@cindex changing syntax -@cindex GNU extensions -@quotation -The macro @code{changesyntax} and all associated functionality is -experimental (@pxref{Experiments}). The functionality might change in -the future. Please direct your comments about it the same way you would -do for bugs. -@end quotation - -The input to @code{m4} is read character by character, and these -characters are grouped together to form input tokens (such as macro -names, strings, comments, etc.). - -Each token is parsed according to certain rules. For example, a macro -name starts with a letter or @samp{_} and consists of the longest -possible string of letters, @samp{_} and digits. But who is to decide -what characters are letters, digits, quotes, white space? Earlier the -operating system decided, now you do. The builtin macro -@code{changesyntax} is used to change the way @code{m4} parses the input -stream into tokens. - -@deffn {Builtin (gnu)} changesyntax (@var{syntax-spec}, @dots{}) -Each @var{syntax-spec} is a two-part string. The first part is a -command, consisting of a single character describing a syntax category, -and an optional one-character action. The action can be @samp{-} to -remove the listed characters from that category, @samp{=} to set the -category to the listed characters -and reassign all other characters previously in that category to -`Other', or @samp{+} to add the listed characters to the category -without affecting other characters. If an action is not specified, but -additional characters are present, then @samp{=} is assumed. - -The remaining characters of each @var{syntax-spec} form the set of -characters to perform the action on for that syntax category. Character -ranges are expanded as for @code{translit} (@pxref{Translit}). To start -the character set with @samp{-}, @samp{+}, or @samp{=}, an action must -be specified. - -If @var{syntax-spec} is just a category, and no action or characters -were specified, then all characters in that category are reset to their -default state. A warning is issued if the category character is not -valid. If @var{syntax-spec} is the empty string, then all categories -are reset to their default state. - -Syntax categories are divided into basic and context. Every input -byte belongs to exactly one basic syntax category. Additionally, any -byte can be assigned to a context category regardless of its current -basic category. Context categories exist because a character can -behave differently when parsed in isolation than when it occurs in -context to close out a token started by another basic category (for -example, @kbd{newline} defaults to the basic category `Whitespace' as -well as the context category `End comment'). - -The following table describes the case-insensitive designation for each -syntax category (the first byte in @var{syntax-spec}), and a description -of what each category controls. - -@multitable @columnfractions .06 .20 .13 .55 -@headitem Code @tab Category @tab Type @tab Description - -@item @kbd{W} @tab @dfn{Words} @tab Basic -@tab Characters that can start a macro name. Defaults to the letters as -defined by the locale, and the character @samp{_}. - -@item @kbd{D} @tab @dfn{Digits} @tab Basic -@tab Characters that, together with the letters, form the remainder of a -macro name. Defaults to the ten digits @samp{0}@dots{}@samp{9}, and any -other digits defined by the locale. - -@item @kbd{S} @tab @dfn{White space} @tab Basic -@tab Characters that should be trimmed from the beginning of each argument to -a macro call. The defaults are space, tab, newline, carriage return, -form feed, and vertical tab, and any others as defined by the locale. - -@item @kbd{(} @tab @dfn{Open parenthesis} @tab Basic -@tab Characters that open the argument list of a macro call. The default is -the single character @samp{(}. - -@item @kbd{)} @tab @dfn{Close parenthesis} @tab Basic -@tab Characters that close the argument list of a macro call. The default -is the single character @samp{)}. - -@item @kbd{,} @tab @dfn{Argument separator} @tab Basic -@tab Characters that separate the arguments of a macro call. The default is -the single character @samp{,}. - -@item @kbd{L} @tab @dfn{Left quote} @tab Basic -@tab The set of characters that can start a single-character quoted string. -The default is the single character @samp{`}. For multiple-character -quote delimiters, use @code{changequote} (@pxref{Changequote}). - -@item @kbd{R} @tab @dfn{Right quote} @tab Context -@tab The set of characters that can end a single-character quoted string. -The default is the single character @samp{'}. For multiple-character -quote delimiters, use @code{changequote} (@pxref{Changequote}). Note -that @samp{'} also defaults to the syntax category `Other', when it -appears in isolation. - -@item @kbd{B} @tab @dfn{Begin comment} @tab Basic -@tab The set of characters that can start a single-character comment. The -default is the single character @samp{#}. For multiple-character -comment delimiters, use @code{changecom} (@pxref{Changecom}). - -@item @kbd{E} @tab @dfn{End comment} @tab Context -@tab The set of characters that can end a single-character comment. The -default is the single character @kbd{newline}. For multiple-character -comment delimiters, use @code{changecom} (@pxref{Changecom}). Note that -newline also defaults to the syntax category `White space', when it -appears in isolation. - -@item @kbd{$} @tab @dfn{Dollar} @tab Context -@tab Characters that can introduce an argument reference in the body of a -macro. The default is the single character @samp{$}. - -@comment FIXME - implement ${10} argument parsing. -@item @kbd{@{} @tab @dfn{Left brace} @tab Context -@tab Characters that introduce an extended argument reference in the body of -a macro immediately after a character in the Dollar category. The -default is the single character @samp{@{}. - -@item @kbd{@}} @tab @dfn{Right brace} @tab Context -@tab Characters that conclude an extended argument reference in the body of a -macro. The default is the single character @samp{@}}. - -@item @kbd{O} @tab @dfn{Other} @tab Basic -@tab Characters that have no special syntactical meaning to @code{m4}. -Defaults to all characters except those in the categories above. - -@item @kbd{A} @tab @dfn{Active} @tab Basic -@tab Characters that themselves, alone, form macro names. This is a -GNU extension, and active characters have lower precedence -than comments. By default, no characters are active. - -@item @kbd{@@} @tab @dfn{Escape} @tab Basic -@tab Characters that must precede macro names for them to be recognized. -This is a GNU extension. When an escape character is defined, -then macros are not recognized unless the escape character is present; -however, the macro name, visible by @samp{$0} in macro definitions, does -not include the escape character. By default, no characters are -escapes. - -@comment FIXME - we should also consider supporting: -@comment @item @kbd{I} @tab @dfn{Ignore} @tab Basic -@comment @tab Characters that are ignored if they appear in -@comment the input; perhaps defaulting to '\0'. -@end multitable - -The expansion of @code{changesyntax} is void. -The macro @code{changesyntax} is recognized only with parameters. Use -this macro with caution, as it is possible to change the syntax in such -a way that no further macros can be recognized by @code{m4}. -This macro was added in M4 2.0. -@end deffn - -With @code{changesyntax} we can modify what characters form a word. For -example, we can make @samp{.} a valid character in a macro name, or even -start a macro name with a number. - -@example -define(`test.1', `TEST ONE') -@result{} -define(`1', `one') -@result{} -__file__ -@result{}stdin -test.1 -@result{}test.1 -dnl Add `.' and remove `_'. -changesyntax(`W+.', `W-_') -@result{} -__file__ -@result{}__file__ -test.1 -@result{}TEST ONE -dnl Set words to include numbers. -changesyntax(`W=a-zA-Z0-9_') -@result{} -__file__ -@result{}stdin -test.1 -@result{}test.one -dnl Reset words to default (a-zA-Z_). -changesyntax(`W') -@result{} -__file__ -@result{}stdin -test.1 -@result{}test.1 -@end example - -Another possibility is to change the syntax of a macro call. - -@example -define(`test', `$#') -@result{} -test(a, b, c) -@result{}3 -dnl Change macro syntax. -changesyntax(`(<', `,|', `)>') -@result{} -test(a, b, c) -@result{}0(a, b, c) -test<a|b|c> -@result{}3 -@end example - -Leading spaces are always removed from macro arguments in @code{m4}, but -by changing the syntax categories we can avoid it. The use of -@code{format} is an alternative to using a literal tab character. - -@example -define(`test', `$1$2$3') -@result{} -test(`a', `b', `c') -@result{}abc -dnl Don't ignore whitespace. -changesyntax(`O 'format(``%c'', `9')` -') -@result{} -test(a, b, -c) -@result{}a b -@result{}c -@end example - -It is possible to redefine the @samp{$} used to indicate macro arguments -in user defined macros. Dollar class syntax elements are copied to the -output if there is no valid expansion. - -@example -define(`argref', `Dollar: $#, Question: ?#') -@result{} -argref(1, 2, 3) -@result{}Dollar: 3, Question: ?# -dnl Change argument identifier. -changesyntax(`$?') -@result{} -argref(1,2,3) -@result{}Dollar: $#, Question: 3 -define(`escape', `$?`'1$?1?') -@result{} -escape(foo) -@result{}$?1$foo? -dnl Multiple argument identifiers. -changesyntax(`$+$') -@result{} -argref(1, 2, 3) -@result{}Dollar: 3, Question: 3 -@end example - -Macro calls can be given a @TeX{} or Texinfo like syntax using an -escape. If one or more characters are defined as escapes, macro names -are only recognized if preceded by an escape character. - -If the escape is not followed by what is normally a word (a letter -optionally followed by letters and/or numerals), that single character -is returned as a macro name. - -As always, words without a macro definition cause no error message. -They and the escape character are simply output. - -@example -define(`foo', `bar') -@result{} -dnl Require @@ escape before any macro. -changesyntax(`@@@@') -@result{} -foo -@result{}foo -@@foo -@result{}bar -@@bar -@result{}@@bar -@@dnl Change escape character. -@@changesyntax(`@@\', `O@@') -@result{} -foo -@result{}foo -@@foo -@result{}@@foo -\foo -@result{}bar -define(`#', `No comment') -@result{}define(#, No comment) -\define(`#', `No comment') -@result{} -\# \foo # Comment \foo -@result{}No comment bar # Comment \foo -@end example - -Active characters are known from @TeX{}. In @code{m4} an active -character is always seen as a one-letter word, and so, if it has a macro -definition, the macro will be called. - -@example -define(`@@', `TEST') -@result{} -define(`a@@a', `hello') -@result{} -define(`a', `A') -@result{} -@@ -@result{}@@ -a@@a -@result{}A@@A -dnl Make @@ active. -changesyntax(`A@@') -@result{} -@@ -@result{}TEST -a@@a -@result{}ATESTa -@end example - -There is obviously an overlap between @code{changesyntax} and -@code{changequote}, since there are now two ways to modify quote -delimiters. To avoid incompatibilities, if the quotes are modified by -@code{changequote}, any characters previously set to either quote -delimiter by @code{changesyntax} are first demoted to the other category -(@samp{O}), so the result is only a single set of quotes. In the other -direction, if quotes were already disabled, or if both the start and end -delimiter set by @code{changequote} are single bytes, then -@code{changesyntax} preserves those settings. But if either delimiter -occupies multiple bytes, @code{changesyntax} first disables both -delimiters. Quotes can be disabled via @code{changesyntax} by emptying -the left quote basic category (@samp{L}). Meanwhile, the right quote -context category (@samp{R}) will never be empty; if a -@code{changesyntax} action would otherwise leave that category empty, -then the default end delimiter from @code{changequote} (@samp{'}) is -used; thus, it is never possible to get @code{m4} in a state where a -quoted string cannot be terminated. These interactions apply to comment -delimiters as well, @i{mutatis mutandis} with @code{changecom}. - -@example -define(`test', `TEST') -@result{} -dnl Add additional single-byte delimiters. -changesyntax(`L+<', `R+>') -@result{} -<test> `test' [test] <<test>> -@result{}test test [TEST] <test> -dnl Use standard interface, overriding changesyntax settings. -changequote(<[>, `]') -@result{} -<test> `test' [test] <<test>> -@result{}<TEST> `TEST' test <<TEST>> -dnl Introduce multi-byte delimiters. -changequote([<<], [>>]) -@result{} -<test> `test' [test] <<test>> -@result{}<TEST> `TEST' [TEST] test -dnl Change end quote, effectively disabling quotes. -changesyntax(<<R]>>) -@result{} -<test> `test' [test] <<test>> -@result{}<TEST> `TEST' [TEST] <<TEST>> -dnl Change beginning quote, make ] normal, thus making ' end quote. -changesyntax(L`, R-]) -@result{} -<test> `test' [test] <<test>> -@result{}<TEST> test [TEST] <<TEST>> -dnl Set multi-byte quote; unrelated changes don't impact it. -changequote(`<<', `>>')changesyntax(<<@@\>>) -@result{} -<\test> `\test' [\test] <<\test>> -@result{}<TEST> `TEST' [TEST] \test -@end example - -If several characters are assigned to a category that forms single -character tokens, all such characters are treated as equal. Any open -parenthesis will match any close parenthesis, etc. - -@example -dnl Go crazy with symbols. -changesyntax(`(@{<', `)@}>', `,;:', `O(,)') -@result{} -eval@{2**4-1; 2: 8> -@result{}00001111 -@end example - -The syntax table is initialized to be backwards compatible, so if you -never call @code{changesyntax}, nothing will have changed. - -For now, debugging output continues to use @kbd{(}, @kbd{,} and @kbd{)} -to show macro calls; and macro expansions that result in a list of -arguments (such as @samp{$@@} or @code{shift}) use @samp{,}, regardless -of the current syntax settings. However, this is likely to change in a -future release, so it should not be relied on, particularly since it is -next to impossible to write recursive macros if the argument separator -doesn't match between expansion and rescanning. - -@c FIXME - changing syntax of , should not break iterative macros. -@example -$ @kbd{m4 -d} -changesyntax(`,=|')traceon(`foo')define(`foo'|`$#:$@@') -@result{} -foo(foo(1|2|3)) -@error{}m4trace: -2- foo(`1', `2', `3') -> `3:`1',`2',`3'' -@error{}m4trace: -1- foo(`3:1,2,3') -> `1:`3:1,2,3'' -@result{}1:3:1,2,3 -@end example - -@node M4wrap -@section Saving text until end of input - -@cindex saving input -@cindex input, saving -@cindex deferring expansion -@cindex expansion, deferring -It is possible to `save' some text until the end of the normal input has -been seen. Text can be saved, to be read again by @code{m4} when the -normal input has been exhausted. This feature is normally used to -initiate cleanup actions before normal exit, e.g., deleting temporary -files. - -To save input text, use the builtin @code{m4wrap}: - -@deffn {Builtin (m4)} m4wrap (@var{string}, @dots{}) -Stores @var{string} in a safe place, to be reread when end of input is -reached. As a GNU extension, additional arguments are -concatenated with a space to the @var{string}. - -Successive invocations of @code{m4wrap} accumulate saved text in -first-in, first-out order, as required by POSIX. - -The expansion of @code{m4wrap} is void. -The macro @code{m4wrap} is recognized only with parameters. -@end deffn - -@example -define(`cleanup', `This is the `cleanup' action. -') -@result{} -m4wrap(`cleanup') -@result{} -This is the first and last normal input line. -@result{}This is the first and last normal input line. -^D -@result{}This is the cleanup action. -@end example - -The saved input is only reread when the end of normal input is seen, and -not if @code{m4exit} is used to exit @code{m4}. - -It is safe to call @code{m4wrap} from wrapped text, where all the -recursively wrapped text is deferred until the current wrapped text is -exhausted. As of M4 1.6, when @code{m4wrap} is not used recursively, -the saved pieces of text are reread in the same order in which they were -saved (FIFO---first in, first out), as required by POSIX. - -@example -m4wrap(`1 -') -@result{} -m4wrap(`2', `3 -') -@result{} -^D -@result{}1 -@result{}2 3 -@end example - -However, earlier versions had reverse ordering (LIFO---last in, first -out), as this behavior is more like the semantics of the C function -@code{atexit}. It is possible to emulate POSIX behavior even -with older versions of GNU M4 by including the file -@file{m4-@value{VERSION}/@/doc/examples/@/wrapfifo.m4} from the -distribution: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -undivert(`wrapfifo.m4')dnl -@result{}dnl Redefine m4wrap to have FIFO semantics. -@result{}define(`_m4wrap_level', `0')dnl -@result{}define(`m4wrap', -@result{}`ifdef(`m4wrap'_m4wrap_level, -@result{} `define(`m4wrap'_m4wrap_level, -@result{} defn(`m4wrap'_m4wrap_level)`$1')', -@result{} `builtin(`m4wrap', `define(`_m4wrap_level', -@result{} incr(_m4wrap_level))dnl -@result{}m4wrap'_m4wrap_level)dnl -@result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl -include(`wrapfifo.m4') -@result{} -m4wrap(`a`'m4wrap(`c -', `d')')m4wrap(`b') -@result{} -^D -@result{}abc -@end example - -It is likewise possible to emulate LIFO behavior without resorting to -the GNU M4 extension of @code{builtin}, by including the file -@file{m4-@value{VERSION}/@/doc/examples/@/wraplifo.m4} from the -distribution. (Unfortunately, both examples shown here share some -subtle bugs. See if you can find and correct them; or @pxref{Improved -m4wrap, , Answers}). - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -undivert(`wraplifo.m4')dnl -@result{}dnl Redefine m4wrap to have LIFO semantics. -@result{}define(`_m4wrap_level', `0')dnl -@result{}define(`_m4wrap', defn(`m4wrap'))dnl -@result{}define(`m4wrap', -@result{}`ifdef(`m4wrap'_m4wrap_level, -@result{} `define(`m4wrap'_m4wrap_level, -@result{} `$1'defn(`m4wrap'_m4wrap_level))', -@result{} `_m4wrap(`define(`_m4wrap_level', incr(_m4wrap_level))dnl -@result{}m4wrap'_m4wrap_level)dnl -@result{}define(`m4wrap'_m4wrap_level, `$1')')')dnl -include(`wraplifo.m4') -@result{} -m4wrap(`a`'m4wrap(`c -', `d')')m4wrap(`b') -@result{} -^D -@result{}bac -@end example - -Here is an example of implementing a factorial function using -@code{m4wrap}: - -@example -define(`f', `ifelse(`$1', `0', `Answer: 0!=1 -', eval(`$1>1'), `0', `Answer: $2$1=eval(`$2$1') -', `m4wrap(`f(decr(`$1'), `$2$1*')')')') -@result{} -f(`10') -@result{} -^D -@result{}Answer: 10*9*8*7*6*5*4*3*2*1=3628800 -@end example - -Invocations of @code{m4wrap} at the same recursion level are -concatenated and rescanned as usual: - -@example -define(`ab', `AB -') -@result{} -m4wrap(`a')m4wrap(`b') -@result{} -^D -@result{}AB -@end example - -@noindent -however, the transition between recursion levels behaves like an end of -file condition between two input files. - -@comment status: 1 -@example -m4wrap(`m4wrap(`)')len(abc') -@result{} -^D -@error{}m4:stdin:1: len: end of file in argument list -@end example - -As of M4 1.6, @code{m4wrap} transparently handles builtin tokens -generated by @code{defn} (@pxref{Defn}). However, for portability, it -is better to defer the evaluation of @code{defn} along with the rest of -the wrapped text, as is done for @code{foo} in the example below, rather -than computing the builtin token up front, as is done for @code{bar}. - -@example -m4wrap(`define(`foo', defn(`divnum'))foo -') -@result{} -m4wrap(`define(`bar', ')m4wrap(defn(`divnum'))m4wrap(`)bar -') -@result{} -^D -@result{}0 -@result{}0 -@end example - -@node File Inclusion -@chapter File inclusion - -@cindex file inclusion -@cindex inclusion, of files -@code{m4} allows you to include named files at any point in the input. - -@menu -* Include:: Including named files and modules -* Search Path:: Searching for include files -@end menu - -@node Include -@section Including named files and modules - -There are two builtin macros in @code{m4} for including files: - -@deffn {Builtin (m4)} include (@var{file}) -@deffnx {Builtin (m4)} sinclude (@var{file}) -Both macros cause the file named @var{file} to be read by -@code{m4}. When the end of the file is reached, input is resumed from -the previous input file. - -The expansion of @code{include} and @code{sinclude} is therefore the -contents of @var{file}. - -If @var{file} does not exist, is a directory, or cannot otherwise be -read, the expansion is void, -and @code{include} will fail with an error while @code{sinclude} is -silent. The empty string counts as a file that does not exist. - -The macros @code{include} and @code{sinclude} are recognized only with -parameters. -@end deffn - -@comment status: 1 -@example -include(`n') -@error{}m4:stdin:1: include: cannot open file 'n': No such file or directory -@result{} -include() -@error{}m4:stdin:2: include: cannot open file '': No such file or directory -@result{} -sinclude(`n') -@result{} -sinclude() -@result{} -@end example - -This section uses the @option{--include} command-line option (or -@option{-I}, @pxref{Preprocessor features, , Invoking m4}) to grab -files from the @file{m4-@value{VERSION}/@/doc/examples} -directory shipped as part of the GNU @code{m4} package. The -file @file{m4-@value{VERSION}/@/doc/examples/@/incl.m4} in the distribution -contains the lines: - -@comment ignore -@example -$ @kbd{cat doc/examples/incl.m4} -@result{}Include file start -@result{}foo -@result{}Include file end -@end example - -Normally file inclusion is used to insert the contents of a file -into the input stream. The contents of the file will be read by -@code{m4} and macro calls in the file will be expanded: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -define(`foo', `FOO') -@result{} -include(`incl.m4') -@result{}Include file start -@result{}FOO -@result{}Include file end -@result{} -@end example - -The fact that @code{include} and @code{sinclude} expand to the contents -of the file can be used to define macros that operate on entire files. -Here is an example, which defines @samp{bar} to expand to the contents -of @file{incl.m4}: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -define(`bar', include(`incl.m4')) -@result{} -This is `bar': >>bar<< -@result{}This is bar: >>Include file start -@result{}foo -@result{}Include file end -@result{}<< -@end example - -This use of @code{include} is not trivial, though, as files can contain -quotes, commas, and parentheses, which can interfere with the way the -@code{m4} parser works. GNU M4 seamlessly concatenates -the file contents with the next character, even if the included file -ended in the middle of a comment, string, or macro call. These -conditions are only treated as end of file errors if specified as input -files on the command line. - -In GNU M4, an alternative method of reading files is -using @code{undivert} (@pxref{Undivert}) on a named file. - -In addition, as a GNU M4 extension, if the included file cannot -be found exactly as given, various standard suffixes are appended. -If the included file name is absolute (a full path from the root directory -is given) then additional search directories are not examined, although -suffixes will be tried if the file is not found exactly as given. -For each directory that is searched (according to the absolute directory -give in the file name, or else by directories listed in @env{M4PATH} and -given with the @option{-I} and @option{-B} options), first the unchanged -file name is tried, and then again with the suffixes @samp{.m4f} and -@samp{.m4}. - -Furthermore, if no matching file has yet been found, before moving on to -the next directory, @samp{.la} and the usual binary module suffix for -the host platform (usually @samp{.so}) are also tried. Matching with one -of those suffixes will attempt to load the matched file as a dynamic -module. @xref{Modules}, for more details. - -@node Search Path -@section Searching for include files - -@cindex search path for included files -@cindex included files, search path for -@cindex GNU extensions -GNU @code{m4} allows included files to be found in other directories -than the current working directory. - -@cindex @env{M4PATH} -If the @option{--prepend-include} or @option{-B} command-line option was -provided (@pxref{Preprocessor features, , Invoking m4}), those -directories are searched first, in reverse order that those options were -listed on the command line. Then @code{m4} looks in the current working -directory. Next comes the directories specified with the -@option{--include} or @option{-I} option, in the order found on the -command line. Finally, if the @env{M4PATH} environment variable is set, -it is expected to contain a colon-separated list of directories, which -will be searched in order. - -If the automatic search for include-files causes trouble, the @samp{p} -debug flag (@pxref{Debugmode}) can help isolate the problem. - -@node Diversions -@chapter Diverting and undiverting output - -@cindex deferring output -Diversions are a way of temporarily saving output. The output of -@code{m4} can at any time be diverted to a temporary file, and be -reinserted into the output stream, @dfn{undiverted}, again at a later -time. - -@cindex @env{TMPDIR} -Numbered diversions are counted from 0 upwards, diversion number 0 -being the normal output stream. GNU -@code{m4} tries to keep diversions in memory. However, there is a -limit to the overall memory usable by all diversions taken together -(512K, currently). When this maximum is about to be exceeded, -a temporary file is opened to receive the contents of the biggest -diversion still in memory, freeing this memory for other diversions. -When creating the temporary file, @code{m4} honors the value of the -environment variable @env{TMPDIR}, and falls back to @file{/tmp}. -Thus, the amount of available disk space provides the only real limit on -the number and aggregate size of diversions. - -Diversions make it possible to generate output in a different order than -the input was read. It is possible to implement topological sorting -dependencies. For example, GNU Autoconf makes use of -diversions under the hood to ensure that the expansion of a prerequisite -macro appears in the output prior to the expansion of a dependent macro, -regardless of which order the two macros were invoked in the user's -input file. - -@menu -* Divert:: Diverting output -* Undivert:: Undiverting output -* Divnum:: Diversion numbers -* Cleardivert:: Discarding diverted text -@end menu - -@node Divert -@section Diverting output - -@cindex diverting output to files -@cindex output, diverting to files -@cindex files, diverting output to -Output is diverted using @code{divert}: - -@deffn {Builtin (m4)} divert (@dvar{number, 0}, @ovar{text}) -The current diversion is changed to @var{number}. If @var{number} is left -out or empty, it is assumed to be zero. If @var{number} cannot be -parsed, the diversion is unchanged. - -@cindex GNU extensions -As a GNU extension, if optional @var{text} is supplied and -@var{number} was valid, then @var{text} is immediately output to the -new diversion, regardless of whether the expansion of @code{divert} -occurred while collecting arguments for another macro. - -The expansion of @code{divert} is void. -@end deffn - -When all the @code{m4} input will have been processed, all existing -diversions are automatically undiverted, in numerical order. - -@example -divert(`1') -This text is diverted. -divert -@result{} -This text is not diverted. -@result{}This text is not diverted. -^D -@result{} -@result{}This text is diverted. -@end example - -Several calls of @code{divert} with the same argument do not overwrite -the previous diverted text, but append to it. Diversions are printed -after any wrapped text is expanded. - -@example -define(`text', `TEXT') -@result{} -divert(`1')`diverted text.' -divert -@result{} -m4wrap(`Wrapped text precedes ') -@result{} -^D -@result{}Wrapped TEXT precedes diverted text. -@end example - -@cindex discarding input -@cindex input, discarding -If output is diverted to a negative diversion, it is simply discarded. -This can be used to suppress unwanted output. A common example of -unwanted output is the trailing newlines after macro definitions. Here -is a common programming idiom in @code{m4} for avoiding them. - -@example -divert(`-1') -define(`foo', `Macro `foo'.') -define(`bar', `Macro `bar'.') -divert -@result{} -@end example - -@cindex GNU extensions -Traditional implementations only supported ten diversions. But as a -GNU extension, diversion numbers can be as large as positive -integers will allow, rather than treating a multi-digit diversion number -as a request to discard text. - -@example -divert(eval(`1<<28'))world -divert(`2')hello -^D -@result{}hello -@result{}world -@end example - -The ability to immediately output extra text is a GNU -extension, but it can prove useful for ensuring that text goes to a -particular diversion no matter how many pending macro expansions are in -progress. For a demonstration of why this is useful, it is important to -understand in the example below why @samp{one} is output in diversion 2, -not diversion 1, while @samp{three} and @samp{five} both end up in the -correctly numbered diversion. The key point is that when @code{divert} -is executed unquoted as part of the argument collection of another -macro, the side effect takes place immediately, but the text @samp{one} -is not passed to any diversion until after the @samp{divert(`2')} and -the enclosing @code{echo} have also taken place. The example with -@samp{three} shows how following the quoting rule of thumb delays the -invocation of @code{divert} until it is not nested in any argument -collection context, while the example with @samp{five} shows the use of -the optional argument to speed up the output process. - -@example -define(`echo', `$1') -@result{} -echo(divert(`1')`one'divert(`2'))`'dnl -echo(`divert(`3')three`'divert(`4')')`'dnl -echo(divert(`5', `five')divert(`6'))`'dnl -divert -@result{} -undivert(`1') -@result{} -undivert(`2') -@result{}one -undivert(`3') -@result{}three -undivert(`4') -@result{} -undivert(`5') -@result{}five -undivert(`6') -@result{} -@end example - -Note that @code{divert} is an English word, but also an active macro -without arguments. When processing plain text, the word might appear in -normal text and be unintentionally swallowed as a macro invocation. One -way to avoid this is to use the @option{-P} option to rename all -builtins (@pxref{Operation modes, , Invoking m4}). Another is to write -a wrapper that requires a parameter to be recognized. - -@example -We decided to divert the stream for irrigation. -@result{}We decided to the stream for irrigation. -define(`divert', `ifelse(`$#', `0', ``$0'', `builtin(`$0', $@@)')') -@result{} -divert(`-1') -Ignored text. -divert(`0') -@result{} -We decided to divert the stream for irrigation. -@result{}We decided to divert the stream for irrigation. -@end example - -@node Undivert -@section Undiverting output - -Diverted text can be undiverted explicitly using the builtin -@code{undivert}: - -@deffn {Builtin (m4)} undivert (@ovar{diversions@dots{}}) -Undiverts the numeric @var{diversions} given by the arguments, in the -order given. If no arguments are supplied, all diversions are -undiverted, in numerical order. - -@cindex file inclusion -@cindex inclusion, of files -@cindex GNU extensions -As a GNU extension, @var{diversions} may contain non-numeric -strings, which are treated as the names of files to copy into the output -without expansion. A warning is issued if a file could not be opened. - -The expansion of @code{undivert} is void. -@end deffn - -@example -divert(`1') -This text is diverted. -divert -@result{} -This text is not diverted. -@result{}This text is not diverted. -undivert(`1') -@result{} -@result{}This text is diverted. -@result{} -@end example - -Notice the last two blank lines. One of them comes from the newline -following @code{undivert}, the other from the newline that followed the -@code{divert}! A diversion often starts with a blank line like this. - -When diverted text is undiverted, it is @emph{not} reread by @code{m4}, -but rather copied directly to the current output, and it is therefore -not an error to undivert into a diversion. Undiverting the empty string -is the same as specifying diversion 0; in either case nothing happens -since the output has already been flushed. - -@example -divert(`1')diverted text -divert -@result{} -undivert() -@result{} -undivert(`0') -@result{} -undivert -@result{}diverted text -@result{} -divert(`1')more -divert(`2')undivert(`1')diverted text`'divert -@result{} -undivert(`1') -@result{} -undivert(`2') -@result{}more -@result{}diverted text -@end example - -When a diversion has been undiverted, the diverted text is discarded, -and it is not possible to bring back diverted text more than once. - -@example -divert(`1') -This text is diverted first. -divert(`0')undivert(`1')dnl -@result{} -@result{}This text is diverted first. -undivert(`1') -@result{} -divert(`1') -This text is also diverted but not appended. -divert(`0')undivert(`1')dnl -@result{} -@result{}This text is also diverted but not appended. -@end example - -Attempts to undivert the current diversion are silently ignored. Thus, -when the current diversion is not 0, the current diversion does not get -rearranged among the other diversions. - -@example -divert(`1')one -divert(`2')two -divert(`3')three -divert(`4')four -divert(`5')five -divert(`2')undivert(`5', `2', `4')dnl -undivert`'dnl effectively undivert(`1', `2', `3', `4', `5') -divert`'undivert`'dnl -@result{}two -@result{}five -@result{}four -@result{}one -@result{}three -@end example - -@cindex GNU extensions -@cindex file inclusion -@cindex inclusion, of files -GNU @code{m4} allows named files to be undiverted. Given a -non-numeric argument, the contents of the file named will be copied, -uninterpreted, to the current output. This complements the builtin -@code{include} (@pxref{Include}). To illustrate the difference, assume -the file @file{foo} contains: - -@comment file: foo -@example -$ @kbd{cat foo} -bar -@end example - -@noindent -then - -@example -define(`bar', `BAR') -@result{} -undivert(`foo') -@result{}bar -@result{} -include(`foo') -@result{}BAR -@result{} -@end example - -If the file is not found (or cannot be read), an error message is -issued, and the expansion is void. It is possible to intermix files -and diversion numbers. - -@example -divert(`1')diversion one -divert(`2')undivert(`foo')dnl -divert(`3')diversion three -divert`'dnl -undivert(`1', `2', `foo', `3')dnl -@result{}diversion one -@result{}bar -@result{}bar -@result{}diversion three -@end example - -@node Divnum -@section Diversion numbers - -@cindex diversion numbers -The current diversion is tracked by the builtin @code{divnum}: - -@deffn {Builtin (m4)} divnum -Expands to the number of the current diversion. -@end deffn - -@example -Initial divnum -@result{}Initial 0 -divert(`1') -Diversion one: divnum -divert(`2') -Diversion two: divnum -^D -@result{} -@result{}Diversion one: 1 -@result{} -@result{}Diversion two: 2 -@end example - -@node Cleardivert -@section Discarding diverted text - -@cindex discarding diverted text -@cindex diverted text, discarding -Often it is not known, when output is diverted, whether the diverted -text is actually needed. Since all non-empty diversion are brought back -on the main output stream when the end of input is seen, a method of -discarding a diversion is needed. If all diversions should be -discarded, the easiest is to end the input to @code{m4} with -@samp{divert(`-1')} followed by an explicit @samp{undivert}: - -@example -divert(`1') -Diversion one: divnum -divert(`2') -Diversion two: divnum -divert(`-1') -undivert -^D -@end example - -@noindent -No output is produced at all. - -Clearing selected diversions can be done with the following macro: - -@deffn Composite cleardivert (@ovar{diversions@dots{}}) -Discard the contents of each of the listed numeric @var{diversions}. -@end deffn - -@example -define(`cleardivert', -`pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')') -@result{} -@end example - -It is called just like @code{undivert}, but the effect is to clear the -diversions, given by the arguments. (This macro has a nasty bug! You -should try to see if you can find it and correct it; or @pxref{Improved -cleardivert, , Answers}). - -@node Modules -@chapter Extending M4 with dynamic runtime modules - -@cindex modules -@cindex dynamic modules -@cindex loadable modules -GNU M4 1.4.x had a monolithic architecture. All of its -functionality was contained in a single binary, and additional macros -could be added only by writing more code in the M4 language, or at the -extreme by hacking the sources and recompiling the whole thing to make -a custom M4 installation. - -Starting with release 2.0, M4 supports and is composed of loadable modules. -Additional modules can be loaded into the running M4 interpreter as it is -started up at the command line, or during normal expansion of macros. This -facilitates runtime extension of the M4 builtin macro list using compiled C -code linked against a new shared library, typically named @file{libm4.so}. - -For example, you might want to add a @code{setenv} builtin to M4, to -use before invoking @code{esyscmd}. We might write a @file{setenv.c} -something like this: - -@comment ignore -@example -#include "m4module.h" - -M4BUILTIN(setenv); - -m4_builtin m4_builtin_table[] = -@{ - /* name handler flags minargs maxargs */ - @{ "setenv", builtin_setenv, M4_BUILTIN_BLIND, 2, 3 @}, - - @{ NULL, NULL, 0, 0, 0 @} -@}; - -/** - * setenv(NAME, VALUE, [OVERWRITE]) - **/ -M4BUILTIN_HANDLER (setenv) -@{ - int overwrite = 1; - - if (argc >= 4) - if (!m4_numeric_arg (context, argc, argv, 3, &overwrite)) - return; - - setenv (M4ARG (1), M4ARG (2), overwrite); -@} -@end example - -Then, having compiled and linked the module, in (somewhat contrived) -M4 code: - -@comment ignore -@example -$ @kbd{m4 setenv} -setenv(`PATH', `/sbin:/bin:/usr/sbin:/usr/bin') -@result{} -esyscmd(`ifconfig -a')dnl -@result{}@dots{} -@end example - -Or instead of loading the module from the M4 invocation, you can use -the @code{include} builtin: - -@comment ignore -@example -$ @kbd{m4} -include(`setenv') -@result{} -setenv(`PATH', `/sbin:/bin:/usr/sbin:/usr/bin') -@result{} -@end example - -Also, at run time, you can choose which core modules to load. SUSv3 M4 -functionality is contained in the module @samp{m4}, GNU extensions in the -module @samp{gnu}, and so on. All of the builtin descriptions in this manual -are annotated with the module from which they are loaded -- mostly from the -module @samp{m4}. - -When you start GNU M4, the modules @samp{m4} and @samp{gnu} are -loaded by default. If you supply the @option{-G} option at startup, the -module @samp{traditional} is loaded instead of @samp{gnu}. -@xref{Compatibility}, for more details on the differences between these -two modes of startup. - -@menu -* M4modules:: Listing loaded modules -* Standard Modules:: Standard bundled modules -@end menu - -@node M4modules -@section Listing loaded modules - -@deffn {Builtin (gnu)} m4modules -Expands to a quoted ordered list of currently loaded modules, -with the most recently loaded module at the front of the list. Loading -a module multiple times will not affect the order of this list, the -position depends on when the module was @emph{first} loaded. -@end deffn - -For example, after GNU @code{m4} is started with no additional modules, -@code{m4modules} will yield the following: - -@example -$ @kbd{m4} -m4modules -@result{}gnu,m4 -@end example - -@node Standard Modules -@section Standard bundled modules - -GNU @code{m4} ships with several bundled modules as standard. -By convention, these modules define a text macro that can be tested -with @code{ifdef} when they are loaded; only the @code{m4} module lacks -this feature test macro, since it is not permitted by POSIX. -Each of the feature test macros are intended to be used without -arguments. - -@table @code -@item m4 -Provides all of the builtins defined by POSIX. This module -is always loaded --- GNU @code{m4} would only be a very slow -version of @command{cat} without the builtins supplied by this module. - -@item gnu -Provides all of the GNU extensions, as defined by -GNU M4 through the 1.4.x release series. It also provides a -couple of feature test macros: - -@deffn {Macro (gnu)} __gnu__ -Expands to the empty string, as an indication that the @samp{gnu} -module is loaded. -@end deffn - -@deffn {Macro (gnu)} __m4_version__ -Expands to an unquoted string containing the release version number of -the running GNU @code{m4} executable. -@end deffn - -This module is always loaded, unless the @option{-G} command line -option is supplied at startup (@pxref{Limits control, , Invoking m4}). - -@item traditional -This module provides compatibility with System V @code{m4}, for anything -not specified by POSIX, and is loaded instead of the -@samp{gnu} module if the @option{-G} command line option is specified. - -@deffn {Macro (traditional)} __traditional__ -Expands to the empty string, as an indication that the -@samp{traditional} module is loaded. -@end deffn - -@item mpeval -This module provides the implementation for the experimental -@code{mpeval} feature. If the host machine does not have the -GNU gmp library, the builtin will generate an error if called. -@xref{Mpeval}, for more details. The module also defines the following -macro: - -@deffn {Macro (mpeval)} __mpeval__ -Expands to the empty string, as an indication that the @samp{mpeval} -module is loaded. -@end deffn -@end table - -Here is an example of using the feature test macros. - -@example -$ @kbd{m4} -__gnu__-__traditional__ -@result{}-__traditional__ -ifdef(`__gnu__', `Extensions are active', `Minimal features') -@result{}Extensions are active -__gnu__(`ignored') -@error{}m4:stdin:3: warning: __gnu__: extra arguments ignored: 1 > 0 -@result{} -@end example - -@comment options: -G -@example -$ @kbd{m4 --traditional} -__gnu__-__traditional__ -@result{}__gnu__- -ifdef(`__gnu__', `Extensions are active', `Minimal features') -@result{}Minimal features -@end example - -Since the version string is unquoted and can potentially contain macro -names (for example, a beta release could be numbered @samp{1.9b}), or be -impacted by the use of @code{changesyntax}), the -@code{__m4_version__} macro should generally be used via @code{defn} -rather than directly invoked (@pxref{Defn}). In general, feature tests -are more reliable than version number checks, so exercise caution when -using this macro. - -@comment This test is excluded from the testsuite since it depends on a -@comment texinfo macro; but builtins.at covers the same thing. -@comment ignore -@example -defn(`__m4_version__') -@result{}@value{VERSION} -@end example - -@node Text handling -@chapter Macros for text handling - -There are a number of builtins in @code{m4} for manipulating text in -various ways, extracting substrings, searching, substituting, and so on. - -@menu -* Len:: Calculating length of strings -* Index macro:: Searching for substrings -* Regexp:: Searching for regular expressions -* Substr:: Extracting substrings -* Translit:: Translating characters -* Patsubst:: Substituting text by regular expression -* Format:: Formatting strings (printf-like) -@end menu - -@node Len -@section Calculating length of strings - -@cindex length of strings -@cindex strings, length of -The length of a string can be calculated by @code{len}: - -@deffn {Builtin (m4)} len (@var{string}) -Expands to the length of @var{string}, as a decimal number. - -The macro @code{len} is recognized only with parameters. -@end deffn - -@example -len() -@result{}0 -len(`abcdef') -@result{}6 -@end example - -@node Index macro -@section Searching for substrings - -@cindex substrings, locating -Searching for substrings is done with @code{index}: - -@deffn {Builtin (m4)} index (@var{string}, @var{substring}, @ovar{offset}) -Expands to the index of the first occurrence of @var{substring} in -@var{string}. The first character in @var{string} has index 0. If -@var{substring} does not occur in @var{string}, @code{index} expands to -@samp{-1}. If @var{offset} is provided, it determines the index at -which the search starts; a negative @var{offset} specifies the offset -relative to the end of @var{string}. - -The macro @code{index} is recognized only with parameters. -@end deffn - -@example -index(`gnus, gnats, and armadillos', `nat') -@result{}7 -index(`gnus, gnats, and armadillos', `dag') -@result{}-1 -@end example - -Omitting @var{substring} evokes a warning, but still produces output; -contrast this with an empty @var{substring}. - -@example -index(`abc') -@error{}m4:stdin:1: warning: index: too few arguments: 1 < 2 -@result{}0 -index(`abc', `') -@result{}0 -index(`abc', `b') -@result{}1 -@end example - -@cindex GNU extensions -As an extension, an @var{offset} can be provided to limit the search to -the tail of the @var{string}. A negative offset is interpreted relative -to the end of @var{string}, and it is not an error if @var{offset} -exceeds the bounds of @var{string}. - -@example -index(`aba', `a', `1') -@result{}2 -index(`ababa', `ba', `-3') -@result{}3 -index(`abc', `ab', `4') -@result{}-1 -index(`abc', `bc', `-4') -@result{}1 -@end example - -@ignore -@comment Expose a bug in the strstr() algorithm present in glibc -@comment 2.9 through 2.12 and in gnulib up to Sep 2010. - -@example -index(`;:11-:12-:12-:12-:12-:12-:12-:12-:12.:12.:12.:12.:12.:12.:12.:12.:12-:', -`:12-:12-:12-:12-:12-:12-:12-:12-') -@result{}-1 -@end example - -@comment Expose a bug in the gnulib replacement strstr() algorithm -@comment present from Jun 2010 to Feb 2011, including m4 1.4.15. - -@example -index(`..wi.d.', `.d.') -@result{}4 -@end example -@end ignore - -@node Regexp -@section Searching for regular expressions - -@cindex regular expressions -@cindex expressions, regular -@cindex GNU extensions -Searching for regular expressions is done with the builtin -@code{regexp}: - -@deffn {Builtin (gnu)} regexp (@var{string}, @var{regexp}, @var{resyntax}) -@deffnx {Builtin (gnu)} regexp (@var{string}, @var{regexp}, @ - @ovar{replacement}, @ovar{resyntax}) -Searches for @var{regexp} in @var{string}. - -If @var{resyntax} is given, the particular flavor of regular expression -understood with respect to @var{regexp} can be changed from the current -default. @xref{Changeresyntax}, for details of the values that can be -given for this argument. If exactly three arguments given, then the -third argument is treated as @var{resyntax} only if it matches a known -syntax name, otherwise it is treated as @var{replacement}. - -If @var{replacement} is omitted, @code{regexp} expands to the index of -the first match of @var{regexp} in @var{string}. If @var{regexp} does -not match anywhere in @var{string}, it expands to -1. - -If @var{replacement} is supplied, and there was a match, @code{regexp} -changes the expansion to this argument, with @samp{\@var{n}} substituted -by the text matched by the @var{n}th parenthesized sub-expression of -@var{regexp}, up to nine sub-expressions. The escape @samp{\&} is -replaced by the text of the entire regular expression matched. For -all other characters, @samp{\} treats the next character literally. A -warning is issued if there were fewer sub-expressions than the -@samp{\@var{n}} requested, or if there is a trailing @samp{\}. If there -was no match, @code{regexp} expands to the empty string. - -The macro @code{regexp} is recognized only with parameters. -@end deffn - -@example -regexp(`GNUs not Unix', `\<[a-z]\w+') -@result{}5 -regexp(`GNUs not Unix', `\<Q\w*') -@result{}-1 -regexp(`GNUs not Unix', `\w\(\w+\)$', `*** \& *** \1 ***') -@result{}*** Unix *** nix *** -regexp(`GNUs not Unix', `\<Q\w*', `*** \& *** \1 ***') -@result{} -@end example - -Here are some more examples on the handling of backslash: - -@example -regexp(`abc', `\(b\)', `\\\10\a') -@result{}\b0a -regexp(`abc', `b', `\1\') -@error{}m4:stdin:2: warning: regexp: sub-expression 1 not present -@error{}m4:stdin:2: warning: regexp: trailing \ ignored in replacement -@result{} -regexp(`abc', `\(\(d\)?\)\(c\)', `\1\2\3\4\5\6') -@error{}m4:stdin:3: warning: regexp: sub-expression 4 not present -@error{}m4:stdin:3: warning: regexp: sub-expression 5 not present -@error{}m4:stdin:3: warning: regexp: sub-expression 6 not present -@result{}c -@end example - -Omitting @var{regexp} evokes a warning, but still produces output; -contrast this with an empty @var{regexp} argument. - -@example -regexp(`abc') -@error{}m4:stdin:1: warning: regexp: too few arguments: 1 < 2 -@result{}0 -regexp(`abc', `') -@result{}0 -regexp(`abc', `', `\\def') -@result{}\def -@end example - -If @var{resyntax} is given, @var{regexp} must be given according to -the syntax chosen, though the default regular expression syntax -remains unchanged for other invocations: - -@example -regexp(`GNUs not Unix', `\w(\w+)$', `*** \& *** \1 ***', - `POSIX_EXTENDED') -@result{}*** Unix *** nix *** -regexp(`GNUs not Unix', `\w(\w+)$', `*** \& *** \1 ***') -@result{} -@end example - -Occasionally, you might want to pass an @var{resyntax} argument without -wishing to give @var{replacement}. If there are exactly three -arguments, and the last argument is a valid @var{resyntax}, it is used -as such, rather than as a replacement. - -@example -regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED') -@result{}9 -regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED', `POSIX_EXTENDED') -@result{}POSIX_EXTENDED -regexp(`GNUs not Unix', `\w(\w+)$', `POSIX_EXTENDED', `') -@result{} -regexp(`GNUs not Unix', `\w\(\w+\)$', `POSIX_EXTENDED', `') -@result{}POSIX_EXTENDED -@end example - -@node Substr -@section Extracting substrings - -@cindex extracting substrings -@cindex substrings, extracting -Substrings are extracted with @code{substr}: - -@deffn {Builtin (m4)} substr (@var{string}, @var{from}, @ovar{length}, @ - @ovar{replace}) -Performs a substring operation on @var{string}. If @var{from} is -positive, it represents the 0-based index where the substring begins. -If @var{length} is omitted, the substring ends at the end of -@var{string}; if it is positive, @var{length} is added to the starting -index to determine the ending index. - -@cindex GNU extensions -As a GNU extension, if @var{from} is negative, it is added to -the length of @var{string} to determine the starting index; if it is -empty, the start of the string is used. Likewise, if @var{length} is -negative, it is added to the length of @var{string} to determine the -ending index, and an emtpy @var{length} behaves like an omitted -@var{length}. It is not an error if either of the resulting indices lie -outside the string, but the selected substring only contains the bytes -of @var{string} that overlap the selected indices. If the end point -lies before the beginning point, the substring chosen is the empty -string located at the starting index. - -If @var{replace} is omitted, then the expansion is only the selected -substring, which may be empty. As a GNU extension,if -@var{replace} is provided, then the expansion is the original -@var{string} with the selected substring replaced by @var{replace}. The -expansion is empty and a warning issued if @var{from} or @var{length} -cannot be parsed, or if @var{replace} is provided but the selected -indices do not overlap with @var{string}. - -The macro @code{substr} is recognized only with parameters. -@end deffn - -@example -substr(`gnus, gnats, and armadillos', `6') -@result{}gnats, and armadillos -substr(`gnus, gnats, and armadillos', `6', `5') -@result{}gnats -@end example - -Omitting @var{from} evokes a warning, but still produces output. On the -other hand, selecting a @var{from} or @var{length} that lies beyond -@var{string} is not a problem. - -@example -substr(`abc') -@error{}m4:stdin:1: warning: substr: too few arguments: 1 < 2 -@result{}abc -substr(`abc', `') -@result{}abc -substr(`abc', `4') -@result{} -substr(`abc', `1', `4') -@result{}bc -@end example - -Using negative values for @var{from} or @var{length} are GNU -extensions, useful for accessing a fixed size tail of an -arbitrary-length string. Prior to M4 1.6, using these values would -silently result in the empty string. Some other implementations crash -on negative values, and many treat an explicitly empty @var{length} as -0, which is different from the omitted @var{length} implying the rest of -the original @var{string}. - -@example -substr(`abcde', `2', `') -@result{}cde -substr(`abcde', `-3') -@result{}cde -substr(`abcde', `', `-3') -@result{}ab -substr(`abcde', `-6') -@result{}abcde -substr(`abcde', `-6', `5') -@result{}abcd -substr(`abcde', `-7', `1') -@result{} -substr(`abcde', `1', `-2') -@result{}bc -substr(`abcde', `-4', `-1') -@result{}bcd -substr(`abcde', `4', `-3') -@result{} -substr(`abcdefghij', `-09', `08') -@result{}bcdefghi -@end example - -Another useful GNU extension, also added in M4 1.6, is the -ability to replace a substring within the original @var{string}. An -empty length substring at the beginning or end of @var{string} is valid, -but selecting a substring that does not overlap @var{string} causes a -warning. - -@example -substr(`abcde', `1', `3', `t') -@result{}ate -substr(`abcde', `5', `', `f') -@result{}abcdef -substr(`abcde', `-3', `-4', `f') -@result{}abfcde -substr(`abcde', `-6', `1', `f') -@result{}fabcde -substr(`abcde', `-7', `1', `f') -@error{}m4:stdin:5: warning: substr: substring out of range -@result{} -substr(`abcde', `6', `', `f') -@error{}m4:stdin:6: warning: substr: substring out of range -@result{} -@end example - -If backwards compabitility to M4 1.4.x behavior is necessary, the -following macro is sufficient to do the job (mimicking warnings about -empty @var{from} or @var{length} or an ignored fourth argument is left -as an exercise to the reader). - -@example -define(`substr', `ifelse(`$#', `0', ``$0'', - eval(`2 < $#')`$3', `1', `', - index(`$2$3', `-'), `-1', `builtin(`$0', `$1', `$2', `$3')')') -@result{} -substr(`abcde', `3') -@result{}de -substr(`abcde', `3', `') -@result{} -substr(`abcde', `-1') -@result{} -substr(`abcde', `1', `-1') -@result{} -substr(`abcde', `2', `1', `C') -@result{}c -@end example - -On the other hand, it is possible to portably emulate the GNU -extension of negative @var{from} and @var{length} arguments across all -@code{m4} implementations, albeit with a lot more overhead. This -example uses @code{incr} and @code{decr} to normalize @samp{-08} to -something that a later @code{eval} will treat as a decimal value, rather -than looking like an invalid octal number, while avoiding using these -macros on an empty string. The helper macro @code{_substr_normalize} is -recursive, since it is easier to fix @var{length} after @var{from} has -been normalized, with the final iteration supplying two non-negative -arguments to the original builtin, now named @code{_substr}. - -@comment options: -daq -t_substr -@example -$ @kbd{m4 -daq -t _substr} -define(`_substr', defn(`substr'))dnl -define(`substr', `ifelse(`$#', `0', ``$0'', - `_$0(`$1', _$0_normalize(len(`$1'), - ifelse(`$2', `', `0', `incr(decr(`$2'))'), - ifelse(`$3', `', `', `incr(decr(`$3'))')))')')dnl -define(`_substr_normalize', `ifelse( - eval(`$2 < 0 && $1 + $2 >= 0'), `1', - `$0(`$1', eval(`$1 + $2'), `$3')', - eval(`$2 < 0')`$3', `1', ``0', `$1'', - eval(`$2 < 0 && $3 - 0 >= 0 && $1 + $2 + $3 - 0 >= 0'), `1', - `$0(`$1', `0', eval(`$1 + $2 + $3 - 0'))', - eval(`$2 < 0 && $3 - 0 >= 0'), `1', ``0', `0'', - eval(`$2 < 0'), `1', `$0(`$1', `0', `$3')', - `$3', `', ``$2', `$1'', - eval(`$3 - 0 < 0 && $1 - $2 + $3 - 0 >= 0'), `1', - ``$2', eval(`$1 - $2 + $3')', - eval(`$3 - 0 < 0'), `1', ``$2', `0'', - ``$2', `$3'')')dnl -substr(`abcde', `2', `') -@error{}m4trace: -1- _substr(`abcde', `2', `5') -@result{}cde -substr(`abcde', `-3') -@error{}m4trace: -1- _substr(`abcde', `2', `5') -@result{}cde -substr(`abcde', `', `-3') -@error{}m4trace: -1- _substr(`abcde', `0', `2') -@result{}ab -substr(`abcde', `-6') -@error{}m4trace: -1- _substr(`abcde', `0', `5') -@result{}abcde -substr(`abcde', `-6', `5') -@error{}m4trace: -1- _substr(`abcde', `0', `4') -@result{}abcd -substr(`abcde', `-7', `1') -@error{}m4trace: -1- _substr(`abcde', `0', `0') -@result{} -substr(`abcde', `1', `-2') -@error{}m4trace: -1- _substr(`abcde', `1', `2') -@result{}bc -substr(`abcde', `-4', `-1') -@error{}m4trace: -1- _substr(`abcde', `1', `3') -@result{}bcd -substr(`abcde', `4', `-3') -@error{}m4trace: -1- _substr(`abcde', `4', `0') -@result{} -substr(`abcdefghij', `-09', `08') -@error{}m4trace: -1- _substr(`abcdefghij', `1', `8') -@result{}bcdefghi -@end example - -@node Translit -@section Translating characters - -@cindex translating characters -@cindex characters, translating -Character translation is done with @code{translit}: - -@deffn {Builtin (m4)} translit (@var{string}, @var{chars}, @ovar{replacement}) -Expands to @var{string}, with each character that occurs in -@var{chars} translated into the character from @var{replacement} with -the same index. - -If @var{replacement} is shorter than @var{chars}, the excess characters -of @var{chars} are deleted from the expansion; if @var{chars} is -shorter, the excess characters in @var{replacement} are silently -ignored. If @var{replacement} is omitted, all characters in -@var{string} that are present in @var{chars} are deleted from the -expansion. If a character appears more than once in @var{chars}, only -the first instance is used in making the translation. Only a single -translation pass is made, even if characters in @var{replacement} also -appear in @var{chars}. - -As a GNU extension, both @var{chars} and @var{replacement} can -contain character-ranges, e.g., @samp{a-z} (meaning all lowercase -letters) or @samp{0-9} (meaning all digits). To include a dash @samp{-} -in @var{chars} or @var{replacement}, place it first or last in the -entire string, or as the last character of a range. Back-to-back ranges -can share a common endpoint. It is not an error for the last character -in the range to be `larger' than the first. In that case, the range -runs backwards, i.e., @samp{9-0} means the string @samp{9876543210}. -The expansion of a range is dependent on the underlying encoding of -characters, so using ranges is not always portable between machines. - -The macro @code{translit} is recognized only with parameters. -@end deffn - -@example -translit(`GNUs not Unix', `A-Z') -@result{}s not nix -translit(`GNUs not Unix', `a-z', `A-Z') -@result{}GNUS NOT UNIX -translit(`GNUs not Unix', `A-Z', `z-a') -@result{}tmfs not fnix -translit(`+,-12345', `+--1-5', `<;>a-c-a') -@result{}<;>abcba -translit(`abcdef', `aabdef', `bcged') -@result{}bgced -@end example - -In the @sc{ascii} encoding, the first example deletes all uppercase -letters, the second converts lowercase to uppercase, and the third -`mirrors' all uppercase letters, while converting them to lowercase. -The two first cases are by far the most common, even though they are not -portable to @sc{ebcdic} or other encodings. The fourth example shows a -range ending in @samp{-}, as well as back-to-back ranges. The final -example shows that @samp{a} is mapped to @samp{b}, not @samp{c}; the -resulting @samp{b} is not further remapped to @samp{g}; the @samp{d} and -@samp{e} are swapped, and the @samp{f} is discarded. - -Omitting @var{chars} evokes a warning, but still produces output. - -@example -translit(`abc') -@error{}m4:stdin:1: warning: translit: too few arguments: 1 < 2 -@result{}abc -@end example - -@node Patsubst -@section Substituting text by regular expression - -@cindex regular expressions -@cindex expressions, regular -@cindex pattern substitution -@cindex substitution by regular expression -@cindex GNU extensions -Global substitution in a string is done by @code{patsubst}: - -@deffn {Builtin (gnu)} patsubst (@var{string}, @var{regexp}, @ - @ovar{replacement}, @ovar{resyntax}) -Searches @var{string} for matches of @var{regexp}, and substitutes -@var{replacement} for each match. - -If @var{resyntax} is given, the particular flavor of regular expression -understood with respect to @var{regexp} can be changed from the current -default. @xref{Changeresyntax}, for details of the values that can be -given for this argument. Unlike @var{regexp}, if exactly three -arguments given, the third argument is always treated as -@var{replacement}, even if it matches a known syntax name. - -The parts of @var{string} that are not covered by any match of -@var{regexp} are copied to the expansion. Whenever a match is found, the -search proceeds from the end of the match, so a character from -@var{string} will never be substituted twice. If @var{regexp} matches a -string of zero length, the start position for the search is incremented, -to avoid infinite loops. - -When a replacement is to be made, @var{replacement} is inserted into -the expansion, with @samp{\@var{n}} substituted by the text matched by -the @var{n}th parenthesized sub-expression of @var{patsubst}, for up to -nine sub-expressions. The escape @samp{\&} is replaced by the text of -the entire regular expression matched. For all other characters, -@samp{\} treats the next character literally. A warning is issued if -there were fewer sub-expressions than the @samp{\@var{n}} requested, or -if there is a trailing @samp{\}. - -The @var{replacement} argument can be omitted, in which case the text -matched by @var{regexp} is deleted. - -The macro @code{patsubst} is recognized only with parameters. -@end deffn - -When used with two arguments, @code{regexp} returns the position of the -match, but @code{patsubst} deletes the match: - -@example -patsubst(`GNUs not Unix', `^', `OBS: ') -@result{}OBS: GNUs not Unix -patsubst(`GNUs not Unix', `\<', `OBS: ') -@result{}OBS: GNUs OBS: not OBS: Unix -patsubst(`GNUs not Unix', `\w*', `(\&)') -@result{}(GNUs)() (not)() (Unix)() -patsubst(`GNUs not Unix', `\w+', `(\&)') -@result{}(GNUs) (not) (Unix) -patsubst(`GNUs not Unix', `[A-Z][a-z]+') -@result{}GN not@w{ } -patsubst(`GNUs not Unix', `not', `NOT\') -@error{}m4:stdin:6: warning: patsubst: trailing \ ignored in replacement -@result{}GNUs NOT Unix -@end example - -Here is a slightly more realistic example, which capitalizes individual -words or whole sentences, by substituting calls of the macros -@code{upcase} and @code{downcase} into the strings. - -@deffn Composite upcase (@var{text}) -@deffnx Composite downcase (@var{text}) -@deffnx Composite capitalize (@var{text}) -Expand to @var{text}, but with capitalization changed: @code{upcase} -changes all letters to upper case, @code{downcase} changes all letters -to lower case, and @code{capitalize} changes the first character of each -word to upper case and the remaining characters to lower case. -@end deffn - -First, an example of their usage, using implementations distributed in -@file{m4-@value{VERSION}/@/doc/examples/@/capitalize.m4}. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`capitalize.m4') -@result{} -upcase(`GNUs not Unix') -@result{}GNUS NOT UNIX -downcase(`GNUs not Unix') -@result{}gnus not unix -capitalize(`GNUs not Unix') -@result{}Gnus Not Unix -@end example - -Now for the implementation. There is a helper macro @code{_capitalize} -which puts only its first word in mixed case. Then @code{capitalize} -merely parses out the words, and replaces them with an invocation of -@code{_capitalize}. (As presented here, the @code{capitalize} macro has -some subtle flaws. You should try to see if you can find and correct -them; or @pxref{Improved capitalize, , Answers}). - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -undivert(`capitalize.m4')dnl -@result{}divert(`-1') -@result{}# upcase(text) -@result{}# downcase(text) -@result{}# capitalize(text) -@result{}# change case of text, simple version -@result{}define(`upcase', `translit(`$*', `a-z', `A-Z')') -@result{}define(`downcase', `translit(`$*', `A-Z', `a-z')') -@result{}define(`_capitalize', -@result{} `regexp(`$1', `^\(\w\)\(\w*\)', -@result{} `upcase(`\1')`'downcase(`\2')')') -@result{}define(`capitalize', `patsubst(`$1', `\w+', `_$0(`\&')')') -@result{}divert`'dnl -@end example - -If @var{resyntax} is given, @var{regexp} must be given according to -the syntax chosen, though the default regular expression syntax -remains unchanged for other invocations: - -@example -define(`epatsubst', - `builtin(`patsubst', `$1', `$2', `$3', `POSIX_EXTENDED')')dnl -epatsubst(`bar foo baz Foo', `(\w*) (foo|Foo)', `_\1_') -@result{}_bar_ _baz_ -patsubst(`bar foo baz Foo', `\(\w*\) \(foo\|Foo\)', `_\1_') -@result{}_bar_ _baz_ -@end example - -While @code{regexp} replaces the whole input with the replacement as -soon as there is a match, @code{patsubst} replaces each -@emph{occurrence} of a match and preserves non-matching pieces: - -@example -define(`patreg', -`patsubst($@@) -regexp($@@)')dnl -patreg(`bar foo baz Foo', `foo\|Foo', `FOO') -@result{}bar FOO baz FOO -@result{}FOO -patreg(`aba abb 121', `\(.\)\(.\)\1', `\2\1\2') -@result{}bab abb 212 -@result{}bab -@end example - -Omitting @var{regexp} evokes a warning, but still produces output; -contrast this with an empty @var{regexp} argument. - -@example -patsubst(`abc') -@error{}m4:stdin:1: warning: patsubst: too few arguments: 1 < 2 -@result{}abc -patsubst(`abc', `') -@result{}abc -patsubst(`abc', `', `\\-') -@result{}\-a\-b\-c\- -@end example - -@node Format -@section Formatting strings (printf-like) - -@cindex formatted output -@cindex output, formatted -@cindex GNU extensions -Formatted output can be made with @code{format}: - -@deffn {Builtin (gnu)} format (@var{format-string}, @dots{}) -Works much like the C function @code{printf}. The first argument -@var{format-string} can contain @samp{%} specifications which are -satisfied by additional arguments, and the expansion of @code{format} is -the formatted string. - -The macro @code{format} is recognized only with parameters. -@end deffn - -Its use is best described by a few examples: - -@comment This test is a bit fragile, if someone tries to port to a -@comment platform without infinity. -@example -define(`foo', `The brown fox jumped over the lazy dog') -@result{} -format(`The string "%s" uses %d characters', foo, len(foo)) -@result{}The string "The brown fox jumped over the lazy dog" uses 38 characters -format(`%*.*d', `-1', `-1', `1') -@result{}1 -format(`%.0f', `56789.9876') -@result{}56790 -len(format(`%-*X', `5000', `1')) -@result{}5000 -ifelse(format(`%010F', `infinity'), ` INF', `success', - format(`%010F', `infinity'), ` INFINITY', `success', - format(`%010F', `infinity')) -@result{}success -ifelse(format(`%.1A', `1.999'), `0X1.0P+1', `success', - format(`%.1A', `1.999'), `0X2.0P+0', `success', - format(`%.1A', `1.999')) -@result{}success -format(`%g', `0xa.P+1') -@result{}20 -@end example - -Using the @code{forloop} macro defined earlier (@pxref{Forloop}), this -example shows how @code{format} can be used to produce tabular output. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`forloop.m4') -@result{} -forloop(`i', `1', `10', `format(`%6d squared is %10d -', i, eval(i**2))') -@result{} 1 squared is 1 -@result{} 2 squared is 4 -@result{} 3 squared is 9 -@result{} 4 squared is 16 -@result{} 5 squared is 25 -@result{} 6 squared is 36 -@result{} 7 squared is 49 -@result{} 8 squared is 64 -@result{} 9 squared is 81 -@result{} 10 squared is 100 -@result{} -@end example - -The builtin @code{format} is modeled after the ANSI C @samp{printf} -function, and supports these @samp{%} specifiers: @samp{c}, @samp{s}, -@samp{d}, @samp{o}, @samp{x}, @samp{X}, @samp{u}, @samp{a}, @samp{A}, -@samp{e}, @samp{E}, @samp{f}, @samp{F}, @samp{g}, @samp{G}, and -@samp{%}; it supports field widths and precisions, and the flags -@samp{+}, @samp{-}, @samp{ }, @samp{0}, @samp{#}, and @samp{'}. For -integer specifiers, the width modifiers @samp{hh}, @samp{h}, and -@samp{l} are recognized, and for floating point specifiers, the width -modifier @samp{l} is recognized. Items not yet supported include -positional arguments, the @samp{n}, @samp{p}, @samp{S}, and @samp{C} -specifiers, the @samp{z}, @samp{t}, @samp{j}, @samp{L} and @samp{ll} -modifiers, and any platform extensions available in the native -@code{printf}. For more details on the functioning of @code{printf}, -see the C Library Manual, or the POSIX specification (for -example, @samp{%a} is supported even on platforms that haven't yet -implemented C99 hexadecimal floating point output natively). - -@c FIXME - format still needs some improvements. -Warnings are issued for unrecognized specifiers, an improper number of -arguments, or difficulty parsing an argument according to the format -string (such as overflow or extra characters). It is anticipated that a -future release of GNU @code{m4} will support more specifiers. -Likewise, escape sequences are not yet recognized. - -@example -format(`%p', `0') -@error{}m4:stdin:1: warning: format: unrecognized specifier in '%p' -@result{} -format(`%*d', `') -@error{}m4:stdin:2: warning: format: empty string treated as 0 -@error{}m4:stdin:2: warning: format: too few arguments: 2 < 3 -@result{}0 -format(`%.1f', `2a') -@error{}m4:stdin:3: warning: format: non-numeric argument '2a' -@result{}2.0 -@end example - -@ignore -@comment Expose a crash with a bad format string fixed in 1.4.15. -@comment Unfortunately, 8-bit bytes are hard to check for; but the -@comment exit status is enough to sniff the crash in broken versions. - -@example -format(`%'format(`%c', `128')) -@result{} -@error{}ignore -@end example -@end ignore - -@node Arithmetic -@chapter Macros for doing arithmetic - -@cindex arithmetic -@cindex integer arithmetic -Integer arithmetic is included in @code{m4}, with a C-like syntax. As -convenient shorthands, there are builtins for simple increment and -decrement operations. - -@menu -* Incr:: Decrement and increment operators -* Eval:: Evaluating integer expressions -* Mpeval:: Multiple precision arithmetic -@end menu - -@node Incr -@section Decrement and increment operators - -@cindex decrement operator -@cindex increment operator -Increment and decrement of integers are supported using the builtins -@code{incr} and @code{decr}: - -@deffn {Builtin (m4)} incr (@var{number}) -@deffnx {Builtin (m4)} decr (@var{number}) -Expand to the numerical value of @var{number}, incremented -or decremented, respectively, by one. Except for the empty string, the -expansion is empty if @var{number} could not be parsed. - -The macros @code{incr} and @code{decr} are recognized only with -parameters. -@end deffn - -@example -incr(`4') -@result{}5 -decr(`7') -@result{}6 -incr() -@error{}m4:stdin:3: warning: incr: empty string treated as 0 -@result{}1 -decr() -@error{}m4:stdin:4: warning: decr: empty string treated as 0 -@result{}-1 -@end example - -The builtin macros @code{incr} and @code{decr} are recognized only when -given arguments. - -@node Eval -@section Evaluating integer expressions - -@cindex integer expression evaluation -@cindex evaluation, of integer expressions -@cindex expressions, evaluation of integer -Integer expressions are evaluated with @code{eval}: - -@deffn {Builtin (m4)} eval (@var{expression}, @dvar{radix, 10}, @ovar{width}) -Expands to the value of @var{expression}. The expansion is empty -if a problem is encountered while parsing the arguments. If specified, -@var{radix} and @var{width} control the format of the output. - -Calculations are done with signed numbers, using at least 31-bit -precision, but as a GNU extension, @code{m4} will use wider -integers if available. Precision is finite, based on the platform's -notion of @code{intmax_t}, and overflow silently results in wraparound. -A warning is issued if division by zero is attempted, or if -@var{expression} could not be parsed. - -Expressions can contain the following operators, listed in order of -decreasing precedence. - -@table @samp -@item () -Parentheses -@item + - ~ ! -Unary plus and minus, and bitwise and logical negation -@item ** -Exponentiation -@item * / % \ -Multiplication, division, modulo, and ratio -@item + - -Addition and subtraction -@item << >> >>> -Shift left, shift right, unsigned shift right -@item > >= < <= -Relational operators -@item == != -Equality operators -@item & -Bitwise and -@item ^ -Bitwise exclusive-or -@item | -Bitwise or -@item && -Logical and -@item || -Logical or -@item ?: -Conditional ternary -@item , -Sequential evaluation -@end table - -The macro @code{eval} is recognized only with parameters. -@end deffn - -All binary operators, except exponentiation, are left associative. C -operators that perform variable assignment, such as @samp{+=} or -@samp{--}, are not implemented, since @code{eval} only operates on -constants, not variables. Attempting to use them results in an error. -@comment FIXME - since XCU ERN 137 is approved, we could provide an -@comment extension that supported assignment operators. - -Note that some older @code{m4} implementations use @samp{^} as an -alternate operator for the exponentiation, although POSIX -requires the C behavior of bitwise exclusive-or. The precedence of the -negation operators, @samp{~} and @samp{!}, was traditionally lower than -equality. The unary operators could not be used reliably more than once -on the same term without intervening parentheses. The traditional -precedence of the equality operators @samp{==} and @samp{!=} was -identical instead of lower than the relational operators such as -@samp{<}, even through GNU M4 1.4.8. Starting with version -1.4.9, GNU M4 correctly follows POSIX precedence -rules. M4 scripts designed to be portable between releases must be -aware that parentheses may be required to enforce C precedence rules. -Likewise, division by zero, even in the unused branch of a -short-circuiting operator, is not always well-defined in other -implementations. - -Following are some examples where the current version of M4 follows C -precedence rules, but where older versions and some other -implementations of @code{m4} require explicit parentheses to get the -correct result: - -@example -eval(`1 == 2 > 0') -@result{}1 -eval(`(1 == 2) > 0') -@result{}0 -eval(`! 0 * 2') -@result{}2 -eval(`! (0 * 2)') -@result{}1 -eval(`1 | 1 ^ 1') -@result{}1 -eval(`(1 | 1) ^ 1') -@result{}0 -eval(`+ + - ~ ! ~ 0') -@result{}1 -eval(`++0') -@error{}m4:stdin:8: warning: eval: invalid operator: '++0' -@result{} -eval(`1 = 1') -@error{}m4:stdin:9: warning: eval: invalid operator: '1 = 1' -@result{} -eval(`0 |= 1') -@error{}m4:stdin:10: warning: eval: invalid operator: '0 |= 1' -@result{} -eval(`2 || 1 / 0') -@result{}1 -eval(`0 || 1 / 0') -@error{}m4:stdin:12: warning: eval: divide by zero: '0 || 1 / 0' -@result{} -eval(`0 && 1 % 0') -@result{}0 -eval(`2 && 1 % 0') -@error{}m4:stdin:14: warning: eval: modulo by zero: '2 && 1 % 0' -@result{} -@end example - -@cindex GNU extensions -As a GNU extension, @code{eval} supports several operators -that do not appear in C@. A right-associative exponentiation operator -@samp{**} computes the value of the left argument raised to the right, -modulo the numeric precision width. If evaluated, the exponent must be -non-negative, and at least one of the arguments must be non-zero, or a -warning is issued. An unsigned shift operator @samp{>>>} allows -shifting a negative number as though it were an unsigned bit pattern, -which shifts in 0 bits rather than twos-complement sign-extension. A -ratio operator @samp{\} behaves like normal division @samp{/} on -integers, but is provided for symmetry with @code{mpeval}. -Additionally, the C operators @samp{,} and @samp{?:} are supported. - -@example -eval(`2 ** 3 ** 2') -@result{}512 -eval(`(2 ** 3) ** 2') -@result{}64 -eval(`0 ** 1') -@result{}0 -eval(`2 ** 0') -@result{}1 -eval(`0 ** 0') -@result{} -@error{}m4:stdin:5: warning: eval: divide by zero: '0 ** 0' -eval(`4 ** -2') -@error{}m4:stdin:6: warning: eval: negative exponent: '4 ** -2' -@result{} -eval(`2 || 4 ** -2') -@result{}1 -eval(`(-1 >> 1) == -1') -@result{}1 -eval(`(-1 >>> 1) > (1 << 30)') -@result{}1 -eval(`6 \ 3') -@result{}2 -eval(`1 ? 2 : 3') -@result{}2 -eval(`0 ? 2 : 3') -@result{}3 -eval(`1 ? 2 : 1/0') -@result{}2 -eval(`0 ? 1/0 : 3') -@result{}3 -eval(`4, 5') -@result{}5 -@end example - -Within @var{expression}, (but not @var{radix} or @var{width}), numbers -without a special prefix are decimal. A simple @samp{0} prefix -introduces an octal number. @samp{0x} introduces a hexadecimal number. -As GNU extensions, @samp{0b} introduces a binary number. -@samp{0r} introduces a number expressed in any radix between 1 and 36: -the prefix should be immediately followed by the decimal expression of -the radix, a colon, then the digits making the number. For radix 1, -leading zeros are ignored, and all remaining digits must be @samp{1}; -for all other radices, the digits are @samp{0}, @samp{1}, @samp{2}, -@dots{}. Beyond @samp{9}, the digits are @samp{a}, @samp{b} @dots{} up -to @samp{z}. Lower and upper case letters can be used interchangeably -in numbers prefixes and as number digits. - -Parentheses may be used to group subexpressions whenever needed. For the -relational operators, a true relation returns @code{1}, and a false -relation return @code{0}. - -Here are a few examples of use of @code{eval}. - -@example -eval(`-3 * 5') -@result{}-15 -eval(`-99 / 10') -@result{}-9 -eval(`-99 % 10') -@result{}-9 -eval(`99 % -10') -@result{}9 -eval(index(`Hello world', `llo') >= 0) -@result{}1 -eval(`0r1:0111 + 0b100 + 0r3:12') -@result{}12 -define(`square', `eval(`($1) ** 2')') -@result{} -square(`9') -@result{}81 -square(square(`5')` + 1') -@result{}676 -define(`foo', `666') -@result{} -eval(`foo / 6') -@error{}m4:stdin:11: warning: eval: bad expression: 'foo / 6' -@result{} -eval(foo / 6) -@result{}111 -@end example - -As the last two lines show, @code{eval} does not handle macro -names, even if they expand to a valid expression (or part of a valid -expression). Therefore all macros must be expanded before they are -passed to @code{eval}. -@comment update this if we add support for variables. - -Some calculations are not portable to other implementations, since they -have undefined semantics in C, but GNU @code{m4} has -well-defined behavior on overflow. When shifting, an out-of-range shift -amount is implicitly brought into the range of the precision using -modulo arithmetic (for example, on 32-bit integers, this would be an -implicit bit-wise and with 0x1f). This example should work whether your -platform uses 32-bit integers, 64-bit integers, or even some other -atypical size. - -@example -define(`max_int', eval(`-1 >>> 1')) -@result{} -define(`min_int', eval(max_int` + 1')) -@result{} -eval(min_int` < 0') -@result{}1 -eval(max_int` > 0') -@result{}1 -ifelse(eval(min_int` / -1'), min_int, `overflow occurred') -@result{}overflow occurred -eval(`0x80000000 % -1') -@result{}0 -eval(`-4 >> 1') -@result{}-2 -eval(`-4 >> 'eval(len(eval(max_int, `2'))` + 2')) -@result{}-2 -@end example - -If @var{radix} is specified, it specifies the radix to be used in the -expansion. The default radix is 10; this is also the case if -@var{radix} is the empty string. A warning results if the radix is -outside the range of 1 through 36, inclusive. The result of @code{eval} -is always taken to be signed. No radix prefix is output, and for -radices greater than 10, the digits are lower case (although some -other implementations use upper case). The output is unquoted, and -subject to further macro expansion. The @var{width} -argument specifies the minimum output width, excluding any negative -sign. The result is zero-padded to extend the expansion to the -requested width. A warning results if the width is negative. If -@var{radix} or @var{width} is out of bounds, the expansion of -@code{eval} is empty. - -@example -eval(`666', `10') -@result{}666 -eval(`666', `11') -@result{}556 -eval(`666', `6') -@result{}3030 -eval(`666', `6', `10') -@result{}0000003030 -eval(`-666', `6', `10') -@result{}-0000003030 -eval(`10', `', `0') -@result{}10 -`0r1:'eval(`10', `1', `11') -@result{}0r1:01111111111 -eval(`10', `16') -@result{}a -eval(`1', `37') -@error{}m4:stdin:9: warning: eval: radix out of range: 37 -@result{} -eval(`1', , `-1') -@error{}m4:stdin:10: warning: eval: negative width: -1 -@result{} -eval() -@error{}m4:stdin:11: warning: eval: empty string treated as 0 -@result{}0 -eval(` ') -@error{}m4:stdin:12: warning: eval: empty string treated as 0 -@result{}0 -define(`a', `hi')eval(` 10 ', `16') -@result{}hi -@end example - -@node Mpeval -@section Multiple precision arithmetic - -When @code{m4} is compiled with a multiple precision arithmetic library -(@pxref{Experiments}), a builtin @code{mpeval} is defined. - -@deffn {Builtin (mpeval)} mpeval (@var{expression}, @dvar{radix, 10}, @ - @ovar{width}) -Behaves similarly to @code{eval}, except the calculations are done with -infinite precision, and rational numbers are supported. Numbers may be -of any length. - -The macro @code{mpeval} is recognized only with parameters. -@end deffn - -For the most part, using @code{mpeval} is similar to using @code{eval}: - -@comment options: mpeval - -@example -$ @kbd{m4 mpeval -} -mpeval(`(1 << 70) + 2 ** 68 * 3', `16') -@result{}700000000000000000 -`0r24:'mpeval(`0r36:zYx', `24', `5') -@result{}0r24:038m9 -@end example - -The ratio operator, @samp{\}, is provided with the same precedence as -division, and rationally divides two numbers and canonicalizes the -result, whereas the division operator @samp{/} always returns the -integer quotient of the division. To convert a rational value to -integral, divide (@samp{/}) by 1. Some operators, such as @samp{%}, -@samp{<<}, @samp{>>}, @samp{~}, @samp{&}, @samp{|} and @samp{^} operate -only on integers and will truncate any rational remainder. The unsigned -shift operator, @samp{>>>}, behaves identically with regular right -shifts, @samp{>>}, since with infinite precision, it is not possible to -convert a negative number to a positive using shifts. The -exponentiation operator, @samp{**}, assumes that the exponent is -integral, but allows negative exponents. With the short-circuit logical -operators, @samp{||} and @samp{&&}, a non-zero result preserves the -value of the argument that ended evaluation, rather than collapsing to -@samp{1}. The operators @samp{?:} and @samp{,} are always available, -even in POSIX mode, since @code{mpeval} does not have to -conform to the POSIX rules for @code{eval}. - -@comment options: mpeval - -@example -$ @kbd{m4 mpeval -} -mpeval(`2 / 4') -@result{}0 -mpeval(`2 \ 4') -@result{}1\2 -mpeval(`2 || 3') -@result{}2 -mpeval(`1 && 3') -@result{}3 -mpeval(`-1 >> 1') -@result{}-1 -mpeval(`-1 >>> 1') -@result{}-1 -@end example - -@node Shell commands -@chapter Macros for running shell commands - -@cindex UNIX commands, running -@cindex executing shell commands -@cindex running shell commands -@cindex shell commands, running -@cindex commands, running shell -There are a few builtin macros in @code{m4} that allow you to run shell -commands from within @code{m4}. - -Note that the definition of a valid shell command is system dependent. -On UNIX systems, this is the typical @command{/bin/sh}. But on other -systems, such as native Windows, the shell has a different syntax of -commands that it understands. Some examples in this chapter assume -@command{/bin/sh}, and also demonstrate how to quit early with a known -exit value if this is not the case. - -@menu -* Platform macros:: Determining the platform -* Syscmd:: Executing simple commands -* Esyscmd:: Reading the output of commands -* Sysval:: Exit status -* Mkstemp:: Making temporary files -* Mkdtemp:: Making temporary directories -@end menu - -@node Platform macros -@section Determining the platform - -@cindex platform macros -Sometimes it is desirable for an input file to know which platform -@code{m4} is running on. GNU @code{m4} provides several -macros that are predefined to expand to the empty string; checking for -their existence will confirm platform details. - -@deffn {Optional builtin (gnu)} __os2__ -@deffnx {Optional builtin (traditional)} os2 -@deffnx {Optional builtin (gnu)} __unix__ -@deffnx {Optional builtin (traditional)} unix -@deffnx {Optional builtin (gnu)} __windows__ -@deffnx {Optional builtin (traditional)} windows -Each of these macros is conditionally defined as needed to describe the -environment of @code{m4}. If defined, each macro expands to the empty -string. -@end deffn - -On UNIX systems, GNU @code{m4} will define @code{@w{__unix__}} -in the @samp{gnu} module, and @code{unix} in the @samp{traditional} -module. - -On native Windows systems, GNU @code{m4} will define -@code{@w{__windows__}} in the @samp{gnu} module, and @code{windows} in -the @samp{traditional} module. - -On OS/2 systems, GNU @code{m4} will define @code{@w{__os2__}} -in the @samp{gnu} module, and @code{os2} in the @samp{traditional} -module. - -If GNU M4 does not provide a platform macro for your system, -please report that as a bug. - -@example -define(`provided', `0') -@result{} -ifdef(`__unix__', `define(`provided', incr(provided))') -@result{} -ifdef(`__windows__', `define(`provided', incr(provided))') -@result{} -ifdef(`__os2__', `define(`provided', incr(provided))') -@result{} -provided -@result{}1 -@end example - -@node Syscmd -@section Executing simple commands - -Any shell command can be executed, using @code{syscmd}: - -@deffn {Builtin (m4)} syscmd (@var{shell-command}) -Executes @var{shell-command} as a shell command. - -The expansion of @code{syscmd} is void, @emph{not} the output from -@var{shell-command}! Output or error messages from @var{shell-command} -are not read by @code{m4}. @xref{Esyscmd}, if you need to process the -command output. - -Prior to executing the command, @code{m4} flushes its buffers. -The default standard input, output and error of @var{shell-command} are -the same as those of @code{m4}. - -By default, the @var{shell-command} will be used as the argument to the -@option{-c} option of the @command{/bin/sh} shell (or the version of -@command{sh} specified by @samp{command -p getconf PATH}, if your system -supports that). If you prefer a different shell, the -@command{configure} script can be given the option -@option{--with-syscmd-shell=@var{location}} to set the location of an -alternative shell at GNU @code{m4} installation; the -alternative shell must still support @option{-c}. - -When the @option{--safer} option (@pxref{Operation modes, , Invoking -m4}) is in effect, @code{syscmd} results in an error, since otherwise an -input file could execute arbitrary code. - -The macro @code{syscmd} is recognized only with parameters. -@end deffn - -@example -define(`foo', `FOO') -@result{} -syscmd(`echo foo') -@result{}foo -@result{} -@end example - -Note how the expansion of @code{syscmd} keeps the trailing newline of -the command, as well as using the newline that appeared after the macro. - -The following is an example of @var{shell-command} using the same -standard input as @code{m4}: - -@comment The testsuite does not know how to parse pipes from the -@comment texinfo. Fortunately, there are other tests in the testsuite -@comment that test this same feature. -@comment ignore -@example -$ @kbd{echo "m4wrap(\`syscmd(\`cat')')" | m4} -@result{} -@end example - -It tells @code{m4} to read all of its input before executing the wrapped -text, then hands a valid (albeit emptied) pipe as standard input for the -@code{cat} subcommand. Therefore, you should be careful when using -standard input (either by specifying no files, or by passing @samp{-} as -a file name on the command line, @pxref{Command line files, , Invoking -m4}), and also invoking subcommands via @code{syscmd} or @code{esyscmd} -that consume data from standard input. When standard input is a -seekable file, the subprocess will pick up with the next character not -yet processed by @code{m4}; when it is a pipe or other non-seekable -file, there is no guarantee how much data will already be buffered by -@code{m4} and thus unavailable to the child. - -Following is an example of how potentially unsafe actions can be -suppressed. - -@comment options: --safer -@comment status: 1 -@example -$ @kbd{m4 --safer} -syscmd(`echo hi') -@error{}m4:stdin:1: syscmd: disabled by --safer -@result{} -@end example - -@node Esyscmd -@section Reading the output of commands - -@cindex GNU extensions -If you want @code{m4} to read the output of a shell command, use -@code{esyscmd}: - -@deffn {Builtin (gnu)} esyscmd (@var{shell-command}) -Expands to the standard output of the shell command -@var{shell-command}. - -Prior to executing the command, @code{m4} flushes its buffers. -The default standard input and standard error of @var{shell-command} are -the same as those of @code{m4}. The error output of @var{shell-command} -is not a part of the expansion: it will appear along with the error -output of @code{m4}. - -By default, the @var{shell-command} will be used as the argument to the -@option{-c} option of the @command{/bin/sh} shell (or the version of -@command{sh} specified by @samp{command -p getconf PATH}, if your system -supports that). If you prefer a different shell, the -@command{configure} script can be given the option -@option{--with-syscmd-shell=@var{location}} to set the location of an -alternative shell at GNU @code{m4} installation; the -alternative shell must still support @option{-c}. - -When the @option{--safer} option (@pxref{Operation modes, , Invoking -m4}) is in effect, @code{esyscmd} results in an error, since otherwise -an input file could execute arbitrary code. - -The macro @code{esyscmd} is recognized only with parameters. -@end deffn - -@example -define(`foo', `FOO') -@result{} -esyscmd(`echo foo') -@result{}FOO -@result{} -@end example - -Note how the expansion of @code{esyscmd} keeps the trailing newline of -the command, as well as using the newline that appeared after the macro. - -Just as with @code{syscmd}, care must be exercised when sharing standard -input between @code{m4} and the child process of @code{esyscmd}. -Likewise, potentially unsafe actions can be suppressed. - -@comment options: --safer -@comment status: 1 -@example -$ @kbd{m4 --safer} -esyscmd(`echo hi') -@error{}m4:stdin:1: esyscmd: disabled by --safer -@result{} -@end example - -@node Sysval -@section Exit status - -@cindex UNIX commands, exit status from -@cindex exit status from shell commands -@cindex shell commands, exit status from -@cindex commands, exit status from shell -@cindex status of shell commands -To see whether a shell command succeeded, use @code{sysval}: - -@deffn {Builtin (m4)} sysval -Expands to the exit status of the last shell command run with -@code{syscmd} or @code{esyscmd}. Expands to 0 if no command has been -run yet. -@end deffn - -@example -sysval -@result{}0 -syscmd(`false') -@result{} -ifelse(sysval, `0', `zero', `non-zero') -@result{}non-zero -syscmd(`exit 2') -@result{} -sysval -@result{}2 -syscmd(`true') -@result{} -sysval -@result{}0 -esyscmd(`false') -@result{} -ifelse(sysval, `0', `zero', `non-zero') -@result{}non-zero -esyscmd(`echo dnl && exit 127') -@result{} -sysval -@result{}127 -esyscmd(`true') -@result{} -sysval -@result{}0 -@end example - -@code{sysval} results in 127 if there was a problem executing the -command, for example, if the system-imposed argument length is exceeded, -or if there were not enough resources to fork. It is not possible to -distinguish between failed execution and successful execution that had -an exit status of 127, unless there was output from the child process. - -On UNIX platforms, where it is possible to detect when command execution -is terminated by a signal, rather than a normal exit, the result is the -signal number shifted left by eight bits. - -@comment This test has difficulties being portable, even on platforms -@comment where syscmd invokes /bin/sh. Kill is not portable with signal -@comment names. According to autoconf, the only portable signal numbers -@comment are 1 (HUP), 2 (INT), 9 (KILL), 13 (PIPE) and 15 (TERM). But -@comment all shells handle SIGINT, and ksh handles HUP (as in, the shell -@comment exits normally rather than letting the signal terminate it). -@comment Also, TERM is flaky, as it can also kill the running m4 on -@comment systems where /bin/sh does not create its own process group. -@comment And PIPE is unreliable, since people tend to run with it -@comment ignored, with m4 inheriting that choice. That leaves KILL as -@comment the only signal we can reliably test. -@example -dnl This test assumes kill is a shell builtin, and that signals are -dnl recognizable. -ifdef(`__unix__', , - `errprint(` skipping: syscmd does not have unix semantics -')m4exit(`77')')dnl -syscmd(`kill -9 $$') -@result{} -sysval -@result{}2304 -syscmd() -@result{} -sysval -@result{}0 -esyscmd(`kill -9 $$') -@result{} -sysval -@result{}2304 -@end example - -When the @option{--safer} option (@pxref{Operation modes, , Invoking -m4}) is in effect, @code{sysval} will always remain at its default value -of zero. - -@comment options: --safer -@comment status: 1 -@example -$ @kbd{m4 --safer} -sysval -@result{}0 -syscmd(`false') -@error{}m4:stdin:2: syscmd: disabled by --safer -@result{} -sysval -@result{}0 -@end example - -@node Mkstemp -@section Making temporary files - -@cindex temporary file names -@cindex files, names of temporary -Commands specified to @code{syscmd} or @code{esyscmd} might need a -temporary file, for output or for some other purpose. There is a -builtin macro, @code{mkstemp}, for making a temporary file: - -@deffn {Builtin (m4)} mkstemp (@var{template}) -@deffnx {Builtin (m4)} maketemp (@var{template}) -Expands to the quoted name of a new, empty file, made from the string -@var{template}, which should end with the string @samp{XXXXXX}. The six -@samp{X} characters are then replaced with random characters matching -the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the file -name unique. If fewer than six @samp{X} characters are found at the end -of @code{template}, the result will be longer than the template. The -created file will have access permissions as if by @kbd{chmod =rw,go=}, -meaning that the current umask of the @code{m4} process is taken into -account, and at most only the current user can read and write the file. - -The traditional behavior, standardized by POSIX, is that -@code{maketemp} merely replaces the trailing @samp{X} with the process -id, without creating a file or quoting the expansion, and without -ensuring that the resulting -string is a unique file name. In part, this means that using the same -@var{template} twice in the same input file will result in the same -expansion. This behavior is a security hole, as it is very easy for -another process to guess the name that will be generated, and thus -interfere with a subsequent use of @code{syscmd} trying to manipulate -that file name. Hence, POSIX has recommended that all new -implementations of @code{m4} provide the secure @code{mkstemp} builtin, -and that users of @code{m4} check for its existence. - -The expansion is void and an error issued if a temporary file could -not be created. - -When the @option{--safer} option (@pxref{Operation modes, Invoking m4}) -is in effect, @code{mkstemp} and GNU-mode @code{maketemp} -result in an error, since otherwise an input file could perform a mild -denial-of-service attack by filling up a disk with multiple empty files. - -The macros @code{mkstemp} and @code{maketemp} are recognized only with -parameters. -@end deffn - -If you try this next example, you will most likely get different output -for the two file names, since the replacement characters are randomly -chosen: - -@comment ignore -@example -$ @kbd{m4} -define(`tmp', `oops') -@result{} -maketemp(`/tmp/fooXXXXXX') -@error{}m4:stdin:1: warning: maketemp: recommend using mkstemp instead -@result{}/tmp/fooa07346 -ifdef(`mkstemp', `define(`maketemp', defn(`mkstemp'))', - `define(`mkstemp', defn(`maketemp'))dnl -errprint(`warning: potentially insecure maketemp implementation -')') -@result{} -mkstemp(`doc') -@result{}docQv83Uw -@end example - -@comment options: --safer -@comment status: 1 -@example -$ @kbd{m4 --safer} -maketemp(`/tmp/fooXXXXXX') -@error{}m4:stdin:1: warning: maketemp: recommend using mkstemp instead -@error{}m4:stdin:1: maketemp: disabled by --safer -@result{} -mkstemp(`/tmp/fooXXXXXX') -@error{}m4:stdin:2: mkstemp: disabled by --safer -@result{} -@end example - -@cindex GNU extensions -Unless you use the @option{--traditional} command line option (or -@option{-G}, @pxref{Limits control, , Invoking m4}), the GNU -version of @code{maketemp} is secure. This means that using the same -template to multiple calls will generate multiple files. However, we -recommend that you use the new @code{mkstemp} macro, introduced in -GNU M4 1.4.8, which is secure even in traditional mode. Also, -as of M4 1.4.11, the secure implementation quotes the resulting file -name, so that you are guaranteed to know what file was created even if -the random file name happens to match an existing macro. Notice that -this example is careful to use @code{defn} to avoid unintended expansion -of @samp{foo}. - -@example -$ @kbd{m4} -define(`foo', `errprint(`oops')') -@result{} -syscmd(`rm -f foo-??????')sysval -@result{}0 -define(`file1', maketemp(`foo-XXXXXX'))dnl -@error{}m4:stdin:3: warning: maketemp: recommend using mkstemp instead -ifelse(esyscmd(`echo \` foo-?????? \''), `foo-??????', - `no file', `created') -@result{}created -define(`file2', maketemp(`foo-XX'))dnl -@error{}m4:stdin:6: warning: maketemp: recommend using mkstemp instead -define(`file3', mkstemp(`foo-XXXXXX'))dnl -ifelse(len(defn(`file1')), len(defn(`file2')), - `same length', `different') -@result{}same length -ifelse(defn(`file1'), defn(`file2'), `same', `different file') -@result{}different file -ifelse(defn(`file2'), defn(`file3'), `same', `different file') -@result{}different file -ifelse(defn(`file1'), defn(`file3'), `same', `different file') -@result{}different file -syscmd(`rm 'defn(`file1') defn(`file2') defn(`file3')) -@result{} -sysval -@result{}0 -@end example - -@comment options: -G -@example -$ @kbd{m4 -G} -syscmd(`rm -f foo-*')sysval -@result{}0 -define(`file1', maketemp(`foo-XXXXXX'))dnl -@error{}m4:stdin:2: warning: maketemp: recommend using mkstemp instead -define(`file2', maketemp(`foo-XXXXXX'))dnl -@error{}m4:stdin:3: warning: maketemp: recommend using mkstemp instead -ifelse(file1, file2, `same', `different file') -@result{}same -len(maketemp(`foo-XXXXX')) -@error{}m4:stdin:5: warning: maketemp: recommend using mkstemp instead -@result{}9 -define(`abc', `def') -@result{} -maketemp(`foo-abc') -@result{}foo-def -@error{}m4:stdin:7: warning: maketemp: recommend using mkstemp instead -syscmd(`test -f foo-*')sysval -@result{}1 -@end example - -@node Mkdtemp -@section Making temporary directories - -@cindex temporary directory -@cindex directories, temporary -@cindex GNU extensions -Commands specified to @code{syscmd} or @code{esyscmd} might need a -temporary directory, for holding multiple temporary files; such a -directory can be created with @code{mkdtemp}: - -@deffn {Builtin (gnu)} mkdtemp (@var{template}) -Expands to the quoted name of a new, empty directory, made from the string -@var{template}, which should end with the string @samp{XXXXXX}. The six -@samp{X} characters are then replaced with random characters matching -the regular expression @samp{[a-zA-Z0-9._-]}, in order to make the name -unique. If fewer than six @samp{X} characters are found at the end of -@code{template}, the result will be longer than the template. The -created directory will have access permissions as if by @kbd{chmod -=rwx,go=}, meaning that the current umask of the @code{m4} process is -taken into account, and at most only the current user can read, write, -and search the directory. - -The expansion is void and an error issued if a temporary directory could -not be created. - -When the @option{--safer} option (@pxref{Operation modes, Invoking m4}) -is in effect, @code{mkdtemp} results in an error, since otherwise an -input file could perform a mild denial-of-service attack by filling up a -disk with multiple directories. - -The macro @code{mkdtemp} is recognized only with parameters. -This macro was added in M4 2.0. -@end deffn - -If you try this next example, you will most likely get different output -for the directory names, since the replacement characters are randomly -chosen: - -@comment ignore -@example -$ @kbd{m4} -define(`tmp', `oops') -@result{} -mkdtemp(`/tmp/fooXXXXXX') -@result{}/tmp/foo2h89Vo -mkdtemp(`dir) -@result{}dirrg079A -@end example - -@comment options: --safer -@comment status: 1 -@example -$ @kbd{m4 --safer} -mkdtemp(`/tmp/fooXXXXXX') -@error{}m4:stdin:1: mkdtemp: disabled by --safer -@result{} -@end example - -Multiple calls with the same template will generate multiple -directories. - -@example -$ @kbd{m4} -syscmd(`echo foo??????')dnl -@result{}foo?????? -define(`dir1', mkdtemp(`fooXXXXXX'))dnl -ifelse(esyscmd(`echo foo??????'), `foo??????', `no dir', `created') -@result{}created -define(`dir2', mkdtemp(`fooXXXXXX'))dnl -ifelse(dir1, dir2, `same', `different directories') -@result{}different directories -syscmd(`rmdir 'dir1 dir2) -@result{} -sysval -@result{}0 -@end example - -@node Miscellaneous -@chapter Miscellaneous builtin macros - -This chapter describes various builtins, that do not really belong in -any of the previous chapters. - -@menu -* Errprint:: Printing error messages -* Location:: Printing current location -* M4exit:: Exiting from @code{m4} -* Syncoutput:: Turning on and off sync lines -@end menu - -@node Errprint -@section Printing error messages - -@cindex printing error messages -@cindex error messages, printing -@cindex messages, printing error -@cindex standard error, output to -You can print error messages using @code{errprint}: - -@deffn {Builtin (m4)} errprint (@var{message}, @dots{}) -Prints @var{message} and the rest of the arguments to standard error, -separated by spaces. Standard error is used, regardless of the -@option{--debugfile} option (@pxref{Debugging options, , Invoking m4}). - -The expansion of @code{errprint} is void. -The macro @code{errprint} is recognized only with parameters. -@end deffn - -@example -errprint(`Invalid arguments to forloop -') -@error{}Invalid arguments to forloop -@result{} -errprint(`1')errprint(`2',`3 -') -@error{}12 3 -@result{} -@end example - -A trailing newline is @emph{not} printed automatically, so it should be -supplied as part of the argument, as in the example. Unfortunately, the -exact output of @code{errprint} is not very portable to other @code{m4} -implementations: POSIX requires that all arguments be printed, -but some implementations of @code{m4} only print the first. -Furthermore, some BSD implementations always append a newline -for each @code{errprint} call, regardless of whether the last argument -already had one, and POSIX is silent on whether this is -acceptable. - -@node Location -@section Printing current location - -@cindex location, input -@cindex input location -To make it possible to specify the location of an error, three -utility builtins exist: - -@deffn {Builtin (gnu)} __file__ -@deffnx {Builtin (gnu)} __line__ -@deffnx {Builtin (gnu)} __program__ -Expand to the quoted name of the current input file, the -current input line number in that file, and the quoted name of the -current invocation of @code{m4}. -@end deffn - -@example -errprint(__program__:__file__:__line__: `input error -') -@error{}m4:stdin:1: input error -@result{} -@end example - -Line numbers start at 1 for each file. If the file was found due to the -@option{-I} option or @env{M4PATH} environment variable, that is -reflected in the file name. Synclines, via @code{syncoutput} -(@pxref{Syncoutput}) or the command line option @option{--synclines} -(or @option{-s}, @pxref{Preprocessor features, , Invoking m4}), and the -@samp{f} and @samp{l} flags of @code{debugmode} (@pxref{Debugmode}), -also use this notion of current file and line. Redefining the three -location macros has no effect on syncline, debug, warning, or error -message output. - -This example reuses the file @file{incl.m4} mentioned earlier -(@pxref{Include}): - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -define(`foo', ``$0' called at __file__:__line__') -@result{} -foo -@result{}foo called at stdin:2 -include(`incl.m4') -@result{}Include file start -@result{}foo called at doc/examples/incl.m4:2 -@result{}Include file end -@result{} -@end example - -The location of macros invoked during the rescanning of macro expansion -text corresponds to the location in the file where the expansion was -triggered, regardless of how many newline characters the expansion text -contains. As of GNU M4 1.4.8, the location of text wrapped -with @code{m4wrap} (@pxref{M4wrap}) is the point at which the -@code{m4wrap} was invoked. Previous versions, however, behaved as -though wrapped text came from line 0 of the file ``''. - -@example -define(`echo', `$@@') -@result{} -define(`foo', `echo(__line__ -__line__)') -@result{} -echo(__line__ -__line__) -@result{}4 -@result{}5 -m4wrap(`foo -') -@result{} -foo(errprint(__line__ -__line__ -)) -@error{}8 -@error{}9 -@result{}8 -@result{}8 -__line__ -@result{}11 -m4wrap(`__line__ -') -@result{} -^D -@result{}6 -@result{}6 -@result{}12 -@end example - -The @code{@w{__program__}} macro behaves like @samp{$0} in shell -terminology. If you invoke @code{m4} through an absolute path or a link -with a different spelling, rather than by relying on a @env{PATH} search -for plain @samp{m4}, it will affect how @code{@w{__program__}} expands. -The intent is that you can use it to produce error messages with the -same formatting that @code{m4} produces internally. It can also be used -within @code{syscmd} (@pxref{Syscmd}) to pick the same version of -@code{m4} that is currently running, rather than whatever version of -@code{m4} happens to be first in @env{PATH}. It was first introduced in -GNU M4 1.4.6. - -@node M4exit -@section Exiting from @code{m4} - -@cindex exiting from @code{m4} -@cindex status, setting @code{m4} exit -If you need to exit from @code{m4} before the entire input has been -read, you can use @code{m4exit}: - -@deffn {Builtin (m4)} m4exit (@ovar{code}) -Causes @code{m4} to exit, with exit status @var{code}. If @var{code} is -left out, the exit status is zero. If @var{code} cannot be parsed, or -is outside the range of 0 to 255, the exit status is one. No further -input is read, and all wrapped and diverted text is discarded. -@end deffn - -@example -m4wrap(`This text is lost due to `m4exit'.') -@result{} -divert(`1') So is this. -divert -@result{} -m4exit And this is never read. -@end example - -A common use of this is to abort processing: - -@deffn Composite fatal_error (@var{message}) -Abort processing with an error message and non-zero status. Prefix -@var{message} with details about where the error occurred, and print the -resulting string to standard error. -@end deffn - -@comment status: 1 -@example -define(`fatal_error', - `errprint(__program__:__file__:__line__`: fatal error: $* -')m4exit(`1')') -@result{} -fatal_error(`this is a BAD one, buster') -@error{}m4:stdin:4: fatal error: this is a BAD one, buster -@end example - -After this macro call, @code{m4} will exit with exit status 1. This macro -is only intended for error exits, since the normal exit procedures are -not followed, i.e., diverted text is not undiverted, and saved text -(@pxref{M4wrap}) is not reread. (This macro could be made more robust -to earlier versions of @code{m4}. You should try to see if you can find -weaknesses and correct them; or @pxref{Improved fatal_error, , Answers}). - -Note that it is still possible for the exit status to be different than -what was requested by @code{m4exit}. If @code{m4} detects some other -error, such as a write error on standard output, the exit status will be -non-zero even if @code{m4exit} requested zero. - -If standard input is seekable, then the file will be positioned at the -next unread character. If it is a pipe or other non-seekable file, -then there are no guarantees how much data @code{m4} might have read -into buffers, and thus discarded. - -@node Syncoutput -@section Turning on and off sync lines - -@cindex toggling synchronization lines -@cindex synchronization lines -@cindex location, input -@cindex input location -It is possible to adjust whether synclines are printed to output: - -@deffn {Builtin (gnu)} syncoutput (@var{truth}) -If @var{truth} matches the extended regular expression -@samp{^[1yY]|^([oO][nN])}, it causes @code{m4} to emit sync lines of the -form: @samp{#line <number> ["<file>"]}. - -If @var{truth} is empty, or matches the extended regular expression -@samp{^[0nN]|^([oO][fF])}, it causes @code{m4} to turn sync lines off. - -All other arguments are ignored and issue a warning. - -The macro @code{syncoutput} is recognized only with parameters. -This macro was added in M4 2.0. -@end deffn - -@example -define(`twoline', `1 -2') -@result{} -changecom(`/*', `*/') -@result{} -define(`comment', `/*1 -2*/') -@result{} -twoline -@result{}1 -@result{}2 -dnl no line -syncoutput(`on') -@result{}#line 8 "stdin" -@result{} -twoline -@result{}1 -@result{}#line 9 -@result{}2 -dnl no line -hello -@result{}#line 11 -@result{}hello -comment -@result{}/*1 -@result{}2*/ -one comment `two -three' -@result{}#line 13 -@result{}one /*1 -@result{}2*/ two -@result{}three -goodbye -@result{}#line 15 -@result{}goodbye -syncoutput(`off') -@result{} -twoline -@result{}1 -@result{}2 -syncoutput(`blah') -@error{}m4:stdin:18: warning: syncoutput: unknown directive 'blah' -@result{} -@end example - -Notice that a syncline is output any time a single source line expands -to multiple output lines, or any time multiple source lines expand to a -single output line. When there is a one-for-one correspondence, no -additional synclines are needed. - -Synchronization lines can be used to track where input comes from; an -optional file designation is printed when the syncline algorithm -detects that consecutive output lines come from different files. You -can also use the @option{--synclines} command-line option (or -@option{-s}, @pxref{Preprocessor features, , Invoking m4}) to start -with synchronization on. This example reuses the file @file{incl.m4} -mentioned earlier (@pxref{Include}): - -@comment examples -@comment options: -s -@example -$ @kbd{m4 --synclines -I doc/examples} -include(`incl.m4') -@result{}#line 1 "doc/examples/incl.m4" -@result{}Include file start -@result{}foo -@result{}Include file end -@result{}#line 1 "stdin" -@result{} -@end example - -@node Frozen files -@chapter Fast loading of frozen state - -Some bigger @code{m4} applications may be built over a common base -containing hundreds of definitions and other costly initializations. -Usually, the common base is kept in one or more declarative files, -which files are listed on each @code{m4} invocation prior to the -user's input file, or else each input file uses @code{include}. - -Reading the common base of a big application, over and over again, may -be time consuming. GNU @code{m4} offers some machinery to -speed up the start of an application using lengthy common bases. - -@menu -* Using frozen files:: Using frozen files -* Frozen file format 1:: Frozen file format 1 -* Frozen file format 2:: Frozen file format 2 -@end menu - -@node Using frozen files -@section Using frozen files - -@cindex fast loading of frozen files -@cindex frozen files for fast loading -@cindex initialization, frozen state -@cindex dumping into frozen file -@cindex reloading a frozen file -@cindex GNU extensions -Suppose a user has a library of @code{m4} initializations in -@file{base.m4}, which is then used with multiple input files: - -@comment ignore -@example -$ @kbd{m4 base.m4 input1.m4} -$ @kbd{m4 base.m4 input2.m4} -$ @kbd{m4 base.m4 input3.m4} -@end example - -Rather than spending time parsing the fixed contents of @file{base.m4} -every time, the user might rather execute: - -@comment ignore -@example -$ @kbd{m4 -F base.m4f base.m4} -@end example - -@noindent -once, and further execute, as often as needed: - -@comment ignore -@example -$ @kbd{m4 -R base.m4f input1.m4} -$ @kbd{m4 -R base.m4f input2.m4} -$ @kbd{m4 -R base.m4f input3.m4} -@end example - -@noindent -with the varying input. The first call, containing the @option{-F} -option, only reads and executes file @file{base.m4}, defining -various application macros and computing other initializations. -Once the input file @file{base.m4} has been completely processed, GNU -@code{m4} produces in @file{base.m4f} a @dfn{frozen} file, that is, a -file which contains a kind of snapshot of the @code{m4} internal state. - -Later calls, containing the @option{-R} option, are able to reload -the internal state of @code{m4}, from @file{base.m4f}, -@emph{prior} to reading any other input files. This means -instead of starting with a virgin copy of @code{m4}, input will be -read after having effectively recovered the effect of a prior run. -In our example, the effect is the same as if file @file{base.m4} has -been read anew. However, this effect is achieved a lot faster. - -Only one frozen file may be created or read in any one @code{m4} -invocation. It is not possible to recover two frozen files at once. -However, frozen files may be updated incrementally, through using -@option{-R} and @option{-F} options simultaneously. For example, if -some care is taken, the command: - -@comment ignore -@example -$ @kbd{m4 file1.m4 file2.m4 file3.m4 file4.m4} -@end example - -@noindent -could be broken down in the following sequence, accumulating the same -output: - -@comment ignore -@example -$ @kbd{m4 -F file1.m4f file1.m4} -$ @kbd{m4 -R file1.m4f -F file2.m4f file2.m4} -$ @kbd{m4 -R file2.m4f -F file3.m4f file3.m4} -$ @kbd{m4 -R file3.m4f file4.m4} -@end example - -Some care is necessary because the frozen file does not save all state -information. Stacks of macro definitions via @code{pushdef} are -accurately stored, along with all renamed or undefined builtins, as are -the current syntax rules such as from @code{changequote}. However, the -value of @code{sysval} and text saved in @code{m4wrap} are not currently -preserved. Also, changing command line options between runs may cause -unexpected behavior. A future release of GNU M4 may improve -on the quality of frozen files. - -When an @code{m4} run is to be frozen, the automatic undiversion -which takes place at end of execution is inhibited. Instead, all -positively numbered diversions are saved into the frozen file. -The active diversion number is also transmitted. - -A frozen file to be reloaded need not reside in the current directory. -It is looked up the same way as an @code{include} file (@pxref{Search -Path}). - -If the frozen file was generated with a newer version of @code{m4}, and -contains directives that an older @code{m4} cannot parse, attempting to -load the frozen file with option @option{-R} will cause @code{m4} to -exit with status 63 to indicate version mismatch. - -@node Frozen file format 1 -@section Frozen file format 1 - -@cindex frozen file format 1 -@cindex file format, frozen file version 1 -Frozen files are sharable across architectures. It is safe to write -a frozen file on one machine and read it on another, given that the -second machine uses the same or newer version of GNU @code{m4}. -It is conventional, but not required, to give a frozen file the suffix -of @code{.m4f}. - -Older versions of GNU @code{m4} create frozen files with -syntax version 1. These files can be read by the current version, but -are no longer produced. Version 1 files are mostly text files, although -any macros or diversions that contained nonprintable characters or long -lines cause the resulting frozen file to do likewise, since there are no -escape sequences. The file can be edited to change the state that -@code{m4} will start with. It is composed of several directives, each -starting with a single letter and ending with a newline (@key{NL}). -Wherever a directive is expected, the character @samp{#} can be used -instead to introduce a comment line; empty lines are also ignored if -they are not part of an embedded string. - -In the following descriptions, each @var{len} refers to the length of a -corresponding subsequent @var{str}. Numbers are always expressed in -decimal, and an omitted number defaults to 0. The valid directives in -version 1 are: - -@table @code -@item V @var{number} @key{NL} -Confirms the format of the file. Version 1 is recognized when -@var{number} is 1. This directive must be the first non-comment in the -file, and may not appear more than once. - -@item C @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL} -Uses @var{str1} and @var{str2} as the begin-comment and -end-comment strings. If omitted, then @samp{#} and @key{NL} are the -comment delimiters. - -@item D @var{number}, @var{len} @key{NL} @var{str} @key{NL} -Selects diversion @var{number}, making it current, then copy @var{str} -in the current diversion. @var{number} may be a negative number for a -diversion that discards text. To merely specify an active selection, -use this command with an empty @var{str}. With 0 as the diversion -@var{number}, @var{str} will be issued on standard output at reload -time. GNU @code{m4} will not produce the @samp{D} directive -with non-zero length for diversion 0, but this can be done with manual -edits. This directive may appear more than once for the same diversion, -in which case the diversion is the concatenation of the various uses. -If omitted, then diversion 0 is current. - -@item F @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL} -Defines, through @code{pushdef}, a definition for @var{str1} expanding -to the function whose builtin name is @var{str2}. If the builtin does -not exist (for example, if the frozen file was produced by a copy of -@code{m4} compiled with the now-abandoned @code{changeword} support), -the reload is silent, but any subsequent use of the definition of -@var{str1} will result in a warning. This directive may appear more -than once for the same name, and its order, along with @samp{T}, is -important. If omitted, you will have no access to any builtins. - -@item Q @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL} -Uses @var{str1} and @var{str2} as the begin-quote and end-quote -strings. If omitted, then @samp{`} and @samp{'} are the quote -delimiters. - -@item T @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL} -Defines, though @code{pushdef}, a definition for @var{str1} -expanding to the text given by @var{str2}. This directive may appear -more than once for the same name, and its order, along with @samp{F}, is -important. -@end table - -When loading format 1, the syntax categories @samp{@{} and @samp{@}} are -disabled (reverting braces to be treated like plain characters). This -is because frozen files created with M4 1.4.x did not understand -@samp{$@{@dots{}@}} extended argument notation, and a frozen macro that -contained this character sequence should not behave differently just -because a newer version of M4 reloaded the file. - -@node Frozen file format 2 -@section Frozen file format 2 - -@cindex frozen file format 2 -@cindex file format, frozen file version 2 -The syntax of version 1 has some drawbacks; if any macro or diversion -contained non-printable characters or long lines, the resulting frozen -file would not qualify as a text file, making it harder to edit with -some vendor tools. The concatenation of multiple strings on a single -line, such as for the @samp{T} directive, makes distinguishing the two -strings a bit more difficult. Finally, the format lacks support for -several items of @code{m4} state, such that a reloaded file did not -always behave the same as the original file. - -These shortcomings have been addressed in version 2 of the frozen file -syntax. New directives have been added, and existing directives have -additional, and sometimes optional, parameters. All @var{str} instances -in the grammar are now followed by @key{NL}, which makes the split -between consecutive strings easier to recognize. Strings may now -contain escape sequences modeled after C, such as @samp{\n} for newline -or @samp{\0} for @sc{nul}, so that the frozen file can be pure -@sc{ascii} (although when hand-editing a frozen file, it is still -acceptable to use the original byte rather than an escape sequence for -all bytes except @samp{\}). Also in the context of a @var{str}, the -escape sequence @samp{\@key{NL}} is discarded, allowing a user to split -lines that are too long for some platform tools. - -@table @code -@item V @var{number} @key{NL} -Confirms the format of the file. @code{m4} @value{VERSION} only creates -frozen files where @var{number} is 2. This directive must be the first -non-comment in the file, and may not appear more than once. - -@item C @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL} -Uses @var{str1} and @var{str2} as the begin-comment and -end-comment strings. If omitted, then @samp{#} and @key{NL} are the -comment delimiters. - -@item d @var{len} @key{NL} @var{str} @key{NL} -Sets the debug flags, using @var{str} as the argument to -@code{debugmode}. If omitted, then the debug flags start in their -default disabled state. - -@item D @var{number} , @var{len} @key{NL} @var{str} @key{NL} -Selects diversion @var{number}, making it current, then copy @var{str} -in the current diversion. @var{number} may be a negative number for a -diversion that discards text. To merely specify an active selection, -use this command with an empty @var{string}. With 0 as the diversion -@var{number}, @var{str} will be issued on standard output at reload -time. GNU @code{m4} will not produce the @samp{D} directive -with non-zero length for diversion 0, but this can be done with manual -edits. This directive may appear more than once for the same diversion, -in which case the diversion is the concatenation of the various uses. -If omitted, then diversion 0 is current. - -@comment FIXME - the first usage, with only one string, is not supported -@comment in the current code -@c @item F @var{len1} @key{NL} @var{str1} @key{NL} -@item F @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL} -@itemx F @var{len1} , @var{len2} , @var{len3} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL} @var{str3} @key{NL} -Defines, through @code{pushdef}, a definition for @var{str1} expanding -to the function whose builtin name is given by @var{str2} (defaulting to -@var{str1} if not present). With two arguments, the builtin name is -searched for among the intrinsic builtin functions only; with three -arguments, the builtin name is searched for amongst the builtin -functions defined by the module named by @var{str3}. - -@item M @var{len} @key{NL} @var{str} @key{NL} -Names a module which will be searched for according to the module search -path and loaded. Modules loaded from a frozen file don't add their -builtin entries to the symbol table. Modules must be loaded prior to -specifying module-specific builtins via the three-argument @code{F} or -@code{T}. - -@item Q @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL} -Uses @var{str1} and @var{str2} as the begin-quote and end-quote strings. -If omitted, then @samp{`} and @samp{'} are the quote delimiters. - -@item R @var{len} @key{NL} @var{str} @key{NL} -Sets the default regexp syntax, where @var{str} encodes one of the -regular expression syntaxes supported by GNU M4. -@xref{Changeresyntax}, for more details. - -@item S @var{syntax-code} @var{len} @key{NL} @var{str} @key{NL} -Defines, through @code{changesyntax}, a syntax category for each of the -characters in @var{str}. The @var{syntax-code} must be one of the -characters described in @ref{Changesyntax}. - -@item t @var{len} @key{NL} @var{str} @key{NL} -Enables tracing for any macro named @var{str}, similar to using the -@code{traceon} builtin. This option may occur more than once for -multiple macros; if omitted, no macro starts out as traced. - -@item T @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL} -@itemx T @var{len1} , @var{len2} , @var{len3} @key{NL} @var{str1} @key{NL} @var{str2} @key{NL} @var{str3} @key{NL} -Defines, though @code{pushdef}, a definition for @var{str1} expanding to -the text given by @var{str2}. This directive may appear more than once -for the same name, and its order, along with @samp{F}, is important. If -present, the optional third argument associates the macro with a module -named by @var{str3}. -@end table - -@node Compatibility -@chapter Compatibility with other versions of @code{m4} - -@cindex compatibility -This chapter describes the many of the differences between this -implementation of @code{m4}, and of other implementations found under -UNIX, such as System V Release 4, Solaris, and BSD flavors. -In particular, it lists the known differences and extensions to -POSIX. However, the list is not necessarily comprehensive. - -At the time of this writing, POSIX 2001 (also known as IEEE -Std 1003.1-2001) is the latest standard, although a new version of -POSIX is under development and includes several proposals for -modifying what @code{m4} is required to do. The requirements for -@code{m4} are shared between SUSv3 and POSIX, and -can be viewed at -@uref{http://www.opengroup.org/onlinepubs/@/000095399/@/utilities/@/m4.html}. - -@menu -* Extensions:: Extensions in GNU M4 -* Incompatibilities:: Other incompatibilities -* Experiments:: Experimental features in GNU M4 -@end menu - -@node Extensions -@section Extensions in GNU M4 - -@cindex GNU extensions -@cindex POSIX -@cindex @env{POSIXLY_CORRECT} -This version of @code{m4} contains a few facilities that do not exist -in System V @code{m4}. These extra facilities are all suppressed by -using the @option{-G} command line option, unless overridden by other -command line options. -Most of these extensions are compatible with -@uref{http://www.unix.org/single_unix_specification/, -POSIX}; the few exceptions are suppressed if the -@env{POSIXLY_CORRECT} environment variable is set. - -@itemize @bullet -@item -In the @code{$@var{n}} notation for macro arguments, @var{n} can contain -several digits, while the System V @code{m4} only accepts one digit. -This allows macros in GNU @code{m4} to take any number of -arguments, and not only nine (@pxref{Arguments}). -POSIX does not allow this extension, so it is disabled if -@env{POSIXLY_CORRECT} is set. -@c FIXME - update this bullet when ${11} is implemented. - -@item -The @code{divert} (@pxref{Divert}) macro can manage more than 9 -diversions. GNU @code{m4} treats all positive numbers as valid -diversions, rather than discarding diversions greater than 9. - -@item -Files included with @code{include} and @code{sinclude} are sought in a -user specified search path, if they are not found in the working -directory. The search path is specified by the @option{-I} option and the -@samp{M4PATH} environment variable (@pxref{Search Path}). - -@item -Arguments to @code{undivert} can be non-numeric, in which case the named -file will be included uninterpreted in the output (@pxref{Undivert}). - -@item -Formatted output is supported through the @code{format} builtin, which -is modeled after the C library function @code{printf} (@pxref{Format}). - -@item -Searches and text substitution through regular expressions are supported -by the @code{regexp} (@pxref{Regexp}) and @code{patsubst} -(@pxref{Patsubst}) builtins. - -The syntax of regular expressions in M4 has never been clearly -formalized. While OpenBSD M4 uses extended regular -expressions for @code{regexp} and @code{patsubst}, GNU M4 -defaults to basic regular expressions, but provides -@code{changeresyntax} (@pxref{Changeresyntax}) to change the flavor of -regular expression syntax in use. - -@item -The output of shell commands can be read into @code{m4} with -@code{esyscmd} (@pxref{Esyscmd}). - -@item -There is indirect access to any builtin macro with @code{builtin} -(@pxref{Builtin}). - -@item -Macros can be called indirectly through @code{indir} (@pxref{Indir}). - -@item -The name of the program, the current input file, and the current input -line number are accessible through the builtins @code{@w{__program__}}, -@code{@w{__file__}}, and @code{@w{__line__}} (@pxref{Location}). - -@item -The generation of sync lines can be controlled through @code{syncoutput} -(@pxref{Syncoutput}). - -@item -The format of the output from @code{dumpdef} and macro tracing can be -controlled with @code{debugmode} (@pxref{Debugmode}). - -@item -The destination of trace and debug output can be controlled with -@code{debugfile} (@pxref{Debugfile}). - -@item -The @code{maketemp} (@pxref{Mkstemp}) macro behaves like @code{mkstemp}, -creating a new file with a unique name on every invocation, rather than -following the insecure behavior of replacing the trailing @samp{X} -characters with the @code{m4} process id. POSIX does not -allow this extension, so @code{maketemp} is insecure if -@env{POSIXLY_CORRECT} is set, but you should be using @code{mkstemp} in -the first place. - -@item -POSIX only requires support for the command line options -@option{-s}, @option{-D}, and @option{-U}, so all other options accepted -by GNU M4 are extensions. @xref{Invoking m4}, for a -description of these options. - -@item -The debugging and tracing facilities in GNU @code{m4} are much -more extensive than in most other versions of @code{m4}. - -@item -Some traditional implementations only allow reading standard input -once, but GNU @code{m4} correctly handles multiple instances -of @samp{-} on the command line. - -@item -POSIX requires @code{m4wrap} (@pxref{M4wrap}) to act in FIFO -(first-in, first-out) order, and most other implementations obey this. -However, versions of GNU @code{m4} earlier than 1.6 used -LIFO order. Furthermore, POSIX states that only the first -argument to @code{m4wrap} is saved for later evaluation, but -GNU @code{m4} saves and processes all arguments, with output -separated by spaces. - -@item -POSIX states that builtins that require arguments, but are -called without arguments, have undefined behavior. Traditional -implementations simply behave as though empty strings had been passed. -For example, @code{a`'define`'b} would expand to @code{ab}. But -GNU @code{m4} ignores certain builtins if they have missing -arguments, giving @code{adefineb} for the above example. -@end itemize - -@node Incompatibilities -@section Other incompatibilities - -There are a few other incompatibilities between this implementation of -@code{m4}, and what POSIX requires, or what the System V -version implemented. - -@itemize @bullet -@item -Traditional implementations handle @code{define(`f',`1')} (@pxref{Define}) -by undefining the entire stack of previous definitions, and if doing -@code{undefine(`f')} first. GNU @code{m4} replaces just the top -definition on the stack, as if doing @code{popdef(`f')} followed by -@code{pushdef(`f',`1')}. POSIX allows either behavior. - -@item -At one point, POSIX required @code{changequote(@var{arg})} -(@pxref{Changequote}) to use newline as the close quote, but this was a -bug, and the next version of POSIX is anticipated to state -that using empty strings or just one argument is unspecified. -Meanwhile, the GNU @code{m4} behavior of treating an empty -end-quote delimiter as @samp{'} is not portable, as Solaris treats it as -repeating the start-quote delimiter, and BSD treats it as leaving the -previous end-quote delimiter unchanged. For predictable results, never -call changequote with just one argument, or with empty strings for -arguments. - -@item -At one point, POSIX required @code{changecom(@var{arg},)} -(@pxref{Changecom}) to make it impossible to end a comment, but this is -a bug, and the next version of POSIX is anticipated to state -that using empty strings is unspecified. Meanwhile, the GNU -@code{m4} behavior of treating an empty end-comment delimiter as newline -is not portable, as BSD treats it as leaving the previous end-comment -delimiter unchanged. It is also impossible in BSD implementations to -disable comments, even though that is required by POSIX. For -predictable results, never call changecom with empty strings for -arguments. - -@item -Traditional implementations allow argument collection, but not string -and comment processing, to span file boundaries. Thus, if @file{a.m4} -contains @samp{len(}, and @file{b.m4} contains @samp{abc)}, -@kbd{m4 a.m4 b.m4} outputs @samp{3} with traditional @code{m4}, but -gives an error message that the end of file was encountered inside a -macro with GNU @code{m4}. On the other hand, traditional -implementations do end of file processing for files included with -@code{include} or @code{sinclude} (@pxref{Include}), while GNU -@code{m4} seamlessly integrates the content of those files. Thus -@code{include(`a.m4')include(`b.m4')} will output @samp{3} instead of -giving an error. - -@item -POSIX requires @code{eval} (@pxref{Eval}) to treat all -operators with the same precedence as C@. However, earlier versions of -GNU @code{m4} followed the traditional behavior of other -@code{m4} implementations, where bitwise and logical negation (@samp{~} -and @samp{!}) have lower precedence than equality operators; and where -equality operators (@samp{==} and @samp{!=}) had the same precedence as -relational operators (such as @samp{<}). Use explicit parentheses to -ensure proper precedence. As extensions to POSIX, -GNU @code{m4} gives well-defined semantics to operations that -C leaves undefined, such as when overflow occurs, when shifting negative -numbers, or when performing division by zero. POSIX also -requires @samp{=} to cause an error, but many traditional -implementations allowed it as an alias for @samp{==}. - -@item -POSIX 2001 requires @code{translit} (@pxref{Translit}) to -treat each character of the second and third arguments literally. -However, it is anticipated that the next version of POSIX will -allow the GNU @code{m4} behavior of treating @samp{-} as a -range operator. - -@item -POSIX requires @code{m4} to honor the locale environment -variables of @env{LANG}, @env{LC_ALL}, @env{LC_CTYPE}, -@env{LC_MESSAGES}, and @env{NLSPATH}, but this has not yet been -implemented in GNU @code{m4}. - -@item -GNU @code{m4} implements sync lines differently from System V -@code{m4}, when text is being diverted. GNU @code{m4} outputs -the sync lines when the text is being diverted, and System V @code{m4} -when the diverted text is being brought back. - -The problem is which lines and file names should be attached to text -that is being, or has been, diverted. System V @code{m4} regards all -the diverted text as being generated by the source line containing the -@code{undivert} call, whereas GNU @code{m4} regards the -diverted text as being generated at the time it is diverted. - -The sync line option is used mostly when using @code{m4} as -a front end to a compiler. If a diverted line causes a compiler error, -the error messages should most probably refer to the place where the -diversion was made, and not where it was inserted again. - -@comment options: -s -@example -divert(2)2 -divert(1)1 -divert`'0 -@result{}#line 3 "stdin" -@result{}0 -^D -@result{}#line 2 "stdin" -@result{}1 -@result{}#line 1 "stdin" -@result{}2 -@end example - -@comment FIXME - this needs to be fixed before 2.0. -The current @code{m4} implementation has a limitation that the syncline -output at the start of each diversion occurs no matter what, even if the -previous diversion did not end with a newline. This goes contrary to -the claim that synclines appear on a line by themselves, so this -limitation may be corrected in a future version of @code{m4}. In the -meantime, when using @option{-s}, it is wisest to make sure all -diversions end with newline. - -@item -GNU @code{m4} makes no attempt at prohibiting self-referential -definitions like: - -@comment ignore -@example -define(`x', `x') -@result{} -define(`x', `x ') -@result{} -@end example - -@cindex rescanning -There is nothing inherently wrong with defining @samp{x} to -return @samp{x}. The wrong thing is to expand @samp{x} unquoted, -because that would cause an infinite rescan loop. -In @code{m4}, one might use macros to hold strings, as we do for -variables in other programming languages, further checking them with: - -@comment ignore -@example -ifelse(defn(`@var{holder}'), `@var{value}', @dots{}) -@end example - -@noindent -In cases like this one, an interdiction for a macro to hold its own name -would be a useless limitation. Of course, this leaves more rope for the -GNU @code{m4} user to hang himself! Rescanning hangs may be -avoided through careful programming, a little like for endless loops in -traditional programming languages. - -@item -POSIX states that only unquoted leading newlines and blanks -(that is, space and tab) are ignored when collecting macro arguments. -However, this appears to be a bug in POSIX, since most -traditional implementations also ignore all whitespace (formfeed, -carriage return, and vertical tab). GNU @code{m4} follows -tradition and ignores all leading unquoted whitespace. -@end itemize - -@node Experiments -@section Experimental features in GNU M4 - -Certain features of GNU @code{m4} are experimental. - -Some are only available if activated by an option given to -@file{m4-@value{VERSION}/@/configure} at GNU @code{m4} installation -time. The functionality -might change or even go away in the future. @emph{Do not rely on it}. -Please direct your comments about it the same way you would do for bugs. - -@section Changesyntax - -An experimental feature, which improves the flexibility of @code{m4}, -allows for changing the way the input is parsed (@pxref{Changesyntax}). -No compile time option is needed for @code{changesyntax}. The -implementation is careful to not slow down @code{m4} parsing, unlike the -withdrawn experiment of @code{changeword} that appeared earlier in M4 -1.4.x. - -@section Multiple precision arithmetic - -Another experimental feature, which would improve @code{m4} usefulness, -allows for multiple precision rational arithmetic similar to -@code{eval}. You must have the GNU multi-precision (gmp) -library installed, and should use @kbd{./configure --with-gmp} if you -want this feature compiled in. The current implementation is unproven -and might go away. Do not count on it yet. - -@node Answers -@chapter Correct version of some examples - -Some of the examples in this manuals are buggy or not very robust, for -demonstration purposes. Improved versions of these composite macros are -presented here. - -@menu -* Improved exch:: Solution for @code{exch} -* Improved forloop:: Solution for @code{forloop} -* Improved foreach:: Solution for @code{foreach} -* Improved copy:: Solution for @code{copy} -* Improved m4wrap:: Solution for @code{m4wrap} -* Improved cleardivert:: Solution for @code{cleardivert} -* Improved capitalize:: Solution for @code{capitalize} -* Improved fatal_error:: Solution for @code{fatal_error} -@end menu - -@node Improved exch -@section Solution for @code{exch} - -The @code{exch} macro (@pxref{Arguments}) as presented requires clients -to double quote their arguments. A nicer definition, which lets -clients follow the rule of thumb of one level of quoting per level of -parentheses, involves adding quotes in the definition of @code{exch}, as -follows: - -@example -define(`exch', ``$2', `$1'') -@result{} -define(exch(`expansion text', `macro')) -@result{} -macro -@result{}expansion text -@end example - -@node Improved forloop -@section Solution for @code{forloop} - -The @code{forloop} macro (@pxref{Forloop}) as presented earlier can go -into an infinite loop if given an iterator that is not parsed as a macro -name. It does not do any sanity checking on its numeric bounds, and -only permits decimal numbers for bounds. Here is an improved version, -shipped as @file{m4-@value{VERSION}/@/doc/examples/@/forloop2.m4}; this -version also optimizes overhead by calling four macros instead of six -per iteration (excluding those in @var{text}), by not dereferencing the -@var{iterator} in the helper @code{@w{_forloop}}. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -undivert(`forloop2.m4')dnl -@result{}divert(`-1') -@result{}# forloop(var, from, to, stmt) - improved version: -@result{}# works even if VAR is not a strict macro name -@result{}# performs sanity check that FROM is larger than TO -@result{}# allows complex numerical expressions in TO and FROM -@result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1', -@result{} `pushdef(`$1')_$0(`$1', eval(`$2'), -@result{} eval(`$3'), `$4')popdef(`$1')')') -@result{}define(`_forloop', -@result{} `define(`$1', `$2')$4`'ifelse(`$2', `$3', `', -@result{} `$0(`$1', incr(`$2'), `$3', `$4')')') -@result{}divert`'dnl -include(`forloop2.m4') -@result{} -forloop(`i', `2', `1', `no iteration occurs') -@result{} -forloop(`', `1', `2', ` odd iterator name') -@result{} odd iterator name odd iterator name -forloop(`i', `5 + 5', `0xc', ` 0x`'eval(i, `16')') -@result{} 0xa 0xb 0xc -forloop(`i', `a', `b', `non-numeric bounds') -@error{}m4:stdin:6: warning: eval: bad input: '(a) <= (b)' -@result{} -@end example - -One other change to notice is that the improved version used @samp{_$0} -rather than @samp{_foreach} to invoke the helper routine. In general, -this is a good practice to follow, because then the set of macros can be -uniformly transformed. The following example shows a transformation -that doubles the current quoting and appends a suffix @samp{2} to each -transformed macro. If @code{foreach} refers to the literal -@samp{_foreach}, then @code{foreach2} invokes @code{_foreach} instead of -the intended @code{_foreach2}, and the mixing of quoting paradigms leads -to an infinite recursion loop in this example. - -@comment options: -L9 -@comment status: 1 -@comment examples -@example -$ @kbd{m4 -d -L 9 -I doc/examples} -define(`arg1', `$1')include(`forloop2.m4')include(`quote.m4') -@result{} -define(`double', `define(`$1'`2', - arg1(patsubst(dquote(defn(`$1')), `[`']', `\&\&')))') -@result{} -double(`forloop')double(`_forloop')defn(`forloop2') -@result{}ifelse(eval(``($2) <= ($3)''), ``1'', -@result{} ``pushdef(``$1'')_$0(``$1'', eval(``$2''), -@result{} eval(``$3''), ``$4'')popdef(``$1'')'') -forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)') -@result{} -changequote(`[', `]')changequote([``], ['']) -@result{} -forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'') -@result{} -changequote`'include(`forloop.m4') -@result{} -double(`forloop')double(`_forloop')defn(`forloop2') -@result{}pushdef(``$1'', ``$2'')_forloop($@@)popdef(``$1'') -forloop(i, 1, 5, `ifelse(')forloop(i, 1, 5, `)') -@result{} -changequote(`[', `]')changequote([``], ['']) -@result{} -forloop2(i, 1, 5, ``ifelse('')forloop2(i, 1, 5, ``)'') -@error{}m4:stdin:12: recursion limit of 9 exceeded, use -L<N> to change it -@end example - -One more optimization is still possible. Instead of repeatedly -assigning a variable then invoking or dereferencing it, it is possible -to pass the current iterator value as a single argument. Coupled with -@code{curry} if other arguments are needed (@pxref{Composition}), or -with helper macros if the argument is needed in more than one place in -the expansion, the output can be generated with three, rather than four, -macros of overhead per iteration. Notice how the file -@file{m4-@value{VERSION}/@/doc/examples/@/forloop3.m4} rearranges the -arguments of the helper @code{_forloop} to take two arguments that are -placed around the current value. By splitting a balanced set of -parantheses across multiple arguments, the helper macro can now be -shared by @code{forloop} and the new @code{forloop_arg}. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`forloop3.m4') -@result{} -undivert(`forloop3.m4')dnl -@result{}divert(`-1') -@result{}# forloop_arg(from, to, macro) - invoke MACRO(value) for -@result{}# each value between FROM and TO, without define overhead -@result{}define(`forloop_arg', `ifelse(eval(`($1) <= ($2)'), `1', -@result{} `_forloop(`$1', eval(`$2'), `$3(', `)')')') -@result{}# forloop(var, from, to, stmt) - refactored to share code -@result{}define(`forloop', `ifelse(eval(`($2) <= ($3)'), `1', -@result{} `pushdef(`$1')_forloop(eval(`$2'), eval(`$3'), -@result{} `define(`$1',', `)$4')popdef(`$1')')') -@result{}define(`_forloop', -@result{} `$3`$1'$4`'ifelse(`$1', `$2', `', -@result{} `$0(incr(`$1'), `$2', `$3', `$4')')') -@result{}divert`'dnl -forloop(`i', `1', `3', ` i') -@result{} 1 2 3 -define(`echo', `$@@') -@result{} -forloop_arg(`1', `3', ` echo') -@result{} 1 2 3 -include(`curry.m4') -@result{} -forloop_arg(`1', `3', `curry(`pushdef', `a')') -@result{} -a -@result{}3 -popdef(`a')a -@result{}2 -popdef(`a')a -@result{}1 -popdef(`a')a -@result{}a -@end example - -Of course, it is possible to make even more improvements, such as -adding an optional step argument, or allowing iteration through -descending sequences. GNU Autoconf provides some of these -additional bells and whistles in its @code{m4_for} macro. - -@node Improved foreach -@section Solution for @code{foreach} - -The @code{foreach} and @code{foreachq} macros (@pxref{Foreach}) as -presented earlier each have flaws. First, we will examine and fix the -quadratic behavior of @code{foreachq}: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`foreachq.m4') -@result{} -traceon(`shift')debugmode(`aq') -@result{} -foreachq(`x', ``1', `2', `3', `4'', `x -')dnl -@result{}1 -@error{}m4trace: -3- shift(`1', `2', `3', `4') -@error{}m4trace: -2- shift(`1', `2', `3', `4') -@result{}2 -@error{}m4trace: -4- shift(`1', `2', `3', `4') -@error{}m4trace: -3- shift(`2', `3', `4') -@error{}m4trace: -3- shift(`1', `2', `3', `4') -@error{}m4trace: -2- shift(`2', `3', `4') -@result{}3 -@error{}m4trace: -5- shift(`1', `2', `3', `4') -@error{}m4trace: -4- shift(`2', `3', `4') -@error{}m4trace: -3- shift(`3', `4') -@error{}m4trace: -4- shift(`1', `2', `3', `4') -@error{}m4trace: -3- shift(`2', `3', `4') -@error{}m4trace: -2- shift(`3', `4') -@result{}4 -@error{}m4trace: -6- shift(`1', `2', `3', `4') -@error{}m4trace: -5- shift(`2', `3', `4') -@error{}m4trace: -4- shift(`3', `4') -@error{}m4trace: -3- shift(`4') -@end example - -@cindex quadratic behavior, avoiding -@cindex avoiding quadratic behavior -Each successive iteration was adding more quoted @code{shift} -invocations, and the entire list contents were passing through every -iteration. In general, when recursing, it is a good idea to make the -recursion use fewer arguments, rather than adding additional quoted -uses of @code{shift}. By doing so, @code{m4} uses less memory, invokes -fewer macros, is less likely to run into machine limits, and most -importantly, performs faster. The fixed version of @code{foreachq} can -be found in @file{m4-@value{VERSION}/@/doc/examples/@/foreachq2.m4}: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`foreachq2.m4') -@result{} -undivert(`foreachq2.m4')dnl -@result{}include(`quote.m4')dnl -@result{}divert(`-1') -@result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt) -@result{}# quoted list, improved version -@result{}define(`foreachq', `pushdef(`$1')_$0($@@)popdef(`$1')') -@result{}define(`_arg1q', ``$1'') -@result{}define(`_rest', `ifelse(`$#', `1', `', `dquote(shift($@@))')') -@result{}define(`_foreachq', `ifelse(`$2', `', `', -@result{} `define(`$1', _arg1q($2))$3`'$0(`$1', _rest($2), `$3')')') -@result{}divert`'dnl -traceon(`shift')debugmode(`aq') -@result{} -foreachq(`x', ``1', `2', `3', `4'', `x -')dnl -@result{}1 -@error{}m4trace: -3- shift(`1', `2', `3', `4') -@result{}2 -@error{}m4trace: -3- shift(`2', `3', `4') -@result{}3 -@error{}m4trace: -3- shift(`3', `4') -@result{}4 -@end example - -Note that the fixed version calls unquoted helper macros in -@code{@w{_foreachq}} to trim elements immediately; those helper macros -in turn must re-supply the layer of quotes lost in the macro invocation. -Contrast the use of @code{@w{_arg1q}}, which quotes the first list -element, with @code{@w{_arg1}} of the earlier implementation that -returned the first list element directly. Additionally, by calling the -helper method immediately, the @samp{defn(`@var{iterator}')} no longer -contains unexpanded macros. - -The astute m4 programmer might notice that the solution above still uses -more macro invocations than strictly necessary. Note that @samp{$2}, -which contains an arbitrarily long quoted list, is expanded and -rescanned three times per iteration of @code{_foreachq}. Furthermore, -every iteration of the algorithm effectively unboxes then reboxes the -list, which costs a couple of macro invocations. It is possible to -rewrite the algorithm by swapping the order of the arguments to -@code{_foreachq} in order to operate on an unboxed list in the first -place, and by using the fixed-length @samp{$#} instead of an arbitrary -length list as the key to end recursion. The result is an overhead of -six macro invocations per loop (excluding any macros in @var{text}), -instead of eight. This alternative approach is available as -@file{m4-@value{VERSION}/@/doc/examples/@/foreach3.m4}: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`foreachq3.m4') -@result{} -undivert(`foreachq3.m4')dnl -@result{}divert(`-1') -@result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt) -@result{}# quoted list, alternate improved version -@result{}define(`foreachq', `ifelse(`$2', `', `', -@result{} `pushdef(`$1')_$0(`$1', `$3', `', $2)popdef(`$1')')') -@result{}define(`_foreachq', `ifelse(`$#', `3', `', -@result{} `define(`$1', `$4')$2`'$0(`$1', `$2', -@result{} shift(shift(shift($@@))))')') -@result{}divert`'dnl -traceon(`shift')debugmode(`aq') -@result{} -foreachq(`x', ``1', `2', `3', `4'', `x -')dnl -@result{}1 -@error{}m4trace: -4- shift(`x', `x -@error{}', `', `1', `2', `3', `4') -@error{}m4trace: -3- shift(`x -@error{}', `', `1', `2', `3', `4') -@error{}m4trace: -2- shift(`', `1', `2', `3', `4') -@result{}2 -@error{}m4trace: -4- shift(`x', `x -@error{}', `1', `2', `3', `4') -@error{}m4trace: -3- shift(`x -@error{}', `1', `2', `3', `4') -@error{}m4trace: -2- shift(`1', `2', `3', `4') -@result{}3 -@error{}m4trace: -4- shift(`x', `x -@error{}', `2', `3', `4') -@error{}m4trace: -3- shift(`x -@error{}', `2', `3', `4') -@error{}m4trace: -2- shift(`2', `3', `4') -@result{}4 -@error{}m4trace: -4- shift(`x', `x -@error{}', `3', `4') -@error{}m4trace: -3- shift(`x -@error{}', `3', `4') -@error{}m4trace: -2- shift(`3', `4') -@end example - -Prior to M4 1.6, every instance of @samp{$@@} was rescanned as it was -encountered. Thus, the @file{foreachq3.m4} alternative used much less -memory than @file{foreachq2.m4}, and executed as much as 10% faster, -since each iteration encountered fewer @samp{$@@}. However, the -implementation of rescanning every byte in @samp{$@@} was quadratic in -the number of bytes scanned (for example, making the broken version in -@file{foreachq.m4} cubic, rather than quadratic, in behavior). Once the -underlying M4 implementation was improved in 1.6 to reuse results of -previous scans, both styles of @code{foreachq} become linear in the -number of bytes scanned, but the @file{foreachq3.m4} version remains -noticeably faster because of fewer macro invocations. Notice how the -implementation injects an empty argument prior to expanding @samp{$2} -within @code{foreachq}; the helper macro @code{_foreachq} then ignores -the third argument altogether, and ends recursion when there are three -arguments left because there was nothing left to pass through -@code{shift}. Thus, each iteration only needs one @code{ifelse}, rather -than the two conditionals used in the version from @file{foreachq2.m4}. - -@cindex nine arguments, more than -@cindex more than nine arguments -@cindex arguments, more than nine -So far, all of the implementations of @code{foreachq} presented have -been quadratic with M4 1.4.x. But @code{forloop} is linear, because -each iteration parses a constant amount of arguments. So, it is -possible to design a variant that uses @code{forloop} to do the -iteration, then uses @samp{$@@} only once at the end, giving a linear -result even with older M4 implementations. This implementation relies -on the GNU extension that @samp{$10} expands to the tenth -argument rather than the first argument concatenated with @samp{0}. The -trick is to define an intermediate macro that repeats the text -@code{m4_define(`$1', `$@var{n}')$2`'}, with @samp{n} set to successive -integers corresponding to each argument. The helper macro -@code{_foreachq_} is needed in order to generate the literal sequences -such as @samp{$1} into the intermediate macro, rather than expanding -them as the arguments of @code{_foreachq}. With this approach, no -@code{shift} calls are even needed! However, when linear recursion is -available in new enough M4, the time and memory cost of using -@code{forloop} to build an intermediate macro outweigh the costs of any -of the previous implementations (there are seven macros of overhead per -iteration instead of six in @file{foreachq3.m4}, and the entire -intermediate macro must be built in memory before any iteration is -expanded). Additionally, this approach will need adjustment when a -future version of M4 follows POSIX by no longer treating -@samp{$10} as the tenth argument; the anticipation is that -@samp{$@{10@}} can be used instead, although that alternative syntax is -not yet supported. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`foreachq4.m4') -@result{} -undivert(`foreachq4.m4')dnl -@result{}include(`forloop2.m4')dnl -@result{}divert(`-1') -@result{}# foreachq(x, `item_1, item_2, ..., item_n', stmt) -@result{}# quoted list, version based on forloop -@result{}define(`foreachq', -@result{}`ifelse(`$2', `', `', `_$0(`$1', `$3', $2)')') -@result{}define(`_foreachq', -@result{}`pushdef(`$1', forloop(`$1', `3', `$#', -@result{} `$0_(`1', `2', indir(`$1'))')`popdef( -@result{} `$1')')indir(`$1', $@@)') -@result{}define(`_foreachq_', -@result{}``define(`$$1', `$$3')$$2`''') -@result{}divert`'dnl -traceon(`shift')debugmode(`aq') -@result{} -foreachq(`x', ``1', `2', `3', `4'', `x -')dnl -@result{}1 -@result{}2 -@result{}3 -@result{}4 -@end example - -For yet another approach, the improved version of @code{foreach}, -available in @file{m4-@value{VERSION}/@/doc/examples/@/foreach2.m4}, -simply overquotes the arguments to @code{@w{_foreach}} to begin with, -using @code{dquote_elt}. Then @code{@w{_foreach}} can just use -@code{@w{_arg1}} to remove the extra layer of quoting that was added up -front: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`foreach2.m4') -@result{} -undivert(`foreach2.m4')dnl -@result{}include(`quote.m4')dnl -@result{}divert(`-1') -@result{}# foreach(x, (item_1, item_2, ..., item_n), stmt) -@result{}# parenthesized list, improved version -@result{}define(`foreach', `pushdef(`$1')_$0(`$1', -@result{} (dquote(dquote_elt$2)), `$3')popdef(`$1')') -@result{}define(`_arg1', `$1') -@result{}define(`_foreach', `ifelse(`$2', `(`')', `', -@result{} `define(`$1', _arg1$2)$3`'$0(`$1', (dquote(shift$2)), `$3')')') -@result{}divert`'dnl -traceon(`shift')debugmode(`aq') -@result{} -foreach(`x', `(`1', `2', `3', `4')', `x -')dnl -@error{}m4trace: -4- shift(`1', `2', `3', `4') -@error{}m4trace: -4- shift(`2', `3', `4') -@error{}m4trace: -4- shift(`3', `4') -@result{}1 -@error{}m4trace: -3- shift(``1'', ``2'', ``3'', ``4'') -@result{}2 -@error{}m4trace: -3- shift(``2'', ``3'', ``4'') -@result{}3 -@error{}m4trace: -3- shift(``3'', ``4'') -@result{}4 -@error{}m4trace: -3- shift(``4'') -@end example - -It is likewise possible to write a variant of @code{foreach} that -performs in linear time on M4 1.4.x; the easiest method is probably -writing a version of @code{foreach} that unboxes its list, then invokes -@code{_foreachq} as previously defined in @file{foreachq4.m4}. - -@cindex filtering defined symbols -@cindex subset of defined symbols -@cindex defined symbols, filtering -With a robust @code{foreachq} implementation, it is possible to create a -filter on a list of defined symbols. This next example will find all -symbols that contain @samp{if} or @samp{def}, via two different -approaches. In the first approach, @code{dquote_elt} is used to -overquote each list element, then @code{dquote} forms the list; that -way, the iterator @code{macro} can be expanded in place because its -contents are already quoted. This approach also uses a self-modifying -macro @code{sep} to provide the correct number of commas. In the second -approach, the iterator @code{macro} contains live text, so it must be -used with @code{defn} to avoid unintentional expansion. The correct -number of commas is achieved by using @code{shift} to ignore the first -one, although a leading space still remains. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`quote.m4')include(`foreachq2.m4') -@result{} -pushdef(`sep', `define(`sep', ``, '')') -@result{} -foreachq(`macro', dquote(dquote_elt(m4symbols)), - `regexp(macro, `.*if.*', `sep`\&'')') -@result{}ifdef, ifelse, shift -popdef(`sep') -@result{} -shift(foreachq(`macro', dquote(m4symbols), - `regexp(defn(`macro'), `def', `,` ''dquote(defn(`macro')))')) -@result{} define, defn, dumpdef, ifdef, popdef, pushdef, undefine -@end example - -In summary, recursion over list elements is trickier than it appeared at -first glance, but provides a powerful idiom within @code{m4} processing. -As a final demonstration, both list styles are now able to handle -several scenarios that would wreak havoc on one or both of the original -implementations. This points out one other difference between the -list styles. @code{foreach} evaluates unquoted list elements only once, -in preparation for calling @code{@w{_foreach}}, similary for -@code{foreachq} as provided by @file{foreachq3.m4} or -@file{foreachq4.m4}. But -@code{foreachq}, as provided by @file{foreachq2.m4}, -evaluates unquoted list elements twice while visiting the first list -element, once in @code{@w{_arg1q}} and once in @code{@w{_rest}}. When -deciding which list style to use, one must take into account whether -repeating the side effects of unquoted list elements will have any -detrimental effects. - -@comment examples -@example -$ @kbd{m4 -d -I doc/examples} -include(`foreach2.m4') -@result{} -include(`foreachq2.m4') -@result{} -dnl 0-element list: -foreach(`x', `', `<x>') / foreachq(`x', `', `<x>') -@result{} /@w{ } -dnl 1-element list of empty element -foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>') -@result{}<> / <> -dnl 2-element list of empty elements -foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>') -@result{}<><> / <><> -dnl 1-element list of a comma -foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>') -@result{}<,> / <,> -dnl 2-element list of unbalanced parentheses -foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>') -@result{}<(><)> / <(><)> -define(`ab', `oops')dnl using defn(`iterator') -foreach(`x', `(`a', `b')', `defn(`x')') /dnl - foreachq(`x', ``a', `b'', `defn(`x')') -@result{}ab / ab -define(`active', `ACT, IVE') -@result{} -traceon(`active') -@result{} -dnl list of unquoted macros; expansion occurs before recursion -foreach(`x', `(active, active)', `<x> -')dnl -@error{}m4trace: -4- active -> `ACT, IVE' -@error{}m4trace: -4- active -> `ACT, IVE' -@result{}<ACT> -@result{}<IVE> -@result{}<ACT> -@result{}<IVE> -foreachq(`x', `active, active', `<x> -')dnl -@error{}m4trace: -3- active -> `ACT, IVE' -@error{}m4trace: -3- active -> `ACT, IVE' -@result{}<ACT> -@error{}m4trace: -3- active -> `ACT, IVE' -@error{}m4trace: -3- active -> `ACT, IVE' -@result{}<IVE> -@result{}<ACT> -@result{}<IVE> -dnl list of quoted macros; expansion occurs during recursion -foreach(`x', `(`active', `active')', `<x> -')dnl -@error{}m4trace: -1- active -> `ACT, IVE' -@result{}<ACT, IVE> -@error{}m4trace: -1- active -> `ACT, IVE' -@result{}<ACT, IVE> -foreachq(`x', ``active', `active'', `<x> -')dnl -@error{}m4trace: -1- active -> `ACT, IVE' -@result{}<ACT, IVE> -@error{}m4trace: -1- active -> `ACT, IVE' -@result{}<ACT, IVE> -dnl list of double-quoted macro names; no expansion -foreach(`x', `(``active'', ``active'')', `<x> -')dnl -@result{}<active> -@result{}<active> -foreachq(`x', ```active'', ``active''', `<x> -')dnl -@result{}<active> -@result{}<active> -@end example - -@node Improved copy -@section Solution for @code{copy} - -The macro @code{copy} presented above works with M4 1.6 and newer, but -is unable to handle builtin tokens with M4 1.4.x, because it tries to -pass the builtin token through the macro @code{curry}, where it is -silently flattened to an empty string (@pxref{Composition}). Rather -than using the problematic @code{curry} to work around the limitation -that @code{stack_foreach} expects to invoke a macro that takes exactly -one argument, we can write a new macro that lets us form the exact -two-argument @code{pushdef} call sequence needed, so that we are no -longer passing a builtin token through a text macro. - -@deffn Composite stack_foreach_sep (@var{macro}, @var{pre}, @var{post}, @ - @var{sep}) -@deffnx Composite stack_foreach_sep_lifo (@var{macro}, @var{pre}, @ - @var{post}, @var{sep}) -For each of the @code{pushdef} definitions associated with @var{macro}, -expand the sequence @samp{@var{pre}`'definition`'@var{post}}. -Additionally, expand @var{sep} between definitions. -@code{stack_foreach_sep} visits the oldest definition first, while -@code{stack_foreach_sep_lifo} visits the current definition first. The -expansion may dereference @var{macro}, but should not modify it. There -are a few special macros, such as @code{defn}, which cannot be used as -the @var{macro} parameter. -@end deffn - -Note that @code{stack_foreach(`@var{macro}', `@var{action}')} is -equivalent to @code{stack_foreach_sep(`@var{macro}', `@var{action}(', -`)')}. By supplying explicit parentheses, split among the @var{pre} and -@var{post} arguments to @code{stack_foreach_sep}, it is now possible to -construct macro calls with more than one argument, without passing -builtin tokens through a macro call. It is likewise possible to -directly reference the stack definitions without a macro call, by -leaving @var{pre} and @var{post} empty. Thus, in addition to fixing -@code{copy} on builtin tokens, it also executes with fewer macro -invocations. - -The new macro also adds a separator that is only output after the first -iteration of the helper @code{_stack_reverse_sep}, implemented by -prepending the original @var{sep} to @var{pre} and omitting a @var{sep} -argument in subsequent iterations. Note that the empty string that -separates @var{sep} from @var{pre} is provided as part of the fourth -argument when originally calling @code{_stack_reverse_sep}, and not by -writing @code{$4`'$3} as the third argument in the recursive call; while -the other approach would give the same output, it does so at the expense -of increasing the argument size on each iteration of -@code{_stack_reverse_sep}, which results in quadratic instead of linear -execution time. The improved stack walking macros are available in -@file{m4-@value{VERSION}/@/doc/examples/@/stack_sep.m4}: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`stack_sep.m4') -@result{} -define(`copy', `ifdef(`$2', `errprint(`$2 already defined -')m4exit(`1')', - `stack_foreach_sep(`$1', `pushdef(`$2',', `)')')')dnl -pushdef(`a', `1')pushdef(`a', defn(`divnum')) -@result{} -copy(`a', `b') -@result{} -b -@result{}0 -popdef(`b') -@result{} -b -@result{}1 -pushdef(`c', `1')pushdef(`c', `2') -@result{} -stack_foreach_sep_lifo(`c', `', `', `, ') -@result{}2, 1 -undivert(`stack_sep.m4')dnl -@result{}divert(`-1') -@result{}# stack_foreach_sep(macro, pre, post, sep) -@result{}# Invoke PRE`'defn`'POST with a single argument of each definition -@result{}# from the definition stack of MACRO, starting with the oldest, and -@result{}# separated by SEP between definitions. -@result{}define(`stack_foreach_sep', -@result{}`_stack_reverse_sep(`$1', `tmp-$1')'dnl -@result{}`_stack_reverse_sep(`tmp-$1', `$1', `$2`'defn(`$1')$3', `$4`'')') -@result{}# stack_foreach_sep_lifo(macro, pre, post, sep) -@result{}# Like stack_foreach_sep, but starting with the newest definition. -@result{}define(`stack_foreach_sep_lifo', -@result{}`_stack_reverse_sep(`$1', `tmp-$1', `$2`'defn(`$1')$3', `$4`'')'dnl -@result{}`_stack_reverse_sep(`tmp-$1', `$1')') -@result{}define(`_stack_reverse_sep', -@result{}`ifdef(`$1', `pushdef(`$2', defn(`$1'))$3`'popdef(`$1')$0( -@result{} `$1', `$2', `$4$3')')') -@result{}divert`'dnl -@end example - -@node Improved m4wrap -@section Solution for @code{m4wrap} - -The replacement @code{m4wrap} versions presented above, designed to -guarantee FIFO or LIFO order regardless of the underlying M4 -implementation, share a bug when dealing with wrapped text that looks -like parameter expansion. Note how the invocation of -@code{m4wrap@var{n}} interprets these parameters, while using the -builtin preserves them for their intended use. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`wraplifo.m4') -@result{} -m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b') -') -@result{} -builtin(`m4wrap', ``'define(`bar', ``$0:'-$1-$*-$#-')bar(`a', `b') -') -@result{} -^D -@result{}m4wrap0:---0- -@result{}bar:-a-a,b-2- -@end example - -Additionally, the computation of @code{_m4wrap_level} and creation of -multiple @code{m4wrap@var{n}} placeholders in the original examples is -more expensive in time and memory than strictly necessary. Notice how -the improved version grabs the wrapped text via @code{defn} to avoid -parameter expansion, then undefines @code{_m4wrap_text}, before -stripping a level of quotes with @code{_arg1} to expand the text. That -way, each level of wrapping reuses the single placeholder, which starts -each nesting level in an undefined state. - -Finally, it is worth emulating the GNU M4 extension of saving -all arguments to @code{m4wrap}, separated by a space, rather than saving -just the first argument. This is done with the @code{join} macro -documented previously (@pxref{Shift}). The improved LIFO example is -shipped as @file{m4-@value{VERSION}/@/doc/examples/@/wraplifo2.m4}, and -can easily be converted to a FIFO solution by swapping the adjacent -invocations of @code{joinall} and @code{defn}. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`wraplifo2.m4') -@result{} -undivert(`wraplifo2.m4')dnl -@result{}dnl Redefine m4wrap to have LIFO semantics, improved example. -@result{}include(`join.m4')dnl -@result{}define(`_m4wrap', defn(`m4wrap'))dnl -@result{}define(`_arg1', `$1')dnl -@result{}define(`m4wrap', -@result{}`ifdef(`_$0_text', -@result{} `define(`_$0_text', joinall(` ', $@@)defn(`_$0_text'))', -@result{} `_$0(`_arg1(defn(`_$0_text')undefine(`_$0_text'))')dnl -@result{}define(`_$0_text', joinall(` ', $@@))')')dnl -m4wrap(`define(`foo', ``$0:'-$1-$*-$#-')foo(`a', `b') -') -@result{} -m4wrap(`lifo text -m4wrap(`nested', `', `$@@ -')') -@result{} -^D -@result{}lifo text -@result{}foo:-a-a,b-2- -@result{}nested $@@ -@end example - -@node Improved cleardivert -@section Solution for @code{cleardivert} - -The @code{cleardivert} macro (@pxref{Cleardivert}) cannot, as it stands, be -called without arguments to clear all pending diversions. That is -because using undivert with an empty string for an argument is different -than using it with no arguments at all. Compare the earlier definition -with one that takes the number of arguments into account: - -@example -define(`cleardivert', - `pushdef(`_n', divnum)divert(`-1')undivert($@@)divert(_n)popdef(`_n')') -@result{} -divert(`1')one -divert -@result{} -cleardivert -@result{} -undivert -@result{}one -@result{} -define(`cleardivert', - `pushdef(`_num', divnum)divert(`-1')ifelse(`$#', `0', - `undivert`'', `undivert($@@)')divert(_num)popdef(`_num')') -@result{} -divert(`2')two -divert -@result{} -cleardivert -@result{} -undivert -@result{} -@end example - -@node Improved capitalize -@section Solution for @code{capitalize} - -The @code{capitalize} macro (@pxref{Patsubst}) as presented earlier does -not allow clients to follow the quoting rule of thumb. Consider the -three macros @code{active}, @code{Active}, and @code{ACTIVE}, and the -difference between calling @code{capitalize} with the expansion of a -macro, expanding the result of a case change, and changing the case of a -double-quoted string: - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`capitalize.m4')dnl -define(`active', `act1, ive')dnl -define(`Active', `Act2, Ive')dnl -define(`ACTIVE', `ACT3, IVE')dnl -upcase(active) -@result{}ACT1,IVE -upcase(`active') -@result{}ACT3, IVE -upcase(``active'') -@result{}ACTIVE -downcase(ACTIVE) -@result{}act3,ive -downcase(`ACTIVE') -@result{}act1, ive -downcase(``ACTIVE'') -@result{}active -capitalize(active) -@result{}Act1 -capitalize(`active') -@result{}Active -capitalize(``active'') -@result{}_capitalize(`active') -define(`A', `OOPS') -@result{} -capitalize(active) -@result{}OOPSct1 -capitalize(`active') -@result{}OOPSctive -@end example - -First, when @code{capitalize} is called with more than one argument, it -was throwing away later arguments, whereas @code{upcase} and -@code{downcase} used @samp{$*} to collect them all. The fix is simple: -use @samp{$*} consistently. - -Next, with single-quoting, @code{capitalize} outputs a single character, -a set of quotes, then the rest of the characters, making it impossible -to invoke @code{Active} after the fact, and allowing the alternate macro -@code{A} to interfere. Here, the solution is to use additional quoting -in the helper macros, then pass the final over-quoted output string -through @code{_arg1} to remove the extra quoting and finally invoke the -concatenated portions as a single string. - -Finally, when passed a double-quoted string, the nested macro -@code{_capitalize} is never invoked because it ended up nested inside -quotes. This one is the toughest to fix. In short, we have no idea how -many levels of quotes are in effect on the substring being altered by -@code{patsubst}. If the replacement string cannot be expressed entirely -in terms of literal text and backslash substitutions, then we need a -mechanism to guarantee that the helper macros are invoked outside of -quotes. In other words, this sounds like a job for @code{changequote} -(@pxref{Changequote}). By changing the active quoting characters, we -can guarantee that replacement text injected by @code{patsubst} always -occurs in the middle of a string that has exactly one level of -over-quoting using alternate quotes; so the replacement text closes the -quoted string, invokes the helper macros, then reopens the quoted -string. In turn, that means the replacement text has unbalanced quotes, -necessitating another round of @code{changequote}. - -In the fixed version below, (also shipped as -@file{m4-@value{VERSION}/@/doc/examples/@/capitalize.m4}), -@code{capitalize} uses the alternate quotes of @samp{<<[} and @samp{]>>} -(the longer strings are chosen so as to be less likely to appear in the -text being converted). The helpers @code{_to_alt} and @code{_from_alt} -merely reduce the number of characters required to perform a -@code{changequote}, since the definition changes twice. The outermost -pair means that @code{patsubst} and @code{_capitalize_alt} are invoked -with alternate quoting; the innermost pair is used so that the third -argument to @code{patsubst} can contain an unbalanced -@samp{]>>}/@samp{<<[} pair. Note that @code{upcase} and @code{downcase} -must be redefined as @code{_upcase_alt} and @code{_downcase_alt}, since -they contain nested quotes but are invoked with the alternate quoting -scheme in effect. - -@comment examples -@example -$ @kbd{m4 -I doc/examples} -include(`capitalize2.m4')dnl -define(`active', `act1, ive')dnl -define(`Active', `Act2, Ive')dnl -define(`ACTIVE', `ACT3, IVE')dnl -define(`A', `OOPS')dnl -capitalize(active; `active'; ``active''; ```actIVE''') -@result{}Act1,Ive; Act2, Ive; Active; `Active' -undivert(`capitalize2.m4')dnl -@result{}divert(`-1') -@result{}# upcase(text) -@result{}# downcase(text) -@result{}# capitalize(text) -@result{}# change case of text, improved version -@result{}define(`upcase', `translit(`$*', `a-z', `A-Z')') -@result{}define(`downcase', `translit(`$*', `A-Z', `a-z')') -@result{}define(`_arg1', `$1') -@result{}define(`_to_alt', `changequote(`<<[', `]>>')') -@result{}define(`_from_alt', `changequote(<<[`]>>, <<[']>>)') -@result{}define(`_upcase_alt', `translit(<<[$*]>>, <<[a-z]>>, <<[A-Z]>>)') -@result{}define(`_downcase_alt', `translit(<<[$*]>>, <<[A-Z]>>, <<[a-z]>>)') -@result{}define(`_capitalize_alt', -@result{} `regexp(<<[$1]>>, <<[^\(\w\)\(\w*\)]>>, -@result{} <<[_upcase_alt(<<[<<[\1]>>]>>)_downcase_alt(<<[<<[\2]>>]>>)]>>)') -@result{}define(`capitalize', -@result{} `_arg1(_to_alt()patsubst(<<[<<[$*]>>]>>, <<[\w+]>>, -@result{} _from_alt()`]>>_$0_alt(<<[\&]>>)<<['_to_alt())_from_alt())') -@result{}divert`'dnl -@end example - -@node Improved fatal_error -@section Solution for @code{fatal_error} - -The @code{fatal_error} macro (@pxref{M4exit}) is not robust to versions -of GNU M4 earlier than 1.4.8, where invoking @code{@w{__file__}} -(@pxref{Location}) inside @code{m4wrap} would result in an empty string, -and @code{@w{__line__}} resulted in @samp{0} even though all files start -at line 1. Furthermore, versions earlier than 1.4.6 did not support the -@code{@w{__program__}} macro. If you want @code{fatal_error} to work -across the entire 1.4.x release series, a better implementation would -be: - -@comment status: 1 -@example -define(`fatal_error', - `errprint(ifdef(`__program__', `__program__', ``m4'')'dnl -`:ifelse(__line__, `0', `', - `__file__:__line__:')` fatal error: $* -')m4exit(`1')') -@result{} -m4wrap(`divnum(`demo of internal message') -fatal_error(`inside wrapped text')') -@result{} -^D -@error{}m4:stdin:6: warning: divnum: extra arguments ignored: 1 > 0 -@result{}0 -@error{}m4:stdin:6: fatal error: inside wrapped text -@end example - -@c ========================================================== Appendices - -@node Copying This Package -@appendix How to make copies of the overall M4 package -@cindex License, code - -This appendix covers the license for copying the source code of the -overall M4 package. This manual is under a different set of -restrictions, covered later (@pxref{Copying This Manual}). - -@menu -* GNU General Public License:: License for copying the M4 package -@end menu - -@node GNU General Public License -@appendixsec License for copying the M4 package -@cindex GPL, GNU General Public License -@cindex GNU General Public License -@cindex General Public License (GPL), GNU -@include gpl-3.0.texi - -@node Copying This Manual -@appendix How to make copies of this manual -@cindex License, manual - -This appendix covers the license for copying this manual. Note that -some of the longer examples in this manual are also distributed in the -directory @file{m4-@value{VERSION}/@/doc/examples/}, where a more -permissive license is in effect when copying just the examples. - -@menu -* GNU Free Documentation License:: License for copying this manual -@end menu - -@node GNU Free Documentation License -@appendixsec License for copying this manual -@cindex FDL, GNU Free Documentation License -@cindex GNU Free Documentation License -@cindex Free Documentation License (FDL), GNU -@include fdl-1.3.texi - -@node Indices -@appendix Indices of concepts and macros - -@menu -* Macro index:: Index for all @code{m4} macros -* Concept index:: Index for many concepts -@end menu - -@node Macro index -@appendixsec Index for all @code{m4} macros - -This index covers all @code{m4} builtins, as well as several useful -composite macros. References are exclusively to the places where a -macro is introduced the first time. - -@printindex fn - -@node Concept index -@appendixsec Index for many concepts - -@printindex cp - -@bye - -@c Local Variables: -@c fill-column: 72 -@c ispell-local-dictionary: "american" -@c indent-tabs-mode: nil -@c whitespace-check-buffer-indent: nil -@c End: |