From ee3cabe6c57323f807ef50baf6341231801d0396 Mon Sep 17 00:00:00 2001 From: Philip Hazel Date: Fri, 24 Jul 2009 15:28:45 +0000 Subject: Imported from /Users/nigel/Work/x/xfpt-0.07.tar.bz2. --- doc/xfpt.html | 185 ++++++++++++++++++++++++++++++---------------------- doc/xfpt.pdf | Bin 62566 -> 64878 bytes doc/xfpt.xfpt | 58 +++++++++++----- share/stdmacs | 50 ++++++++++++++ src/globals.c | 4 +- testing/infiles/04 | 9 +++ testing/outfiles/04 | 16 +++++ testing/runtest | 2 +- 8 files changed, 229 insertions(+), 95 deletions(-) create mode 100644 testing/infiles/04 create mode 100644 testing/outfiles/04 diff --git a/doc/xfpt.html b/doc/xfpt.html index 062c0803e..b18242810 100644 --- a/doc/xfpt.html +++ b/doc/xfpt.html @@ -1,4 +1,3 @@ - The xfpt plain text to XML processor - +
@@ -54,7 +53,7 @@ The xfpt plain text to XML processor

- + The xfpt plain text to XML processor

@@ -70,7 +69,7 @@ Hazel
+Copyright © 2009 University of Cambridge

xfpt is a program that reads a marked-up ASCII source file, and converts it into XML. It was written with DocBook XML in mind, but can also be used for other -forms of XML. Unlike AsciiDoc (http://www.methods.co.nz/asciidoc/), +forms of XML. Unlike AsciiDoc (http://www.methods.co.nz/asciidoc/), xfpt does not try to produce XML from a document that is also usable as a freestanding ASCII document. The input for xfpt is very definitely “marked up”. This makes it less ambiguous for large and/or complicated documents. xfpt @@ -530,18 +541,18 @@ output, the appropriate XML entity reference (&xfpt [options] [input source]

If no input is specified, the standard input is read. There are four options: -

-help

+

-help

This option causes xfpt to output its “usage” message, and exit. -

-o <output destination>

+

-o <output destination>

This option overrides the default destination. If the standard input is being read, the default destination is the standard output. Otherwise, the default destination is the name of the input file with the extension .xml, replacing its existing extension if there is one. A single hyphen character can be given as an output destination to refer to the standard output. -

-S <directory path>

+

-S <directory path>

This option overrides the path to xfpt’s library directory that is built into the program. This makes it possible to use or test alternate libraries. -

-v

+

-v

This option causes xfpt to output its version number and exit.

Here is a very short example of a complete xfpt input file that uses some of the standard macros and flags: @@ -570,7 +581,7 @@ standard macros and flags: “modes”):

  • In the default mode, text is processed paragraph by paragraph. -[1] +[1] The end of a paragraph is indicated by the end of the input, a blank line, or by an occurrence of the .literal directive. Other directives (for example, .include) do not of themselves terminate a paragraph. Most of the standard @@ -640,7 +651,7 @@ An unrecognized directive line normally causes an error; however, in the literal text and literal XML modes, an unrecognized line that starts with a dot is treated as a data line.

Macros are defined by the .macro directive, which is described in section -3.11. There are two ways of calling a macro. It can be called in the +3.11. There are two ways of calling a macro. It can be called in the same way as a directive, or it can be called from within text that is being processed. The second case is called an “inline macro call”.

@@ -674,10 +685,10 @@ A macro that can be called inline can always be called as a directive, but the opposite is not always true. Macros are usually designed to be called either one way or the other. However, the .new and .index macros in the standard library are examples of macros that are designed be called either way. -



[1] -There is, however, a special case when a paragraph contains one or more +



[1] +There is, however, a special case when a paragraph contains one or more footnotes. In that situation, each part of the outer paragraph is processed -independently. +independently.

Only one flag sequence is built-into the code itself. If an input line ends with three ampersands (ignoring trailing white space), the ampersands are removed, and the next input line, with any leading white space removed, is @@ -718,7 +729,7 @@ without modification. For example:

If an ampersand is followed by a sequence of alphanumeric characters starting with a letter, terminated by an opening parenthesis, the characters between the ampersand and the parenthesis are interpreted as the name of a macro. See -section 1.5 for more details. +section 1.5 for more details.

Any other flag sequences that are needed must be defined by means of the .flag directive. These are of two types, standalone and paired. Both cases define replacement text. This is always literal; it is not itself scanned for @@ -786,7 +797,7 @@ sequence of literal XML, changing to the literal XML mode may be more convenient.

The directives that are built into the code of xfpt are now described in alphabetical order. You can see more examples of their use in the descriptions -of the standard macros in chapter 4. +of the standard macros in chapter 4.

This directive may appear only within the body of a macro. It must be followed by a single number, optionally preceded by a minus sign. If the number is positive (no minus sign), subsequent lines, up to a .endarg directive, are @@ -869,9 +880,9 @@ way to revert to the previous definition.

This directive must be followed by a single string argument that is the path to a file. The contents of the file are read and incorporated into the input at this point. If the string does not contain any slashes, the path to the xfpt -library is prepended. Otherwise, the path is used unaltered. If +library is prepended. Otherwise, the path is used unaltered. If .include is used inside a macro, it is evaluated each time the macro is -called, and thus can be used to include a different file on each occasion. +called, and thus can be used to include a different file on each occasion.

This directive may appear only within the body of a macro. It must be followed by one of the words “layout”, “text”, “off”, or “xml”. If the current literal mode does not correspond to the word, subsequent lines, up to a @@ -881,14 +892,14 @@ nested. “xml”. It forces an end to a previous paragraph, if there is one, and then switches between processing modes. The default mode is the “off” mode, in which text is processed paragraph by paragraph, and flags are recognized. -Section 1.3 describes how input lines are processed in +Section 1.3 describes how input lines are processed in the four modes.

This directive is used to define macros. It must be followed by a macro name, and then, optionally, by any number of arguments. The macro name can be any sequence of non-whitespace characters. The arguments in the definition provide default values. The following lines, up to .endmacro, form the body of the macro. They are not processed in any way when the macro is defined; they are -processed only when the macro is called (see section 1.5). +processed only when the macro is called (see section 1.5).

Within the body of a macro, argument substitutions can be specified by means of a dollar character and an argument number, for example, $3 for the third @@ -951,7 +962,7 @@ remembered and xfpt then reverts to the d At the end of a nested sequence, if a paragraph has been started, it is terminated, and then xfpt reverts to the previous state.

This directive must be followed by a single string argument. It is processed -as an input line without a newline at the end. This facility is useful +as an input line without a newline at the end. This facility is useful in macros when constructing a single data line from several text fragments. See for example the .new macro in the standard macros.

xfpt keeps a stack of text strings that are manipulated by the .push and @@ -1009,7 +1020,7 @@ though it does still seem to process it correctly.

For handling the most common case (setting and unsetting “changed”), the standard macros .new and .wen are provided (see section -4.11). +4.13).

This directive must be followed by a name and a text string. It defines a user variable and gives it a name. A reference to the name in the style of an XML entity causes the string to be substituted, without further processing. For @@ -1042,7 +1053,18 @@ file the standard header material for a DocBook XML file, which is:

The .book macro has no arguments. It generates <book> and pushes </book> onto the stack so that it will be output at the end. -

Chapters, sections, and subsections are supported by three macros that all +

XML processing instructions such as <?sdop toc_sections="no"?> can, of +course, be written written literally between .literal xml and +.literal off. If there are a lot of them, this is perhaps the most +convenient approach. A macro called .pi is provided as an easy way of +setting up a short processing instruction. Its first argument is the name of +the processor for which the instruction is intended, and its second argument is +the contents of the instruction, for example: +

+ .pi sdop 'toc_sections="yes,yes,no"'
+

+This generates <?sdop toc_sections="yes,yes,no"?>. +

Chapters, sections, and subsections are supported by three macros that all operate in the same way. They are .chapter, .section, and .subsection. They take either one, two, or three arguments. The first argument is the title. If a second argument is present, and is not an empty @@ -1070,14 +1092,23 @@ argument can be an empty string. For example:

Where and when the abbreviation is used in place of the full title is controlled by the stylesheet when the XML is processed. -

-These three macros use the stack to ensure that each chapter, section, and -subsection is terminated at the correct point. For example, starting a new -section automatically terminates an open subsection and a previous section. -

The macros .preface, .appendix, and .colophon operate in the same +

The macros .preface, .appendix, and .colophon operate in the same way as .chapter, except that the first and the last have the default title strings “Preface” and “Colophon”. -

The url macro generates URL references, and is intended to be called inline +

The macros for chapters, sections, appendixes, etc. use the stack to ensure +that each one is terminated at the correct point, without the need for an +explicit terminator. For example, starting a new section automatically +terminates an open subsection and a previous section. +

+Occasionally, however, there is a need to force an explicit termination. The +.endchapter, .endsection, .endsubsection, .endpreface, +.endappendix, and .endcolophon macros provide this facility. For +example, if you want to include an XML processing instruction after a preface, +but before the start of the following chapter, you must terminate the preface +with .endpreface. Otherwise a processing instruction that precedes the next +.chapter will end up inside the <preface> element. You should not +include any actual text items at these points. +

The url macro generates URL references, and is intended to be called inline within the text that is being processed. It generates a <ulink> element, and has either one or two arguments. The first argument is the URL, and the second is the text that describes it. For example: @@ -1092,7 +1123,7 @@ If the second argument is absent, the contents of the first argument are used instead. If url is called as a directive, there will be a newline in the output after </ulink>, which in most cases (such as the example above), you do not want. -

The .ilist macro marks the start of an itemized list, the items of which +

The .ilist macro marks the start of an itemized list, the items of which are normally rendered with bullets or similar markings. The macro can optionally be called with one argument, for which there is no default. If the argument is present, it is used to add a mark= attribute to the @@ -1111,7 +1142,7 @@ For example: .endlist

There may be more than one paragraph in an item. -

The .olist macro marks the start of an ordered list, the items of which are +

The .olist macro marks the start of an ordered list, the items of which are numbered. If no argument is given, arabic numerals are used. One of the following words can be given as the macro’s argument to specify the numeration:

@@ -1132,7 +1163,7 @@ For example: .endlist

There may be more than one paragraph in an item. -

A variable list is one in which each entry is composed of a set of one or more +

A variable list is one in which each entry is composed of a set of one or more terms and an associated description. Typically, the terms are printed in a style that makes them stand out, and the description is indented underneath. The start of a variable list is indicated by the .vlist macro, which has @@ -1150,12 +1181,12 @@ This is followed by the body of the entry. The list is terminated by the .endlist

As for the other lists, there may be more than one paragraph in an item. -

Lists may be nested as required. Some DocBook processors automatically choose +

Lists may be nested as required. Some DocBook processors automatically choose different bullets for nested itemized lists, but others do not. The .endlist macro has no useful arguments. Any text that follows it is treated as a comment. This can provide an annotation facility that may make the input easier to understand when lists are nested. -

In displayed text each non-directive input line generates one output line. The +

In displayed text each non-directive input line generates one output line. The <literallayout> DocBook element is used to achieve this. Two kinds of displayed text are supported by the standard macros. They differ in their handling of the text itself. @@ -1186,10 +1217,10 @@ monospaced font. For example:

As the examples illustrate, both kinds of display are terminated by the .endd macro. -

The macro pair .blockquote and .endblockquote are used to wrap the +

The macro pair .blockquote and .endblockquote are used to wrap the lines between them in a <blockquote> element. -

Two macros are provided to simplify setting and unsetting the “changed” -revision marking (see section 3.16). When the revised text is +

Two macros are provided to simplify setting and unsetting the “changed” +revision marking (see section 3.16). When the revised text is substantial (for example, a complete paragraph, table, display, or section), it can be placed between .new and .wen, as in this example:

@@ -1239,7 +1270,7 @@ literal text mode.
 If you want to add revision indications to part of a table, you must use an
 inline call of new within an argument of the .row macro (see below).
 This is the only usage that works in this case.
-

The .itable macro starts an informal (untitled) table with some basic +

The .itable macro starts an informal (untitled) table with some basic parameterization. If you are working on a large document that has many tables with the same parameters, the best approach is to define your own table macros, possibly calling the standard one with specific arguments. @@ -1287,7 +1318,7 @@ The .row macro does not set the new macro within an entry to generate a <phrase> element with revisionflag set. -

The .table macro starts a formal table, that is, a table that has a title, +

The .table macro starts a formal table, that is, a table that has a title, and which can be cross referenced. The first argument of this macro is the table’s title; the second is an identifier for cross-referencing. If you are not going to reference the table, an empty string must be supplied. From the @@ -1298,22 +1329,22 @@ For example: .row "cell 11" "cell 12" .row "cell 21" "cell 22" .endtable -

A figure is enclosed between .figure and .endfigure macros. The first +

A figure is enclosed between .figure and .endfigure macros. The first argument of .figure provides a title for the figure. The second is optional; if present, it is a tag for references to the figure.

A figure normally contains an image. The .image macro can be used in simple cases. It generates a <mediaobject> element containing an <imageobject>. The first argument is the name of the file containing the -image. The remaining arguments are optional; an empty string must be +image. The remaining arguments are optional; an empty string must be supplied as a placeholder when one that is not required is followed by one that -is set. +is set.

  • The second argument specifies a scaling factor for the image, as a percentage. Thus, a value of 50 reduces the image to half size.

  • The third argument specifies an alignment for the image. It must be one of -left (default), right or center (or even centre if the +left (default), right or center (or even centre if the DocBook processor you are using can handle it).

  • The fourth and fifth arguments specify the depth and width, respectively. How @@ -1331,7 +1362,7 @@ Here is another example, where the figure is reduced to 80% and centred: .figure "A reduced figure" .image figure02.eps 80 center .endfigure -

Footnotes can be specified between .footnote and .endnote macros. +

Footnotes can be specified between .footnote and .endnote macros. Within a footnote there can be any kind of text item, including displays and tables. When a footnote occurs in the middle of a paragraph, paired flags must not straddle the footnote. This example is wrong: @@ -1349,7 +1380,7 @@ The correct markup for this example is: That's really fast. .endf &'brown'& fox. -

The .index macro generates <indexterm> elements (index entries) in the +

The .index macro generates <indexterm> elements (index entries) in the output. It takes one or two arguments. The first is the text for the primary index term, and the second, if present, specifies a secondary index term. This macro can be called either from a directive line, or inline. However, it is diff --git a/doc/xfpt.pdf b/doc/xfpt.pdf index caa54eadf..2cda4081a 100644 Binary files a/doc/xfpt.pdf and b/doc/xfpt.pdf differ diff --git a/doc/xfpt.xfpt b/doc/xfpt.xfpt index e23ac9d67..093ccc330 100644 --- a/doc/xfpt.xfpt +++ b/doc/xfpt.xfpt @@ -13,14 +13,14 @@ The xfpt plain text to XML processor xfpt -06 February 2008 +22 July 2009 Philip Hazel PH -0.0606 February 2008PH -2008University of Cambridge +0.0722 July 2009PH +2009University of Cambridge .literal off @@ -152,9 +152,9 @@ standard macros and flags: .ilist In the default mode, text is processed paragraph by paragraph. .footnote -&new("There is, however, a special case when a paragraph contains one or more +There is, however, a special case when a paragraph contains one or more footnotes. In that situation, each part of the outer paragraph is processed -independently.") +independently. .endnote The end of a paragraph is indicated by the end of the input, a blank line, or by an occurrence of the &*.literal*& directive. Other directives (for example, @@ -374,6 +374,7 @@ you from generating invalid XML. For example, DocBook does not allow &``& within &``&, though it does allow &``& within &``&. + .section "Unrecognized flag sequences" ID10 If an ampersand is not followed by a character sequence in one of the forms described in the preceding sections, an error occurs. @@ -521,9 +522,9 @@ way to revert to the previous definition. This directive must be followed by a single string argument that is the path to a file. The contents of the file are read and incorporated into the input at this point. If the string does not contain any slashes, the path to the &X; -library is prepended. Otherwise, the path is used unaltered. &new("If +library is prepended. Otherwise, the path is used unaltered. If &*.include*& is used inside a macro, it is evaluated each time the macro is -called, and thus can be used to include a different file on each occasion.") +called, and thus can be used to include a different file on each occasion. .section "The &*.inliteral*& directive" ID21 @@ -619,7 +620,7 @@ terminated, and then &X; reverts to the previous state. .section "The &*.nonl*& directive" ID25 This directive must be followed by a single string argument. It is processed -as &new("an input line") without a newline at the end. This facility is useful +as an input line without a newline at the end. This facility is useful in macros when constructing a single data line from several text fragments. See for example the &*.new*& macro in the standard macros. @@ -731,6 +732,20 @@ The &*.book*& macro has no arguments. It generates &``& and pushes &``& onto the stack so that it will be output at the end. +.section "Processing instructions" +XML processing instructions such as &``& can, of +course, be written written literally between &`.literal`& &`xml`& and +&`.literal`& &`off`&. If there are a lot of them, this is perhaps the most +convenient approach. A macro called &*.pi*& is provided as an easy way of +setting up a short processing instruction. Its first argument is the name of +the processor for which the instruction is intended, and its second argument is +the contents of the instruction, for example: +.code + .pi sdop 'toc_sections="yes,yes,no"' +.endd +This generates &``&. + + .section "Chapters, sections, and subsections" ID32 Chapters, sections, and subsections are supported by three macros that all operate in the same way. They are &*.chapter*&, &*.section*&, and @@ -761,10 +776,6 @@ argument can be an empty string. For example: Where and when the abbreviation is used in place of the full title is controlled by the stylesheet when the XML is processed. -These three macros use the stack to ensure that each chapter, section, and -subsection is terminated at the correct point. For example, starting a new -section automatically terminates an open subsection and a previous section. - .section "Prefaces, appendixes, and colophons" ID33 The macros &*.preface*&, &*.appendix*&, and &*.colophon*& operate in the same @@ -772,6 +783,23 @@ way as &*.chapter*&, except that the first and the last have the default title strings &"Preface"& and &"Colophon"&. +.section "Terminating chapters, etc." +The macros for chapters, sections, appendixes, etc. use the stack to ensure +that each one is terminated at the correct point, without the need for an +explicit terminator. For example, starting a new section automatically +terminates an open subsection and a previous section. + +Occasionally, however, there is a need to force an explicit termination. The +&*.endchapter*&, &*.endsection*&, &*.endsubsection*&, &*.endpreface*&, +&*.endappendix*&, and &*.endcolophon*& macros provide this facility. For +example, if you want to include an XML processing instruction after a preface, +but before the start of the following chapter, you must terminate the preface +with &*.endpreface*&. Otherwise a processing instruction that precedes the next +&*.chapter*& will end up inside the &``& element. You should not +include any actual text items at these points. + + + .section "URL references" ID34 The &*url*& macro generates URL references, and is intended to be called inline within the text that is being processed. It generates a &``& element, @@ -1044,16 +1072,16 @@ optional; if present, it is a tag for references to the figure. A figure normally contains an image. The &*.image*& macro can be used in simple cases. It generates a &``& element containing an &``&. The first argument is the name of the file containing the -image. The remaining arguments are optional; &new("an empty string must be +image. The remaining arguments are optional; an empty string must be supplied as a placeholder when one that is not required is followed by one that -is set.") +is set. .ilist The second argument specifies a scaling factor for the image, as a percentage. Thus, a value of 50 reduces the image to half size. .next The third argument specifies an alignment for the image. It must be one of -&`left`& &new("(default)"), &`right`& or &`center`& (or even &`centre`& if the +&`left`& (default), &`right`& or &`center`& (or even &`centre`& if the DocBook processor you are using can handle it). .next The fourth and fifth arguments specify the depth and width, respectively. How diff --git a/share/stdmacs b/share/stdmacs index 351d23fd7..967c0c206 100644 --- a/share/stdmacs +++ b/share/stdmacs @@ -30,6 +30,13 @@ .literal off .endmacro +.macro endpreface +.literal layout +.pop C +.push C +.literal off +.endmacro + .macro chapter .literal layout .pop C @@ -45,6 +52,13 @@ .literal off .endmacro +.macro endchapter +.literal layout +.pop C +.push C +.literal off +.endmacro + .macro appendix .literal layout .pop C @@ -60,6 +74,13 @@ .literal off .endmacro +.macro endappendix +.literal layout +.pop C +.push C +.literal off +.endmacro + .macro colophon "Colophon" .literal layout .pop C @@ -73,6 +94,13 @@ .literal off .endmacro +.macro endcolophon +.literal layout +.pop C +.push C +.literal off +.endmacro + .macro section .literal layout .pop S @@ -87,6 +115,13 @@ .literal off .endmacro +.macro endsection +.literal layout +.pop S +.push S +.literal off +.endmacro + .macro subsection .literal layout .pop U @@ -101,6 +136,13 @@ .literal off .endmacro +.macro endsubsection +.literal layout +.pop U +.push U +.literal off +.endmacro + . =============== Lists =============== .macro ilist @@ -385,4 +427,12 @@ .literal off .endmacro +. =============== Processing instructions ============= + +.macro pi +.literal layout +&& +.literal off +.endmacro + . End diff --git a/src/globals.c b/src/globals.c index 44297f127..56d5a28fe 100644 --- a/src/globals.c +++ b/src/globals.c @@ -2,7 +2,7 @@ * xfpt - Simple ASCII->Docbook processor * *************************************************/ -/* Copyright (c) University of Cambridge, 2008 */ +/* Copyright (c) University of Cambridge, 2009 */ /* Written by Philip Hazel. */ /* Allocate storage and initialize global variables */ @@ -11,7 +11,7 @@ uschar *xfpt_share = US DATADIR; -uschar *xfpt_version = US "0.06 06-February-2008"; +uschar *xfpt_version = US "0.07 22-July-2009"; tree_node *entities = NULL; diff --git a/testing/infiles/04 b/testing/infiles/04 new file mode 100644 index 000000000..917b2cac5 --- /dev/null +++ b/testing/infiles/04 @@ -0,0 +1,9 @@ +.include stdflags +.include stdmacs + +.preface "A first preface" +This is text in the preface. +.endpreface +.pi sdop 'toc_sections="no"' +.chapter "A first chapter" +This is text in a chapter diff --git a/testing/outfiles/04 b/testing/outfiles/04 new file mode 100644 index 000000000..5712bce99 --- /dev/null +++ b/testing/outfiles/04 @@ -0,0 +1,16 @@ + +A first preface + +This is text in the preface. + + + + + + +A first chapter + +This is text in a chapter + + + diff --git a/testing/runtest b/testing/runtest index 1cd36a638..a125ce798 100755 --- a/testing/runtest +++ b/testing/runtest @@ -3,7 +3,7 @@ # Controlling script for xfpt tests $xfpt = "../src/xfpt -S ../share"; -$cf = (-f "/usr/local/bin/cf")? "cf" : "diff -u"; +$cf = (-f "/usr/local/bin/cf")? "cf" : "diff"; $force_update = 0; $starttest = undef; -- cgit v1.2.1