summaryrefslogtreecommitdiff
path: root/NEWS
diff options
context:
space:
mode:
authorJim Blandy <jimb@red-bean.com>1997-06-24 17:19:51 +0000
committerJim Blandy <jimb@red-bean.com>1997-06-24 17:19:51 +0000
commit94982a4ee13a7b6d58e3f03a4c1045bfad0000ea (patch)
treeb881ac29b1a2e1b23baad765f27fdf3945ec4fef /NEWS
parentf4f9904695e5e1113f5fbb6f9a1fdfdf3bd93462 (diff)
downloadguile-94982a4ee13a7b6d58e3f03a4c1045bfad0000ea.tar.gz
New sections on regexps.
Move Gary's syscall notes into the scheme section.
Diffstat (limited to 'NEWS')
-rw-r--r--NEWS326
1 files changed, 292 insertions, 34 deletions
diff --git a/NEWS b/NEWS
index 5ced2acc8..c00101199 100644
--- a/NEWS
+++ b/NEWS
@@ -6,8 +6,6 @@ Please send Guile bug reports to bug-guile@prep.ai.mit.edu.
Changes in Guile 1.2:
-[[trim out any sections we don't need]]
-
* Changes to the distribution
** Nightly snapshots are now available from ftp.red-bean.com.
@@ -28,11 +26,22 @@ source directory. See the `INSTALL' file for examples.
* Changes to the procedure for linking libguile with your programs
-** Like Guile 1.0, Guile 1.2 will now use the Rx regular expression
-library, if it is installed on your system. When you are linking
-libguile into your own programs, this means you will have to link
-against -lguile, -lqt (if you configured Guile with thread support),
-and -lrx.
+** The standard Guile load path for Scheme code now includes
+$(datadir)/guile (usually /usr/local/share/guile). This means that
+you can install your own Scheme files there, and Guile will find them.
+(Previous versions of Guile only checked a directory whose name
+contained the Guile version number, so you had to re-install or move
+your Scheme sources each time you installed a fresh version of Guile.)
+
+The load path also includes $(datadir)/guile/site; we recommend
+putting individual Scheme files there. If you want to install a
+package with multiple source files, create a directory for them under
+$(datadir)/guile.
+
+** Guile 1.2 will now use the Rx regular expression library, if it is
+installed on your system. When you are linking libguile into your own
+programs, this means you will have to link against -lguile, -lqt (if
+you configured Guile with thread support), and -lrx.
If you are using autoconf to generate configuration scripts for your
application, the following lines should suffice to add the appropriate
@@ -43,6 +52,10 @@ AC_CHECK_LIB(rx, main)
AC_CHECK_LIB(qt, main)
AC_CHECK_LIB(guile, scm_shell)
+The Guile 1.2 distribution does not contain sources for the Rx
+library, as Guile 1.0 did. If you want to use Rx, you'll need to
+retrieve it from a GNU FTP site and install it separately.
+
* Changes to Scheme functions and syntax
** The dynamic linking features of Guile are now enabled by default.
@@ -161,38 +174,265 @@ symbols.)
functions for matching regular expressions, based on the Rx library.
In Guile 1.1, the Guile/Rx interface was removed to simplify the
distribution, and thus Guile had no regular expression support. Guile
-1.2 now adds back the most commonly used functions, and supports all
-of SCSH's regular expression functions. They are:
+1.2 again supports the most commonly used functions, and supports all
+of SCSH's regular expression functions.
-*** [[get stuff from Tim's documentation]]
-*** [[mention the regexp/mumble flags]]
+If your system does not include a POSIX regular expression library,
+and you have not linked Guile with a third-party regexp library such as
+Rx, these functions will not be available. You can tell whether your
+Guile installation includes regular expression support by checking
+whether the `*features*' list includes the `regex' symbol.
-** Guile now provides information on how it was built, via the new
-global variable, %guile-build-info. This variable records the values
-of the standard GNU makefile directory variables as an assocation
-list, mapping variable names (symbols) onto directory paths (strings).
-For example, to find out where the Guile link libraries were
-installed, you can say:
+*** regexp functions
-guile -c "(display (assq-ref %guile-build-info 'libdir)) (newline)"
+By default, Guile supports POSIX extended regular expressions. That
+means that the characters `(', `)', `+' and `?' are special, and must
+be escaped if you wish to match the literal characters.
-
-* Changes to the gh_ interface
-
-* Changes to the scm_ interface
-
-** The new function scm_handle_by_message_noexit is just like the
-existing scm_handle_by_message function, except that it doesn't call
-exit to terminate the process. Instead, it prints a message and just
-returns #f. This might be a more appropriate catch-all handler for
-new dynamic roots and threads.
-
-* Changes to system call interfaces:
-
-** The value returned by `raise' is now unspecified. It throws an exception
+This regular expression interface was modeled after that implemented
+by SCSH, the Scheme Shell. It is intended to be upwardly compatible
+with SCSH regular expressions.
+
+**** Function: string-match PATTERN STR [START]
+ Compile the string PATTERN into a regular expression and compare
+ it with STR. The optional numeric argument START specifies the
+ position of STR at which to begin matching.
+
+ `string-match' returns a "match structure" which describes what,
+ if anything, was matched by the regular expression. *Note Match
+ Structures::. If STR does not match PATTERN at all,
+ `string-match' returns `#f'.
+
+ Each time `string-match' is called, it must compile its PATTERN
+argument into a regular expression structure. This operation is
+expensive, which makes `string-match' inefficient if the same regular
+expression is used several times (for example, in a loop). For better
+performance, you can compile a regular expression in advance and then
+match strings against the compiled regexp.
+
+**** Function: make-regexp STR [FLAGS]
+ Compile the regular expression described by STR, and return the
+ compiled regexp structure. If STR does not describe a legal
+ regular expression, `make-regexp' throws a
+ `regular-expression-syntax' error.
+
+ FLAGS may be the bitwise-or of one or more of the following:
+
+**** Constant: regexp/extended
+ Use POSIX Extended Regular Expression syntax when interpreting
+ STR. If not set, POSIX Basic Regular Expression syntax is used.
+ If the FLAGS argument is omitted, we assume regexp/extended.
+
+**** Constant: regexp/icase
+ Do not differentiate case. Subsequent searches using the
+ returned regular expression will be case insensitive.
+
+**** Constant: regexp/newline
+ Match-any-character operators don't match a newline.
+
+ A non-matching list ([^...]) not containing a newline matches a
+ newline.
+
+ Match-beginning-of-line operator (^) matches the empty string
+ immediately after a newline, regardless of whether the FLAGS
+ passed to regexp-exec contain regexp/notbol.
+
+ Match-end-of-line operator ($) matches the empty string
+ immediately before a newline, regardless of whether the FLAGS
+ passed to regexp-exec contain regexp/noteol.
+
+**** Function: regexp-exec REGEXP STR [START [FLAGS]]
+ Match the compiled regular expression REGEXP against `str'. If
+ the optional integer START argument is provided, begin matching
+ from that position in the string. Return a match structure
+ describing the results of the match, or `#f' if no match could be
+ found.
+
+ FLAGS may be the bitwise-or of one or more of the following:
+
+**** Constant: regexp/notbol
+ The match-beginning-of-line operator always fails to match (but
+ see the compilation flag regexp/newline above) This flag may be
+ used when different portions of a string are passed to
+ regexp-exec and the beginning of the string should not be
+ interpreted as the beginning of the line.
+
+**** Constant: regexp/noteol
+ The match-end-of-line operator always fails to match (but see the
+ compilation flag regexp/newline above)
+
+**** Function: regexp? OBJ
+ Return `#t' if OBJ is a compiled regular expression, or `#f'
+ otherwise.
+
+ Regular expressions are commonly used to find patterns in one string
+and replace them with the contents of another string.
+
+**** Function: regexp-substitute PORT MATCH [ITEM...]
+ Write to the output port PORT selected contents of the match
+ structure MATCH. Each ITEM specifies what should be written, and
+ may be one of the following arguments:
+
+ * A string. String arguments are written out verbatim.
+
+ * An integer. The submatch with that number is written.
+
+ * The symbol `pre'. The portion of the matched string preceding
+ the regexp match is written.
+
+ * The symbol `post'. The portion of the matched string
+ following the regexp match is written.
+
+ PORT may be `#f', in which case nothing is written; instead,
+ `regexp-substitute' constructs a string from the specified ITEMs
+ and returns that.
+
+**** Function: regexp-substitute/global PORT REGEXP TARGET [ITEM...]
+ Similar to `regexp-substitute', but can be used to perform global
+ substitutions on STR. Instead of taking a match structure as an
+ argument, `regexp-substitute/global' takes two string arguments: a
+ REGEXP string describing a regular expression, and a TARGET string
+ which should be matched against this regular expression.
+
+ Each ITEM behaves as in REGEXP-SUBSTITUTE, with the following
+ exceptions:
+
+ * A function may be supplied. When this function is called, it
+ will be passed one argument: a match structure for a given
+ regular expression match. It should return a string to be
+ written out to PORT.
+
+ * The `post' symbol causes `regexp-substitute/global' to recurse
+ on the unmatched portion of STR. This *must* be supplied in
+ order to perform global search-and-replace on STR; if it is
+ not present among the ITEMs, then `regexp-substitute/global'
+ will return after processing a single match.
+
+*** Match Structures
+
+ A "match structure" is the object returned by `string-match' and
+`regexp-exec'. It describes which portion of a string, if any, matched
+the given regular expression. Match structures include: a reference to
+the string that was checked for matches; the starting and ending
+positions of the regexp match; and, if the regexp included any
+parenthesized subexpressions, the starting and ending positions of each
+submatch.
+
+ In each of the regexp match functions described below, the `match'
+argument must be a match structure returned by a previous call to
+`string-match' or `regexp-exec'. Most of these functions return some
+information about the original target string that was matched against a
+regular expression; we will call that string TARGET for easy reference.
+
+**** Function: regexp-match? OBJ
+ Return `#t' if OBJ is a match structure returned by a previous
+ call to `regexp-exec', or `#f' otherwise.
+
+**** Function: match:substring MATCH [N]
+ Return the portion of TARGET matched by subexpression number N.
+ Submatch 0 (the default) represents the entire regexp match. If
+ the regular expression as a whole matched, but the subexpression
+ number N did not match, return `#f'.
+
+**** Function: match:start MATCH [N]
+ Return the starting position of submatch number N.
+
+**** Function: match:end MATCH [N]
+ Return the ending position of submatch number N.
+
+**** Function: match:prefix MATCH
+ Return the unmatched portion of TARGET preceding the regexp match.
+
+**** Function: match:suffix MATCH
+ Return the unmatched portion of TARGET following the regexp match.
+
+**** Function: match:count MATCH
+ Return the number of parenthesized subexpressions from MATCH.
+ Note that the entire regular expression match itself counts as a
+ subexpression, and failed submatches are included in the count.
+
+**** Function: match:string MATCH
+ Return the original TARGET string.
+
+*** Backslash Escapes
+
+ Sometimes you will want a regexp to match characters like `*' or `$'
+exactly. For example, to check whether a particular string represents
+a menu entry from an Info node, it would be useful to match it against
+a regexp like `^* [^:]*::'. However, this won't work; because the
+asterisk is a metacharacter, it won't match the `*' at the beginning of
+the string. In this case, we want to make the first asterisk un-magic.
+
+ You can do this by preceding the metacharacter with a backslash
+character `\'. (This is also called "quoting" the metacharacter, and
+is known as a "backslash escape".) When Guile sees a backslash in a
+regular expression, it considers the following glyph to be an ordinary
+character, no matter what special meaning it would ordinarily have.
+Therefore, we can make the above example work by changing the regexp to
+`^\* [^:]*::'. The `\*' sequence tells the regular expression engine
+to match only a single asterisk in the target string.
+
+ Since the backslash is itself a metacharacter, you may force a
+regexp to match a backslash in the target string by preceding the
+backslash with itself. For example, to find variable references in a
+TeX program, you might want to find occurrences of the string `\let\'
+followed by any number of alphabetic characters. The regular expression
+`\\let\\[A-Za-z]*' would do this: the double backslashes in the regexp
+each match a single backslash in the target string.
+
+**** Function: regexp-quote STR
+ Quote each special character found in STR with a backslash, and
+ return the resulting string.
+
+ *Very important:* Using backslash escapes in Guile source code (as
+in Emacs Lisp or C) can be tricky, because the backslash character has
+special meaning for the Guile reader. For example, if Guile encounters
+the character sequence `\n' in the middle of a string while processing
+Scheme code, it replaces those characters with a newline character.
+Similarly, the character sequence `\t' is replaced by a horizontal tab.
+Several of these "escape sequences" are processed by the Guile reader
+before your code is executed. Unrecognized escape sequences are
+ignored: if the characters `\*' appear in a string, they will be
+translated to the single character `*'.
+
+ This translation is obviously undesirable for regular expressions,
+since we want to be able to include backslashes in a string in order to
+escape regexp metacharacters. Therefore, to make sure that a backslash
+is preserved in a string in your Guile program, you must use *two*
+consecutive backslashes:
+
+ (define Info-menu-entry-pattern (make-regexp "^\\* [^:]*"))
+
+ The string in this example is preprocessed by the Guile reader before
+any code is executed. The resulting argument to `make-regexp' is the
+string `^\* [^:]*', which is what we really want.
+
+ This also means that in order to write a regular expression that
+matches a single backslash character, the regular expression string in
+the source code must include *four* backslashes. Each consecutive pair
+of backslashes gets translated by the Guile reader to a single
+backslash, and the resulting double-backslash is interpreted by the
+regexp engine as matching a single backslash character. Hence:
+
+ (define tex-variable-pattern (make-regexp "\\\\let\\\\=[A-Za-z]*"))
+
+ The reason for the unwieldiness of this syntax is historical. Both
+regular expression pattern matchers and Unix string processing systems
+have traditionally used backslashes with the special meanings described
+above. The POSIX regular expression specification and ANSI C standard
+both require these semantics. Attempting to abandon either convention
+would cause other kinds of compatibility problems, possibly more severe
+ones. Therefore, without extending the Scheme reader to support
+strings with different quoting conventions (an ungainly and confusing
+extension when implemented in other languages), we must adhere to this
+cumbersome escape syntax.
+
+** Changes to system call interfaces:
+
+*** The value returned by `raise' is now unspecified. It throws an exception
if an error occurs.
-** A new procedure `sigaction' can be used to install signal handlers
+*** A new procedure `sigaction' can be used to install signal handlers
(sigaction signum [action] [flags])
@@ -219,9 +459,27 @@ facility. Maybe this is not needed, since the thread support may
provide solutions to the problem of consistent access to data
structures.
-** A new procedure `flush-all-ports' is equivalent to running
+*** A new procedure `flush-all-ports' is equivalent to running
`force-output' on every port open for output.
+** Guile now provides information on how it was built, via the new
+global variable, %guile-build-info. This variable records the values
+of the standard GNU makefile directory variables as an assocation
+list, mapping variable names (symbols) onto directory paths (strings).
+For example, to find out where the Guile link libraries were
+installed, you can say:
+
+guile -c "(display (assq-ref %guile-build-info 'libdir)) (newline)"
+
+
+* Changes to the scm_ interface
+
+** The new function scm_handle_by_message_noexit is just like the
+existing scm_handle_by_message function, except that it doesn't call
+exit to terminate the process. Instead, it prints a message and just
+returns #f. This might be a more appropriate catch-all handler for
+new dynamic roots and threads.
+
Changes in Guile 1.1 (Fri May 16 1997):