summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc')
-rw-r--r--doc/manual.txt241
1 files changed, 115 insertions, 126 deletions
diff --git a/doc/manual.txt b/doc/manual.txt
index d7be557..890f9df 100644
--- a/doc/manual.txt
+++ b/doc/manual.txt
@@ -50,10 +50,10 @@ Notes
4. All functions that take a regular expression pattern as an argument will
generate an error if that pattern is found invalid by the used
- POSIX_ / PCRE_ / Oniguruma_ / TRE_ library.
+ POSIX_ / PCRE_ / GNU_ / Oniguruma_ / TRE_ library.
5. All functions that take a string-type regex argument accept a compiled regex
- too. In this case, the cf_, locale_ and syntax_ arguments are ignored (should
+ too. In this case, the cf_ and larg_ arguments are ignored (should
be either supplied as nils or omitted).
.. _cf:
@@ -64,6 +64,8 @@ Notes
* REG_EXTENDED for POSIX and TRE
* 0 for PCRE
* ONIG_OPTION_NONE for Oniguruma
+ * GNU does not use this argument; it is ignored (but must be
+ present if further arguments are used)
**PCRE**, **Oniguruma**: *cf* may also be supplied as a string, whose
characters stand for compilation flags. Combinations of the following
@@ -90,41 +92,10 @@ Notes
e.g. Spencer's.
* 0 for PCRE, Oniguruma and TRE
-.. _locale:
+.. _larg:
-8. **PCRE:** parameter *locale* (*lo*) can be either a string (e.g.,
- "French_France.1252"), or a userdata obtained from a call to maketables_.
- The default value, used when the parameter is not supplied or ``nil``,
- is the built-in PCRE set of character tables.
-
- **Oniguruma:** this parameter (which actually should be named "encoding"
- rather than "locale") must be one of the predefined strings that are formed
- from the ONIG_ENCODING_xxx identifiers defined in oniguruma.h, by means of
- omitting the ONIG_ENCODING\_ part. For example, ONIG_ENCODING_UTF8 becomes
- ``"UTF8"`` on the Lua side. The default value, used when the parameter is not
- supplied or ``nil``, is ``"ASCII"``.
-
- If the caller-supplied value of this parameter is not one of the predefined
- "encoding" string set, an error is raised.
-
-.. _syntax:
-
-9. **GNU:** parameter *syntax* (*syn*) must be one of the predefined strings
- that are formed from the RE_SYNTAX_xxx identifiers defined in regex.h, by
- means of omitting the RE_SYNTAX\_ part. For example, RE_SYNTAX_GREP becomes
- ``"GREP"`` on the Lua side. The default value, used when the parameter is
- not supplied or ``nil``, is either ``"POSIX_EXTENDED"`` (at start-up),
- or the value set by the last setsyntax_ call.
-
- **Oniguruma:** parameter *syntax* (*syn*) must be one of the predefined
- strings that are formed from the ONIG_SYNTAX_xxx identifiers defined in
- oniguruma.h, by means of omitting the ONIG_SYNTAX\_ part. For example,
- ONIG_SYNTAX_JAVA becomes ``"JAVA"`` on the Lua side. The default value, used
- when the parameter is not supplied or ``nil``, is either ``"RUBY"`` (at
- start-up), or the value set by the last setdefaultsyntax_ call.
-
- If the caller-supplied value of `syntax` parameter is not one of the
- predefined "syntax" string set, an error is raised.
+8. The notation *larg...* is used to indicate optional library-specific
+ arguments, which are documented in the ``new`` method of each library.
------------------------------------------------------------
@@ -134,7 +105,7 @@ Functions and methods common to all bindings
match
-----
-:funcdef:`rex.match (subj, patt, [init], [cf], [ef], [lo], [tr], [syn])`
+:funcdef:`rex.match (subj, patt, [init], [cf], [ef], [larg...])`
or
@@ -146,7 +117,7 @@ The function searches for the first match of the regexp *patt* in the string
+---------+-------------------------------+--------+-------------+
|Parameter| Description | Type |Default Value|
+=========+===============================+========+=============+
- | r | regex object produced by new_ |userdata| n/a |
+ | r |regex object produced by new_ |userdata| n/a |
+---------+-------------------------------+--------+-------------+
| subj | subject | string | n/a |
+---------+-------------------------------+--------+-------------+
@@ -159,15 +130,9 @@ The function searches for the first match of the regexp *patt* in the string
+---------+-------------------------------+--------+-------------+
| [cf] |compilation flags (bitwise OR) | number | cf_ |
+---------+-------------------------------+--------+-------------+
- | [ef] | execution flags (bitwise OR) | number | ef_ |
+ | [ef] |execution flags (bitwise OR) | number | ef_ |
+---------+-------------------------------+--------+-------------+
- | [lo] |[PCRE, Oniguruma] locale |string |locale_ |
- | | |or | |
- | | |userdata| |
- +---------+-------------------------------+--------+-------------+
- | [tr] |[GNU] translation table | table | n/a |
- +---------+-------------------------------+--------+-------------+
- | [syn] |[GNU, Oniguruma] syntax | string |syntax_ |
+ |[larg...]|library-specific arguments | | |
+---------+-------------------------------+--------+-------------+
**Returns on success:**
@@ -184,7 +149,7 @@ The function searches for the first match of the regexp *patt* in the string
find
----
-:funcdef:`rex.find (subj, patt, [init], [cf], [ef], [lo], [tr], [syn])`
+:funcdef:`rex.find (subj, patt, [init], [cf], [ef], [larg...])`
or
@@ -196,9 +161,9 @@ The function searches for the first match of the regexp *patt* in the string
+---------+-------------------------------+--------+-------------+
|Parameter| Description | Type |Default Value|
+=========+===============================+========+=============+
- | r | regex object produced by new_ |userdata| n/a |
+ | r |regex object produced by new_ |userdata| n/a |
+---------+-------------------------------+--------+-------------+
- | subj | subject | string | n/a |
+ | subj |subject | string | n/a |
+---------+-------------------------------+--------+-------------+
| patt |regular expression pattern |string | n/a |
| | |or | |
@@ -209,15 +174,9 @@ The function searches for the first match of the regexp *patt* in the string
+---------+-------------------------------+--------+-------------+
| [cf] |compilation flags (bitwise OR) | number | cf_ |
+---------+-------------------------------+--------+-------------+
- | [ef] | execution flags (bitwise OR) | number | ef_ |
- +---------+-------------------------------+--------+-------------+
- | [lo] |[PCRE, Oniguruma] locale |string |locale_ |
- | | |or | |
- | | |userdata| |
- +---------+-------------------------------+--------+-------------+
- | [tr] |[GNU] translation table | table | n/a |
+ | [ef] |execution flags (bitwise OR) | number | ef_ |
+---------+-------------------------------+--------+-------------+
- | [syn] |[Oniguruma] syntax | string |syntax_ |
+ |[larg...]|library-specific arguments | | |
+---------+-------------------------------+--------+-------------+
**Returns on success:**
@@ -235,7 +194,7 @@ The function searches for the first match of the regexp *patt* in the string
gmatch
------
-:funcdef:`rex.gmatch (subj, patt, [cf], [ef], [lo], [tr], [syn])`
+:funcdef:`rex.gmatch (subj, patt, [cf], [ef], [larg...])`
The function is intended for use in the *generic for* Lua construct.
It returns an iterator for repeated matching of the pattern *patt* in
@@ -244,7 +203,7 @@ the string *subj*, subject to flags *cf* and *ef*.
+---------+-------------------------------+--------+-------------+
|Parameter| Description | Type |Default Value|
+=========+===============================+========+=============+
- | subj | subject |string | n/a |
+ | subj |subject |string | n/a |
+---------+-------------------------------+--------+-------------+
| patt |regular expression pattern |string | n/a |
| | |or | |
@@ -254,13 +213,7 @@ the string *subj*, subject to flags *cf* and *ef*.
+---------+-------------------------------+--------+-------------+
| [ef] |execution flags (bitwise OR) |number | ef_ |
+---------+-------------------------------+--------+-------------+
- | [lo] |[PCRE, Oniguruma] locale |string |locale_ |
- | | |or | |
- | | |userdata| |
- +---------+-------------------------------+--------+-------------+
- | [tr] |[GNU] translation table | table | n/a |
- +---------+-------------------------------+--------+-------------+
- | [syn] |[GNU, Oniguruma] syntax | string |syntax_ |
+ |[larg...]|library-specific arguments | | |
+---------+-------------------------------+--------+-------------+
The iterator function is called by Lua. On every iteration (that is, on every
@@ -273,7 +226,7 @@ till the subject fails to match.
gsub
----
-:funcdef:`rex.gsub (subj, patt, repl, [n], [cf], [ef], [lo], [tr], [syn])`
+:funcdef:`rex.gsub (subj, patt, repl, [n], [cf], [ef], [larg...])`
This function searches for all matches of the pattern *patt* in the string
*subj* and replaces them according to the parameters *repl* and *n* (see details
@@ -295,12 +248,7 @@ below).
+---------+-----------------------------------+-------------------------+-------------+
| [ef] |execution flags (bitwise OR) | number | ef_ |
+---------+-----------------------------------+-------------------------+-------------+
- | [lo] |[PCRE, Oniguruma] locale | string or userdata |locale_ |
- | | | | |
- +---------+-----------------------------------+-------------------------+-------------+
- | [tr] |[GNU] translation table | table | n/a |
- +---------+-----------------------------------+-------------------------+-------------+
- | [syn] |[GNU, Oniguruma] syntax | string |syntax_ |
+ |[larg...]|library-specific arguments | | |
+---------+-----------------------------------+-------------------------+-------------+
**Returns:**
@@ -393,7 +341,7 @@ below).
split
-----
-:funcdef:`rex.split (subj, sep, [cf], [ef], [lo], [tr], [syn])`
+:funcdef:`rex.split (subj, sep, [cf], [ef], [larg...])`
The function is intended for use in the *generic for* Lua construct.
It is used for splitting a subject string *subj* into parts (*sections*).
@@ -406,7 +354,7 @@ the string *subj*, subject to flags *cf* and *ef*.
+---------+-------------------------------+--------+-------------+
|Parameter| Description | Type |Default Value|
+=========+===============================+========+=============+
- | subj | subject |string | n/a |
+ | subj |subject |string | n/a |
+---------+-------------------------------+--------+-------------+
| sep |separator (regular expression |string | n/a |
| |pattern) |or | |
@@ -416,13 +364,7 @@ the string *subj*, subject to flags *cf* and *ef*.
+---------+-------------------------------+--------+-------------+
| [ef] |execution flags (bitwise OR) |number | ef_ |
+---------+-------------------------------+--------+-------------+
- | [lo] |[PCRE, Oniguruma] locale |string |locale_ |
- | | |or | |
- | | |userdata| |
- +---------+-------------------------------+--------+-------------+
- | [tr] |[GNU] translation table | table | n/a |
- +---------+-------------------------------+--------+-------------+
- | [syn] |[GNU, Oniguruma] syntax | string |syntax_ |
+ |[larg...]|library-specific arguments | | |
+---------+-------------------------------+--------+-------------+
**On every iteration pass, the iterator returns:**
@@ -475,18 +417,21 @@ constants in the used library. They are formed as follows:
but for ONIG_OPTION_xxx constants, alias strings are created additionally,
e.g., the value of ONIG_OPTION_IGNORECASE constant becomes accessible via
either of two keys: ``"ONIG_OPTION_IGNORECASE"`` and ``"IGNORECASE"``.
+* **GNU**: the GNU library does not define any flags.
------------------------------------------------------------
new
---
-:funcdef:`rex.new (patt, [cf], [lo], [tr], [syn])`
+:funcdef:`rex.new (patt, [cf], [larg...])`
-The functions compiles regular expression *patt* into a regular expression
-object whose internal representation is corresponding to the library used.
-The returned result then can be used by the methods, e.g. `tfind`_, `exec`_,
-etc. Regular expression objects are automatically garbage collected.
+The function compiles regular expression *patt* into a regular expression object
+whose internal representation is corresponding to the library used. The returned
+result then can be used by the methods, e.g. `tfind`_, `exec`_, etc. Regular
+expression objects are automatically garbage collected. See the library-specific
+documentation below for details of the library-specific arguments *larg...*, if
+any.
+---------+-------------------------------+--------+-------------+
|Parameter| Description | Type |Default Value|
@@ -495,13 +440,7 @@ etc. Regular expression objects are automatically garbage collected.
+---------+-------------------------------+--------+-------------+
| [cf] |compilation flags (bitwise OR) | number | cf_ |
+---------+-------------------------------+--------+-------------+
- | [lo] |[PCRE, Oniguruma] locale |string |locale_ |
- | | |or | |
- | | |userdata| |
- +---------+-------------------------------+--------+-------------+
- | [tr] |[GNU] translation table | table | n/a |
- +---------+-------------------------------+--------+-------------+
- | [syn] |[Oniguruma] syntax | string |syntax_ |
+ |[larg...]|library-specific arguments | | |
+---------+-------------------------------+--------+-------------+
**Returns:**
@@ -520,14 +459,14 @@ string *subj*, starting from offset *init*, subject to execution flags *ef*.
+---------+-----------------------------------+--------+-------------+
|Parameter| Description | Type |Default Value|
+=========+===================================+========+=============+
- | r | regex object produced by new_ |userdata| n/a |
+ | r |regex object produced by new_ |userdata| n/a |
+---------+-----------------------------------+--------+-------------+
- | subj | subject | string | n/a |
+ | subj |subject | string | n/a |
+---------+-----------------------------------+--------+-------------+
| [init] |start offset in the subject | number | 1 |
| |(can be negative) | | |
+---------+-----------------------------------+--------+-------------+
- | [ef] | execution flags (bitwise OR) | number | ef_ |
+ | [ef] |execution flags (bitwise OR) | number | ef_ |
+---------+-----------------------------------+--------+-------------+
**Returns on success:**
@@ -562,14 +501,14 @@ string *subj*, starting from offset *init*, subject to execution flags *ef*.
+---------+-----------------------------------+--------+-------------+
|Parameter| Description | Type |Default Value|
+=========+===================================+========+=============+
- | r | regex object produced by new_ |userdata| n/a |
+ | r |regex object produced by new_ |userdata| n/a |
+---------+-----------------------------------+--------+-------------+
- | subj | subject | string | n/a |
+ | subj |subject | string | n/a |
+---------+-----------------------------------+--------+-------------+
| [init] |start offset in the subject | number | 1 |
| |(can be negative) | | |
+---------+-----------------------------------+--------+-------------+
- | [ef] | execution flags (bitwise OR) | number | ef_ |
+ | [ef] |execution flags (bitwise OR) | number | ef_ |
+---------+-----------------------------------+--------+-------------+
**Returns on success:**
@@ -597,6 +536,16 @@ string *subj*, starting from offset *init*, subject to execution flags *ef*.
PCRE-only functions and methods
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+new
+---
+
+:funcdef:`rex.new (patt, [cf], [lo])`
+
+The locale (*lo*) can be either a string (e.g., "French_France.1252"), or a
+userdata obtained from a call to maketables_. The default value, used when the
+parameter is not supplied or ``nil``, is the built-in PCRE set of character
+tables.
+
dfa_exec
--------
@@ -610,14 +559,14 @@ string *subj*, using a DFA matching algorithm.
+----------+-------------------------------------+--------+-------------+
|Parameter | Description | Type |Default Value|
+==========+=====================================+========+=============+
- | r | regex object produced by new_ |userdata| n/a |
+ | r |regex object produced by new_ |userdata| n/a |
+----------+-------------------------------------+--------+-------------+
- | subj | subject | string | n/a |
+ | subj |subject | string | n/a |
+----------+-------------------------------------+--------+-------------+
| [init] |start offset in the subject | number | 1 |
| |(can be negative) | | |
+----------+-------------------------------------+--------+-------------+
- | [ef] | execution flags (bitwise OR) | number | ef_ |
+ | [ef] |execution flags (bitwise OR) | number | ef_ |
+----------+-------------------------------------+--------+-------------+
|[ovecsize]|size of the array for result offsets | number | 100 |
+----------+-------------------------------------+--------+-------------+
@@ -684,6 +633,18 @@ and its release date.
GNU-only functions and methods
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+new
+---
+
+:funcdef:`rex.new (patt, [cf], [tr])`
+
+Parameter *syntax* (*syn*) must be one of the predefined strings that are formed
+from the RE_SYNTAX_xxx identifiers defined in regex.h, by means of omitting the
+RE_SYNTAX\_ part. For example, RE_SYNTAX_GREP becomes ``"GREP"`` on the Lua
+side. The default value, used when the parameter is not supplied or ``nil``, is
+either ``"POSIX_EXTENDED"`` (at start-up), or the value set by the last
+setsyntax_ call.
+
setsyntax
---------
@@ -705,6 +666,30 @@ argument is passed to those functions explicitly.
Oniguruma-only functions and methods
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+new
+---
+
+:funcdef:`rex.new (patt, [cf], [enc], [syn])`
+
+The *encoding* parameter (*enc*) must be one of the predefined strings that are
+formed from the ONIG_ENCODING_xxx identifiers defined in oniguruma.h, by means
+of omitting the ONIG_ENCODING\_ part. For example, ONIG_ENCODING_UTF8 becomes
+``"UTF8"`` on the Lua side. The default value, used when the parameter is not
+supplied or ``nil``, is ``"ASCII"``.
+
+If the caller-supplied value of this parameter is not one of the predefined
+"encoding" string set, an error is raised.
+
+The parameter *syntax* (*syn*) must be one of the predefined strings that are
+formed from the ONIG_SYNTAX_xxx identifiers defined in oniguruma.h, by means of
+omitting the ONIG_SYNTAX\_ part. For example, ONIG_SYNTAX_JAVA becomes
+``"JAVA"`` on the Lua side. The default value, used when the parameter is not
+supplied or ``nil``, is either ``"RUBY"`` (at start-up), or the value set by the
+last setdefaultsyntax_ call.
+
+If the caller-supplied value of `syntax` parameter is not one of the predefined
+"syntax" string set, an error is raised.
+
setdefaultsyntax
----------------
@@ -739,6 +724,11 @@ library.
TRE-only functions and methods
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+new
+---
+
+:funcdef:`rex.new (patt, [cf])`
+
atfind
-------
@@ -750,22 +740,22 @@ string *subj*, starting from offset *init*, subject to execution flags *ef*.
+---------+-----------------------------------+--------+-------------+
|Parameter| Description | Type |Default Value|
+=========+===================================+========+=============+
- | r | regex object produced by new_ |userdata| n/a |
+ | r |regex object produced by new_ |userdata| n/a |
+---------+-----------------------------------+--------+-------------+
- | subj | subject | string | n/a |
+ | subj |subject | string | n/a |
+---------+-----------------------------------+--------+-------------+
- | params | Approximate matching parameters. | table |n/a |
- | | The values are integers. | | |
- | | The valid string key values are: | |(Default |
- | | ``cost_ins``, ``cost_del``, | |value for |
- | | ``cost_subst``, ``max_cost``, | |a missing |
- | | ``max_ins``, ``max_del``, | |field is 0) |
- | | ``max_subst``, ``max_err`` | | |
+ | params |Approximate matching parameters. | table |n/a |
+ | |The values are integers. | | |
+ | |The valid string key values are: | |(Default |
+ | |``cost_ins``, ``cost_del``, | |value for |
+ | |``cost_subst``, ``max_cost``, | |a missing |
+ | |``max_ins``, ``max_del``, | |field is 0) |
+ | |``max_subst``, ``max_err`` | | |
+---------+-----------------------------------+--------+-------------+
| [init] |start offset in the subject | number | 1 |
| |(can be negative) | | |
+---------+-----------------------------------+--------+-------------+
- | [ef] | execution flags (bitwise OR) | number | ef_ |
+ | [ef] |execution flags (bitwise OR) | number | ef_ |
+---------+-----------------------------------+--------+-------------+
**Returns on success:**
@@ -793,22 +783,22 @@ string *subj*, starting from offset *init*, subject to execution flags *ef*.
+---------+-----------------------------------+--------+-------------+
|Parameter| Description | Type |Default Value|
+=========+===================================+========+=============+
- | r | regex object produced by new_ |userdata| n/a |
+ | r |regex object produced by new_ |userdata| n/a |
+---------+-----------------------------------+--------+-------------+
- | subj | subject | string | n/a |
+ | subj |subject | string | n/a |
+---------+-----------------------------------+--------+-------------+
- | params | Approximate matching parameters. | table |n/a |
- | | The values are integers. | | |
- | | The valid string key values are: | |(Default |
- | | ``cost_ins``, ``cost_del``, | |value for |
- | | ``cost_subst``, ``max_cost``, | |a missing |
- | | ``max_ins``, ``max_del``, | |field is 0) |
- | | ``max_subst``, ``max_err`` | | |
+ | params |Approximate matching parameters. | table |n/a |
+ | |The values are integers. | | |
+ | |The valid string key values are: | |(Default |
+ | |``cost_ins``, ``cost_del``, | |value for |
+ | |``cost_subst``, ``max_cost``, | |a missing |
+ | |``max_ins``, ``max_del``, | |field is 0) |
+ | |``max_subst``, ``max_err`` | | |
+---------+-----------------------------------+--------+-------------+
| [init] |start offset in the subject | number | 1 |
| |(can be negative) | | |
+---------+-----------------------------------+--------+-------------+
- | [ef] | execution flags (bitwise OR) | number | ef_ |
+ | [ef] |execution flags (bitwise OR) | number | ef_ |
+---------+-----------------------------------+--------+-------------+
**Returns on success:**
@@ -891,9 +881,9 @@ The function searches for the first match of the string *patt* in the subject
+---------+---------------------------+--------+-------------+
|Parameter| Description | Type |Default Value|
+=========+===========================+========+=============+
- | subj | subject | string | n/a |
+ | subj |subject | string | n/a |
+---------+---------------------------+--------+-------------+
- | patt | text to find | string | n/a |
+ | patt |text to find | string | n/a |
+---------+---------------------------+--------+-------------+
| [init] |start offset in the subject| number | 1 |
| |(can be negative) | | |
@@ -938,4 +928,3 @@ Incompatibilities with previous versions
#. Methods tfind_ and exec_: 2 values are returned on failure
#. (PCRE) exec_: the returned table may additionally contain *named
subpatterns*
-