From e874d29e595fc2c60c30c14f3e7e9ab3ff0fe60a Mon Sep 17 00:00:00 2001
From: Shmuel Zeigerman Table of Contents Lrexlib builds into shared libraries called by default rex_posix.so,
-rex_pcre.so, rex_gnu.so, rex_tre.so and rex_onig.so, which can be used with
-require.
-
-
-
-
-
+
-
+
+
Most functions and methods in Lrexlib have mandatory and optional arguments. There are no dependencies between arguments in Lrexlib's functions and @@ -82,8 +92,9 @@ MyFunc (arg1, arg2, [arg3], [arg4])
Throughout this document (unless it causes ambiguity), the identifier rex -is used in place of either rex_posix, rex_pcre, rex_gnu, rex_onig or -rex_tre, which are the default namespaces for the corresponding libraries.
+is used in place of either rex_posix, rex_pcre, rex_pcre2, rex_gnu, +rex_onig or rex_tre, which are the default namespaces for the corresponding +libraries.All functions that take a regular expression pattern as an argument will generate an error if that pattern is found invalid by the regex library.
@@ -108,73 +119,60 @@ a length that excludes the NUL. the parameter is not supplied or nil is:--
- REG_EXTENDED for POSIX and TRE
-- 0 for PCRE
-- ONIG_OPTION_NONE for Oniguruma
-- SYNTAX_POSIX_EXTENDED for GNU
+- REG_EXTENDED for POSIX and TRE
+- 0 for PCRE and PCRE2
+- ONIG_OPTION_NONE for Oniguruma
+- SYNTAX_POSIX_EXTENDED for GNU
PCRE, Oniguruma: cf may also be supplied as a string, whose -characters stand for compilation flags. Combinations of the following +
PCRE, PCRE2, Oniguruma: cf may also be supplied as a string, +whose characters stand for compilation flags. Combinations of the following characters (case sensitive) are supported:
@@ -186,10 +184,9 @@ characters (case sensitive) are supported: the parameter is not supplied or nil, is:
- -- - + + + + - Character
-- PCRE flag
-+ Oniguruma flag
-- Character +PCRE flag +PCRE2 flag +Oniguruma flag - - i
-- PCRE_CASELESS
-- ONIG_OPTION_IGNORECASE
-- - m
-- PCRE_MULTILINE
-- ONIG_OPTION_NEGATE_SINGLELINE
-- - s
-- PCRE_DOTALL
-- ONIG_OPTION_MULTILINE
-- - x
-- PCRE_EXTENDED
-- ONIG_OPTION_EXTEND
-- U
-- PCRE_UNGREEDY
-+ n/a
-+ i +PCRE_CASELESS +PCRE2_CASELESS +ONIG_OPTION_IGNORECASE ++ m +PCRE_MULTILINE +PCRE2_MULTILINE +ONIG_OPTION_NEGATE_SINGLELINE ++ s +PCRE_DOTALL +PCRE2_DOTALL +ONIG_OPTION_MULTILINE ++ x +PCRE_EXTENDED +PCRE2_EXTENDED +ONIG_OPTION_EXTEND +- U +PCRE_UNGREEDY +PCRE2_UNGREEDY +n/a - X
-- PCRE_EXTRA
-+ n/a
-X +PCRE_EXTRA +n/a +n/a -
- 0 for standard POSIX regex library
-- REG_STARTEND for those POSIX regex libraries that support it, -e.g. Spencer's.
-- 0 for PCRE, Oniguruma and TRE
+- 0 for standard POSIX regex library
+- REG_STARTEND for those POSIX regex libraries that support it, e.g. Spencer's
+- 0 for PCRE, PCRE2, Oniguruma and TRE
rex.match (subj, patt, [init], [cf], [ef], [larg...])
or
r:match (subj, [init], [ef])
@@ -287,7 +284,7 @@ substring is returned.rex.find (subj, patt, [init], [cf], [ef], [larg...])
or
r:find (subj, [init], [ef])
@@ -369,7 +366,7 @@ the match.rex.gmatch (subj, patt, [cf], [ef], [larg...])
The function is intended for use in the generic for Lua construct. It returns an iterator for repeated matching of the pattern patt in @@ -427,7 +424,7 @@ till the subject fails to match.
rex.gsub (subj, patt, repl, [n], [cf], [ef], [larg...])
This function searches for all matches of the pattern patt in the string subj and replaces them according to the parameters repl and n (see details @@ -593,7 +590,7 @@ next match; n will not be called again;
rex.split (subj, sep, [cf], [ef], [larg...])
The function is intended for use in the generic for Lua construct. It is used for splitting a subject string subj into parts (sections). @@ -664,7 +661,7 @@ subject.
rex.count (subj, patt, [cf], [ef], [larg...])
This function counts matches of the pattern patt in the string subj.
@@ -721,7 +718,7 @@ subject.
rex.flags ([tb])
This function returns a table containing the numeric values of the constants defined by the used regex library, with the keys being the (string) names of the @@ -768,8 +765,8 @@ The keys in the tb table are formed from the names of the correspon constants in the used library. They are formed as follows:
rex.new (patt, [cf], [larg...])
The function compiles regular expression patt into a regular expression object whose internal representation is corresponding to the library used. The returned @@ -838,7 +835,7 @@ any.
r:tfind (subj, [init], [ef])
The method searches for the first match of the compiled regexp r in the string subj, starting from offset init, subject to execution flags ef.
@@ -890,9 +887,9 @@ string subj, starting from offset init, subject to execution fr:exec (subj, [init], [ef])
The method searches for the first match of the compiled regexp r in the string subj, starting from offset init, subject to execution flags ef.
@@ -959,9 +956,9 @@ string subj, starting from offset init, subject to execution f returned as a third result, in a table. This table contains false in the positions where the corresponding sub-pattern did not participate in the match.rex.new (patt, [cf], [lo])
The locale (lo) can be either a string (e.g., "French_France.1252"), or a -userdata obtained from a call to maketables. The default value, used when the -parameter is not supplied or nil, is the built-in PCRE set of character +userdata obtained from a call to maketables. The default value, used when +the parameter is not supplied or nil, is the built-in PCRE set of character tables.
[See pcre_fullinfo in the PCRE docs.]
r:fullinfo ()
This function returns a table containing information about the compiled pattern. The keys are strings formed in the following way: PCRE_INFO_CAPTURECOUNT -> "CAPTURECOUNT". The values are numbers.
-[PCRE 6.0 and later. See pcre_dfa_exec in the PCRE docs.]
r:dfa_exec (subj, [init], [ef], [ovecsize], [wscount])
The method matches a compiled regular expression r against a given subject @@ -1074,10 +1071,10 @@ first.
[See pcre_maketables in the PCRE docs.]
rex_pcre.maketables ()
Creates a set of character tables corresponding to the current locale and @@ -1086,7 +1083,7 @@ function accepting the locale parameter.
[PCRE 4.0 and later. See pcre_config in the PCRE docs.]
rex_pcre.config ([tb])
This function returns a table containing the values of the configuration @@ -1095,8 +1092,8 @@ keyed by their names (strings). If the table argument tb is supplied th is used as the output table, else a new table is created.
[See pcre_version in the PCRE docs.]
rex_pcre.version ()
This function returns a string containing the version of the used PCRE library @@ -1104,10 +1101,144 @@ and its release date.
rex.new (patt, [cf], [lo])
+The locale (lo) can be either a string (e.g., "French_France.1252"), or a +userdata obtained from a call to maketables. The default value, used when +the parameter is not supplied or nil, is the built-in PCRE2 set of character +tables.
+[See pcre2_patterninfo in the PCRE2 docs.]
+r:patterninfo ()
+This function returns a table containing information about the compiled pattern. +The keys are strings formed in the following way: +PCRE2_INFO_CAPTURECOUNT -> "CAPTURECOUNT". The values are numbers.
+[See pcre2_dfa_exec in the PCRE2 docs.]
+r:dfa_exec (subj, [init], [ef], [ovecsize], [wscount])
+The method matches a compiled regular expression r against a given subject +string subj, using a DFA matching algorithm.
++++
++ + ++ + + + + + + Parameter +Description +Type +Default Value ++ r +regex object produced by new +userdata +n/a ++ subj +subject +string +n/a ++ [init] +start offset in the subject +(can be negative) +number +1 ++ [ef] +execution flags (bitwise OR) +number +ef ++ [ovecsize] +size of the array for result offsets +number +100 ++ + [wscount] +number of elements in the working +space array +number +50 +
[See pcre2_jit_compile in the PCRE2 docs.]
+r:jit_compile ([options])
+Parameter options is a number (a bitwise OR of separate options; +it defaults to PCRE2_JIT_COMPLETE).
+The method returns true on success or false + error message string on failure.
+[See pcre2_maketables in the PCRE2 docs.]
+rex_pcre2.maketables ()
+Creates a set of character tables corresponding to the current locale and +returns it as a userdata. The returned value can be passed to any Lrexlib +function accepting the locale parameter.
+[See pcre2_config in the PCRE2 docs.]
+rex_pcre2.config ([tb])
+This function returns a table containing the values of the configuration +parameters used at PCRE2 library build-time. Those parameters (numbers) are +keyed by their names (strings). If the table argument tb is supplied then it +is used as the output table, else a new table is created.
+[See pcre2_config(PCRE2_CONFIG_VERSION) in the PCRE2 docs.]
+rex_pcre2.version ()
+This function returns a string containing the version of the used PCRE2 library +and its release date.
+rex.new (patt, [cf], [tr])
If the compilation flags (cf) are not supplied or nil, the default syntax is SYNTAX_POSIX_EXTENDED. Note that this is not the same as passing a value @@ -1119,9 +1250,9 @@ translated when it is being matched.
rex.new (patt, [cf], [enc], [syn])
The encoding parameter (enc) must be one of the predefined strings that are formed from the ONIG_ENCODING_xxx identifiers defined in oniguruma.h, by means @@ -1140,7 +1271,7 @@ last setdefaultsyntax "syntax" string set, an error is raised.
rex_onig.setdefaultsyntax (syntax)
This function sets the default syntax for the Oniguruma library, according to the value of the string syntax. The specified syntax will be further used for @@ -1156,8 +1287,8 @@ argument is passed to those functions explicitly.
[See onig_version in the Oniguruma docs.]
rex_onig.version ()
This function returns a string containing the version of the used Oniguruma @@ -1165,7 +1296,7 @@ library.
[See onig_number_of_captures in the Oniguruma docs.]
r:capturecount ()
Returns the number of captures in the pattern.
@@ -1173,13 +1304,13 @@ library.rex.new (patt, [cf])
r:atfind (subj, params, [init], [ef])
The method searches for the first match of the compiled regexp r in the string subj, starting from offset init, subject to execution flags ef.
@@ -1260,7 +1391,7 @@ in the following fields: cost,r:aexec (subj, params, [init], [ef])
The method searches for the first match of the compiled regexp r in the string subj, starting from offset init, subject to execution flags ef.
@@ -1342,21 +1473,21 @@ the match, in the following fields: cost,r:have_approx ()
The method returns true if the compiled pattern uses approximate matching, and false if not.
r:have_backrefs ()
The method returns true if the compiled pattern has back references, and false if not.
[See tre_config in the TRE docs.]
rex_tre.config ([tb])
This function returns a table containing the values of the configuration @@ -1366,7 +1497,7 @@ is used as the output table, else a new table is created.
[See tre_version in the TRE docs.]
rex_tre.version ()
This function returns a string containing the version of the used TRE library.
@@ -1374,7 +1505,7 @@ is used as the output table, else a new table is created.Incompatibilities between versions 2.8 and 2.7:
@@ -1403,7 +1534,7 @@ position.
Incompatibilities between versions 2.1 and 2.0:
@@ -1427,7 +1558,7 @@ subpatterns