summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--doc/html/pcre2grep.html77
-rw-r--r--doc/pcre2grep.14
-rw-r--r--doc/pcre2grep.txt81
3 files changed, 132 insertions, 30 deletions
diff --git a/doc/html/pcre2grep.html b/doc/html/pcre2grep.html
index dcfb96f..8f21034 100644
--- a/doc/html/pcre2grep.html
+++ b/doc/html/pcre2grep.html
@@ -22,11 +22,12 @@ please consult the man page, in case the conversion went wrong.
<li><a name="TOC7" href="#SEC7">NEWLINES</a>
<li><a name="TOC8" href="#SEC8">OPTIONS COMPATIBILITY</a>
<li><a name="TOC9" href="#SEC9">OPTIONS WITH DATA</a>
-<li><a name="TOC10" href="#SEC10">MATCHING ERRORS</a>
-<li><a name="TOC11" href="#SEC11">DIAGNOSTICS</a>
-<li><a name="TOC12" href="#SEC12">SEE ALSO</a>
-<li><a name="TOC13" href="#SEC13">AUTHOR</a>
-<li><a name="TOC14" href="#SEC14">REVISION</a>
+<li><a name="TOC10" href="#SEC10">CALLING EXTERNAL SCRIPTS</a>
+<li><a name="TOC11" href="#SEC11">MATCHING ERRORS</a>
+<li><a name="TOC12" href="#SEC12">DIAGNOSTICS</a>
+<li><a name="TOC13" href="#SEC13">SEE ALSO</a>
+<li><a name="TOC14" href="#SEC14">AUTHOR</a>
+<li><a name="TOC15" href="#SEC15">REVISION</a>
</ul>
<br><a name="SEC1" href="#TOC1">SYNOPSIS</a><br>
<P>
@@ -735,7 +736,57 @@ The exceptions to the above are the <b>--colour</b> (or <b>--color</b>) and
options does have data, it must be given in the first form, using an equals
character. Otherwise <b>pcre2grep</b> will assume that it has no data.
</P>
-<br><a name="SEC10" href="#TOC1">MATCHING ERRORS</a><br>
+<br><a name="SEC10" href="#TOC1">CALLING EXTERNAL SCRIPTS</a><br>
+<P>
+On non-Windows systems, <b>pcre2grep</b> has, by default, support for calling
+external programs or scripts during matching by making use of PCRE2's callout
+facility. However, this support can be disabled when <b>pcre2grep</b> is built.
+You can find out whether your binary has support for callouts by running it
+with the <b>--help</b> option. If the support is not enabled, all callouts in
+patterns are ignored by <b>pcre2grep</b>.
+</P>
+<P>
+A callout in a PCRE2 pattern is of the form (?C&#60;arg&#62;) where the argument is
+either a number or a quoted string (see the
+<a href="pcre2callout.html"><b>pcre2callout</b></a>
+documentation for details). Numbered callouts are ignored by <b>pcre2grep</b>.
+String arguments are parsed as a list of substrings separated by pipe (vertical
+bar) characters. The first substring must be an executable name, with the
+following substrings specifying arguments:
+<pre>
+ executable_name|arg1|arg2|...
+</pre>
+Any substring (including the executable name) may contain escape sequences
+started by a dollar character: $&#60;digits&#62; or ${&#60;digits&#62;} is replaced by the
+captured substring of the given decimal number, which must be greater than
+zero. If the number is greater than the number of capturing substrings, or if
+the capture is unset, the replacement is empty.
+</P>
+<P>
+Any other character is substituted by itself. In particular, $$ is replaced by
+a single dollar and $| is replaced by a pipe character. Here is an example:
+<pre>
+ echo -e "abcde\n12345" | pcre2grep \
+ '(?x)(.)(..(.))
+ (?C"/bin/echo|Arg1: [$1] [$2] [$3]|Arg2: $|${1}$| ($4)")()' -
+
+ Output:
+
+ Arg1: [a] [bcd] [d] Arg2: |a| ()
+ abcde
+ Arg1: [1] [234] [4] Arg2: |1| ()
+ 12345
+</pre>
+The parameters for the <b>execv()</b> system call that is used to run the
+program or script are zero-terminated strings. This means that binary zero
+characters in the callout argument will cause premature termination of their
+substrings, and therefore should not be present. Any syntax errors in the
+string (for example, a dollar not followed by another character) cause the
+callout to be ignored. If running the program fails for any reason (including
+the non-existence of the executable), a local matching failure occurs and the
+matcher backtracks in the normal way.
+</P>
+<br><a name="SEC11" href="#TOC1">MATCHING ERRORS</a><br>
<P>
It is possible to supply a regular expression that takes a very long time to
fail to match certain lines. Such patterns normally involve nested indefinite
@@ -751,7 +802,7 @@ overall resource limit; there is a second option called <b>--recursion-limit</b>
that sets a limit on the amount of memory (usually stack) that is used (see the
discussion of these options above).
</P>
-<br><a name="SEC11" href="#TOC1">DIAGNOSTICS</a><br>
+<br><a name="SEC12" href="#TOC1">DIAGNOSTICS</a><br>
<P>
Exit status is 0 if any matches were found, 1 if no matches were found, and 2
for syntax errors, overlong lines, non-existent or inaccessible files (even if
@@ -759,11 +810,11 @@ matches were found in other files) or too many matching errors. Using the
<b>-s</b> option to suppress error messages about inaccessible files does not
affect the return code.
</P>
-<br><a name="SEC12" href="#TOC1">SEE ALSO</a><br>
+<br><a name="SEC13" href="#TOC1">SEE ALSO</a><br>
<P>
-<b>pcre2pattern</b>(3), <b>pcre2syntax</b>(3).
+<b>pcre2pattern</b>(3), <b>pcre2syntax</b>(3), <b>pcre2callout</b>(3).
</P>
-<br><a name="SEC13" href="#TOC1">AUTHOR</a><br>
+<br><a name="SEC14" href="#TOC1">AUTHOR</a><br>
<P>
Philip Hazel
<br>
@@ -772,11 +823,11 @@ University Computing Service
Cambridge, England.
<br>
</P>
-<br><a name="SEC14" href="#TOC1">REVISION</a><br>
+<br><a name="SEC15" href="#TOC1">REVISION</a><br>
<P>
-Last updated: 03 January 2015
+Last updated: 06 April 2016
<br>
-Copyright &copy; 1997-2015 University of Cambridge.
+Copyright &copy; 1997-2016 University of Cambridge.
<br>
<p>
Return to the <a href="index.html">PCRE2 index page</a>.
diff --git a/doc/pcre2grep.1 b/doc/pcre2grep.1
index 75e5685..b95ae55 100644
--- a/doc/pcre2grep.1
+++ b/doc/pcre2grep.1
@@ -687,9 +687,9 @@ a single dollar and $| is replaced by a pipe character. Here is an example:
echo -e "abcde\en12345" | pcre2grep \e
'(?x)(.)(..(.))
(?C"/bin/echo|Arg1: [$1] [$2] [$3]|Arg2: $|${1}$| ($4)")()' -
-
+.sp
Output:
-
+.sp
Arg1: [a] [bcd] [d] Arg2: |a| ()
abcde
Arg1: [1] [234] [4] Arg2: |1| ()
diff --git a/doc/pcre2grep.txt b/doc/pcre2grep.txt
index 29cd75c..9ee7165 100644
--- a/doc/pcre2grep.txt
+++ b/doc/pcre2grep.txt
@@ -725,35 +725,86 @@ OPTIONS WITH DATA
equals character. Otherwise pcre2grep will assume that it has no data.
+CALLING EXTERNAL SCRIPTS
+
+ On non-Windows systems, pcre2grep has, by default, support for calling
+ external programs or scripts during matching by making use of PCRE2's
+ callout facility. However, this support can be disabled when pcre2grep
+ is built. You can find out whether your binary has support for call-
+ outs by running it with the --help option. If the support is not
+ enabled, all callouts in patterns are ignored by pcre2grep.
+
+ A callout in a PCRE2 pattern is of the form (?C<arg>) where the argu-
+ ment is either a number or a quoted string (see the pcre2callout docu-
+ mentation for details). Numbered callouts are ignored by pcre2grep.
+ String arguments are parsed as a list of substrings separated by pipe
+ (vertical bar) characters. The first substring must be an executable
+ name, with the following substrings specifying arguments:
+
+ executable_name|arg1|arg2|...
+
+ Any substring (including the executable name) may contain escape
+ sequences started by a dollar character: $<digits> or ${<digits>} is
+ replaced by the captured substring of the given decimal number, which
+ must be greater than zero. If the number is greater than the number of
+ capturing substrings, or if the capture is unset, the replacement is
+ empty.
+
+ Any other character is substituted by itself. In particular, $$ is
+ replaced by a single dollar and $| is replaced by a pipe character.
+ Here is an example:
+
+ echo -e "abcde\n12345" | pcre2grep \
+ '(?x)(.)(..(.))
+ (?C"/bin/echo|Arg1: [$1] [$2] [$3]|Arg2: $|${1}$| ($4)")()' -
+
+ Output:
+
+ Arg1: [a] [bcd] [d] Arg2: |a| ()
+ abcde
+ Arg1: [1] [234] [4] Arg2: |1| ()
+ 12345
+
+ The parameters for the execv() system call that is used to run the pro-
+ gram or script are zero-terminated strings. This means that binary zero
+ characters in the callout argument will cause premature termination of
+ their substrings, and therefore should not be present. Any syntax
+ errors in the string (for example, a dollar not followed by another
+ character) cause the callout to be ignored. If running the program
+ fails for any reason (including the non-existence of the executable), a
+ local matching failure occurs and the matcher backtracks in the normal
+ way.
+
+
MATCHING ERRORS
- It is possible to supply a regular expression that takes a very long
- time to fail to match certain lines. Such patterns normally involve
- nested indefinite repeats, for example: (a+)*\d when matched against a
- line of a's with no final digit. The PCRE2 matching function has a
- resource limit that causes it to abort in these circumstances. If this
- happens, pcre2grep outputs an error message and the line that caused
- the problem to the standard error stream. If there are more than 20
+ It is possible to supply a regular expression that takes a very long
+ time to fail to match certain lines. Such patterns normally involve
+ nested indefinite repeats, for example: (a+)*\d when matched against a
+ line of a's with no final digit. The PCRE2 matching function has a
+ resource limit that causes it to abort in these circumstances. If this
+ happens, pcre2grep outputs an error message and the line that caused
+ the problem to the standard error stream. If there are more than 20
such errors, pcre2grep gives up.
- The --match-limit option of pcre2grep can be used to set the overall
- resource limit; there is a second option called --recursion-limit that
- sets a limit on the amount of memory (usually stack) that is used (see
+ The --match-limit option of pcre2grep can be used to set the overall
+ resource limit; there is a second option called --recursion-limit that
+ sets a limit on the amount of memory (usually stack) that is used (see
the discussion of these options above).
DIAGNOSTICS
Exit status is 0 if any matches were found, 1 if no matches were found,
- and 2 for syntax errors, overlong lines, non-existent or inaccessible
- files (even if matches were found in other files) or too many matching
+ and 2 for syntax errors, overlong lines, non-existent or inaccessible
+ files (even if matches were found in other files) or too many matching
errors. Using the -s option to suppress error messages about inaccessi-
ble files does not affect the return code.
SEE ALSO
- pcre2pattern(3), pcre2syntax(3).
+ pcre2pattern(3), pcre2syntax(3), pcre2callout(3).
AUTHOR
@@ -765,5 +816,5 @@ AUTHOR
REVISION
- Last updated: 03 January 2015
- Copyright (c) 1997-2015 University of Cambridge.
+ Last updated: 06 April 2016
+ Copyright (c) 1997-2016 University of Cambridge.