diff options
Diffstat (limited to 'gnulib/doc/regexprops-generic.texi')
m--------- | gnulib | 0 | ||||
-rw-r--r-- | gnulib/doc/regexprops-generic.texi | 715 |
2 files changed, 715 insertions, 0 deletions
diff --git a/gnulib b/gnulib deleted file mode 160000 -Subproject 443bc5ffcf7429e557f4a371b0661abe98ddbc1 diff --git a/gnulib/doc/regexprops-generic.texi b/gnulib/doc/regexprops-generic.texi new file mode 100644 index 0000000..e70d954 --- /dev/null +++ b/gnulib/doc/regexprops-generic.texi @@ -0,0 +1,715 @@ +@c Copyright (C) 1994, 1996, 1998, 2000-2001, 2003-2007, 2009-2011 Free +@c Software Foundation, Inc. +@c +@c Permission is granted to copy, distribute and/or modify this document +@c under the terms of the GNU Free Documentation License, Version 1.3 or +@c any later version published by the Free Software Foundation; with no +@c Invariant Sections, with no Front-Cover Texts, and with no Back-Cover +@c Texts. A copy of the license is included in the ``GNU Free +@c Documentation License'' file as part of this distribution. + +@c this regular expression description is for: generic + +@menu +* awk regular expression syntax:: +* egrep regular expression syntax:: +* ed regular expression syntax:: +* emacs regular expression syntax:: +* gnu-awk regular expression syntax:: +* grep regular expression syntax:: +* posix-awk regular expression syntax:: +* posix-basic regular expression syntax:: +* posix-egrep regular expression syntax:: +* posix-extended regular expression syntax:: +* posix-minimal-basic regular expression syntax:: +* sed regular expression syntax:: +@end menu + +@node awk regular expression syntax +@subsection @samp{awk} regular expression syntax + + +The character @samp{.} matches any single character except the null character. + + +@table @samp + +@item + +indicates that the regular expression should match one or more occurrences of the previous atom or regexp. +@item ? +indicates that the regular expression should match zero or one occurrence of the previous atom or regexp. +@item \+ +matches a @samp{+} +@item \? +matches a @samp{?}. +@end table + + +Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}. + +GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively. + +Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit matches that digit. + +The alternation operator is @samp{|}. + +The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified. + +@samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except: +@enumerate + +@item At the beginning of a regular expression + +@item After an open-group, signified by +@samp{(} +@item After the alternation operator @samp{|} + +@end enumerate + + + + +The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. + + +@node egrep regular expression syntax +@subsection @samp{egrep} regular expression syntax + + +The character @samp{.} matches any single character except newline. + + +@table @samp + +@item + +indicates that the regular expression should match one or more occurrences of the previous atom or regexp. +@item ? +indicates that the regular expression should match zero or one occurrence of the previous atom or regexp. +@item \+ +matches a @samp{+} +@item \? +matches a @samp{?}. +@end table + + +Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. Non-matching lists @samp{[^@dots{}]} do not ever match newline. + +GNU extensions are supported: +@enumerate + +@item @samp{\w} matches a character within a word + +@item @samp{\W} matches a character which is not within a word + +@item @samp{\<} matches the beginning of a word + +@item @samp{\>} matches the end of a word + +@item @samp{\b} matches a word boundary + +@item @samp{\B} matches characters which are not a word boundary + +@item @samp{\`} matches the beginning of the whole input + +@item @samp{\'} matches the end of the whole input + +@end enumerate + + +Grouping is performed with parentheses @samp{()}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}. + +The alternation operator is @samp{|}. + +The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified. + +The characters @samp{*}, @samp{+} and @samp{?} are special anywhere in a regular expression. + + + +The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. + + +@node ed regular expression syntax +@subsection @samp{ed} regular expression syntax + + +The character @samp{.} matches any single character except the null character. + + +@table @samp + +@item \+ +indicates that the regular expression should match one or more occurrences of the previous atom or regexp. +@item \? +indicates that the regular expression should match zero or one occurrence of the previous atom or regexp. +@item + and ? +match themselves. +@end table + + +Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. + +GNU extensions are supported: +@enumerate + +@item @samp{\w} matches a character within a word + +@item @samp{\W} matches a character which is not within a word + +@item @samp{\<} matches the beginning of a word + +@item @samp{\>} matches the end of a word + +@item @samp{\b} matches a word boundary + +@item @samp{\B} matches characters which are not a word boundary + +@item @samp{\`} matches the beginning of the whole input + +@item @samp{\'} matches the end of the whole input + +@end enumerate + + +Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}. + +The alternation operator is @samp{\|}. + +The character @samp{^} only represents the beginning of a string when it appears: +@enumerate + +@item +At the beginning of a regular expression + +@item After an open-group, signified by +@samp{\(} + +@item After the alternation operator @samp{\|} + +@end enumerate + + +The character @samp{$} only represents the end of a string when it appears: +@enumerate + +@item At the end of a regular expression + +@item Before a close-group, signified by +@samp{\)} +@item Before the alternation operator @samp{\|} + +@end enumerate + + +@samp{\*}, @samp{\+} and @samp{\?} are special at any point in a regular expression except: +@enumerate + +@item At the beginning of a regular expression + +@item After an open-group, signified by +@samp{\(} +@item After the alternation operator @samp{\|} + +@end enumerate + + +Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted. + +The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. + + +@node emacs regular expression syntax +@subsection @samp{emacs} regular expression syntax + + +The character @samp{.} matches any single character except newline. + + +@table @samp + +@item + +indicates that the regular expression should match one or more occurrences of the previous atom or regexp. +@item ? +indicates that the regular expression should match zero or one occurrence of the previous atom or regexp. +@item \+ +matches a @samp{+} +@item \? +matches a @samp{?}. +@end table + + +Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}. + +GNU extensions are supported: +@enumerate + +@item @samp{\w} matches a character within a word + +@item @samp{\W} matches a character which is not within a word + +@item @samp{\<} matches the beginning of a word + +@item @samp{\>} matches the end of a word + +@item @samp{\b} matches a word boundary + +@item @samp{\B} matches characters which are not a word boundary + +@item @samp{\`} matches the beginning of the whole input + +@item @samp{\'} matches the end of the whole input + +@end enumerate + + +Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}. + +The alternation operator is @samp{\|}. + +The character @samp{^} only represents the beginning of a string when it appears: +@enumerate + +@item +At the beginning of a regular expression + +@item After an open-group, signified by +@samp{\(} + +@item After the alternation operator @samp{\|} + +@end enumerate + + +The character @samp{$} only represents the end of a string when it appears: +@enumerate + +@item At the end of a regular expression + +@item Before a close-group, signified by +@samp{\)} +@item Before the alternation operator @samp{\|} + +@end enumerate + + +@samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except: +@enumerate + +@item At the beginning of a regular expression + +@item After an open-group, signified by +@samp{\(} +@item After the alternation operator @samp{\|} + +@end enumerate + + + + +The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. + + +@node gnu-awk regular expression syntax +@subsection @samp{gnu-awk} regular expression syntax + + +The character @samp{.} matches any single character. + + +@table @samp + +@item + +indicates that the regular expression should match one or more occurrences of the previous atom or regexp. +@item ? +indicates that the regular expression should match zero or one occurrence of the previous atom or regexp. +@item \+ +matches a @samp{+} +@item \? +matches a @samp{?}. +@end table + + +Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. + +GNU extensions are supported: +@enumerate + +@item @samp{\w} matches a character within a word + +@item @samp{\W} matches a character which is not within a word + +@item @samp{\<} matches the beginning of a word + +@item @samp{\>} matches the end of a word + +@item @samp{\b} matches a word boundary + +@item @samp{\B} matches characters which are not a word boundary + +@item @samp{\`} matches the beginning of the whole input + +@item @samp{\'} matches the end of the whole input + +@end enumerate + + +Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}. + +The alternation operator is @samp{|}. + +The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified. + +@samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except: +@enumerate + +@item At the beginning of a regular expression + +@item After an open-group, signified by +@samp{(} +@item After the alternation operator @samp{|} + +@end enumerate + + + + +The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. + + +@node grep regular expression syntax +@subsection @samp{grep} regular expression syntax + + +The character @samp{.} matches any single character except newline. + + +@table @samp + +@item \+ +indicates that the regular expression should match one or more occurrences of the previous atom or regexp. +@item \? +indicates that the regular expression should match zero or one occurrence of the previous atom or regexp. +@item + and ? +match themselves. +@end table + + +Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. Non-matching lists @samp{[^@dots{}]} do not ever match newline. + +GNU extensions are supported: +@enumerate + +@item @samp{\w} matches a character within a word + +@item @samp{\W} matches a character which is not within a word + +@item @samp{\<} matches the beginning of a word + +@item @samp{\>} matches the end of a word + +@item @samp{\b} matches a word boundary + +@item @samp{\B} matches characters which are not a word boundary + +@item @samp{\`} matches the beginning of the whole input + +@item @samp{\'} matches the end of the whole input + +@end enumerate + + +Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}. + +The alternation operator is @samp{\|}. + +The character @samp{^} only represents the beginning of a string when it appears: +@enumerate + +@item +At the beginning of a regular expression + +@item After an open-group, signified by +@samp{\(} + +@item After a newline + +@item After the alternation operator @samp{\|} + +@end enumerate + + +The character @samp{$} only represents the end of a string when it appears: +@enumerate + +@item At the end of a regular expression + +@item Before a close-group, signified by +@samp{\)} +@item Before a newline + +@item Before the alternation operator @samp{\|} + +@end enumerate + + +@samp{\*}, @samp{\+} and @samp{\?} are special at any point in a regular expression except: +@enumerate + +@item At the beginning of a regular expression + +@item After an open-group, signified by +@samp{\(} +@item After a newline + +@item After the alternation operator @samp{\|} + +@end enumerate + + +Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted. + +The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. + + +@node posix-awk regular expression syntax +@subsection @samp{posix-awk} regular expression syntax + + +The character @samp{.} matches any single character except the null character. + + +@table @samp + +@item + +indicates that the regular expression should match one or more occurrences of the previous atom or regexp. +@item ? +indicates that the regular expression should match zero or one occurrence of the previous atom or regexp. +@item \+ +matches a @samp{+} +@item \? +matches a @samp{?}. +@end table + + +Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. + +GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively. + +Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}. + +The alternation operator is @samp{|}. + +The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified. + +@samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are not allowed: +@enumerate + +@item At the beginning of a regular expression + +@item After an open-group, signified by +@samp{(} +@item After the alternation operator @samp{|} + +@end enumerate + + +Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals such as @samp{a@{1z} are not accepted. + +The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. + + +@node posix-basic regular expression syntax +@subsection @samp{posix-basic} regular expression syntax +This is a synonym for ed. +@node posix-egrep regular expression syntax +@subsection @samp{posix-egrep} regular expression syntax + + +The character @samp{.} matches any single character except newline. + + +@table @samp + +@item + +indicates that the regular expression should match one or more occurrences of the previous atom or regexp. +@item ? +indicates that the regular expression should match zero or one occurrence of the previous atom or regexp. +@item \+ +matches a @samp{+} +@item \? +matches a @samp{?}. +@end table + + +Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. Non-matching lists @samp{[^@dots{}]} do not ever match newline. + +GNU extensions are supported: +@enumerate + +@item @samp{\w} matches a character within a word + +@item @samp{\W} matches a character which is not within a word + +@item @samp{\<} matches the beginning of a word + +@item @samp{\>} matches the end of a word + +@item @samp{\b} matches a word boundary + +@item @samp{\B} matches characters which are not a word boundary + +@item @samp{\`} matches the beginning of the whole input + +@item @samp{\'} matches the end of the whole input + +@end enumerate + + +Grouping is performed with parentheses @samp{()}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}. + +The alternation operator is @samp{|}. + +The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified. + +The characters @samp{*}, @samp{+} and @samp{?} are special anywhere in a regular expression. + +Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1} + +The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. + + +@node posix-extended regular expression syntax +@subsection @samp{posix-extended} regular expression syntax + + +The character @samp{.} matches any single character except the null character. + + +@table @samp + +@item + +indicates that the regular expression should match one or more occurrences of the previous atom or regexp. +@item ? +indicates that the regular expression should match zero or one occurrence of the previous atom or regexp. +@item \+ +matches a @samp{+} +@item \? +matches a @samp{?}. +@end table + + +Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. + +GNU extensions are supported: +@enumerate + +@item @samp{\w} matches a character within a word + +@item @samp{\W} matches a character which is not within a word + +@item @samp{\<} matches the beginning of a word + +@item @samp{\>} matches the end of a word + +@item @samp{\b} matches a word boundary + +@item @samp{\B} matches characters which are not a word boundary + +@item @samp{\`} matches the beginning of the whole input + +@item @samp{\'} matches the end of the whole input + +@end enumerate + + +Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}. + +The alternation operator is @samp{|}. + +The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified. + +@samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are not allowed: +@enumerate + +@item At the beginning of a regular expression + +@item After an open-group, signified by +@samp{(} +@item After the alternation operator @samp{|} + +@end enumerate + + +Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals such as @samp{a@{1z} are not accepted. + +The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. + + +@node posix-minimal-basic regular expression syntax +@subsection @samp{posix-minimal-basic} regular expression syntax + + +The character @samp{.} matches any single character except the null character. + + + +Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. + +GNU extensions are supported: +@enumerate + +@item @samp{\w} matches a character within a word + +@item @samp{\W} matches a character which is not within a word + +@item @samp{\<} matches the beginning of a word + +@item @samp{\>} matches the end of a word + +@item @samp{\b} matches a word boundary + +@item @samp{\B} matches characters which are not a word boundary + +@item @samp{\`} matches the beginning of the whole input + +@item @samp{\'} matches the end of the whole input + +@end enumerate + + +Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}. + + + +The character @samp{^} only represents the beginning of a string when it appears: +@enumerate + +@item +At the beginning of a regular expression + +@item After an open-group, signified by +@samp{\(} + +@end enumerate + + +The character @samp{$} only represents the end of a string when it appears: +@enumerate + +@item At the end of a regular expression + +@item Before a close-group, signified by +@samp{\)} +@end enumerate + + + + +Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted. + +The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. + + +@node sed regular expression syntax +@subsection @samp{sed} regular expression syntax +This is a synonym for ed.
\ No newline at end of file |