summaryrefslogtreecommitdiff
path: root/doc/regex.texi
diff options
context:
space:
mode:
authorPaul Eggert <eggert@cs.ucla.edu>2019-12-30 00:22:05 -0800
committerPaul Eggert <eggert@cs.ucla.edu>2019-12-30 00:22:32 -0800
commit35cddb5464e1aec6fdd3c89a7e43d4374aab31e6 (patch)
treea00a11180bc79fc40a049746a9c9f2dfc7b49938 /doc/regex.texi
parentff6ee46a283ef023146b4e69fc9cf776ac546899 (diff)
downloadgnulib-35cddb5464e1aec6fdd3c89a7e43d4374aab31e6.tar.gz
doc: document trouble with back-references
* doc/regex.texi (Back-reference Operator): Mention bugs etc.
Diffstat (limited to 'doc/regex.texi')
-rw-r--r--doc/regex.texi12
1 files changed, 12 insertions, 0 deletions
diff --git a/doc/regex.texi b/doc/regex.texi
index 7b83cdd8e1..4e0da9b397 100644
--- a/doc/regex.texi
+++ b/doc/regex.texi
@@ -1144,6 +1144,18 @@ example, @samp{(a(b))\2*} matches @samp{a} followed by two or more
If there is no preceding @w{@var{digit}-th} subexpression, the regular
expression is invalid.
+Back-references can greatly slow down matching, as they can generate
+exponentially many matching possibilities that can consume both time
+and memory to explore. Also, the POSIX specification for
+back-references is at times unclear. Furthermore, many regular
+expression implementations have back-reference bugs that can cause
+programs to return incorrect answers or even crash, and fixing these
+bugs has often been low-priority---for example, as of 2019 the GNU C
+library bug database contained back-reference bugs 52, 10844, 11053,
+and 23522, with little sign of forthcoming fixes. Luckily,
+back-references are rarely useful and it should be little trouble to
+avoid them in practical applications.
+
@node Anchoring Operators
@section Anchoring Operators