summaryrefslogtreecommitdiff
path: root/src/regexp
Commit message (Collapse)AuthorAgeFilesLines
* regexp: fix copy-paste typo on Regexp.UnmarshalText docDaniel Martí2023-04-141-4/+4
| | | | | | | | | | | | | | | | I noticed that https://go.dev/cl/479401 called both methods MarshalText in the godoc, so fix that. While here, add more godoc links for better usability. Change-Id: I8f10bafeca6a1ca1c1ed9be7a7dd9fdecfe991a0 Reviewed-on: https://go-review.googlesource.com/c/go/+/484335 Auto-Submit: Ian Lance Taylor <iant@google.com> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: David Chase <drchase@google.com>
* regexp: add Regexp.TextMarshaler/TextUnmarshalerJonathan Hall2023-04-122-0/+47
| | | | | | | | | | | | | Fixes #46159 Change-Id: I51dc4e9e8915ab5a73f053690fb2395edbeb1151 Reviewed-on: https://go-review.googlesource.com/c/go/+/479401 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com>
* all: re-run stringerIan Lance Taylor2023-04-111-0/+26
| | | | | | | | | | | | | | | Re-run all go:generate stringer commands. This mostly adds checks that the constant values did not change, but does add new strings for the debug/dwarf and internal/pkgbits packages. Change-Id: I5fc41f20da47338152c183d45d5ae65074e2fccf Reviewed-on: https://go-review.googlesource.com/c/go/+/483717 Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>
* regexp/syntax: test for lowercase letters first in IsWordCharLudi Rehak2023-03-142-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | Lowercase letters occur more frequently than uppercase letters in English text. In IsWordChar, evaluate the most common case (lowercase letters) first to minimize the expected value of its execution time. Code clarity does not suffer by rearranging the order of the checks. Add a benchmark on a sentence demonstrating the performance improvement. name old time/op new time/op delta IsWordChar-10 122ns ± 0% 114ns ± 1% -6.68% (p=0.000 n=8+10) Change-Id: Ieee8126a4bd8ee8703905b4f75724623029f6fa2 Reviewed-on: https://go-review.googlesource.com/c/go/+/404100 Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: thepudds <thepudds1460@gmail.com>
* all: add missing periods in commentscui fliter2022-11-182-3/+3
| | | | | | | | | | | | | Change-Id: I69065f8adf101fdb28682c55997f503013a50e29 Reviewed-on: https://go-review.googlesource.com/c/go/+/449757 Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Joedian Reid <joedian@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Joedian Reid <joedian@golang.org> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
* regexp: add ErrLarge errorcuiweixie2022-11-022-4/+6
| | | | | | | | | | | | | | For #56041 Change-Id: I6c98458b5c0d3b3636a53ee04fc97221f3fd8bbc Reviewed-on: https://go-review.googlesource.com/c/go/+/444817 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com> Run-TryBot: xie cui <523516579@qq.com>
* regexp: limit size of parsed regexpsRuss Cox2022-10-052-10/+150
| | | | | | | | | | | | | | | | | | | | Set a 128 MB limit on the amount of space used by []syntax.Inst in the compiled form corresponding to a given regexp. Also set a 128 MB limit on the rune storage in the *syntax.Regexp tree itself. Thanks to Adam Korczynski (ADA Logics) and OSS-Fuzz for reporting this issue. Fixes CVE-2022-41715. Fixes #55949. Change-Id: Ia656baed81564436368cf950e1c5409752f28e1b Reviewed-on: https://go-review.googlesource.com/c/go/+/439356 Auto-Submit: Roland Shoemaker <roland@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Roland Shoemaker <roland@golang.org> Reviewed-by: Damien Neil <dneil@google.com>
* regexp: fix a few function names on commentscui fliter2022-10-022-4/+4
| | | | | | | | | | | | Change-Id: I192dd34c677e52e16f0ef78e1dae58a78f6d1aac GitHub-Last-Rev: 1638a7468951df72f13fea34855b6a4fcbb08226 GitHub-Pull-Request: golang/go#55967 Reviewed-on: https://go-review.googlesource.com/c/go/+/436885 Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com>
* regexp: avoid copying each instruction executedBryan Boreham2022-06-042-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inst is a 40-byte struct, so avoiding the copy gives a decent speedup: name old time/op new time/op delta Find-8 160ns ± 4% 109ns ± 4% -32.22% (p=0.008 n=5+5) FindAllNoMatches-8 70.4ns ± 4% 53.8ns ± 0% -23.58% (p=0.016 n=5+4) FindString-8 154ns ± 6% 107ns ± 0% -30.37% (p=0.016 n=5+4) FindSubmatch-8 194ns ± 1% 135ns ± 1% -30.56% (p=0.008 n=5+5) FindStringSubmatch-8 193ns ± 8% 131ns ± 0% -31.82% (p=0.008 n=5+5) Literal-8 42.8ns ± 2% 34.8ns ± 0% -18.67% (p=0.008 n=5+5) NotLiteral-8 917ns ± 2% 636ns ± 0% -30.68% (p=0.008 n=5+5) MatchClass-8 1.18µs ± 3% 0.91µs ± 1% -22.27% (p=0.016 n=5+4) MatchClass_InRange-8 1.11µs ± 1% 0.87µs ± 2% -21.38% (p=0.008 n=5+5) ReplaceAll-8 659ns ± 6% 596ns ± 3% -9.60% (p=0.008 n=5+5) AnchoredLiteralShortNonMatch-8 34.2ns ± 0% 30.4ns ± 1% -11.20% (p=0.016 n=4+5) AnchoredLiteralLongNonMatch-8 38.7ns ± 0% 38.7ns ± 0% ~ (p=0.579 n=5+5) AnchoredShortMatch-8 67.0ns ± 1% 52.7ns ± 0% -21.31% (p=0.016 n=5+4) AnchoredLongMatch-8 121ns ± 0% 124ns ±10% ~ (p=0.730 n=5+5) OnePassShortA-8 392ns ± 0% 231ns ± 3% -41.10% (p=0.008 n=5+5) NotOnePassShortA-8 370ns ± 0% 282ns ± 1% -23.81% (p=0.008 n=5+5) OnePassShortB-8 280ns ± 0% 179ns ± 1% -36.05% (p=0.008 n=5+5) NotOnePassShortB-8 226ns ± 0% 185ns ± 3% -18.26% (p=0.008 n=5+5) OnePassLongPrefix-8 51.7ns ± 0% 39.1ns ± 1% -24.28% (p=0.016 n=4+5) OnePassLongNotPrefix-8 213ns ± 2% 132ns ± 1% -37.86% (p=0.008 n=5+5) MatchParallelShared-8 25.3ns ± 3% 23.4ns ± 7% -7.50% (p=0.016 n=5+5) MatchParallelCopied-8 26.5ns ± 7% 22.3ns ± 7% -16.06% (p=0.008 n=5+5) QuoteMetaAll-8 45.8ns ± 1% 45.8ns ± 1% ~ (p=1.000 n=5+5) QuoteMetaNone-8 24.3ns ± 0% 24.3ns ± 0% ~ (p=0.325 n=5+5) Compile/Onepass-8 1.98µs ± 0% 1.97µs ± 0% -0.22% (p=0.016 n=5+4) Compile/Medium-8 4.56µs ± 0% 4.55µs ± 1% ~ (p=0.595 n=5+5) Compile/Hard-8 35.7µs ± 0% 35.3µs ± 3% ~ (p=0.151 n=5+5) Match/Easy0/16-8 2.18ns ± 2% 2.19ns ± 5% ~ (p=0.690 n=5+5) Match/Easy0/32-8 27.4ns ± 2% 27.6ns ± 4% ~ (p=1.000 n=5+5) Match/Easy0/1K-8 246ns ± 0% 252ns ± 7% ~ (p=0.238 n=5+5) Match/Easy0/32K-8 4.58µs ± 7% 4.64µs ± 5% ~ (p=1.000 n=5+5) Match/Easy0/1M-8 235µs ± 0% 235µs ± 0% ~ (p=0.886 n=4+4) Match/Easy0/32M-8 7.86ms ± 0% 7.86ms ± 1% ~ (p=0.730 n=4+5) Match/Easy0i/16-8 2.15ns ± 0% 2.15ns ± 0% ~ (p=0.246 n=5+5) Match/Easy0i/32-8 507ns ± 2% 466ns ± 4% -8.03% (p=0.008 n=5+5) Match/Easy0i/1K-8 14.7µs ± 0% 13.6µs ± 2% -7.63% (p=0.008 n=5+5) Match/Easy0i/32K-8 571µs ± 1% 570µs ± 1% ~ (p=0.556 n=4+5) Match/Easy0i/1M-8 18.2ms ± 0% 18.8ms ±11% ~ (p=0.548 n=5+5) Match/Easy0i/32M-8 581ms ± 0% 590ms ± 1% +1.52% (p=0.016 n=4+5) Match/Easy1/16-8 2.17ns ± 0% 2.15ns ± 0% -0.90% (p=0.000 n=5+4) Match/Easy1/32-8 25.1ns ± 0% 25.4ns ± 4% ~ (p=0.651 n=5+5) Match/Easy1/1K-8 462ns ± 1% 431ns ± 4% -6.56% (p=0.008 n=5+5) Match/Easy1/32K-8 18.8µs ± 0% 18.8µs ± 1% ~ (p=1.000 n=5+5) Match/Easy1/1M-8 658µs ± 0% 658µs ± 1% ~ (p=0.841 n=5+5) Match/Easy1/32M-8 21.0ms ± 1% 21.0ms ± 2% ~ (p=0.841 n=5+5) Match/Medium/16-8 2.15ns ± 0% 2.16ns ± 0% ~ (p=0.714 n=4+5) Match/Medium/32-8 561ns ± 1% 512ns ± 5% -8.69% (p=0.008 n=5+5) Match/Medium/1K-8 16.9µs ± 0% 15.2µs ± 1% -10.40% (p=0.008 n=5+5) Match/Medium/32K-8 632µs ± 0% 631µs ± 1% ~ (p=0.421 n=5+5) Match/Medium/1M-8 20.3ms ± 1% 20.1ms ± 0% ~ (p=0.190 n=5+4) Match/Medium/32M-8 650ms ± 1% 646ms ± 0% -0.58% (p=0.032 n=5+4) Match/Hard/16-8 2.15ns ± 0% 2.15ns ± 1% ~ (p=0.111 n=5+5) Match/Hard/32-8 870ns ± 2% 667ns ± 1% -23.28% (p=0.008 n=5+5) Match/Hard/1K-8 26.9µs ± 0% 21.0µs ± 2% -21.83% (p=0.008 n=5+5) Match/Hard/32K-8 833µs ± 0% 833µs ± 1% ~ (p=0.548 n=5+5) Match/Hard/1M-8 26.6ms ± 0% 26.8ms ± 1% ~ (p=0.905 n=4+5) Match/Hard/32M-8 856ms ± 0% 851ms ± 0% -0.65% (p=0.016 n=5+4) Match/Hard1/16-8 2.96µs ±12% 1.81µs ± 3% -38.68% (p=0.008 n=5+5) Match/Hard1/32-8 5.62µs ± 3% 3.48µs ± 0% -38.07% (p=0.016 n=5+4) Match/Hard1/1K-8 175µs ± 5% 108µs ± 0% -37.85% (p=0.016 n=5+4) Match/Hard1/32K-8 4.09ms ± 2% 4.05ms ± 0% -0.85% (p=0.016 n=5+4) Match/Hard1/1M-8 131ms ± 0% 131ms ± 3% ~ (p=0.151 n=5+5) Match/Hard1/32M-8 4.19s ± 0% 4.20s ± 1% ~ (p=1.000 n=5+5) Match_onepass_regex/16-8 262ns ± 2% 170ns ± 2% -35.13% (p=0.008 n=5+5) Match_onepass_regex/32-8 463ns ± 0% 306ns ± 0% -33.90% (p=0.008 n=5+5) Match_onepass_regex/1K-8 13.3µs ± 2% 8.8µs ± 0% -33.84% (p=0.008 n=5+5) Match_onepass_regex/32K-8 424µs ± 3% 280µs ± 1% -33.93% (p=0.008 n=5+5) Match_onepass_regex/1M-8 13.4ms ± 0% 9.0ms ± 1% -32.80% (p=0.016 n=4+5) Match_onepass_regex/32M-8 427ms ± 0% 288ms ± 1% -32.60% (p=0.008 n=5+5) Change-Id: I02c54176ed5c9f5b5fc99524a2d5eb1c490f0ebf Reviewed-on: https://go-review.googlesource.com/c/go/+/355789 Reviewed-by: Peter Weinberger <pjw@google.com> Reviewed-by: Russ Cox <rsc@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Auto-Submit: Russ Cox <rsc@golang.org>
* regexp/syntax: fix typo in commentLudi Rehak2022-04-291-1/+1
| | | | | | | | | | | | | | | Fix typo in comment describing IsWordChar. Change-Id: Ia283813cf5662e218ee6d0411fb0c1b1ad1021f3 Reviewed-on: https://go-review.googlesource.com/c/go/+/393435 Auto-Submit: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
* regexp/syntax: rename ErrInvalidDepth to ErrNestingDepthIan Lance Taylor2022-04-221-4/+4
| | | | | | | | | | | | | | | | The proposal accepted the name ErrNestingDepth. For #51684 Change-Id: I702365f19e5e1889dbcc3c971eecff68e0b08727 Reviewed-on: https://go-review.googlesource.com/c/go/+/401854 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
* regexp: change ErrInvalidDepth message to match proposalIan Lance Taylor2022-04-221-1/+1
| | | | | | | | | | | | | | | Also update the file in $GOROOT/api/next to use proposal number. For #51684 Change-Id: I28bfa6bc1cee98a17b13da196d41cda34d968bb0 Reviewed-on: https://go-review.googlesource.com/c/go/+/401076 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
* all: gofmt main repoRuss Cox2022-04-113-98/+116
| | | | | | | | | | | | | | | [This CL is part of a sequence implementing the proposal #51082. The design doc is at https://go.dev/s/godocfmt-design.] Run the updated gofmt, which reformats doc comments, on the main repository. Vendored files are excluded. For #51082. Change-Id: I7332f099b60f716295fb34719c98c04eb1a85407 Reviewed-on: https://go-review.googlesource.com/c/go/+/384268 Reviewed-by: Jonathan Amsterdam <jba@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* all: replace `` and '' with “ (U+201C) and ” (U+201D) in doc commentsRuss Cox2022-04-052-3/+3
| | | | | | | | | | | | | | | | go/doc in all its forms applies this replacement when rendering the comments. We are considering formatting doc comments, including doing this replacement as part of the formatting. Apply it to our source files ahead of time. For #51082. Change-Id: Ifcc1f5861abb57c5d14e7d8c2102dfb31b7a3a19 Reviewed-on: https://go-review.googlesource.com/c/go/+/384262 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* regexp/syntax: add and use ErrInvalidDepthRuss Cox2022-04-041-3/+4
| | | | | | | | | | | | | | | | | | | | The fix for #51112 introduced a depth check but used ErrInternalError to avoid introduce new API in a CL that would be backported to earlier releases. New API accepted in proposal #51684. This CL adds a distinct error for this case. For #51112. Fixes #51684. Change-Id: I068fc70aafe4218386a06103d9b7c847fb7ffa65 Reviewed-on: https://go-review.googlesource.com/c/go/+/384617 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org>
* all: remove trailing blank doc comment linesRuss Cox2022-04-012-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | A future change to gofmt will rewrite // Doc comment. // func f() to // Doc comment. func f() Apply that change preemptively to all doc comments. For #51082. Change-Id: I4023e16cfb0729b64a8590f071cd92f17343081d Reviewed-on: https://go-review.googlesource.com/c/go/+/384259 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org>
* regexp: use input.step() to advance one rune in Regexp.allMatches()Andy Pan2022-03-271-3/+4
| | | | | | | | | Change-Id: I32944f4ed519419e168e62f9ed6df63961839259 Reviewed-on: https://go-review.googlesource.com/c/go/+/359197 Reviewed-by: Ian Lance Taylor <iant@golang.org> Trust: Emmanuel Odeke <emmanuel@orijtech.com> Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com> TryBot-Result: Gopher Robot <gobot@golang.org>
* regexp: fix typo in the overviewJinwook Jeong2022-03-021-1/+1
| | | | | | | | | | Correct the slice expression in the description of Index functions. Change-Id: I97a1b670c4c7e600d858f6550b647f677ef90b41 Reviewed-on: https://go-review.googlesource.com/c/go/+/360058 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> Trust: Daniel Martí <mvdan@mvdan.cc>
* regexp/syntax: reject very deeply nested regexps in ParseRuss Cox2022-02-102-2/+77
| | | | | | | | | | | | | | | | | | | | | | The regexp code assumes it can recurse over the structure of a regexp safely. Go's growable stacks make that reasonable for all plausible regexps, but implausible ones can reach the “infinite recursion?” stack limit. This CL limits the depth of any parsed regexp to 1000. That is, the depth of the parse tree is required to be ≤ 1000. Regexps that require deeper parse trees will return ErrInternalError. A future CL will change the error to ErrInvalidDepth, but using ErrInternalError for now avoids introducing new API in point releases when this is backported. Fixes #51112. Change-Id: I97d2cd82195946eb43a4ea8561f5b95f91fb14c5 Reviewed-on: https://go-review.googlesource.com/c/go/+/384616 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* regexp: add the missing isluochuanhang2022-01-191-1/+1
| | | | | | | | | Change-Id: I23264972329aa3414067cd0e0986b69bb39bbeb5 GitHub-Last-Rev: d1d668a3cbe852d9a06f03369e7e635232d85139 GitHub-Pull-Request: golang/go#50650 Reviewed-on: https://go-review.googlesource.com/c/go/+/378935 Reviewed-by: Ian Lance Taylor <iant@golang.org> Trust: Daniel Martí <mvdan@mvdan.cc>
* all: go fix -fix=buildtag std cmd (except for bootstrap deps, vendor)Russ Cox2021-10-281-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | When these packages are released as part of Go 1.18, Go 1.16 will no longer be supported, so we can remove the +build tags in these files. Ran go fix -fix=buildtag std cmd and then reverted the bootstrapDirs as defined in src/cmd/dist/buildtool.go, which need to continue to build with Go 1.4 for now. Also reverted src/vendor and src/cmd/vendor, which will need to be updated in their own repos first. Manual changes in runtime/pprof/mprof_test.go to adjust line numbers. For #41184. Change-Id: Ic0f93f7091295b6abc76ed5cd6e6746e1280861e Reviewed-on: https://go-review.googlesource.com/c/go/+/344955 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com>
* regexp: document and implement that invalid UTF-8 bytes are the same as U+FFFDRuss Cox2021-10-115-3/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | What should it mean to run a regexp match on invalid UTF-8 bytes? The coherent behavior options are: 1. Invalid UTF-8 does not match any character classes, nor a U+FFFD literal (nor \x{fffd}). 2. Each byte of invalid UTF-8 is treated identically to a U+FFFD in the input, as a utf8.DecodeRune loop might. RE2 uses Rule 1. Because it works byte at a time, it can also provide \C to match any single byte of input, which matches invalid UTF-8 as well. This provides the nice property that a match for a regexp without \C is guaranteed to be valid UTF-8. Unfortunately, today Go has an incoherent mix of these two, although mostly Rule 2. This is a deviation from RE2, and it gives up the nice property, but we probably can't correct that at this point. In particular .* already matches entire inputs today, valid UTF-8 or not, and I doubt we can break that. This CL adopts Rule 2 officially, fixing the few places that deviate from it. Fixes #48749. Change-Id: I96402527c5dfb1146212f568ffa09dde91d71244 Reviewed-on: https://go-review.googlesource.com/c/go/+/354569 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Rob Pike <r@golang.org>
* all: use bytes.Cut, strings.CutRuss Cox2021-10-063-28/+16
| | | | | | | | | | | | | | | Many uses of Index/IndexByte/IndexRune/Split/SplitN can be written more clearly using the new Cut functions. Do that. Also rewrite to other functions if that's clearer. For #46336. Change-Id: I68d024716ace41a57a8bf74455c62279bde0f448 Reviewed-on: https://go-review.googlesource.com/c/go/+/351711 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* regexp: fix repeat of preferred empty matchRuss Cox2021-05-138-46/+176
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In Perl mode, (|a)* should match an empty string at the start of the input. Instead it matches as many a's as possible. Because (|a)+ is handled correctly, matching only an empty string, this leads to the paradox that e* can match more text than e+ (for e = (|a)) and that e+ is sometimes different from ee*. This is a very old bug that ultimately derives from the picture I drew for e* in https://swtch.com/~rsc/regexp/regexp1.html. The picture is correct for longest-match (POSIX) regexps but subtly wrong for preferred-match (Perl) regexps in the case where e has a preferred empty match. Pointed out by Andrew Gallant in private mail. The current code treats e* and e+ as the same structure, with different entry points. In the case of e* the preference list ends up not quite in the right order, in part because the “before e” and “after e” states are the same state. Splitting them apart fixes the preference list, and that can be done by compiling e* as if it were (e+)?. Like with any bug fix, there is a very low chance of breaking a program that accidentally depends on the buggy behavior. RE2, Go, and Rust all have this bug, and we've all agreed to fix it, to keep the implementations in sync. Fixes #46123. Change-Id: I70e742e71e0a23b626593b16ddef3c1e73b413b0 Reviewed-on: https://go-review.googlesource.com/c/go/+/318750 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Rob Pike <r@golang.org> TryBot-Result: Go Bot <gobot@golang.org>
* all: go fmt std cmd (but revert vendor)Russ Cox2021-02-201-0/+1
| | | | | | | | | | | | | | | | Make all our package sources use Go 1.17 gofmt format (adding //go:build lines). Part of //go:build change (#41184). See https://golang.org/design/draft-gobuild Change-Id: Ia0534360e4957e58cd9a18429c39d0e32a6addb4 Reviewed-on: https://go-review.googlesource.com/c/go/+/294430 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* regexp/syntax: add note about Unicode character classesTom Payne2020-11-251-1/+2
| | | | | | | | | | | | | As proposed on golang-nuts: https://groups.google.com/g/golang-nuts/c/M3lmSUptExQ/m/hRySV9GsCAAJ Includes the latest updates from re2's mksyntaxgo: https://code.googlesource.com/re2/+/refs/heads/master/doc/mksyntaxgo Change-Id: Ib7b79aa6531f473feabd0a7f1d263cd65c4388e4 Reviewed-on: https://go-review.googlesource.com/c/go/+/264678 Reviewed-by: Russ Cox <rsc@golang.org> Trust: Emmanuel Odeke <emmanuel@orijtech.com>
* regexp/syntax: append patchLists in constant timeAndy Balholm2020-06-181-39/+29
| | | | | | | | | | | | | | | | | | By keeping a tail pointer, we can append to a patchList in constant time, rather than in time proportional to the length of the list. This gets rid of the quadratic compile times we were seeing for long series of alternations. This is basically the same change as https://github.com/google/re2/commit/e9d517989f66f2e0a24cde42f4d2424dd3e4a9b9. Fixes #39542. Change-Id: Ib4ca0ca9c55abd1594df1984653c7d311ccf7572 Reviewed-on: https://go-review.googlesource.com/c/go/+/238079 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* regexp/syntax: fix comment on p.literal and simplifyRuss Cox2020-04-171-11/+5
| | | | | | | | | | | | p.literal's doc comment said it returned a value but it doesn't. While we're here, p.newLiteral is only called from p.literal, so simplify the code by merging the two. Change-Id: Ia357937a99f4e7473f0f1ec837113a39eaeb83d4 Reviewed-on: https://go-review.googlesource.com/c/go/+/222659 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* regexp: add (*Regexp).SubexpIndexSylvain Zimmer2020-04-103-14/+59
| | | | | | | | | | | | | SubexpIndex returns the index of the first subexpression with the given name, or -1 if there is no subexpression with that name. Fixes #32420 Change-Id: Ie1f9d22d50fb84e18added80a9d9a9f6dca8ffc4 Reviewed-on: https://go-review.googlesource.com/c/go/+/187919 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
* regexp: skip long-running benchmarks if -short is specifiedKeith Randall2019-10-171-2/+2
| | | | | | | | | | | | | This CL helps race.bash finish in a reasonable amount of time. Otherwise the Match/Hard1/32M benchmark takes over 1200 seconds to finish on arm64, triggering a timeout. With this change the regexp benchmarks as a whole take only about a minute. Change-Id: Ie2260ef9f5709e32a74bd76f135bc384b2d9853f Reviewed-on: https://go-review.googlesource.com/c/go/+/201742 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* regexp: add examples for FindSubmatchIndex and LongestPantelis Sampaziotis2019-09-281-0/+28
| | | | | | | | | | | | updates #21450 Change-Id: Ibffe0dadc1e1523c55cd5f5b8a69bc1c399a255d GitHub-Last-Rev: 507f55508121a525de4d210e7ada1396ccaaf367 GitHub-Pull-Request: golang/go#33497 Reviewed-on: https://go-review.googlesource.com/c/go/+/189177 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* regexp: add more examples for Regexp methodsLiz Rice2019-09-181-0/+17
| | | | | | | | | | | | | | | | | | | | | | Since I first started on this CL, most of the methods have had examples added by other folks, so this is now one new example, and additions to two existing examples for extra clarity. The issue has a comment about not necessarily having examples for all methods, but I recall finding this package pretty confusing when I first used it, and having concrete examples would have really helped me navigate all the different options. There are more String methods with examples now, but I think seeing how the byte-slice methods work could also be helpful to explain the differences. Updates #21450 Change-Id: I27b4eeb634fb8ab59f791c0961cce79a67889826 Reviewed-on: https://go-review.googlesource.com/c/go/+/120145 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>
* regexp: add example for NumSubexpPantelis Sampaziotis2019-09-101-0/+7
| | | | | | | | | | | | Updates #21450 Change-Id: Idf276e97f816933cc0f752cdcd5e713b5c975833 GitHub-Last-Rev: 198e585f92db6e7ac126b49cd751b333e9a44b93 GitHub-Pull-Request: golang/go#33490 Reviewed-on: https://go-review.googlesource.com/c/go/+/189138 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* regexp: add example for ReplaceAllpsampaz2019-09-051-0/+13
| | | | | | | | | | | | Updates #21450 Change-Id: Ia31c20b52bae5daeb33d918234c2f0944a8aeb07 GitHub-Last-Rev: cc8554477024277c3c1b4122344e9d14427680b3 GitHub-Pull-Request: golang/go#33489 Reviewed-on: https://go-review.googlesource.com/c/go/+/189137 Run-TryBot: Sylvain Zimmer <sylvinus@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* std: remove unused bits of code all over the placeDaniel Martí2019-09-021-1/+1
| | | | | | | | | | | | | | | | Some were never used, and some haven't been used for years. One exception is net/http's readerAndCloser, which was only used in a test. Move it to a test file. While at it, remove a check in regexp that could never fire; the field is an uint32, so it can never be negative. Change-Id: Ia2200f6afa106bae4034045ea8233b452f38747b Reviewed-on: https://go-review.googlesource.com/c/go/+/192621 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* regexp/syntax: exclude full range from String negation caseKeegan Carruthers-Smith2019-05-222-1/+2
| | | | | | | | | | | If the char class is 0x0-0x10ffff we mistakenly would String that to `[^]`, which is not a valid regex. Fixes #31807 Change-Id: I9ceeaddc28b67b8e1de12b6703bcb124cc784556 Reviewed-on: https://go-review.googlesource.com/c/go/+/175679 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* regexp: optimize for provably too short inputsSylvain Zimmer2019-05-155-10/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For many patterns we can compute the minimum input length at compile time. If the input is shorter, we can return early and get a huge speedup. As pointed out by Damian Gryski, Perl's regex engine contains a number of these kinds of fail-fast optimizations: https://perldoc.perl.org/perlreguts.html#Peep-hole-Optimisation-and-Analysis Benchmarks: (including new ones for compile time) name old time/op new time/op delta Compile/Onepass-8 4.39µs ± 1% 4.40µs ± 0% +0.34% (p=0.029 n=9+8) Compile/Medium-8 9.80µs ± 0% 9.91µs ± 0% +1.17% (p=0.000 n=10+10) Compile/Hard-8 72.7µs ± 0% 73.5µs ± 0% +1.10% (p=0.000 n=9+10) name old time/op new time/op delta Match/Easy0/16-8 52.6ns ± 5% 4.9ns ± 0% -90.68% (p=0.000 n=10+9) Match/Easy0/32-8 64.1ns ±10% 61.4ns ± 1% ~ (p=0.188 n=10+9) Match/Easy0/1K-8 280ns ± 1% 277ns ± 2% -0.97% (p=0.004 n=10+10) Match/Easy0/32K-8 4.61µs ± 1% 4.55µs ± 1% -1.49% (p=0.000 n=9+10) Match/Easy0/1M-8 229µs ± 0% 226µs ± 1% -1.29% (p=0.000 n=8+10) Match/Easy0/32M-8 7.50ms ± 1% 7.47ms ± 1% ~ (p=0.165 n=10+10) Match/Easy0i/16-8 533ns ± 1% 5ns ± 2% -99.07% (p=0.000 n=10+10) Match/Easy0i/32-8 950ns ± 0% 950ns ± 1% ~ (p=0.920 n=10+9) Match/Easy0i/1K-8 27.5µs ± 1% 27.5µs ± 0% ~ (p=0.739 n=10+10) Match/Easy0i/32K-8 1.13ms ± 0% 1.13ms ± 1% ~ (p=0.079 n=9+10) Match/Easy0i/1M-8 36.7ms ± 2% 36.1ms ± 0% -1.64% (p=0.000 n=10+9) Match/Easy0i/32M-8 1.17s ± 0% 1.16s ± 1% -0.80% (p=0.004 n=8+9) Match/Easy1/16-8 55.5ns ± 6% 4.9ns ± 1% -91.19% (p=0.000 n=10+9) Match/Easy1/32-8 58.3ns ± 8% 56.6ns ± 1% ~ (p=0.449 n=10+8) Match/Easy1/1K-8 750ns ± 0% 748ns ± 1% ~ (p=0.072 n=8+10) Match/Easy1/32K-8 31.8µs ± 0% 31.6µs ± 1% -0.50% (p=0.035 n=10+9) Match/Easy1/1M-8 1.10ms ± 1% 1.09ms ± 0% -0.95% (p=0.000 n=10+9) Match/Easy1/32M-8 35.5ms ± 0% 35.2ms ± 1% -1.05% (p=0.000 n=9+10) Match/Medium/16-8 442ns ± 2% 5ns ± 1% -98.89% (p=0.000 n=10+10) Match/Medium/32-8 875ns ± 0% 878ns ± 1% ~ (p=0.071 n=9+10) Match/Medium/1K-8 26.1µs ± 0% 25.9µs ± 0% -0.64% (p=0.000 n=10+10) Match/Medium/32K-8 1.09ms ± 1% 1.08ms ± 0% -0.84% (p=0.000 n=10+9) Match/Medium/1M-8 34.9ms ± 0% 34.6ms ± 1% -0.98% (p=0.000 n=9+10) Match/Medium/32M-8 1.12s ± 1% 1.11s ± 1% -0.98% (p=0.000 n=10+9) Match/Hard/16-8 721ns ± 1% 5ns ± 0% -99.32% (p=0.000 n=10+9) Match/Hard/32-8 1.32µs ± 1% 1.31µs ± 0% -0.71% (p=0.000 n=9+9) Match/Hard/1K-8 39.8µs ± 1% 39.7µs ± 1% ~ (p=0.165 n=10+10) Match/Hard/32K-8 1.57ms ± 0% 1.56ms ± 0% -0.70% (p=0.000 n=10+9) Match/Hard/1M-8 50.4ms ± 1% 50.1ms ± 1% -0.57% (p=0.007 n=10+10) Match/Hard/32M-8 1.62s ± 1% 1.60s ± 0% -0.98% (p=0.000 n=10+10) Match/Hard1/16-8 3.88µs ± 1% 3.86µs ± 0% ~ (p=0.118 n=10+10) Match/Hard1/32-8 7.44µs ± 1% 7.46µs ± 1% ~ (p=0.109 n=10+10) Match/Hard1/1K-8 232µs ± 1% 229µs ± 1% -1.31% (p=0.000 n=10+9) Match/Hard1/32K-8 7.41ms ± 2% 7.41ms ± 0% ~ (p=0.237 n=10+8) Match/Hard1/1M-8 238ms ± 1% 238ms ± 0% ~ (p=0.481 n=10+10) Match/Hard1/32M-8 7.69s ± 1% 7.61s ± 0% -1.00% (p=0.000 n=10+10) Fixes #31329 Change-Id: I04640e8c59178ec8b3106e13ace9b109b6bdbc25 Reviewed-on: https://go-review.googlesource.com/c/go/+/171023 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Rob Pike <r@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* regexp: clarify docs re Submatch resultDmitry Vyukov2019-05-081-3/+4
| | | | | | | | | | Currently we say that a negative index means no match, but we don't say how "no match" is expressed when 'Index' is not present. Say how it is expressed. Change-Id: I82b6c9038557ac49852ac03642afc0bc545bb4a2 Reviewed-on: https://go-review.googlesource.com/c/go/+/175677 Reviewed-by: Ian Lance Taylor <iant@golang.org>
* regexp: add ReplaceAllStringFunc exampleValentin Vidic2019-02-271-0/+8
| | | | | | | | | | Change-Id: I016312f3ecf3dfcbf0eaf24e31b6842d80abb029 GitHub-Last-Rev: 360047c9006dba643429c006f89d813d927999b3 GitHub-Pull-Request: golang/go#30445 Reviewed-on: https://go-review.googlesource.com/c/164257 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* regexp: limit the capacity of slices of bytes returned by FindXFrancesc Campoy2019-02-262-8/+20
| | | | | | | | | | | | | | | | | | This change limits the capacity of the slices of bytes returned by: - Find - FindAll - FindAllSubmatch to be the same as their length. Fixes #30169 Change-Id: I07b632757d2bfeab42fce0d42364e2a16c597360 Reviewed-on: https://go-review.googlesource.com/c/161877 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* regexp: use backquotes for all regular expression examplesVladimir Kovpak2018-11-201-21/+21
| | | | | | | | | | | | | This commit performs replace double quote to backquote, so now all examples looks consistent. Change-Id: I8cf760ce1bdeff9619a88e531161b9516385241b GitHub-Last-Rev: e3e636cebbf41528b8a73f9a3fe5afa10876f964 GitHub-Pull-Request: golang/go#28879 Reviewed-on: https://go-review.googlesource.com/c/150397 Reviewed-by: Rob Pike <r@golang.org> Run-TryBot: Rob Pike <r@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* regexp: add matching and finding examplesVladimir Kovpak2018-11-191-0/+54
| | | | | | | | | | | | | This commit adds examples for Match, Find, FindAllSubmatch, FindSubmatch and Match functions. Change-Id: I2bdf8c3cee6e89d618109397378c1fc91aaf1dfb GitHub-Last-Rev: 33f34b7adca2911a4fff9638c93e846fb0021465 GitHub-Pull-Request: golang/go#28837 Reviewed-on: https://go-review.googlesource.com/c/150020 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
* all: use "reports whether" consistently in the few places that didn'tBrad Fitzpatrick2018-11-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Go documentation style for boolean funcs is to say: // Foo reports whether ... func Foo() bool (rather than "returns true if") This CL also replaces 4 uses of "iff" with the same "reports whether" wording, which doesn't lose any meaning, and will prevent people from sending typo fixes when they don't realize it's "if and only if". In the past I think we've had the typo CLs updated to just say "reports whether". So do them all at once. (Inspired by the addition of another "returns true if" in CL 146938 in fd_plan9.go) Created with: $ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" | grep -v vendor) $ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" | grep -v vendor) Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f Reviewed-on: https://go-review.googlesource.com/c/147037 Reviewed-by: Ian Lance Taylor <iant@golang.org>
* regexp: add partial Deprecation comment to CopyRuss Cox2018-10-121-2/+6
| | | | | | | | Change-Id: I21b7817e604a48330f1ee250f7b1b2adc1f16067 Reviewed-on: https://go-review.googlesource.com/c/139784 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* regexp: add DeepEqual testRuss Cox2018-10-121-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This locks in behavior we accidentally broke and then restored during the Go 1.11 cycle. See #26219. It also locks in new behavior that DeepEqual always works, instead of only usually working. This CL is the final piece of a series of CLs to make DeepEqual always work, by eliminating the machine cache and making other related optimizations. Overall, this whole sequence of CLs achieves: name old time/op new time/op delta Find-12 264ns ± 3% 260ns ± 0% -1.59% (p=0.000 n=10+9) FindAllNoMatches-12 140ns ± 2% 133ns ± 0% -5.34% (p=0.000 n=10+7) FindString-12 256ns ± 0% 249ns ± 0% -2.73% (p=0.000 n=8+8) FindSubmatch-12 339ns ± 1% 333ns ± 1% -1.73% (p=0.000 n=9+10) FindStringSubmatch-12 322ns ± 0% 322ns ± 1% ~ (p=0.450 n=8+10) Literal-12 100ns ± 2% 92ns ± 0% -8.13% (p=0.000 n=10+10) NotLiteral-12 1.50µs ± 0% 1.47µs ± 0% -1.65% (p=0.000 n=8+8) MatchClass-12 2.18µs ± 0% 2.15µs ± 0% -1.05% (p=0.000 n=10+9) MatchClass_InRange-12 2.12µs ± 0% 2.11µs ± 0% -0.65% (p=0.000 n=10+9) ReplaceAll-12 1.41µs ± 0% 1.41µs ± 0% ~ (p=0.254 n=7+10) AnchoredLiteralShortNonMatch-12 89.8ns ± 0% 81.5ns ± 0% -9.22% (p=0.000 n=8+9) AnchoredLiteralLongNonMatch-12 105ns ± 3% 97ns ± 0% -7.21% (p=0.000 n=10+10) AnchoredShortMatch-12 141ns ± 0% 128ns ± 0% -9.22% (p=0.000 n=9+9) AnchoredLongMatch-12 276ns ± 4% 253ns ± 2% -8.23% (p=0.000 n=10+10) OnePassShortA-12 620ns ± 0% 587ns ± 0% -5.26% (p=0.000 n=10+6) NotOnePassShortA-12 575ns ± 3% 547ns ± 1% -4.77% (p=0.000 n=10+10) OnePassShortB-12 493ns ± 0% 455ns ± 0% -7.62% (p=0.000 n=8+9) NotOnePassShortB-12 423ns ± 0% 406ns ± 1% -3.95% (p=0.000 n=8+10) OnePassLongPrefix-12 112ns ± 0% 109ns ± 1% -2.77% (p=0.000 n=9+10) OnePassLongNotPrefix-12 405ns ± 0% 349ns ± 0% -13.74% (p=0.000 n=8+9) MatchParallelShared-12 501ns ± 1% 38ns ± 2% -92.42% (p=0.000 n=10+10) MatchParallelCopied-12 39.1ns ± 0% 38.6ns ± 1% -1.38% (p=0.002 n=6+10) QuoteMetaAll-12 94.6ns ± 0% 94.8ns ± 0% +0.26% (p=0.001 n=10+9) QuoteMetaNone-12 52.7ns ± 0% 52.7ns ± 0% ~ (all equal) Match/Easy0/32-12 79.1ns ± 0% 72.0ns ± 0% -8.95% (p=0.000 n=9+9) Match/Easy0/1K-12 307ns ± 1% 297ns ± 0% -3.32% (p=0.000 n=10+7) Match/Easy0/32K-12 4.65µs ± 2% 4.67µs ± 1% ~ (p=0.633 n=10+8) Match/Easy0/1M-12 234µs ± 0% 234µs ± 0% ~ (p=0.684 n=10+10) Match/Easy0/32M-12 7.98ms ± 1% 7.96ms ± 0% -0.31% (p=0.014 n=9+9) Match/Easy0i/32-12 1.13µs ± 1% 1.10µs ± 0% -3.18% (p=0.000 n=9+10) Match/Easy0i/1K-12 32.5µs ± 0% 31.7µs ± 0% -2.61% (p=0.000 n=9+9) Match/Easy0i/32K-12 1.59ms ± 0% 1.26ms ± 0% -20.71% (p=0.000 n=9+7) Match/Easy0i/1M-12 51.0ms ± 0% 40.4ms ± 0% -20.68% (p=0.000 n=10+7) Match/Easy0i/32M-12 1.63s ± 0% 1.30s ± 0% -20.62% (p=0.001 n=7+7) Match/Easy1/32-12 75.1ns ± 1% 67.4ns ± 0% -10.24% (p=0.000 n=8+10) Match/Easy1/1K-12 861ns ± 0% 879ns ± 0% +2.18% (p=0.000 n=8+8) Match/Easy1/32K-12 39.2µs ± 1% 34.1µs ± 0% -13.01% (p=0.000 n=10+8) Match/Easy1/1M-12 1.38ms ± 0% 1.17ms ± 0% -15.06% (p=0.000 n=10+8) Match/Easy1/32M-12 44.2ms ± 1% 37.5ms ± 0% -15.15% (p=0.000 n=10+9) Match/Medium/32-12 1.04µs ± 1% 1.03µs ± 0% -0.64% (p=0.002 n=9+8) Match/Medium/1K-12 31.3µs ± 0% 31.2µs ± 0% -0.36% (p=0.000 n=9+9) Match/Medium/32K-12 1.44ms ± 0% 1.20ms ± 0% -17.02% (p=0.000 n=8+7) Match/Medium/1M-12 46.1ms ± 0% 38.2ms ± 0% -17.14% (p=0.001 n=6+8) Match/Medium/32M-12 1.48s ± 0% 1.23s ± 0% -17.10% (p=0.000 n=9+7) Match/Hard/32-12 1.54µs ± 1% 1.47µs ± 0% -4.64% (p=0.000 n=9+10) Match/Hard/1K-12 46.4µs ± 1% 44.4µs ± 0% -4.35% (p=0.000 n=9+8) Match/Hard/32K-12 2.19ms ± 0% 1.78ms ± 7% -18.74% (p=0.000 n=8+10) Match/Hard/1M-12 70.1ms ± 0% 57.7ms ± 7% -17.62% (p=0.000 n=8+10) Match/Hard/32M-12 2.24s ± 0% 1.84s ± 8% -17.92% (p=0.000 n=8+10) Match/Hard1/32-12 8.17µs ± 1% 7.95µs ± 0% -2.72% (p=0.000 n=8+10) Match/Hard1/1K-12 254µs ± 2% 245µs ± 0% -3.62% (p=0.000 n=9+10) Match/Hard1/32K-12 9.58ms ± 1% 8.54ms ± 7% -10.87% (p=0.000 n=10+10) Match/Hard1/1M-12 306ms ± 1% 271ms ± 8% -11.42% (p=0.000 n=9+10) Match/Hard1/32M-12 9.79s ± 1% 8.58s ± 9% -12.37% (p=0.000 n=9+10) Match_onepass_regex/32-12 808ns ± 0% 716ns ± 1% -11.39% (p=0.000 n=8+9) Match_onepass_regex/1K-12 27.8µs ± 0% 19.9µs ± 2% -28.51% (p=0.000 n=8+9) Match_onepass_regex/32K-12 925µs ± 0% 631µs ± 2% -31.71% (p=0.000 n=9+9) Match_onepass_regex/1M-12 29.5ms ± 0% 20.2ms ± 2% -31.53% (p=0.000 n=10+9) Match_onepass_regex/32M-12 945ms ± 0% 648ms ± 2% -31.39% (p=0.000 n=9+9) CompileOnepass-12 4.67µs ± 0% 4.60µs ± 0% -1.48% (p=0.000 n=10+10) [Geo mean] 24.5µs 21.4µs -12.94% https://perf.golang.org/search?q=upload:20181004.5 Change-Id: Icb17b306830dc5489efbb55900937b94ce0eb047 Reviewed-on: https://go-review.googlesource.com/c/139783 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* regexp: evaluate context flags lazilyRuss Cox2018-10-123-21/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's no point in computing whether we're at the beginning of the line if the NFA isn't going to ask. Wait to compute that until asked. Whatever minor slowdowns were introduced by the conversion to pools that were not repaid by other optimizations are taken care of by this one. name old time/op new time/op delta Find-12 252ns ± 0% 260ns ± 0% +3.34% (p=0.000 n=10+8) FindAllNoMatches-12 136ns ± 4% 134ns ± 4% -0.96% (p=0.033 n=10+10) FindString-12 246ns ± 0% 250ns ± 0% +1.46% (p=0.000 n=8+10) FindSubmatch-12 332ns ± 1% 332ns ± 0% ~ (p=0.101 n=9+10) FindStringSubmatch-12 321ns ± 1% 322ns ± 1% ~ (p=0.717 n=9+10) Literal-12 91.6ns ± 0% 92.3ns ± 0% +0.74% (p=0.000 n=9+9) NotLiteral-12 1.47µs ± 0% 1.47µs ± 0% +0.38% (p=0.000 n=9+8) MatchClass-12 2.15µs ± 0% 2.15µs ± 0% +0.39% (p=0.000 n=10+10) MatchClass_InRange-12 2.09µs ± 0% 2.11µs ± 0% +0.75% (p=0.000 n=9+9) ReplaceAll-12 1.40µs ± 0% 1.40µs ± 0% ~ (p=0.525 n=10+10) AnchoredLiteralShortNonMatch-12 83.5ns ± 0% 81.6ns ± 0% -2.28% (p=0.000 n=9+10) AnchoredLiteralLongNonMatch-12 101ns ± 0% 97ns ± 1% -3.54% (p=0.000 n=10+10) AnchoredShortMatch-12 131ns ± 0% 128ns ± 0% -2.29% (p=0.000 n=10+9) AnchoredLongMatch-12 268ns ± 1% 252ns ± 1% -6.04% (p=0.000 n=10+10) OnePassShortA-12 614ns ± 0% 587ns ± 1% -4.33% (p=0.000 n=6+10) NotOnePassShortA-12 552ns ± 0% 547ns ± 1% -0.89% (p=0.000 n=10+10) OnePassShortB-12 494ns ± 0% 455ns ± 0% -7.96% (p=0.000 n=9+9) NotOnePassShortB-12 411ns ± 0% 406ns ± 0% -1.30% (p=0.000 n=9+9) OnePassLongPrefix-12 109ns ± 0% 108ns ± 1% ~ (p=0.064 n=8+9) OnePassLongNotPrefix-12 403ns ± 0% 349ns ± 0% -13.30% (p=0.000 n=9+8) MatchParallelShared-12 38.9ns ± 1% 37.9ns ± 1% -2.65% (p=0.000 n=10+8) MatchParallelCopied-12 39.2ns ± 1% 38.3ns ± 2% -2.20% (p=0.001 n=10+10) QuoteMetaAll-12 94.5ns ± 0% 94.7ns ± 0% +0.18% (p=0.043 n=10+9) QuoteMetaNone-12 52.7ns ± 0% 52.7ns ± 0% ~ (all equal) Match/Easy0/32-12 72.2ns ± 0% 71.9ns ± 0% -0.38% (p=0.009 n=8+10) Match/Easy0/1K-12 296ns ± 1% 297ns ± 0% +0.51% (p=0.001 n=10+9) Match/Easy0/32K-12 4.57µs ± 3% 4.61µs ± 2% ~ (p=0.280 n=10+10) Match/Easy0/1M-12 234µs ± 0% 234µs ± 0% ~ (p=0.986 n=10+10) Match/Easy0/32M-12 7.96ms ± 0% 7.98ms ± 0% +0.22% (p=0.010 n=10+9) Match/Easy0i/32-12 1.09µs ± 0% 1.10µs ± 0% +0.23% (p=0.000 n=8+9) Match/Easy0i/1K-12 31.7µs ± 0% 31.7µs ± 0% +0.09% (p=0.003 n=9+8) Match/Easy0i/32K-12 1.61ms ± 0% 1.27ms ± 1% -21.03% (p=0.000 n=8+10) Match/Easy0i/1M-12 51.4ms ± 0% 40.4ms ± 0% -21.29% (p=0.000 n=8+8) Match/Easy0i/32M-12 1.65s ± 0% 1.30s ± 1% -21.22% (p=0.000 n=9+9) Match/Easy1/32-12 67.6ns ± 1% 67.2ns ± 0% ~ (p=0.085 n=10+9) Match/Easy1/1K-12 873ns ± 2% 880ns ± 0% +0.78% (p=0.006 n=9+7) Match/Easy1/32K-12 39.7µs ± 1% 34.3µs ± 3% -13.53% (p=0.000 n=10+10) Match/Easy1/1M-12 1.41ms ± 1% 1.19ms ± 3% -15.48% (p=0.000 n=10+10) Match/Easy1/32M-12 44.9ms ± 1% 38.0ms ± 2% -15.21% (p=0.000 n=10+10) Match/Medium/32-12 1.04µs ± 0% 1.03µs ± 0% -0.57% (p=0.000 n=9+9) Match/Medium/1K-12 31.2µs ± 0% 31.4µs ± 1% +0.61% (p=0.000 n=8+10) Match/Medium/32K-12 1.45ms ± 1% 1.20ms ± 0% -17.70% (p=0.000 n=10+8) Match/Medium/1M-12 46.4ms ± 0% 38.4ms ± 2% -17.32% (p=0.000 n=6+9) Match/Medium/32M-12 1.49s ± 1% 1.24s ± 1% -16.81% (p=0.000 n=10+10) Match/Hard/32-12 1.47µs ± 0% 1.47µs ± 0% -0.31% (p=0.000 n=9+10) Match/Hard/1K-12 44.5µs ± 1% 44.4µs ± 0% ~ (p=0.075 n=10+10) Match/Hard/32K-12 2.09ms ± 0% 1.78ms ± 7% -14.88% (p=0.000 n=8+10) Match/Hard/1M-12 67.8ms ± 5% 56.9ms ± 7% -16.05% (p=0.000 n=10+10) Match/Hard/32M-12 2.17s ± 5% 1.84s ± 6% -15.21% (p=0.000 n=10+10) Match/Hard1/32-12 7.89µs ± 0% 7.94µs ± 0% +0.61% (p=0.000 n=9+9) Match/Hard1/1K-12 246µs ± 0% 245µs ± 0% -0.30% (p=0.010 n=9+10) Match/Hard1/32K-12 8.93ms ± 0% 8.17ms ± 0% -8.44% (p=0.000 n=9+8) Match/Hard1/1M-12 286ms ± 0% 269ms ± 9% -5.66% (p=0.028 n=9+10) Match/Hard1/32M-12 9.16s ± 0% 8.61s ± 8% -5.98% (p=0.028 n=9+10) Match_onepass_regex/32-12 825ns ± 0% 712ns ± 0% -13.75% (p=0.000 n=8+8) Match_onepass_regex/1K-12 28.7µs ± 1% 19.8µs ± 0% -30.99% (p=0.000 n=9+8) Match_onepass_regex/32K-12 950µs ± 1% 628µs ± 0% -33.83% (p=0.000 n=9+8) Match_onepass_regex/1M-12 30.4ms ± 0% 20.1ms ± 0% -33.74% (p=0.000 n=9+8) Match_onepass_regex/32M-12 974ms ± 1% 646ms ± 0% -33.73% (p=0.000 n=9+8) CompileOnepass-12 4.60µs ± 0% 4.59µs ± 0% ~ (p=0.063 n=8+9) [Geo mean] 23.1µs 21.3µs -7.44% https://perf.golang.org/search?q=upload:20181004.4 Change-Id: I47cdd09f6dcde1d7c317080e9b4df42c7d0a8d24 Reviewed-on: https://go-review.googlesource.com/c/139782 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* regexp: use pools for NFA machinesRuss Cox2018-10-122-66/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now the machine struct is only used for NFA execution. Use global pools to cache machines instead of per-Regexp lists. Also eliminate some tail calls in NFA execution, to pay for the added overhead of sync.Pool. name old time/op new time/op delta Find-12 252ns ± 0% 252ns ± 0% ~ (p=1.000 n=10+10) FindAllNoMatches-12 134ns ± 1% 136ns ± 4% ~ (p=0.443 n=9+10) FindString-12 246ns ± 0% 246ns ± 0% -0.16% (p=0.046 n=10+8) FindSubmatch-12 333ns ± 2% 332ns ± 1% ~ (p=0.489 n=10+9) FindStringSubmatch-12 320ns ± 0% 321ns ± 1% +0.55% (p=0.005 n=10+9) Literal-12 91.1ns ± 0% 91.6ns ± 0% +0.55% (p=0.000 n=10+9) NotLiteral-12 1.45µs ± 0% 1.47µs ± 0% +0.82% (p=0.000 n=10+9) MatchClass-12 2.19µs ± 0% 2.15µs ± 0% -2.01% (p=0.000 n=9+10) MatchClass_InRange-12 2.09µs ± 0% 2.09µs ± 0% ~ (p=0.082 n=10+9) ReplaceAll-12 1.39µs ± 0% 1.40µs ± 0% +0.50% (p=0.000 n=10+10) AnchoredLiteralShortNonMatch-12 82.4ns ± 0% 83.5ns ± 0% +1.36% (p=0.000 n=8+9) AnchoredLiteralLongNonMatch-12 106ns ± 1% 101ns ± 0% -4.36% (p=0.000 n=10+10) AnchoredShortMatch-12 130ns ± 0% 131ns ± 0% +0.77% (p=0.000 n=9+10) AnchoredLongMatch-12 272ns ± 0% 268ns ± 1% -1.46% (p=0.000 n=8+10) OnePassShortA-12 615ns ± 0% 614ns ± 0% ~ (p=0.094 n=10+6) NotOnePassShortA-12 549ns ± 0% 552ns ± 0% +0.52% (p=0.000 n=9+10) OnePassShortB-12 494ns ± 0% 494ns ± 0% ~ (p=0.247 n=8+9) NotOnePassShortB-12 412ns ± 1% 411ns ± 0% ~ (p=0.625 n=10+9) OnePassLongPrefix-12 108ns ± 0% 109ns ± 0% +0.93% (p=0.000 n=10+8) OnePassLongNotPrefix-12 402ns ± 0% 403ns ± 0% +0.14% (p=0.041 n=8+9) MatchParallelShared-12 38.6ns ± 2% 38.9ns ± 1% ~ (p=0.172 n=9+10) MatchParallelCopied-12 39.4ns ± 7% 39.2ns ± 1% ~ (p=0.423 n=10+10) QuoteMetaAll-12 94.9ns ± 0% 94.5ns ± 0% -0.42% (p=0.000 n=9+10) QuoteMetaNone-12 52.7ns ± 0% 52.7ns ± 0% ~ (all equal) Match/Easy0/32-12 72.1ns ± 0% 72.2ns ± 0% ~ (p=0.435 n=9+8) Match/Easy0/1K-12 298ns ± 0% 296ns ± 1% -1.01% (p=0.000 n=8+10) Match/Easy0/32K-12 4.64µs ± 1% 4.57µs ± 3% -1.39% (p=0.030 n=10+10) Match/Easy0/1M-12 234µs ± 0% 234µs ± 0% ~ (p=0.971 n=10+10) Match/Easy0/32M-12 7.95ms ± 0% 7.96ms ± 0% ~ (p=0.278 n=9+10) Match/Easy0i/32-12 1.10µs ± 0% 1.09µs ± 0% -0.29% (p=0.000 n=9+8) Match/Easy0i/1K-12 31.8µs ± 1% 31.7µs ± 0% ~ (p=0.704 n=10+9) Match/Easy0i/32K-12 1.62ms ± 1% 1.61ms ± 0% -1.12% (p=0.000 n=10+8) Match/Easy0i/1M-12 51.8ms ± 0% 51.4ms ± 0% -0.84% (p=0.000 n=8+8) Match/Easy0i/32M-12 1.65s ± 0% 1.65s ± 0% -0.46% (p=0.000 n=9+9) Match/Easy1/32-12 67.7ns ± 1% 67.6ns ± 1% ~ (p=0.723 n=10+10) Match/Easy1/1K-12 873ns ± 0% 873ns ± 2% ~ (p=0.345 n=10+9) Match/Easy1/32K-12 39.4µs ± 0% 39.7µs ± 1% +0.66% (p=0.000 n=10+10) Match/Easy1/1M-12 1.39ms ± 0% 1.41ms ± 1% +1.10% (p=0.000 n=10+10) Match/Easy1/32M-12 44.3ms ± 0% 44.9ms ± 1% +1.18% (p=0.000 n=10+10) Match/Medium/32-12 1.04µs ± 0% 1.04µs ± 0% -0.58% (p=0.000 n=9+9) Match/Medium/1K-12 31.4µs ± 0% 31.2µs ± 0% -0.62% (p=0.000 n=8+8) Match/Medium/32K-12 1.45ms ± 0% 1.45ms ± 1% ~ (p=0.356 n=9+10) Match/Medium/1M-12 46.4ms ± 0% 46.4ms ± 0% ~ (p=0.142 n=8+6) Match/Medium/32M-12 1.49s ± 1% 1.49s ± 1% ~ (p=0.739 n=10+10) Match/Hard/32-12 1.48µs ± 0% 1.47µs ± 0% -0.53% (p=0.000 n=9+9) Match/Hard/1K-12 45.0µs ± 1% 44.5µs ± 1% -1.06% (p=0.000 n=10+10) Match/Hard/32K-12 2.24ms ± 0% 2.09ms ± 0% -6.56% (p=0.000 n=8+8) Match/Hard/1M-12 71.6ms ± 0% 67.8ms ± 5% -5.36% (p=0.000 n=7+10) Match/Hard/32M-12 2.29s ± 0% 2.17s ± 5% -5.40% (p=0.000 n=9+10) Match/Hard1/32-12 7.89µs ± 0% 7.89µs ± 0% ~ (p=0.053 n=9+9) Match/Hard1/1K-12 244µs ± 0% 246µs ± 0% +0.71% (p=0.000 n=10+9) Match/Hard1/32K-12 10.3ms ± 0% 8.9ms ± 0% -13.76% (p=0.000 n=10+9) Match/Hard1/1M-12 331ms ± 0% 286ms ± 0% -13.72% (p=0.000 n=9+9) Match/Hard1/32M-12 10.6s ± 0% 9.2s ± 0% -13.72% (p=0.000 n=10+9) Match_onepass_regex/32-12 830ns ± 0% 825ns ± 0% -0.57% (p=0.000 n=9+8) Match_onepass_regex/1K-12 28.7µs ± 1% 28.7µs ± 1% -0.22% (p=0.040 n=9+9) Match_onepass_regex/32K-12 949µs ± 0% 950µs ± 1% ~ (p=0.236 n=8+9) Match_onepass_regex/1M-12 30.4ms ± 0% 30.4ms ± 0% ~ (p=0.059 n=8+9) Match_onepass_regex/32M-12 973ms ± 0% 974ms ± 1% ~ (p=0.258 n=9+9) CompileOnepass-12 4.64µs ± 0% 4.60µs ± 0% -0.90% (p=0.000 n=10+8) [Geo mean] 23.3µs 23.1µs -1.16% https://perf.golang.org/search?q=upload:20181004.3 Change-Id: I46f3d52ce89c8cd992cf554473c27af81fd81bfd Reviewed-on: https://go-review.googlesource.com/c/139781 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* regexp: split one-pass execution out of machine structRuss Cox2018-10-126-102/+131
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows the one-pass executions to have their own pool of (much smaller) allocated structures. A step toward eliminating the per-Regexp machine cache. Not much effect on benchmarks, since there are no optimizations here, and pools are a tiny bit slower than a locked data structure for single-threaded code. name old time/op new time/op delta Find-12 254ns ± 0% 252ns ± 0% -0.94% (p=0.000 n=9+10) FindAllNoMatches-12 135ns ± 0% 134ns ± 1% -0.49% (p=0.002 n=9+9) FindString-12 247ns ± 0% 246ns ± 0% -0.24% (p=0.003 n=8+10) FindSubmatch-12 334ns ± 0% 333ns ± 2% ~ (p=0.283 n=10+10) FindStringSubmatch-12 321ns ± 0% 320ns ± 0% -0.51% (p=0.000 n=9+10) Literal-12 92.2ns ± 0% 91.1ns ± 0% -1.25% (p=0.000 n=9+10) NotLiteral-12 1.47µs ± 0% 1.45µs ± 0% -0.99% (p=0.000 n=9+10) MatchClass-12 2.17µs ± 0% 2.19µs ± 0% +0.84% (p=0.000 n=7+9) MatchClass_InRange-12 2.13µs ± 0% 2.09µs ± 0% -1.70% (p=0.000 n=10+10) ReplaceAll-12 1.39µs ± 0% 1.39µs ± 0% +0.51% (p=0.000 n=10+10) AnchoredLiteralShortNonMatch-12 83.2ns ± 0% 82.4ns ± 0% -0.96% (p=0.000 n=8+8) AnchoredLiteralLongNonMatch-12 105ns ± 0% 106ns ± 1% ~ (p=0.087 n=10+10) AnchoredShortMatch-12 131ns ± 0% 130ns ± 0% -0.76% (p=0.000 n=10+9) AnchoredLongMatch-12 267ns ± 0% 272ns ± 0% +2.01% (p=0.000 n=10+8) OnePassShortA-12 611ns ± 0% 615ns ± 0% +0.61% (p=0.000 n=9+10) NotOnePassShortA-12 552ns ± 0% 549ns ± 0% -0.46% (p=0.000 n=8+9) OnePassShortB-12 491ns ± 0% 494ns ± 0% +0.61% (p=0.000 n=8+8) NotOnePassShortB-12 412ns ± 0% 412ns ± 1% ~ (p=0.151 n=9+10) OnePassLongPrefix-12 112ns ± 0% 108ns ± 0% -3.57% (p=0.000 n=10+10) OnePassLongNotPrefix-12 410ns ± 0% 402ns ± 0% -1.95% (p=0.000 n=9+8) MatchParallelShared-12 38.8ns ± 1% 38.6ns ± 2% ~ (p=0.536 n=10+9) MatchParallelCopied-12 39.2ns ± 3% 39.4ns ± 7% ~ (p=0.986 n=10+10) QuoteMetaAll-12 94.6ns ± 0% 94.9ns ± 0% +0.29% (p=0.001 n=8+9) QuoteMetaNone-12 52.7ns ± 0% 52.7ns ± 0% ~ (all equal) Match/Easy0/32-12 72.9ns ± 0% 72.1ns ± 0% -1.07% (p=0.000 n=9+9) Match/Easy0/1K-12 298ns ± 0% 298ns ± 0% ~ (p=0.140 n=6+8) Match/Easy0/32K-12 4.60µs ± 2% 4.64µs ± 1% ~ (p=0.171 n=10+10) Match/Easy0/1M-12 235µs ± 0% 234µs ± 0% -0.14% (p=0.004 n=10+10) Match/Easy0/32M-12 7.96ms ± 0% 7.95ms ± 0% -0.12% (p=0.043 n=10+9) Match/Easy0i/32-12 1.09µs ± 0% 1.10µs ± 0% +0.15% (p=0.000 n=8+9) Match/Easy0i/1K-12 31.7µs ± 0% 31.8µs ± 1% ~ (p=0.905 n=9+10) Match/Easy0i/32K-12 1.61ms ± 0% 1.62ms ± 1% +1.12% (p=0.000 n=9+10) Match/Easy0i/1M-12 51.4ms ± 0% 51.8ms ± 0% +0.85% (p=0.000 n=8+8) Match/Easy0i/32M-12 1.65s ± 1% 1.65s ± 0% ~ (p=0.113 n=9+9) Match/Easy1/32-12 67.9ns ± 0% 67.7ns ± 1% ~ (p=0.232 n=8+10) Match/Easy1/1K-12 884ns ± 0% 873ns ± 0% -1.29% (p=0.000 n=9+10) Match/Easy1/32K-12 39.2µs ± 0% 39.4µs ± 0% +0.50% (p=0.000 n=9+10) Match/Easy1/1M-12 1.39ms ± 0% 1.39ms ± 0% +0.29% (p=0.000 n=9+10) Match/Easy1/32M-12 44.2ms ± 1% 44.3ms ± 0% +0.21% (p=0.029 n=10+10) Match/Medium/32-12 1.05µs ± 0% 1.04µs ± 0% -0.27% (p=0.001 n=8+9) Match/Medium/1K-12 31.3µs ± 0% 31.4µs ± 0% +0.39% (p=0.000 n=9+8) Match/Medium/32K-12 1.45ms ± 0% 1.45ms ± 0% +0.33% (p=0.000 n=8+9) Match/Medium/1M-12 46.2ms ± 0% 46.4ms ± 0% +0.35% (p=0.000 n=9+8) Match/Medium/32M-12 1.48s ± 0% 1.49s ± 1% +0.70% (p=0.000 n=8+10) Match/Hard/32-12 1.49µs ± 0% 1.48µs ± 0% -0.43% (p=0.000 n=10+9) Match/Hard/1K-12 45.1µs ± 1% 45.0µs ± 1% ~ (p=0.393 n=10+10) Match/Hard/32K-12 2.18ms ± 1% 2.24ms ± 0% +2.71% (p=0.000 n=9+8) Match/Hard/1M-12 69.7ms ± 1% 71.6ms ± 0% +2.76% (p=0.000 n=9+7) Match/Hard/32M-12 2.23s ± 1% 2.29s ± 0% +2.65% (p=0.000 n=9+9) Match/Hard1/32-12 7.89µs ± 0% 7.89µs ± 0% ~ (p=0.286 n=9+9) Match/Hard1/1K-12 244µs ± 0% 244µs ± 0% ~ (p=0.905 n=9+10) Match/Hard1/32K-12 10.3ms ± 0% 10.3ms ± 0% ~ (p=0.796 n=10+10) Match/Hard1/1M-12 331ms ± 0% 331ms ± 0% ~ (p=0.167 n=8+9) Match/Hard1/32M-12 10.6s ± 0% 10.6s ± 0% ~ (p=0.315 n=8+10) Match_onepass_regex/32-12 812ns ± 0% 830ns ± 0% +2.19% (p=0.000 n=10+9) Match_onepass_regex/1K-12 28.5µs ± 0% 28.7µs ± 1% +0.97% (p=0.000 n=10+9) Match_onepass_regex/32K-12 936µs ± 0% 949µs ± 0% +1.43% (p=0.000 n=10+8) Match_onepass_regex/1M-12 30.2ms ± 0% 30.4ms ± 0% +0.62% (p=0.000 n=10+8) Match_onepass_regex/32M-12 970ms ± 0% 973ms ± 0% +0.35% (p=0.000 n=10+9) CompileOnepass-12 4.63µs ± 1% 4.64µs ± 0% ~ (p=0.060 n=10+10) [Geo mean] 23.3µs 23.3µs +0.12% https://perf.golang.org/search?q=upload:20181004.2 Change-Id: Iff9e9f9d4a4698162126a2f300e8ed1b1a39361e Reviewed-on: https://go-review.googlesource.com/c/139780 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* regexp: split bit-state execution out of machine structRuss Cox2018-10-123-139/+162
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows the bit-state executions to have their own pool of allocated structures. A step toward eliminating the per-Regexp machine cache. Note especially the -92% on MatchParallelShared. This is real but not a complete story: the other execution engines still need to be de-shared, but the benchmark was only using bit-state. The tiny slowdowns in unrelated code are noise. name old time/op new time/op delta Find-12 264ns ± 3% 254ns ± 0% -3.86% (p=0.000 n=10+9) FindAllNoMatches-12 140ns ± 2% 135ns ± 0% -3.91% (p=0.000 n=10+9) FindString-12 256ns ± 0% 247ns ± 0% -3.52% (p=0.000 n=8+8) FindSubmatch-12 339ns ± 1% 334ns ± 0% -1.41% (p=0.000 n=9+10) FindStringSubmatch-12 322ns ± 0% 321ns ± 0% -0.21% (p=0.005 n=8+9) Literal-12 100ns ± 2% 92ns ± 0% -8.10% (p=0.000 n=10+9) NotLiteral-12 1.50µs ± 0% 1.47µs ± 0% -1.91% (p=0.000 n=8+9) MatchClass-12 2.18µs ± 0% 2.17µs ± 0% -0.20% (p=0.001 n=10+7) MatchClass_InRange-12 2.12µs ± 0% 2.13µs ± 0% +0.23% (p=0.000 n=10+10) ReplaceAll-12 1.41µs ± 0% 1.39µs ± 0% -1.30% (p=0.000 n=7+10) AnchoredLiteralShortNonMatch-12 89.8ns ± 0% 83.2ns ± 0% -7.35% (p=0.000 n=8+8) AnchoredLiteralLongNonMatch-12 105ns ± 3% 105ns ± 0% ~ (p=0.186 n=10+10) AnchoredShortMatch-12 141ns ± 0% 131ns ± 0% -7.09% (p=0.000 n=9+10) AnchoredLongMatch-12 276ns ± 4% 267ns ± 0% -3.23% (p=0.000 n=10+10) OnePassShortA-12 620ns ± 0% 611ns ± 0% -1.39% (p=0.000 n=10+9) NotOnePassShortA-12 575ns ± 3% 552ns ± 0% -3.97% (p=0.000 n=10+8) OnePassShortB-12 493ns ± 0% 491ns ± 0% -0.33% (p=0.000 n=8+8) NotOnePassShortB-12 423ns ± 0% 412ns ± 0% -2.60% (p=0.000 n=8+9) OnePassLongPrefix-12 112ns ± 0% 112ns ± 0% ~ (all equal) OnePassLongNotPrefix-12 405ns ± 0% 410ns ± 0% +1.23% (p=0.000 n=8+9) MatchParallelShared-12 501ns ± 1% 39ns ± 1% -92.27% (p=0.000 n=10+10) MatchParallelCopied-12 39.1ns ± 0% 39.2ns ± 3% ~ (p=0.785 n=6+10) QuoteMetaAll-12 94.6ns ± 0% 94.6ns ± 0% ~ (p=0.439 n=10+8) QuoteMetaNone-12 52.7ns ± 0% 52.7ns ± 0% ~ (all equal) Match/Easy0/32-12 79.1ns ± 0% 72.9ns ± 0% -7.85% (p=0.000 n=9+9) Match/Easy0/1K-12 307ns ± 1% 298ns ± 0% -2.99% (p=0.000 n=10+6) Match/Easy0/32K-12 4.65µs ± 2% 4.60µs ± 2% ~ (p=0.159 n=10+10) Match/Easy0/1M-12 234µs ± 0% 235µs ± 0% +0.17% (p=0.003 n=10+10) Match/Easy0/32M-12 7.98ms ± 1% 7.96ms ± 0% ~ (p=0.278 n=9+10) Match/Easy0i/32-12 1.13µs ± 1% 1.09µs ± 0% -3.24% (p=0.000 n=9+8) Match/Easy0i/1K-12 32.5µs ± 0% 31.7µs ± 0% -2.66% (p=0.000 n=9+9) Match/Easy0i/32K-12 1.59ms ± 0% 1.61ms ± 0% +0.75% (p=0.000 n=9+9) Match/Easy0i/1M-12 51.0ms ± 0% 51.4ms ± 0% +0.77% (p=0.000 n=10+8) Match/Easy0i/32M-12 1.63s ± 0% 1.65s ± 1% +1.24% (p=0.000 n=7+9) Match/Easy1/32-12 75.1ns ± 1% 67.9ns ± 0% -9.54% (p=0.000 n=8+8) Match/Easy1/1K-12 861ns ± 0% 884ns ± 0% +2.71% (p=0.000 n=8+9) Match/Easy1/32K-12 39.2µs ± 1% 39.2µs ± 0% ~ (p=0.090 n=10+9) Match/Easy1/1M-12 1.38ms ± 0% 1.39ms ± 0% ~ (p=0.095 n=10+9) Match/Easy1/32M-12 44.2ms ± 1% 44.2ms ± 1% ~ (p=0.218 n=10+10) Match/Medium/32-12 1.04µs ± 1% 1.05µs ± 0% +1.05% (p=0.000 n=9+8) Match/Medium/1K-12 31.3µs ± 0% 31.3µs ± 0% -0.14% (p=0.004 n=9+9) Match/Medium/32K-12 1.44ms ± 0% 1.45ms ± 0% +0.18% (p=0.001 n=8+8) Match/Medium/1M-12 46.1ms ± 0% 46.2ms ± 0% +0.13% (p=0.003 n=6+9) Match/Medium/32M-12 1.48s ± 0% 1.48s ± 0% +0.20% (p=0.002 n=9+8) Match/Hard/32-12 1.54µs ± 1% 1.49µs ± 0% -3.60% (p=0.000 n=9+10) Match/Hard/1K-12 46.4µs ± 1% 45.1µs ± 1% -2.78% (p=0.000 n=9+10) Match/Hard/32K-12 2.19ms ± 0% 2.18ms ± 1% -0.51% (p=0.006 n=8+9) Match/Hard/1M-12 70.1ms ± 0% 69.7ms ± 1% -0.52% (p=0.006 n=8+9) Match/Hard/32M-12 2.24s ± 0% 2.23s ± 1% -0.42% (p=0.046 n=8+9) Match/Hard1/32-12 8.17µs ± 1% 7.89µs ± 0% -3.42% (p=0.000 n=8+9) Match/Hard1/1K-12 254µs ± 2% 244µs ± 0% -3.91% (p=0.000 n=9+9) Match/Hard1/32K-12 9.58ms ± 1% 10.35ms ± 0% +8.00% (p=0.000 n=10+10) Match/Hard1/1M-12 306ms ± 1% 331ms ± 0% +8.27% (p=0.000 n=9+8) Match/Hard1/32M-12 9.79s ± 1% 10.60s ± 0% +8.29% (p=0.000 n=9+8) Match_onepass_regex/32-12 808ns ± 0% 812ns ± 0% +0.47% (p=0.000 n=8+10) Match_onepass_regex/1K-12 27.8µs ± 0% 28.5µs ± 0% +2.32% (p=0.000 n=8+10) Match_onepass_regex/32K-12 925µs ± 0% 936µs ± 0% +1.24% (p=0.000 n=9+10) Match_onepass_regex/1M-12 29.5ms ± 0% 30.2ms ± 0% +2.38% (p=0.000 n=10+10) Match_onepass_regex/32M-12 945ms ± 0% 970ms ± 0% +2.60% (p=0.000 n=9+10) CompileOnepass-12 4.67µs ± 0% 4.63µs ± 1% -0.84% (p=0.000 n=10+10) [Geo mean] 24.5µs 23.3µs -5.04% https://perf.golang.org/search?q=upload:20181004.1 Change-Id: Idbc2b76223718265657819ff38be2d9aba1c54b4 Reviewed-on: https://go-review.googlesource.com/c/139779 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
* regexp: fix BenchmarkMatch_onepass_regexRuss Cox2018-10-111-8/+2
| | | | | | | | | | | | | | | This benchmark - in contrast to all other benchmarks - was running the regexp match on 1-byte substrings of the input instead of the entire input. Worse, it was doing so by preallocating a slice of slices of every 1-byte substring. Needless to say, this does not accurately reflect what happens when the regexp matcher is given a large input. Change-Id: Icd5b95f0e43f554a6b93164916745941366e03d6 Reviewed-on: https://go-review.googlesource.com/c/139778 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>