delta/ruby.git - github.com: ruby/ruby.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Use UTF-8 encoding for literal extended regexps with UTF-8 characters in ↵	Jeremy Evans	2023-04-23	1	-1/+8
\| \| \| \| \| \|	comments Fixes [Bug #19455]
*	MatchData#named_captures: add optional symbolize_names keyword (#6952)	Vladimir Dementyev	2023-04-19	1	-4/+30
\|
*	[Feature #19474] Refactor NEWOBJ macros	Matt Valentine-House	2023-04-06	1	-2/+2
\| \| \| \|	NEWOBJ_OF is now our canonical newobj macro. It takes an optional ec
*	Stop exporting symbols for MJIT	Takashi Kokubun	2023-03-06	1	-1/+1
\|
*	[DOC] Fix options of `Regexp#initialize`	Nobuyoshi Nakada	2023-03-06	1	-1/+1
\| \| \| \|	`Integer#\|` is bit-wise OR operator, not logical OR.
*	`rb_scan_args` never fills optional arguments with `Qundef`	Nobuyoshi Nakada	2023-03-06	1	-2/+2
\|
*	[Bug #19471] `Regexp.compile` should handle keyword arguments	Nobuyoshi Nakada	2023-03-03	1	-1/+1
\| \| \| \| \|	As well as `Regexp.new`, it should pass keyword arguments to the `Regexp#initialize` method.
*	Remove support for the Regexp.new 3rd argument	Jeremy Evans	2023-03-01	1	-13/+2
\| \| \| \| \| \|	This was deprecated in Ruby 3.2. Fixes [Bug #18797]
*	Adjust `else` style to be consistent in each files [ci skip]	Nobuyoshi Nakada	2023-02-26	1	-4/+8
\|
*	Remove (newly unneeded) remarks about aliases	BurdetteLamar	2023-02-19	1	-11/+0
\|
*	Implement Write Barrier for RMatch objects	Jean Boussier	2023-02-10	1	-13/+12
\| \| \| \|	They only have two references.
*	[DOC] Fix typo in document of regexp [ci skip]	OKURA Masafumi	2023-02-10	1	-2/+2
\|
*	Remove `REG_LITERAL` flag	Nobuyoshi Nakada	2023-02-09	1	-4/+0
\| \| \| \|	All `Regexp` literals are frozen now.
*	Fix parsing of regexps that toggle extended mode on/off inside regexp	Jeremy Evans	2023-01-30	1	-33/+120
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was broken in ec3542229b29ec93062e9d90e877ea29d3c19472. That commit didn't handle cases where extended mode was turned on/off inside the regexp. There are two ways to turn extended mode on/off: ``` /(?-x:#y)#z /x =~ '#y' /(?-x)#y(?x)#z /x =~ '#y' ``` These can be nested inside the same regexp: ``` /(?-x:(?x)#x (?-x)#y)#z /x =~ '#y' ``` As you can probably imagine, this makes handling these regexps somewhat complex. Due to the nesting inside portions of regexps, the unassign_nonascii function needs to be recursive. In recursive mode, it needs to track both opening and closing parentheses, similar to how it already tracked opening and closing brackets for character classes. When scanning the regexp and coming to `(?` not followed by `#`, scan for options, and use `x` and `i` to determine whether to turn on or off extended mode. For `:`, indicting only the current regexp section should have the extended mode switched, recurse with the extended mode set or unset. For `)`, indicating the remainder of the regexp (or current regexp portion if already recursing) should turn extended mode on or off, just change the extended mode flag and keep scanning. While testing this, I noticed that `a`, `d`, and `u` are accepted as options, in addition to `i`, `m`, and `x`, but I can't see where those options are documented. I'm not sure whether or not handling `a`, `d`, and `u` as options is a bug. Fixes [Bug #19379]
*	[DOC] Correction to RDoc for Regexp.new (#7130)	Burdette Lamar	2023-01-16	1	-0/+2
\| \| \|	Correction to RDoc for Regexp.new
*	Always issue deprecation warning when calling Regexp.new with 3rd positional ↵	Jeremy Evans	2022-12-22	1	-14/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	argument Previously, only certain values of the 3rd argument triggered a deprecation warning. First step for fix for bug #18797. Support for the 3rd argument will be removed after the release of Ruby 3.2. Fix minor fallout discovered by the tests. Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
*	Refactor `reg_extract_args` to return regexp if given	Nobuyoshi Nakada	2022-12-22	1	-12/+9
\|
*	Share argument parsing in `Regexp#initialize` and `Regexp.linear_time?`	Nobuyoshi Nakada	2022-12-22	1	-20/+41
\|
*	typo in doc [ci skip]	卜部昌平	2022-12-19	1	-1/+1
\|
*	Note about Regexp.linera_time? [ci skip]	卜部昌平	2022-12-19	1	-0/+10
\|
*	Add `Regexp.linear_time?` (#6901)	TSUYUSATO Kitsune	2022-12-14	1	-0/+34
\|
*	Introduce encoding check macro	S-H-GAMELINKS	2022-12-02	1	-1/+2
\|
*	Prevent segfault in String#scan with ObjectSpace.each_object	Yusuke Endoh	2022-12-01	1	-0/+7
\| \| \| \| \| \| \| \| \| \|	Calling `String#scan` without a block creates an incomplete MatchData object whose `RMATCH(match)->str` is Qfalse. Usually this object is not leaked, but it was possible to pull it by using ObjectSpace.each_object. This change hides the internal MatchData object by using rb_obj_hide. Fixes [Bug #19159]
*	Using UNDEF_P macro	S-H-GAMELINKS	2022-11-16	1	-2/+2
\|
*	Suppress false warning by a bug of gcc	Nobuyoshi Nakada	2022-11-08	1	-4/+5
\| \| \| \| \| \| \|	GCC [Bug 99578] seems triggered by calling `rb_reg_last_match` before `match_check(match)`, probably by `NIL_P(match)` in `rb_reg_nth_match`. [Bug 99578]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578
*	Refactor timeout-setting code to a function	Yusuke Endoh	2022-10-24	1	-13/+12
\|
*	Refactor timeout-related code in re.c a little	Yusuke Endoh	2022-10-24	1	-9/+9
\|
*	Fix per-instance Regexp timeout (#6621)	Yusuke Endoh	2022-10-24	1	-2/+8
\| \| \| \| \| \| \| \| \| \|	Fix per-instance Regexp timeout This makes it follow what was decided in [Bug #19055]: * `Regexp.new(str, timeout: nil)` should respect the global timeout * `Regexp.new(str, timeout: huge_val)` should use the maximum value that can be represented in the internal representation * `Regexp.new(str, timeout: 0 or negative value)` should raise an error
*	Fix argument & Remove enum	S-H-GAMELINKS	2022-10-23	1	-9/+3
\|
*	Introduce rb_memsearch_with_char_size function	S-H-GAMELINKS	2022-10-23	1	-10/+14
\|
*	* expand tabs. [ci skip]	git	2022-10-10	1	-2/+2
\| \| \| \| \|	Tabs were expanded because the file did not have any tab indentation in unedited lines. Please update your editor config, and use misc/expand_tabs.rb in the pre-commit hook.
*	Should use dedecated function `Check_Type`	Nobuyoshi Nakada	2022-10-10	1	-12/+4
\|
*	Add MatchData#deconstruct/deconstruct_keys	Vladimir Dementyev	2022-10-10	1	-0/+85
\|
*	[DOC] `offset` argument of Regexp#match	Nobuyoshi Nakada	2022-08-18	1	-1/+6
\|
*	Speed up setting the backref match object	Aaron Patterson	2022-08-02	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch speeds up setting the backref match object by avoiding some memcopies. Take the following code for example: ```ruby "hello world" =~ /hello/ p $~ ``` When the RE matches the string, we have to set the Match object in the backref global. So we would allocate a match object[^1] and use `rb_reg_region_copy`[^2] to make a deep copy of the stack allocated `re_registers` struct[^3] in to the newly created Ruby object. This could possibly trigger GC[^4], and would allocate new memory. This patch makes a shallow copy of the `re_registers` struct on to the Match object allowing the match object to manage the `re_registers` pointer and also avoiding some calls to `xmalloc` and some manual memcopy. Benchmark looks like this: ```ruby require "benchmark/ips" def test_re thing thing =~ /hello/ end Benchmark.ips do \|x\| x.report("re hit") do test_re "hello world" end x.report("re miss") do test_re "world" end end ``` Before this patch: ``` $ ruby -v test.rb ruby 3.2.0dev (2022-07-27T22:29:00Z master 4ad69899b7) [arm64-darwin21] Ignoring bcrypt-3.1.16 because its extensions are not built. Try: gem pristine bcrypt --version 3.1.16 Warming up -------------------------------------- re hit 345.401k i/100ms re miss 673.584k i/100ms Calculating ------------------------------------- re hit 3.452M (± 0.5%) i/s - 17.270M in 5.002535s re miss 6.736M (± 0.4%) i/s - 34.353M in 5.099593s ``` After this patch: ``` $ ./ruby -v test.rb ruby 3.2.0dev (2022-08-01T21:24:12Z less-memcpy 0ff2a56606) [arm64-darwin21] Warming up -------------------------------------- re hit 419.578k i/100ms re miss 673.251k i/100ms Calculating ------------------------------------- re hit 4.201M (± 0.7%) i/s - 21.398M in 5.093593s re miss 6.716M (± 0.4%) i/s - 33.663M in 5.012756s ``` Matches get faster and misses maintain the same speed [^1]: https://github.com/ruby/ruby/blob/24204d54ab730791bfbd0cd66b8e12f0bd62ca5d/re.c#L1737 [^2]: https://github.com/ruby/ruby/blob/24204d54ab730791bfbd0cd66b8e12f0bd62ca5d/re.c#L1738 [^3]: https://github.com/ruby/ruby/blob/24204d54ab730791bfbd0cd66b8e12f0bd62ca5d/re.c#L1686 [^4]: https://github.com/ruby/ruby/blob/24204d54ab730791bfbd0cd66b8e12f0bd62ca5d/re.c#L981
*	Expand tabs [ci skip]	Takashi Kokubun	2022-07-21	1	-636/+636
\| \| \| \|	[Misc #18891]
*	[DOC] Fix a typo [ci skip]	Kazuhiro NISHIYAMA	2022-06-26	1	-1/+1
\|
*	Document that Regexp#source does not retain lexer escapes	Jeremy Evans	2022-06-20	1	-1/+5
\| \| \| \|	Related to [Feature #18838]
*	[Feature #18788] [DOC] String options to `Regexp.new`	Nobuyoshi Nakada	2022-06-20	1	-0/+5
\| \| \| \|	Co-Authored-By: Janosch Müller <janosch.mueller@betterplace.org>
*	[Feature #18788] Support options as `String` to `Regexp.new`	Nobuyoshi Nakada	2022-06-20	1	-0/+21
\| \| \| \| \|	`Regexp.new` now supports passing the regexp flags not only as an `Integer`, but also as a `String. Unknown flags raise errors.
*	Warn suspicious flag to `Regexp.new`	Nobuyoshi Nakada	2022-06-20	1	-1/+3
\| \| \| \| \|	Now second argument should be `true`, `false`, `nil` or Integer. This flag is confused with third argument some times.
*	[DOC] Refine Regexp.new argument descriptions	Nobuyoshi Nakada	2022-06-20	1	-6/+19
\|
*	[DOC] Regexp timeout is float or nil	Nobuyoshi Nakada	2022-06-20	1	-3/+3
\|
*	[DOC] Fixed omissions in Regexp.new arguments	Nobuyoshi Nakada	2022-06-20	1	-2/+6
\|
*	Ignore invalid escapes in regexp comments	Jeremy Evans	2022-06-06	1	-8/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Invalid escapes are handled at multiple levels. The first level is in parse.y, so skip invalid unicode escape checks for regexps in parse.y. Make rb_reg_preprocess and unescape_nonascii accept the regexp options. In unescape_nonascii, if the regexp is an extended regexp, when "#" is encountered, ignore all characters until the end of line or end of regexp. Unfortunately, in extended regexps, you can use "#" as a non-comment character inside a character class, so also parse "[" and "]" specially for extended regexps, and only skip comments if "#" is not inside a character class. Handle nested character classes as well. This issue doesn't just affect extended regexps, it also affects "(#?" comments inside all regexps. So for those comments, scan until trailing ")" and ignore content inside. I'm not sure if there are other corner cases not handled. A better fix would be to redesign the regexp parser so that it unescaped during parsing instead of before parsing, so you already know the current parsing state. Fixes [Bug #18294] Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
*	[DOC] Enhanced RDoc for MatchData (#5822)	Burdette Lamar	2022-04-18	1	-50/+69
\| \| \| \| \| \| \| \| \| \|	Treats: #to_s #named_captures #string #inspect #hash #==
*	Enhanced RDoc for MatchData (#5821)	Burdette Lamar	2022-04-18	1	-32/+41
\| \| \| \| \| \|	Treats: #[] #values_at
*	Enhanced RDoc for MatchData (#5820)	Burdette Lamar	2022-04-18	1	-33/+41
\| \| \| \| \| \| \| \|	Treats: #pre_match #post_match #to_a #captures
*	[DOC] Enhanced RDoc for MatchData (#5819)	Burdette Lamar	2022-04-18	1	-45/+47
\| \| \| \| \| \| \| \|	Treats: #begin #end #match #match_length
*	[DOC] Enhanced RDoc for MatchData (#5818)	Burdette Lamar	2022-04-18	1	-31/+32
\| \| \| \| \| \| \| \|	Treats: #regexp #names #size #offset