| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
If the iseq only contains `opt_invokebuiltin_delegate_leave` insn and
the builtin-function (bf) is inline-able, the caller doesn't need to
build a method frame.
`vm_call_single_noarg_inline_builtin` is fast path for such cases.
|
|
|
|
|
|
|
|
|
|
| |
st tables will maintain insertion order so we can marshal dump / load
objects with instance variables in the same order they were set on that
particular instance
[ruby-core:112926] [Bug #19535]
Co-Authored-By: Jemma Issroff <jemmaissroff@gmail.com>
|
|
|
|
|
|
|
| |
* Break up jit_exec from vm_sendish
* YJIT: Implement throw instruction
* YJIT: Explain what rb_vm_throw does [ci skip]
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
This is a variation of the `defined` instruction, for use when we
are checking for an instance variable. Splitting this out as a
separate instruction lets us skip some checks, and it also allows
us to use an instance variable cache, letting shape analysis
speed up the operation further.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`f(*a, **kw)` is compiled to `f([*a, kw])` but it makes an dummy
array, so change it to pass two arguments `a` and `kw` with calling
flags.
```
ruby 3.2.0 (2022-12-29 revision a7d467a792) [x86_64-linux]
Calculating -------------------------------------
foo() 15.354M (± 4.2%) i/s - 77.295M in 5.043650s
dele() 13.439M (± 3.9%) i/s - 67.109M in 5.001974s
dele(*) 6.265M (± 4.5%) i/s - 31.730M in 5.075649s
dele(*a) 6.286M (± 3.3%) i/s - 31.719M in 5.051516s
dele(*a, **kw) 1.926M (± 4.5%) i/s - 9.753M in 5.076487s
dele(*, **) 1.927M (± 4.2%) i/s - 9.710M in 5.048224s
dele(...) 5.871M (± 3.9%) i/s - 29.471M in 5.028023s
forwardable 4.969M (± 4.1%) i/s - 25.233M in 5.087498s
ruby 3.3.0dev (2023-01-13T01:28:00Z master 7e8802fa5b) [x86_64-linux]
Calculating -------------------------------------
foo() 16.354M (± 4.7%) i/s - 81.799M in 5.014561s
dele() 14.256M (± 3.5%) i/s - 71.656M in 5.032883s
dele(*) 6.701M (± 3.8%) i/s - 33.948M in 5.074938s
dele(*a) 6.681M (± 3.3%) i/s - 33.578M in 5.031720s
dele(*a, **kw) 4.200M (± 4.4%) i/s - 21.258M in 5.072583s
dele(*, **) 4.197M (± 5.3%) i/s - 21.322M in 5.096684s
dele(...) 6.039M (± 6.8%) i/s - 30.355M in 5.052662s
forwardable 4.788M (± 3.2%) i/s - 24.033M in 5.024875s
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following script can sometimes trigger a crash:
```ruby
GC.stress = true
class Array
def foo(bool)
if bool
@a = 1
@b = 2
@c = 1
else
@c = 1
end
end
end
obj = []
obj.foo(true)
obj2 = []
obj2.foo(false)
obj3 = []
obj3.foo(true)
```
This is because vm_setivar_default calls rb_ensure_generic_iv_list_size
to resize the iv list. However, the call to gen_ivtbl_resize reallocs
the iv list, and then inserts into the generic iv table. If the
st_insert triggers a GC then the old iv list will be read during
marking, causing a use-after-free bug.
Co-Authored-By: Jemma Issroff <jemmaissroff@gmail.com>
|
| |
|
|
|
|
|
| |
This commit is refactoring and documenting the debug counters related to
instance variables.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The interrupt check will unintentionally release the VM lock when loading an iseq.
And this will cause issues with the `debug` gem's
[`ObjectSpace.each_iseq` method](https://github.com/ruby/debug/blob/0fcfc28acae33ec1c08068fb7c33703cfa681fa7/ext/debug/iseq_collector.c#L61-L67),
which wraps iseqs with a wrapper and exposes their internal states when they're actually not ready to be used.
And when that happens, errors like this would occur and kill the `debug` gem's thread:
```
DEBUGGER: ReaderThreadError: uninitialized InstructionSequence
┃ DEBUGGER: Disconnected.
┃ ["/opt/rubies/ruby-3.2.0/lib/ruby/gems/3.2.0/gems/debug-1.7.1/lib/debug/breakpoint.rb:247:in `absolute_path'",
┃ "/opt/rubies/ruby-3.2.0/lib/ruby/gems/3.2.0/gems/debug-1.7.1/lib/debug/breakpoint.rb:247:in `block in iterate_iseq'",
┃ "/opt/rubies/ruby-3.2.0/lib/ruby/gems/3.2.0/gems/debug-1.7.1/lib/debug/breakpoint.rb:246:in `each_iseq'",
...
```
A way to reproduce the issue is to satisfy these conditions at the same time:
1. `debug` gem calling `ObjectSpace.each_iseq` (e.g. [activating a `LineBreakpoint`](https://github.com/ruby/debug/blob/0fcfc28acae33ec1c08068fb7c33703cfa681fa7/lib/debug/breakpoint.rb#L246)).
2. A large amount of iseq being loaded from another thread (possibly through the `bootsnap` gem).
3. 1 and 2 iterating through the same iseq(s) at the same time.
Because this issue requires external dependencies and a rather complicated timing setup to reproduce, I wasn't able to write a test case for it.
But here's some pseudo code to help reproduce it:
```rb
require "debug/session"
Thread.new do
100.times do
ObjectSpace.each_iseq do |iseq|
iseq.absolute_path
end
end
end
sleep 0.1
load_a_bunch_of_iseq
possibly_through_bootsnap
```
[Bug #19348]
Co-authored-by: Peter Zhu <peter@peterzhu.ca>
|
|
|
|
| |
[Bug #19339]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On the cfunc methods, if a splat argument is given, all array elements
are expanded on the VM stack and it can cause SystemStackError.
The idea to avoid it is making a hidden array to contain all parameters
and use this array as an argv.
This patch is reviesed version of https://github.com/ruby/ruby/pull/6816
The main change is all changes are closed around calling cfunc logic.
Fixes [Bug #4040]
Co-authored-by: Jeremy Evans <code@jeremyevans.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the following crashes with
`CFLAGS=-DRGENGC_CHECK_MODE=2 -DRUBY_DEBUG=1 -fno-inline`:
$ ./miniruby -e 'GC.stress = true; Marshal.dump({})'
It crashes with a write barrier (WB) miss assertion on an edge from the
`Hash` class object to a newly allocated negative method entry.
This is due to usages of vm_ccs_create() running the WB too early,
before the method entry is inserted into the cc table, so before the
reference edge is established. The insertion can trigger GC and promote
the class object, so running the WB after the insertion is necessary.
Move the insertion into vm_ccs_create() and run the WB after the
insertion.
Discovered on CI:
http://ci.rvm.jp/results/trunk-asserts@ruby-sp2-docker/4391770
|
|
|
|
| |
We don't use this value, so there's no need to set it.
|
|
|
|
|
|
|
|
|
|
| |
I noticed this while running test_yjit with --mjit-call-threshold=1,
which redefines `Integer#<`. When Ruby is monkey-patched,
MJIT itself could be broken.
Similarly, Ruby scripts could break MJIT in many different ways. I
prepared the same set of hooks as YJIT so that we could possibly
override it and disable it on those moments. Every constant under
RubyVM::MJIT is private and thus it's an unsupported behavior though.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an object becomes "too complex" (in other words it has too many
variations in the shape tree), we transition it to use a "too complex"
shape and use a hash for storing instance variables.
Without this patch, there were rare cases where shape tree growth could
"explode" and cause performance degradation on what would otherwise have
been cached fast paths.
This patch puts a limit on shape tree growth, and gracefully degrades in
the rare case where there could be a factorial growth in the shape tree.
For example:
```ruby
class NG; end
HUGE_NUMBER.times do
NG.new.instance_variable_set(:"@unique_ivar_#{_1}", 1)
end
```
We consider objects to be "too complex" when the object's class has more
than SHAPE_MAX_VARIATIONS (currently 8) leaf nodes in the shape tree and
the object introduces a new variation (a new leaf node) associated with
that class.
For example, new variations on instances of the following class would be
considered "too complex" because those instances create more than 8
leaves in the shape tree:
```ruby
class Foo; end
9.times { Foo.new.instance_variable_set(":@uniq_#{_1}", 1) }
```
However, the following class is *not* too complex because it only has
one leaf in the shape tree:
```ruby
class Foo
def initialize
@a = @b = @c = @d = @e = @f = @g = @h = @i = nil
end
end
9.times { Foo.new }
``
This case is rare, so we don't expect this change to impact performance
of most applications, but it needs to be handled.
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
|
| |
|
|
|
|
|
|
| |
Since edc7af48acd12666a2945f30901d16b62a39f474, we now no longer have
undef ivar transitions. Instead, we rebuild the shapes table. When we do
this, we need to ensure that we retain our capacities on shapes.
|
|
|
|
|
|
|
|
|
| |
* YJIT: implement getconstant YARV instruction
* Constant id is not a pointer
* Stack operands must be read after jit_prepare_routine_call
Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Cases like this:
```ruby
obj = Object.new
loop do
obj.instance_variable_set(:@foo, 1)
obj.remove_instance_variable(:@foo)
end
```
can cause us to use many more shapes than we want (and even run out).
This commit changes the code such that when an instance variable is
removed, we'll walk up the shape tree, find the shape, then rebuild any
child nodes that happened to be below the "targetted for removal" IV.
This also requires moving any instance variables so that indexes derived
from the shape tree will work correctly.
Co-Authored-By: Jemma Issroff <jemmaissroff@gmail.com>
Co-authored-by: John Hawthorn <jhawthorn@github.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this commit the `OPTIMIZED_CMP` macro relied on a method lookup
to determine whether `<=>` was overridden. The result of the lookup was
cached, but only for the duration of the specific method that
initialized the cmp_opt_data cache structure.
With this method lookup, `[x,y].max` is slower than doing `x > y ?
x : y` even though there's an optimized instruction for "new array max".
(John noticed somebody a proposed micro-optimization based on this fact
in https://github.com/mastodon/mastodon/pull/19903.)
```rb
a, b = 1, 2
Benchmark.ips do |bm|
bm.report('conditional') { a > b ? a : b }
bm.report('method') { [a, b].max }
bm.compare!
end
```
Before:
```
Comparison:
conditional: 22603733.2 i/s
method: 19820412.7 i/s - 1.14x (± 0.00) slower
```
This commit replaces the method lookup with a new CMP basic op, which
gives the examples above equivalent performance.
After:
```
Comparison:
method: 24022466.5 i/s
conditional: 23851094.2 i/s - same-ish: difference falls within
error
```
Relevant benchmarks show an improvement to Array#max and Array#min when
not using the optimized newarray_max instruction as well. They are
noticeably faster for small arrays with the relevant types, and the same
or maybe a touch faster on larger arrays.
```
$ make benchmark COMPARE_RUBY=<master@5958c305> ITEM=array_min
$ make benchmark COMPARE_RUBY=<master@5958c305> ITEM=array_max
```
The benchmarks added in this commit also look generally improved.
Co-authored-by: John Hawthorn <jhawthorn@github.com>
|
|
|
|
|
|
| |
We can loosely predict the number of ivar sets on a class based on the
number of iv set instructions in the initialize method. This should give
us a more accurate estimate to use for initial size pool allocation,
which should in turn give us more cache hits.
|
|
|
|
|
|
|
| |
obj_ivar_set and vm_setivar_slowpath is essentially doing the same thing,
but the code is duplicated and not quite implemented in the same way,
which could cause bugs. This commit refactors vm_setivar_slowpath to use
obj_ivar_set.
|
| |
|
|
|
|
|
|
|
| |
Since object shapes store the capacity of an object, we no longer
need the numiv field on RObjects. This gives us one extra slot which
we can use to give embedded objects one more instance variable (for a
total of 3 ivs). This commit removes the concept of numiv from RObject.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds a `capacity` field to shapes, and adds shape
transitions whenever an object's capacity changes. Objects which are
allocated out of a bigger size pool will also make a transition from the
root shape to the shape with the correct capacity for their size pool
when they are allocated.
This commit will allow us to remove numiv from objects completely, and
will also mean we can guarantee that if two objects share shapes, their
IVs are in the same positions (an embedded and extended object cannot
share shapes). This will enable us to implement ivar sets in YJIT using
object shapes.
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
|
|
|
|
|
|
|
|
| |
* Avoid RCLASS_IV_TBL in marshal.c
* Avoid RCLASS_IV_TBL for class names
* Avoid RCLASS_IV_TBL for autoload
* Avoid RCLASS_IV_TBL for class variables
* Avoid copying RCLASS_IV_TBL onto ICLASSes
* Use object shapes for Class and Module IVs
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch pushes dummy frames when loading code for the
profiling purpose.
The following methods push a dummy frame:
* `Kernel#require`
* `Kernel#load`
* `RubyVM::InstructionSequence.compile_file`
* `RubyVM::InstructionSequence.load_from_binary`
https://bugs.ruby-lang.org/issues/18559
|
|
|
|
|
|
| |
Shapes provides us with an (almost) exact count of instance variables.
We only need to check for Qundef when an IV has been "undefined"
Prefer to use ROBJECT_IV_COUNT when iterating IVs
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The inline cache is initialized by vm_cc_attr_index_set only when
vm_cc_markable(cc). However, vm_getivar attempted to read the cache
even if the cc is not vm_cc_markable.
This caused a condition that depends on uninitialized value.
Here is an output of valgrind:
```
==10483== Conditional jump or move depends on uninitialised value(s)
==10483== at 0x4C1D60: vm_getivar (vm_insnhelper.c:1171)
==10483== by 0x4C1D60: vm_call_ivar (vm_insnhelper.c:3257)
==10483== by 0x4E8E48: vm_call_symbol (vm_insnhelper.c:3481)
==10483== by 0x4EAD8C: vm_sendish (vm_insnhelper.c:5035)
==10483== by 0x4C62B2: vm_exec_core (insns.def:820)
==10483== by 0x4DD519: rb_vm_exec (vm.c:0)
==10483== by 0x4F00B3: invoke_block (vm.c:1417)
==10483== by 0x4F00B3: invoke_iseq_block_from_c (vm.c:1473)
==10483== by 0x4F00B3: invoke_block_from_c_bh (vm.c:1491)
==10483== by 0x4D42B6: rb_yield (vm_eval.c:0)
==10483== by 0x259128: rb_ary_each (array.c:2733)
==10483== by 0x4E8730: vm_call_cfunc_with_frame (vm_insnhelper.c:3227)
==10483== by 0x4EAD8C: vm_sendish (vm_insnhelper.c:5035)
==10483== by 0x4C6254: vm_exec_core (insns.def:801)
==10483== by 0x4DD519: rb_vm_exec (vm.c:0)
==10483==
```
In fact, the CI on FreeBSD 12 started failing since ad63b668e22e21c352b852f3119ae98a7acf99f1.
```
gmake[1]: Entering directory '/usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby'
/usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:924:in `complete': undefined method `complete' for nil:NilClass (NoMethodError)
from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1816:in `block in visit'
from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1815:in `reverse_each'
from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1815:in `visit'
from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1847:in `block in complete'
from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1846:in `catch'
from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1846:in `complete'
from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1640:in `block in parse_in_order'
from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1632:in `catch'
from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1632:in `parse_in_order'
from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1626:in `order!'
from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1732:in `permute!'
from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1757:in `parse!'
from ./ext/extmk.rb:359:in `parse_args'
from ./ext/extmk.rb:396:in `<main>'
```
This change adds a guard to read the cache only when vm_cc_markable(cc).
It might be better to initialize the cache as INVALID_SHAPE_ID when the
cc is not vm_cc_markable.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this commit, we were reading and writing ivar index and
shape ID in inline caches in two separate instructions when
getting and setting ivars. This meant there was a race condition
with ractors and these caches where one ractor could change
a value in the cache while another was still reading from it.
This commit instead reads and writes shape ID and ivar index to
inline caches atomically so there is no longer a race condition.
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
|
|
|
|
| |
This reverts commit 9a6803c90b817f70389cae10d60b50ad752da48f.
|
| |
|
| |
|
|
|
|
| |
This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.
|
|
|
|
|
|
|
|
| |
Before d594a5a8bd0756f65c078fcf5ce0098250cba141, we were only
asserting that the value on an ivar_get was ractor_sharable if the
object was a T_OBJECT and also ractor shareable. We should still
be doing this check only if the object is a T_OBJECT and ractor
shareable
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects. Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness"). Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree. Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.
For example:
```ruby
class Foo
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
class Bar
def initialize
# Starts with shape id 0
@a = 1 # transitions to shape id 1
@b = 1 # transitions to shape id 2
end
end
foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```
Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.
This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.
This commit also adds some methods for debugging shapes on objects. See
`RubyVM::Shape` for more details.
For more context on Object Shapes, see [Feature: #18776]
Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
|