summaryrefslogtreecommitdiff
path: root/vm_insnhelper.c
Commit message (Collapse)AuthorAgeFilesLines
* Transition complex objects to "too complex" shapeJemma Issroff2022-12-151-14/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an object becomes "too complex" (in other words it has too many variations in the shape tree), we transition it to use a "too complex" shape and use a hash for storing instance variables. Without this patch, there were rare cases where shape tree growth could "explode" and cause performance degradation on what would otherwise have been cached fast paths. This patch puts a limit on shape tree growth, and gracefully degrades in the rare case where there could be a factorial growth in the shape tree. For example: ```ruby class NG; end HUGE_NUMBER.times do NG.new.instance_variable_set(:"@unique_ivar_#{_1}", 1) end ``` We consider objects to be "too complex" when the object's class has more than SHAPE_MAX_VARIATIONS (currently 8) leaf nodes in the shape tree and the object introduces a new variation (a new leaf node) associated with that class. For example, new variations on instances of the following class would be considered "too complex" because those instances create more than 8 leaves in the shape tree: ```ruby class Foo; end 9.times { Foo.new.instance_variable_set(":@uniq_#{_1}", 1) } ``` However, the following class is *not* too complex because it only has one leaf in the shape tree: ```ruby class Foo def initialize @a = @b = @c = @d = @e = @f = @g = @h = @i = nil end end 9.times { Foo.new } `` This case is rare, so we don't expect this change to impact performance of most applications, but it needs to be handled. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
* YJIT: Implement opt_newarray_max instruction (#6893)Takashi Kokubun2022-12-121-0/+6
|
* Update shape capacity when removing ivar and rewriting shape transitionsJemma Issroff2022-12-101-2/+0
| | | | | | Since edc7af48acd12666a2945f30901d16b62a39f474, we now no longer have undef ivar transitions. Instead, we rebuild the shapes table. When we do this, we need to ensure that we retain our capacities on shapes.
* YJIT: implement `getconstant` YARV instruction (#6884)Maxime Chevalier-Boisvert2022-12-091-0/+6
| | | | | | | | | * YJIT: implement getconstant YARV instruction * Constant id is not a pointer * Stack operands must be read after jit_prepare_routine_call Co-authored-by: Takashi Kokubun <takashikkbn@gmail.com>
* YJIT: implement opt_newarray_min YARV instruction (#6888)Maxime Chevalier-Boisvert2022-12-081-0/+6
|
* Stop transitioning to UNDEF when undefining an instance variableAaron Patterson2022-12-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Cases like this: ```ruby obj = Object.new loop do obj.instance_variable_set(:@foo, 1) obj.remove_instance_variable(:@foo) end ``` can cause us to use many more shapes than we want (and even run out). This commit changes the code such that when an instance variable is removed, we'll walk up the shape tree, find the shape, then rebuild any child nodes that happened to be below the "targetted for removal" IV. This also requires moving any instance variables so that indexes derived from the shape tree will work correctly. Co-Authored-By: Jemma Issroff <jemmaissroff@gmail.com> Co-authored-by: John Hawthorn <jhawthorn@github.com>
* Introduce BOP_CMP for optimized comparisonDaniel Colson2022-12-061-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this commit the `OPTIMIZED_CMP` macro relied on a method lookup to determine whether `<=>` was overridden. The result of the lookup was cached, but only for the duration of the specific method that initialized the cmp_opt_data cache structure. With this method lookup, `[x,y].max` is slower than doing `x > y ? x : y` even though there's an optimized instruction for "new array max". (John noticed somebody a proposed micro-optimization based on this fact in https://github.com/mastodon/mastodon/pull/19903.) ```rb a, b = 1, 2 Benchmark.ips do |bm| bm.report('conditional') { a > b ? a : b } bm.report('method') { [a, b].max } bm.compare! end ``` Before: ``` Comparison: conditional: 22603733.2 i/s method: 19820412.7 i/s - 1.14x (± 0.00) slower ``` This commit replaces the method lookup with a new CMP basic op, which gives the examples above equivalent performance. After: ``` Comparison: method: 24022466.5 i/s conditional: 23851094.2 i/s - same-ish: difference falls within error ``` Relevant benchmarks show an improvement to Array#max and Array#min when not using the optimized newarray_max instruction as well. They are noticeably faster for small arrays with the relevant types, and the same or maybe a touch faster on larger arrays. ``` $ make benchmark COMPARE_RUBY=<master@5958c305> ITEM=array_min $ make benchmark COMPARE_RUBY=<master@5958c305> ITEM=array_max ``` The benchmarks added in this commit also look generally improved. Co-authored-by: John Hawthorn <jhawthorn@github.com>
* Increment max_iv_count on class based on number of set_iv in initialize (#6788)Jemma Issroff2022-11-221-0/+5
| | | | | | We can loosely predict the number of ivar sets on a class based on the number of iv set instructions in the initialize method. This should give us a more accurate estimate to use for initial size pool allocation, which should in turn give us more cache hits.
* Refactor obj_ivar_set and vm_setivarPeter Zhu2022-11-211-27/+1
| | | | | | | obj_ivar_set and vm_setivar_slowpath is essentially doing the same thing, but the code is duplicated and not quite implemented in the same way, which could cause bugs. This commit refactors vm_setivar_slowpath to use obj_ivar_set.
* Using UNDEF_P macroS-H-GAMELINKS2022-11-161-15/+15
|
* Remove numiv from RObjectJemma Issroff2022-11-101-4/+2
| | | | | | | Since object shapes store the capacity of an object, we no longer need the numiv field on RObjects. This gives us one extra slot which we can use to give embedded objects one more instance variable (for a total of 3 ivs). This commit removes the concept of numiv from RObject.
* Transition shape when object's capacity changesJemma Issroff2022-11-101-30/+22
| | | | | | | | | | | | | | | | This commit adds a `capacity` field to shapes, and adds shape transitions whenever an object's capacity changes. Objects which are allocated out of a bigger size pool will also make a transition from the root shape to the shape with the correct capacity for their size pool when they are allocated. This commit will allow us to remove numiv from objects completely, and will also mean we can guarantee that if two objects share shapes, their IVs are in the same positions (an embedded and extended object cannot share shapes). This will enable us to implement ivar sets in YJIT using object shapes. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
* Implement object shapes for T_CLASS and T_MODULE (#6637)John Hawthorn2022-10-311-7/+21
| | | | | | | | * Avoid RCLASS_IV_TBL in marshal.c * Avoid RCLASS_IV_TBL for class names * Avoid RCLASS_IV_TBL for autoload * Avoid RCLASS_IV_TBL for class variables * Avoid copying RCLASS_IV_TBL onto ICLASSes * Use object shapes for Class and Module IVs
* push dummy frame for loading processKoichi Sasada2022-10-201-1/+31
| | | | | | | | | | | | | This patch pushes dummy frames when loading code for the profiling purpose. The following methods push a dummy frame: * `Kernel#require` * `Kernel#load` * `RubyVM::InstructionSequence.compile_file` * `RubyVM::InstructionSequence.load_from_binary` https://bugs.ruby-lang.org/issues/18559
* More precisely iterate over Object instance variablesAaron Patterson2022-10-151-2/+4
| | | | | | Shapes provides us with an (almost) exact count of instance variables. We only need to check for Qundef when an IV has been "undefined" Prefer to use ROBJECT_IV_COUNT when iterating IVs
* Initialize shape attr index also in non-markable CCNobuyoshi Nakada2022-10-121-19/+5
|
* Adjust indents [ci skip]Nobuyoshi Nakada2022-10-121-117/+122
|
* Do not read cached_id from callcache on stackYusuke Endoh2022-10-121-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The inline cache is initialized by vm_cc_attr_index_set only when vm_cc_markable(cc). However, vm_getivar attempted to read the cache even if the cc is not vm_cc_markable. This caused a condition that depends on uninitialized value. Here is an output of valgrind: ``` ==10483== Conditional jump or move depends on uninitialised value(s) ==10483== at 0x4C1D60: vm_getivar (vm_insnhelper.c:1171) ==10483== by 0x4C1D60: vm_call_ivar (vm_insnhelper.c:3257) ==10483== by 0x4E8E48: vm_call_symbol (vm_insnhelper.c:3481) ==10483== by 0x4EAD8C: vm_sendish (vm_insnhelper.c:5035) ==10483== by 0x4C62B2: vm_exec_core (insns.def:820) ==10483== by 0x4DD519: rb_vm_exec (vm.c:0) ==10483== by 0x4F00B3: invoke_block (vm.c:1417) ==10483== by 0x4F00B3: invoke_iseq_block_from_c (vm.c:1473) ==10483== by 0x4F00B3: invoke_block_from_c_bh (vm.c:1491) ==10483== by 0x4D42B6: rb_yield (vm_eval.c:0) ==10483== by 0x259128: rb_ary_each (array.c:2733) ==10483== by 0x4E8730: vm_call_cfunc_with_frame (vm_insnhelper.c:3227) ==10483== by 0x4EAD8C: vm_sendish (vm_insnhelper.c:5035) ==10483== by 0x4C6254: vm_exec_core (insns.def:801) ==10483== by 0x4DD519: rb_vm_exec (vm.c:0) ==10483== ``` In fact, the CI on FreeBSD 12 started failing since ad63b668e22e21c352b852f3119ae98a7acf99f1. ``` gmake[1]: Entering directory '/usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby' /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:924:in `complete': undefined method `complete' for nil:NilClass (NoMethodError) from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1816:in `block in visit' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1815:in `reverse_each' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1815:in `visit' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1847:in `block in complete' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1846:in `catch' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1846:in `complete' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1640:in `block in parse_in_order' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1632:in `catch' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1632:in `parse_in_order' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1626:in `order!' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1732:in `permute!' from /usr/home/chkbuild/chkbuild/tmp/build/20221011T163003Z/ruby/lib/optparse.rb:1757:in `parse!' from ./ext/extmk.rb:359:in `parse_args' from ./ext/extmk.rb:396:in `<main>' ``` This change adds a guard to read the cache only when vm_cc_markable(cc). It might be better to initialize the cache as INVALID_SHAPE_ID when the cc is not vm_cc_markable.
* Make inline cache reads / writes atomic with object shapesJemma Issroff2022-10-111-71/+77
| | | | | | | | | | | | | | Prior to this commit, we were reading and writing ivar index and shape ID in inline caches in two separate instructions when getting and setting ivars. This meant there was a race condition with ractors and these caches where one ractor could change a value in the cache while another was still reading from it. This commit instead reads and writes shape ID and ivar index to inline caches atomically so there is no longer a race condition. Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: John Hawthorn <john@hawthorn.email>
* Revert "Revert "This commit implements the Object Shapes technique in CRuby.""Jemma Issroff2022-10-111-146/+339
| | | | This reverts commit 9a6803c90b817f70389cae10d60b50ad752da48f.
* Use the dedicated function to check arityNobuyoshi Nakada2022-10-011-4/+5
|
* Add macros for assertionsNobuyoshi Nakada2022-10-011-3/+8
|
* Revert "This commit implements the Object Shapes technique in CRuby."Aaron Patterson2022-09-301-339/+146
| | | | This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.
* Only assert ractor_shareable is consistent on ivar_set for T_OBJECTJemma Issroff2022-09-301-1/+1
| | | | | | | | Before d594a5a8bd0756f65c078fcf5ce0098250cba141, we were only asserting that the value on an ivar_get was ractor_sharable if the object was a T_OBJECT and also ractor shareable. We should still be doing this check only if the object is a T_OBJECT and ractor shareable
* This commit implements the Object Shapes technique in CRuby.Jemma Issroff2022-09-281-146/+339
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email>
* Revert this until we can figure out WB issues or remove shapes from GCAaron Patterson2022-09-261-344/+146
| | | | | | | | | | Revert "* expand tabs. [ci skip]" This reverts commit 830b5b5c351c5c6efa5ad461ae4ec5085e5f0275. Revert "This commit implements the Object Shapes technique in CRuby." This reverts commit 9ddfd2ca004d1952be79cf1b84c52c79a55978f4.
* This commit implements the Object Shapes technique in CRuby.Jemma Issroff2022-09-261-146/+344
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org> Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com> Co-Authored-By: John Hawthorn <john@hawthorn.email>
* vm_method_cfunc_is: get rid of ANYARGS卜部昌平2022-09-211-2/+31
| | | | | ANYARGS-ed function prototypes are basically prohibited in C23. Use __attribute__((__transparent_union__)) instead.
* cref_replace_with_duplicated_cref_each_frame: returns a pointer卜部昌平2022-09-211-1/+1
| | | | Why use FALSE here?
* vm_insnhelper.c: add casts卜部昌平2022-09-211-4/+8
| | | | Why they have not been at the first place? Siblings have proper casts.
* vm_objtostring: skip method lookup for T_STRING receiversJean Boussier2022-09-081-3/+6
| | | | | We don't need it, and in string interpolation context that's the common case.
* New constant caching insn: opt_getconstant_pathJohn Hawthorn2022-09-011-39/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously YARV bytecode implemented constant caching by having a pair of instructions, opt_getinlinecache and opt_setinlinecache, wrapping a series of getconstant calls (with putobject providing supporting arguments). This commit replaces that pattern with a new instruction, opt_getconstant_path, handling both getting/setting the inline cache and fetching the constant on a cache miss. This is implemented by storing the full constant path as a null-terminated array of IDs inside of the IC structure. idNULL is used to signal an absolute constant reference. $ ./miniruby --dump=insns -e '::Foo::Bar::Baz' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE) 0000 opt_getconstant_path <ic:0 ::Foo::Bar::Baz> ( 1)[Li] 0002 leave The motivation for this is that we had increasingly found the need to disassemble the instructions between the opt_getinlinecache and opt_setinlinecache in order to determine the constant we are fetching, or otherwise store metadata. This disassembly was done: * In opt_setinlinecache, to register the IC against the constant names it is using for granular invalidation. * In rb_iseq_free, to unregister the IC from the invalidation table. * In YJIT to find the position of a opt_getinlinecache instruction to invalidate it when the cache is populated * In YJIT to register the constant names being used for invalidation. With this change we no longe need disassemly for these (in fact rb_iseq_each is now unused), as the list of constant names being referenced is held in the IC. This should also make it possible to make more optimizations in the future. This may also reduce the size of iseqs, as previously each segment required 32 bytes (on 64-bit platforms) for each constant segment. This implementation only stores one ID per-segment. There should be no significant performance change between this and the previous implementation. Previously opt_getinlinecache was a "leaf" instruction, but it included a jump (almost always to a separate cache line). Now opt_getconstant_path is a non-leaf (it may raise/autoload/call const_missing) but it does not jump. These seem to even out.
* YJIT: Implement concatarray in yjit (https://github.com/Shopify/ruby/pull/405)Maple Ong2022-08-291-0/+10
| | | | | | | | | | | | | | | | * Create code generation func * Make rb_vm_concat_array available to use in Rust * Map opcode to code gen func * Implement code gen for concatarray * Add test for concatarray * Use new asm backend * Add comment to C func wrapper
* Fix private methods reported as protected when called via Symbol#to_procJean Boussier2022-08-251-0/+1
| | | | | Ref: bfa6a8ddc84fffe0aef5a0f91b417167e124dbbf Ref: [Bug #18826]
* Rename mjit_exec to jit_exec (#6262)Takashi Kokubun2022-08-191-6/+5
| | | | | | | * Rename mjit_exec to jit_exec * Rename mjit_exec_slowpath to mjit_check_iseq * Remove mjit_exec references from comments
* Repalce to NIL_P macroS-H-GAMELINKS2022-08-191-1/+1
|
* Only allow procs created by Symbol#to_proc to call public methodsJeremy Evans2022-08-101-7/+29
| | | | | Fixes [Bug #18826] Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
* Fix inconsistency with opt_aref_withJohn Hawthorn2022-08-041-1/+2
| | | | | | | | | | | | | | | | | | | | | opt_aref_with is an optimized instruction for accessing a Hash using a non-frozen string key (ie. from a file without frozen_string_literal). It attempts to avoid allocating the string, and instead silently using a frozen string (hash string keys are always fstrings). Because this is just an optimization, it should be invisible to the user. However, previously this optimization was could be seen via hashes with default procs. For example, previously: h = Hash.new { |h, k| k.frozen? } str = "foo" h[str] # false h["foo"] # true when optimizations enabled This commit checks that the Hash doesn't have a default proc when using opt_aref_with.
* Adjust styles [ci skip]Nobuyoshi Nakada2022-07-271-1/+2
|
* Expand tabs [ci skip]Takashi Kokubun2022-07-211-945/+945
| | | | [Misc #18891]
* Do not have class/module keywords look up ancestors of ObjectJeremy Evans2022-07-211-20/+4
| | | | | | | | Fixes case where Object includes a module that defines a constant, then using class/module keyword to define the same constant on Object itself. Implements [Feature #18832]
* Extract vm_ic_entry API to mimic vm_cc behaviorJemma Issroff2022-07-181-9/+6
|
* vm_opt_ltlt: call rb_str_buf_append directly if RHS is a StringJean Boussier2022-07-061-1/+5
| | | | | | | | | | | | `rb_str_concat` does a lot of type checking we can easily bypass. ``` | |compare-ruby|built-ruby| |:--------------|-----------:|---------:| |string_concat | 362.007k| 398.965k| | | -| 1.10x| ```
* Fix empty call cache check for debug counterNobuyoshi Nakada2022-07-031-1/+1
|
* YJIT: Refactor gen_opt_mod (#6078)Dave Schwantes2022-06-301-6/+0
| | | Refactor gen_opt_mod in YJIT
* Allow method caching of protected FCALLsJohn Hawthorn2022-06-211-4/+4
|
* Don't check protected method ancestry on fcallJohn Hawthorn2022-06-211-1/+1
| | | | | | If we are making an FCALL, we know we are calling a method on self. This is the same check made for private method visibility, so it should also guarantee we can call a protected method.
* Allow calling protected methods from refinementsJohn Hawthorn2022-06-161-6/+16
| | | | | | | | | | | | | | Previously protected methods on refinements could never be called because they were seen as being "defined" on the hidden refinement ICLASS. This commit updates calling refined protected methods so that they are considered to be defined on the original class (the one being refined). This ended up using the same behaviour that was used to check whether a call to super was allowed, so I extracted that into a method. [Bug #18806]
* Fix use-after-free with interacting TracePointsAlan Wu2022-05-301-5/+18
| | | | | | | | | | | | | | `vm_trace_hook()` runs global hooks before running local hooks. Previously, we read the local hook list before running the global hooks which led to use-after-free when a global hook frees the local hook list. A global hook can do this by disabling a local TracePoint, for example. Delay local hook list loading until after running the global hooks. Issue discovered by Jeremy Evans in GH-5862. [Bug #18730]
* Remove unnecessary module flag, add module assertions to other module flagsJemma Issroff2022-05-231-1/+1
|