summaryrefslogtreecommitdiff
path: root/vm_backtrace.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix compiler warning for uninitialized variablePeter Zhu2022-02-221-1/+1
| | | | | | | Fixes this compiler warning: warning: 'loc' may be used uninitialized in this function [-Wmaybe-uninitialized] bt_yield_loc(loc - cfunc_counter, cfunc_counter, btobj);
* Add Thread.each_caller_locationJeremy Evans2022-02-171-5/+37
| | | | | | | | This method takes a block and yields Thread::Backtrace::Location objects to the block. It does not take arguments, and always starts at the default frame that caller_locations would start at. Implements [Feature #16663]
* [DOC] Document Thread::Backtrace.limitzverok2021-12-201-1/+55
|
* Make RubyVM::AbstractSyntaxTree.of raise for backtrace location in evalYusuke Endoh2021-12-191-7/+12
| | | | | | | | | This check is needed to fix a bug of error_highlight when NameError occurred in eval'ed code. https://github.com/ruby/error_highlight/pull/16 The same check for proc/method has been already introduced since 64ac984129a7a4645efe5ac57c168ef880b479b2.
* ast.c: Use kept script_lines data instead of re-opening the source file (#5019)Yusuke Endoh2021-10-261-11/+11
| | | ast.c: Use kept script_lines data instead of re-open the source file
* Using NIL_P macro instead of `== Qnil`S.H2021-10-031-3/+3
|
* Make backtrace generation work outward from current frameJeremy Evans2021-08-061-295/+181
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes multiple bugs found in the partial backtrace optimization added in 3b24b7914c16930bfadc89d6aff6326a51c54295. These bugs occurs when passing a start argument to caller where the start argument lands on a iseq frame without a pc. Before this commit, the following code results in the same line being printed twice, both for the #each method. ```ruby def a; [1].group_by { b } end def b; puts(caller(2, 1).first, caller(3, 1).first) end a ``` After this commit and in Ruby 2.7, the lines are different, with the first line being for each and the second for group_by. Before this commit, the following code can either segfault or result in an infinite loop: ```ruby def foo caller_locations(2, 1).inspect # segfault caller_locations(2, 1)[0].path # infinite loop end 1.times.map { 1.times.map { foo } } ``` After this commit, this code works correctly. This commit completely refactors the backtrace handling. Instead of processing the backtrace from the outermost frame working in, process it from the innermost frame working out. This is much faster for partial backtraces, since you only access the control frames you need to in order to construct the backtrace. To handle cfunc frames in the new design, they start out with no location information. We increment a counter for each cfunc frame added. When an iseq frame with pc is accessed, after adding the iseq backtrace location, we use the location for the iseq backtrace location for all of the directly preceding cfunc backtrace locations. If the last backtrace line is a cfunc frame, we continue scanning for iseq frames until the end control frame, and use the location information from the first one for the trailing cfunc frames in the backtrace. As only rb_ec_partial_backtrace_object uses the new backtrace implementation, remove all of the function pointers and inline the functions. This makes the process easier to understand. Restore the Ruby 2.7 implementation of backtrace_each and use it for all the other functions that called backtrace_each other than rb_ec_partial_backtrace_object. All other cases requested the entire backtrace, so there is no advantage of using the new algorithm for those. Additionally, there are implicit assumptions in the other code that the backtrace processing works inward instead of outward. Remove the cfunc/iseq union in rb_backtrace_location_t, and remove the prev_loc member for cfunc. Both cfunc and iseq types can now have iseq and pc entries, so the location information can be accessed the same way for each. This avoids the need for a extra backtrace location entry to store an iseq backtrace location if the final entry in the backtrace is a cfunc. This is also what fixes the segfault and infinite loop issues in the above bugs. Here's Ruby pseudocode for the new algorithm, where start and length are the arguments to caller or caller_locations: ```ruby end_cf = VM.end_control_frame.next cf = VM.start_control_frame size = VM.num_control_frames - 2 bt = [] cfunc_counter = 0 if length.nil? || length > size length = size end while cf != end_cf && bt.size != length if cf.iseq? if cf.instruction_pointer? if start > 0 start -= 1 else bt << cf.iseq_backtrace_entry cf_counter.times do |i| bt[-1 - i].loc = cf.loc end cfunc_counter = 0 end end elsif cf.cfunc? if start > 0 start -= 1 else bt << cf.cfunc_backtrace_entry cfunc_counter += 1 end end cf = cf.prev end if cfunc_counter > 0 while cf != end_cf if (cf.iseq? && cf.instruction_pointer?) cf_counter.times do |i| bt[-i].loc = cf.loc end end cf = cf.prev end end ``` With the following benchmark, which uses a call depth of around 100 (common in many Ruby applications): ```ruby class T def test(depth, &block) if depth == 0 yield self else test(depth - 1, &block) end end def array Array.new end def first caller_locations(1, 1) end def full caller_locations end end t = T.new t.test((ARGV.first || 100).to_i) do Benchmark.ips do |x| x.report ('caller_loc(1, 1)') {t.first} x.report ('caller_loc') {t.full} x.report ('Array.new') {t.array} x.compare! end end ``` Results before commit: ``` Calculating ------------------------------------- caller_loc(1, 1) 281.159k (_ 0.7%) i/s - 1.426M in 5.073055s caller_loc 15.836k (_ 2.1%) i/s - 79.450k in 5.019426s Array.new 1.852M (_ 2.5%) i/s - 9.296M in 5.022511s Comparison: Array.new: 1852297.5 i/s caller_loc(1, 1): 281159.1 i/s - 6.59x (_ 0.00) slower caller_loc: 15835.9 i/s - 116.97x (_ 0.00) slower ``` Results after commit: ``` Calculating ------------------------------------- caller_loc(1, 1) 562.286k (_ 0.8%) i/s - 2.858M in 5.083249s caller_loc 16.402k (_ 1.0%) i/s - 83.200k in 5.072963s Array.new 1.853M (_ 0.1%) i/s - 9.278M in 5.007523s Comparison: Array.new: 1852776.5 i/s caller_loc(1, 1): 562285.6 i/s - 3.30x (_ 0.00) slower caller_loc: 16402.3 i/s - 112.96x (_ 0.00) slower ``` This shows that the speed of caller_locations(1, 1) has roughly doubled, and the speed of caller_locations with no arguments has improved slightly. So this new algorithm is significant faster, much simpler, and fixes bugs in the previous algorithm. Fixes [Bug #18053]
* Using RBOOL macroS.H2021-08-021-6/+1
|
* Enable USE_ISEQ_NODE_ID by defaultYusuke Endoh2021-06-181-5/+5
| | | | | | | | ... which is formally called EXPERIMENTAL_ISEQ_NODE_ID. See also ff69ef27b06eed1ba750e7d9cab8322f351ed245. https://bugs.ruby-lang.org/issues/17930
* Make it possible to get AST::Node from Thread::Backtrace::LocationYusuke Endoh2021-06-181-3/+66
| | | | | | | | | RubyVM::AST.of(Thread::Backtrace::Location) returns a node that corresponds to the location. Typically, the node is a method call, but not always. This change also includes iseq's dump/load support of node_ids for each instructions.
* Remove LOCATION_TYPE_ISEQ_CALCED state from Backtrace::LocationYusuke Endoh2021-06-181-35/+14
| | | | | | | | | | | | | | | | | Previously Backtrace::Location had two possible states: LOCATION_TYPE_ISEQ and LOCATION_TYPE_ISEQ_CALCED. The former had the location information as PC, and the latter had it as lineno. Once lineno was caluculated, the state was changed to LOCATION_TYPE_ISEQ_CALCED and the caluculated result was kept. This change removes LOCATION_TYPE_ISEQ_CALCED, so lineno is calculated whenever it is needed. It will be slow a little, but lineno is typically needed only when its backtrace is shown, so I believe that it does not matter. This is a preparation to add column information to Backtrace::Location because PC is needed to caluculate node_id for AST::Node even after lineno is calculated. This change is approved by ko1.
* Adjust styles [ci skip]Nobuyoshi Nakada2021-06-171-6/+11
| | | | | | | | | * --braces-after-func-def-line * --dont-cuddle-else * --procnames-start-lines * --space-after-for * --space-after-if * --space-after-while
* Ensure that caller respects the start argumentJeremy Evans2021-03-241-2/+50
| | | | | | | | | | | | | | | | | | | | | | | Previously, if there were ignored frames (iseq without pc), we could go beyond the requested start frame. This has two changes: 1) Ensure that we don't look beyond the start frame by using last_cfp = RUBY_VM_PREVIOUS_CONTROL_FRAME(last_cfp) until the desired start frame is reached. 2) To fix the failures caused by change 1), which occur when a limited number of frames is requested, scan the VM stack before allocating backtrace frames, looking for ignored frames. This is complicated if there are ignored frames before and after the start, in which case we need to scan until the start frame, and then scan backwards, decrementing the start value until we get to the point where start will result in the number of requested frames. This fixes a Rails test failure. Jean Boussier was able to to produce a failing test case outside of Rails. Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
* Fix backtrace to not skip frames with iseq without pcJeremy Evans2021-02-191-7/+9
| | | | | | | | | | | | | | | | | | Previously, frames with iseq but no pc were skipped (even before the refactoring in 3b24b7914c16930bfadc89d6aff6326a51c54295). Because the entire backtrace was procesed before the refactoring, this was handled by using later frames instead. However, after the refactoring, we need to handle those frames or they get lost. Keep two iteration counters when iterating, one for the desired backtrace size (so we generate the desired number of frames), and one for the actual backtrace size (so we don't process off the end of the stack). When skipping over an iseq frame with no pc, decrement the counter for the desired backtrace, so it will continue to process the expected number of backtrace frames. Fixes [Bug #17581]
* Added Thread::Backtrace.limit [Feature #17479]Nobuyoshi Nakada2021-02-151-0/+8
|
* Ignore <internal: entries from core library methods for Kernel#warn(message, ↵Benoit Daloze2020-10-261-9/+49
| | | | | | uplevel: n) * Fixes [Bug #17259]
* Document difference between Thread::Backtrace::Location#{,absolute_}pathJeremy Evans2020-09-231-2/+5
| | | | They are usually the same, except for locations in the main script.
* Improve performance of partial backtracesJeremy Evans2020-08-271-81/+155
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, backtrace_each fully populated the rb_backtrace_t with all backtrace frames, even if caller only requested a partial backtrace (e.g. Kernel#caller_locations(1, 1)). This changes backtrace_each to only add the requested frames to the rb_backtrace_t. To do this, backtrace_each needs to be passed the starting frame and number of frames values passed to Kernel#caller or #caller_locations. backtrace_each works from the top of the stack to the bottom, where the bottom is the current frame. Due to how the location for cfuncs is tracked using the location of the previous iseq, we need to store an extra frame for the previous iseq if we are limiting the backtrace and final backtrace frame (the first one stored) would be a cfunc and not an iseq. To limit the amount of work in this case, while scanning until the start of the requested backtrace, for each iseq, store the cfp. If the first backtrace frame we care about is a cfunc, use the stored cfp to find the related iseq. Use a function pointer to handle the storage of the cfp in the iteration arg, and also store the location of the extra frame in the iteration arg. backtrace_each needs to return int instead of void in order to signal when a starting frame larger than backtrace size is given, as caller and caller_locations needs to return nil and not the empty array in these cases. To handle cases where a range is provided with a negative end, and the backtrace size is needed to calculate the result to pass to rb_range_beg_len, add a backtrace_size static function to calculate the size, which copies the logic from backtrace_each. As backtrace_each only adds the backtrace lines requested, backtrace_to_*_ary can be simplified to always operate on the entire backtrace. Previously, caller_locations(1,1) was about 6.2 times slower for an 800 deep callstack compared to an empty callstack. With this new approach, it is only 1.3 times slower. It will always be somewhat slower as it still needs to scan the cfps from the top of the stack until it finds the first requested backtrace frame. This initializes the backtrace memory to zero. I do not think this is necessary, as from my analysis, nothing during the setting of the backtrace entries can cause a garbage collection, but it seems the safest approach, and it's unlikely the performance decrease is significant. This removes the rb_backtrace_t backtrace_base member. backtrace and backtrace_base were initialized to the same value, and neither is modified, so it doesn't make sense to have two pointers. This also removes LOCATION_TYPE_IFUNC from vm_backtrace.c, as the value is never set. Fixes [Bug #17031]
* Expose ec -> backtrace (internal) and use it to implement fiber backtrace.Samuel Williams2020-08-181-0/+10
| | | | See <https://bugs.ruby-lang.org/issues/16815> for more details.
* Revert "Improve performance of partial backtraces"Jeremy Evans2020-08-121-159/+71
| | | | | | | This reverts commit f2d7461e85053cb084e10999b0b8019b0c29e66e. Some CI machines are reporting issues with backtrace_mark, so I'm going to revert this for now.
* Improve performance of partial backtracesJeremy Evans2020-08-121-71/+159
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, backtrace_each fully populated the rb_backtrace_t with all backtrace frames, even if caller only requested a partial backtrace (e.g. Kernel#caller_locations(1, 1)). This changes backtrace_each to only add the requested frames to the rb_backtrace_t. To do this, backtrace_each needs to be passed the starting frame and number of frames values passed to Kernel#caller or #caller_locations. backtrace_each works from the top of the stack to the bottom, where the bottom is the current frame. Due to how the location for cfuncs is tracked using the location of the previous iseq, we need to store an extra frame for the previous iseq if we are limiting the backtrace and final backtrace frame (the first one stored) would be a cfunc and not an iseq. To limit the amount of work in this case, while scanning until the start of the requested backtrace, for each iseq, store the cfp. If the first backtrace frame we care about is a cfunc, use the stored cfp to find the related iseq. Use a function pointer to handle the storage of the cfp in the iteration arg, and also store the location of the extra frame in the iteration arg. backtrace_each needs to return int instead of void in order to signal when a starting frame larger than backtrace size is given, as caller and caller_locations needs to return nil and not the empty array in these cases. To handle cases where a range is provided with a negative end, and the backtrace size is needed to calculate the result to pass to rb_range_beg_len, add a backtrace_size static function to calculate the size, which copies the logic from backtrace_each. As backtrace_each only adds the backtrace lines requested, backtrace_to_*_ary can be simplified to always operate on the entire backtrace. Previously, caller_locations(1,1) was about 6.2 times slower for an 800 deep callstack compared to an empty callstack. With this new approach, it is only 1.3 times slower. It will always be somewhat slower as it still needs to scan the cfps from the top of the stack until it finds the first requested backtrace frame. Fixes [Bug #17031]
* vm_backtrace.c: let rb_profile_frames show cfunc framesYusuke Endoh2020-07-281-4/+63
| | | | | ... in addition to normal iseq frames. It is sometimes useful to point the bottleneck more precisely.
* Add compaction support for backtrace objectsAaron Patterson2020-05-071-4/+32
| | | | This just introduces compaction support for backtrace objects.
* Fix rb_profile_frame_classpath to handle module singletonsJean Boussier2020-05-071-1/+1
| | | | | | Right now `SomeClass.method` is properly named, but `SomeModule.method` is displayed as `#<Module:0x000055eb5d95adc8>.method` which makes profiling annoying.
* decouple internal.h headers卜部昌平2019-12-261-5/+5
| | | | | | | | | | | | | | | | | | Saves comitters' daily life by avoid #include-ing everything from internal.h to make each file do so instead. This would significantly speed up incremental builds. We take the following inclusion order in this changeset: 1. "ruby/config.h", where _GNU_SOURCE is defined (must be the very first thing among everything). 2. RUBY_EXTCONF_H if any. 3. Standard C headers, sorted alphabetically. 4. Other system headers, maybe guarded by #ifdef 5. Everything else, sorted alphabetically. Exceptions are those win32-related headers, which tend not be self- containing (headers have inclusion order dependencies).
* Create backtrace location array directlyNobuyoshi Nakada2019-12-131-3/+3
|
* Let the backtrace array constructed in backtrace_collect be initialized with ↵Lourens Naudé2019-10-291-1/+1
| | | | the size already given
* avoid overflow in integer multiplication卜部昌平2019-10-091-1/+1
| | | | | | | This changeset basically replaces `ruby_xmalloc(x * y)` into `ruby_xmalloc2(x, y)`. Some convenient functions are also provided for instance `rb_xmalloc_mul_add(x, y, z)` which allocates x * y + z byes.
* Revert https://github.com/ruby/ruby/pull/2486卜部昌平2019-10-031-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commits: 10d6a3aca7 8ba48c1b85 fba8627dc1 dd883de5ba 6c6a25feca 167e6b48f1 7cb96d41a5 3207979278 595b3c4fdd 1521f7cf89 c11c5e69ac cf33608203 3632a812c0 f56506be0d 86427a3219 . The reason for the revert is that we observe ABA problem around inline method cache. When a cache misshits, we search for a method entry. And if the entry is identical to what was cached before, we reuse the cache. But the commits we are reverting here introduced situations where a method entry is freed, then the identical memory region is used for another method entry. An inline method cache cannot detect that ABA. Here is a code that reproduce such situation: ```ruby require 'prime' class << Integer alias org_sqrt sqrt def sqrt(n) raise end GC.stress = true Prime.each(7*37){} rescue nil # <- Here we populate CC class << Object.new; end # These adjacent remove-then-alias maneuver # frees a method entry, then immediately # reuses it for another. remove_method :sqrt alias sqrt org_sqrt end Prime.each(7*37).to_a # <- SEGV ```
* refactor constify most of rb_method_entry_t卜部昌平2019-09-301-2/+2
| | | | | | | | | | | Now that we have eliminated most destructive operations over the rb_method_entry_t / rb_callable_method_entry_t, let's make them mostly immutabe and mark them const. One exception is rb_export_method(), which destructively modifies visibilities of method entries. I have left that operation as is because I suspect that destructiveness is the nature of that function.
* drop-in type check for rb_define_global_function卜部昌平2019-08-291-2/+2
| | | | | | We can check the function pointer passed to rb_define_global_function like we do so in rb_define_method. It turns out that almost anybody is misunderstanding the API.
* * expand tabs.git2019-08-011-1/+1
|
* calc_lineno(): add assertions卜部昌平2019-08-011-11/+28
| | | | This function has a lot of assumptions. Should make them sure.
* Handle (empty) backtrace when thread is not born yet.Samuel Williams2019-06-191-0/+6
|
* * expand tabs.svn2019-03-211-1/+1
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67327 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Fix a wrong lineno in backtrace for cfuncmame2019-03-211-1/+1
| | | | | | | | lineno is an int, and INT2FIX(0) was assigned. [Bug #15719] [ruby-core:91911] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67326 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Make some internal functions staticnobu2018-11-161-1/+1
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65764 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* vm_backtrace.c: pos can be zeroshyouhei2018-11-071-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (lldb) target create "./miniruby" Current executable set to './miniruby' (x86_64). (lldb) settings set -- target.run-args "-e0" (lldb) run Process 97005 launched: './miniruby' (x86_64) ./miniruby(rb_print_backtrace+0x15) [0x10024f7d5] vm_dump.c:715 ./miniruby(rb_vm_get_sourceline+0x85) [0x10024c4f5] vm_backtrace.c:43 ./miniruby(rb_vm_make_binding+0x146) [0x100236976] vm.c:941 ./miniruby(Init_VM+0x592) [0x100249f02] vm.c:3091 ./miniruby(rb_call_inits+0xc2) [0x1000c5a72] inits.c:58 ./miniruby(ruby_setup+0xcb) [0x100098c6b] eval.c:74 ./miniruby(ruby_init+0x9) [0x100098c99] eval.c:91 ./miniruby(main+0x4d) [0x10025ddbd] addr2line.c:246 Process 97005 stopped * thread #1: tid = 0x639bb, 0x000000010024c4f5 miniruby`rb_vm_get_sourceline(cfp=<unavailable>) + 133 at vm_backtrace.c:44, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0) frame #0: 0x000000010024c4f5 miniruby`rb_vm_get_sourceline(cfp=<unavailable>) + 133 at vm_backtrace.c:44 41 else { 42 /* SDR() is not possible; that causes infinite loop. */ 43 rb_print_backtrace(); -> 44 __builtin_trap(); 45 } 46 #endif 47 return rb_iseq_line_no(iseq, pos); (lldb) git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65598 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* escape all env properly.ko12018-09-211-0/+3
| | | | | | | | | | | | | | | | | | * vm_backtrace.c (rb_debug_inspector_open): escape all env using `rb_vm_stack_to_heap()` before making bindings. [Bug #15105] There is a complicated story of this issue: Without this patch, IFUNC frame does not escaped. A IFUNC frame points to CFUNC ep as previous ep. However, CFUNC ep can be escaped because of making bindings of Ruby level frames. IFUNC's ep can points to invalidated ep and `rb_iter_break()` will fail. This is why `any?` fails. * test/-ext-/debug/test_debug.rb: add a test. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64800 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Revert "Fix use of `rb_profile_frames` start parameter"tenderlove2018-04-271-1/+2
| | | | | | | | This reverts commit r63265. ko1 said I should not have committed this! I'm sorry! git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63267 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Fix use of `rb_profile_frames` start parametertenderlove2018-04-261-2/+1
| | | | | | | | | | | | | | | rb_profile_frames was always behaving as if the value given for the start parameter was 0. The reason for this was that it would check if (start > 0) { then continue without updating the control frame pointer or anything other than decrementing start. [ruby-core:86147] [Bug #14607] Co-authored-by: Dylan Thacker-Smith <Dylan.Smith@shopify.com> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63265 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* mjit_compile.c: merge initial JIT compilerk0kubun2018-02-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | which has been developed by Takashi Kokubun <takashikkbn@gmail> as YARV-MJIT. Many of its bugs are fixed by wanabe <s.wanabe@gmail.com>. This JIT compiler is designed to be a safe migration path to introduce JIT compiler to MRI. So this commit does not include any bytecode changes or dynamic instruction modifications, which are done in original MJIT. This commit even strips off some aggressive optimizations from YARV-MJIT, and thus it's slower than YARV-MJIT too. But it's still fairly faster than Ruby 2.5 in some benchmarks (attached below). Note that this JIT compiler passes `make test`, `make test-all`, `make test-spec` without JIT, and even with JIT. Not only it's perfectly safe with JIT disabled because it does not replace VM instructions unlike MJIT, but also with JIT enabled it stably runs Ruby applications including Rails applications. I'm expecting this version as just "initial" JIT compiler. I have many optimization ideas which are skipped for initial merging, and you may easily replace this JIT compiler with a faster one by just replacing mjit_compile.c. `mjit_compile` interface is designed for the purpose. common.mk: update dependencies for mjit_compile.c. internal.h: declare `rb_vm_insn_addr2insn` for MJIT. vm.c: exclude some definitions if `-DMJIT_HEADER` is provided to compiler. This avoids to include some functions which take a long time to compile, e.g. vm_exec_core. Some of the purpose is achieved in transform_mjit_header.rb (see `IGNORED_FUNCTIONS`) but others are manually resolved for now. Load mjit_helper.h for MJIT header. mjit_helper.h: New. This is a file used only by JIT-ed code. I'll refactor `mjit_call_cfunc` later. vm_eval.c: add some #ifdef switches to skip compiling some functions like Init_vm_eval. win32/mkexports.rb: export thread/ec functions, which are used by MJIT. include/ruby/defines.h: add MJIT_FUNC_EXPORTED macro alis to clarify that a function is exported only for MJIT. array.c: export a function used by MJIT. bignum.c: ditto. class.c: ditto. compile.c: ditto. error.c: ditto. gc.c: ditto. hash.c: ditto. iseq.c: ditto. numeric.c: ditto. object.c: ditto. proc.c: ditto. re.c: ditto. st.c: ditto. string.c: ditto. thread.c: ditto. variable.c: ditto. vm_backtrace.c: ditto. vm_insnhelper.c: ditto. vm_method.c: ditto. I would like to improve maintainability of function exports, but I believe this way is acceptable as initial merging if we clarify the new exports are for MJIT (so that we can use them as TODO list to fix) and add unit tests to detect unresolved symbols. I'll add unit tests of JIT compilations in succeeding commits. Author: Takashi Kokubun <takashikkbn@gmail.com> Contributor: wanabe <s.wanabe@gmail.com> Part of [Feature #14235] --- * Known issues * Code generated by gcc is faster than clang. The benchmark may be worse in macOS. Following benchmark result is provided by gcc w/ Linux. * Performance is decreased when Google Chrome is running * JIT can work on MinGW, but it doesn't improve performance at least in short running benchmark. * Currently it doesn't perform well with Rails. We'll try to fix this before release. --- * Benchmark reslts Benchmarked with: Intel 4.0GHz i7-4790K with 16GB memory under x86-64 Ubuntu 8 Cores - 2.0.0-p0: Ruby 2.0.0-p0 - r62186: Ruby trunk (early 2.6.0), before MJIT changes - JIT off: On this commit, but without `--jit` option - JIT on: On this commit, and with `--jit` option ** Optcarrot fps Benchmark: https://github.com/mame/optcarrot | |2.0.0-p0 |r62186 |JIT off |JIT on | |:--------|:--------|:--------|:--------|:--------| |fps |37.32 |51.46 |51.31 |58.88 | |vs 2.0.0 |1.00x |1.38x |1.37x |1.58x | ** MJIT benchmarks Benchmark: https://github.com/benchmark-driver/mjit-benchmarks (Original: https://github.com/vnmakarov/ruby/tree/rtl_mjit_branch/MJIT-benchmarks) | |2.0.0-p0 |r62186 |JIT off |JIT on | |:----------|:--------|:--------|:--------|:--------| |aread |1.00 |1.09 |1.07 |2.19 | |aref |1.00 |1.13 |1.11 |2.22 | |aset |1.00 |1.50 |1.45 |2.64 | |awrite |1.00 |1.17 |1.13 |2.20 | |call |1.00 |1.29 |1.26 |2.02 | |const2 |1.00 |1.10 |1.10 |2.19 | |const |1.00 |1.11 |1.10 |2.19 | |fannk |1.00 |1.04 |1.02 |1.00 | |fib |1.00 |1.32 |1.31 |1.84 | |ivread |1.00 |1.13 |1.12 |2.43 | |ivwrite |1.00 |1.23 |1.21 |2.40 | |mandelbrot |1.00 |1.13 |1.16 |1.28 | |meteor |1.00 |2.97 |2.92 |3.17 | |nbody |1.00 |1.17 |1.15 |1.49 | |nest-ntimes|1.00 |1.22 |1.20 |1.39 | |nest-while |1.00 |1.10 |1.10 |1.37 | |norm |1.00 |1.18 |1.16 |1.24 | |nsvb |1.00 |1.16 |1.16 |1.17 | |red-black |1.00 |1.02 |0.99 |1.12 | |sieve |1.00 |1.30 |1.28 |1.62 | |trees |1.00 |1.14 |1.13 |1.19 | |while |1.00 |1.12 |1.11 |2.41 | ** Discourse's script/bench.rb Benchmark: https://github.com/discourse/discourse/blob/v1.8.7/script/bench.rb NOTE: Rails performance was somehow a little degraded with JIT for now. We should fix this. (At least I know opt_aref is performing badly in JIT and I have an idea to fix it. Please wait for the fix.) *** JIT off Your Results: (note for timings- percentile is first, duration is second in millisecs) categories_admin: 50: 17 75: 18 90: 22 99: 29 home_admin: 50: 21 75: 21 90: 27 99: 40 topic_admin: 50: 17 75: 18 90: 22 99: 32 categories: 50: 35 75: 41 90: 43 99: 77 home: 50: 39 75: 46 90: 49 99: 95 topic: 50: 46 75: 52 90: 56 99: 101 *** JIT on Your Results: (note for timings- percentile is first, duration is second in millisecs) categories_admin: 50: 19 75: 21 90: 25 99: 33 home_admin: 50: 24 75: 26 90: 30 99: 35 topic_admin: 50: 19 75: 20 90: 25 99: 30 categories: 50: 40 75: 44 90: 48 99: 76 home: 50: 42 75: 48 90: 51 99: 89 topic: 50: 49 75: 55 90: 58 99: 99 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62197 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* internal.h: remove dependecy on ruby/encoding.hnobu2018-01-091-1/+2
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61713 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* remove `PUSH_TAG`/`EXEC_AG`/`POP_TAG`/`JUMO_TAG`.ko12017-12-061-1/+1
| | | | | | | | * eval_intern.h: remove non-`EC_` prefix *_TAG() macros. Use `EC_` prefix macros explicitly. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61040 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* make a func static.ko12017-11-161-2/+2
| | | | | | | | * vm_backtrace.c (rb_ec_backtrace_location_ary): make it static and remove `rb_` prefix. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60809 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* remove `trace` instruction. [Feature #14104]ko12017-11-141-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | * tool/instruction.rb: create `trace_` prefix instructions. * compile.c (ADD_TRACE): do not add `trace` instructions but add TRACE link elements. TRACE elements will be unified with a next instruction as instruction information. * vm_trace.c (update_global_event_hook): modify all ISeqs when hooks are enabled. * iseq.c (rb_iseq_trace_set): added to toggle `trace_` instructions. * vm_insnhelper.c (vm_trace): added. This function is a body of `trace_` prefix instructions. * vm_insnhelper.h (JUMP): save PC to a control frame. * insns.def (trace): removed. * vm_exec.h (INSN_ENTRY_SIG): add debug output (disabled). git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60763 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* Fix a typo [ci skip]kazu2017-11-101-1/+1
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60738 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* refactoring about source line.ko12017-11-101-1/+3
| | | | | | | | | * iseq.c (find_line_no): renamed to rb_iseq_line_no(). * vm_backtrace.c (calc_lineno): add a comment why we need to use "pos-1". git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60733 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* fix backtrace on argment error.ko12017-11-091-0/+19
| | | | | | | | | | | * vm_backtrace.c (rb_backtrace_use_iseq_first_lineno_for_last_location): added. It modifies last location's line as corresponding iseq's first line number. * vm_args.c (raise_argument_error): use added function. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60725 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
* * vm_backtrace.c (rb_debug_inspector_t): `th` -> `ec`.ko12017-11-071-11/+11
| | | | git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60680 b2dd03c8-39d4-4d8f-98ff-823fe69b080e