delta/ruby.git - github.com: ruby/ruby.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Functions defined in headers should be static inline	Nobuyoshi Nakada	2020-11-15	1	-1/+1
\|
*	ruby_vm_global_method_state is no longer needed.	Koichi Sasada	2020-10-14	1	-3/+0
\| \| \| \|	Now ruby_vm_global_method_state is not used so let's remove it.
*	Introduce Ractor mechanism for parallel execution	Koichi Sasada	2020-09-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit introduces Ractor mechanism to run Ruby program in parallel. See doc/ractor.md for more details about Ractor. See ticket [Feature #17100] to see the implementation details and discussions. [Feature #17100] This commit does not complete the implementation. You can find many bugs on using Ractor. Also the specification will be changed so that this feature is experimental. You will see a warning when you make the first Ractor with `Ractor.new`. I hope this feature can help programmers from thread-safety issues.
*	Extracted METHOD_ENTRY_CACHEABLE macro	Nobuyoshi Nakada	2020-06-30	1	-1/+1
\|
*	Use canary cond also if not VM_CHECK_MODE to suppress warnings	Nobuyoshi Nakada	2020-06-22	1	-2/+2
\|
*	Verify builtin inline annotation with VM_CHECK_MODE (#3244)	Takashi Kokubun	2020-06-21	1	-7/+7
\| \| \| \| \|	* Verify builtin inline annotation with VM_CHECK_MODE * Remove static to fix the link issue on MJIT
*	Suppress warnings no inline ruby debug (#3107)	Kenta Murata	2020-05-22	1	-1/+1
\| \| \| \| \|	* Suppress unused warnings occurred due to -fno-inline * Suppress warning occurred due to RUBY_DEBUG=1
*	add #include guard hack	卜部昌平	2020-04-13	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to MSVC manual (1), cl.exe can skip including a header file when that: - contains #pragma once, or - starts with #ifndef, or - starts with #if ! defined. GCC has a similar trick (2), but it acts more stricter (e. g. there must be _no tokens_ outside of #ifndef...#endif). Sun C lacked #pragma once for a looong time. Oracle Developer Studio 12.5 finally implemented it, but we cannot assume such recent version. This changeset modifies header files so that each of them include strictly one #ifndef...#endif. I believe this is the most portable way to trigger compiler optimizations. [Bug #16770] 1: https://docs.microsoft.com/en-us/cpp/preprocessor/once 2: https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html
*	Reduce allocations for keyword argument hashes	Jeremy Evans	2020-03-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, passing a keyword splat to a method always allocated a hash on the caller side, and accepting arbitrary keywords in a method allocated a separate hash on the callee side. Passing explicit keywords to a method that accepted a keyword splat did not allocate a hash on the caller side, but resulted in two hashes allocated on the callee side. This commit makes passing a single keyword splat to a method not allocate a hash on the caller side. Passing multiple keyword splats or a mix of explicit keywords and a keyword splat still generates a hash on the caller side. On the callee side, if arbitrary keywords are not accepted, it does not allocate a hash. If arbitrary keywords are accepted, it will allocate a hash, but this commit uses a callinfo flag to indicate whether the caller already allocated a hash, and if so, the callee can use the passed hash without duplicating it. So this commit should make it so that a maximum of a single hash is allocated during method calls. To set the callinfo flag appropriately, method call argument compilation checks if only a single keyword splat is given. If only one keyword splat is given, the VM_CALL_KW_SPLAT_MUT callinfo flag is not set, since in that case the keyword splat is passed directly and not mutable. If more than one splat is used, a new hash needs to be generated on the caller side, and in that case the callinfo flag is set, indicating the keyword splat is mutable by the callee. In compile_hash, used for both hash and keyword argument compilation, if compiling keyword arguments and only a single keyword splat is used, pass the argument directly. On the caller side, in vm_args.c, the callinfo flag needs to be recognized and handled. Because the keyword splat argument may not be a hash, it needs to be converted to a hash first if not. Then, unless the callinfo flag is set, the hash needs to be duplicated. The temporary copy of the callinfo flag, kw_flag, is updated if a hash was duplicated, to prevent the need to duplicate it again. If we are converting to a hash or duplicating a hash, we need to update the argument array, which can including duplicating the positional splat array if one was passed. CALLER_SETUP_ARG and a couple other places needs to be modified to handle similar issues for other types of calls. This includes fairly comprehensive tests for different ways keywords are handled internally, checking that you get equal results but that keyword splats on the caller side result in distinct objects for keyword rest parameters. Included are benchmarks for keyword argument calls. Brief results when compiled without optimization: def kw(a: 1) a end def kws(kw) kw end h = {a: 1} kw(a: 1) # about same kw(h) # 2.37x faster kws(a: 1) # 1.30x faster kws(h) # 2.19x faster kw(a: 1, h) # 1.03x slower kw(h, h) # about same kws(a: 1, h) # 1.16x faster kws(h, **h) # 1.14x faster
*	Introduce disposable call-cache.	Koichi Sasada	2020-02-22	1	-11/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch contains several ideas: (1) Disposable inline method cache (IMC) for race-free inline method cache * Making call-cache (CC) as a RVALUE (GC target object) and allocate new CC on cache miss. * This technique allows race-free access from parallel processing elements like RCU. (2) Introduce per-Class method cache (pCMC) * Instead of fixed-size global method cache (GMC), pCMC allows flexible cache size. * Caching CCs reduces CC allocation and allow sharing CC's fast-path between same call-info (CI) call-sites. (3) Invalidate an inline method cache by invalidating corresponding method entries (MEs) * Instead of using class serials, we set "invalidated" flag for method entry itself to represent cache invalidation. * Compare with using class serials, the impact of method modification (add/overwrite/delete) is small. * Updating class serials invalidate all method caches of the class and sub-classes. * Proposed approach only invalidate the method cache of only one ME. See [Feature #16614] for more details.
*	VALUE size packed callinfo (ci).	Koichi Sasada	2020-02-22	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now, rb_call_info contains how to call the method with tuple of (mid, orig_argc, flags, kwarg). Most of cases, kwarg == NULL and mid+argc+flags only requires 64bits. So this patch packed rb_call_info to VALUE (1 word) on such cases. If we can not represent it in VALUE, then use imemo_callinfo which contains conventional callinfo (rb_callinfo, renamed from rb_call_info). iseq->body->ci_kw_size is removed because all of callinfo is VALUE size (packed ci or a pointer to imemo_callinfo). To access ci information, we need to use these functions: vm_ci_mid(ci), _flag(ci), _argc(ci), _kwarg(ci). struct rb_call_info_kw_arg is renamed to rb_callinfo_kwarg. rb_funcallv_with_cc() and rb_method_basic_definition_p_with_cc() is temporary removed because cd->ci should be marked.
*	per-method serial number	卜部昌平	2019-12-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Methods and their definitions can be allocated/deallocated on-the-fly. One pathological situation is when a method is deallocated then another one is allocated immediately after that. Address of those old/new method entries/definitions can be the same then, depending on underlying malloc/free implementation. So pointer comparison is insufficient. We have to check the contents. To do so we introduce def->method_serial, which is an integer unique to that specific method definition. PS: Note that method_serial being uintptr_t rather than rb_serial_t is intentional. This is because rb_serial_t can be bigger than a pointer on a 32bit system (rb_serial_t is at least 64bit). In order to preserve old packing of struct rb_call_cache, rb_serial_t is inappropriate.
*	Define PREV_CLASS_SERIAL	John Hawthorn	2019-12-17	1	-0/+1
\| \| \| \| \| \| \| \|	Avoids genereating a "throwaway" sentinel class serial. There wasn't any read harm in doing so (we're at no risk of exhaustion and there'd be no measurable performance impact), but if feels cleaner that all class serials actually end up assigned and used (especially now that we won't overwrite them in a single method definition).
*	convert macros into inline functions	卜部昌平	2019-12-17	1	-10/+13
\| \| \| \|	For better readability.
*	ensure cc->def == cc->me->def	卜部昌平	2019-12-16	1	-0/+8
\| \| \| \| \| \| \|	The equation shall hold for every call cache. However prior to this changeset cc->me could be updated without also updating cc->def. Let's make it sure by introducing new macro named CC_SET_ME which sets cc->me and cc->def at once.
*	Combine call info and cache to speed up method invocation	Alan Wu	2019-10-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To perform a regular method call, the VM needs two structs, `rb_call_info` and `rb_call_cache`. At the moment, we allocate these two structures in separate buffers. In the worst case, the CPU needs to read 4 cache lines to complete a method call. Putting the two structures together reduces the maximum number of cache line reads to 2. Combining the structures also saves 8 bytes per call site as the current layout uses separate two pointers for the call info and the call cache. This saves about 2 MiB on Discourse. This change improves the Optcarrot benchmark at least 3%. For more details, see attached bugs.ruby-lang.org ticket. Complications: - A new instruction attribute `comptime_sp_inc` is introduced to calculate SP increase at compile time without using call caches. At compile time, a `TS_CALLDATA` operand points to a call info struct, but at runtime, the same operand points to a call data struct. Instruction that explicitly define `sp_inc` also need to define `comptime_sp_inc`. - MJIT code for copying call cache becomes slightly more complicated. - This changes the bytecode format, which might break existing tools. [Misc #16258]
*	Merge pull request #2422 from jeremyevans/rb_keyword_given_p	Jeremy Evans	2019-09-03	1	-0/+1
\| \| \|	Add rb_keyword_given_p to the C-API
*	Don't pass an empty keyword hash when double splatting empty hash	Jeremy Evans	2019-08-30	1	-0/+1
\|
*	Initialize vm_throw_data::throw_state as int	Nobuyoshi Nakada	2019-07-25	1	-2/+4
\| \| \| \| \| \| \| \|	As `struct vm_throw_data::throw_state` is initialized as `VALUE` by rb_imemo_new, extended MSW part is assigned to it on LP64 big-endian platforms. Fix up 1feda1c2b091b950efcaa481a11fd660efa9e717
*	constify again.	Koichi Sasada	2019-07-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Same as last commit, make some fields `const`. include/ruby/ruby.h: * Rasic::klass * RArray::heap::aux::shared_root * RRegexp::src internal.h: * rb_classext_struct::origin_, redefined_class * vm_svar::cref_or_me, lastline, backref, others * vm_throw_data::throw_obj * vm_ifunc::data * MEMO::v1, v2, u3::value While modifying this patch, I found write-barrier miss on rb_classext_struct::redefined_class. Also vm_throw_data::throw_state is only `int` so change the type.
*	Make export declaration place more consistent	Takashi Kokubun	2019-07-14	1	-2/+0
\|
*	MJIT Support for getblockparamproxy	Takashi Kokubun	2019-07-14	1	-0/+2
\|
*	Share vm_call_iseq_optimizable_p to reduce copy-paste	k0kubun	2019-03-21	1	-0/+9
\| \| \| \|	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67329 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	on-smash canary detection	shyouhei	2019-02-01	1	-3/+11
\| \| \| \| \| \| \| \| \|	In addition to detect dead canary, we try to detect the very moment when we smash the stack top. Requested by k0kubun: https://twitter.com/k0kubun/status/1085180749899194368 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66981 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	vm.inc now in C99	shyouhei	2019-01-25	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	This changeset modifies the VM generator so that vm.inc is written in C99. Also added some comments in _insn_entry.erb so that the intention of each parts to be made more clear. I think this improves overall readability of the generated VM. Confirmed that the exact same binary is generated before/after this changeset. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66923 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	vm_insnhelper.c: delete unused macros	shyouhei	2018-12-28	1	-48/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- FIXNUM_2_P: moved to vm_insnhelper.c because that is the only place this macro is used. - FLONUM_2_P: ditto. - FLOAT_HEAP_P: not used anywhere. - FLOAT_INSTANCE_P: ditto. - GET_TOS: ditto. - USE_IC_FOR_SPECIALIZED_METHOD: ditto. - rb_obj_hidden_p: ditto. - REG_A: ditto. - REG_B: ditto. - GET_CONST_INLINE_CACHE: ditto. - vm_regan_regtype: moved inside of VM_COLLECT_USAGE_DETAILS because that os the only place this enum is used. - vm_regan_acttype: ditto. - GET_GLOBAL: used only once. Removed with replacing that usage. - SET_GLOBAL: ditto. - rb_method_definition_create: declaration moved to vm_insnhelper.c because that is the only place this declaration makes sense. - rb_method_definition_set: ditto. - rb_method_definition_eq: ditto. - rb_make_no_method_exception: ditto. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66597 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	insns.def: refactor to avoid CALL_METHOD macro	shyouhei	2018-12-26	1	-36/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These send and its variant instructions are the most frequently called paths in the entire process. Reducing macro expansions to make them dedicated function called vm_sendish() is the main goal of this changeset. It reduces the size of vm_exec_coref from 25,552 bytes to 23,728 bytes on my machine. I see no significant slowdown. Fix: [GH-2056] vanilla: ruby 2.6.0dev (2018-12-19 trunk 66449) [x86_64-darwin15] ours: ruby 2.6.0dev (2018-12-19 refactor-send 66449) [x86_64-darwin15] last_commit=insns.def: refactor to avoid CALL_METHOD macro Calculating ------------------------------------- vanilla ours vm2_defined_method 2.645M 2.823M i/s - 6.000M times in 5.109888s 4.783254s vm2_method 8.553M 8.873M i/s - 6.000M times in 1.579892s 1.524026s vm2_method_missing 3.772M 3.858M i/s - 6.000M times in 3.579482s 3.499220s vm2_method_with_block 8.494M 8.944M i/s - 6.000M times in 1.589774s 1.509463s vm2_poly_method 0.571 0.607 i/s - 1.000 times in 3.947570s 3.733528s vm2_poly_method_ov 5.514 5.168 i/s - 1.000 times in 0.408156s 0.436169s vm3_clearmethodcache 2.875 2.837 i/s - 1.000 times in 0.783018s 0.793493s Comparison: vm2_defined_method ours: 2822555.4 i/s vanilla: 2644878.1 i/s - 1.07x slower vm2_method ours: 8872947.8 i/s vanilla: 8553433.1 i/s - 1.04x slower vm2_method_missing ours: 3858192.3 i/s vanilla: 3772296.3 i/s - 1.02x slower vm2_method_with_block ours: 8943825.1 i/s vanilla: 8493955.0 i/s - 1.05x slower vm2_poly_method ours: 0.6 i/s vanilla: 0.6 i/s - 1.06x slower vm2_poly_method_ov vanilla: 5.5 i/s ours: 5.2 i/s - 1.07x slower vm3_clearmethodcache vanilla: 2.9 i/s ours: 2.8 i/s - 1.01x slower git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66565 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	vm_insnhelper.h: rename CI_SET_FASTPATH to CC_SET_FASTPATH	k0kubun	2018-09-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	because it's actually setting fastpath to cc instead of ci since r51903. vm_insnhelper.c: ditto mjit_compile.c: ditto tool/ruby_vm/views/_mjit_compile_send.erb: ditto git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64772 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	move ADD_PC around (take 2)	shyouhei	2018-09-14	1	-7/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we can say for sure if an instruction calls a method or not internally, it is now possible to reroute the bugs that forced us to revert the "move PC around" optimization. First try: r62051 Reverted: r63763 See also: r63999 ---- trunk: ruby 2.6.0dev (2018-09-13 trunk 64736) [x86_64-darwin15] ours: ruby 2.6.0dev (2018-09-13 trunk 64736) [x86_64-darwin15] last_commit=move ADD_PC around (take 2) Calculating ------------------------------------- trunk ours so_ackermann 1.884 2.278 i/s - 1.000 times in 0.530926s 0.438935s so_array 1.178 1.157 i/s - 1.000 times in 0.848786s 0.864467s so_binary_trees 0.176 0.177 i/s - 1.000 times in 5.683895s 5.657707s so_concatenate 0.220 0.221 i/s - 1.000 times in 4.546896s 4.518949s so_count_words 6.729 6.470 i/s - 1.000 times in 0.148602s 0.154561s so_exception 3.324 3.688 i/s - 1.000 times in 0.300872s 0.271147s so_fannkuch 0.546 0.968 i/s - 1.000 times in 1.831328s 1.033376s so_fasta 0.541 0.547 i/s - 1.000 times in 1.849923s 1.827091s so_k_nucleotide 0.800 0.777 i/s - 1.000 times in 1.250635s 1.286295s so_lists 2.101 1.848 i/s - 1.000 times in 0.475954s 0.541095s so_mandelbrot 0.435 0.408 i/s - 1.000 times in 2.299328s 2.450535s so_matrix 1.946 1.912 i/s - 1.000 times in 0.513872s 0.523076s so_meteor_contest 0.311 0.317 i/s - 1.000 times in 3.219297s 3.152052s so_nbody 0.746 0.703 i/s - 1.000 times in 1.339815s 1.423441s so_nested_loop 0.899 0.901 i/s - 1.000 times in 1.111767s 1.109555s so_nsieve 0.559 0.579 i/s - 1.000 times in 1.787763s 1.726552s so_nsieve_bits 0.435 0.428 i/s - 1.000 times in 2.296282s 2.333852s so_object 1.368 1.442 i/s - 1.000 times in 0.731237s 0.693684s so_partial_sums 0.616 0.546 i/s - 1.000 times in 1.623592s 1.833097s so_pidigits 0.831 0.832 i/s - 1.000 times in 1.203117s 1.202334s so_random 2.934 2.724 i/s - 1.000 times in 0.340791s 0.367150s so_reverse_complement 0.583 0.866 i/s - 1.000 times in 1.714144s 1.154615s so_sieve 1.829 2.081 i/s - 1.000 times in 0.546607s 0.480562s so_spectralnorm 0.524 0.558 i/s - 1.000 times in 1.908716s 1.792382s git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64737 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	vm_insnhelper.h: drop OPT_CALL_FASTPATH macro support	k0kubun	2018-09-13	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \|	because cc->call is NULL by default and it is not overridden by vm_search_super_method if OPT_CALL_FASTPATH is 0. So this macro is not just a switch for optimization but now it's mandatory. vm_insnhelper.c: cosmetic change. Use boolean-ish `TRUE` instead of 1 to specify `enabled` flag. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64735 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	Revert "vm_insnhelper.h: simplify EXEC_EC_CFP implementation"	k0kubun	2018-09-13	1	-2/+6
\| \| \| \| \| \| \|	This reverts commit r64711, because EXEC_EC_CFP on JIT-ed code does not call jit_func with the patch when catch_except_p is true. It wasn't intentional. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64730 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	vm_insnhelper.h: simplify EXEC_EC_CFP implementation	k0kubun	2018-09-13	1	-6/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and possibly memory access for iseq->body may be reduced. No significant impact for performance on Optcarrot. * before fps: 55.03865935187656 fps: 57.16854675983188 fps: 57.672458407661765 fps: 58.28989837869383 fps: 58.80503815099268 fps: 59.068054176528534 fps: 59.55736806358244 fps: 61.01018920533034 fps: 63.34167049232186 fps: 65.20575018321766 fps: 65.46758316561318 * after fps: 55.21860411005677 fps: 55.34840351179166 fps: 58.23666596747484 fps: 59.71987124578901 fps: 61.131485120234935 fps: 61.279905164649485 fps: 61.66060774175459 fps: 64.11215576508765 fps: 64.63699742853154 fps: 65.28260058920769 fps: 65.85447796482678 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64711 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	move canary-related statements into macros	shyouhei	2018-09-13	1	-0/+21
\| \| \| \| \| \| \| \|	This is mostly cosmetic. Should generate a slightly readable vm.inc output. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64709 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	mjit.c: initial support for mswin MJIT	k0kubun	2018-08-07	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By this commit's changes in other files, now MJIT started to work on VC++. Unfortunately some features are still broken and they'll be fixed later. This also suppresses cl.exe's default output to stdout because there seems to be no option to do it. Tweaking some log messages as well. vm_core.h: declare `__declspec(dllimport)` to export them correctly on mswin. vm_insnhelper.h: ditto mjit.h: ditto test_jit.rb: skipped some pending tests. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64221 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	mjit_compile.c: reduce sp motion on JIT	k0kubun	2018-07-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This retries r62655, which was reverted at r63863 for r63763. tool/ruby_vm/views/_mjit_compile_insn.erb: revert the revert. tool/ruby_vm/views/_mjit_compile_insn_body.erb: ditto. tool/ruby_vm/views/_mjit_compile_pc_and_sp.erb: ditto. tool/ruby_vm/views/_mjit_compile_send.erb: ditto. tool/ruby_vm/views/mjit_compile.inc.erb: ditto. tool/ruby_vm/views/_insn_entry.erb: revert half of r63763. The commit was originally reverted since changing pc motion was bad for tracing, but changing sp motion was totally fine. For JIT, I wanna resurrect the sp motion change in r62051. tool/ruby_vm/models/bare_instructions.rb: ditto. insns.def: ditto. vm_insnhelper.c: ditto. vm_insnhelper.h: ditto. * benchmark $ benchmark-driver benchmark.yml --rbenv 'before;after;before --jit;after --jit' --repeat-count 12 -v before: ruby 2.6.0dev (2018-07-19 trunk 63998) [x86_64-linux] after: ruby 2.6.0dev (2018-07-19 add-sp 63998) [x86_64-linux] last_commit=mjit_compile.c: reduce sp motion on JIT before --jit: ruby 2.6.0dev (2018-07-19 trunk 63998) +JIT [x86_64-linux] after --jit: ruby 2.6.0dev (2018-07-19 add-sp 63998) +JIT [x86_64-linux] last_commit=mjit_compile.c: reduce sp motion on JIT Calculating ------------------------------------- before after before --jit after --jit Optcarrot Lan_Master.nes 51.354 50.238 70.010 72.139 fps Comparison: Optcarrot Lan_Master.nes after --jit: 72.1 fps before --jit: 70.0 fps - 1.03x slower before: 51.4 fps - 1.40x slower after: 50.2 fps - 1.44x slower git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63999 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	remove assertion for hidden objects.	ko1	2018-06-27	1	-1/+1
\| \| \| \| \| \| \| \| \|	* vm_insnhelper.h (PUSH): r63763 increase "PUSH()" op and it reveals that there are hidden objects (at least T_IMEMO/iseq objects) are located on VM stack. Remove this check and I'll revisit it later. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63765 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	give up insn attr handles_frame	shyouhei	2018-06-27	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	I introduced this mechanism in r62051 to speed things up. Later it was reported that the change causes problems. I searched for workarounds but nothing seemed appropriate. I hereby officially give it up. The idea to move ADD_PC around was a mistake. Fixes [Bug #14809] and [Bug #14834]. Signed-off-by: Urabe, Shyouhei <shyouhei@ruby-lang.org> git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63763 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	NULL class Data_Wrap_Struct is not allowed.	ko1	2018-06-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	* spec/ruby/optional/capi/typed_data_spec.rb: same as r63692. * spec/ruby/optional/capi/ext/typed_data_spec.c: ditto. * vm_insnhelper.h (PUSH): re-enable assertion to check hidden objects. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63694 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	remvoe assertion because rubyspec hit this assert	ko1	2018-06-18	1	-1/+1
\| \| \| \|	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63688 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	hidden objects should not be pushed.	ko1	2018-06-18	1	-1/+12
\| \| \| \| \| \| \| \|	* vm_insnhelper.h (PUSH): hidden objects (klass == 0) should not be pushed to a VM value stack. Add assertion for it. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63687 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	mjit_compile.c: use local variables for stack	k0kubun	2018-03-04	1	-2/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	if catch_except_p is FALSE. If catch_except_p is TRUE, stack values should be on VM's stack when exception is thrown and the JIT-ed frame is re-executed by VM's exception handler. If it's FALSE, the JIT-ed frame won't be re-executed and don't need to keep values on VM's stack. Using local variables allows us to reduce cfp->sp motion. Moving cfp->sp is needed only for insns whose handles_frame? is false. So it improves performance. _mjit_compile_insn.erb: Prepare `stack_size` variable for GET_SP, STACK_ADDR_FROM_TOP, TOPN macros. Share pc and sp motion partial view. Use cancel handler created in mjit_compile.c. _mjit_compile_send.erb: ditto. Also, when iseq->body->catch_except_p is TRUE, this stops to call mjit_exec directly. I described the reason in vm_insnhelper.h's comment for EXEC_EC_CFP. _mjit_compile_pc_and_sp.erb: Shared logic for moving sp and pc. As you can see from thsi file, when status->local_stack_p is TRUE and insn.handles_frame? is false, moving sp is skipped. But if insn.handles_frame? is true, values should be rolled back to VM's stack. common.mk: add dependency for the file _mjit_compile_insn_body.erb: Set sp value before canceling JIT on DISPATCH_ORIGINAL_INSN. Replace GET_SP, STACK_ADDR_FROM_TOP, TOPN macros for the case ocal_stack_p is TRUE and insn.handles_frame? is false. In that case, values are not available on VM's stack and those macros should be replaced. mjit_compile.inc.erb: updated comments of macros which are supported by JIT compiler. All references to `cfp->sp` should be replaced and thus INC_SP, SET_SV, PUSH are no longer supported for now, because they are not used now. vm_exec.h: moved EXEC_EC_CFP definition to vm_insnhelper.h because it's tighly coupled to CALL_METHOD. vm_insnhelper.h: Have revised EXEC_EC_CFP definition moved from vm_exec.h. Now it triggers mjit_exec for VM, and has the guard for catch_except_p on JIT-ed code. See comments for details. CALL_METHOD delegates triggering mjit_exec to EXEC_EC_CFP. insns.def: Stopped using EXEC_EC_CFP for the case we don't want to trigger mjit_exec. Those insns (defineclass, opt_call_c_function) are not supported by JIT and it's safe to use RESTORE_REGS(), NEXT_INSN(). expandarray is changed to pass GET_SP() to replace the macro in _mjit_compile_insn_body.erb. vm_insnhelper.c: change to take sp for the above reason. [close https://github.com/ruby/ruby/pull/1828] This patch resurrects the performance which was attached in [Feature #14235]. * Benchmark Optcarrot (with configuration for benchmark_driver.gem) https://github.com/benchmark-driver/optcarrot $ benchmark-driver benchmark.yml --verbose 1 --rbenv 'before;before+JIT::before,--jit;after;after+JIT::after,--jit' --repeat-count 10 before: ruby 2.6.0dev (2018-03-04 trunk 62652) [x86_64-linux] before+JIT: ruby 2.6.0dev (2018-03-04 trunk 62652) +JIT [x86_64-linux] after: ruby 2.6.0dev (2018-03-04 local-variable.. 62652) [x86_64-linux] last_commit=mjit_compile.c: use local variables for stack after+JIT: ruby 2.6.0dev (2018-03-04 local-variable.. 62652) +JIT [x86_64-linux] last_commit=mjit_compile.c: use local variables for stack Calculating ------------------------------------- before before+JIT after after+JIT optcarrot 53.552 59.680 53.697 63.358 fps Comparison: optcarrot after+JIT: 63.4 fps before+JIT: 59.7 fps - 1.06x slower after: 53.7 fps - 1.18x slower before: 53.6 fps - 1.18x slower git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62655 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	vm.c: add mjit_enable_p flag	k0kubun	2018-03-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to count up total calls properly. Some places (especially CALL_METHOD) invoke mjit_exec twice for one method call. It would be problematic when debugging, or possibly it would result in a wrong profiling result. This commit doesn't have impact for performance: * Optcarrot benchmark before fps: 59.37757770848619 fps: 56.49998488958699 fps: 59.07900362739362 fps: 58.924749807695996 fps: 57.667905665594894 fps: 57.540021018385254 fps: 59.5518055679647 fps: 55.93831555148311 fps: 57.82685112863262 fps: 59.22391754481736 checksum: 59662 after fps: 58.461881158098194 fps: 59.32685183081354 fps: 54.11334310279802 fps: 59.2281560439788 fps: 58.60495705318312 fps: 55.696478648491045 fps: 58.49003452654724 fps: 58.387771929393224 fps: 59.24156772816439 fps: 56.68804731968107 checksum: 59662 * Discourse Your Results: (note for timings- percentile is first, duration is second in millisecs) before (without JIT) categories_admin: 50: 16 75: 17 90: 24 99: 37 home_admin: 50: 20 75: 20 90: 24 99: 42 topic_admin: 50: 16 75: 16 90: 18 99: 28 categories: 50: 36 75: 37 90: 45 99: 68 home: 50: 38 75: 40 90: 53 99: 92 topic: 50: 14 75: 15 90: 17 99: 26 after (without JIT) categories_admin: 50: 16 75: 16 90: 24 99: 36 home_admin: 50: 19 75: 20 90: 23 99: 41 topic_admin: 50: 16 75: 16 90: 19 99: 33 categories: 50: 35 75: 36 90: 44 99: 61 home: 50: 38 75: 40 90: 52 99: 101 topic: 50: 14 75: 15 90: 15 99: 24 before (with JIT) categories_admin: 50: 19 75: 23 90: 29 99: 44 home_admin: 50: 24 75: 26 90: 32 99: 46 topic_admin: 50: 20 75: 22 90: 27 99: 44 categories: 50: 41 75: 43 90: 51 99: 66 home: 50: 46 75: 49 90: 56 99: 68 topic: 50: 18 75: 19 90: 22 99: 31 after (with JIT) categories_admin: 50: 18 75: 21 90: 28 99: 42 home_admin: 50: 23 75: 25 90: 31 99: 51 topic_admin: 50: 19 75: 20 90: 24 99: 31 categories: 50: 41 75: 44 90: 52 99: 69 home: 50: 45 75: 48 90: 61 99: 88 topic: 50: 19 75: 20 90: 24 99: 33 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62641 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	prefixed functions exported for mjit	nobu	2018-02-17	1	-2/+6
\| \| \| \|	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62445 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	mjit_compile.c: share the definition of macros	k0kubun	2018-02-06	1	-0/+8
\| \| \| \| \| \| \| \| \|	(IS_ARGS_SPLAT, IS_ARGS_KEYWORD) with vm_args.c. vm_args.c: share them with mjit_compile.c. vm_insnhelper.h: get those definitions, with CALLER_SETUP_ARG too git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62254 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	mjit_compile.c: merge initial JIT compiler	k0kubun	2018-02-04	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	which has been developed by Takashi Kokubun <takashikkbn@gmail> as YARV-MJIT. Many of its bugs are fixed by wanabe <s.wanabe@gmail.com>. This JIT compiler is designed to be a safe migration path to introduce JIT compiler to MRI. So this commit does not include any bytecode changes or dynamic instruction modifications, which are done in original MJIT. This commit even strips off some aggressive optimizations from YARV-MJIT, and thus it's slower than YARV-MJIT too. But it's still fairly faster than Ruby 2.5 in some benchmarks (attached below). Note that this JIT compiler passes `make test`, `make test-all`, `make test-spec` without JIT, and even with JIT. Not only it's perfectly safe with JIT disabled because it does not replace VM instructions unlike MJIT, but also with JIT enabled it stably runs Ruby applications including Rails applications. I'm expecting this version as just "initial" JIT compiler. I have many optimization ideas which are skipped for initial merging, and you may easily replace this JIT compiler with a faster one by just replacing mjit_compile.c. `mjit_compile` interface is designed for the purpose. common.mk: update dependencies for mjit_compile.c. internal.h: declare `rb_vm_insn_addr2insn` for MJIT. vm.c: exclude some definitions if `-DMJIT_HEADER` is provided to compiler. This avoids to include some functions which take a long time to compile, e.g. vm_exec_core. Some of the purpose is achieved in transform_mjit_header.rb (see `IGNORED_FUNCTIONS`) but others are manually resolved for now. Load mjit_helper.h for MJIT header. mjit_helper.h: New. This is a file used only by JIT-ed code. I'll refactor `mjit_call_cfunc` later. vm_eval.c: add some #ifdef switches to skip compiling some functions like Init_vm_eval. win32/mkexports.rb: export thread/ec functions, which are used by MJIT. include/ruby/defines.h: add MJIT_FUNC_EXPORTED macro alis to clarify that a function is exported only for MJIT. array.c: export a function used by MJIT. bignum.c: ditto. class.c: ditto. compile.c: ditto. error.c: ditto. gc.c: ditto. hash.c: ditto. iseq.c: ditto. numeric.c: ditto. object.c: ditto. proc.c: ditto. re.c: ditto. st.c: ditto. string.c: ditto. thread.c: ditto. variable.c: ditto. vm_backtrace.c: ditto. vm_insnhelper.c: ditto. vm_method.c: ditto. I would like to improve maintainability of function exports, but I believe this way is acceptable as initial merging if we clarify the new exports are for MJIT (so that we can use them as TODO list to fix) and add unit tests to detect unresolved symbols. I'll add unit tests of JIT compilations in succeeding commits. Author: Takashi Kokubun <takashikkbn@gmail.com> Contributor: wanabe <s.wanabe@gmail.com> Part of [Feature #14235] --- * Known issues * Code generated by gcc is faster than clang. The benchmark may be worse in macOS. Following benchmark result is provided by gcc w/ Linux. * Performance is decreased when Google Chrome is running * JIT can work on MinGW, but it doesn't improve performance at least in short running benchmark. * Currently it doesn't perform well with Rails. We'll try to fix this before release. --- * Benchmark reslts Benchmarked with: Intel 4.0GHz i7-4790K with 16GB memory under x86-64 Ubuntu 8 Cores - 2.0.0-p0: Ruby 2.0.0-p0 - r62186: Ruby trunk (early 2.6.0), before MJIT changes - JIT off: On this commit, but without `--jit` option - JIT on: On this commit, and with `--jit` option Optcarrot fps Benchmark: https://github.com/mame/optcarrot \| \|2.0.0-p0 \|r62186 \|JIT off \|JIT on \| \|:--------\|:--------\|:--------\|:--------\|:--------\| \|fps \|37.32 \|51.46 \|51.31 \|58.88 \| \|vs 2.0.0 \|1.00x \|1.38x \|1.37x \|1.58x \| MJIT benchmarks Benchmark: https://github.com/benchmark-driver/mjit-benchmarks (Original: https://github.com/vnmakarov/ruby/tree/rtl_mjit_branch/MJIT-benchmarks) \| \|2.0.0-p0 \|r62186 \|JIT off \|JIT on \| \|:----------\|:--------\|:--------\|:--------\|:--------\| \|aread \|1.00 \|1.09 \|1.07 \|2.19 \| \|aref \|1.00 \|1.13 \|1.11 \|2.22 \| \|aset \|1.00 \|1.50 \|1.45 \|2.64 \| \|awrite \|1.00 \|1.17 \|1.13 \|2.20 \| \|call \|1.00 \|1.29 \|1.26 \|2.02 \| \|const2 \|1.00 \|1.10 \|1.10 \|2.19 \| \|const \|1.00 \|1.11 \|1.10 \|2.19 \| \|fannk \|1.00 \|1.04 \|1.02 \|1.00 \| \|fib \|1.00 \|1.32 \|1.31 \|1.84 \| \|ivread \|1.00 \|1.13 \|1.12 \|2.43 \| \|ivwrite \|1.00 \|1.23 \|1.21 \|2.40 \| \|mandelbrot \|1.00 \|1.13 \|1.16 \|1.28 \| \|meteor \|1.00 \|2.97 \|2.92 \|3.17 \| \|nbody \|1.00 \|1.17 \|1.15 \|1.49 \| \|nest-ntimes\|1.00 \|1.22 \|1.20 \|1.39 \| \|nest-while \|1.00 \|1.10 \|1.10 \|1.37 \| \|norm \|1.00 \|1.18 \|1.16 \|1.24 \| \|nsvb \|1.00 \|1.16 \|1.16 \|1.17 \| \|red-black \|1.00 \|1.02 \|0.99 \|1.12 \| \|sieve \|1.00 \|1.30 \|1.28 \|1.62 \| \|trees \|1.00 \|1.14 \|1.13 \|1.19 \| \|while \|1.00 \|1.12 \|1.11 \|2.41 \| Discourse's script/bench.rb Benchmark: https://github.com/discourse/discourse/blob/v1.8.7/script/bench.rb NOTE: Rails performance was somehow a little degraded with JIT for now. We should fix this. (At least I know opt_aref is performing badly in JIT and I have an idea to fix it. Please wait for the fix.) * JIT off Your Results: (note for timings- percentile is first, duration is second in millisecs) categories_admin: 50: 17 75: 18 90: 22 99: 29 home_admin: 50: 21 75: 21 90: 27 99: 40 topic_admin: 50: 17 75: 18 90: 22 99: 32 categories: 50: 35 75: 41 90: 43 99: 77 home: 50: 39 75: 46 90: 49 99: 95 topic: 50: 46 75: 52 90: 56 99: 101 *** JIT on Your Results: (note for timings- percentile is first, duration is second in millisecs) categories_admin: 50: 19 75: 21 90: 25 99: 33 home_admin: 50: 24 75: 26 90: 30 99: 35 topic_admin: 50: 19 75: 20 90: 25 99: 30 categories: 50: 40 75: 44 90: 48 99: 76 home: 50: 42 75: 48 90: 51 99: 89 topic: 50: 49 75: 55 90: 58 99: 99 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62197 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	mjit.c: merge MJIT infrastructure	k0kubun	2018-02-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	that allows to JIT-compile Ruby methods by generating C code and using C compiler. See the first comment of mjit.c to know what this file does. mjit.c is authored by Vladimir Makarov <vmakarov@redhat.com>. After he invented great method JIT infrastructure for MRI as MJIT, Lars Kanis <lars@greiz-reinsdorf.de> sent the patch to support MinGW in MJIT. In addition to merging it, I ported pthread to Windows native threads. Now this MJIT infrastructure can be compiled on Visual Studio. This commit simplifies mjit.c to decrease code at initial merge. For example, this commit does not provide multiple JIT threads support. We can resurrect them later if we really want them, but I wanted to minimize diff to make it easier to review this patch. `/tmp/_mjitXXX` file is renamed to `/tmp/_ruby_mjitXXX` because non-Ruby developers may not know the name "mjit" and the file name should make sure it's from Ruby and not from some harmful programs. TODO: it may be better to store this to some temporary directory which Ruby is already using by Tempfile, if it's not bad for performance. mjit.h: New. It has `mjit_exec` interface similar to `vm_exec`, which is for triggering MJIT. This drops interface for AOT compared to the original MJIT. Makefile.in: define macros to let MJIT know the path of MJIT header. Probably we can refactor this to reduce the number of macros (TODO). win32/Makefile.sub: ditto. common.mk: compile mjit.o and mjit_compile.o. Unlike original MJIT, this commit separates MJIT infrastructure and JIT compiler code as independent object files. As initial patch is NOT going to have ultra-fast JIT compiler, it's likely to replace JIT compiler, e.g. original MJIT's compiler or some future JIT impelementations which are not public now. inits.c: define MJIT module. This is added because `MJIT.enabled?` was necessary for testing. test/lib/zombie_hunter.rb: skip if `MJIT.enabled?`. Obviously this wouldn't work with current code when JIT is enabled. test/ruby/test_io.rb: skip this too. This would make no sense with MJIT. ruby.c: define MJIT CLI options. As major difference from original MJIT, "-j:l"/"--jit:llvm" are renamed to "--jit-cc" because I want to support not only gcc/clang but also cl.exe (Visual Studio) in the future. But it takes only "--jit-cc=gcc", "--jit-cc=clang" for now. And only long "--jit" options are allowed since some Ruby committers preferred it at Ruby developers Meeting on January, and some of options are renamed. This file also triggers to initialize MJIT thread and variables. eval.c: finalize MJIT worker thread and variables. test/ruby/test_rubyoptions.rb: fix number of CLI options for --jit. thread_pthread.c: change for pthread abstraction in MJIT. Prefix rb_ for functions which are used by other files. thread_win32.c: ditto, for Windows. Those pthread porting is one of major works that YARV-MJIT created, which is my fork of MJIT, in Feature 14235. thread.c: follow rb_ prefix changes vm.c: trigger MJIT call on VM invocation. Also trigger `mjit_mark` to avoid SEGV by race between JIT and GC of ISeq. The improvement was provided by wanabe <s.wanabe@gmail.com>. In JIT compiler I created and am going to add in my next commit, I found that having `mjit_exec` after `vm_loop_start:` is harmful because the JIT-ed function doesn't proceed other ISeqs on RESTORE_REGS of leave insn. Executing non-FINISH frame is unexpected for my JIT compiler and `exception_handler` triggers executions of such ISeqs. So `mjit_exec` here should be executed only when it directly comes from `vm_exec` call. `RubyVM::MJIT` module and `.enabled?` method is added so that we can skip some tests which don't expect JIT threads or compiler file descriptors. vm_insnhelper.h: trigger MJIT on method calls during VM execution. vm_core.h: add fields required for mjit.c. `bp` must be `cfp[6]` because rb_control_frame_struct is likely to be casted to another struct. The last position is the safest place to add the new field. vm_insnhelper.c: save initial value of cfp->ep as cfp->bp. This is an optimization which are done in both MJIT and YARV-MJIT. So this change is added in this commit. Calculating bp from ep is a little heavy work, so bp is kind of cache for it. iseq.c: notify ISeq GC to MJIT. We should know which iseq in MJIT queue is GCed to avoid SEGV. TODO: unload some GCed units in some safe way. gc.c: add hooks so that MJIT can wait GC, and vice versa. Simultaneous JIT and GC executions may cause SEGV and so we should synchronize them. cont.c: save continuation information in MJIT worker. As MJIT shouldn't unload JIT-ed code which is being used, MJIT wants to know full list of saved execution contexts for continuation and detect ISeqs in use. mjit_compile.c: added empty JIT compiler so that you can reuse this commit to build your own JIT compiler. This commit tries to compile ISeqs but all of them are considered as not supported in this commit. So you can't use JIT compiler in this commit yet while we added --jit option now. Patch author: Vladimir Makarov <vmakarov@redhat.com>. Contributors: Takashi Kokubun <takashikkbn@gmail.com>. wanabe <s.wanabe@gmail.com>. Lars Kanis <lars@greiz-reinsdorf.de>. Part of Feature 12589 and 14235. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62189 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	bare_instructions.rb: sp_inc is signed	nobu	2018-01-30	1	-15/+0
\| \| \| \| \| \| \| \| \| \| \|	* tool/ruby_vm/models/bare_instructions.rb (predefine_attributes): `sp_inc` attribute which may return negative values must be signed `rb_snum_t`, to be signed-expanded at type promotion. * vm_insnhelper.h (ADJ_SP): removed the workaround for platforms where rb_num_t is wider than int. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62103 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	eliminate CALL_SIMPLE_METHOD	shyouhei	2018-01-29	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Arrange operands of several opt_something insns so that jumps to opt_send_without_block can be applied to them. This makes it possible to eliminate CALL_SIMPLE_METHOD macro at all. Results in binary size of vm_exec_core to change from 27,008 bytes to 26,016 bytes on my machine. [close GH-1779] Note however that PC can point somewhere non-instruction now. ----------------------------------------------------------- benchmark results: minimum results in each 3 measurements. Execution time (sec) name before after so_ackermann 0.450 0.426 so_array 0.789 0.824 so_binary_trees 5.760 5.635 so_concatenate 3.594 3.508 so_count_words 0.211 0.196 so_exception 0.256 0.244 so_fannkuch 1.049 1.044 so_fasta 1.485 1.472 so_k_nucleotide 1.195 1.216 so_lists 0.517 0.513 so_mandelbrot 2.264 2.394 so_matrix 0.501 0.468 so_meteor_contest 2.987 2.912 so_nbody 1.307 1.289 so_nested_loop 0.908 0.925 so_nsieve 1.679 1.614 so_nsieve_bits 2.131 2.092 so_object 0.620 0.625 so_partial_sums 1.623 1.675 so_pidigits 1.135 1.190 so_random 0.357 0.321 so_reverse_complement 0.619 0.583 so_sieve 0.493 0.496 so_spectralnorm 1.749 1.737 Speedup ratio: compare with the result of `before' (greater is better) name after so_ackermann 1.057 so_array 0.958 so_binary_trees 1.022 so_concatenate 1.024 so_count_words 1.077 so_exception 1.049 so_fannkuch 1.004 so_fasta 1.009 so_k_nucleotide 0.983 so_lists 1.007 so_mandelbrot 0.946 so_matrix 1.072 so_meteor_contest 1.026 so_nbody 1.013 so_nested_loop 0.982 so_nsieve 1.040 so_nsieve_bits 1.018 so_object 0.992 so_partial_sums 0.969 so_pidigits 0.954 so_random 1.111 so_reverse_complement 1.062 so_sieve 0.994 so_spectralnorm 1.007 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62089 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	also use sp_inc in vm core	shyouhei	2018-01-29	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that sp_inc attributes are officially provided as inline functions. Why not use them directly from the vm core, not just by the compiler. By doing so, it is now possible for us to optimize stack manipulations. We can now know exactly how many words of stack space an instruction consumes before it actually does. This changeset deletes some lines from insns.def because they are no longer needed. As a result it reduces the size of vm_exec_core function from 32,400 bytes to 32,352 bytes on my machine. It seems it does not affect performance: ----------------------------------------------------------- benchmark results: minimum results in each 3 measurements. Execution time (sec) name before after loop_for 1.093 1.061 loop_generator 1.156 1.152 loop_times 0.982 0.974 loop_whileloop 0.549 0.587 loop_whileloop2 0.115 0.121 Speedup ratio: compare with the result of `before' (greater is better) name after loop_for 1.030 loop_generator 1.003 loop_times 1.008 loop_whileloop 0.935 loop_whileloop2 0.949 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62087 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
*	comma at the end of enum is a C99ism	shyouhei	2018-01-02	1	-2/+2
\| \| \| \|	git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61549 b2dd03c8-39d4-4d8f-98ff-823fe69b080e