mirror of
https://github.com/ruby/ruby.git
synced 2022-11-09 12:17:21 -05:00
637d1cc0c0
This PR improves the performance of `super` calls. While working on some Rails optimizations jhawthorn discovered that `super` calls were slower than expected. The changes here do the following: 1) Adds a check for whether the call frame is not equal to the method entry iseq. This avoids the `rb_obj_is_kind_of` check on the next line which is quite slow. If the current call frame is equal to the method entry we know we can't have an instance eval, etc. 2) Changes `FL_TEST` to `FL_TEST_RAW`. This is safe because we've already done the check for `T_ICLASS` above. 3) Adds a benchmark for `T_ICLASS` super calls. 4) Note: makes a chage for `method_entry_cref` to use `const`. On master the benchmarks showed that `super` is 1.76x slower. Our changes improved the performance so that it is now only 1.36x slower. Benchmark IPS: ``` Warming up -------------------------------------- super 244.918k i/100ms method call 383.007k i/100ms Calculating ------------------------------------- super 2.280M (± 6.7%) i/s - 11.511M in 5.071758s method call 3.834M (± 4.9%) i/s - 19.150M in 5.008444s Comparison: method call: 3833648.3 i/s super: 2279837.9 i/s - 1.68x (± 0.00) slower ``` With changes: ``` Warming up -------------------------------------- super 308.777k i/100ms method call 375.051k i/100ms Calculating ------------------------------------- super 2.951M (± 5.4%) i/s - 14.821M in 5.039592s method call 3.551M (± 4.9%) i/s - 18.002M in 5.081695s Comparison: method call: 3551372.7 i/s super: 2950557.9 i/s - 1.20x (± 0.00) slower ``` Ruby VM benchmarks also showed an improvement: Existing `vm_super` benchmark`. ``` $ make benchmark ITEM=vm_super | |compare-ruby|built-ruby| |:---------|-----------:|---------:| |vm_super | 21.555M| 37.819M| | | -| 1.75x| ``` New `vm_iclass_super` benchmark: ``` $ make benchmark ITEM=vm_iclass_super | |compare-ruby|built-ruby| |:----------------|-----------:|---------:| |vm_iclass_super | 1.669M| 3.683M| | | -| 2.21x| ``` This is the benchmark script used for the benchmark-ips benchmarks: ```ruby require "benchmark/ips" class Foo def zuper; end def top; end last_method = "top" ("A".."M").each do |module_name| eval <<-EOM module #{module_name} def zuper; super; end def #{module_name.downcase} #{last_method} end end prepend #{module_name} EOM last_method = module_name.downcase end end foo = Foo.new Benchmark.ips do |x| x.report "super" do foo.zuper end x.report "method call" do foo.m end x.compare! end ``` Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org> Co-authored-by: John Hawthorn <john@hawthorn.email> |
||
---|---|---|
.. | ||
gc | ||
lib | ||
other-lang | ||
app_answer.rb | ||
app_aobench.rb | ||
app_erb.yml | ||
app_factorial.rb | ||
app_fib.rb | ||
app_lc_fizzbuzz.rb | ||
app_mandelbrot.rb | ||
app_pentomino.rb | ||
app_raise.rb | ||
app_strconcat.rb | ||
app_tak.rb | ||
app_tarai.rb | ||
app_uri.rb | ||
array_flatten.yml | ||
array_intersection.yml | ||
array_max_float.yml | ||
array_max_int.yml | ||
array_max_str.yml | ||
array_min.yml | ||
array_sample_100k_10.rb | ||
array_sample_100k_11.rb | ||
array_sample_100k__1k.rb | ||
array_sample_100k__6k.rb | ||
array_sample_100k__100.rb | ||
array_sample_100k___10k.rb | ||
array_sample_100k___50k.rb | ||
array_shift.rb | ||
array_small_and.rb | ||
array_small_diff.rb | ||
array_small_or.rb | ||
array_sort_block.rb | ||
array_sort_float.rb | ||
array_values_at_int.rb | ||
array_values_at_range.rb | ||
bighash.rb | ||
cgi_escape_html.yml | ||
complex_float_add.yml | ||
complex_float_div.yml | ||
complex_float_mul.yml | ||
complex_float_new.yml | ||
complex_float_power.yml | ||
complex_float_sub.yml | ||
dir_empty_p.rb | ||
enum_lazy_flat_map.yml | ||
enum_lazy_grep_v_20.rb | ||
enum_lazy_grep_v_50.rb | ||
enum_lazy_grep_v_100.rb | ||
enum_lazy_uniq_20.rb | ||
enum_lazy_uniq_50.rb | ||
enum_lazy_uniq_100.rb | ||
enum_lazy_zip.yml | ||
erb_render.yml | ||
fiber_chain.yml | ||
fiber_locals.yml | ||
file_chmod.rb | ||
file_rename.rb | ||
hash_aref_dsym.rb | ||
hash_aref_dsym_long.rb | ||
hash_aref_fix.rb | ||
hash_aref_flo.rb | ||
hash_aref_miss.rb | ||
hash_aref_str.rb | ||
hash_aref_sym.rb | ||
hash_aref_sym_long.rb | ||
hash_defaults.yml | ||
hash_dup.yml | ||
hash_flatten.rb | ||
hash_ident_flo.rb | ||
hash_ident_num.rb | ||
hash_ident_obj.rb | ||
hash_ident_str.rb | ||
hash_ident_sym.rb | ||
hash_keys.rb | ||
hash_literal_small2.rb | ||
hash_literal_small4.rb | ||
hash_literal_small8.rb | ||
hash_long.rb | ||
hash_shift.rb | ||
hash_shift_u16.rb | ||
hash_shift_u24.rb | ||
hash_shift_u32.rb | ||
hash_small2.rb | ||
hash_small4.rb | ||
hash_small8.rb | ||
hash_to_proc.rb | ||
hash_values.rb | ||
int_quo.rb | ||
io_copy_stream_write.rb | ||
io_copy_stream_write_socket.rb | ||
io_file_create.rb | ||
io_file_read.rb | ||
io_file_write.rb | ||
io_nonblock_noex.rb | ||
io_nonblock_noex2.rb | ||
io_pipe_rw.rb | ||
io_select.rb | ||
io_select2.rb | ||
io_select3.rb | ||
irb_color.yml | ||
irb_exec.yml | ||
kernel_clone.yml | ||
kernel_float.yml | ||
kernel_tap.yml | ||
kernel_then.yml | ||
keyword_arguments.yml | ||
loop_for.rb | ||
loop_generator.rb | ||
loop_times.rb | ||
loop_whileloop.rb | ||
loop_whileloop2.rb | ||
marshal_dump_flo.rb | ||
marshal_dump_load_geniv.rb | ||
marshal_dump_load_time.rb | ||
match_gt4.rb | ||
match_small.rb | ||
mjit_exec_jt2jt.yml | ||
mjit_exec_vm2jt.yml | ||
mjit_exec_vm2vm.yml | ||
mjit_exivar.yml | ||
mjit_integer.yml | ||
mjit_kernel.yml | ||
mjit_leave.yml | ||
mjit_opt_cc_insns.yml | ||
mjit_struct_aref.yml | ||
nil_p.yml | ||
num_zero_p.yml | ||
objspace_dump_all.yml | ||
pm_array.yml | ||
range_last.yml | ||
README.md | ||
realpath.yml | ||
require.yml | ||
require_thread.yml | ||
securerandom.rb | ||
so_ackermann.rb | ||
so_array.rb | ||
so_binary_trees.rb | ||
so_concatenate.rb | ||
so_count_words.yml | ||
so_exception.rb | ||
so_fannkuch.rb | ||
so_fasta.rb | ||
so_k_nucleotide.yml | ||
so_lists.rb | ||
so_mandelbrot.rb | ||
so_matrix.rb | ||
so_meteor_contest.rb | ||
so_nbody.rb | ||
so_nested_loop.rb | ||
so_nsieve.rb | ||
so_nsieve_bits.rb | ||
so_object.rb | ||
so_partial_sums.rb | ||
so_pidigits.rb | ||
so_random.rb | ||
so_reverse_complement.yml | ||
so_sieve.rb | ||
so_spectralnorm.rb | ||
string_capitalize.yml | ||
string_casecmp.yml | ||
string_casecmp_p.yml | ||
string_downcase.yml | ||
string_index.rb | ||
string_scan_re.rb | ||
string_scan_str.rb | ||
string_slice.yml | ||
string_split.yml | ||
string_swapcase.yml | ||
string_upcase.yml | ||
time_strptime.yml | ||
time_subsec.rb | ||
vm_array.yml | ||
vm_attr_ivar.yml | ||
vm_attr_ivar_set.yml | ||
vm_backtrace.rb | ||
vm_bigarray.yml | ||
vm_bighash.yml | ||
vm_block.yml | ||
vm_block_handler.yml | ||
vm_blockparam.yml | ||
vm_blockparam_call.yml | ||
vm_blockparam_pass.yml | ||
vm_blockparam_yield.yml | ||
vm_case.yml | ||
vm_case_lit.yml | ||
vm_clearmethodcache.rb | ||
vm_const.yml | ||
vm_defined_method.yml | ||
vm_dstr.yml | ||
vm_ensure.yml | ||
vm_eval.yml | ||
vm_fiber_allocate.yml | ||
vm_fiber_count.yml | ||
vm_fiber_reuse.yml | ||
vm_fiber_reuse_gc.yml | ||
vm_fiber_switch.yml | ||
vm_float_simple.yml | ||
vm_freezestring.yml | ||
vm_gc.rb | ||
vm_gc_old_full.rb | ||
vm_gc_old_immediate.rb | ||
vm_gc_old_lazy.rb | ||
vm_gc_short_lived.yml | ||
vm_gc_short_with_complex_long.yml | ||
vm_gc_short_with_long.yml | ||
vm_gc_short_with_symbol.yml | ||
vm_gc_wb_ary.yml | ||
vm_gc_wb_ary_promoted.yml | ||
vm_gc_wb_obj.yml | ||
vm_gc_wb_obj_promoted.yml | ||
vm_iclass_super.yml | ||
vm_ivar.yml | ||
vm_ivar_set.yml | ||
vm_length.yml | ||
vm_lvar_init.yml | ||
vm_lvar_set.yml | ||
vm_method.yml | ||
vm_method_missing.yml | ||
vm_method_with_block.yml | ||
vm_module_ann_const_set.yml | ||
vm_module_const_set.yml | ||
vm_mutex.yml | ||
vm_neq.yml | ||
vm_newlambda.yml | ||
vm_not.yml | ||
vm_poly_method.yml | ||
vm_poly_method_ov.yml | ||
vm_poly_same_method.yml | ||
vm_poly_singleton.yml | ||
vm_proc.yml | ||
vm_raise1.yml | ||
vm_raise2.yml | ||
vm_regexp.yml | ||
vm_rescue.yml | ||
vm_send.yml | ||
vm_send_cfunc.yml | ||
vm_simplereturn.yml | ||
vm_string_literal.yml | ||
vm_struct_big_aref_hi.yml | ||
vm_struct_big_aref_lo.yml | ||
vm_struct_big_aset.yml | ||
vm_struct_big_href_hi.yml | ||
vm_struct_big_href_lo.yml | ||
vm_struct_big_hset.yml | ||
vm_struct_small_aref.yml | ||
vm_struct_small_aset.yml | ||
vm_struct_small_href.yml | ||
vm_struct_small_hset.yml | ||
vm_super.yml | ||
vm_swap.yml | ||
vm_symbol_block_pass.rb | ||
vm_thread_alive_check.yml | ||
vm_thread_close.rb | ||
vm_thread_condvar1.rb | ||
vm_thread_condvar2.rb | ||
vm_thread_create_join.rb | ||
vm_thread_mutex1.rb | ||
vm_thread_mutex2.rb | ||
vm_thread_mutex3.rb | ||
vm_thread_pass.rb | ||
vm_thread_pass_flood.rb | ||
vm_thread_pipe.rb | ||
vm_thread_queue.rb | ||
vm_thread_sized_queue.rb | ||
vm_thread_sized_queue2.rb | ||
vm_thread_sized_queue3.rb | ||
vm_thread_sized_queue4.rb | ||
vm_thread_sleep.yml | ||
vm_unif1.yml | ||
vm_yield.yml | ||
vm_zsuper.yml |
ruby/benchmark
This directory has benchmark definitions to be run with benchmark_driver.gem.
Normal usage
Execute gem install benchmark_driver
and run a command like:
# Run a benchmark script with the ruby in the $PATH
benchmark-driver benchmark/app_fib.rb
# Run benchmark scripts with multiple Ruby executables or options
benchmark-driver benchmark/*.rb -e /path/to/ruby -e '/path/to/ruby --jit'
# Or compare Ruby versions managed by rbenv
benchmark-driver benchmark/*.rb --rbenv '2.5.1;2.6.0-preview2 --jit'
# You can collect many metrics in many ways
benchmark-driver benchmark/*.rb --runner memory --output markdown
# Some are defined with YAML for complex setup or accurate measurement
benchmark-driver benchmark/*.yml
See also:
Usage: benchmark-driver [options] RUBY|YAML...
-r, --runner TYPE Specify runner type: ips, time, memory, once (default: ips)
-o, --output TYPE Specify output type: compare, simple, markdown, record (default: compare)
-e, --executables EXECS Ruby executables (e1::path1 arg1; e2::path2 arg2;...)
--rbenv VERSIONS Ruby executables in rbenv (x.x.x arg1;y.y.y arg2;...)
--repeat-count NUM Try benchmark NUM times and use the fastest result or the worst memory usage
--repeat-result TYPE Yield "best", "average" or "worst" result with --repeat-count (default: best)
--bundler Install and use gems specified in Gemfile
--filter REGEXP Filter out benchmarks with given regexp
--run-duration SECONDS Warmup estimates loop_count to run for this duration (default: 3)
-v, --verbose Verbose mode. Multiple -v options increase visilibity (max: 2)
make benchmark
Using make benchmark
, make update-benchmark-driver
automatically downloads
the supported version of benchmark_driver, and it runs benchmarks with the downloaded
benchmark_driver.
# Run all benchmarks with the ruby in the $PATH and the built ruby
make benchmark
# Or compare with specific ruby binary
make benchmark COMPARE_RUBY="/path/to/ruby --jit"
# Run vm benchmarks
make benchmark ITEM=vm
# Run some limited benchmarks in ITEM-matched files
make benchmark ITEM=vm OPTS=--filter=block
# You can specify the benchmark by an exact filename instead of using the default argument:
# ARGS = $$(find $(srcdir)/benchmark -maxdepth 1 -name '*$(ITEM)*.yml' -o -name '*$(ITEM)*.rb')
make benchmark ARGS=benchmark/erb_render.yml
# You can specify any option via $OPTS
make benchmark OPTS="--help"
# With `make benchmark`, some special runner plugins are available:
# -r peak, -r size, -r total, -r utime, -r stime, -r cutime, -r cstime
make benchmark ITEM=vm_bigarray OPTS="-r peak"