1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00
ruby--ruby/benchmark
Takashi Kokubun b9d3ceee8f
Unwrap vm_call_cfunc indirection on JIT
for VM_METHOD_TYPE_CFUNC.

This has been known to decrease optcarrot fps:

```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark.yml --repeat-count=24 --output=all
before --jit: ruby 2.8.0dev (2020-04-13T16:25:13Z master fb40495cd9) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-04-13T23:23:11Z mjit-inline-c bdcd06d159) +JIT [x86_64-linux]
Calculating -------------------------------------
                                 before --jit           after --jit
Optcarrot Lan_Master.nes    66.38132676191719     67.41369177299630 fps
                            69.42728743772243     68.90327567263054
                            72.16028300263211     69.62605130880686
                            72.46631319102777     70.48818243767207
                            73.37078877002490     70.79522887347566
                            73.69422431217367     70.99021920193194
                            74.01471487018695     74.69931965402584
                            75.48685183295630     74.86714575949016
                            75.54445264507932     75.97864419721677
                            77.28089738169756     76.48908637569581
                            78.04183397891302     76.54320932488021
                            78.36807984096562     76.59407262898067
                            78.92898762543574     77.31316743361343
                            78.93576483233765     77.97153484180480
                            79.13754917503078     77.98478782102325
                            79.62648945850653     78.02263322726446
                            79.86334213878064     78.26333724045934
                            80.05100635898518     78.60056756355614
                            80.26186843769584     78.91082645644468
                            80.34205717020330     79.01226659142263
                            80.62286066044338     79.32733939423721
                            80.95883033058557     79.63793060542024
                            80.97376819251613     79.73108936622778
                            81.23050939202896     80.18280109433088
```

and I deleted this capability in an early stage of YARV-MJIT development:
0ab130feee

I suspect either of the following things could be the cause:

* Directly calling vm_call_cfunc requires more optimization effort in GCC,
  resulting in 30ms-ish compilation time increase for such methods and
  decreasing the number of methods compiled in a benchmarked period.

* Code size increase => icache miss hit

These hypotheses could be verified by some methodologies. However, I'd
like to introduce this regardless of the result because this blocks
inlining C method's definition.

I may revert this commit when I give up to implement inlining C method
definition, which requires this change.

Microbenchmark-wise, this gives slight performance improvement:

```
$ benchmark-driver -v --rbenv 'before --jit;after --jit' benchmark/mjit_send_cfunc.yml --repeat-count=4
before --jit: ruby 2.8.0dev (2020-04-13T16:25:13Z master fb40495cd9) +JIT [x86_64-linux]
after --jit: ruby 2.8.0dev (2020-04-13T23:23:11Z mjit-inline-c bdcd06d159) +JIT [x86_64-linux]
Calculating -------------------------------------
                     before --jit  after --jit
     mjit_send_cfunc      41.961M      56.489M i/s -    100.000M times in 2.383143s 1.770244s

Comparison:
                  mjit_send_cfunc
         after --jit:  56489372.5 i/s
        before --jit:  41961388.1 i/s - 1.35x  slower
```
2020-04-13 16:45:05 -07:00
..
gc
lib Remove unneeded exec bits from some files 2019-11-09 21:36:30 +09:00
other-lang
app_answer.rb
app_aobench.rb
app_erb.yml
app_factorial.rb
app_fib.rb
app_lc_fizzbuzz.rb
app_mandelbrot.rb
app_pentomino.rb
app_raise.rb
app_strconcat.rb
app_tak.rb
app_tarai.rb
app_uri.rb
array_flatten.yml
array_intersection.yml
array_sample_100k_10.rb
array_sample_100k_11.rb
array_sample_100k__1k.rb
array_sample_100k__6k.rb
array_sample_100k__100.rb
array_sample_100k___10k.rb
array_sample_100k___50k.rb
array_shift.rb
array_small_and.rb
array_small_diff.rb
array_small_or.rb
array_sort_block.rb
array_sort_float.rb
array_values_at_int.rb
array_values_at_range.rb
bighash.rb
cgi_escape_html.yml
complex_float_add.yml
complex_float_div.yml
complex_float_mul.yml
complex_float_new.yml
complex_float_power.yml
complex_float_sub.yml
dir_empty_p.rb
enum_lazy_grep_v_20.rb
enum_lazy_grep_v_50.rb
enum_lazy_grep_v_100.rb
enum_lazy_uniq_20.rb
enum_lazy_uniq_50.rb
enum_lazy_uniq_100.rb
erb_render.yml
fiber_chain.yml Drop executable bit of *.{yml,h,mk.tmpl} 2020-01-22 16:04:38 +09:00
fiber_locals.yml Let execution context local storage be an ID table 2020-01-11 14:40:36 +13:00
file_chmod.rb
file_rename.rb
hash_aref_dsym.rb
hash_aref_dsym_long.rb
hash_aref_fix.rb
hash_aref_flo.rb
hash_aref_miss.rb
hash_aref_str.rb
hash_aref_sym.rb
hash_aref_sym_long.rb
hash_defaults.yml Speeds up fallback to Hash#default_proc in rb_hash_aref by removing a method call 2020-01-08 18:09:52 +09:00
hash_dup.yml
hash_flatten.rb
hash_ident_flo.rb
hash_ident_num.rb
hash_ident_obj.rb
hash_ident_str.rb
hash_ident_sym.rb
hash_keys.rb
hash_literal_small2.rb
hash_literal_small4.rb
hash_literal_small8.rb
hash_long.rb
hash_shift.rb
hash_shift_u16.rb
hash_shift_u24.rb
hash_shift_u32.rb
hash_small2.rb
hash_small4.rb
hash_small8.rb
hash_to_proc.rb
hash_values.rb
int_quo.rb
io_copy_stream_write.rb
io_copy_stream_write_socket.rb
io_file_create.rb
io_file_read.rb
io_file_write.rb
io_nonblock_noex.rb
io_nonblock_noex2.rb
io_pipe_rw.rb
io_select.rb
io_select2.rb
io_select3.rb
irb_color.yml
irb_exec.yml
kernel_clone.yml support builtin for Kernel#clone 2020-03-17 19:37:07 +09:00
keyword_arguments.yml Reduce allocations for keyword argument hashes 2020-03-17 12:09:43 -07:00
loop_for.rb
loop_generator.rb
loop_times.rb
loop_whileloop.rb
loop_whileloop2.rb
marshal_dump_flo.rb
marshal_dump_load_geniv.rb
marshal_dump_load_time.rb
match_gt4.rb
match_small.rb
mjit_exec_jt2jt.yml
mjit_exec_vm2jt.yml
mjit_exec_vm2vm.yml
mjit_exivar.yml Remove an unused pragma 2020-03-30 23:30:08 -07:00
mjit_leave.yml Make JIT-ed leave insn leaf 2020-03-31 22:10:16 -07:00
mjit_send_cfunc.yml Unwrap vm_call_cfunc indirection on JIT 2020-04-13 16:45:05 -07:00
nil_p.yml
range_last.yml
README.md
realpath.yml
require.yml
require_thread.yml
securerandom.rb
so_ackermann.rb
so_array.rb
so_binary_trees.rb
so_concatenate.rb
so_count_words.yml
so_exception.rb
so_fannkuch.rb
so_fasta.rb
so_k_nucleotide.yml
so_lists.rb
so_mandelbrot.rb
so_matrix.rb
so_meteor_contest.rb
so_nbody.rb
so_nested_loop.rb
so_nsieve.rb
so_nsieve_bits.rb
so_object.rb
so_partial_sums.rb
so_pidigits.rb
so_random.rb
so_reverse_complement.yml
so_sieve.rb
so_spectralnorm.rb
string_capitalize.yml
string_casecmp.yml Added more benchmarks for String 2020-02-29 15:42:24 +09:00
string_casecmp_p.yml Added more benchmarks for String 2020-02-29 15:42:24 +09:00
string_downcase.yml Added more benchmarks for String 2020-02-29 15:42:24 +09:00
string_index.rb
string_scan_re.rb
string_scan_str.rb
string_slice.yml Improve String#slice! performance 2020-01-31 17:12:05 +09:00
string_split.yml
string_swapcase.yml Added more benchmarks for String 2020-02-29 15:42:24 +09:00
string_upcase.yml Added more benchmarks for String 2020-02-29 15:42:24 +09:00
time_strptime.yml
time_subsec.rb
vm1_attr_ivar.yml
vm1_attr_ivar_set.yml
vm1_block.yml
vm1_blockparam.yml
vm1_blockparam_call.yml
vm1_blockparam_pass.yml
vm1_blockparam_yield.yml
vm1_const.yml
vm1_ensure.yml
vm1_float_simple.yml
vm1_gc_short_lived.yml
vm1_gc_short_with_complex_long.yml
vm1_gc_short_with_long.yml
vm1_gc_short_with_symbol.yml
vm1_gc_wb_ary.yml
vm1_gc_wb_ary_promoted.yml
vm1_gc_wb_obj.yml
vm1_gc_wb_obj_promoted.yml
vm1_ivar.yml
vm1_ivar_set.yml
vm1_length.yml
vm1_lvar_init.yml
vm1_lvar_set.yml
vm1_neq.yml
vm1_not.yml
vm1_rescue.yml
vm1_simplereturn.yml
vm1_swap.yml
vm1_yield.yml
vm2_array.yml
vm2_bigarray.yml
vm2_bighash.yml
vm2_case.yml
vm2_case_lit.yml
vm2_defined_method.yml
vm2_dstr.yml
vm2_eval.yml
vm2_fiber_allocate.yml
vm2_fiber_count.yml
vm2_fiber_reuse.yml
vm2_fiber_reuse_gc.yml
vm2_fiber_switch.yml
vm2_freezestring.yml
vm2_method.yml
vm2_method_missing.yml
vm2_method_with_block.yml
vm2_module_ann_const_set.yml
vm2_module_const_set.yml
vm2_mutex.yml
vm2_newlambda.yml
vm2_poly_method.yml
vm2_poly_method_ov.yml
vm2_poly_same_method.yml
vm2_poly_singleton.yml
vm2_proc.yml
vm2_raise1.yml
vm2_raise2.yml
vm2_regexp.yml
vm2_send.yml
vm2_string_literal.yml
vm2_struct_big_aref_hi.yml
vm2_struct_big_aref_lo.yml
vm2_struct_big_aset.yml
vm2_struct_big_href_hi.yml
vm2_struct_big_href_lo.yml
vm2_struct_big_hset.yml
vm2_struct_small_aref.yml
vm2_struct_small_aset.yml
vm2_struct_small_href.yml
vm2_struct_small_hset.yml
vm2_super.yml
vm2_unif1.yml
vm2_zsuper.yml
vm3_backtrace.rb
vm3_clearmethodcache.rb
vm3_gc.rb
vm3_gc_old_full.rb
vm3_gc_old_immediate.rb
vm3_gc_old_lazy.rb
vm_symbol_block_pass.rb
vm_thread_alive_check.yml
vm_thread_close.rb
vm_thread_condvar1.rb
vm_thread_condvar2.rb
vm_thread_create_join.rb
vm_thread_mutex1.rb
vm_thread_mutex2.rb
vm_thread_mutex3.rb
vm_thread_pass.rb
vm_thread_pass_flood.rb
vm_thread_pipe.rb
vm_thread_queue.rb
vm_thread_sized_queue.rb
vm_thread_sized_queue2.rb
vm_thread_sized_queue3.rb
vm_thread_sized_queue4.rb
vm_thread_sleep.yml

ruby/benchmark

This directory has benchmark definitions to be run with benchmark_driver.gem.

Normal usage

Execute gem install benchmark_driver and run a command like:

# Run a benchmark script with the ruby in the $PATH
benchmark-driver benchmark/app_fib.rb

# Run benchmark scripts with multiple Ruby executables or options
benchmark-driver benchmark/*.rb -e /path/to/ruby -e '/path/to/ruby --jit'

# Or compare Ruby versions managed by rbenv
benchmark-driver benchmark/*.rb --rbenv '2.5.1;2.6.0-preview2 --jit'

# You can collect many metrics in many ways
benchmark-driver benchmark/*.rb --runner memory --output markdown

# Some are defined with YAML for complex setup or accurate measurement
benchmark-driver benchmark/*.yml

See also:

Usage: benchmark-driver [options] RUBY|YAML...
    -r, --runner TYPE                Specify runner type: ips, time, memory, once (default: ips)
    -o, --output TYPE                Specify output type: compare, simple, markdown, record (default: compare)
    -e, --executables EXECS          Ruby executables (e1::path1 arg1; e2::path2 arg2;...)
        --rbenv VERSIONS             Ruby executables in rbenv (x.x.x arg1;y.y.y arg2;...)
        --repeat-count NUM           Try benchmark NUM times and use the fastest result or the worst memory usage
        --repeat-result TYPE         Yield "best", "average" or "worst" result with --repeat-count (default: best)
        --bundler                    Install and use gems specified in Gemfile
        --filter REGEXP              Filter out benchmarks with given regexp
        --run-duration SECONDS       Warmup estimates loop_count to run for this duration (default: 3)
    -v, --verbose                    Verbose mode. Multiple -v options increase visilibity (max: 2)

make benchmark

Using make benchmark, make update-benchmark-driver automatically downloads the supported version of benchmark_driver, and it runs benchmarks with the downloaded benchmark_driver.

# Run all benchmarks with the ruby in the $PATH and the built ruby
make benchmark

# Or compare with specific ruby binary
make benchmark COMPARE_RUBY="/path/to/ruby --jit"

# Run vm1 benchmarks
make benchmark ITEM=vm1

# Run some limited benchmarks in ITEM-matched files
make benchmark ITEM=vm1 OPTS=--filter=block

# You can specify the benchmark by an exact filename instead of using the default argument:
# ARGS = $$(find $(srcdir)/benchmark -maxdepth 1 -name '*$(ITEM)*.yml' -o -name '*$(ITEM)*.rb')
make benchmark ARGS=../benchmark/erb_render.yml

# You can specify any option via $OPTS
make benchmark OPTS="--help"

# With `make benchmark`, some special runner plugins are available:
#   -r peak, -r size, -r total, -r utime, -r stime, -r cutime, -r cstime
make benchmark ITEM=vm2_bigarray OPTS="-r peak"