1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00
ruby--ruby/lib
watson1978 dce4a3f58c Improve CSV performance
If it will not use special variables (like $1, $&, $`...),
it can improve the performance by using Regexp#match? or String#match? instead of Regexp#=~ or String#=~.

This patch is same idea as https://github.com/ruby/ruby/pull/1836

[Fix GH-1842]

## Environment
* OS : Ubuntu 17.10
* Compiler : gcc version 7.2.0
* CPU : Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
* Memory : 16 GB

## TL;DR
Methods     | Before | After  | Speed up
----------- | ------ | ------ | --------
CSV.foreach | 44.825 | 48.201 | 7.5%
CSV#shift   | 45.200 | 49.584 | 9.7%
CSV.read    | 42.968 | 46.853 | 9.0%
CSV.table   | 10.933 | 11.277 | 3.1%

## Before
```
Calculating -------------------------------------
         CSV.foreach     44.825  (± 0.0%) i/s -    228.000  in   5.086576s
           CSV#shift     45.200  (± 0.0%) i/s -    228.000  in   5.044297s
            CSV.read     42.968  (± 0.0%) i/s -    216.000  in   5.027504s
           CSV.table     10.933  (± 0.0%) i/s -     55.000  in   5.031098s
```

## After
```
Calculating -------------------------------------
         CSV.foreach     48.201  (± 0.0%) i/s -    244.000  in   5.062256s
           CSV#shift     49.584  (± 0.0%) i/s -    248.000  in   5.001652s
            CSV.read     46.853  (± 0.0%) i/s -    236.000  in   5.037044s
           CSV.table     11.277  (± 0.0%) i/s -     57.000  in   5.054694s
```

## Benchmark code
```ruby
require 'csv'
require 'benchmark/ips'

CSV.open("/tmp/file.csv", "w") do |csv|
  csv << ["player", "gameA", "gameB"]
  1000.times do
    csv << ['"Alice"', "84.0", "79.5"]
    csv << ['"Bob"', "20.0", "56.5"]
  end
end

Benchmark.ips do |x|
  x.report "CSV.foreach" do
    CSV.foreach("/tmp/file.csv") do |row|
    end
  end

  x.report "CSV#shift" do
    CSV.open("/tmp/file.csv") do |csv|
      while line = csv.shift
      end
    end
  end

  x.report "CSV.read" do
    CSV.read("/tmp/file.csv")
  end

  x.report "CSV.table" do
    CSV.table("/tmp/file.csv")
  end
end
```

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62806 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-03-18 10:28:58 +00:00
..
cgi
drb If host of URI is omitted, make it with IP address. 2018-03-17 15:13:39 +00:00
forwardable
irb Hash instead of Set 2018-03-13 01:28:28 +00:00
matrix
net Raise ArgumentError if host component is nil 2018-03-08 16:07:54 +00:00
optparse
racc
rdoc erb.rb: deprecate safe_level of ERB.new 2018-02-22 13:28:25 +00:00
rexml
rinda
rss
rubygems Merge RubyGems 2.7.6 from upstream. 2018-02-16 08:08:06 +00:00
shell
unicode_normalize
uri Introduce URI::File to handle file URI scheme 2018-03-15 16:51:31 +00:00
webrick webrick 1.4.2 2017-12-24 08:38:43 +00:00
yaml
.document
abbrev.rb
base64.rb
benchmark.rb
cgi.rb
cmath.gemspec
cmath.rb
csv.gemspec
csv.rb Improve CSV performance 2018-03-18 10:28:58 +00:00
debug.rb
delegate.rb
drb.rb
e2mmap.rb
English.rb
erb.rb erb.rb: relax warn level of ERB.new 2018-02-28 12:12:20 +00:00
fileutils.gemspec Bump up fileutils-1.0.2 2017-12-22 08:00:10 +00:00
fileutils.rb Fix typos [ci skip] 2018-03-13 15:10:59 +00:00
find.rb
forwardable.rb
getoptlong.rb
ipaddr.gemspec
ipaddr.rb
irb.rb irb.rb: fix highlight 2017-12-25 07:55:25 +00:00
logger.rb logger: use safe navigation operator 2018-01-18 00:52:01 +00:00
matrix.rb lib/matrix.rb: Document deprecated methods [#12032] [doc] [ci-skip] 2018-02-06 23:47:49 +00:00
mkmf.rb mkmf.rb: werror on mswin 2018-01-24 08:25:36 +00:00
monitor.rb
mutex_m.rb
observer.rb
open-uri.rb open-uri: clear string after buffering 2018-01-08 01:11:33 +00:00
open3.rb
optionparser.rb
optparse.rb optparse.rb: froze string literals 2018-01-26 03:41:04 +00:00
ostruct.rb lib/ostruct.rb: Use FrozenError instead of RuntimeError. 2018-02-06 23:52:30 +00:00
pp.rb Requiring pp is not required now [ci skip] 2017-12-18 01:51:53 +00:00
prettyprint.rb
prime.rb
profile.rb
profiler.rb
pstore.rb
rdoc.rb Merge rdoc-6.0.1. 2017-12-23 23:33:09 +00:00
resolv-replace.rb
resolv.rb resolv.rb: remove rangerand 2018-03-06 03:31:46 +00:00
rss.rb
rubygems.rb Merge RubyGems 2.7.6 from upstream. 2018-02-16 08:08:06 +00:00
scanf.gemspec
scanf.rb lib/scanf.rb: [DOC] fix typos 2018-01-07 17:49:46 +00:00
securerandom.rb
set.rb Add a new #filter alias for #select 2018-02-25 13:52:07 +00:00
shell.rb
shellwords.rb
singleton.rb
sync.rb
tempfile.rb
thwait.rb
time.rb
timeout.rb
tmpdir.rb
tracer.rb
tsort.rb
un.rb
uri.rb Introduce URI::File to handle file URI scheme 2018-03-15 16:51:31 +00:00
weakref.rb
webrick.rb
yaml.rb Clarify the documentation of the YAML module [Misc #14567] 2018-03-02 12:56:37 +00:00