1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00

[ruby/csv] RDoc for converters (#157)

* More on RDoc for converters

* More on RDoc for converters

* Fix indent

Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
6044976160
This commit is contained in:
Burdette Lamar 2020-07-15 15:37:17 -05:00 committed by Nobuyoshi Nakada
parent d7c42df0b1
commit d9749b4715
No known key found for this signature in database
GPG key ID: 7CD2805BFA3770C6
Notes: git 2020-07-20 03:35:32 +09:00
4 changed files with 477 additions and 217 deletions

View file

@ -1,7 +1,7 @@
====== Option +write_converters+ ====== Option +write_converters+
Specifies the \Proc or \Array of Procs that are to be called Specifies converters to be used in generating fields.
for converting each output field. See {Write Converters}[#class-CSV-label-Write+Converters]
Default value: Default value:
CSV::DEFAULT_OPTIONS.fetch(:write_converters) # => nil CSV::DEFAULT_OPTIONS.fetch(:write_converters) # => nil
@ -11,21 +11,23 @@ With no write converter:
str # => "\"\na\n\",\tb\t, c \n" str # => "\"\na\n\",\tb\t, c \n"
With a write converter: With a write converter:
strip_converter = lambda {|field| field.strip } strip_converter = proc {|field| field.strip }
str = CSV.generate_line(["\na\n", "\tb\t", " c "], write_converters: strip_converter) str = CSV.generate_line(["\na\n", "\tb\t", " c "], write_converters: strip_converter)
str # => "a,b,c\n" str # => "a,b,c\n"
With two write converters (called in order): With two write converters (called in order):
upcase_converter = lambda {|field| field.upcase } upcase_converter = proc {|field| field.upcase }
downcase_converter = lambda {|field| field.downcase } downcase_converter = proc {|field| field.downcase }
write_converters = [upcase_converter, downcase_converter] write_converters = [upcase_converter, downcase_converter]
str = CSV.generate_line(['a', 'b', 'c'], write_converters: write_converters) str = CSV.generate_line(['a', 'b', 'c'], write_converters: write_converters)
str # => "a,b,c\n" str # => "a,b,c\n"
See also {Write Converters}[#class-CSV-label-Write+Converters]
--- ---
Raises an exception if the converter returns a value that is neither +nil+ Raises an exception if the converter returns a value that is neither +nil+
nor \String-convertible: nor \String-convertible:
bad_converter = lambda {|field| BasicObject.new } bad_converter = proc {|field| BasicObject.new }
# Raises NoMethodError (undefined method `is_a?' for #<BasicObject:>) # Raises NoMethodError (undefined method `is_a?' for #<BasicObject:>)
CSV.generate_line(['a', 'b', 'c'], write_converters: bad_converter) CSV.generate_line(['a', 'b', 'c'], write_converters: bad_converter)

View file

@ -1,41 +1,42 @@
====== Option +converters+ ====== Option +converters+
Specifies a single field converter name or \Proc, Specifies converters to be used in parsing fields.
or an \Array of field converter names and Procs.
See {Field Converters}[#class-CSV-label-Field+Converters] See {Field Converters}[#class-CSV-label-Field+Converters]
Default value: Default value:
CSV::DEFAULT_OPTIONS.fetch(:converters) # => nil CSV::DEFAULT_OPTIONS.fetch(:converters) # => nil
The value may be a single field converter name: The value may be a field converter name
(see {Stored Converters}[#class-CSV-label-Stored+Converters]):
str = '1,2,3' str = '1,2,3'
# Without a converter # Without a converter
ary = CSV.parse_line(str) array = CSV.parse_line(str)
ary # => ["1", "2", "3"] array # => ["1", "2", "3"]
# With built-in converter :integer # With built-in converter :integer
ary = CSV.parse_line(str, converters: :integer) array = CSV.parse_line(str, converters: :integer)
ary # => [1, 2, 3] array # => [1, 2, 3]
The value may be an \Array of field converter names: The value may be a converter list
(see {Converter Lists}[#class-CSV-label-Converter+Lists]):
str = '1,3.14159' str = '1,3.14159'
# Without converters # Without converters
ary = CSV.parse_line(str) array = CSV.parse_line(str)
ary # => ["1", "3.14159"] array # => ["1", "3.14159"]
# With built-in converters # With built-in converters
ary = CSV.parse_line(str, converters: [:integer, :float]) array = CSV.parse_line(str, converters: [:integer, :float])
ary # => [1, 3.14159] array # => [1, 3.14159]
The value may be a \Proc custom converter: The value may be a \Proc custom converter:
(see {Custom Field Converters}[#class-CSV-label-Custom+Field+Converters]):
str = ' foo , bar , baz ' str = ' foo , bar , baz '
# Without a converter # Without a converter
ary = CSV.parse_line(str) array = CSV.parse_line(str)
ary # => [" foo ", " bar ", " baz "] array # => [" foo ", " bar ", " baz "]
# With a custom converter # With a custom converter
ary = CSV.parse_line(str, converters: proc {|field| field.strip }) array = CSV.parse_line(str, converters: proc {|field| field.strip })
ary # => ["foo", "bar", "baz"] array # => ["foo", "bar", "baz"]
See also {Custom Converters}[#class-CSV-label-Custom+Converters] See also {Custom Field Converters}[#class-CSV-label-Custom+Field+Converters]
--- ---

View file

@ -1,6 +1,7 @@
====== Option +header_converters+ ====== Option +header_converters+
Specifies a \String converter name or an \Array of converter names. Specifies converters to be used in parsing headers.
See {Header Converters}[#class-CSV-label-Header+Converters]
Default value: Default value:
CSV::DEFAULT_OPTIONS.fetch(:header_converters) # => nil CSV::DEFAULT_OPTIONS.fetch(:header_converters) # => nil
@ -10,22 +11,33 @@ except that:
- The converters apply only to the header row. - The converters apply only to the header row.
- The built-in header converters are +:downcase+ and +:symbol+. - The built-in header converters are +:downcase+ and +:symbol+.
Examples: This section assumes prior execution of:
str = <<-EOT str = <<-EOT
Name,Value
foo,0 foo,0
bar,1 bar,1
baz,2 baz,2
EOT EOT
headers = ['Name', 'Value']
# With no header converter # With no header converter
csv = CSV.parse(str, headers: headers) table = CSV.parse(str, headers: true)
csv.headers # => ["Name", "Value"] table.headers # => ["Name", "Value"]
# With header converter :downcase
csv = CSV.parse(str, headers: headers, header_converters: :downcase) The value may be a header converter name
csv.headers # => ["name", "value"] (see {Stored Converters}[#class-CSV-label-Stored+Converters]):
# With header converter :symbol table = CSV.parse(str, headers: true, header_converters: :downcase)
csv = CSV.parse(str, headers: headers, header_converters: :symbol) table.headers # => ["name", "value"]
csv.headers # => [:name, :value]
# With both The value may be a converter list
csv = CSV.parse(str, headers: headers, header_converters: [:downcase, :symbol]) (see {Converter Lists}[#class-CSV-label-Converter+Lists]):
csv.headers # => [:name, :value] header_converters = [:downcase, :symbol]
table = CSV.parse(str, headers: true, header_converters: header_converters)
table.headers # => [:name, :value]
The value may be a \Proc custom converter
(see {Custom Header Converters}[#class-CSV-label-Custom+Header+Converters]):
upcase_converter = proc {|field| field.upcase }
table = CSV.parse(str, headers: true, header_converters: upcase_converter)
table.headers # => ["NAME", "VALUE"]
See also {Custom Header Converters}[#class-CSV-label-Custom+Header+Converters]

View file

@ -34,7 +34,7 @@
# I'm sure I'll miss something, but I'll try to mention most of the major # I'm sure I'll miss something, but I'll try to mention most of the major
# differences I am aware of, to help others quickly get up to speed: # differences I am aware of, to help others quickly get up to speed:
# #
# === CSV Parsing # === \CSV Parsing
# #
# * This parser is m17n aware. See CSV for full details. # * This parser is m17n aware. See CSV for full details.
# * This library has a stricter parser and will throw MalformedCSVErrors on # * This library has a stricter parser and will throw MalformedCSVErrors on
@ -440,54 +440,188 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
# data = CSV.parse('Bob,Engineering,1000', headers: %i[name department salary]) # data = CSV.parse('Bob,Engineering,1000', headers: %i[name department salary])
# data.first #=> #<CSV::Row name:"Bob" department:"Engineering" salary:"1000"> # data.first #=> #<CSV::Row name:"Bob" department:"Engineering" salary:"1000">
# #
# === \CSV \Converters # === \Converters
# #
# By default, each field parsed by \CSV is formed into a \String. # By default, each value (field or header) parsed by \CSV is formed into a \String.
# You can use a _converter_ to convert certain fields into other Ruby objects. # You can use a _field_ _converter_ or _header_ _converter_
# to intercept and modify the parsed values:
# - See {Field Converters}[#class-CSV-label-Field+Converters].
# - See {Header Converters}[#class-CSV-label-Header+Converters].
# #
# When you specify a converter for parsing, # Also by default, each value to be written during generation is written 'as-is'.
# each parsed field is passed to the converter; # You can use a _write_ _converter_ to modify values before writing.
# its return value becomes the new value for the field. # - See {Write Converters}[#class-CSV-label-Write+Converters].
#
# ==== Specifying \Converters
#
# You can specify converters for parsing or generating in the +options+
# argument to various \CSV methods:
# - Option +converters+ for converting parsed field values.
# - Option +header_converters+ for converting parsed header values.
# - Option +write_converters+ for converting values to be written (generated).
#
# There are three forms for specifying converters:
# - A converter proc: executable code to be used for conversion.
# - A converter name: the name of a stored converter.
# - A converter list: an array of converter procs, converter names, and converter lists.
#
# ===== Converter Procs
#
# This converter proc, +strip_converter+, accepts a value +field+
# and returns <tt>field.strip</tt>:
# strip_converter = proc {|field| field.strip }
# In this call to <tt>CSV.parse</tt>,
# the keyword argument <tt>converters: string_converter</tt>
# specifies that:
# - \Proc +string_converter+ is to be called for each parsed field.
# - The converter's return value is to replace the +field+ value.
# Example:
# string = " foo , 0 \n bar , 1 \n baz , 2 \n"
# array = CSV.parse(string, converters: strip_converter)
# array # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# A converter proc can receive a second argument, +field_info+,
# that contains details about the field.
# This modified +strip_converter+ displays its arguments:
# strip_converter = proc do |field, field_info|
# p [field, field_info]
# field.strip
# end
# string = " foo , 0 \n bar , 1 \n baz , 2 \n"
# array = CSV.parse(string, converters: strip_converter)
# array # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
# Output:
# [" foo ", #<struct CSV::FieldInfo index=0, line=1, header=nil>]
# [" 0 ", #<struct CSV::FieldInfo index=1, line=1, header=nil>]
# [" bar ", #<struct CSV::FieldInfo index=0, line=2, header=nil>]
# [" 1 ", #<struct CSV::FieldInfo index=1, line=2, header=nil>]
# [" baz ", #<struct CSV::FieldInfo index=0, line=3, header=nil>]
# [" 2 ", #<struct CSV::FieldInfo index=1, line=3, header=nil>]
# Each CSV::Info object shows:
# - The 0-based field index.
# - The 1-based line index.
# - The field header, if any.
#
# ===== Stored \Converters
#
# A converter may be given a name and stored in a structure where
# the parsing methods can find it by name.
#
# The storage structure for field converters is the \Hash CSV::Converters.
# It has several built-in converter procs:
# - <tt>:integer</tt>: converts each \String-embedded integer into a true \Integer.
# - <tt>:float</tt>: converts each \String-embedded float into a true \Float.
# - <tt>:date</tt>: converts each \String-embedded date into a true \Date.
# - <tt>:date_time</tt>: converts each \String-embedded date-time into a true \DateTime
# .
# This example creates a converter proc, then stores it:
# strip_converter = proc {|field| field.strip }
# CSV::Converters[:strip] = strip_converter
# Then the parsing method call can refer to the converter
# by its name, <tt>:strip</tt>:
# string = " foo , 0 \n bar , 1 \n baz , 2 \n"
# array = CSV.parse(string, converters: :strip)
# array # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# The storage structure for header converters is the \Hash CSV::HeaderConverters,
# which works in the same way.
# It also has built-in converter procs:
# - <tt>:downcase</tt>: Downcases each header.
# - <tt>:symbol</tt>: Converts each header to a \Symbol.
#
# There is no such storage structure for write headers.
#
# ===== Converter Lists
#
# A _converter_ _list_ is an \Array that may include any assortment of:
# - Converter procs.
# - Names of stored converters.
# - Nested converter lists.
#
# Examples:
# numeric_converters = [:integer, :float]
# date_converters = [:date, :date_time]
# [numeric_converters, strip_converter]
# [strip_converter, date_converters, :float]
#
# Like a converter proc, a converter list may be named and stored in either
# \CSV::Converters or CSV::HeaderConverters:
# CSV::Converters[:custom] = [strip_converter, date_converters, :float]
# CSV::HeaderConverters[:custom] = [:downcase, :symbol]
#
# There are two built-in converter lists:
# CSV::Converters[:numeric] # => [:integer, :float]
# CSV::Converters[:all] # => [:date_time, :numeric]
#
# ==== Field \Converters
#
# With no conversion, all parsed fields in all rows become Strings:
# string = "foo,0\nbar,1\nbaz,2\n"
# ary = CSV.parse(string)
# ary # => # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
#
# When you specify a field converter, each parsed field is passed to the converter;
# its return value becomes the stored value for the field.
# A converter might, for example, convert an integer embedded in a \String # A converter might, for example, convert an integer embedded in a \String
# into a true \Integer. # into a true \Integer.
# (In fact, that's what built-in field converter +:integer+ does.) # (In fact, that's what built-in field converter +:integer+ does.)
# #
# There are additional built-in \converters, and custom \converters are also supported. # There are three ways to use field \converters.
# #
# All \converters try to transcode fields to UTF-8 before converting. # - Using option {converters}[#class-CSV-label-Option+converters] with a parsing method:
# The conversion will fail if the data cannot be transcoded, leaving the field unchanged. # ary = CSV.parse(string, converters: :integer)
# ary # => [0, 1, 2] # => [["foo", 0], ["bar", 1], ["baz", 2]]
# - Using option {converters}[#class-CSV-label-Option+converters] with a new \CSV instance:
# csv = CSV.new(string, converters: :integer)
# # Field converters in effect:
# csv.converters # => [:integer]
# csv.read # => [["foo", 0], ["bar", 1], ["baz", 2]]
# - Using method #convert to add a field converter to a \CSV instance:
# csv = CSV.new(string)
# # Add a converter.
# csv.convert(:integer)
# csv.converters # => [:integer]
# csv.read # => [["foo", 0], ["bar", 1], ["baz", 2]]
# #
# ==== Field \Converters # Installing a field converter does not affect already-read rows:
# # csv = CSV.new(string)
# There are three ways to use field \converters; # csv.shift # => ["foo", "0"]
# these examples use built-in field converter +:integer+,
# which converts each parsed integer string to a true \Integer.
#
# Option +converters+ with a singleton parsing method:
# ary = CSV.parse_line('0,1,2', converters: :integer)
# ary # => [0, 1, 2]
#
# Option +converters+ with a new \CSV instance:
# csv = CSV.new('0,1,2', converters: :integer)
# # Field converters in effect:
# csv.converters # => [:integer]
# csv.shift # => [0, 1, 2]
#
# Method #convert adds a field converter to a \CSV instance:
# csv = CSV.new('0,1,2')
# # Add a converter. # # Add a converter.
# csv.convert(:integer) # csv.convert(:integer)
# csv.converters # => [:integer] # csv.converters # => [:integer]
# csv.shift # => [0, 1, 2] # csv.read # => [["bar", 1], ["baz", 2]]
# #
# --- # There are additional built-in \converters, and custom \converters are also supported.
# #
# The built-in field \converters are in \Hash CSV::Converters. # ===== Built-In Field \Converters
# The \Symbol keys there are the names of the \converters:
# #
# CSV::Converters.keys # => [:integer, :float, :numeric, :date, :date_time, :all] # The built-in field converters are in \Hash CSV::Converters:
# - Each key is a field converter name.
# - Each value is one of:
# - A \Proc field converter.
# - An \Array of field converter names.
# #
# Converter +:integer+ converts each field that +Integer()+ accepts: # Display:
# CSV::Converters.each_pair do |name, value|
# if value.kind_of?(Proc)
# p [name, value.class]
# else
# p [name, value]
# end
# end
# Output:
# [:integer, Proc]
# [:float, Proc]
# [:numeric, [:integer, :float]]
# [:date, Proc]
# [:date_time, Proc]
# [:all, [:date_time, :numeric]]
#
# Each of these converters transcodes values to UTF-8 before attempting conversion.
# If a value cannot be transcoded to UTF-8 the conversion will
# fail and the value will remain unconverted.
#
# Converter +:integer+ converts each field that Integer() accepts:
# data = '0,1,2,x' # data = '0,1,2,x'
# # Without the converter # # Without the converter
# csv = CSV.parse_line(data) # csv = CSV.parse_line(data)
@ -496,7 +630,7 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
# csv = CSV.parse_line(data, converters: :integer) # csv = CSV.parse_line(data, converters: :integer)
# csv # => [0, 1, 2, "x"] # csv # => [0, 1, 2, "x"]
# #
# Converter +:float+ converts each field that +Float()+ accepts: # Converter +:float+ converts each field that Float() accepts:
# data = '1.0,3.14159,x' # data = '1.0,3.14159,x'
# # Without the converter # # Without the converter
# csv = CSV.parse_line(data) # csv = CSV.parse_line(data)
@ -507,7 +641,7 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
# #
# Converter +:numeric+ converts with both +:integer+ and +:float+.. # Converter +:numeric+ converts with both +:integer+ and +:float+..
# #
# Converter +:date+ converts each field that +Date::parse()+ accepts: # Converter +:date+ converts each field that Date::parse accepts:
# data = '2001-02-03,x' # data = '2001-02-03,x'
# # Without the converter # # Without the converter
# csv = CSV.parse_line(data) # csv = CSV.parse_line(data)
@ -516,7 +650,7 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
# csv = CSV.parse_line(data, converters: :date) # csv = CSV.parse_line(data, converters: :date)
# csv # => [#<Date: 2001-02-03 ((2451944j,0s,0n),+0s,2299161j)>, "x"] # csv # => [#<Date: 2001-02-03 ((2451944j,0s,0n),+0s,2299161j)>, "x"]
# #
# Converter +:date_time+ converts each field that +DateTime::parse() accepts: # Converter +:date_time+ converts each field that DateTime::parse accepts:
# data = '2020-05-07T14:59:00-05:00,x' # data = '2020-05-07T14:59:00-05:00,x'
# # Without the converter # # Without the converter
# csv = CSV.parse_line(data) # csv = CSV.parse_line(data)
@ -536,17 +670,16 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
# csv.convert(:date) # csv.convert(:date)
# csv.converters # => [:integer, :date] # csv.converters # => [:integer, :date]
# #
# You can add a custom field converter to \Hash CSV::Converters: # ===== Custom Field \Converters
# strip_converter = proc {|field| field.strip} #
# You can define a custom field converter:
# strip_converter = proc {|field| field.strip }
# Add it to the \Converters \Hash:
# CSV::Converters[:strip] = strip_converter # CSV::Converters[:strip] = strip_converter
# CSV::Converters.keys # => [:integer, :float, :numeric, :date, :date_time, :all, :strip] # Use it by name:
# # string = " foo , 0 \n bar , 1 \n baz , 2 \n"
# Then use it to convert fields: # array = CSV.parse(string, converters: strip_converter)
# str = ' foo , 0 ' # array # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
# ary = CSV.parse_line(str, converters: :strip)
# ary # => ["foo", "0"]
#
# See {Custom Converters}[#class-CSV-label-Custom+Converters].
# #
# ==== Header \Converters # ==== Header \Converters
# #
@ -556,43 +689,42 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
# these examples use built-in header converter +:dowhcase+, # these examples use built-in header converter +:dowhcase+,
# which downcases each parsed header. # which downcases each parsed header.
# #
# Option +header_converters+ with a singleton parsing method: # - Option +header_converters+ with a singleton parsing method:
# str = "Name,Count\nFoo,0\n,Bar,1\nBaz,2" # string = "Name,Count\nFoo,0\n,Bar,1\nBaz,2"
# tbl = CSV.parse(str, headers: true, header_converters: :downcase) # tbl = CSV.parse(string, headers: true, header_converters: :downcase)
# tbl.class # => CSV::Table # tbl.class # => CSV::Table
# tbl.headers # => ["name", "count"] # tbl.headers # => ["name", "count"]
# #
# Option +header_converters+ with a new \CSV instance: # - Option +header_converters+ with a new \CSV instance:
# csv = CSV.new(str, header_converters: :downcase) # csv = CSV.new(string, header_converters: :downcase)
# # Header converters in effect: # # Header converters in effect:
# csv.header_converters # => [:downcase] # csv.header_converters # => [:downcase]
# tbl = CSV.parse(str, headers: true) # tbl = CSV.parse(string, headers: true)
# tbl.headers # => ["Name", "Count"] # tbl.headers # => ["Name", "Count"]
# #
# Method #header_convert adds a header converter to a \CSV instance: # - Method #header_convert adds a header converter to a \CSV instance:
# csv = CSV.new(str) # csv = CSV.new(string)
# # Add a header converter. # # Add a header converter.
# csv.header_convert(:downcase) # csv.header_convert(:downcase)
# csv.header_converters # => [:downcase] # csv.header_converters # => [:downcase]
# tbl = CSV.parse(str, headers: true) # tbl = CSV.parse(string, headers: true)
# tbl.headers # => ["Name", "Count"] # tbl.headers # => ["Name", "Count"]
# #
# --- # ===== Built-In Header \Converters
#
# The built-in header \converters are in \Hash CSV::Converters.
# The \Symbol keys there are the names of the \converters:
# #
# The built-in header \converters are in \Hash CSV::HeaderConverters.
# The keys there are the names of the \converters:
# CSV::HeaderConverters.keys # => [:downcase, :symbol] # CSV::HeaderConverters.keys # => [:downcase, :symbol]
# #
# Converter +:downcase+ converts each header by downcasing it: # Converter +:downcase+ converts each header by downcasing it:
# str = "Name,Count\nFoo,0\n,Bar,1\nBaz,2" # string = "Name,Count\nFoo,0\n,Bar,1\nBaz,2"
# tbl = CSV.parse(str, headers: true, header_converters: :downcase) # tbl = CSV.parse(string, headers: true, header_converters: :downcase)
# tbl.class # => CSV::Table # tbl.class # => CSV::Table
# tbl.headers # => ["name", "count"] # tbl.headers # => ["name", "count"]
# #
# Converter +:symbol+ by making it into a \Symbol: # Converter +:symbol+ converts each header by making it into a \Symbol:
# str = "Name,Count\nFoo,0\n,Bar,1\nBaz,2" # string = "Name,Count\nFoo,0\n,Bar,1\nBaz,2"
# tbl = CSV.parse(str, headers: true, header_converters: :symbol) # tbl = CSV.parse(string, headers: true, header_converters: :symbol)
# tbl.headers # => [:name, :count] # tbl.headers # => [:name, :count]
# Details: # Details:
# - Strips leading and trailing whitespace. # - Strips leading and trailing whitespace.
@ -601,46 +733,44 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
# - Removes non-word characters. # - Removes non-word characters.
# - Makes the string into a \Symbol. # - Makes the string into a \Symbol.
# #
# You can add a custom header converter to \Hash CSV::HeaderConverters: # ===== Custom Header \Converters
# strip_converter = proc {|field| field.strip}
# CSV::HeaderConverters[:strip] = strip_converter
# CSV::HeaderConverters.keys # => [:downcase, :symbol, :strip]
# #
# Then use it to convert headers: # You can define a custom header converter:
# str = " Name , Value \nfoo,0\nbar,1\nbaz,2" # upcase_converter = proc {|header| header.upcase }
# tbl = CSV.parse(str, headers: true, header_converters: :strip) # Add it to the \HeaderConverters \Hash:
# tbl.headers # => ["Name", "Value"] # CSV::HeaderConverters[:upcase] = upcase_converter
# Use it by name:
# string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(string, headers: true, converters: upcase_converter)
# table # => #<CSV::Table mode:col_or_row row_count:4>
# table.headers # => ["Name", "Value"]
# #
# See {Custom Converters}[#class-CSV-label-Custom+Converters]. # ===== Write \Converters
# #
# ==== Custom \Converters # When you specify a write converter for generating \CSV,
# each field to be written is passed to the converter;
# its return value becomes the new value for the field.
# A converter might, for example, strip whitespace from a field.
# #
# You can define custom \converters. # - Using no write converter (all fields unmodified):
# output_string = CSV.generate do |csv|
# csv << [' foo ', 0]
# csv << [' bar ', 1]
# csv << [' baz ', 2]
# end
# output_string # => " foo ,0\n bar ,1\n baz ,2\n"
# - Using option +write_converters+:
# strip_converter = proc {|field| field.respond_to?(:strip) ? field.strip : field }
# upcase_converter = proc {|field| field.respond_to?(:upcase) ? field.upcase : field }
# converters = [strip_converter, upcase_converter]
# output_string = CSV.generate(write_converters: converters) do |csv|
# csv << [' foo ', 0]
# csv << [' bar ', 1]
# csv << [' baz ', 2]
# end
# output_string # => "FOO,0\nBAR,1\nBAZ,2\n"
# #
# The \converter is a \Proc that is called with two arguments, # === Character Encodings (M17n or Multilingualization)
# \String +field+ and CSV::FieldInfo +field_info+;
# it returns a \String that will become the field value:
# converter = proc {|field, field_info| <some_string> }
#
# To illustrate:
# converter = proc {|field, field_info| p [field, field_info]; field}
# ary = CSV.parse_line('foo,0', converters: converter)
#
# Produces:
# ["foo", #<struct CSV::FieldInfo index=0, line=1, header=nil>]
# ["0", #<struct CSV::FieldInfo index=1, line=1, header=nil>]
#
# In each of the output lines:
# - The first \Array element is the passed \String field.
# - The second is a \FieldInfo structure containing information about the field:
# - The 0-based column index.
# - The 1-based line number.
# - The header for the column, if available.
#
# If the \converter does not need +field_info+, it can be omitted:
# converter = proc {|field| ... }
#
# === CSV and Character Encodings (M17n or Multilingualization)
# #
# This new CSV parser is m17n savvy. The parser works in the Encoding of the IO # This new CSV parser is m17n savvy. The parser works in the Encoding of the IO
# or String object being read from or written to. Your data is never transcoded # or String object being read from or written to. Your data is never transcoded
@ -721,30 +851,12 @@ class CSV
# The encoding used by all converters. # The encoding used by all converters.
ConverterEncoding = Encoding.find("UTF-8") ConverterEncoding = Encoding.find("UTF-8")
# A \Hash containing the names and \Procs for the built-in field converters.
# See {Built-In Field Converters}[#class-CSV-label-Built-In+Field+Converters].
# #
# This Hash holds the built-in converters of CSV that can be accessed by name. # This \Hash is intentionally left unfrozen, and may be extended with
# You can select Converters with CSV.convert() or through the +options+ Hash # custom field converters.
# passed to CSV::new(). # See {Custom Field Converters}[#class-CSV-label-Custom+Field+Converters].
#
# <b><tt>:integer</tt></b>:: Converts any field Integer() accepts.
# <b><tt>:float</tt></b>:: Converts any field Float() accepts.
# <b><tt>:numeric</tt></b>:: A combination of <tt>:integer</tt>
# and <tt>:float</tt>.
# <b><tt>:date</tt></b>:: Converts any field Date::parse() accepts.
# <b><tt>:date_time</tt></b>:: Converts any field DateTime::parse() accepts.
# <b><tt>:all</tt></b>:: All built-in converters. A combination of
# <tt>:date_time</tt> and <tt>:numeric</tt>.
#
# All built-in converters transcode field data to UTF-8 before attempting a
# conversion. If your data cannot be transcoded to UTF-8 the conversion will
# fail and the field will remain unchanged.
#
# This Hash is intentionally left unfrozen and users should feel free to add
# values to it that can be accessed by all CSV objects.
#
# To add a combo field, the value should be an Array of names. Combo fields
# can be nested with other combo fields.
#
Converters = { Converters = {
integer: lambda { |f| integer: lambda { |f|
Integer(f.encode(ConverterEncoding)) rescue f Integer(f.encode(ConverterEncoding)) rescue f
@ -772,27 +884,12 @@ class CSV
all: [:date_time, :numeric], all: [:date_time, :numeric],
} }
# A \Hash containing the names and \Procs for the built-in header converters.
# See {Built-In Header Converters}[#class-CSV-label-Built-In+Header+Converters].
# #
# This Hash holds the built-in header converters of CSV that can be accessed # This \Hash is intentionally left unfrozen, and may be extended with
# by name. You can select HeaderConverters with CSV.header_convert() or # custom field converters.
# through the +options+ Hash passed to CSV::new(). # See {Custom Header Converters}[#class-CSV-label-Custom+Header+Converters].
#
# <b><tt>:downcase</tt></b>:: Calls downcase() on the header String.
# <b><tt>:symbol</tt></b>:: Leading/trailing spaces are dropped, string is
# downcased, remaining spaces are replaced with
# underscores, non-word characters are dropped,
# and finally to_sym() is called.
#
# All built-in header converters transcode header data to UTF-8 before
# attempting a conversion. If your data cannot be transcoded to UTF-8 the
# conversion will fail and the header will remain unchanged.
#
# This Hash is intentionally left unfrozen and users should feel free to add
# values to it that can be accessed by all CSV objects.
#
# To add a combo field, the value should be an Array of names. Combo fields
# can be nested with other combo fields.
#
HeaderConverters = { HeaderConverters = {
downcase: lambda { |h| h.encode(ConverterEncoding).downcase }, downcase: lambda { |h| h.encode(ConverterEncoding).downcase },
symbol: lambda { |h| symbol: lambda { |h|
@ -1726,9 +1823,14 @@ class CSV
# :call-seq: # :call-seq:
# csv.converters -> array # csv.converters -> array
# #
# Returns an \Array containing field converters; used for parsing; # Returns an \Array containing field converters;
# see {Option +converters+}[#class-CSV-label-Option+converters]: # see {Field Converters}[#class-CSV-label-Field+Converters]:
# CSV.new('').converters # => [] # csv = CSV.new('')
# csv.converters # => []
# csv.convert(:integer)
# csv.converters # => [:integer]
# csv.convert(proc {|x| x.to_s })
# csv.converters
def converters def converters
parser_fields_converter.map do |converter| parser_fields_converter.map do |converter|
name = Converters.rassoc(converter) name = Converters.rassoc(converter)
@ -1789,7 +1891,7 @@ class CSV
# csv.header_converters -> array # csv.header_converters -> array
# #
# Returns an \Array containing header converters; used for parsing; # Returns an \Array containing header converters; used for parsing;
# see {Option +header_converters+}[#class-CSV-label-Option+header_converters]: # see {Header Converters}[#class-CSV-label-Header+Converters]:
# CSV.new('').header_converters # => [] # CSV.new('').header_converters # => []
def header_converters def header_converters
header_fields_converter.map do |converter| header_fields_converter.map do |converter|
@ -1833,7 +1935,7 @@ class CSV
# csv.encoding -> endcoding # csv.encoding -> endcoding
# #
# Returns the encoding used for parsing and generating; # Returns the encoding used for parsing and generating;
# see {CSV and Character Encodings (M17n or Multilingualization)}[#class-CSV-label-CSV+and+Character+Encodings+-28M17n+or+Multilingualization-29]: # see {Character Encodings (M17n or Multilingualization)}[#class-CSV-label-Character+Encodings+-28M17n+or+Multilingualization-29]:
# CSV.new('').encoding # => #<Encoding:UTF-8> # CSV.new('').encoding # => #<Encoding:UTF-8>
attr_reader :encoding attr_reader :encoding
@ -1965,13 +2067,56 @@ class CSV
### End Delegation ### ### End Delegation ###
# :call-seq:
# csv.<< row
# #
# The primary write method for wrapped Strings and IOs, +row+ (an Array or # Appends a row to +self+.
# CSV::Row) is converted to CSV and appended to the data source. When a
# CSV::Row is passed, only the row's fields() are appended to the output.
# #
# The data source must be open for writing. # - Argument +row+ must be an \Array object or a CSV::Row object.
# - The output stream must be open for writing.
# #
# ---
#
# Append Arrays:
# CSV.generate do |csv|
# csv << ['foo', 0]
# csv << ['bar', 1]
# csv << ['baz', 2]
# end # => "foo,0\nbar,1\nbaz,2\n"
#
# Append CSV::Rows:
# headers = []
# CSV.generate do |csv|
# csv << CSV::Row.new(headers, ['foo', 0])
# csv << CSV::Row.new(headers, ['bar', 1])
# csv << CSV::Row.new(headers, ['baz', 2])
# end # => "foo,0\nbar,1\nbaz,2\n"
#
# Headers in CSV::Row objects are not appended:
# headers = ['Name', 'Count']
# CSV.generate do |csv|
# csv << CSV::Row.new(headers, ['foo', 0])
# csv << CSV::Row.new(headers, ['bar', 1])
# csv << CSV::Row.new(headers, ['baz', 2])
# end # => "foo,0\nbar,1\nbaz,2\n"
#
# ---
#
# Raises an exception if +row+ is not an \Array or \CSV::Row:
# CSV.generate do |csv|
# # Raises NoMethodError (undefined method `collect' for :foo:Symbol)
# csv << :foo
# end
#
# Raises an exception if the output stream is not open for writing:
# path = 't.csv'
# File.write(path, '')
# File.open(path) do |file|
# CSV.open(file) do |csv|
# # Raises IOError (not opened for writing)
# csv << ['foo', 0]
# end
# end
def <<(row) def <<(row)
writer << row writer << row
self self
@ -1979,36 +2124,136 @@ class CSV
alias_method :add_row, :<< alias_method :add_row, :<<
alias_method :puts, :<< alias_method :puts, :<<
#
# :call-seq: # :call-seq:
# convert( name ) # convert(converter_name) -> array_of_procs
# convert { |field| ... } # convert {|field, field_info| ... } -> array_of_procs
# convert { |field, field_info| ... }
# #
# You can use this method to install a CSV::Converters built-in, or provide a # - With no block, installs a field converter (a \Proc).
# block that handles a custom conversion. # - With a block, defines and installs a custom field converter.
# - Returns the \Array of installed field converters.
# #
# If you provide a block that takes one argument, it will be passed the field # - Argument +converter_name+, if given, should be the name
# and is expected to return the converted value or the field itself. If your # of an existing field converter.
# block takes two arguments, it will also be passed a CSV::FieldInfo Struct,
# containing details about the field. Again, the block should return a
# converted field or the field itself.
# #
# See {Field Converters}[#class-CSV-label-Field+Converters].
# ---
#
# With no block, installs a field converter:
# csv = CSV.new('')
# csv.convert(:integer)
# csv.convert(:float)
# csv.convert(:date)
# csv.converters # => [:integer, :float, :date]
#
# ---
#
# The block, if given, is called for each field:
# - Argument +field+ is the field value.
# - Argument +field_info+ is a CSV::FieldInfo object
# containing details about the field.
#
# The examples here assume the prior execution of:
# string = "foo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
#
# Example giving a block:
# csv = CSV.open(path)
# csv.convert {|field, field_info| p [field, field_info]; field.upcase }
# csv.read # => [["FOO", "0"], ["BAR", "1"], ["BAZ", "2"]]
#
# Output:
# ["foo", #<struct CSV::FieldInfo index=0, line=1, header=nil>]
# ["0", #<struct CSV::FieldInfo index=1, line=1, header=nil>]
# ["bar", #<struct CSV::FieldInfo index=0, line=2, header=nil>]
# ["1", #<struct CSV::FieldInfo index=1, line=2, header=nil>]
# ["baz", #<struct CSV::FieldInfo index=0, line=3, header=nil>]
# ["2", #<struct CSV::FieldInfo index=1, line=3, header=nil>]
#
# The block need not return a \String object:
# csv = CSV.open(path)
# csv.convert {|field, field_info| field.to_sym }
# csv.read # => [[:foo, :"0"], [:bar, :"1"], [:baz, :"2"]]
#
# If +converter_name+ is given, the block is not called:
# csv = CSV.open(path)
# csv.convert(:integer) {|field, field_info| fail 'Cannot happen' }
# csv.read # => [["foo", 0], ["bar", 1], ["baz", 2]]
#
# ---
#
# Raises a parse-time exception if +converter_name+ is not the name of a built-in
# field converter:
# csv = CSV.open(path)
# csv.convert(:nosuch) => [nil]
# # Raises NoMethodError (undefined method `arity' for nil:NilClass)
# csv.read
def convert(name = nil, &converter) def convert(name = nil, &converter)
parser_fields_converter.add_converter(name, &converter) parser_fields_converter.add_converter(name, &converter)
end end
#
# :call-seq: # :call-seq:
# header_convert( name ) # header_convert(converter_name) -> array_of_procs
# header_convert { |field| ... } # header_convert {|header, field_info| ... } -> array_of_procs
# header_convert { |field, field_info| ... }
# #
# Identical to CSV#convert(), but for header rows. # - With no block, installs a header converter (a \Proc).
# - With a block, defines and installs a custom header converter.
# - Returns the \Array of installed header converters.
# #
# Note that this method must be called before header rows are read to have any # - Argument +converter_name+, if given, should be the name
# effect. # of an existing header converter.
# #
# See {Header Converters}[#class-CSV-label-Header+Converters].
# ---
#
# With no block, installs a header converter:
# csv = CSV.new('')
# csv.header_convert(:symbol)
# csv.header_convert(:downcase)
# csv.header_converters # => [:symbol, :downcase]
#
# ---
#
# The block, if given, is called for each header:
# - Argument +header+ is the header value.
# - Argument +field_info+ is a CSV::FieldInfo object
# containing details about the header.
#
# The examples here assume the prior execution of:
# string = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# path = 't.csv'
# File.write(path, string)
#
# Example giving a block:
# csv = CSV.open(path, headers: true)
# csv.header_convert {|header, field_info| p [header, field_info]; header.upcase }
# table = csv.read
# table # => #<CSV::Table mode:col_or_row row_count:4>
# table.headers # => ["NAME", "VALUE"]
#
# Output:
# ["Name", #<struct CSV::FieldInfo index=0, line=1, header=nil>]
# ["Value", #<struct CSV::FieldInfo index=1, line=1, header=nil>]
# The block need not return a \String object:
# csv = CSV.open(path, headers: true)
# csv.header_convert {|header, field_info| header.to_sym }
# table = csv.read
# table.headers # => [:Name, :Value]
#
# If +converter_name+ is given, the block is not called:
# csv = CSV.open(path, headers: true)
# csv.header_convert(:downcase) {|header, field_info| fail 'Cannot happen' }
# table = csv.read
# table.headers # => ["name", "value"]
# ---
#
# Raises a parse-time exception if +converter_name+ is not the name of a built-in
# field converter:
# csv = CSV.open(path, headers: true)
# csv.header_convert(:nosuch)
# # Raises NoMethodError (undefined method `arity' for nil:NilClass)
# csv.read
def header_convert(name = nil, &converter) def header_convert(name = nil, &converter)
header_fields_converter.add_converter(name, &converter) header_fields_converter.add_converter(name, &converter)
end end