mirror of
https://github.com/ruby/ruby.git
synced 2022-11-09 12:17:21 -05:00
* lib/scanf.rb: Improve documentation. Patch by Gabe McArthur.
[Ruby 1.9 - Bug #4735] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@31646 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This commit is contained in:
parent
e2283b873d
commit
7e1e46b99d
2 changed files with 380 additions and 325 deletions
|
@ -1,3 +1,8 @@
|
||||||
|
Fri May 20 04:23:42 2011 Eric Hodel <drbrain@segment7.net>
|
||||||
|
|
||||||
|
* lib/scanf.rb: Improve documentation. Patch by Gabe McArthur.
|
||||||
|
[Ruby 1.9 - Bug #4735]
|
||||||
|
|
||||||
Fri May 20 00:58:01 2011 Nobuyoshi Nakada <nobu@ruby-lang.org>
|
Fri May 20 00:58:01 2011 Nobuyoshi Nakada <nobu@ruby-lang.org>
|
||||||
|
|
||||||
* enc/trans/ibm737-tbl.rb: greek code page. fixes #4738
|
* enc/trans/ibm737-tbl.rb: greek code page. fixes #4738
|
||||||
|
|
700
lib/scanf.rb
700
lib/scanf.rb
|
@ -1,305 +1,288 @@
|
||||||
# scanf for Ruby
|
# scanf for Ruby
|
||||||
#
|
#
|
||||||
|
#--
|
||||||
# $Release Version: 1.1.2 $
|
# $Release Version: 1.1.2 $
|
||||||
# $Revision$
|
# $Revision$
|
||||||
# $Id$
|
# $Id$
|
||||||
# $Author$
|
# $Author$
|
||||||
|
#++
|
||||||
#
|
#
|
||||||
# A product of the Austin Ruby Codefest (Austin, Texas, August 2002)
|
# == Description
|
||||||
|
#
|
||||||
=begin
|
# scanf is an implementation of the C function scanf(3), modified as necessary
|
||||||
|
# for ruby compatibility.
|
||||||
=scanf for Ruby
|
#
|
||||||
|
# the methods provided are String#scanf, IO#scanf, and
|
||||||
==Description
|
# Kernel#scanf. Kernel#scanf is a wrapper around STDIN.scanf. IO#scanf
|
||||||
|
# can be used on any IO stream, including file handles and sockets.
|
||||||
scanf for Ruby is an implementation of the C function scanf(3),
|
# scanf can be called either with or without a block.
|
||||||
modified as necessary for Ruby compatibility.
|
#
|
||||||
|
# Scanf scans an input string or stream according to a <b>format</b>, as
|
||||||
The methods provided are String#scanf, IO#scanf, and
|
# described below in Conversions, and returns an array of matches between
|
||||||
Kernel#scanf. Kernel#scanf is a wrapper around STDIN.scanf. IO#scanf
|
# the format and the input. The format is defined in a string, and is
|
||||||
can be used on any IO stream, including file handles and sockets.
|
# similar (though not identical) to the formats used in Kernel#printf and
|
||||||
scanf can be called either with or without a block.
|
# Kernel#sprintf.
|
||||||
|
#
|
||||||
scanf for Ruby scans an input string or stream according to a
|
# The format may contain <b>conversion specifiers</b>, which tell scanf
|
||||||
<b>format</b>, as described below ("Conversions"), and returns an
|
# what form (type) each particular matched substring should be converted
|
||||||
array of matches between the format and the input. The format is
|
# to (e.g., decimal integer, floating point number, literal string,
|
||||||
defined in a string, and is similar (though not identical) to the
|
# etc.) The matches and conversions take place from left to right, and
|
||||||
formats used in Kernel#printf and Kernel#sprintf.
|
# the conversions themselves are returned as an array.
|
||||||
|
#
|
||||||
The format may contain <b>conversion specifiers</b>, which tell scanf
|
# The format string may also contain characters other than those in the
|
||||||
what form (type) each particular matched substring should be converted
|
# conversion specifiers. White space (blanks, tabs, or newlines) in the
|
||||||
to (e.g., decimal integer, floating point number, literal string,
|
# format string matches any amount of white space, including none, in
|
||||||
etc.) The matches and conversions take place from left to right, and
|
# the input. Everything else matches only itself.
|
||||||
the conversions themselves are returned as an array.
|
#
|
||||||
|
# Scanning stops, and scanf returns, when any input character fails to
|
||||||
The format string may also contain characters other than those in the
|
# match the specifications in the format string, or when input is
|
||||||
conversion specifiers. White space (blanks, tabs, or newlines) in the
|
# exhausted, or when everything in the format string has been
|
||||||
format string matches any amount of white space, including none, in
|
# matched. All matches found up to the stopping point are returned in
|
||||||
the input. Everything else matches only itself.
|
# the return array (or yielded to the block, if a block was given).
|
||||||
|
#
|
||||||
Scanning stops, and scanf returns, when any input character fails to
|
#
|
||||||
match the specifications in the format string, or when input is
|
# == Basic usage
|
||||||
exhausted, or when everything in the format string has been
|
#
|
||||||
matched. All matches found up to the stopping point are returned in
|
# require 'scanf'
|
||||||
the return array (or yielded to the block, if a block was given).
|
#
|
||||||
|
# # String#scanf and IO#scanf take a single argument, the format string
|
||||||
|
# array = a_string.scanf("%d%s")
|
||||||
==Basic usage
|
# array = an_io.scanf("%d%s")
|
||||||
|
#
|
||||||
require 'scanf.rb'
|
# # Kernel#scanf reads from STDIN
|
||||||
|
# array = scanf("%d%s")
|
||||||
# String#scanf and IO#scanf take a single argument (a format string)
|
#
|
||||||
array = aString.scanf("%d%s")
|
# == Block usage
|
||||||
array = anIO.scanf("%d%s")
|
#
|
||||||
|
# When called with a block, scanf keeps scanning the input, cycling back
|
||||||
# Kernel#scanf reads from STDIN
|
# to the beginning of the format string, and yields a new array of
|
||||||
array = scanf("%d%s")
|
# conversions to the block every time the format string is matched
|
||||||
|
# (including partial matches, but not including complete failures). The
|
||||||
==Block usage
|
# actual return value of scanf when called with a block is an array
|
||||||
|
# containing the results of all the executions of the block.
|
||||||
When called with a block, scanf keeps scanning the input, cycling back
|
#
|
||||||
to the beginning of the format string, and yields a new array of
|
# str = "123 abc 456 def 789 ghi"
|
||||||
conversions to the block every time the format string is matched
|
# str.scanf("%d%s") { |num,str| [ num * 2, str.upcase ] }
|
||||||
(including partial matches, but not including complete failures). The
|
# # => [[246, "ABC"], [912, "DEF"], [1578, "GHI"]]
|
||||||
actual return value of scanf when called with a block is an array
|
#
|
||||||
containing the results of all the executions of the block.
|
# == Conversions
|
||||||
|
#
|
||||||
str = "123 abc 456 def 789 ghi"
|
# The single argument to scanf is a format string, which generally
|
||||||
str.scanf("%d%s") { |num,str| [ num * 2, str.upcase ] }
|
# includes one or more conversion specifiers. Conversion specifiers
|
||||||
# => [[246, "ABC"], [912, "DEF"], [1578, "GHI"]]
|
# begin with the percent character ('%') and include information about
|
||||||
|
# what scanf should next scan for (string, decimal number, single
|
||||||
==Conversions
|
# character, etc.).
|
||||||
|
#
|
||||||
The single argument to scanf is a format string, which generally
|
# There may be an optional maximum field width, expressed as a decimal
|
||||||
includes one or more conversion specifiers. Conversion specifiers
|
# integer, between the % and the conversion. If no width is given, a
|
||||||
begin with the percent character ('%') and include information about
|
# default of `infinity' is used (with the exception of the %c specifier;
|
||||||
what scanf should next scan for (string, decimal number, single
|
# see below). Otherwise, given a field width of <em>n</em> for a given
|
||||||
character, etc.).
|
# conversion, at most <em>n</em> characters are scanned in processing
|
||||||
|
# that conversion. Before conversion begins, most conversions skip
|
||||||
There may be an optional maximum field width, expressed as a decimal
|
# white space in the input string; this white space is not counted
|
||||||
integer, between the % and the conversion. If no width is given, a
|
# against the field width.
|
||||||
default of `infinity' is used (with the exception of the %c specifier;
|
#
|
||||||
see below). Otherwise, given a field width of <em>n</em> for a given
|
# The following conversions are available.
|
||||||
conversion, at most <em>n</em> characters are scanned in processing
|
#
|
||||||
that conversion. Before conversion begins, most conversions skip
|
# [%]
|
||||||
white space in the input string; this white space is not counted
|
# Matches a literal `%'. That is, `%%' in the format string matches a
|
||||||
against the field width.
|
# single input `%' character. No conversion is done, and the resulting
|
||||||
|
# '%' is not included in the return array.
|
||||||
The following conversions are available. (See the files EXAMPLES
|
#
|
||||||
and <tt>tests/scanftests.rb</tt> for examples.)
|
# [d]
|
||||||
|
# Matches an optionally signed decimal integer.
|
||||||
[%]
|
#
|
||||||
Matches a literal `%'. That is, `%%' in the format string matches a
|
# [u]
|
||||||
single input `%' character. No conversion is done, and the resulting
|
# Same as d.
|
||||||
'%' is not included in the return array.
|
#
|
||||||
|
# [i]
|
||||||
[d]
|
# Matches an optionally signed integer. The integer is read in base
|
||||||
Matches an optionally signed decimal integer.
|
# 16 if it begins with `0x' or `0X', in base 8 if it begins with `0',
|
||||||
|
# and in base 10 other- wise. Only characters that correspond to the
|
||||||
[u]
|
# base are recognized.
|
||||||
Same as d.
|
#
|
||||||
|
# [o]
|
||||||
[i]
|
# Matches an optionally signed octal integer.
|
||||||
Matches an optionally signed integer. The integer is read in base
|
#
|
||||||
16 if it begins with `0x' or `0X', in base 8 if it begins with `0',
|
# [x, X]
|
||||||
and in base 10 other- wise. Only characters that correspond to the
|
# Matches an optionally signed hexadecimal integer,
|
||||||
base are recognized.
|
#
|
||||||
|
# [a, e, f, g, A, E, F, G]
|
||||||
[o]
|
# Matches an optionally signed floating-point number.
|
||||||
Matches an optionally signed octal integer.
|
#
|
||||||
|
# [s]
|
||||||
[x,X]
|
# Matches a sequence of non-white-space character. The input string stops at
|
||||||
Matches an optionally signed hexadecimal integer,
|
# white space or at the maximum field width, whichever occurs first.
|
||||||
|
#
|
||||||
[a,e,f,g,A,E,F,G]
|
# [c]
|
||||||
Matches an optionally signed floating-point number.
|
# Matches a single character, or a sequence of <em>n</em> characters if a
|
||||||
|
# field width of <em>n</em> is specified. The usual skip of leading white
|
||||||
[s]
|
# space is suppressed. To skip white space first, use an explicit space in
|
||||||
Matches a sequence of non-white-space character. The input string stops at
|
# the format.
|
||||||
white space or at the maximum field width, whichever occurs first.
|
#
|
||||||
|
# [[]
|
||||||
[c]
|
# Matches a nonempty sequence of characters from the specified set
|
||||||
Matches a single character, or a sequence of <em>n</em> characters if a
|
# of accepted characters. The usual skip of leading white space is
|
||||||
field width of <em>n</em> is specified. The usual skip of leading white
|
# suppressed. This bracketed sub-expression is interpreted exactly like a
|
||||||
space is suppressed. To skip white space first, use an explicit space in
|
# character class in a Ruby regular expression. (In fact, it is placed as-is
|
||||||
the format.
|
# in a regular expression.) The matching against the input string ends with
|
||||||
|
# the appearance of a character not in (or, with a circumflex, in) the set,
|
||||||
[<tt>[</tt>]
|
# or when the field width runs out, whichever comes first.
|
||||||
Matches a nonempty sequence of characters from the specified set
|
#
|
||||||
of accepted characters. The usual skip of leading white space is
|
# === Assignment suppression
|
||||||
suppressed. This bracketed sub-expression is interpreted exactly like a
|
#
|
||||||
character class in a Ruby regular expression. (In fact, it is placed as-is
|
# To require that a particular match occur, but without including the result
|
||||||
in a regular expression.) The matching against the input string ends with
|
# in the return array, place the <b>assignment suppression flag</b>, which is
|
||||||
the appearance of a character not in (or, with a circumflex, in) the set,
|
# the star character ('*'), immediately after the leading '%' of a format
|
||||||
or when the field width runs out, whichever comes first.
|
# specifier (just before the field width, if any).
|
||||||
|
#
|
||||||
===Assignment suppression
|
# == scanf for Ruby compared with scanf in C
|
||||||
|
#
|
||||||
To require that a particular match occur, but without including the result
|
# scanf for Ruby is based on the C function scanf(3), but with modifications,
|
||||||
in the return array, place the <b>assignment suppression flag</b>, which is
|
# dictated mainly by the underlying differences between the languages.
|
||||||
the star character ('*'), immediately after the leading '%' of a format
|
#
|
||||||
specifier (just before the field width, if any).
|
# === Unimplemented flags and specifiers
|
||||||
|
#
|
||||||
==Examples
|
# * The only flag implemented in scanf for Ruby is '<tt>*</tt>' (ignore
|
||||||
|
# upcoming conversion). Many of the flags available in C versions of
|
||||||
See the files <tt>EXAMPLES</tt> and <tt>tests/scanftests.rb</tt>.
|
# scanf(3) have to do with the type of upcoming pointer arguments, and are
|
||||||
|
# meaningless in Ruby.
|
||||||
==scanf for Ruby compared with scanf in C
|
#
|
||||||
|
# * The <tt>n</tt> specifier (store number of characters consumed so far in
|
||||||
scanf for Ruby is based on the C function scanf(3), but with modifications,
|
# next pointer) is not implemented.
|
||||||
dictated mainly by the underlying differences between the languages.
|
#
|
||||||
|
# * The <tt>p</tt> specifier (match a pointer value) is not implemented.
|
||||||
===Unimplemented flags and specifiers
|
#
|
||||||
|
# === Altered specifiers
|
||||||
* The only flag implemented in scanf for Ruby is '<tt>*</tt>' (ignore
|
#
|
||||||
upcoming conversion). Many of the flags available in C versions of scanf(4)
|
# [o, u, x, X]
|
||||||
have to do with the type of upcoming pointer arguments, and are literally
|
# In scanf for Ruby, all of these specifiers scan for an optionally signed
|
||||||
meaningless in Ruby.
|
# integer, rather than for an unsigned integer like their C counterparts.
|
||||||
|
#
|
||||||
* The <tt>n</tt> specifier (store number of characters consumed so far in
|
# === Return values
|
||||||
next pointer) is not implemented.
|
#
|
||||||
|
# scanf for Ruby returns an array of successful conversions, whereas
|
||||||
* The <tt>p</tt> specifier (match a pointer value) is not implemented.
|
# scanf(3) returns the number of conversions successfully
|
||||||
|
# completed. (See below for more details on scanf for Ruby's return
|
||||||
===Altered specifiers
|
# values.)
|
||||||
|
#
|
||||||
[o,u,x,X]
|
# == Return values
|
||||||
In scanf for Ruby, all of these specifiers scan for an optionally signed
|
#
|
||||||
integer, rather than for an unsigned integer like their C counterparts.
|
# Without a block, scanf returns an array containing all the conversions
|
||||||
|
# it has found. If none are found, scanf will return an empty array. An
|
||||||
===Return values
|
# unsuccesful match is never ignored, but rather always signals the end
|
||||||
|
# of the scanning operation. If the first unsuccessful match takes place
|
||||||
scanf for Ruby returns an array of successful conversions, whereas
|
# after one or more successful matches have already taken place, the
|
||||||
scanf(3) returns the number of conversions successfully
|
# returned array will contain the results of those successful matches.
|
||||||
completed. (See below for more details on scanf for Ruby's return
|
#
|
||||||
values.)
|
# With a block scanf returns a 'map'-like array of transformations from
|
||||||
|
# the block -- that is, an array reflecting what the block did with each
|
||||||
==Return values
|
# yielded result from the iterative scanf operation. (See "Block
|
||||||
|
# usage", above.)
|
||||||
Without a block, scanf returns an array containing all the conversions
|
#
|
||||||
it has found. If none are found, scanf will return an empty array. An
|
# == Current limitations and bugs
|
||||||
unsuccesful match is never ignored, but rather always signals the end
|
#
|
||||||
of the scanning operation. If the first unsuccessful match takes place
|
# When using IO#scanf under Windows, make sure you open your files in
|
||||||
after one or more successful matches have already taken place, the
|
# binary mode:
|
||||||
returned array will contain the results of those successful matches.
|
#
|
||||||
|
# File.open("filename", "rb")
|
||||||
With a block scanf returns a 'map'-like array of transformations from
|
#
|
||||||
the block -- that is, an array reflecting what the block did with each
|
# so that scanf can keep track of characters correctly.
|
||||||
yielded result from the iterative scanf operation. (See "Block
|
#
|
||||||
usage", above.)
|
# Support for character classes is reasonably complete (since it
|
||||||
|
# essentially piggy-backs on Ruby's regular expression handling of
|
||||||
==Test suite
|
# character classes), but users are advised that character class testing
|
||||||
|
# has not been exhaustive, and that they should exercise some caution
|
||||||
scanf for Ruby includes a suite of unit tests (requiring the
|
# in using any of the more complex and/or arcane character class
|
||||||
<tt>TestUnit</tt> package), which can be run with the command <tt>ruby
|
# idioms.
|
||||||
tests/scanftests.rb</tt> or the command <tt>make test</tt>.
|
#
|
||||||
|
# == License and copyright
|
||||||
==Current limitations and bugs
|
#
|
||||||
|
# Copyright:: (c) 2002-2003 David Alan Black
|
||||||
When using IO#scanf under Windows, make sure you open your files in
|
# License:: Distributed on the same licensing terms as Ruby itself
|
||||||
binary mode:
|
#
|
||||||
|
# == Warranty disclaimer
|
||||||
File.open("filename", "rb")
|
#
|
||||||
|
# This software is provided "as is" and without any express or implied
|
||||||
so that scanf can keep track of characters correctly.
|
# warranties, including, without limitation, the implied warranties of
|
||||||
|
# merchantibility and fitness for a particular purpose.
|
||||||
Support for character classes is reasonably complete (since it
|
#
|
||||||
essentially piggy-backs on Ruby's regular expression handling of
|
# == Credits and acknowledgements
|
||||||
character classes), but users are advised that character class testing
|
#
|
||||||
has not been exhaustive, and that they should exercise some caution
|
# scanf was developed as the major activity of the Austin Ruby Codefest
|
||||||
in using any of the more complex and/or arcane character class
|
# (Austin, Texas, August 2002).
|
||||||
idioms.
|
#
|
||||||
|
# Principal author:: David Alan Black (mailto:dblack@superlink.net)
|
||||||
|
# Co-author:: Hal Fulton (mailto:hal9000@hypermetrics.com)
|
||||||
==Technical notes
|
# Project contributors:: Nolan Darilek, Jason Johnston
|
||||||
|
#
|
||||||
===Rationale behind scanf for Ruby
|
# Thanks to Hal Fulton for hosting the Codefest.
|
||||||
|
#
|
||||||
The impetus for a scanf implementation in Ruby comes chiefly from the fact
|
# Thanks to Matz for suggestions about the class design.
|
||||||
that existing pattern matching operations, such as Regexp#match and
|
#
|
||||||
String#scan, return all results as strings, which have to be converted to
|
# Thanks to Gavin Sinclair for some feedback on the documentation.
|
||||||
integers or floats explicitly in cases where what's ultimately wanted are
|
#
|
||||||
integer or float values.
|
# The text for parts of this document, especially the Description and
|
||||||
|
# Conversions sections, above, were adapted from the Linux Programmer's
|
||||||
===Design of scanf for Ruby
|
# Manual manpage for scanf(3), dated 1995-11-01.
|
||||||
|
#
|
||||||
scanf for Ruby is essentially a <format string>-to-<regular
|
# == Bugs and bug reports
|
||||||
expression> converter.
|
#
|
||||||
|
# scanf for Ruby is based on something of an amalgam of C scanf
|
||||||
When scanf is called, a FormatString object is generated from the
|
# implementations and documentation, rather than on a single canonical
|
||||||
format string ("%d%s...") argument. The FormatString object breaks the
|
# description. Suggestions for features and behaviors which appear in
|
||||||
format string down into atoms ("%d", "%5f", "blah", etc.), and from
|
# other scanfs, and would be meaningful in Ruby, are welcome, as are
|
||||||
each atom it creates a FormatSpecifier object, which it
|
# reports of suspicious behaviors and/or bugs. (Please see "Credits and
|
||||||
saves.
|
# acknowledgements", above, for email addresses.)
|
||||||
|
|
||||||
Each FormatSpecifier has a regular expression fragment and a "handler"
|
|
||||||
associated with it. For example, the regular expression fragment
|
|
||||||
associated with the format "%d" is "([-+]?\d+)", and the handler
|
|
||||||
associated with it is a wrapper around String#to_i. scanf itself calls
|
|
||||||
FormatString#match, passing in the input string. FormatString#match
|
|
||||||
iterates through its FormatSpecifiers; for each one, it matches the
|
|
||||||
corresponding regular expression fragment against the string. If
|
|
||||||
there's a match, it sends the matched string to the handler associated
|
|
||||||
with the FormatSpecifier.
|
|
||||||
|
|
||||||
Thus, to follow up the "%d" example: if "123" occurs in the input
|
|
||||||
string when a FormatSpecifier consisting of "%d" is reached, the "123"
|
|
||||||
will be matched against "([-+]?\d+)", and the matched string will be
|
|
||||||
rendered into an integer by a call to to_i.
|
|
||||||
|
|
||||||
The rendered match is then saved to an accumulator array, and the
|
|
||||||
input string is reduced to the post-match substring. Thus the string
|
|
||||||
is "eaten" from the left as the FormatSpecifiers are applied in
|
|
||||||
sequence. (This is done to a duplicate string; the original string is
|
|
||||||
not altered.)
|
|
||||||
|
|
||||||
As soon as a regular expression fragment fails to match the string, or
|
|
||||||
when the FormatString object runs out of FormatSpecifiers, scanning
|
|
||||||
stops and results accumulated so far are returned in an array.
|
|
||||||
|
|
||||||
==License and copyright
|
|
||||||
|
|
||||||
Copyright:: (c) 2002-2003 David Alan Black
|
|
||||||
License:: Distributed on the same licensing terms as Ruby itself
|
|
||||||
|
|
||||||
==Warranty disclaimer
|
|
||||||
|
|
||||||
This software is provided "as is" and without any express or implied
|
|
||||||
warranties, including, without limitation, the implied warranties of
|
|
||||||
merchantibility and fitness for a particular purpose.
|
|
||||||
|
|
||||||
==Credits and acknowledgements
|
|
||||||
|
|
||||||
scanf for Ruby was developed as the major activity of the Austin
|
|
||||||
Ruby Codefest (Austin, Texas, August 2002).
|
|
||||||
|
|
||||||
Principal author:: David Alan Black (mailto:dblack@superlink.net)
|
|
||||||
Co-author:: Hal Fulton (mailto:hal9000@hypermetrics.com)
|
|
||||||
Project contributors:: Nolan Darilek, Jason Johnston
|
|
||||||
|
|
||||||
Thanks to Hal Fulton for hosting the Codefest.
|
|
||||||
|
|
||||||
Thanks to Matz for suggestions about the class design.
|
|
||||||
|
|
||||||
Thanks to Gavin Sinclair for some feedback on the documentation.
|
|
||||||
|
|
||||||
The text for parts of this document, especially the Description and
|
|
||||||
Conversions sections, above, were adapted from the Linux Programmer's
|
|
||||||
Manual manpage for scanf(3), dated 1995-11-01.
|
|
||||||
|
|
||||||
==Bugs and bug reports
|
|
||||||
|
|
||||||
scanf for Ruby is based on something of an amalgam of C scanf
|
|
||||||
implementations and documentation, rather than on a single canonical
|
|
||||||
description. Suggestions for features and behaviors which appear in
|
|
||||||
other scanfs, and would be meaningful in Ruby, are welcome, as are
|
|
||||||
reports of suspicious behaviors and/or bugs. (Please see "Credits and
|
|
||||||
acknowledgements", above, for email addresses.)
|
|
||||||
|
|
||||||
=end
|
|
||||||
|
|
||||||
module Scanf
|
module Scanf
|
||||||
|
# :stopdoc:
|
||||||
|
|
||||||
|
# ==Technical notes
|
||||||
|
#
|
||||||
|
# ===Rationale behind scanf for Ruby
|
||||||
|
#
|
||||||
|
# The impetus for a scanf implementation in Ruby comes chiefly from the fact
|
||||||
|
# that existing pattern matching operations, such as Regexp#match and
|
||||||
|
# String#scan, return all results as strings, which have to be converted to
|
||||||
|
# integers or floats explicitly in cases where what's ultimately wanted are
|
||||||
|
# integer or float values.
|
||||||
|
#
|
||||||
|
# ===Design of scanf for Ruby
|
||||||
|
#
|
||||||
|
# scanf for Ruby is essentially a <format string>-to-<regular
|
||||||
|
# expression> converter.
|
||||||
|
#
|
||||||
|
# When scanf is called, a FormatString object is generated from the
|
||||||
|
# format string ("%d%s...") argument. The FormatString object breaks the
|
||||||
|
# format string down into atoms ("%d", "%5f", "blah", etc.), and from
|
||||||
|
# each atom it creates a FormatSpecifier object, which it
|
||||||
|
# saves.
|
||||||
|
#
|
||||||
|
# Each FormatSpecifier has a regular expression fragment and a "handler"
|
||||||
|
# associated with it. For example, the regular expression fragment
|
||||||
|
# associated with the format "%d" is "([-+]?\d+)", and the handler
|
||||||
|
# associated with it is a wrapper around String#to_i. scanf itself calls
|
||||||
|
# FormatString#match, passing in the input string. FormatString#match
|
||||||
|
# iterates through its FormatSpecifiers; for each one, it matches the
|
||||||
|
# corresponding regular expression fragment against the string. If
|
||||||
|
# there's a match, it sends the matched string to the handler associated
|
||||||
|
# with the FormatSpecifier.
|
||||||
|
#
|
||||||
|
# Thus, to follow up the "%d" example: if "123" occurs in the input
|
||||||
|
# string when a FormatSpecifier consisting of "%d" is reached, the "123"
|
||||||
|
# will be matched against "([-+]?\d+)", and the matched string will be
|
||||||
|
# rendered into an integer by a call to to_i.
|
||||||
|
#
|
||||||
|
# The rendered match is then saved to an accumulator array, and the
|
||||||
|
# input string is reduced to the post-match substring. Thus the string
|
||||||
|
# is "eaten" from the left as the FormatSpecifiers are applied in
|
||||||
|
# sequence. (This is done to a duplicate string; the original string is
|
||||||
|
# not altered.)
|
||||||
|
#
|
||||||
|
# As soon as a regular expression fragment fails to match the string, or
|
||||||
|
# when the FormatString object runs out of FormatSpecifiers, scanning
|
||||||
|
# stops and results accumulated so far are returned in an array.
|
||||||
|
|
||||||
class FormatSpecifier
|
class FormatSpecifier
|
||||||
|
|
||||||
|
@ -574,39 +557,61 @@ module Scanf
|
||||||
return accum.compact
|
return accum.compact
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
|
# :startdoc:
|
||||||
end
|
end
|
||||||
|
|
||||||
class IO
|
class IO
|
||||||
|
|
||||||
# The trick here is doing a match where you grab one *line*
|
#:stopdoc:
|
||||||
# of input at a time. The linebreak may or may not occur
|
# The trick here is doing a match where you grab one *line*
|
||||||
# at the boundary where the string matches a format specifier.
|
# of input at a time. The linebreak may or may not occur
|
||||||
# And if it does, some rule about whitespace may or may not
|
# at the boundary where the string matches a format specifier.
|
||||||
# be in effect...
|
# And if it does, some rule about whitespace may or may not
|
||||||
#
|
# be in effect...
|
||||||
# That's why this is much more elaborate than the string
|
#
|
||||||
# version.
|
# That's why this is much more elaborate than the string
|
||||||
#
|
# version.
|
||||||
# For each line:
|
#
|
||||||
# Match succeeds (non-emptily)
|
# For each line:
|
||||||
# and the last attempted spec/string sub-match succeeded:
|
#
|
||||||
#
|
# Match succeeds (non-emptily)
|
||||||
# could the last spec keep matching?
|
# and the last attempted spec/string sub-match succeeded:
|
||||||
# yes: save interim results and continue (next line)
|
#
|
||||||
#
|
# could the last spec keep matching?
|
||||||
# The last attempted spec/string did not match:
|
# yes: save interim results and continue (next line)
|
||||||
#
|
#
|
||||||
# are we on the next-to-last spec in the string?
|
# The last attempted spec/string did not match:
|
||||||
# yes:
|
#
|
||||||
# is fmt_string.string_left all spaces?
|
# are we on the next-to-last spec in the string?
|
||||||
# yes: does current spec care about input space?
|
# yes:
|
||||||
# yes: fatal failure
|
# is fmt_string.string_left all spaces?
|
||||||
# no: save interim results and continue
|
# yes: does current spec care about input space?
|
||||||
# no: continue [this state could be analyzed further]
|
# yes: fatal failure
|
||||||
#
|
# no: save interim results and continue
|
||||||
#
|
# no: continue [this state could be analyzed further]
|
||||||
|
#
|
||||||
|
#:startdoc:
|
||||||
|
|
||||||
def scanf(str,&b)
|
# Scans the current string until the match is exhausted,
|
||||||
|
# yielding each match as it is encountered in the string.
|
||||||
|
# A block is not necessary though, as the results will simply
|
||||||
|
# be aggregated into the final array.
|
||||||
|
#
|
||||||
|
# "123 456".block_scanf("%d")
|
||||||
|
# # => [123, 456]
|
||||||
|
#
|
||||||
|
# If a block is given, the value from that is returned from
|
||||||
|
# the yield is added to an output array.
|
||||||
|
#
|
||||||
|
# "123 456".block_scanf("%d) do |digit,| # the ',' unpacks the Array
|
||||||
|
# digit + 100
|
||||||
|
# end
|
||||||
|
# # => [223, 556]
|
||||||
|
#
|
||||||
|
# See Scanf for details on creating a format string.
|
||||||
|
#
|
||||||
|
# You will need to require 'scanf' to use use IO#scanf.
|
||||||
|
def scanf(str,&b) #:yield: current_match
|
||||||
return block_scanf(str,&b) if b
|
return block_scanf(str,&b) if b
|
||||||
return [] unless str.size > 0
|
return [] unless str.size > 0
|
||||||
|
|
||||||
|
@ -686,7 +691,28 @@ end
|
||||||
|
|
||||||
class String
|
class String
|
||||||
|
|
||||||
def scanf(fstr,&b)
|
# :section: scanf
|
||||||
|
#
|
||||||
|
# You will need to require 'scanf' to use these methods
|
||||||
|
|
||||||
|
# Scans the current string. If a block is given, it
|
||||||
|
# functions exactly like block_scanf.
|
||||||
|
#
|
||||||
|
# arr = "123 456".scanf("%d%d")
|
||||||
|
# # => [123, 456]
|
||||||
|
#
|
||||||
|
# require 'pp'
|
||||||
|
#
|
||||||
|
# "this 123 read that 456 other".scanf("%s%d%s") {|m| pp m}
|
||||||
|
#
|
||||||
|
# # ["this", 123, "read"]
|
||||||
|
# # ["that", 456, "other"]
|
||||||
|
# # => [["this", 123, "read"], ["that", 456, "other"]]
|
||||||
|
#
|
||||||
|
# See Scanf for details on creating a format string.
|
||||||
|
#
|
||||||
|
# You will need to require 'scanf' to use String#scanf
|
||||||
|
def scanf(fstr,&b) #:yield: current_match
|
||||||
if b
|
if b
|
||||||
block_scanf(fstr,&b)
|
block_scanf(fstr,&b)
|
||||||
else
|
else
|
||||||
|
@ -700,7 +726,26 @@ class String
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
|
|
||||||
def block_scanf(fstr,&b)
|
# Scans the current string until the match is exhausted
|
||||||
|
# yielding each match as it is encountered in the string.
|
||||||
|
# A block is not necessary as the results will simply
|
||||||
|
# be aggregated into the final array.
|
||||||
|
#
|
||||||
|
# "123 456".block_scanf("%d")
|
||||||
|
# # => [123, 456]
|
||||||
|
#
|
||||||
|
# If a block is given, the value from that is returned from
|
||||||
|
# the yield is added to an output array.
|
||||||
|
#
|
||||||
|
# "123 456".block_scanf("%d) do |digit,| # the ',' unpacks the Array
|
||||||
|
# digit + 100
|
||||||
|
# end
|
||||||
|
# # => [223, 556]
|
||||||
|
#
|
||||||
|
# See Scanf for details on creating a format string.
|
||||||
|
#
|
||||||
|
# You will need to require 'scanf' to use String#block_scanf
|
||||||
|
def block_scanf(fstr,&b) #:yield: current_match
|
||||||
fs = Scanf::FormatString.new(fstr)
|
fs = Scanf::FormatString.new(fstr)
|
||||||
str = self.dup
|
str = self.dup
|
||||||
final = []
|
final = []
|
||||||
|
@ -715,7 +760,12 @@ end
|
||||||
|
|
||||||
module Kernel
|
module Kernel
|
||||||
private
|
private
|
||||||
def scanf(fs,&b)
|
# Scans STDIN for data matching +format+. See IO#scanf for details.
|
||||||
STDIN.scanf(fs,&b)
|
#
|
||||||
|
# See Scanf for details on creating a format string.
|
||||||
|
#
|
||||||
|
# You will need to require 'scanf' to use Kernel#scanf.
|
||||||
|
def scanf(format, &b) #:doc:
|
||||||
|
STDIN.scanf(format ,&b)
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue