1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00
ruby--ruby/doc/io_streams.rdoc
Burdette Lamar 3ddab3a84e
[DOC] Housekeeping in iostreams doc (#6420)
Write some method names in linkable form; make some capitalization consistent.
2022-09-23 09:41:21 -05:00

348 lines
9.1 KiB
Text

== \IO Streams
Ruby supports processing data as \IO streams;
that is, as data that may be read, re-read, written, re-written,
and traversed via iteration.
Core classes with such support include:
- IO, and its derived class File.
- {StringIO}[rdoc-ref:StringIO]: for processing a string.
- {ARGF}[rdoc-ref:ARGF]: for processing files cited on the command line.
Pre-existing stream objects that are referenced by constants include:
- $stdin: read-only instance of \IO.
- $stdout: write-only instance of \IO.
- $stderr: read-only instance of \IO.
- \ARGF: read-only instance of \ARGF.
You can create stream objects:
- \File:
- File.new: returns a new \File object.
- File.open: passes a new \File object to given the block.
- \IO:
- IO.new: returns a new \IO object for the given integer file descriptor.
- IO.open: passes a new \IO object to the given block.
- IO.popen: returns a new \IO object that is connected to the $stdin
and $stdout of a newly-launched subprocess.
- Kernel#open: returns a new \IO object connected to a given source:
stream, file, or subprocess.
- \StringIO:
- StringIO.new: returns a new \StringIO object.
- StringIO.open: passes a new \StringIO object to the given block.
(You cannot create an \ARGF object, but one already exists.)
=== About the Examples
Many examples here use these variables:
# English text with newlines.
text = <<~EOT
First line
Second line
Fourth line
Fifth line
EOT
# Russian text.
russian = "\u{442 435 441 442}" # => "тест"
# Binary data.
data = "\u9990\u9991\u9992\u9993\u9994"
# Text file.
File.write('t.txt', text)
# File with Russian text.
File.write('t.rus', russian)
# File with binary data.
f = File.new('t.dat', 'wb:UTF-16')
f.write(data)
f.close
=== Position
An \IO stream has a nonnegative integer _position_,
which is the byte offset at which the next read or write is to occur;
the relevant methods:
- IO#tell (aliased as +#pos+):
Returns the current position (in bytes) in the stream:
f = File.new('t.txt')
f.tell # => 0
f.gets # => "First line\n"
f.tell # => 12
f.close
- IO#pos=: Sets the position of the stream (in bytes):
f = File.new('t.txt')
f.tell # => 0
f.pos = 20 # => 20
f.tell # => 20
f.close
- IO#seek: Sets the position of the stream to a given integer +offset+
(in bytes), with respect to a given constant +whence+, which is one of:
- +:CUR+ or <tt>IO::SEEK_CUR</tt>:
Repositions the stream to its current position plus the given +offset+:
f = File.new('t.txt')
f.tell # => 0
f.seek(20, :CUR) # => 0
f.tell # => 20
f.seek(-10, :CUR) # => 0
f.tell # => 10
f.close
- +:END+ or <tt>IO::SEEK_END</tt>:
Repositions the stream to its end plus the given +offset+:
f = File.new('t.txt')
f.tell # => 0
f.seek(0, :END) # => 0 # Repositions to stream end.
f.tell # => 52
f.seek(-20, :END) # => 0
f.tell # => 32
f.seek(-40, :END) # => 0
f.tell # => 12
f.close
- +:SET+ or <tt>IO:SEEK_SET</tt>:
Repositions the stream to the given +offset+:
f = File.new('t.txt')
f.tell # => 0
f.seek(20, :SET) # => 0
f.tell # => 20
f.seek(40, :SET) # => 0
f.tell # => 40
f.close
- IO#rewind: Positions the stream to the beginning:
f = File.new('t.txt')
f.tell # => 0
f.gets # => "First line\n"
f.tell # => 12
f.rewind # => 0
f.tell # => 0
f.close
=== Lines
Some reader methods in \IO streams are line-oriented;
such a method reads one or more lines,
which are separated by an implicit or explicit line separator.
These methods are included (except as noted) in classes Kernel, IO, File,
and {ARGF}[rdoc-ref:ARGF]:
- IO#each_line: Passes each line to the block; not in Kernel:
f = File.new('t.txt')
f.each_line {|line| p line }
Output:
"First line\n"
"Second line\n"
"\n"
"Fourth line\n"
"Fifth line\n"
The reading may begin mid-line:
f = File.new('t.txt')
f.pos = 27
f.each_line {|line| p line }
Output:
"rth line\n"
"Fifth line\n"
- IO#gets: Returns the next line (which may begin mid-line):
f = File.new('t.txt')
f.gets # => "First line\n"
f.gets # => "Second line\n"
f.pos = 27
f.gets # => "rth line\n"
f.readlines # => ["Fifth line\n"]
f.gets # => nil
- IO#readline: Like #gets, but raises an exception at end-of-file;
not in StringIO.
- IO#readlines: Returns all remaining lines in an array;
may begin mid-line:
f = File.new('t.txt')
f.pos = 19
f.readlines # => ["ine\n", "\n", "Fourth line\n", "Fifth line\n"]
f.readlines # => []
Each of these methods may be called with:
- An optional line separator, +sep+.
- An optional line-size limit, +limit+.
- Both +sep+ and +limit+.
==== Line Separator
The default line separator is the given by the global variable <tt>$/</tt>,
whose value is by default <tt>"\n"</tt>.
The line to be read next is all data from the current position
to the next line separator:
f = File.new('t.txt')
f.gets # => "First line\n"
f.gets # => "Second line\n"
f.gets # => "\n"
f.gets # => "Fourth line\n"
f.gets # => "Fifth line\n"
f.close
You can specify a different line separator:
f = File.new('t.txt')
f.gets('l') # => "First l"
f.gets('li') # => "ine\nSecond li"
f.gets('lin') # => "ne\n\nFourth lin"
f.gets # => "e\n"
f.close
There are two special line separators:
- +nil+: The entire stream is read into a single string:
f = File.new('t.txt')
f.gets(nil) # => "First line\nSecond line\n\nFourth line\nFifth line\n"
f.close
- <tt>''</tt> (the empty string): The next "paragraph" is read
(paragraphs being separated by two consecutive line separators):
f = File.new('t.txt')
f.gets('') # => "First line\nSecond line\n\n"
f.gets('') # => "Fourth line\nFifth line\n"
f.close
==== Line Limit
The line to be read may be further defined by an optional integer argument +limit+,
which specifies that the number of bytes returned may not be (much) longer
than the given +limit+;
a multi-byte character will not be split, and so a line may be slightly longer
than the given limit.
If +limit+ is not given, the line is determined only by +sep+.
# Text with 1-byte characters.
File.new('t.txt') {|f| f.gets(1) } # => "F"
File.new('t.txt') {|f| f.gets(2) } # => "Fi"
File.new('t.txt') {|f| f.gets(3) } # => "Fir"
File.new('t.txt') {|f| f.gets(4) } # => "Firs"
# No more than one line.
File.new('t.txt') {|f| f.gets(10) } # => "First line"
File.new('t.txt') {|f| f.gets(11) } # => "First line\n"
File.new('t.txt') {|f| f.gets(12) } # => "First line\n"
# Text with 2-byte characters, which will not be split.
File.new('r.rus') {|f| f.gets(1).size } # => 1
File.new('r.rus') {|f| f.gets(2).size } # => 1
File.new('r.rus') {|f| f.gets(3).size } # => 2
File.new('r.rus') {|f| f.gets(4).size } # => 2
==== Line Separator and Line Limit
With arguments +sep+ and +limit+ given,
combines the two behaviors:
- Returns the next line as determined by line separator +sep+.
- But returns no more bytes than are allowed by the limit.
Example:
File.new('t.txt') {|f| f.gets('li', 20) } # => "First li"
File.new('t.txt') {|f| f.gets('li', 2) } # => "Fi"
==== Line Number
A readable \IO stream has a _line_ _number_,
which is the non-negative integer line number
in the stream where the next read will occur.
A new stream is initially has line number +0+.
\Method IO#lineno returns the line number.
Reading lines from a stream usually changes its line number:
f = File.new('t.txt', 'r')
f.lineno # => 0
f.readline # => "This is line one.\n"
f.lineno # => 1
f.readline # => "This is the second line.\n"
f.lineno # => 2
f.readline # => "Here's the third line.\n"
f.lineno # => 3
f.eof? # => true
f.close
Iterating over lines in a stream usually changes its line number:
f = File.new('t.txt')
f.each_line do |line|
p "position=#{f.pos} eof?=#{f.eof?} lineno=#{f.lineno}"
end
f.close
Output:
"position=11 eof?=false lineno=1"
"position=23 eof?=false lineno=2"
"position=24 eof?=false lineno=3"
"position=36 eof?=false lineno=4"
"position=47 eof?=true lineno=5"
==== Line Options
A number of \IO methods accept optional keyword arguments
that determine how lines in a stream are to be treated:
- +:chomp+: If +true+, line separators are omitted; default is +false+.
=== Open and Closed \IO Streams
A new \IO stream may be open for reading, open for writing, or both.
You can close a stream using these methods:
- IO#close: Closes the stream for both reading and writing.
- IO#close_read (not available in \ARGF): Closes the stream for reading.
- IO#close_write (not available in \ARGF): Closes the stream for writing.
You can query whether a stream is closed using these methods:
- IO#closed?: Returns whether the stream is closed.
=== Stream End-of-File
You can query whether a stream is at end-of-file using this method:
- IO#eof? (also aliased as +#eof+):
Returns whether the stream is at end-of-file.