1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00
ruby--ruby/doc/io_streams.rdoc
Burdette Lamar 56d773dc6f
New page IO Streams (#6383)
This page provides an overview of IO streams. It's meant to be linked to from many other doc spots. In particular it will be linked to from many places in ARGF, File, IO, and StringIO.
2022-09-21 16:34:55 -05:00

350 lines
9.2 KiB
Text

== \IO Streams
Ruby supports processing data as \IO streams;
that is, as data that may be read, re-read, written, re-written,
and traversed via iteration.
Core classes with such support include:
- IO, and its derived class File.
- {StringIO}[rdoc-ref:StringIO]: for processing a string.
- {ARGF}[rdoc-ref:ARGF]: for processing files cited on the command line.
Pre-existing stream objects that are referenced by constants include:
- $stdin: read-only instance of \IO.
- $stdout: write-only instance of \IO.
- $stderr: read-only instance of \IO.
- \ARGF: read-only instance of \ARGF.
You can create stream objects:
- \File:
- File.new: returns a new \File object.
- File.open: passes a new \File object to given the block.
- \IO:
- IO.new: returns a new \IO object for the given integer file descriptor.
- IO.open: passes a new \IO object to the given block.
- IO.popen: returns a new \IO object that is connected to the $stdin
and $stdout of a newly-launched subprocess.
- Kernel#open: returns a new \IO object connected to a given source:
stream, file, or subprocess.
- \StringIO:
- StringIO.new: returns a new \StringIO object.
- StringIO.open: passes a new \StringIO object to the given block.
(You cannot create an \ARGF object, but one already exists.)
=== About the Examples
Many examples here use these variables:
# English text with newlines.
text = <<~EOT
First line
Second line
Fourth line
Fifth line
EOT
# Russian text.
russian = "\u{442 435 441 442}" # => "тест"
# Binary data.
data = "\u9990\u9991\u9992\u9993\u9994"
# Text file.
File.write('t.txt', text)
# File with Russian text.
File.write('t.rus', russian)
# File with binary data.
f = File.new('t.dat', 'wb:UTF-16')
f.write(data)
f.close
=== Position
An \IO stream has a nonnegative integer _position_,
which is the byte offset at which the next read or write is to occur;
the relevant methods:
- +#tell+ (aliased as #pos): Returns the current position (in bytes) in the stream:
f = File.new('t.txt')
f.tell # => 0
f.gets # => "First line\n"
f.tell # => 12
f.close
- +#pos=+: Sets the position of the stream (in bytes):
f = File.new('t.txt')
f.tell # => 0
f.pos = 20 # => 20
f.tell # => 20
f.close
- +#seek+: Sets the position of the stream to a given integer +offset+
(in bytes), with respect to a given constant +whence+, which is one of:
- +:CUR+ or <tt>IO::SEEK_CUR</tt>:
Repositions the stream to its current position plus the given +offset+:
f = File.new('t.txt')
f.tell # => 0
f.seek(20, :CUR) # => 0
f.tell # => 20
f.seek(-10, :CUR) # => 0
f.tell # => 10
f.close
- +:END+ or <tt>IO::SEEK_END</tt>:
Repositions the stream to its end plus the given +offset+:
f = File.new('t.txt')
f.tell # => 0
f.seek(0, :END) # => 0 # Repositions to stream end.
f.tell # => 52
f.seek(-20, :END) # => 0
f.tell # => 32
f.seek(-40, :END) # => 0
f.tell # => 12
f.close
- +:SET+ or <tt>IO:SEEK_SET</tt>:
Repositions the stream to the given +offset+:
f = File.new('t.txt')
f.tell # => 0
f.seek(20, :SET) # => 0
f.tell # => 20
f.seek(40, :SET) # => 0
f.tell # => 40
f.close
- +#rewind+: Positions the stream to the beginning:
f = File.new('t.txt')
f.tell # => 0
f.gets # => "First line\n"
f.tell # => 12
f.rewind # => 0
f.tell # => 0
f.close
=== Lines
Some reader methods in \IO streams are line-oriented;
such a method reads one or more lines,
which are separated by an implicit or explicit line separator.
These methods are included (except as noted) in classes Kernel, IO, File,
and {ARGF}[rdoc-ref:ARGF]:
- +#each_line+ - passes each line to the block; not in Kernel:
f = File.new('t.txt')
f.each_line {|line| p line }
Output:
"First line\n"
"Second line\n"
"\n"
"Fourth line\n"
"Fifth line\n"
The reading may begin mid-line:
f = File.new('t.txt')
f.pos = 27
f.each_line {|line| p line }
Output:
"rth line\n"
"Fifth line\n"
- +#gets+ - returns the next line (which may begin mid-line):
f = File.new('t.txt')
f.gets # => "First line\n"
f.gets # => "Second line\n"
f.pos = 27
f.gets # => "rth line\n"
f.readlines # => ["Fifth line\n"]
f.gets # => nil
- +#readline+ - like #gets, but raises an exception at end-of-file;
not in StringIO.
- +#readlines+ - returns all remaining lines in an array;
may begin mid-line:
f = File.new('t.txt')
f.pos = 19
f.readlines # => ["ine\n", "\n", "Fourth line\n", "Fifth line\n"]
f.readlines # => []
Each of these methods may be called with:
- An optional line separator, +sep+.
- An optional line-size limit, +limit+.
- Both +sep+ and +limit+.
==== Line Separator
The default line separator is the given by the global variable <tt>$/</tt>,
whose value is by default <tt>"\n"</tt>.
The line to be read next is all data from the current position
to the next line separator:
f = File.new('t.txt')
f.gets # => "First line\n"
f.gets # => "Second line\n"
f.gets # => "\n"
f.gets # => "Fourth line\n"
f.gets # => "Fifth line\n"
f.close
You can specify a different line separator:
f = File.new('t.txt')
f.gets('l') # => "First l"
f.gets('li') # => "ine\nSecond li"
f.gets('lin') # => "ne\n\nFourth lin"
f.gets # => "e\n"
f.close
There are two special line separators:
- +nil+: The entire stream is read into a single string:
f = File.new('t.txt')
f.gets(nil) # => "First line\nSecond line\n\nFourth line\nFifth line\n"
f.close
- <tt>''</tt> (the empty string): The next "paragraph" is read
(paragraphs being separated by two consecutive line separators):
f = File.new('t.txt')
f.gets('') # => "First line\nSecond line\n\n"
f.gets('') # => "Fourth line\nFifth line\n"
f.close
==== Line Limit
The line to be read may be further defined by an optional integer argument +limit+,
which specifies that the number of bytes returned may not be (much) longer
than the given +limit+;
a multi-byte character will not be split, and so a line may be slightly longer
than the given limit.
If +limit+ is not given, the line is determined only by +sep+.
# Text with 1-byte characters.
File.new('t.txt') {|f| f.gets(1) } # => "F"
File.new('t.txt') {|f| f.gets(2) } # => "Fi"
File.new('t.txt') {|f| f.gets(3) } # => "Fir"
File.new('t.txt') {|f| f.gets(4) } # => "Firs"
# No more than one line.
File.new('t.txt') {|f| f.gets(10) } # => "First line"
File.new('t.txt') {|f| f.gets(11) } # => "First line\n"
File.new('t.txt') {|f| f.gets(12) } # => "First line\n"
# Text with 2-byte characters, which will not be split.
File.new('r.rus') {|f| f.gets(1).size } # => 1
File.new('r.rus') {|f| f.gets(2).size } # => 1
File.new('r.rus') {|f| f.gets(3).size } # => 2
File.new('r.rus') {|f| f.gets(4).size } # => 2
==== Line Separator and Line Limit
With arguments +sep+ and +limit+ given,
combines the two behaviors:
- Returns the next line as determined by line separator +sep+.
- But returns no more bytes than are allowed by the limit.
Example:
File.new('t.txt') {|f| f.gets('li', 20) } # => "First li"
File.new('t.txt') {|f| f.gets('li', 2) } # => "Fi"
==== Line Number
A readable \IO stream has a _line_ _number_,
which is the non-negative integer line number
in the stream where the next read will occur.
A new stream is initially has line number +0+.
\Method IO#lineno returns the line number.
Reading lines from a stream usually changes its line number:
f = File.new('t.txt', 'r')
f.lineno # => 0
f.readline # => "This is line one.\n"
f.lineno # => 1
f.readline # => "This is the second line.\n"
f.lineno # => 2
f.readline # => "Here's the third line.\n"
f.lineno # => 3
f.eof? # => true
f.close
Iterating over lines in a stream usually changes its line number:
f = File.new('t.txt')
f.each_line do |line|
p "position=#{f.pos} eof?=#{f.eof?} lineno=#{f.lineno}"
end
f.close
Output:
"position=11 eof?=false lineno=1"
"position=23 eof?=false lineno=2"
"position=24 eof?=false lineno=3"
"position=36 eof?=false lineno=4"
"position=47 eof?=true lineno=5"
==== Line Options
A number of \IO methods accept optional keyword arguments
that determine how lines in a stream are to be treated:
- +:chomp+: If +true+, line separators are omitted; default is +false+.
=== Open and Closed \IO Streams
A new \IO stream may be open for reading, open for writing, or both.
You can close a stream using these methods:
- +#close+ - closes the stream for both reading and writing.
- +#close_read+ (not available in \ARGF) - closes the stream for reading.
- +#close_write+ (not available in \ARGF) - closes the stream for writing.
You can query whether a stream is closed using these methods:
- +#closed?+ - returns whether the stream is closed.
=== Stream End-of-File
You can query whether a stream is at end-of-file using this method:
- +#eof?+ (also aliased as +#eof+) -
returns whether the stream is at end-of-file.