1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00

[DOC] Enhanced RDoc for encoding (#5603)

Additions and corrections for external/internal encodings.
This commit is contained in:
Burdette Lamar 2022-02-27 15:43:23 -06:00 committed by GitHub
parent 7f4345639b
commit 28ee1ca748
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
Notes: git 2022-02-28 06:43:54 +09:00
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>

View file

@ -205,57 +205,74 @@ other than from the filesystem:
Encoding.find('locale') # => #<Encoding:IBM437> Encoding.find('locale') # => #<Encoding:IBM437>
=== \IO Encodings === Stream Encodings
An IO object (an input/output stream), and by inheritance a File object, Certain stream objects can have two encodings; these objects include instances of:
has at least one, and sometimes two, encodings:
- Its _external_ _encoding_ identifies the encoding of the stream. - IO.
- Its _internal_ _encoding_, if not +nil+, specifies the encoding - File.
- ARGF.
- StringIO.
The two encodings are:
- An _external_ _encoding_, which identifies the encoding of the stream.
- An _internal_ _encoding_, which (if not +nil+) specifies the encoding
to be used for the string constructed from the stream. to be used for the string constructed from the stream.
==== External \Encoding ==== External \Encoding
Bytes read from the stream are decoded into characters via the external encoding; The external encoding, which is an \Encoding object, specifies how bytes read
by default (that is, if the internal encoding is +nil), from the stream are to be interpreted as characters.
those characters become a string whose encoding is set to the external encoding.
The default external encoding is: The default external encoding is:
- UTF-8 for a text stream. - UTF-8 for a text stream.
- ASCII-8BIT for a binary stream. - ASCII-8BIT for a binary stream.
f = File.open('t.rus', 'rb') The default external encoding is returned by method Encoding.default_external,
f.external_encoding # => #<Encoding:ASCII-8BIT> and may be set by:
The external encoding may be set by the open option +external_encoding+: - Ruby command-line options <tt>--external_encoding</tt> or <tt>-E</tt>.
f = File.open('t.txt', external_encoding: 'ASCII-8BIT') You can also set the default external encoding using method Encoding.default_external=,
f.external_encoding # => #<Encoding:ASCII-8BIT> but doing so may cause problems; strings created before and after the change
may have a different encodings.
The external encoding may also set by method #set_encoding: For an \IO or \File object, the external encoding may be set by:
f = File.open('t.txt') - Open options +external_encoding+ or +encoding+, when the object is created;
f.set_encoding('ASCII-8BIT') see {Open Options}[rdoc-ref:IO@Open+Options].
f.external_encoding # => #<Encoding:ASCII-8BIT>
For an \IO, \File, \ARGF, or \StringIO object, the external encoding may be set by:
- \Methods +set_encoding+ or (except for \ARGF) +set_encoding_by_bom+.
==== Internal \Encoding ==== Internal \Encoding
If not +nil+, the internal encoding specifies that the characters read The internal encoding, which is an \Encoding object or +nil+,
from the stream are to be converted to characters in the internal encoding; specifies how characters read from the stream
are to be converted to characters in the internal encoding;
those characters become a string whose encoding is set to the internal encoding. those characters become a string whose encoding is set to the internal encoding.
The default internal encoding is +nil+ (no conversion). The default internal encoding is +nil+ (no conversion).
The internal encoding may set by the open option +internal_encoding+: It is returned by method Encoding.default_internal,
and may be set by:
f = File.open('t.txt', internal_encoding: 'ASCII-8BIT') - Ruby command-line options <tt>--internal_encoding</tt> or <tt>-E</tt>.
f.internal_encoding # => #<Encoding:ASCII-8BIT>
The internal encoding may also set by method #set_encoding: You can also set the default internal encoding using method Encoding.default_internal=,
but doing so may cause problems; strings created before and after the change
may have a different encodings.
f = File.open('t.txt') For an \IO or \File object, the internal encoding may be set by:
f.set_encoding('UTF-8', 'ASCII-8BIT')
f.internal_encoding # => #<Encoding:ASCII-8BIT> - Open options +internal_encoding+ or +encoding+, when the object is created;
see {Open Options}[rdoc-ref:IO@Open+Options].
For an \IO, \File, \ARGF, or \StringIO object, the internal encoding may be set by:
- \Method +set_encoding+.
=== Script \Encoding === Script \Encoding