[DOC] Main doc for encodings moved from encoding.c to doc/encodings.rdoc (#5748)

Main doc for encodings moved from encoding.c to doc/encodings.rdoc
Merged-By: BurdetteLamar <BurdetteLamar@Yahoo.com>
2022-11-09 12:17:21 -05:00 · 2022-04-01 20:41:04 -05:00 · 2022-04-01 20:41:04 -05:00 · 81741690a0 · 2022-04-02 10:41:30 +09:00
commit 81741690a0
parent 6068da8937
3 changed files with 22 additions and 192 deletions
--- a/doc/encodings.rdoc
+++ b/doc/encodings.rdoc
@ -1,4 +1,4 @@
-== \Encoding
+== Encodings
 === The Basics
--- a/doc/string/new.rdoc
+++ b/doc/string/new.rdoc
@ -12,7 +12,7 @@ with the same encoding as +string+:
  s.encoding # => #<Encoding:UTF-8>
 Literal strings like <tt>""</tt> or here-documents always use
-Encoding@Script+encoding, unlike String.new.
+{script encoding}[rdoc-ref:encodings.rdoc@Script+Encoding], unlike String.new.
 With keyword +encoding+, returns a copy of +str+
 with the specified encoding:
--- a/encoding.c
+++ b/encoding.c
@ -1936,203 +1936,33 @@ rb_enc_aliases(VALUE klass)
 }
 /*
- * An Encoding instance represents a character encoding usable in Ruby. It is
+ * An \Encoding instance represents a character encoding usable in Ruby.
- * defined as a constant under the Encoding namespace. It has a name and
+ * It is defined as a constant under the \Encoding namespace.
- * optionally, aliases:
+ * It has a name and, optionally, aliases:
 *
- *   Encoding::ISO_8859_1.name
+ *   Encoding::US_ASCII.name  # => "US-ASCII"
- *   #=> "ISO-8859-1"
+ *   Encoding::US_ASCII.names # => ["US-ASCII", "ASCII", "ANSI_X3.4-1968", "646"]
 *
- *   Encoding::ISO_8859_1.names
+ * A Ruby method that accepts an encoding as an argument will accept:
 *   #=> ["ISO-8859-1", "ISO8859-1"]
 *
- * Ruby methods dealing with encodings return or accept Encoding instances as
+ * - An \Encoding object.
- * arguments (when a method accepts an Encoding instance as an argument, it
+ * - The name of an encoding.
- * can be passed an Encoding name or alias instead).
+ * - An alias for an encoding name.
 *
- *   "some string".encoding
+ * These are equivalent:
 *   #=> #<Encoding:UTF-8>
 *
- *   string = "some string".encode(Encoding::ISO_8859_1)
+ *   'foo'.encode(Encoding::US_ASCII) # Encoding object.
- *   #=> "some string"
+ *   'foo'.encode('US-ASCII')         # Encoding name.
- *   string.encoding
+ *   'foo'.encode('ASCII')            # Encoding alias.
 *   #=> #<Encoding:ISO-8859-1>
 *
- *   "some string".encode "ISO-8859-1"
+ * For a full discussion of encodings and their uses,
- *   #=> "some string"
+ * see {the Encodings document}[rdoc-ref:encodings.rdoc].
 *
- * Encoding::ASCII_8BIT is a special encoding that is usually used for
+ * Encoding::ASCII_8BIT is a special-purpose encoding that is usually used for
- * a byte string, not a character string. But as the name insists, its
+ * a string of bytes, not a string of characters.
- * characters in the range of ASCII are considered as ASCII
+ * But as the name indicates, its characters in the ASCII range
- * characters.  This is useful when you use ASCII-8BIT characters with
+ * are considered as ASCII characters.
- * other ASCII compatible characters.
+ * This is useful when you use other ASCII-compatible encodings.
 *
 * == Changing an encoding
 *
 * The associated Encoding of a String can be changed in two different ways.
 *
 * First, it is possible to set the Encoding of a string to a new Encoding
 * without changing the internal byte representation of the string, with
 * String#force_encoding. This is how you can tell Ruby the correct encoding
 * of a string.
 *
 *   string
 *   #=> "R\xC3\xA9sum\xC3\xA9"
 *   string.encoding
 *   #=> #<Encoding:ISO-8859-1>
 *   string.force_encoding(Encoding::UTF_8)
 *   #=> "R\u00E9sum\u00E9"
 *
 * Second, it is possible to transcode a string, i.e. translate its internal
 * byte representation to another encoding. Its associated encoding is also
 * set to the other encoding. See String#encode for the various forms of
 * transcoding, and the Encoding::Converter class for additional control over
 * the transcoding process.
 *
 *   string
 *   #=> "R\u00E9sum\u00E9"
 *   string.encoding
 *   #=> #<Encoding:UTF-8>
 *   string = string.encode!(Encoding::ISO_8859_1)
 *   #=> "R\xE9sum\xE9"
 *   string.encoding
 *   #=> #<Encoding::ISO-8859-1>
 *
 * == Script encoding
 *
 * All Ruby script code has an associated Encoding which any String literal
 * created in the source code will be associated to.
 *
 * The default script encoding is Encoding::UTF_8 after v2.0, but it
 * can be changed by a magic comment on the first line of the source
 * code file (or second line, if there is a shebang line on the
 * first). The comment must contain the word <code>coding</code> or
 * <code>encoding</code>, followed by a colon, space and the Encoding
 * name or alias:
 *
 *   # encoding: UTF-8
 *
 *   "some string".encoding
 *   #=> #<Encoding:UTF-8>
 *
 * The <code>__ENCODING__</code> keyword returns the script encoding of the file
 * which the keyword is written:
 *
 *   # encoding: ISO-8859-1
 *
 *   __ENCODING__
 *   #=> #<Encoding:ISO-8859-1>
 *
 * <code>ruby -K</code> will change the default locale encoding, but this is
 * not recommended. Ruby source files should declare its script encoding by a
 * magic comment even when they only depend on US-ASCII strings or regular
 * expressions.
 *
 * == Locale encoding
 *
 * The default encoding of the environment. Usually derived from locale.
 *
 * see Encoding.locale_charmap, Encoding.find('locale')
 *
 * == Filesystem encoding
 *
 * The default encoding of strings from the filesystem of the environment.
 * This is used for strings of file names or paths.
 *
 * see Encoding.find('filesystem')
 *
 * == External encoding
 *
 * Each IO object has an external encoding which indicates the encoding that
 * Ruby will use to read its data. By default Ruby sets the external encoding
 * of an IO object to the default external encoding. The default external
 * encoding is set by locale encoding or the interpreter <code>-E</code> option.
 * Encoding.default_external returns the current value of the external
 * encoding.
 *
 *   ENV["LANG"]
 *   #=> "UTF-8"
 *   Encoding.default_external
 *   #=> #<Encoding:UTF-8>
 *
 *   $ ruby -E ISO-8859-1 -e "p Encoding.default_external"
 *   #<Encoding:ISO-8859-1>
 *
 *   $ LANG=C ruby -e 'p Encoding.default_external'
 *   #<Encoding:US-ASCII>
 *
 * The default external encoding may also be set through
 * Encoding.default_external=, but you should not do this as strings created
 * before and after the change will have inconsistent encodings.  Instead use
 * <code>ruby -E</code> to invoke ruby with the correct external encoding.
 *
 * When you know that the actual encoding of the data of an IO object is not
 * the default external encoding, you can reset its external encoding with
 * IO#set_encoding or set it at IO object creation (see IO.new options).
 *
 * == Internal encoding
 *
 * To process the data of an IO object which has an encoding different
 * from its external encoding, you can set its internal encoding. Ruby will use
 * this internal encoding to transcode the data when it is read from the IO
 * object.
 *
 * Conversely, when data is written to the IO object it is transcoded from the
 * internal encoding to the external encoding of the IO object.
 *
 * The internal encoding of an IO object can be set with
 * IO#set_encoding or at IO object creation (see IO.new options).
 *
 * The internal encoding is optional and when not set, the Ruby default
 * internal encoding is used. If not explicitly set this default internal
 * encoding is +nil+ meaning that by default, no transcoding occurs.
 *
 * The default internal encoding can be set with the interpreter option
 * <code>-E</code>. Encoding.default_internal returns the current internal
 * encoding.
 *
 *    $ ruby -e 'p Encoding.default_internal'
 *    nil
 *
 *    $ ruby -E ISO-8859-1:UTF-8 -e "p [Encoding.default_external, \
 *      Encoding.default_internal]"
 *    [#<Encoding:ISO-8859-1>, #<Encoding:UTF-8>]
 *
 * The default internal encoding may also be set through
 * Encoding.default_internal=, but you should not do this as strings created
 * before and after the change will have inconsistent encodings.  Instead use
 * <code>ruby -E</code> to invoke ruby with the correct internal encoding.
 *
 * == IO encoding example
 *
 * In the following example a UTF-8 encoded string "R\u00E9sum\u00E9" is transcoded for
 * output to ISO-8859-1 encoding, then read back in and transcoded to UTF-8:
 *
 *   string = "R\u00E9sum\u00E9"
 *
 *   open("transcoded.txt", "w:ISO-8859-1") do |io|
 *     io.write(string)
 *   end
 *
 *   puts "raw text:"
 *   p File.binread("transcoded.txt")
 *   puts
 *
 *   open("transcoded.txt", "r:ISO-8859-1:UTF-8") do |io|
 *     puts "transcoded text:"
 *     p io.read
 *   end
 *
 * While writing the file, the internal encoding is not specified as it is
 * only necessary for reading.  While reading the file both the internal and
 * external encoding must be specified to obtain the correct result.
 *
 *   $ ruby t.rb
 *   raw text:
 *   "R\xE9sum\xE9"
 *
 *   transcoded text:
 *   "R\u00E9sum\u00E9"
 *
 */
`@ -1,4 +1,4 @@`
	`== \Encoding`	`== Encodings`

	`=== The Basics`	`=== The Basics`