1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00
ruby--ruby/lib/rexml/encoding.rb
ser f114b85d89 * Cross-patch from Ruby CVS; mostly Nabu edits.
* Fixes ticket:68.

  ***** Note that this is an API change!!! *****

  NOTE that this involves an API change!  Entity declarations in the doctype now
  generate events that carry two, not one, arguments.

* Implements ticket:15, using gwrite's suggestion.  This allows Element to be
  subclassed.

* Fixed namespaces handling in XPath and element.

  ***** Note that this is an API change!!! *****

  Element.namespaces() now returns a hash of namespace mappings which are
  relevant for that node.

* Fixes a bug in multiple decodings

* The changeset 1230:1231 was bad.  The default behavior is *not* to use the
  native REXML encodings by default, but rather to use ICONV by default.  I'll
  have to think of a better way of managing translations, but the REXML codecs
  are (a) less reliable than ICONV, but more importantly (b) slower.  The real
  solution is to use ICONV by default, but allow users to specify that they
  want to use the pure Ruby codecs.

* Fixes ticket:61 (xpath_parser)

* Fixes ticket:63 (UTF-16; UNILE decoding was bad)

* Improves parsing error messages a little

* Adds the ability to override the encoding detection in Source construction

* Fixes an edge case in Functions::string, where document nodes weren't
  correctly converted

  * Fixes Functions::string() for Element and Document nodes

  * Fixes some problems in entity handling

* Addresses ticket:66

* Fixes ticket:71

* Addresses ticket:78

    NOTE: that this also fixes what is technically another bug in REXML.  REXML's
    XPath parser used to allow exponential notation in numbers.  The XPath spec
    is specific about what a number is, and scientific notation is not included.
    Therefore, this has been fixed.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_1_8@11315 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-12-01 02:20:08 +00:00

66 lines
2 KiB
Ruby

# -*- mode: ruby; ruby-indent-level: 2; indent-tabs-mode: t; tab-width: 2 -*- vim: sw=2 ts=2
module REXML
module Encoding
@encoding_methods = {}
def self.register(enc, &block)
@encoding_methods[enc] = block
end
def self.apply(obj, enc)
@encoding_methods[enc][obj]
end
def self.encoding_method(enc)
@encoding_methods[enc]
end
# Native, default format is UTF-8, so it is declared here rather than in
# an encodings/ definition.
UTF_8 = 'UTF-8'
UTF_16 = 'UTF-16'
UNILE = 'UNILE'
# ID ---> Encoding name
attr_reader :encoding
def encoding=( enc )
old_verbosity = $VERBOSE
begin
$VERBOSE = false
enc = enc.nil? ? nil : enc.upcase
return false if defined? @encoding and enc == @encoding
if enc and enc != UTF_8
@encoding = enc
raise ArgumentError, "Bad encoding name #@encoding" unless @encoding =~ /^[\w-]+$/
@encoding.untaint
begin
require 'rexml/encodings/ICONV.rb'
Encoding.apply(self, "ICONV")
rescue LoadError, Exception
begin
enc_file = File.join( "rexml", "encodings", "#@encoding.rb" )
require enc_file
Encoding.apply(self, @encoding)
rescue LoadError => err
puts err.message
raise ArgumentError, "No decoder found for encoding #@encoding. Please install iconv."
end
end
else
@encoding = UTF_8
require 'rexml/encodings/UTF-8.rb'
Encoding.apply(self, @encoding)
end
ensure
$VERBOSE = old_verbosity
end
true
end
def check_encoding str
# We have to recognize UTF-16, LSB UTF-16, and UTF-8
return UTF_16 if /\A\xfe\xff/n =~ str
return UNILE if /\A\xff\xfe/n =~ str
str =~ /^\s*<?xml\s*version\s*=\s*(['"]).*?\2\s*encoding\s*=\s*(["'])(.*?)\2/um
return $1.upcase if $1
return UTF_8
end
end
end