1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00

* Cross-patch from Ruby CVS; mostly Nabu edits.

* Fixes ticket:68.

  ***** Note that this is an API change!!! *****

  NOTE that this involves an API change!  Entity declarations in the doctype now
  generate events that carry two, not one, arguments.

* Implements ticket:15, using gwrite's suggestion.  This allows Element to be
  subclassed.

* Fixed namespaces handling in XPath and element.

  ***** Note that this is an API change!!! *****

  Element.namespaces() now returns a hash of namespace mappings which are
  relevant for that node.

* Fixes a bug in multiple decodings

* The changeset 1230:1231 was bad.  The default behavior is *not* to use the
  native REXML encodings by default, but rather to use ICONV by default.  I'll
  have to think of a better way of managing translations, but the REXML codecs
  are (a) less reliable than ICONV, but more importantly (b) slower.  The real
  solution is to use ICONV by default, but allow users to specify that they
  want to use the pure Ruby codecs.

* Fixes ticket:61 (xpath_parser)

* Fixes ticket:63 (UTF-16; UNILE decoding was bad)

* Improves parsing error messages a little

* Adds the ability to override the encoding detection in Source construction

* Fixes an edge case in Functions::string, where document nodes weren't
  correctly converted

  * Fixes Functions::string() for Element and Document nodes

  * Fixes some problems in entity handling

* Addresses ticket:66

* Fixes ticket:71

* Addresses ticket:78

    NOTE: that this also fixes what is technically another bug in REXML.  REXML's
    XPath parser used to allow exponential notation in numbers.  The XPath spec
    is specific about what a number is, and scientific notation is not included.
    Therefore, this has been fixed.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_1_8@11315 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This commit is contained in:
ser 2006-12-01 02:20:08 +00:00
parent d2205c869e
commit f114b85d89
14 changed files with 136 additions and 81 deletions

View file

@ -6,7 +6,7 @@ module REXML
# Generates a Source object
# @param arg Either a String, or an IO
# @return a Source, or nil if a bad argument was given
def SourceFactory::create_from arg#, slurp=true
def SourceFactory::create_from(arg)
if arg.kind_of? String
Source.new(arg)
elsif arg.respond_to? :read and
@ -35,16 +35,23 @@ module REXML
# Constructor
# @param arg must be a String, and should be a valid XML document
def initialize(arg)
# @param encoding if non-null, sets the encoding of the source to this
# value, overriding all encoding detection
def initialize(arg, encoding=nil)
@orig = @buffer = arg
self.encoding = check_encoding( @buffer )
if encoding
self.encoding = encoding
else
self.encoding = check_encoding( @buffer )
end
@line = 0
end
# Inherited from Encoding
# Overridden to support optimized en/decoding
def encoding=(enc)
super
return unless super
@line_break = encode( '>' )
if enc != UTF_8
@buffer = decode(@buffer)
@ -124,7 +131,7 @@ module REXML
#attr_reader :block_size
# block_size has been deprecated
def initialize(arg, block_size=500)
def initialize(arg, block_size=500, encoding=nil)
@er_source = @source = arg
@to_utf = false
# Determining the encoding is a deceptively difficult issue to resolve.
@ -134,10 +141,12 @@ module REXML
# if there is one. If there isn't one, the file MUST be UTF-8, as per
# the XML spec. If there is one, we can determine the encoding from
# it.
@buffer = ""
str = @source.read( 2 )
if /\A(?:\xfe\xff|\xff\xfe)/n =~ str
if encoding
self.encoding = encoding
elsif /\A(?:\xfe\xff|\xff\xfe)/n =~ str
self.encoding = check_encoding( str )
@line_break = encode( '>' )
else
@line_break = '>'
end
@ -159,6 +168,8 @@ module REXML
str = @source.readline(@line_break)
str = decode(str) if @to_utf and str
@buffer << str
rescue Iconv::IllegalSequence
raise
rescue
@source = nil
end