mirror of
https://github.com/ruby/ruby.git
synced 2022-11-09 12:17:21 -05:00
Merged from REXML main repository:
Fixes ticket:68. NOTE that this involves an API change! Entity declarations in the doctype now generate events that carry two, not one, arguments. Implements ticket:15, using gwrite's suggestion. This allows Element to be subclassed. Two unrelated changes, because subversion is retarded and doesn't do block-level commits: 1) Fixed a typo bug in previous change for ticket:15 2) Fixed namespaces handling in XPath and element. ***** Note that this is an API change!!! ***** Element.namespaces() now returns a hash of namespace mappings which are relevant for that node. Fixes a bug in multiple decodings The changeset 1230:1231 was bad. The default behavior is *not* to use the native REXML encodings by default, but rather to use ICONV by default. I know that this will piss some people off, but defaulting to the pure Ruby version isn't the correct solution, and it breaks other encodings, so I've reverted it. * Fixes ticket:61 (xpath_parser) * Fixes ticket:63 (UTF-16; UNILE decoding was bad) * Cleans up some tests, removing opportunities for test corruption * Improves parsing error messages a little * Adds the ability to override the encoding detection in Source construction * Fixes an edge case in Functions::string, where document nodes weren't correctly converted * Fixes Functions::string() for Element and Document nodes * Fixes some problems in entity handling Addresses ticket:66 Fixes ticket:71 Addresses ticket:78 NOTE: that this also fixes what is technically another bug in REXML. REXML's XPath parser used to allow exponential notation in numbers. The XPath spec is specific about what a number is, and scientific notation is not included. Therefore, this has been fixed. Cross-ported a fix for ticket:88 from CVS. Fixes ticket:80 Documentation cleanup. Ticket:84 Applied Kou's fix for an un-trac'ed bug. ------------------------------------------------------------------------ git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@11548 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This commit is contained in:
parent
f700c1354f
commit
fa4bfa6af5
13 changed files with 142 additions and 83 deletions
|
@ -6,7 +6,7 @@ module REXML
|
|||
# Generates a Source object
|
||||
# @param arg Either a String, or an IO
|
||||
# @return a Source, or nil if a bad argument was given
|
||||
def SourceFactory::create_from arg#, slurp=true
|
||||
def SourceFactory::create_from(arg)
|
||||
if arg.kind_of? String
|
||||
Source.new(arg)
|
||||
elsif arg.respond_to? :read and
|
||||
|
@ -35,12 +35,19 @@ module REXML
|
|||
|
||||
# Constructor
|
||||
# @param arg must be a String, and should be a valid XML document
|
||||
def initialize(arg)
|
||||
# @param encoding if non-null, sets the encoding of the source to this
|
||||
# value, overriding all encoding detection
|
||||
def initialize(arg, encoding=nil)
|
||||
@orig = @buffer = arg
|
||||
self.encoding = check_encoding( @buffer )
|
||||
if encoding
|
||||
self.encoding = encoding
|
||||
else
|
||||
self.encoding = check_encoding( @buffer )
|
||||
end
|
||||
@line = 0
|
||||
end
|
||||
|
||||
|
||||
# Inherited from Encoding
|
||||
# Overridden to support optimized en/decoding
|
||||
def encoding=(enc)
|
||||
|
@ -124,7 +131,7 @@ module REXML
|
|||
#attr_reader :block_size
|
||||
|
||||
# block_size has been deprecated
|
||||
def initialize(arg, block_size=500)
|
||||
def initialize(arg, block_size=500, encoding=nil)
|
||||
@er_source = @source = arg
|
||||
@to_utf = false
|
||||
# Determining the encoding is a deceptively difficult issue to resolve.
|
||||
|
@ -134,10 +141,12 @@ module REXML
|
|||
# if there is one. If there isn't one, the file MUST be UTF-8, as per
|
||||
# the XML spec. If there is one, we can determine the encoding from
|
||||
# it.
|
||||
@buffer = ""
|
||||
str = @source.read( 2 )
|
||||
if /\A(?:\xfe\xff|\xff\xfe)/n =~ str
|
||||
if encoding
|
||||
self.encoding = encoding
|
||||
elsif /\A(?:\xfe\xff|\xff\xfe)/n =~ str
|
||||
self.encoding = check_encoding( str )
|
||||
@line_break = encode( '>' )
|
||||
else
|
||||
@line_break = '>'
|
||||
end
|
||||
|
@ -159,6 +168,8 @@ module REXML
|
|||
str = @source.readline(@line_break)
|
||||
str = decode(str) if @to_utf and str
|
||||
@buffer << str
|
||||
rescue Iconv::IllegalSequence
|
||||
raise
|
||||
rescue
|
||||
@source = nil
|
||||
end
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue