1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00

Short summary:

This is a version bump to REXML 3.1.4.  It includes numerous bug fixes and is
  a pretty big patch, but is nonetheless a minor revision bump, since the API
  hasn't changed.

  For more information, see:

    http:/www.germane-software.com/projects/rexml/milestone/3.1.4

  For all tickets, see:

    http://www.germane-software.com/projects/rexml/ticket/#

  Where '#' is replaced with the ticket number.

Changelog:

* Fixed the documentation WRT the raw mode of text nodes (ticket #4)
* Fixes roundup ticket #43: substring-after bug.
* Fixed ticket #44, Element#xpath
* Patch submitted by an anonymous doner to allow parsing of Tempfiles.  I was
  hoping that, by now, that whole Source thing would have been changed to use
  duck typing and avoid this sort of ticket... but in the meantime, the patch
  has been applied.
* Fixes ticket:30, XPath default namespace bug.  The fix was provided
  by Lucas Nussbaum.
* Aliases #size to #length, as per zdennis's request.
* Fixes typo from previous commit
* Fixes ticket #32, preceding-sibling fails attempting delete_if on nil nodeset
* Merges a user-contributed patch for ticket #40
* Adds a forgotten-to-commit unit test for ticket #32
* Changes Date, Version, and Copyright to upper case, to avoid conflicts with
  the Date class.  All of the other changes in the altered files are because
  Subversion doesn't allow block-level commits, like it should.  English cased
  Version and Copyright are aliased to the upper case versions, for partial
  backward compatability.
* Minor, yet incomplete, documentation changes.  Again, these are in this patch
  because of Subversion's glaring lack of block-level commits.
* Resolves ticket #34, SAX parser change makes it impossible to parse IO feeds.
* Moves parser.source.position() to parser.position()
* Fixes ticket:48, repeated writes munging text content
* Fixes ticket:46, adding methods for accessing notation DTD information.
* Encodes some characters and removes a brokes link in the documentation
* Deals with carriage returns after XML declarations
* Improved doctype handling
* Whitespace handling changes
* Applies a patch by David Tardon, which (incidentally) fixes ticket:50
* Closes #26, allowing anything that walks like an IO to be a source.
* Ticket #31 - One unescape too many
  This wasn't really a bug, per se... "value" always returns
  a normalized string, and "value" is the method used to get
  the text() of an element.  However, entities have no meaning
  in CDATA sections, so there's no justification for value
  to be normalizing the content of CData objects.  This behavior
  has therefore been changed.
* Ticket #45 -- Now parses notation declarations in DTDs properly.
* Resolves ticket #49, Document.parse_stream returns ArgumentError
* Adds documentation to clarify how XMLDecl works, to avoid invalid bug reports.
* Addresses ticket #10, fixing the StreamParser API for DTDs.
* Fixes ticket #42, XPath node-set function 'name' fails with relative node
  set parameter
* Good patch by Aaron to fix ticket #53: REXML ignoring unbalanced tags
  at the end of a document.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_1_8@10090 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This commit is contained in:
ser 2006-04-14 02:56:44 +00:00
parent bec759abcc
commit 5f4bf32929
17 changed files with 589 additions and 462 deletions

View file

@ -35,6 +35,10 @@ module REXML
@string @string
end end
def value
@string
end
# Generates XML output of this object # Generates XML output of this object
# #
# output:: # output::

View file

@ -6,55 +6,55 @@ require 'rexml/attlistdecl'
require 'rexml/xmltokens' require 'rexml/xmltokens'
module REXML module REXML
# Represents an XML DOCTYPE declaration; that is, the contents of <!DOCTYPE # Represents an XML DOCTYPE declaration; that is, the contents of <!DOCTYPE
# ... >. DOCTYPES can be used to declare the DTD of a document, as well as # ... >. DOCTYPES can be used to declare the DTD of a document, as well as
# being used to declare entities used in the document. # being used to declare entities used in the document.
class DocType < Parent class DocType < Parent
include XMLTokens include XMLTokens
START = "<!DOCTYPE" START = "<!DOCTYPE"
STOP = ">" STOP = ">"
SYSTEM = "SYSTEM" SYSTEM = "SYSTEM"
PUBLIC = "PUBLIC" PUBLIC = "PUBLIC"
DEFAULT_ENTITIES = { DEFAULT_ENTITIES = {
'gt'=>EntityConst::GT, 'gt'=>EntityConst::GT,
'lt'=>EntityConst::LT, 'lt'=>EntityConst::LT,
'quot'=>EntityConst::QUOT, 'quot'=>EntityConst::QUOT,
"apos"=>EntityConst::APOS "apos"=>EntityConst::APOS
} }
# name is the name of the doctype # name is the name of the doctype
# external_id is the referenced DTD, if given # external_id is the referenced DTD, if given
attr_reader :name, :external_id, :entities, :namespaces attr_reader :name, :external_id, :entities, :namespaces
# Constructor # Constructor
# #
# dt = DocType.new( 'foo', '-//I/Hate/External/IDs' ) # dt = DocType.new( 'foo', '-//I/Hate/External/IDs' )
# # <!DOCTYPE foo '-//I/Hate/External/IDs'> # # <!DOCTYPE foo '-//I/Hate/External/IDs'>
# dt = DocType.new( doctype_to_clone ) # dt = DocType.new( doctype_to_clone )
# # Incomplete. Shallow clone of doctype # # Incomplete. Shallow clone of doctype
# #
# +Note+ that the constructor: # +Note+ that the constructor:
# #
# Doctype.new( Source.new( "<!DOCTYPE foo 'bar'>" ) ) # Doctype.new( Source.new( "<!DOCTYPE foo 'bar'>" ) )
# #
# is _deprecated_. Do not use it. It will probably disappear. # is _deprecated_. Do not use it. It will probably disappear.
def initialize( first, parent=nil ) def initialize( first, parent=nil )
@entities = DEFAULT_ENTITIES @entities = DEFAULT_ENTITIES
@long_name = @uri = nil @long_name = @uri = nil
if first.kind_of? String if first.kind_of? String
super() super()
@name = first @name = first
@external_id = parent @external_id = parent
elsif first.kind_of? DocType elsif first.kind_of? DocType
super( parent ) super( parent )
@name = first.name @name = first.name
@external_id = first.external_id @external_id = first.external_id
elsif first.kind_of? Array elsif first.kind_of? Array
super( parent ) super( parent )
@name = first[0] @name = first[0]
@external_id = first[1] @external_id = first[1]
@long_name = first[2] @long_name = first[2]
@uri = first[3] @uri = first[3]
elsif first.kind_of? Source elsif first.kind_of? Source
super( parent ) super( parent )
parser = Parsers::BaseParser.new( first ) parser = Parsers::BaseParser.new( first )
@ -64,150 +64,215 @@ module REXML
end end
else else
super() super()
end end
end end
def node_type def node_type
:doctype :doctype
end end
def attributes_of element def attributes_of element
rv = [] rv = []
each do |child| each do |child|
child.each do |key,val| child.each do |key,val|
rv << Attribute.new(key,val) rv << Attribute.new(key,val)
end if child.kind_of? AttlistDecl and child.element_name == element end if child.kind_of? AttlistDecl and child.element_name == element
end end
rv rv
end end
def attribute_of element, attribute def attribute_of element, attribute
att_decl = find do |child| att_decl = find do |child|
child.kind_of? AttlistDecl and child.kind_of? AttlistDecl and
child.element_name == element and child.element_name == element and
child.include? attribute child.include? attribute
end end
return nil unless att_decl return nil unless att_decl
att_decl[attribute] att_decl[attribute]
end end
def clone def clone
DocType.new self DocType.new self
end end
# output:: # output::
# Where to write the string # Where to write the string
# indent:: # indent::
# An integer. If -1, no indenting will be used; otherwise, the # An integer. If -1, no indenting will be used; otherwise, the
# indentation will be this number of spaces, and children will be # indentation will be this number of spaces, and children will be
# indented an additional amount. # indented an additional amount.
# transitive:: # transitive::
# If transitive is true and indent is >= 0, then the output will be # If transitive is true and indent is >= 0, then the output will be
# pretty-printed in such a way that the added whitespace does not affect # pretty-printed in such a way that the added whitespace does not affect
# the absolute *value* of the document -- that is, it leaves the value # the absolute *value* of the document -- that is, it leaves the value
# and number of Text nodes in the document unchanged. # and number of Text nodes in the document unchanged.
# ie_hack:: # ie_hack::
# Internet Explorer is the worst piece of crap to have ever been # Internet Explorer is the worst piece of crap to have ever been
# written, with the possible exception of Windows itself. Since IE is # written, with the possible exception of Windows itself. Since IE is
# unable to parse proper XML, we have to provide a hack to generate XML # unable to parse proper XML, we have to provide a hack to generate XML
# that IE's limited abilities can handle. This hack inserts a space # that IE's limited abilities can handle. This hack inserts a space
# before the /> on empty tags. # before the /> on empty tags.
# #
def write( output, indent=0, transitive=false, ie_hack=false ) def write( output, indent=0, transitive=false, ie_hack=false )
indent( output, indent ) indent( output, indent )
output << START output << START
output << ' ' output << ' '
output << @name output << @name
output << " #@external_id" if @external_id output << " #@external_id" if @external_id
output << " #@long_name" if @long_name output << " #@long_name" if @long_name
output << " #@uri" if @uri output << " #@uri" if @uri
unless @children.empty? unless @children.empty?
next_indent = indent + 1 next_indent = indent + 1
output << ' [' output << ' ['
child = nil # speed child = nil # speed
@children.each { |child| @children.each { |child|
output << "\n" output << "\n"
child.write( output, next_indent ) child.write( output, next_indent )
} }
output << "\n" #output << ' '*next_indent
#output << ' '*next_indent output << "\n]"
output << "]" end
end output << STOP
output << STOP end
end
def context def context
@parent.context @parent.context
end end
def entity( name ) def entity( name )
@entities[name].unnormalized if @entities[name] @entities[name].unnormalized if @entities[name]
end end
def add child def add child
super(child) super(child)
@entities = DEFAULT_ENTITIES.clone if @entities == DEFAULT_ENTITIES @entities = DEFAULT_ENTITIES.clone if @entities == DEFAULT_ENTITIES
@entities[ child.name ] = child if child.kind_of? Entity @entities[ child.name ] = child if child.kind_of? Entity
end end
end
# This method retrieves the public identifier identifying the document's
# DTD.
#
# Method contributed by Henrik Martensson
def public
case @external_id
when "SYSTEM"
nil
when "PUBLIC"
strip_quotes(@long_name)
end
end
# This method retrieves the system identifier identifying the document's DTD
#
# Method contributed by Henrik Martensson
def system
case @external_id
when "SYSTEM"
strip_quotes(@long_name)
when "PUBLIC"
@uri.kind_of?(String) ? strip_quotes(@uri) : nil
end
end
# This method returns a list of notations that have been declared in the
# _internal_ DTD subset. Notations in the external DTD subset are not
# listed.
#
# Method contributed by Henrik Martensson
def notations
children().select {|node| node.kind_of?(REXML::NotationDecl)}
end
# Retrieves a named notation. Only notations declared in the internal
# DTD subset can be retrieved.
#
# Method contributed by Henrik Martensson
def notation(name)
notations.find { |notation_decl|
notation_decl.name == name
}
end
private
# Method contributed by Henrik Martensson
def strip_quotes(quoted_string)
quoted_string =~ /^[\'\"].*[\´\"]$/ ?
quoted_string[1, quoted_string.length-2] :
quoted_string
end
end
# We don't really handle any of these since we're not a validating # We don't really handle any of these since we're not a validating
# parser, so we can be pretty dumb about them. All we need to be able # parser, so we can be pretty dumb about them. All we need to be able
# to do is spew them back out on a write() # to do is spew them back out on a write()
# This is an abstract class. You never use this directly; it serves as a # This is an abstract class. You never use this directly; it serves as a
# parent class for the specific declarations. # parent class for the specific declarations.
class Declaration < Child class Declaration < Child
def initialize src def initialize src
super() super()
@string = src @string = src
end end
def to_s def to_s
@string+'>' @string+'>'
end end
def write( output, indent ) def write( output, indent )
output << (' '*indent) if indent > 0 output << (' '*indent) if indent > 0
output << to_s output << to_s
end end
end end
public public
class ElementDecl < Declaration class ElementDecl < Declaration
def initialize( src ) def initialize( src )
super super
end end
end end
class ExternalEntity < Child class ExternalEntity < Child
def initialize( src ) def initialize( src )
super() super()
@entity = src @entity = src
end end
def to_s def to_s
@entity @entity
end end
def write( output, indent ) def write( output, indent )
output << @entity output << @entity
output << "\n" end
end end
end
class NotationDecl < Child class NotationDecl < Child
def initialize name, middle, rest attr_accessor :public, :system
@name = name def initialize name, middle, pub, sys
@middle = middle super(nil)
@rest = rest @name = name
end @middle = middle
@public = pub
@system = sys
end
def to_s def to_s
"<!NOTATION #@name '#@middle #@rest'>" "<!NOTATION #@name #@middle#{
end @public ? ' ' + public.inspect : ''
}#{
@system ? ' ' +@system.inspect : ''
}>"
end
def write( output, indent=-1 ) def write( output, indent=-1 )
output << (' '*indent) if indent > 0 output << (' '*indent) if indent > 0
output << to_s output << to_s
end end
end
# This method retrieves the name of the notation.
#
# Method contributed by Henrik Martensson
def name
@name
end
end
end end

View file

@ -70,11 +70,23 @@ module REXML
if child.kind_of? XMLDecl if child.kind_of? XMLDecl
@children.unshift child @children.unshift child
elsif child.kind_of? DocType elsif child.kind_of? DocType
if @children[0].kind_of? XMLDecl # Find first Element or DocType node and insert the decl right
@children[1,0] = child # before it. If there is no such node, just insert the child at the
else # end. If there is a child and it is an DocType, then replace it.
@children.unshift child insert_before_index = 0
end @children.find { |x|
insert_before_index += 1
x.kind_of?(Element) || x.kind_of?(DocType)
}
if @children[ insert_before_index ] # Not null = not end of list
if @children[ insert_before_index ].kind_of DocType
@children[ insert_before_index ] = child
else
@children[ index_before_index-1, 0 ] = child
end
else # Insert at end of list
@children[insert_before_index] = child
end
child.parent = self child.parent = self
else else
rv = super rv = super

View file

@ -1224,5 +1224,20 @@ module REXML
rv.each{ |attr| attr.remove } rv.each{ |attr| attr.remove }
return rv return rv
end end
# The +get_attribute_ns+ method retrieves a method by its namespace
# and name. Thus it is possible to reliably identify an attribute
# even if an XML processor has changed the prefix.
#
# Method contributed by Henrik Martensson
def get_attribute_ns(namespace, name)
each_attribute() { |attribute|
if name == attribute.name &&
namespace == attribute.namespace()
return attribute
end
}
nil
end
end end
end end

View file

@ -1,58 +1,64 @@
# -*- mode: ruby; ruby-indent-level: 2; indent-tabs-mode: t; tab-width: 2 -*- vim: sw=2 ts=2 # -*- mode: ruby; ruby-indent-level: 2; indent-tabs-mode: t; tab-width: 2 -*- vim: sw=2 ts=2
module REXML module REXML
module Encoding module Encoding
@encoding_methods = {} @encoding_methods = {}
def self.register(enc, &block) def self.register(enc, &block)
@encoding_methods[enc] = block @encoding_methods[enc] = block
end end
def self.apply(obj, enc) def self.apply(obj, enc)
@encoding_methods[enc][obj] @encoding_methods[enc][obj]
end end
def self.encoding_method(enc) def self.encoding_method(enc)
@encoding_methods[enc] @encoding_methods[enc]
end end
# Native, default format is UTF-8, so it is declared here rather than in # Native, default format is UTF-8, so it is declared here rather than in
# an encodings/ definition. # an encodings/ definition.
UTF_8 = 'UTF-8' UTF_8 = 'UTF-8'
UTF_16 = 'UTF-16' UTF_16 = 'UTF-16'
UNILE = 'UNILE' UNILE = 'UNILE'
# ID ---> Encoding name # ID ---> Encoding name
attr_reader :encoding attr_reader :encoding
def encoding=( enc ) def encoding=( enc )
old_verbosity = $VERBOSE old_verbosity = $VERBOSE
begin begin
$VERBOSE = false $VERBOSE = false
return if defined? @encoding and enc == @encoding return if defined? @encoding and enc == @encoding
if enc if enc and enc != UTF_8
raise ArgumentError, "Bad encoding name #{enc}" unless /\A[\w-]+\z/n =~ enc @encoding = enc.upcase
@encoding = enc.upcase.untaint begin
else require 'rexml/encodings/ICONV.rb'
@encoding = UTF_8 Encoding.apply(self, "ICONV")
end rescue LoadError, Exception => err
err = nil raise ArgumentError, "Bad encoding name #@encoding" unless @encoding =~ /^[\w-]+$/
[@encoding, "ICONV"].each do |enc| @encoding.untaint
begin enc_file = File.join( "rexml", "encodings", "#@encoding.rb" )
require File.join("rexml", "encodings", "#{enc}.rb") begin
return Encoding.apply(self, enc) require enc_file
rescue LoadError, Exception => err Encoding.apply(self, @encoding)
end rescue LoadError
end puts $!.message
puts err.message raise ArgumentError, "No decoder found for encoding #@encoding. Please install iconv."
raise ArgumentError, "No decoder found for encoding #@encoding. Please install iconv." end
ensure end
$VERBOSE = old_verbosity else
end @encoding = UTF_8
end require 'rexml/encodings/UTF-8.rb'
Encoding.apply(self, @encoding)
end
ensure
$VERBOSE = old_verbosity
end
end
def check_encoding str def check_encoding str
# We have to recognize UTF-16, LSB UTF-16, and UTF-8 # We have to recognize UTF-16, LSB UTF-16, and UTF-8
return UTF_16 if str[0] == 254 && str[1] == 255 return UTF_16 if str[0] == 254 && str[1] == 255
return UNILE if str[0] == 255 && str[1] == 254 return UNILE if str[0] == 255 && str[1] == 254
str =~ /^\s*<?xml\s*version=(['"]).*?\2\s*encoding=(["'])(.*?)\2/um str =~ /^\s*<?xml\s*version=(['"]).*?\2\s*encoding=(["'])(.*?)\2/um
return $1.upcase if $1 return $1.upcase if $1
return UTF_8 return UTF_8
end end
end end
end end

View file

@ -67,11 +67,10 @@ module REXML
if node_set == nil if node_set == nil
yield @@context[:node] if defined? @@context[:node].namespace yield @@context[:node] if defined? @@context[:node].namespace
else else
if node_set.namespace if node_set.respond_to? :each
yield node_set
else
return unless node_set.kind_of? Enumerable
node_set.each { |node| yield node if defined? node.namespace } node_set.each { |node| yield node if defined? node.namespace }
elsif node_set.respond_to? :namespace
yield node_set
end end
end end
end end

View file

@ -1,168 +1,166 @@
require "rexml/child" require "rexml/child"
module REXML module REXML
# A parent has children, and has methods for accessing them. The Parent # A parent has children, and has methods for accessing them. The Parent
# class is never encountered except as the superclass for some other # class is never encountered except as the superclass for some other
# object. # object.
class Parent < Child class Parent < Child
include Enumerable include Enumerable
# Constructor # Constructor
# @param parent if supplied, will be set as the parent of this object # @param parent if supplied, will be set as the parent of this object
def initialize parent=nil def initialize parent=nil
super(parent) super(parent)
@children = [] @children = []
end end
def add( object ) def add( object )
#puts "PARENT GOTS #{size} CHILDREN" #puts "PARENT GOTS #{size} CHILDREN"
object.parent = self object.parent = self
@children << object @children << object
#puts "PARENT NOW GOTS #{size} CHILDREN" #puts "PARENT NOW GOTS #{size} CHILDREN"
object object
end end
alias :push :add alias :push :add
alias :<< :push alias :<< :push
def unshift( object ) def unshift( object )
object.parent = self object.parent = self
@children.unshift object @children.unshift object
end end
def delete( object ) def delete( object )
found = false found = false
@children.delete_if {|c| @children.delete_if {|c| c.equal?(object) and found = true }
c.equal?(object) and found = true object.parent = nil if found
} end
object.parent = nil if found
end def each(&block)
@children.each(&block)
def each(&block) end
@children.each(&block)
end def delete_if( &block )
@children.delete_if(&block)
def delete_if( &block ) end
@children.delete_if(&block)
end def delete_at( index )
@children.delete_at index
def delete_at( index ) end
@children.delete_at index
end def each_index( &block )
@children.each_index(&block)
def each_index( &block ) end
@children.each_index(&block)
end # Fetches a child at a given index
# @param index the Integer index of the child to fetch
# Fetches a child at a given index def []( index )
# @param index the Integer index of the child to fetch @children[index]
def []( index ) end
@children[index]
end alias :each_child :each
alias :each_child :each
# Set an index entry. See Array.[]=
# @param index the index of the element to set
# Set an index entry. See Array.[]= # @param opt either the object to set, or an Integer length
# @param index the index of the element to set # @param child if opt is an Integer, this is the child to set
# @param opt either the object to set, or an Integer length # @return the parent (self)
# @param child if opt is an Integer, this is the child to set def []=( *args )
# @return the parent (self) args[-1].parent = self
def []=( *args ) @children[*args[0..-2]] = args[-1]
args[-1].parent = self end
@children[*args[0..-2]] = args[-1]
end # Inserts an child before another child
# @param child1 this is either an xpath or an Element. If an Element,
# Inserts an child before another child # child2 will be inserted before child1 in the child list of the parent.
# @param child1 this is either an xpath or an Element. If an Element, # If an xpath, child2 will be inserted before the first child to match
# child2 will be inserted before child1 in the child list of the parent. # the xpath.
# If an xpath, child2 will be inserted before the first child to match # @param child2 the child to insert
# the xpath. # @return the parent (self)
# @param child2 the child to insert def insert_before( child1, child2 )
# @return the parent (self) if child1.kind_of? String
def insert_before( child1, child2 ) child1 = XPath.first( self, child1 )
if child1.kind_of? String child1.parent.insert_before child1, child2
child1 = XPath.first( self, child1 ) else
child1.parent.insert_before child1, child2 ind = index(child1)
else child2.parent.delete(child2) if child2.parent
ind = index(child1) @children[ind,0] = child2
child2.parent.delete(child2) if child2.parent child2.parent = self
@children[ind,0] = child2 end
child2.parent = self self
end end
self
end # Inserts an child after another child
# @param child1 this is either an xpath or an Element. If an Element,
# Inserts an child after another child # child2 will be inserted after child1 in the child list of the parent.
# @param child1 this is either an xpath or an Element. If an Element, # If an xpath, child2 will be inserted after the first child to match
# child2 will be inserted after child1 in the child list of the parent. # the xpath.
# If an xpath, child2 will be inserted after the first child to match # @param child2 the child to insert
# the xpath. # @return the parent (self)
# @param child2 the child to insert def insert_after( child1, child2 )
# @return the parent (self) if child1.kind_of? String
def insert_after( child1, child2 ) child1 = XPath.first( self, child1 )
if child1.kind_of? String child1.parent.insert_after child1, child2
child1 = XPath.first( self, child1 ) else
child1.parent.insert_after child1, child2 ind = index(child1)+1
else child2.parent.delete(child2) if child2.parent
ind = index(child1)+1 @children[ind,0] = child2
child2.parent.delete(child2) if child2.parent child2.parent = self
@children[ind,0] = child2 end
child2.parent = self self
end end
self
end def to_a
@children.dup
def to_a end
@children.dup
end # Fetches the index of a given child
# @param child the child to get the index of
# Fetches the index of a given child # @return the index of the child, or nil if the object is not a child
# @param child the child to get the index of # of this parent.
# @return the index of the child, or nil if the object is not a child def index( child )
# of this parent. count = -1
def index( child ) @children.find { |i| count += 1 ; i.hash == child.hash }
count = -1 count
@children.find { |i| count += 1 ; i.hash == child.hash } end
count
end # @return the number of children of this parent
def size
# @return the number of children of this parent @children.size
def size end
@children.size
end
alias :length :size alias :length :size
# Replaces one child with another, making sure the nodelist is correct # Replaces one child with another, making sure the nodelist is correct
# @param to_replace the child to replace (must be a Child) # @param to_replace the child to replace (must be a Child)
# @param replacement the child to insert into the nodelist (must be a # @param replacement the child to insert into the nodelist (must be a
# Child) # Child)
def replace_child( to_replace, replacement ) def replace_child( to_replace, replacement )
@children.map! {|c| c.equal?( to_replace ) ? replacement : c } @children.map! {|c| c.equal?( to_replace ) ? replacement : c }
to_replace.parent = nil to_replace.parent = nil
replacement.parent = self replacement.parent = self
end end
# Deeply clones this object. This creates a complete duplicate of this # Deeply clones this object. This creates a complete duplicate of this
# Parent, including all descendants. # Parent, including all descendants.
def deep_clone def deep_clone
cl = clone() cl = clone()
each do |child| each do |child|
if child.kind_of? Parent if child.kind_of? Parent
cl << child.deep_clone cl << child.deep_clone
else else
cl << child.clone cl << child.clone
end end
end end
cl cl
end end
alias :children :to_a alias :children :to_a
def parent? def parent?
true true
end end
end end
end end

View file

@ -42,7 +42,7 @@ module REXML
CDATA_END = /^\s*\]\s*>/um CDATA_END = /^\s*\]\s*>/um
CDATA_PATTERN = /<!\[CDATA\[(.*?)\]\]>/um CDATA_PATTERN = /<!\[CDATA\[(.*?)\]\]>/um
XMLDECL_START = /\A<\?xml\s/u; XMLDECL_START = /\A<\?xml\s/u;
XMLDECL_PATTERN = /<\?xml\s+(.*?)\?>*/um XMLDECL_PATTERN = /<\?xml\s+(.*?)\?>/um
INSTRUCTION_START = /\A<\?/u INSTRUCTION_START = /\A<\?/u
INSTRUCTION_PATTERN = /<\?(.*?)(\s+.*?)?\?>/um INSTRUCTION_PATTERN = /<\?(.*?)(\s+.*?)?\?>/um
TAG_MATCH = /^<((?>#{NAME_STR}))\s*((?>\s+#{NAME_STR}\s*=\s*(["']).*?\3)*)\s*(\/)?>/um TAG_MATCH = /^<((?>#{NAME_STR}))\s*((?>\s+#{NAME_STR}\s*=\s*(["']).*?\3)*)\s*(\/)?>/um
@ -68,8 +68,8 @@ module REXML
ATTLISTDECL_START = /^\s*<!ATTLIST/um ATTLISTDECL_START = /^\s*<!ATTLIST/um
ATTLISTDECL_PATTERN = /^\s*<!ATTLIST\s+#{NAME}(?:#{ATTDEF})*\s*>/um ATTLISTDECL_PATTERN = /^\s*<!ATTLIST\s+#{NAME}(?:#{ATTDEF})*\s*>/um
NOTATIONDECL_START = /^\s*<!NOTATION/um NOTATIONDECL_START = /^\s*<!NOTATION/um
PUBLIC = /^\s*<!NOTATION\s+(\w[\-\w]*)\s+(PUBLIC)\s+((["']).*?\4)\s*>/um PUBLIC = /^\s*<!NOTATION\s+(\w[\-\w]*)\s+(PUBLIC)\s+(["'])(.*?)\3(?:\s+(["'])(.*?)\5)?\s*>/um
SYSTEM = /^\s*<!NOTATION\s+(\w[\-\w]*)\s+(SYSTEM)\s+((["']).*?\4)\s*>/um SYSTEM = /^\s*<!NOTATION\s+(\w[\-\w]*)\s+(SYSTEM)\s+(["'])(.*?)\3\s*>/um
TEXT_PATTERN = /\A([^<]*)/um TEXT_PATTERN = /\A([^<]*)/um
@ -120,20 +120,7 @@ module REXML
attr_reader :source attr_reader :source
def stream=( source ) def stream=( source )
if source.kind_of? String @source = SourceFactory.create_from( source )
@source = Source.new(source)
elsif source.kind_of? IO
@source = IOSource.new(source)
elsif source.kind_of? Source
@source = source
elsif defined? StringIO and source.kind_of? StringIO
@source = IOSource.new(source)
elsif defined? Tempfile and source.kind_of? Tempfile
@source = IOSource.new(source)
else
raise "#{source.class} is not a valid input stream. It must be \n"+
"either a String, IO, StringIO or Source."
end
@closed = nil @closed = nil
@document_status = nil @document_status = nil
@tags = [] @tags = []
@ -152,8 +139,8 @@ module REXML
# Returns true if there are no more events # Returns true if there are no more events
def empty? def empty?
#puts "@source.empty? = #{@source.empty?}" #STDERR.puts "@source.empty? = #{@source.empty?}"
#puts "@stack.empty? = #{@stack.empty?}" #STDERR.puts "@stack.empty? = #{@stack.empty?}"
return (@source.empty? and @stack.empty?) return (@source.empty? and @stack.empty?)
end end
@ -197,14 +184,17 @@ module REXML
return [ :end_document ] if empty? return [ :end_document ] if empty?
return @stack.shift if @stack.size > 0 return @stack.shift if @stack.size > 0
@source.read if @source.buffer.size<2 @source.read if @source.buffer.size<2
#STDERR.puts "BUFFER = #{@source.buffer.inspect}"
if @document_status == nil if @document_status == nil
@source.consume( /^\s*/um ) #@source.consume( /^\s*/um )
word = @source.match( /(<[^>]*)>/um ) word = @source.match( /^((?:\s+)|(?:<[^>]*>))/um )
word = word[1] unless word.nil? word = word[1] unless word.nil?
#STDERR.puts "WORD = #{word.inspect}"
case word case word
when COMMENT_START when COMMENT_START
return [ :comment, @source.match( COMMENT_PATTERN, true )[1] ] return [ :comment, @source.match( COMMENT_PATTERN, true )[1] ]
when XMLDECL_START when XMLDECL_START
#STDERR.puts "XMLDECL"
results = @source.match( XMLDECL_PATTERN, true )[1] results = @source.match( XMLDECL_PATTERN, true )[1]
version = VERSION.match( results ) version = VERSION.match( results )
version = version[1] unless version.nil? version = version[1] unless version.nil?
@ -213,7 +203,7 @@ module REXML
@source.encoding = encoding @source.encoding = encoding
standalone = STANDALONE.match(results) standalone = STANDALONE.match(results)
standalone = standalone[1] unless standalone.nil? standalone = standalone[1] unless standalone.nil?
return [ :xmldecl, version, encoding, standalone] return [ :xmldecl, version, encoding, standalone ]
when INSTRUCTION_START when INSTRUCTION_START
return [ :processing_instruction, *@source.match(INSTRUCTION_PATTERN, true)[1,2] ] return [ :processing_instruction, *@source.match(INSTRUCTION_PATTERN, true)[1,2] ]
when DOCTYPE_START when DOCTYPE_START
@ -236,6 +226,7 @@ module REXML
@document_status = :in_doctype @document_status = :in_doctype
end end
return args return args
when /^\s+/
else else
@document_status = :after_doctype @document_status = :after_doctype
@source.read if @source.buffer.size<2 @source.read if @source.buffer.size<2
@ -299,12 +290,14 @@ module REXML
md = nil md = nil
if @source.match( PUBLIC ) if @source.match( PUBLIC )
md = @source.match( PUBLIC, true ) md = @source.match( PUBLIC, true )
vals = [md[1],md[2],md[4],md[6]]
elsif @source.match( SYSTEM ) elsif @source.match( SYSTEM )
md = @source.match( SYSTEM, true ) md = @source.match( SYSTEM, true )
vals = [md[1],md[2],nil,md[4]]
else else
raise REXML::ParseException.new( "error parsing notation: no matching pattern", @source ) raise REXML::ParseException.new( "error parsing notation: no matching pattern", @source )
end end
return [ :notationdecl, md[1], md[2], md[3] ] return [ :notationdecl, *vals ]
when CDATA_END when CDATA_END
@document_status = :after_doctype @document_status = :after_doctype
@source.match( CDATA_END, true ) @source.match( CDATA_END, true )
@ -323,7 +316,7 @@ module REXML
return [ :end_element, last_tag ] return [ :end_element, last_tag ]
elsif @source.buffer[1] == ?! elsif @source.buffer[1] == ?!
md = @source.match(/\A(\s*[^>]*>)/um) md = @source.match(/\A(\s*[^>]*>)/um)
#puts "SOURCE BUFFER = #{source.buffer}, #{source.buffer.size}" #STDERR.puts "SOURCE BUFFER = #{source.buffer}, #{source.buffer.size}"
raise REXML::ParseException.new("Malformed node", @source) unless md raise REXML::ParseException.new("Malformed node", @source) unless md
if md[0][2] == ?- if md[0][2] == ?-
md = @source.match( COMMENT_PATTERN, true ) md = @source.match( COMMENT_PATTERN, true )
@ -361,10 +354,11 @@ module REXML
else else
md = @source.match( TEXT_PATTERN, true ) md = @source.match( TEXT_PATTERN, true )
if md[0].length == 0 if md[0].length == 0
#puts "EMPTY = #{empty?}" puts "EMPTY = #{empty?}"
#puts "BUFFER = \"#{@source.buffer}\"" puts "BUFFER = \"#{@source.buffer}\""
@source.match( /(\s+)/, true ) @source.match( /(\s+)/, true )
end end
#STDERR.puts "GOT #{md[1].inspect}" unless md[0].length == 0
#return [ :text, "" ] if md[0].length == 0 #return [ :text, "" ] if md[0].length == 0
# unnormalized = Text::unnormalize( md[1], self ) # unnormalized = Text::unnormalize( md[1], self )
# return PullEvent.new( :text, md[1], unnormalized ) # return PullEvent.new( :text, md[1], unnormalized )

View file

@ -1,42 +1,46 @@
module REXML module REXML
module Parsers module Parsers
class StreamParser class StreamParser
def initialize source, listener def initialize source, listener
@listener = listener @listener = listener
@parser = BaseParser.new( source ) @parser = BaseParser.new( source )
end end
def add_listener( listener ) def add_listener( listener )
@parser.add_listener( listener ) @parser.add_listener( listener )
end end
def parse def parse
# entity string # entity string
while true while true
event = @parser.pull event = @parser.pull
case event[0] case event[0]
when :end_document when :end_document
return return
when :start_element when :start_element
attrs = event[2].each do |n, v| attrs = event[2].each do |n, v|
event[2][n] = @parser.unnormalize( v ) event[2][n] = @parser.unnormalize( v )
end end
@listener.tag_start( event[1], attrs ) @listener.tag_start( event[1], attrs )
when :end_element when :end_element
@listener.tag_end( event[1] ) @listener.tag_end( event[1] )
when :text when :text
normalized = @parser.unnormalize( event[1] ) normalized = @parser.unnormalize( event[1] )
@listener.text( normalized ) @listener.text( normalized )
when :processing_instruction when :processing_instruction
@listener.instruction( *event[1,2] ) @listener.instruction( *event[1,2] )
when :start_doctype when :start_doctype
@listener.doctype( *event[1..-1] ) @listener.doctype( *event[1..-1] )
when :comment, :attlistdecl, :notationdecl, :elementdecl, when :end_doctype
:entitydecl, :cdata, :xmldecl, :attlistdecl # FIXME: remove this condition for milestone:3.2
@listener.send( event[0].to_s, *event[1..-1] ) @listener.doctype_end if @listener.respond_to? :doctype_end
end when :comment, :attlistdecl, :cdata, :xmldecl, :elementdecl
end @listener.send( event[0].to_s, *event[1..-1] )
end when :entitydecl, :notationdecl
end @listener.send( event[0].to_s, event[1..-1] )
end end
end
end
end
end
end end

View file

@ -19,8 +19,12 @@ module REXML
begin begin
while true while true
event = @parser.pull event = @parser.pull
#STDERR.puts "TREEPARSER GOT #{event.inspect}"
case event[0] case event[0]
when :end_document when :end_document
unless tag_stack.empty?
raise ParseException.new("No close tag for #{tag_stack.inspect}")
end
return return
when :start_element when :start_element
tag_stack.push(event[1]) tag_stack.push(event[1])
@ -35,10 +39,10 @@ module REXML
@build_context[-1] << event[1] @build_context[-1] << event[1]
else else
@build_context.add( @build_context.add(
Text.new( event[1], @build_context.whitespace, nil, true ) Text.new(event[1], @build_context.whitespace, nil, true)
) unless ( ) unless (
event[1].strip.size==0 and @build_context.ignore_whitespace_nodes and
@build_context.ignore_whitespace_nodes event[1].strip.size==0
) )
end end
end end

View file

@ -10,8 +10,8 @@
# #
# Main page:: http://www.germane-software.com/software/rexml # Main page:: http://www.germane-software.com/software/rexml
# Author:: Sean Russell <serATgermaneHYPHENsoftwareDOTcom> # Author:: Sean Russell <serATgermaneHYPHENsoftwareDOTcom>
# Version:: 3.1.3.1 # Version:: 3.1.4
# Date:: 2005/364 # Date:: 2006/104
# #
# This API documentation can be downloaded from the REXML home page, or can # This API documentation can be downloaded from the REXML home page, or can
# be accessed online[http://www.germane-software.com/software/rexml_doc] # be accessed online[http://www.germane-software.com/software/rexml_doc]
@ -20,7 +20,10 @@
# or can be accessed # or can be accessed
# online[http://www.germane-software.com/software/rexml/docs/tutorial.html] # online[http://www.germane-software.com/software/rexml/docs/tutorial.html]
module REXML module REXML
Copyright = "Copyright © 2001-2005 Sean Russell <ser@germane-software.com>" COPYRIGHT = "Copyright © 2001-2006 Sean Russell <ser@germane-software.com>"
Date = "2005/364" DATE = "2006/104"
Version = "3.1.3.1" VERSION = "3.1.4"
Copyright = COPYRIGHT
Version = VERSION
end end

View file

@ -84,6 +84,7 @@ module REXML
# @p version the version attribute value. EG, "1.0" # @p version the version attribute value. EG, "1.0"
# @p encoding the encoding attribute value, or nil. EG, "utf" # @p encoding the encoding attribute value, or nil. EG, "utf"
# @p standalone the standalone attribute value, or nil. EG, nil # @p standalone the standalone attribute value, or nil. EG, nil
# @p spaced the declaration is followed by a line break
def xmldecl version, encoding, standalone def xmldecl version, encoding, standalone
end end
# Called when a comment is encountered. # Called when a comment is encountered.

View file

@ -7,13 +7,19 @@ module REXML
# @param arg Either a String, or an IO # @param arg Either a String, or an IO
# @return a Source, or nil if a bad argument was given # @return a Source, or nil if a bad argument was given
def SourceFactory::create_from arg#, slurp=true def SourceFactory::create_from arg#, slurp=true
if arg.kind_of? String if arg.kind_of? String
Source.new(arg) Source.new(arg)
elsif arg.kind_of? IO elsif arg.respond_to? :read and
arg.respond_to? :readline and
arg.respond_to? :nil? and
arg.respond_to? :eof?
IOSource.new(arg) IOSource.new(arg)
elsif arg.kind_of? Source elsif arg.kind_of? Source
arg arg
end else
raise "#{source.class} is not a valid input stream. It must walk \n"+
"like either a String, IO, or Source."
end
end end
end end

View file

@ -39,6 +39,9 @@ module REXML
# @p uri the uri of the doctype, or nil. EG, "bar" # @p uri the uri of the doctype, or nil. EG, "bar"
def doctype name, pub_sys, long_name, uri def doctype name, pub_sys, long_name, uri
end end
# Called when the doctype is done
def doctype_end
end
# If a doctype includes an ATTLIST declaration, it will cause this # If a doctype includes an ATTLIST declaration, it will cause this
# method to be called. The content is the declaration itself, unparsed. # method to be called. The content is the declaration itself, unparsed.
# EG, <!ATTLIST el attr CDATA #REQUIRED> will come to this method as "el # EG, <!ATTLIST el attr CDATA #REQUIRED> will come to this method as "el

View file

@ -284,9 +284,10 @@ module REXML
EREFERENCE = /&(?!#{Entity::NAME};)/ EREFERENCE = /&(?!#{Entity::NAME};)/
# Escapes all possible entities # Escapes all possible entities
def Text::normalize( input, doctype=nil, entity_filter=nil ) def Text::normalize( input, doctype=nil, entity_filter=nil )
copy = input.clone copy = input
# Doing it like this rather than in a loop improves the speed # Doing it like this rather than in a loop improves the speed
if doctype if doctype
# Replace all ampersands that aren't part of an entity
copy = copy.gsub( EREFERENCE, '&amp;' ) copy = copy.gsub( EREFERENCE, '&amp;' )
doctype.entities.each_value do |entity| doctype.entities.each_value do |entity|
copy = copy.gsub( entity.value, copy = copy.gsub( entity.value,
@ -294,6 +295,7 @@ module REXML
not( entity_filter and entity_filter.include?(entity) ) not( entity_filter and entity_filter.include?(entity) )
end end
else else
# Replace all ampersands that aren't part of an entity
copy = copy.gsub( EREFERENCE, '&amp;' ) copy = copy.gsub( EREFERENCE, '&amp;' )
DocType::DEFAULT_ENTITIES.each_value do |entity| DocType::DEFAULT_ENTITIES.each_value do |entity|
copy = copy.gsub(entity.value, "&#{entity.name};" ) copy = copy.gsub(entity.value, "&#{entity.name};" )

View file

@ -80,6 +80,11 @@ module REXML
self.dowrite self.dowrite
end end
# Only use this if you do not want the XML declaration to be written;
# this object is ignored by the XML writer. Otherwise, instantiate your
# own XMLDecl and add it to the document.
#
# Note that XML 1.1 documents *must* include an XML declaration
def XMLDecl.default def XMLDecl.default
rv = XMLDecl.new( "1.0" ) rv = XMLDecl.new( "1.0" )
rv.nowrite rv.nowrite

View file

@ -380,10 +380,13 @@ module REXML
return @variables[ var_name ] return @variables[ var_name ]
# :and, :or, :eq, :neq, :lt, :lteq, :gt, :gteq # :and, :or, :eq, :neq, :lt, :lteq, :gt, :gteq
# TODO: Special case for :or and :and -- not evaluate the right
# operand if the left alone determines result (i.e. is true for
# :or and false for :and).
when :eq, :neq, :lt, :lteq, :gt, :gteq, :and, :or when :eq, :neq, :lt, :lteq, :gt, :gteq, :and, :or
left = expr( path_stack.shift, nodeset, context ) left = expr( path_stack.shift, nodeset.dup, context )
#puts "LEFT => #{left.inspect} (#{left.class.name})" #puts "LEFT => #{left.inspect} (#{left.class.name})"
right = expr( path_stack.shift, nodeset, context ) right = expr( path_stack.shift, nodeset.dup, context )
#puts "RIGHT => #{right.inspect} (#{right.class.name})" #puts "RIGHT => #{right.inspect} (#{right.class.name})"
res = equality_relational_compare( left, op, right ) res = equality_relational_compare( left, op, right )
#puts "RES => #{res.inspect}" #puts "RES => #{res.inspect}"
@ -467,8 +470,11 @@ module REXML
def descendant_or_self( path_stack, nodeset ) def descendant_or_self( path_stack, nodeset )
rs = [] rs = []
#puts "#"*80
#puts "PATH_STACK = #{path_stack.inspect}"
#puts "NODESET = #{nodeset.collect{|n|n.inspect}.inspect}"
d_o_s( path_stack, nodeset, rs ) d_o_s( path_stack, nodeset, rs )
#puts "RS = #{rs.collect{|n|n.to_s}.inspect}" #puts "RS = #{rs.collect{|n|n.inspect}.inspect}"
document_order(rs.flatten.compact) document_order(rs.flatten.compact)
#rs.flatten.compact #rs.flatten.compact
end end