ruby--ruby/lib/rexml/parsers/treeparser.rb

91 lines
3.0 KiB
Ruby
Raw Normal View History

r1002 | ser | 2004-06-07 07:45:53 -0400 (Mon, 07 Jun 2004) | 2 lines * Workin' in the coal mine, goin' down, down, down... r1003 | ser | 2004-06-08 22:24:08 -0400 (Tue, 08 Jun 2004) | 7 lines * Entirely rewrote the validation code; the finite state machine, while cool, didn't survive the encounter with Interleave. It was getting sort of hacky, too. The new mechanism is less elegant, but is basically still a FSM, and is more flexible without having to add hacks to extend it. Large chunks of the FSM may be reusable in other validation mechanisms. * Added interleave support r1004 | ser | 2004-06-09 07:24:17 -0400 (Wed, 09 Jun 2004) | 2 lines * Added suppert for mixed r1005 | ser | 2004-06-09 08:01:33 -0400 (Wed, 09 Jun 2004) | 3 lines * Added Kou's patch to normalize attribute values passed through the SAX2 and Stream parsers. r1006 | ser | 2004-06-09 08:12:35 -0400 (Wed, 09 Jun 2004) | 2 lines * Applied Kou's preceding-sibling patch, which fixes the order of the axe results r1009 | ser | 2004-06-20 11:02:55 -0400 (Sun, 20 Jun 2004) | 8 lines * Redesigned and rewrote the RelaxNG code. It isn't elegant, but it works. Particular problems encountered were interleave and ref. Interleave means I can't use a clean FSM design, and ref means the dirty FSM design has to be modified during validation. There's a lot of code that could be cleaned up in here. However, I'm pretty sure that this design is reasonably fast and space efficient. I'm not entirely convinced that it is correct; more tests are required. * This version adds support for defines and refs. r1011 | ser | 2004-06-20 11:20:07 -0400 (Sun, 20 Jun 2004) | 3 lines * Removed debugging output from unit test * Moved ">" in Element.inspect r1014 | ser | 2004-06-20 11:40:30 -0400 (Sun, 20 Jun 2004) | 2 lines * Minor big in missing includes for validation rules r1023 | ser | 2004-07-03 08:57:34 -0400 (Sat, 03 Jul 2004) | 2 lines * Fixed bug #34, typo in xpath_parser. r1024 | ser | 2004-07-03 10:22:08 -0400 (Sat, 03 Jul 2004) | 9 lines * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@6577 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2004-07-04 15:26:07 +00:00
require 'rexml/validation/validationexception'
This is the log for the *previous* commit, but CVS is bloody stupid. * Added XPath expansion and abbreviation to Parsers::XPathParser * Improved the look of Element.inspect * Added xpath() to Element and Attribute, allowing the generation of a unique xpath for nodes of these types. This method for the other nodes still need to be done * Made REXML::XPathParser#match public First pass at validation support. Minimal RelaxNG support. * The tree parser is now an independant parser, like the rest. * The first basic RelaxNG support is in. It supports elements, attributes, choice, sequence, oneOrMany, zeroOrMany, and optional. Improved support for converting XPaths to strings. * XPath wasn't parsing ")" correctly. Validation improvements: * Fixed text * Fixed attributes in choices * Fixed text in choices. This change improves handling of all events that occur without an end step (which is most of them). * Fixed a bunch of cases * Added support for <group> * Added support for <value> Workin' in the coal mine, goin' down, down, down... * Entirely rewrote the validation code; the finite state machine, while cool, didn't survive the encounter with Interleave. It was getting sort of hacky, too. The new mechanism is less elegant, but is basically still a FSM, and is more flexible without having to add hacks to extend it. Large chunks of the FSM may be reusable in other validation mechanisms. * Added interleave support * Added suppert for mixed * Added Kou's patch to normalize attribute values passed through the SAX2 and Stream parsers. * Applied Kou's preceding-sibling patch, which fixes the order of the axe results git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@6442 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2004-06-10 02:09:37 +00:00
module REXML
module Parsers
class TreeParser
def initialize( source, build_context = Document.new )
@build_context = build_context
@parser = Parsers::BaseParser.new( source )
end
def add_listener( listener )
@parser.add_listener( listener )
end
def parse
tag_stack = []
in_doctype = false
entities = nil
begin
while true
event = @parser.pull
case event[0]
when :end_document
return
when :start_element
tag_stack.push(event[1])
# find the observers for namespaces
@build_context = @build_context.add_element( event[1], event[2] )
when :end_element
tag_stack.pop
@build_context = @build_context.parent
when :text
if not in_doctype
if @build_context[-1].instance_of? Text
@build_context[-1] << event[1]
else
@build_context.add(
Text.new( event[1], @build_context.whitespace, nil, true )
) unless (
event[1].strip.size==0 and
@build_context.ignore_whitespace_nodes
)
end
end
when :comment
c = Comment.new( event[1] )
@build_context.add( c )
when :cdata
c = CData.new( event[1] )
@build_context.add( c )
when :processing_instruction
@build_context.add( Instruction.new( event[1], event[2] ) )
when :end_doctype
in_doctype = false
entities.each { |k,v| entities[k] = @build_context.entities[k].value }
@build_context = @build_context.parent
when :start_doctype
doctype = DocType.new( event[1..-1], @build_context )
@build_context = doctype
entities = {}
in_doctype = true
when :attlistdecl
n = AttlistDecl.new( event[1..-1] )
@build_context.add( n )
when :externalentity
n = ExternalEntity.new( event[1] )
@build_context.add( n )
when :elementdecl
n = ElementDecl.new( event[1] )
@build_context.add(n)
when :entitydecl
entities[ event[1] ] = event[2] unless event[2] =~ /PUBLIC|SYSTEM/
@build_context.add(Entity.new(event))
when :notationdecl
n = NotationDecl.new( *event[1..-1] )
@build_context.add( n )
when :xmldecl
x = XMLDecl.new( event[1], event[2], event[3] )
@build_context.add( x )
end
end
rescue REXML::Validation::ValidationException
raise
rescue
raise ParseException.new( $!.message, @parser.source, @parser, $! )
end
end
end
end
end