1
0
Fork 0
mirror of https://github.com/ruby/ruby.git synced 2022-11-09 12:17:21 -05:00
ruby--ruby/lib/rexml/text.rb

344 lines
12 KiB
Ruby
Raw Normal View History

require 'rexml/entity'
require 'rexml/doctype'
require 'rexml/child'
require 'rexml/doctype'
require 'rexml/parseexception'
module REXML
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
# Represents text nodes in an XML document
class Text < Child
include Comparable
# The order in which the substitutions occur
SPECIALS = [ /&(?!#?[\w-]+;)/u, /</u, />/u, /"/u, /'/u, /\r/u ]
SUBSTITUTES = ['&amp;', '&lt;', '&gt;', '&quot;', '&apos;', '&#13;']
# Characters which are substituted in written strings
SLAICEPS = [ '<', '>', '"', "'", '&' ]
SETUTITSBUS = [ /&lt;/u, /&gt;/u, /&quot;/u, /&apos;/u, /&amp;/u ]
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
# If +raw+ is true, then REXML leaves the value alone
attr_accessor :raw
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
ILLEGAL = /(<|&(?!(#{Entity::NAME})|(#0*((?:\d+)|(?:x[a-fA-F0-9]+)));))/um
NUMERICENTITY = /&#0*((?:\d+)|(?:x[a-fA-F0-9]+));/
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
# Constructor
# +arg+ if a String, the content is set to the String. If a Text,
# the object is shallowly cloned.
#
# +respect_whitespace+ (boolean, false) if true, whitespace is
# respected
#
# +parent+ (nil) if this is a Parent object, the parent
# will be set to this.
#
# +raw+ (nil) This argument can be given three values.
# If true, then the value of used to construct this object is expected to
# contain no unescaped XML markup, and REXML will not change the text. If
# this value is false, the string may contain any characters, and REXML will
# escape any and all defined entities whose values are contained in the
# text. If this value is nil (the default), then the raw value of the
# parent will be used as the raw value for this node. If there is no raw
# value for the parent, and no value is supplied, the default is false.
Short summary: This is a version bump to REXML 3.1.4 for Ruby HEAD. This change log is identical to the log for the 1.8 branch. It includes numerous bug fixes and is a pretty big patch, but is nonetheless a minor revision bump, since the API hasn't changed. For more information, see: http:/www.germane-software.com/projects/rexml/milestone/3.1.4 For all tickets, see: http://www.germane-software.com/projects/rexml/ticket/# Where '#' is replaced with the ticket number. Changelog: * Fixed the documentation WRT the raw mode of text nodes (ticket #4) * Fixes roundup ticket #43: substring-after bug. * Fixed ticket #44, Element#xpath * Patch submitted by an anonymous doner to allow parsing of Tempfiles. I was hoping that, by now, that whole Source thing would have been changed to use duck typing and avoid this sort of ticket... but in the meantime, the patch has been applied. * Fixes ticket:30, XPath default namespace bug. The fix was provided by Lucas Nussbaum. * Aliases #size to #length, as per zdennis's request. * Fixes typo from previous commit * Fixes ticket #32, preceding-sibling fails attempting delete_if on nil nodeset * Merges a user-contributed patch for ticket #40 * Adds a forgotten-to-commit unit test for ticket #32 * Changes Date, Version, and Copyright to upper case, to avoid conflicts with the Date class. All of the other changes in the altered files are because Subversion doesn't allow block-level commits, like it should. English cased Version and Copyright are aliased to the upper case versions, for partial backward compatability. * Resolves ticket #34, SAX parser change makes it impossible to parse IO feeds. * Moves parser.source.position() to parser.position() * Fixes ticket:48, repeated writes munging text content * Fixes ticket:46, adding methods for accessing notation DTD information. * Encodes some characters and removes a brokes link in the documentation * Deals with carriage returns after XML declarations * Improved doctype handling * Whitespace handling changes * Applies a patch by David Tardon, which (incidentally) fixes ticket:50 * Closes #26, allowing anything that walks like an IO to be a source. * Ticket #31 - One unescape too many This wasn't really a bug, per se... "value" always returns a normalized string, and "value" is the method used to get the text() of an element. However, entities have no meaning in CDATA sections, so there's no justification for value to be normalizing the content of CData objects. This behavior has therefore been changed. * Ticket #45 -- Now parses notation declarations in DTDs properly. * Resolves ticket #49, Document.parse_stream returns ArgumentError * Adds documentation to clarify how XMLDecl works, to avoid invalid bug reports. * Addresses ticket #10, fixing the StreamParser API for DTDs. * Fixes ticket #42, XPath node-set function 'name' fails with relative node set parameter * Good patch by Aaron to fix ticket #53: REXML ignoring unbalanced tags at the end of a document. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10092 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-04-15 00:11:04 -04:00
# Use this field if you have entities defined for some text, and you don't
# want REXML to escape that text in output.
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
# Text.new( "<&", false, nil, false ) #-> "&lt;&amp;"
Merged from REXML main repository: Fixes ticket:68. NOTE that this involves an API change! Entity declarations in the doctype now generate events that carry two, not one, arguments. Implements ticket:15, using gwrite's suggestion. This allows Element to be subclassed. Two unrelated changes, because subversion is retarded and doesn't do block-level commits: 1) Fixed a typo bug in previous change for ticket:15 2) Fixed namespaces handling in XPath and element. ***** Note that this is an API change!!! ***** Element.namespaces() now returns a hash of namespace mappings which are relevant for that node. Fixes a bug in multiple decodings The changeset 1230:1231 was bad. The default behavior is *not* to use the native REXML encodings by default, but rather to use ICONV by default. I know that this will piss some people off, but defaulting to the pure Ruby version isn't the correct solution, and it breaks other encodings, so I've reverted it. * Fixes ticket:61 (xpath_parser) * Fixes ticket:63 (UTF-16; UNILE decoding was bad) * Cleans up some tests, removing opportunities for test corruption * Improves parsing error messages a little * Adds the ability to override the encoding detection in Source construction * Fixes an edge case in Functions::string, where document nodes weren't correctly converted * Fixes Functions::string() for Element and Document nodes * Fixes some problems in entity handling Addresses ticket:66 Fixes ticket:71 Addresses ticket:78 NOTE: that this also fixes what is technically another bug in REXML. REXML's XPath parser used to allow exponential notation in numbers. The XPath spec is specific about what a number is, and scientific notation is not included. Therefore, this has been fixed. Cross-ported a fix for ticket:88 from CVS. Fixes ticket:80 Documentation cleanup. Ticket:84 Applied Kou's fix for an un-trac'ed bug. ------------------------------------------------------------------------ git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@11548 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-01-19 22:56:02 -05:00
# Text.new( "&lt;&amp;", false, nil, false ) #-> "&amp;lt;&amp;amp;"
Short summary: This is a version bump to REXML 3.1.4 for Ruby HEAD. This change log is identical to the log for the 1.8 branch. It includes numerous bug fixes and is a pretty big patch, but is nonetheless a minor revision bump, since the API hasn't changed. For more information, see: http:/www.germane-software.com/projects/rexml/milestone/3.1.4 For all tickets, see: http://www.germane-software.com/projects/rexml/ticket/# Where '#' is replaced with the ticket number. Changelog: * Fixed the documentation WRT the raw mode of text nodes (ticket #4) * Fixes roundup ticket #43: substring-after bug. * Fixed ticket #44, Element#xpath * Patch submitted by an anonymous doner to allow parsing of Tempfiles. I was hoping that, by now, that whole Source thing would have been changed to use duck typing and avoid this sort of ticket... but in the meantime, the patch has been applied. * Fixes ticket:30, XPath default namespace bug. The fix was provided by Lucas Nussbaum. * Aliases #size to #length, as per zdennis's request. * Fixes typo from previous commit * Fixes ticket #32, preceding-sibling fails attempting delete_if on nil nodeset * Merges a user-contributed patch for ticket #40 * Adds a forgotten-to-commit unit test for ticket #32 * Changes Date, Version, and Copyright to upper case, to avoid conflicts with the Date class. All of the other changes in the altered files are because Subversion doesn't allow block-level commits, like it should. English cased Version and Copyright are aliased to the upper case versions, for partial backward compatability. * Resolves ticket #34, SAX parser change makes it impossible to parse IO feeds. * Moves parser.source.position() to parser.position() * Fixes ticket:48, repeated writes munging text content * Fixes ticket:46, adding methods for accessing notation DTD information. * Encodes some characters and removes a brokes link in the documentation * Deals with carriage returns after XML declarations * Improved doctype handling * Whitespace handling changes * Applies a patch by David Tardon, which (incidentally) fixes ticket:50 * Closes #26, allowing anything that walks like an IO to be a source. * Ticket #31 - One unescape too many This wasn't really a bug, per se... "value" always returns a normalized string, and "value" is the method used to get the text() of an element. However, entities have no meaning in CDATA sections, so there's no justification for value to be normalizing the content of CData objects. This behavior has therefore been changed. * Ticket #45 -- Now parses notation declarations in DTDs properly. * Resolves ticket #49, Document.parse_stream returns ArgumentError * Adds documentation to clarify how XMLDecl works, to avoid invalid bug reports. * Addresses ticket #10, fixing the StreamParser API for DTDs. * Fixes ticket #42, XPath node-set function 'name' fails with relative node set parameter * Good patch by Aaron to fix ticket #53: REXML ignoring unbalanced tags at the end of a document. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10092 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-04-15 00:11:04 -04:00
# Text.new( "<&", false, nil, true ) #-> Parse exception
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
# Text.new( "&lt;&amp;", false, nil, true ) #-> "&lt;&amp;"
# # Assume that the entity "s" is defined to be "sean"
# # and that the entity "r" is defined to be "russell"
# Text.new( "sean russell" ) #-> "&s; &r;"
# Text.new( "sean russell", false, nil, true ) #-> "sean russell"
#
# +entity_filter+ (nil) This can be an array of entities to match in the
# supplied text. This argument is only useful if +raw+ is set to false.
# Text.new( "sean russell", false, nil, false, ["s"] ) #-> "&s; russell"
# Text.new( "sean russell", false, nil, true, ["s"] ) #-> "sean russell"
# In the last example, the +entity_filter+ argument is ignored.
#
# +pattern+ INTERNAL USE ONLY
def initialize(arg, respect_whitespace=false, parent=nil, raw=nil,
entity_filter=nil, illegal=ILLEGAL )
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
@raw = false
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
if parent
super( parent )
@raw = parent.raw
else
@parent = nil
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
@raw = raw unless raw.nil?
@entity_filter = entity_filter
@normalized = @unnormalized = nil
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
if arg.kind_of? String
@string = arg.clone
@string.squeeze!(" \n\t") unless respect_whitespace
elsif arg.kind_of? Text
@string = arg.to_s
@raw = arg.raw
elsif
raise "Illegal argument of type #{arg.type} for Text constructor (#{arg})"
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
@string.gsub!( /\r\n?/, "\n" )
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
# check for illegal characters
if @raw
if @string =~ illegal
raise "Illegal character '#{$1}' in raw string \"#{@string}\""
end
end
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
def node_type
:text
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
def empty?
@string.size==0
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
def clone
return Text.new(self)
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
# Appends text to this text node. The text is appended in the +raw+ mode
# of this text node.
def <<( to_append )
@string << to_append.gsub( /\r\n?/, "\n" )
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
# +other+ a String or a Text
# +returns+ the result of (to_s <=> arg.to_s)
def <=>( other )
to_s() <=> other.to_s
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
REFERENCE = /#{Entity::REFERENCE}/
# Returns the string value of this text node. This string is always
# escaped, meaning that it is a valid XML text node string, and all
# entities that can be escaped, have been inserted. This method respects
# the entity filter set in the constructor.
#
# # Assume that the entity "s" is defined to be "sean", and that the
# # entity "r" is defined to be "russell"
# t = Text.new( "< & sean russell", false, nil, false, ['s'] )
# t.to_s #-> "&lt; &amp; &s; russell"
# t = Text.new( "< & &s; russell", false, nil, false )
# t.to_s #-> "&lt; &amp; &s; russell"
# u = Text.new( "sean russell", false, nil, true )
# u.to_s #-> "sean russell"
def to_s
return @string if @raw
return @normalized if @normalized
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
doctype = nil
if @parent
doc = @parent.document
doctype = doc.doctype if doc
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
@normalized = Text::normalize( @string, doctype, @entity_filter )
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
def inspect
@string.inspect
end
# Returns the string value of this text. This is the text without
# entities, as it might be used programmatically, or printed to the
# console. This ignores the 'raw' attribute setting, and any
# entity_filter.
#
# # Assume that the entity "s" is defined to be "sean", and that the
# # entity "r" is defined to be "russell"
# t = Text.new( "< & sean russell", false, nil, false, ['s'] )
Short summary: This is a version bump to REXML 3.1.4 for Ruby HEAD. This change log is identical to the log for the 1.8 branch. It includes numerous bug fixes and is a pretty big patch, but is nonetheless a minor revision bump, since the API hasn't changed. For more information, see: http:/www.germane-software.com/projects/rexml/milestone/3.1.4 For all tickets, see: http://www.germane-software.com/projects/rexml/ticket/# Where '#' is replaced with the ticket number. Changelog: * Fixed the documentation WRT the raw mode of text nodes (ticket #4) * Fixes roundup ticket #43: substring-after bug. * Fixed ticket #44, Element#xpath * Patch submitted by an anonymous doner to allow parsing of Tempfiles. I was hoping that, by now, that whole Source thing would have been changed to use duck typing and avoid this sort of ticket... but in the meantime, the patch has been applied. * Fixes ticket:30, XPath default namespace bug. The fix was provided by Lucas Nussbaum. * Aliases #size to #length, as per zdennis's request. * Fixes typo from previous commit * Fixes ticket #32, preceding-sibling fails attempting delete_if on nil nodeset * Merges a user-contributed patch for ticket #40 * Adds a forgotten-to-commit unit test for ticket #32 * Changes Date, Version, and Copyright to upper case, to avoid conflicts with the Date class. All of the other changes in the altered files are because Subversion doesn't allow block-level commits, like it should. English cased Version and Copyright are aliased to the upper case versions, for partial backward compatability. * Resolves ticket #34, SAX parser change makes it impossible to parse IO feeds. * Moves parser.source.position() to parser.position() * Fixes ticket:48, repeated writes munging text content * Fixes ticket:46, adding methods for accessing notation DTD information. * Encodes some characters and removes a brokes link in the documentation * Deals with carriage returns after XML declarations * Improved doctype handling * Whitespace handling changes * Applies a patch by David Tardon, which (incidentally) fixes ticket:50 * Closes #26, allowing anything that walks like an IO to be a source. * Ticket #31 - One unescape too many This wasn't really a bug, per se... "value" always returns a normalized string, and "value" is the method used to get the text() of an element. However, entities have no meaning in CDATA sections, so there's no justification for value to be normalizing the content of CData objects. This behavior has therefore been changed. * Ticket #45 -- Now parses notation declarations in DTDs properly. * Resolves ticket #49, Document.parse_stream returns ArgumentError * Adds documentation to clarify how XMLDecl works, to avoid invalid bug reports. * Addresses ticket #10, fixing the StreamParser API for DTDs. * Fixes ticket #42, XPath node-set function 'name' fails with relative node set parameter * Good patch by Aaron to fix ticket #53: REXML ignoring unbalanced tags at the end of a document. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10092 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-04-15 00:11:04 -04:00
# t.value #-> "< & sean russell"
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
# t = Text.new( "< & &s; russell", false, nil, false )
Short summary: This is a version bump to REXML 3.1.4 for Ruby HEAD. This change log is identical to the log for the 1.8 branch. It includes numerous bug fixes and is a pretty big patch, but is nonetheless a minor revision bump, since the API hasn't changed. For more information, see: http:/www.germane-software.com/projects/rexml/milestone/3.1.4 For all tickets, see: http://www.germane-software.com/projects/rexml/ticket/# Where '#' is replaced with the ticket number. Changelog: * Fixed the documentation WRT the raw mode of text nodes (ticket #4) * Fixes roundup ticket #43: substring-after bug. * Fixed ticket #44, Element#xpath * Patch submitted by an anonymous doner to allow parsing of Tempfiles. I was hoping that, by now, that whole Source thing would have been changed to use duck typing and avoid this sort of ticket... but in the meantime, the patch has been applied. * Fixes ticket:30, XPath default namespace bug. The fix was provided by Lucas Nussbaum. * Aliases #size to #length, as per zdennis's request. * Fixes typo from previous commit * Fixes ticket #32, preceding-sibling fails attempting delete_if on nil nodeset * Merges a user-contributed patch for ticket #40 * Adds a forgotten-to-commit unit test for ticket #32 * Changes Date, Version, and Copyright to upper case, to avoid conflicts with the Date class. All of the other changes in the altered files are because Subversion doesn't allow block-level commits, like it should. English cased Version and Copyright are aliased to the upper case versions, for partial backward compatability. * Resolves ticket #34, SAX parser change makes it impossible to parse IO feeds. * Moves parser.source.position() to parser.position() * Fixes ticket:48, repeated writes munging text content * Fixes ticket:46, adding methods for accessing notation DTD information. * Encodes some characters and removes a brokes link in the documentation * Deals with carriage returns after XML declarations * Improved doctype handling * Whitespace handling changes * Applies a patch by David Tardon, which (incidentally) fixes ticket:50 * Closes #26, allowing anything that walks like an IO to be a source. * Ticket #31 - One unescape too many This wasn't really a bug, per se... "value" always returns a normalized string, and "value" is the method used to get the text() of an element. However, entities have no meaning in CDATA sections, so there's no justification for value to be normalizing the content of CData objects. This behavior has therefore been changed. * Ticket #45 -- Now parses notation declarations in DTDs properly. * Resolves ticket #49, Document.parse_stream returns ArgumentError * Adds documentation to clarify how XMLDecl works, to avoid invalid bug reports. * Addresses ticket #10, fixing the StreamParser API for DTDs. * Fixes ticket #42, XPath node-set function 'name' fails with relative node set parameter * Good patch by Aaron to fix ticket #53: REXML ignoring unbalanced tags at the end of a document. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10092 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-04-15 00:11:04 -04:00
# t.value #-> "< & sean russell"
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
# u = Text.new( "sean russell", false, nil, true )
Short summary: This is a version bump to REXML 3.1.4 for Ruby HEAD. This change log is identical to the log for the 1.8 branch. It includes numerous bug fixes and is a pretty big patch, but is nonetheless a minor revision bump, since the API hasn't changed. For more information, see: http:/www.germane-software.com/projects/rexml/milestone/3.1.4 For all tickets, see: http://www.germane-software.com/projects/rexml/ticket/# Where '#' is replaced with the ticket number. Changelog: * Fixed the documentation WRT the raw mode of text nodes (ticket #4) * Fixes roundup ticket #43: substring-after bug. * Fixed ticket #44, Element#xpath * Patch submitted by an anonymous doner to allow parsing of Tempfiles. I was hoping that, by now, that whole Source thing would have been changed to use duck typing and avoid this sort of ticket... but in the meantime, the patch has been applied. * Fixes ticket:30, XPath default namespace bug. The fix was provided by Lucas Nussbaum. * Aliases #size to #length, as per zdennis's request. * Fixes typo from previous commit * Fixes ticket #32, preceding-sibling fails attempting delete_if on nil nodeset * Merges a user-contributed patch for ticket #40 * Adds a forgotten-to-commit unit test for ticket #32 * Changes Date, Version, and Copyright to upper case, to avoid conflicts with the Date class. All of the other changes in the altered files are because Subversion doesn't allow block-level commits, like it should. English cased Version and Copyright are aliased to the upper case versions, for partial backward compatability. * Resolves ticket #34, SAX parser change makes it impossible to parse IO feeds. * Moves parser.source.position() to parser.position() * Fixes ticket:48, repeated writes munging text content * Fixes ticket:46, adding methods for accessing notation DTD information. * Encodes some characters and removes a brokes link in the documentation * Deals with carriage returns after XML declarations * Improved doctype handling * Whitespace handling changes * Applies a patch by David Tardon, which (incidentally) fixes ticket:50 * Closes #26, allowing anything that walks like an IO to be a source. * Ticket #31 - One unescape too many This wasn't really a bug, per se... "value" always returns a normalized string, and "value" is the method used to get the text() of an element. However, entities have no meaning in CDATA sections, so there's no justification for value to be normalizing the content of CData objects. This behavior has therefore been changed. * Ticket #45 -- Now parses notation declarations in DTDs properly. * Resolves ticket #49, Document.parse_stream returns ArgumentError * Adds documentation to clarify how XMLDecl works, to avoid invalid bug reports. * Addresses ticket #10, fixing the StreamParser API for DTDs. * Fixes ticket #42, XPath node-set function 'name' fails with relative node set parameter * Good patch by Aaron to fix ticket #53: REXML ignoring unbalanced tags at the end of a document. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10092 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-04-15 00:11:04 -04:00
# u.value #-> "sean russell"
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
def value
@unnormalized if @unnormalized
doctype = nil
if @parent
doc = @parent.document
doctype = doc.doctype if doc
end
@unnormalized = Text::unnormalize( @string, doctype )
end
# Sets the contents of this text node. This expects the text to be
# unnormalized. It returns self.
#
# e = Element.new( "a" )
# e.add_text( "foo" ) # <a>foo</a>
# e[0].value = "bar" # <a>bar</a>
# e[0].value = "<a>" # <a>&lt;a&gt;</a>
def value=( val )
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
@string = val.gsub( /\r\n?/, "\n" )
@unnormalized = nil
@normalized = nil
@raw = false
end
Merged from REXML main repository: Fixes ticket:68. NOTE that this involves an API change! Entity declarations in the doctype now generate events that carry two, not one, arguments. Implements ticket:15, using gwrite's suggestion. This allows Element to be subclassed. Two unrelated changes, because subversion is retarded and doesn't do block-level commits: 1) Fixed a typo bug in previous change for ticket:15 2) Fixed namespaces handling in XPath and element. ***** Note that this is an API change!!! ***** Element.namespaces() now returns a hash of namespace mappings which are relevant for that node. Fixes a bug in multiple decodings The changeset 1230:1231 was bad. The default behavior is *not* to use the native REXML encodings by default, but rather to use ICONV by default. I know that this will piss some people off, but defaulting to the pure Ruby version isn't the correct solution, and it breaks other encodings, so I've reverted it. * Fixes ticket:61 (xpath_parser) * Fixes ticket:63 (UTF-16; UNILE decoding was bad) * Cleans up some tests, removing opportunities for test corruption * Improves parsing error messages a little * Adds the ability to override the encoding detection in Source construction * Fixes an edge case in Functions::string, where document nodes weren't correctly converted * Fixes Functions::string() for Element and Document nodes * Fixes some problems in entity handling Addresses ticket:66 Fixes ticket:71 Addresses ticket:78 NOTE: that this also fixes what is technically another bug in REXML. REXML's XPath parser used to allow exponential notation in numbers. The XPath spec is specific about what a number is, and scientific notation is not included. Therefore, this has been fixed. Cross-ported a fix for ticket:88 from CVS. Fixes ticket:80 Documentation cleanup. Ticket:84 Applied Kou's fix for an un-trac'ed bug. ------------------------------------------------------------------------ git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@11548 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-01-19 22:56:02 -05:00
def wrap(string, width, addnewline=false)
# Recursivly wrap string at width.
return string if string.length <= width
place = string.rindex(' ', width) # Position in string with last ' ' before cutoff
if addnewline then
return "\n" + string[0,place] + "\n" + wrap(string[place+1..-1], width)
else
return string[0,place] + "\n" + wrap(string[place+1..-1], width)
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
end
Merged from REXML main repository: Fixes ticket:68. NOTE that this involves an API change! Entity declarations in the doctype now generate events that carry two, not one, arguments. Implements ticket:15, using gwrite's suggestion. This allows Element to be subclassed. Two unrelated changes, because subversion is retarded and doesn't do block-level commits: 1) Fixed a typo bug in previous change for ticket:15 2) Fixed namespaces handling in XPath and element. ***** Note that this is an API change!!! ***** Element.namespaces() now returns a hash of namespace mappings which are relevant for that node. Fixes a bug in multiple decodings The changeset 1230:1231 was bad. The default behavior is *not* to use the native REXML encodings by default, but rather to use ICONV by default. I know that this will piss some people off, but defaulting to the pure Ruby version isn't the correct solution, and it breaks other encodings, so I've reverted it. * Fixes ticket:61 (xpath_parser) * Fixes ticket:63 (UTF-16; UNILE decoding was bad) * Cleans up some tests, removing opportunities for test corruption * Improves parsing error messages a little * Adds the ability to override the encoding detection in Source construction * Fixes an edge case in Functions::string, where document nodes weren't correctly converted * Fixes Functions::string() for Element and Document nodes * Fixes some problems in entity handling Addresses ticket:66 Fixes ticket:71 Addresses ticket:78 NOTE: that this also fixes what is technically another bug in REXML. REXML's XPath parser used to allow exponential notation in numbers. The XPath spec is specific about what a number is, and scientific notation is not included. Therefore, this has been fixed. Cross-ported a fix for ticket:88 from CVS. Fixes ticket:80 Documentation cleanup. Ticket:84 Applied Kou's fix for an un-trac'ed bug. ------------------------------------------------------------------------ git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@11548 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-01-19 22:56:02 -05:00
def indent_text(string, level=1, style="\t", indentfirstline=true)
return string if level < 0
new_string = ''
string.each { |line|
indent_string = style * level
new_line = (indent_string + line).sub(/[\s]+$/,'')
new_string << new_line
}
new_string.strip! unless indentfirstline
return new_string
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
def write( writer, indent=-1, transitive=false, ie_hack=false )
s = to_s()
if not (@parent and @parent.whitespace) then
s = wrap(s, 60, false) if @parent and @parent.context[:wordwrap] == :all
if @parent and not @parent.context[:indentstyle].nil? and indent > 0 and s.count("\n") > 0
s = indent_text(s, indent, @parent.context[:indentstyle], false)
end
s.squeeze!(" \n\t") if @parent and !@parent.whitespace
end
writer << s
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
end
# FIXME
# This probably won't work properly
def xpath
path = @parent.xpath
path += "/text()"
return path
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
# Writes out text, substituting special characters beforehand.
# +out+ A String, IO, or any other object supporting <<( String )
# +input+ the text to substitute and the write out
#
# z=utf8.unpack("U*")
# ascOut=""
# z.each{|r|
# if r < 0x100
# ascOut.concat(r.chr)
# else
# ascOut.concat(sprintf("&#x%x;", r))
# end
# }
# puts ascOut
def write_with_substitution out, input
copy = input.clone
# Doing it like this rather than in a loop improves the speed
copy.gsub!( SPECIALS[0], SUBSTITUTES[0] )
copy.gsub!( SPECIALS[1], SUBSTITUTES[1] )
copy.gsub!( SPECIALS[2], SUBSTITUTES[2] )
copy.gsub!( SPECIALS[3], SUBSTITUTES[3] )
copy.gsub!( SPECIALS[4], SUBSTITUTES[4] )
copy.gsub!( SPECIALS[5], SUBSTITUTES[5] )
out << copy
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
# Reads text, substituting entities
def Text::read_with_substitution( input, illegal=nil )
copy = input.clone
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
if copy =~ illegal
raise ParseException.new( "malformed text: Illegal character #$& in \"#{copy}\"" )
end if illegal
copy.gsub!( /\r\n?/, "\n" )
if copy.include? ?&
copy.gsub!( SETUTITSBUS[0], SLAICEPS[0] )
copy.gsub!( SETUTITSBUS[1], SLAICEPS[1] )
copy.gsub!( SETUTITSBUS[2], SLAICEPS[2] )
copy.gsub!( SETUTITSBUS[3], SLAICEPS[3] )
copy.gsub!( SETUTITSBUS[4], SLAICEPS[4] )
copy.gsub!( /&#0*((?:\d+)|(?:x[a-f0-9]+));/ ) {|m|
m=$1
#m='0' if m==''
m = "0#{m}" if m[0] == ?x
[Integer(m)].pack('U*')
}
end
copy
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
EREFERENCE = /&(?!#{Entity::NAME};)/
# Escapes all possible entities
def Text::normalize( input, doctype=nil, entity_filter=nil )
Short summary: This is a version bump to REXML 3.1.4 for Ruby HEAD. This change log is identical to the log for the 1.8 branch. It includes numerous bug fixes and is a pretty big patch, but is nonetheless a minor revision bump, since the API hasn't changed. For more information, see: http:/www.germane-software.com/projects/rexml/milestone/3.1.4 For all tickets, see: http://www.germane-software.com/projects/rexml/ticket/# Where '#' is replaced with the ticket number. Changelog: * Fixed the documentation WRT the raw mode of text nodes (ticket #4) * Fixes roundup ticket #43: substring-after bug. * Fixed ticket #44, Element#xpath * Patch submitted by an anonymous doner to allow parsing of Tempfiles. I was hoping that, by now, that whole Source thing would have been changed to use duck typing and avoid this sort of ticket... but in the meantime, the patch has been applied. * Fixes ticket:30, XPath default namespace bug. The fix was provided by Lucas Nussbaum. * Aliases #size to #length, as per zdennis's request. * Fixes typo from previous commit * Fixes ticket #32, preceding-sibling fails attempting delete_if on nil nodeset * Merges a user-contributed patch for ticket #40 * Adds a forgotten-to-commit unit test for ticket #32 * Changes Date, Version, and Copyright to upper case, to avoid conflicts with the Date class. All of the other changes in the altered files are because Subversion doesn't allow block-level commits, like it should. English cased Version and Copyright are aliased to the upper case versions, for partial backward compatability. * Resolves ticket #34, SAX parser change makes it impossible to parse IO feeds. * Moves parser.source.position() to parser.position() * Fixes ticket:48, repeated writes munging text content * Fixes ticket:46, adding methods for accessing notation DTD information. * Encodes some characters and removes a brokes link in the documentation * Deals with carriage returns after XML declarations * Improved doctype handling * Whitespace handling changes * Applies a patch by David Tardon, which (incidentally) fixes ticket:50 * Closes #26, allowing anything that walks like an IO to be a source. * Ticket #31 - One unescape too many This wasn't really a bug, per se... "value" always returns a normalized string, and "value" is the method used to get the text() of an element. However, entities have no meaning in CDATA sections, so there's no justification for value to be normalizing the content of CData objects. This behavior has therefore been changed. * Ticket #45 -- Now parses notation declarations in DTDs properly. * Resolves ticket #49, Document.parse_stream returns ArgumentError * Adds documentation to clarify how XMLDecl works, to avoid invalid bug reports. * Addresses ticket #10, fixing the StreamParser API for DTDs. * Fixes ticket #42, XPath node-set function 'name' fails with relative node set parameter * Good patch by Aaron to fix ticket #53: REXML ignoring unbalanced tags at the end of a document. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10092 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-04-15 00:11:04 -04:00
copy = input
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
# Doing it like this rather than in a loop improves the speed
Merged from REXML main repository: Fixes ticket:68. NOTE that this involves an API change! Entity declarations in the doctype now generate events that carry two, not one, arguments. Implements ticket:15, using gwrite's suggestion. This allows Element to be subclassed. Two unrelated changes, because subversion is retarded and doesn't do block-level commits: 1) Fixed a typo bug in previous change for ticket:15 2) Fixed namespaces handling in XPath and element. ***** Note that this is an API change!!! ***** Element.namespaces() now returns a hash of namespace mappings which are relevant for that node. Fixes a bug in multiple decodings The changeset 1230:1231 was bad. The default behavior is *not* to use the native REXML encodings by default, but rather to use ICONV by default. I know that this will piss some people off, but defaulting to the pure Ruby version isn't the correct solution, and it breaks other encodings, so I've reverted it. * Fixes ticket:61 (xpath_parser) * Fixes ticket:63 (UTF-16; UNILE decoding was bad) * Cleans up some tests, removing opportunities for test corruption * Improves parsing error messages a little * Adds the ability to override the encoding detection in Source construction * Fixes an edge case in Functions::string, where document nodes weren't correctly converted * Fixes Functions::string() for Element and Document nodes * Fixes some problems in entity handling Addresses ticket:66 Fixes ticket:71 Addresses ticket:78 NOTE: that this also fixes what is technically another bug in REXML. REXML's XPath parser used to allow exponential notation in numbers. The XPath spec is specific about what a number is, and scientific notation is not included. Therefore, this has been fixed. Cross-ported a fix for ticket:88 from CVS. Fixes ticket:80 Documentation cleanup. Ticket:84 Applied Kou's fix for an un-trac'ed bug. ------------------------------------------------------------------------ git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@11548 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2007-01-19 22:56:02 -05:00
#copy = copy.gsub( EREFERENCE, '&amp;' )
copy = copy.gsub( "&", "&amp;" )
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
if doctype
Short summary: This is a version bump to REXML 3.1.4 for Ruby HEAD. This change log is identical to the log for the 1.8 branch. It includes numerous bug fixes and is a pretty big patch, but is nonetheless a minor revision bump, since the API hasn't changed. For more information, see: http:/www.germane-software.com/projects/rexml/milestone/3.1.4 For all tickets, see: http://www.germane-software.com/projects/rexml/ticket/# Where '#' is replaced with the ticket number. Changelog: * Fixed the documentation WRT the raw mode of text nodes (ticket #4) * Fixes roundup ticket #43: substring-after bug. * Fixed ticket #44, Element#xpath * Patch submitted by an anonymous doner to allow parsing of Tempfiles. I was hoping that, by now, that whole Source thing would have been changed to use duck typing and avoid this sort of ticket... but in the meantime, the patch has been applied. * Fixes ticket:30, XPath default namespace bug. The fix was provided by Lucas Nussbaum. * Aliases #size to #length, as per zdennis's request. * Fixes typo from previous commit * Fixes ticket #32, preceding-sibling fails attempting delete_if on nil nodeset * Merges a user-contributed patch for ticket #40 * Adds a forgotten-to-commit unit test for ticket #32 * Changes Date, Version, and Copyright to upper case, to avoid conflicts with the Date class. All of the other changes in the altered files are because Subversion doesn't allow block-level commits, like it should. English cased Version and Copyright are aliased to the upper case versions, for partial backward compatability. * Resolves ticket #34, SAX parser change makes it impossible to parse IO feeds. * Moves parser.source.position() to parser.position() * Fixes ticket:48, repeated writes munging text content * Fixes ticket:46, adding methods for accessing notation DTD information. * Encodes some characters and removes a brokes link in the documentation * Deals with carriage returns after XML declarations * Improved doctype handling * Whitespace handling changes * Applies a patch by David Tardon, which (incidentally) fixes ticket:50 * Closes #26, allowing anything that walks like an IO to be a source. * Ticket #31 - One unescape too many This wasn't really a bug, per se... "value" always returns a normalized string, and "value" is the method used to get the text() of an element. However, entities have no meaning in CDATA sections, so there's no justification for value to be normalizing the content of CData objects. This behavior has therefore been changed. * Ticket #45 -- Now parses notation declarations in DTDs properly. * Resolves ticket #49, Document.parse_stream returns ArgumentError * Adds documentation to clarify how XMLDecl works, to avoid invalid bug reports. * Addresses ticket #10, fixing the StreamParser API for DTDs. * Fixes ticket #42, XPath node-set function 'name' fails with relative node set parameter * Good patch by Aaron to fix ticket #53: REXML ignoring unbalanced tags at the end of a document. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10092 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-04-15 00:11:04 -04:00
# Replace all ampersands that aren't part of an entity
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
doctype.entities.each_value do |entity|
copy = copy.gsub( entity.value,
"&#{entity.name};" ) if entity.value and
not( entity_filter and entity_filter.include?(entity) )
end
else
Short summary: This is a version bump to REXML 3.1.4 for Ruby HEAD. This change log is identical to the log for the 1.8 branch. It includes numerous bug fixes and is a pretty big patch, but is nonetheless a minor revision bump, since the API hasn't changed. For more information, see: http:/www.germane-software.com/projects/rexml/milestone/3.1.4 For all tickets, see: http://www.germane-software.com/projects/rexml/ticket/# Where '#' is replaced with the ticket number. Changelog: * Fixed the documentation WRT the raw mode of text nodes (ticket #4) * Fixes roundup ticket #43: substring-after bug. * Fixed ticket #44, Element#xpath * Patch submitted by an anonymous doner to allow parsing of Tempfiles. I was hoping that, by now, that whole Source thing would have been changed to use duck typing and avoid this sort of ticket... but in the meantime, the patch has been applied. * Fixes ticket:30, XPath default namespace bug. The fix was provided by Lucas Nussbaum. * Aliases #size to #length, as per zdennis's request. * Fixes typo from previous commit * Fixes ticket #32, preceding-sibling fails attempting delete_if on nil nodeset * Merges a user-contributed patch for ticket #40 * Adds a forgotten-to-commit unit test for ticket #32 * Changes Date, Version, and Copyright to upper case, to avoid conflicts with the Date class. All of the other changes in the altered files are because Subversion doesn't allow block-level commits, like it should. English cased Version and Copyright are aliased to the upper case versions, for partial backward compatability. * Resolves ticket #34, SAX parser change makes it impossible to parse IO feeds. * Moves parser.source.position() to parser.position() * Fixes ticket:48, repeated writes munging text content * Fixes ticket:46, adding methods for accessing notation DTD information. * Encodes some characters and removes a brokes link in the documentation * Deals with carriage returns after XML declarations * Improved doctype handling * Whitespace handling changes * Applies a patch by David Tardon, which (incidentally) fixes ticket:50 * Closes #26, allowing anything that walks like an IO to be a source. * Ticket #31 - One unescape too many This wasn't really a bug, per se... "value" always returns a normalized string, and "value" is the method used to get the text() of an element. However, entities have no meaning in CDATA sections, so there's no justification for value to be normalizing the content of CData objects. This behavior has therefore been changed. * Ticket #45 -- Now parses notation declarations in DTDs properly. * Resolves ticket #49, Document.parse_stream returns ArgumentError * Adds documentation to clarify how XMLDecl works, to avoid invalid bug reports. * Addresses ticket #10, fixing the StreamParser API for DTDs. * Fixes ticket #42, XPath node-set function 'name' fails with relative node set parameter * Good patch by Aaron to fix ticket #53: REXML ignoring unbalanced tags at the end of a document. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@10092 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2006-04-15 00:11:04 -04:00
# Replace all ampersands that aren't part of an entity
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
DocType::DEFAULT_ENTITIES.each_value do |entity|
copy = copy.gsub(entity.value, "&#{entity.name};" )
end
end
copy
end
Merged in development from the main REXML repository. * Fixed bug #34, typo in xpath_parser. * Previous fix, (include? -> includes?) was incorrect. * Added another test for encoding * Started AnyName support in RelaxNG * Added Element#Attributes#to_a, so that it does something intelligent. This was needed by XPath, for '@*' * Fixed XPath so that @* works. * Added xmlgrep to the bin/ directory. A little tool allowing you to grep for XPaths in an XML document. * Fixed a CDATA pretty-printing bug. (#39) * Fixed a buffering bug in Source.rb that affected the SAX parser This bug was related to how REXML determines the encoding of a file, and evinced itself by hanging on input when using the SAX parser. * The unit test for the previous patch. Forgot to commit it. * Minor pretty printing fix. * Applied Curt Sampson's optimization improvements * Issue #9; 3.1.3: The SAX parser was not denormalizing entity references in incoming text. All declared internal entities, as well as numeric entities, should now be denormalized. There was a related bug in that the SAX parser was actually double-encoding entities; this is also fixed. * bin/* programs should now be executable. Setting bin apps to executable * Issue 14; 3.1.3: DTD events are now all being passed by StreamParser Some of the DTD events were not being passed through by the stream parser. * #26: Element#add_element(nil) now raises an error Changed XPath searches so that if a non-Hash is passed, an error is raised Fixed a spurrious undefined method error in encoding. #29: XPath ordering bug fixed by Mark Williams. Incidentally, Mark supplied a superlative bug report, including a full unit test. Then he went ahead and fixed the bug. It doesn't get any better than this, folks. * Fixed a broken link. Thanks to Dick Davies for pointing it out. Added functions courtesy of Michael Neumann <mneumann@xxxx.de>. Example code to follow. * Added Michael's sample code. Merged the changes in from branches/xpath_V * Fixed preceding:: and following:: axis Fixed the ordering bug that Martin Fowler reported. * Uncommented some code commented for testing Applied Nobu's changes to the Encoding infrastructure, which should fix potential threading issues. * Added more tests, and the missing syncenumerator class. Fixed the inheritance bug in the pull parser that James Britt found. Indentation changes, and changed some exceptions to runtime exceptions. * Changes by Matz, mostly of indent -> indent_level, to avoid function/variable naming conflicts * Tabs -> spaces (whitespace) Note the addition of syncenumerator.rb. This is a stopgap, until I can work on the class enough to get it accepted as a replacement for the SyncEnumerator that comes with the Generator class. My version is orders of magnitude faster than the Generator SyncEnumerator, but is currently missing a couple of features of the original. Eventually, I expect this class to migrate to another part of the source tree. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2005-05-18 22:58:11 -04:00
# Unescapes all possible entities
def Text::unnormalize( string, doctype=nil, filter=nil, illegal=nil )
rv = string.clone
rv.gsub!( /\r\n?/, "\n" )
matches = rv.scan( REFERENCE )
return rv if matches.size == 0
rv.gsub!( NUMERICENTITY ) {|m|
m=$1
m = "0#{m}" if m[0] == ?x
[Integer(m)].pack('U*')
}
matches.collect!{|x|x[0]}.compact!
if matches.size > 0
if doctype
matches.each do |entity_reference|
unless filter and filter.include?(entity_reference)
entity_value = doctype.entity( entity_reference )
re = /&#{entity_reference};/
rv.gsub!( re, entity_value ) if entity_value
end
end
else
matches.each do |entity_reference|
unless filter and filter.include?(entity_reference)
entity_value = DocType::DEFAULT_ENTITIES[ entity_reference ]
re = /&#{entity_reference};/
rv.gsub!( re, entity_value.value ) if entity_value
end
end
end
rv.gsub!( /&amp;/, '&' )
end
rv
end
end
end