mirror of
https://github.com/ruby/ruby.git
synced 2022-11-09 12:17:21 -05:00
* doc/re.rdoc: Document difference between match and =~, options with
Regexp.new and global variables. Patch by Sylvain Daubert. [Ruby 1.9 - Bug #5709] git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@33977 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This commit is contained in:
parent
52654367f6
commit
3e204989c1
2 changed files with 75 additions and 2 deletions
|
@ -1,3 +1,9 @@
|
||||||
|
Thu Dec 8 07:20:15 2011 Eric Hodel <drbrain@segment7.net>
|
||||||
|
|
||||||
|
* doc/re.rdoc: Document difference between match and =~, options with
|
||||||
|
Regexp.new and global variables. Patch by Sylvain Daubert.
|
||||||
|
[Ruby 1.9 - Bug #5709]
|
||||||
|
|
||||||
Thu Dec 8 06:53:10 2011 Eric Hodel <drbrain@segment7.net>
|
Thu Dec 8 06:53:10 2011 Eric Hodel <drbrain@segment7.net>
|
||||||
|
|
||||||
* doc/re.rdoc: Fix example code to match documentation. Patch by
|
* doc/re.rdoc: Fix example code to match documentation. Patch by
|
||||||
|
|
71
doc/re.rdoc
71
doc/re.rdoc
|
@ -24,6 +24,32 @@ string matches itself.
|
||||||
Specifically, <tt>/st/</tt> requires that the string contains the letter
|
Specifically, <tt>/st/</tt> requires that the string contains the letter
|
||||||
_s_ followed by the letter _t_, so it matches _haystack_, also.
|
_s_ followed by the letter _t_, so it matches _haystack_, also.
|
||||||
|
|
||||||
|
== <tt>=~</tt> and Regexp#match
|
||||||
|
|
||||||
|
Pattern matching may be achieved by using <tt>=~</tt> operator or Regexp#match
|
||||||
|
method.
|
||||||
|
|
||||||
|
=== <tt>=~</tt> operator
|
||||||
|
|
||||||
|
<tt>=~</tt> is Ruby's basic pattern-matching operator. When one operand is a
|
||||||
|
regular expression and is a string (this operator is equivalently defined by
|
||||||
|
Regexp and String). If a match is found, the operator returns index of first
|
||||||
|
match in string, otherwise it returns +nil+.
|
||||||
|
|
||||||
|
/hay/ =~ 'haystack' #=> 0
|
||||||
|
/a/ =~ 'haystack' #=> 1
|
||||||
|
/u/ =~ 'haystack' #=> nil
|
||||||
|
|
||||||
|
Using <tt>=~</tt> operator with a String and Regexp the <tt>$~</tt> global
|
||||||
|
variable is set after a successful match. <tt>$~</tt> holds a MatchData
|
||||||
|
object. Regexp.last_match is equivalent to <tt>$~</tt>.
|
||||||
|
|
||||||
|
=== Regexp#match method
|
||||||
|
|
||||||
|
#match method return a MatchData object :
|
||||||
|
|
||||||
|
/st/.match('haystack') #=> #<MatchData "st">
|
||||||
|
|
||||||
== Metacharacters and Escapes
|
== Metacharacters and Escapes
|
||||||
|
|
||||||
The following are <i>metacharacters</i> <tt>(</tt>, <tt>)</tt>,
|
The following are <i>metacharacters</i> <tt>(</tt>, <tt>)</tt>,
|
||||||
|
@ -111,7 +137,7 @@ matches any character in the Unicode _Nd_ category.
|
||||||
* <tt>/[[:print:]]/</tt> - Like [:graph:], but includes the space character
|
* <tt>/[[:print:]]/</tt> - Like [:graph:], but includes the space character
|
||||||
* <tt>/[[:punct:]]/</tt> - Punctuation character
|
* <tt>/[[:punct:]]/</tt> - Punctuation character
|
||||||
* <tt>/[[:space:]]/</tt> - Whitespace character (<tt>[:blank:]</tt>, newline,
|
* <tt>/[[:space:]]/</tt> - Whitespace character (<tt>[:blank:]</tt>, newline,
|
||||||
carriage return, etc.)
|
carriage return, etc.)
|
||||||
* <tt>/[[:upper:]]/</tt> - Uppercase alphabetical
|
* <tt>/[[:upper:]]/</tt> - Uppercase alphabetical
|
||||||
* <tt>/[[:xdigit:]]/</tt> - Digit allowed in a hexadecimal number (i.e.,
|
* <tt>/[[:xdigit:]]/</tt> - Digit allowed in a hexadecimal number (i.e.,
|
||||||
0-9a-fA-F)
|
0-9a-fA-F)
|
||||||
|
@ -169,7 +195,7 @@ jeopardises the overall match.
|
||||||
Parentheses can be used for <i>capturing</i>. The text enclosed by the
|
Parentheses can be used for <i>capturing</i>. The text enclosed by the
|
||||||
<i>n</i><sup>th</sup> group of parentheses can be subsequently referred to
|
<i>n</i><sup>th</sup> group of parentheses can be subsequently referred to
|
||||||
with <i>n</i>. Within a pattern use the <i>backreference</i>
|
with <i>n</i>. Within a pattern use the <i>backreference</i>
|
||||||
<tt>\</tt><i>n</i>; outside of the pattern use
|
<tt>\n</tt>; outside of the pattern use
|
||||||
<tt>MatchData[</tt><i>n</i><tt>]</tt>.
|
<tt>MatchData[</tt><i>n</i><tt>]</tt>.
|
||||||
|
|
||||||
# 'at' is captured by the first group of parentheses, then referred to
|
# 'at' is captured by the first group of parentheses, then referred to
|
||||||
|
@ -473,6 +499,13 @@ expression enclosed by the parentheses.
|
||||||
/a(?i:b)c/.match('aBc') #=> #<MatchData "aBc">
|
/a(?i:b)c/.match('aBc') #=> #<MatchData "aBc">
|
||||||
/a(?i:b)c/.match('abc') #=> #<MatchData "abc">
|
/a(?i:b)c/.match('abc') #=> #<MatchData "abc">
|
||||||
|
|
||||||
|
Options may also be used with <tt>Regexp.new</tt>:
|
||||||
|
|
||||||
|
Regexp.new("abc", Regexp::IGNORECASE) #=> /abc/i
|
||||||
|
Regexp.new("abc", Regexp::MULTILINE) #=> /abc/m
|
||||||
|
Regexp.new("abc # Comment", Regexp::EXTENDED) #=> /abc # Comment/x
|
||||||
|
Regexp.new("abc", Regexp::IGNORECASE | Regexp::MULTILINE) #=> /abc/mi
|
||||||
|
|
||||||
== Free-Spacing Mode and Comments
|
== Free-Spacing Mode and Comments
|
||||||
|
|
||||||
As mentioned above, the <tt>x</tt> option enables <i>free-spacing</i>
|
As mentioned above, the <tt>x</tt> option enables <i>free-spacing</i>
|
||||||
|
@ -525,6 +558,40 @@ regexp's encoding can be explicitly fixed by supplying
|
||||||
#=> Encoding::CompatibilityError: incompatible encoding regexp match
|
#=> Encoding::CompatibilityError: incompatible encoding regexp match
|
||||||
(ISO-8859-1 regexp with UTF-8 string)
|
(ISO-8859-1 regexp with UTF-8 string)
|
||||||
|
|
||||||
|
== Special global variables
|
||||||
|
|
||||||
|
Pattern matching sets some global variables :
|
||||||
|
* <tt>$~</tt> is equivalent to Regexp.last_match;
|
||||||
|
* <tt>$&</tt> contains the complete matched text;
|
||||||
|
* <tt>$`</tt> contains string before match;
|
||||||
|
* <tt>$'</tt> contains string after match;
|
||||||
|
* <tt>$1</tt>, <tt>$2</tt> and so on contain text matching first, second, etc
|
||||||
|
capture group;
|
||||||
|
* <tt>$+</tt> contains last capture group.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
m = /s(\w{2}).*(c)/.match('haystack') #=> #<MatchData "stac" 1:"ta" 2:"c">
|
||||||
|
$~ #=> #<MatchData "stac" 1:"ta" 2:"c">
|
||||||
|
Regexp.latch_match #=> #<MatchData "stac" 1:"ta" 2:"c">
|
||||||
|
|
||||||
|
$& #=> "stac"
|
||||||
|
# same as m[0]
|
||||||
|
$` #=> "hay"
|
||||||
|
# same as m.pre_match
|
||||||
|
$' #=> "k"
|
||||||
|
# same as m.post_match
|
||||||
|
$1 #=> "ta"
|
||||||
|
# same as m[1]
|
||||||
|
$2 #=> "c"
|
||||||
|
# same as m[2]
|
||||||
|
$3 #=> nil
|
||||||
|
# no third group in pattern
|
||||||
|
$+ #=> "c"
|
||||||
|
# same as m[-1]
|
||||||
|
|
||||||
|
These global variables are thread-local and method-local varaibles.
|
||||||
|
|
||||||
== Performance
|
== Performance
|
||||||
|
|
||||||
Certain pathological combinations of constructs can lead to abysmally bad
|
Certain pathological combinations of constructs can lead to abysmally bad
|
||||||
|
|
Loading…
Reference in a new issue