diff --git a/doc/regexp.rdoc b/doc/regexp.rdoc index 5ec64907f5..23fe7113b9 100644 --- a/doc/regexp.rdoc +++ b/doc/regexp.rdoc @@ -222,13 +222,13 @@ jeopardises the overall match. == Capturing Parentheses can be used for capturing. The text enclosed by the -nth group of parentheses can be subsequently referred to +nth group of parentheses can be subsequently referred to with n. Within a pattern use the backreference -\n; outside of the pattern use -MatchData[n]. +\n (e.g. \1); outside of the pattern use +MatchData[n] (e.g. MatchData[1]). -'at' is captured by the first group of parentheses, then referred to later -with \1: +In this example, 'at' is captured by the first group of +parentheses, then referred to later with \1: /[csh](..) [csh]\1 in/.match("The cat sat in the hat") #=> # @@ -238,6 +238,21 @@ available with its #[] method: /[csh](..) [csh]\1 in/.match("The cat sat in the hat")[1] #=> 'at' +While Ruby supports an arbitrary number of numbered captured groups, +only groups 1-9 are supported using the \n backreference +syntax. + +Ruby also supports \0 as a special backreference, which +references the entire matched string. This is also available at +MatchData[0]. Note that the \0 backreference cannot +be used inside the regexp, as backreferences can only be used after the +end of the capture group, and the \0 backreference uses the +implicit capture group of the entire match. However, you can use +this backreference when doing substitution: + + "The cat sat in the hat".gsub(/[csh]at/, '\0s') + # => "The cats sats in the hats" + === Named captures Capture groups can be referred to by name when defined with the @@ -524,6 +539,17 @@ characters, anchoring the match to a specific position. * (?pat) - Negative lookbehind assertion: ensures that the preceding characters do not match pat, but doesn't include those characters in the matched text +* \K - Uses an positive lookbehind of the content preceding + \K in the regexp. For example, the following two regexps are + almost equivalent: + + /ab\Kc/ + /(?<=ab)c/ + + As are the following two regexps: + + /(a)\K(b)\Kc/ + /(?<=(?<=(a))(b))c/ If a pattern isn't anchored it can begin at any point in the string: