Grouping ConstructsÂ
Grouping constructs delineate subexpressions of a regular expression and typically capture substrings of an input string. The following table describes the regular expression grouping constructs.
Grouping construct | Description |
---|---|
(subexpression) |
Captures the matched subexpression (or noncapturing group; for more information, see the ExplicitCapture option in Regular Expression Options). Captures using () are numbered automatically based on the order of the opening parenthesis, starting from one. The first capture, capture element number zero, is the text matched by the whole regular expression pattern. |
(?<name> subexpression) |
Captures the matched subexpression into a group name or number name. The string used for name must not contain any punctuation and cannot begin with a number. You can use single quotes instead of angle brackets; for example, |
(?<name1-name2> subexpression) |
(Balancing group definition.) Deletes the definition of the previously defined group name2 and stores in group name1 the interval between the previously defined name2 group and the current group. If no group name2 is defined, the match backtracks. Because deleting the last definition of name2 reveals the previous definition of name2, this construct allows the stack of captures for group name2 to be used as a counter for keeping track of nested constructs such as parentheses. In this construct, name1 is optional. You can use single quotes instead of angle brackets; for example, For more information, see the example in this topic. |
(?: subexpression) |
(Noncapturing group.) Does not capture the substring matched by the subexpression. |
(?imnsx-imnsx: subexpression) |
Applies or disables the specified options within the subexpression. For example, |
(?= subexpression) |
(Zero-width positive lookahead assertion.) Continues match only if the subexpression matches at this position on the right. For example, |
(?! subexpression) |
(Zero-width negative lookahead assertion.) Continues match only if the subexpression does not match at this position on the right. For example, |
(?<= subexpression) |
(Zero-width positive lookbehind assertion.) Continues match only if the subexpression matches at this position on the left. For example, |
(?<! subexpression) |
(Zero-width negative lookbehind assertion.) Continues match only if the subexpression does not match at the position on the left. |
(?> subexpression) |
(Nonbacktracking subexpression (also known as a "greedy" subexpression.)) The subexpression is fully matched once, and then does not participate piecemeal in backtracking. (That is, the subexpression matches only strings that would be matched by the subexpression alone.) By default, if a match does not succeed, backtracking searches for other possible matches. If you know backtracking cannot succeed, you can use a nonbacktracking subexpression to prevent unnecessary searching, which improves performance. |
Named captures are numbered sequentially, based on the left-to-right order of the opening parenthesis (like unnamed captures), but the numbering of named captures starts after all unnamed captures have been counted. For example, the pattern ((?<One>abc)/d+)?(?<Two>xyz)(.*)
produces the following capturing groups by number and name. (The first capture (number 0) always refers to the entire pattern).
Number | Name | Pattern |
---|---|---|
0 |
0 (default name) |
|
1 |
1 (default name) |
|
2 |
2 (default name) |
|
3 |
One |
|
4 |
Two |
|
Balancing Group Definition Example
The following code example demonstrates using a balancing group definition to match left and right angle brackets (<>) in an input string. The capture collections of the Open and Close groups in the example are used like a stack to track matching pairs of angle brackets: each captured left angle bracket is pushed into the capture collection of the Open group; each captured right angle bracket is pushed into the capture collection of the Close group; and the balancing group definition ensures there is a matching right angle bracket for each left angle bracket.
' This code example demonstrates using the balancing group definition feature of
' regular expressions to match balanced left angle bracket (<) and right angle
' bracket (>) characters in a string.
Imports System
Imports System.Text.RegularExpressions
Class Sample
Public Shared Sub Main()
'
' The following expression matches all balanced left and right angle brackets(<>).
' The expression:
' 1) Matches and discards zero or more non-angle bracket characters.
' 2) Matches zero or more of:
' 2a) One or more of:
' 2a1) A group named "Open" that matches a left angle bracket, followed by zero
' or more non-angle bracket characters.
' "Open" essentially counts the number of left angle brackets.
' 2b) One or more of:
' 2b1) A balancing group named "Close" that matches a right angle bracket,
' followed by zero or more non-angle bracket characters.
' "Close" essentially counts the number of right angle brackets.
' 3) If the "Open" group contains an unaccounted for left angle bracket, the
' entire regular expression fails.
'
Dim pattern As String = "^[^<>]*" & _
"(" + "((?'Open'<)[^<>]*)+" & _
"((?'Close-Open'>)[^<>]*)+" + ")*" & _
"(?(Open)(?!))$"
Dim input As String = "<abc><mno<xyz>>"
'
Dim m As Match = Regex.Match(input, pattern)
If m.Success = True Then
Console.WriteLine("Input: ""{0}"" " & vbCrLf & "Match: ""{1}""", _
input, m)
Else
Console.WriteLine("Match failed.")
End If
End Sub 'Main
End Class 'Sample
'This code example produces the following results:
'
'Input: "<abc><mno<xyz>>"
'Match: "<abc><mno<xyz>>"
'
// This code example demonstrates using the balancing group definition feature of
// regular expressions to match balanced left angle bracket (<) and right angle
// bracket (>) characters in a string.
using System;
using System.Text.RegularExpressions;
class Sample
{
public static void Main()
{
/*
The following expression matches all balanced left and right angle brackets(<>).
The expression:
1) Matches and discards zero or more non-angle bracket characters.
2) Matches zero or more of:
2a) One or more of:
2a1) A group named "Open" that matches a left angle bracket, followed by zero
or more non-angle bracket characters.
"Open" essentially counts the number of left angle brackets.
2b) One or more of:
2b1) A balancing group named "Close" that matches a right angle bracket,
followed by zero or more non-angle bracket characters.
"Close" essentially counts the number of right angle brackets.
3) If the "Open" group contains an unaccounted for left angle bracket, the
entire regular expression fails.
*/
string pattern = "^[^<>]*" +
"(" +
"((?'Open'<)[^<>]*)+" +
"((?'Close-Open'>)[^<>]*)+" +
")*" +
"(?(Open)(?!))$";
string input = "<abc><mno<xyz>>";
//
Match m = Regex.Match(input, pattern);
if (m.Success == true)
Console.WriteLine("Input: \"{0}\" \nMatch: \"{1}\"", input, m);
else
Console.WriteLine("Match failed.");
}
}
/*
This code example produces the following results:
Input: "<abc><mno<xyz>>"
Match: "<abc><mno<xyz>>"
*/