規則運算式選項
預設情況下,比對輸入字串與規則運算式模式中的任何常值字元會區分大小寫,規則運算式模式中的空格會被解譯為常值空白字元,而規則運算式中的擷取群組則會隱含及明確地命名。 您可以指定規則運算式選項來修改這些及其他幾個方面的預設規則運算式行為。 這些選項 (列於下表) 可以透過內嵌方式加入在規則運算式模式中,也可以提供給 System.Text.RegularExpressions.Regex 類別建構函式或靜態模式比對方法,做為 System.Text.RegularExpressions.RegexOptions 列舉值。
RegexOptions 成員 |
內嵌字元 |
作用 |
---|---|---|
無法使用 |
使用預設行為。 如需詳細資訊,請參閱預設選項。 |
|
i |
使用區分大小寫的比對方式。 如需詳細資訊,請參閱不區分大小寫的比對。 |
|
m |
使用多行模式,其中 ^ 和 $ 符合每行的開頭與結尾 (而不是輸入字串的開始和結束)。 如需詳細資訊,請參閱多行程式碼。 |
|
s |
使用單行模式,在此模型下,句點 (.) 會比對每個字元 (而不是除了 \n 以外的每個字元)。 如需詳細資訊,請參閱單行模式。 |
|
n |
不擷取未命名的群組。 唯一有效的擷取為 (?<name> subexpression) 形式的明確具名群組或編號群組。 如需詳細資訊,請參閱僅限明確擷取。 |
|
無法使用 |
將規則運算式編譯為組件。 如需詳細資訊,請參閱已編譯的規則運算式。 |
|
x |
排除模式中未逸出的空白字元,並在數字符號 (#) 之後啟用註解。 如需詳細資訊,請參閱忽略空白。 |
|
無法使用 |
變更搜尋方向。 搜尋會從右至左而非從左至右。 如需詳細資訊,請參閱由右至左模式。 |
|
無法使用 |
啟用運算式的 ECMAScript 相容行為。 如需詳細資訊,請參閱 ECMAScript 比對行為。 |
|
無法使用 |
忽略語言的文化差異。 如需詳細資訊,請參閱使用不因文化特性而異的比較。 |
指定選項
您可以使用三種方式的其中一種指定規則運算式的選項:
在 System.Text.RegularExpressions.Regex 類別建構函式或靜態 (在 Visual Basic 中為 Shared) 模式比對方法的 options 參數中,例如Regex.Regex(String, RegexOptions) 或 Regex.Match(String, String, RegexOptions)。 options 參數是 System.Text.RegularExpressions.RegexOptions 列舉值的位元 OR 組合。
下列範例提供一個實例。 它會使用 Regex.Match(String, String, RegexOptions) 方法的 options 參數,以啟用不區分大小寫的比對,並且在識別以字母"d"開頭的文字時忽略模式空白字元。
Dim pattern As String = "d \w+ \s" Dim input As String = "Dogs are decidedly good pets." Dim options As RegexOptions = RegexOptions.IgnoreCase Or RegexOptions.IgnorePatternWhitespace For Each match As Match In Regex.Matches(input, pattern, options) Console.WriteLine("'{0}' found at index {1}.", match.Value, match.Index) Next ' The example displays the following output: ' 'Dogs ' found at index 0. ' 'decidedly ' found at index 9.
string pattern = @"d \w+ \s"; string input = "Dogs are decidedly good pets."; RegexOptions options = RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace; foreach (Match match in Regex.Matches(input, pattern, options)) Console.WriteLine("'{0}// found at index {1}.", match.Value, match.Index); // The example displays the following output: // 'Dogs // found at index 0. // 'decidedly // found at index 9.
藉由使用語法 (?imnsx-imnsx) 在規則運算式模式中套用內嵌選項。 該選項適用於模式從點定義模式的結尾或點的選項未定義由另一個內嵌選項的選項。 如需詳細資訊,請參閱其他建構主題。
下列範例提供一個實例。 它會使用內嵌選項,以啟用不區分大小寫的比對,並且在識別以字母"d"開頭的文字時忽略模式空白字元。
Dim pattern As String = "\b(?ix) d \w+ \s" Dim input As String = "Dogs are decidedly good pets." For Each match As Match In Regex.Matches(input, pattern) Console.WriteLine("'{0}' found at index {1}.", match.Value, match.Index) Next ' The example displays the following output: ' 'Dogs ' found at index 0. ' 'decidedly ' found at index 9.
string pattern = @"(?ix) d \w+ \s"; string input = "Dogs are decidedly good pets."; foreach (Match match in Regex.Matches(input, pattern)) Console.WriteLine("'{0}// found at index {1}.", match.Value, match.Index); // The example displays the following output: // 'Dogs // found at index 0. // 'decidedly // found at index 9.
藉由使用語法 (?imnsx-imnsx:子運算式) 在規則運算式模式的特定群組建構中套用內嵌選項。 選項組前面沒有正負號時,會開啟該選項組,而有減號時則會關閉該選項組。 (? 是固定語言建構語法中的固定部分,無論啟用或停用選項,此部分都是必要的)。此選項只適用於該群組。 如需詳細資訊,請參閱群組建構。
下列範例提供一個實例。 它會在群組建構中使用內嵌選項,以啟用不區分大小寫的比對,並且在識別以字母"d"開頭的文字時忽略模式空白字元。
Dim pattern As String = "\b(?ix: d \w+)\s" Dim input As String = "Dogs are decidedly good pets." For Each match As Match In Regex.Matches(input, pattern) Console.WriteLine("'{0}' found at index {1}.", match.Value, match.Index) Next ' The example displays the following output: ' 'Dogs ' found at index 0. ' 'decidedly ' found at index 9.
string pattern = @"\b(?ix: d \w+)\s"; string input = "Dogs are decidedly good pets."; foreach (Match match in Regex.Matches(input, pattern)) Console.WriteLine("'{0}// found at index {1}.", match.Value, match.Index); // The example displays the following output: // 'Dogs // found at index 0. // 'decidedly // found at index 9.
如果以內嵌方式指定選項,在選項前加上負號 (-) 就會關閉這些選項。 例如,內嵌建構 (?ix-ms) 會開啟 RegexOptions.IgnoreCase 和 RegexOptions.IgnorePatternWhitespace 選項,並關閉 RegexOptions.Multiline 和 RegexOptions.Singleline 選項。 所有規則運算式選項預設是關閉的。
注意事項 |
---|
如果在建構函式的 options 參數中指定規則運算式選項,或者方法呼叫與內嵌於一般規則運算式模式中的選項衝突,會使用內嵌選項。 |
參數選項和內嵌選項都可以設定下列五個規則運算式選項:
下列五個規則運算式選項可以設定使用 options 參數,但不能設定內嵌:
決定選項
您可以判斷在透過擷取唯讀 Regex.Options 屬性的值將 Regex 物件執行個體化時,要提供哪些選項給該物件。 這個屬性特別適用於決定 Regex.CompileToAssembly 方法所建立之已編譯規則運算式所定義的選項。
若要測試 RegexOptions.None以外的任何選項是否存在,請執行 AND 作業搭配 Regex.Options 屬性的值和您有興趣的 RegexOptions 值。 然後測試結果是否等於 RegexOptions 值。 下列範例測試是否已設定 RegexOptions.IgnoreCase 選項。
If (rgx.Options And RegexOptions.IgnoreCase) = RegexOptions.IgnoreCase Then
Console.WriteLine("Case-insensitive pattern comparison.")
Else
Console.WriteLine("Case-sensitive pattern comparison.")
End If
if ((rgx.Options & RegexOptions.IgnoreCase) == RegexOptions.IgnoreCase)
Console.WriteLine("Case-insensitive pattern comparison.");
else
Console.WriteLine("Case-sensitive pattern comparison.");
若要測試 RegexOptions.None,請判斷 Regex.Options 屬性的值是否等於 RegexOptions.None,如下列範例所示。
If rgx.Options = RegexOptions.None Then
Console.WriteLine("No options have been set.")
End If
if (rgx.Options == RegexOptions.None)
Console.WriteLine("No options have been set.");
下列章節列出 .NET Framework 中的規則運算式所支援的選項。
預設選項
RegexOptions.None 選項指示不指定任何選項,而規則運算式引擎會使用其預設行為。 包括下列項目:
模式會解譯成標準規則運算式而非 ECMAScript 規則運算式。
會在輸入字串中從左至右比對規則運算式模式。
比較是區分大小寫的。
^ 和 $ 語言項目會比對輸入字串的開頭和結尾。
. 語言項目會比對 \n 以外的每一個字元。
規則運算式模式中的任何空白字元都會解譯為常值空格字元。
比對模式與輸入字串時,會使用目前的文化特性的慣例。
規則運算式模式中的擷取群組是隱含與明確的。
注意事項 |
---|
RegexOptions.None 選項不具有內嵌等值。當規則運算式選項是以內嵌方式套用時,會透過關閉特定選項,以選項為基礎還原預設行為。例如,(?i) 會開啟不區分大小寫的比較,而 (?-i) 則會還原預設的區分大小寫比較。 |
因為 RegexOptions.None 選項代表規則運算式引擎的預設行為,因此很少在方法呼叫中明確指定。 建構函式或靜態模式比對方法,而不呼叫 options 參數。
回到頁首
不區分大小寫的比對
IgnoreCase 選項 (或 i 內嵌選項) 會提供不區分大小寫的比對。 預設情況下會使用目前文化特性的大小寫慣例。
下列範例定義規則運算式模式 \bthe\w*\b,此模式會比對以 "the" 開頭的所有文字。 因為 Match 方法的第一次呼叫使用預設的區分大小寫比較,因此輸出指出句子開頭的字串 "The" 不符。 會在呼叫 Match 方法且選項設為 IgnoreCase 時比對。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim pattern As String = "\bthe\w*\b"
Dim input As String = "The man then told them about that event."
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("Found {0} at index {1}.", match.Value, match.Index)
Next
Console.WriteLine()
For Each match As Match In Regex.Matches(input, pattern, _
RegexOptions.IgnoreCase)
Console.WriteLine("Found {0} at index {1}.", match.Value, match.Index)
Next
End Sub
End Module
' The example displays the following output:
' Found then at index 8.
' Found them at index 18.
'
' Found The at index 0.
' Found then at index 8.
' Found them at index 18.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = @"\bthe\w*\b";
string input = "The man then told them about that event.";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("Found {0} at index {1}.", match.Value, match.Index);
Console.WriteLine();
foreach (Match match in Regex.Matches(input, pattern,
RegexOptions.IgnoreCase))
Console.WriteLine("Found {0} at index {1}.", match.Value, match.Index);
}
}
// The example displays the following output:
// Found then at index 8.
// Found them at index 18.
//
// Found The at index 0.
// Found then at index 8.
// Found them at index 18.
下列範例會修改前一個範例中的規則運算式模式,以使用內嵌選項取代 options 參數來提供不區分大小寫的比較。 第一個模式定義群組建構中不區分大小寫的選項,只適用於字串 "the" 中的字母 "t"。 因為選項建構發生在模式開頭,因此第二個模式會將不區分大小寫的選項套用至整個規則運算式。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim pattern As String = "\b(?i:t)he\w*\b"
Dim input As String = "The man then told them about that event."
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("Found {0} at index {1}.", match.Value, match.Index)
Next
Console.WriteLine()
pattern = "(?i)\bthe\w*\b"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("Found {0} at index {1}.", match.Value, match.Index)
Next
End Sub
End Module
' The example displays the following output:
' Found The at index 0.
' Found then at index 8.
' Found them at index 18.
'
' Found The at index 0.
' Found then at index 8.
' Found them at index 18.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = @"\b(?i:t)he\w*\b";
string input = "The man then told them about that event.";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("Found {0} at index {1}.", match.Value, match.Index);
Console.WriteLine();
pattern = @"(?i)\bthe\w*\b";
foreach (Match match in Regex.Matches(input, pattern,
RegexOptions.IgnoreCase))
Console.WriteLine("Found {0} at index {1}.", match.Value, match.Index);
}
}
// The example displays the following output:
// Found The at index 0.
// Found then at index 8.
// Found them at index 18.
//
// Found The at index 0.
// Found then at index 8.
// Found them at index 18.
回到頁首
多行模式
RegexOptions.Multiline 選項 (或 m 內嵌選項) 可讓規則運算式引擎處理包含多行的輸入字串。 它會變更 ^ 和 $ 語言項目的解譯,使其符合行的開頭和結尾,而不是符合輸入字串的開頭和結尾。
預設情況下,$ 只比對輸入字串結尾。 如果您指定 RegexOptions.Multiline 選項,它會比對新行字元 (\n) 或輸入字串的結尾。 但是,它並不會比對歸位/換行字元組合。 如果要成功比對,請使用子運算式 \r?$,而不要只使用 $。
下列範例會擷取保齡球員的名字和分數,並加入至以遞減順序排列的 SortedList<TKey, TValue> 集合。 會呼叫 Matches 方法兩次。 在第一次的方法呼叫中,規則運算式是 ^(\w+)\s(\d+)$,且未設定任何選項。 如輸出所示,由於規則運算式引擎不能比對輸入模式與輸入字串的開頭和結尾,因此找不到任何相符的項目。 在第二個方法呼叫中,規則運算式會變更為 ^(\w+)\s(\d+)\r?$,而且選項設為 RegexOptions.Multiline。 如輸出所示,可以成功比對名稱和分數,並以遞減順序顯示分數。
Imports System.Collections.Generic
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim scores As New SortedList(Of Integer, String)(New DescendingComparer(Of Integer)())
Dim input As String = "Joe 164" + vbCrLf + _
"Sam 208" + vbCrLf + _
"Allison 211" + vbCrLf + _
"Gwen 171" + vbCrLf
Dim pattern As String = "^(\w+)\s(\d+)$"
Dim matched As Boolean = False
Console.WriteLine("Without Multiline option:")
For Each match As Match In Regex.Matches(input, pattern)
scores.Add(CInt(match.Groups(2).Value), match.Groups(1).Value)
matched = True
Next
If Not matched Then Console.WriteLine(" No matches.")
Console.WriteLine()
' Redefine pattern to handle multiple lines.
pattern = "^(\w+)\s(\d+)\r*$"
Console.WriteLine("With multiline option:")
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.Multiline)
scores.Add(CInt(match.Groups(2).Value), match.Groups(1).Value)
Next
' List scores in descending order.
For Each score As KeyValuePair(Of Integer, String) In scores
Console.WriteLine("{0}: {1}", score.Value, score.Key)
Next
End Sub
End Module
Public Class DescendingComparer(Of T) : Implements IComparer(Of T)
Public Function Compare(x As T, y As T) As Integer _
Implements IComparer(Of T).Compare
Return Comparer(Of T).Default.Compare(x, y) * -1
End Function
End Class
' The example displays the following output:
' Without Multiline option:
' No matches.
'
' With multiline option:
' Allison: 211
' Sam: 208
' Gwen: 171
' Joe: 164
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
SortedList<int, string> scores = new SortedList<int, string>(new DescendingComparer<int>());
string input = "Joe 164\n" +
"Sam 208\n" +
"Allison 211\n" +
"Gwen 171\n";
string pattern = @"^(\w+)\s(\d+)$";
bool matched = false;
Console.WriteLine("Without Multiline option:");
foreach (Match match in Regex.Matches(input, pattern))
{
scores.Add(Int32.Parse(match.Groups[2].Value), (string) match.Groups[1].Value);
matched = true;
}
if (! matched)
Console.WriteLine(" No matches.");
Console.WriteLine();
// Redefine pattern to handle multiple lines.
pattern = @"^(\w+)\s(\d+)\r*$";
Console.WriteLine("With multiline option:");
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.Multiline))
scores.Add(Int32.Parse(match.Groups[2].Value), (string) match.Groups[1].Value);
// List scores in descending order.
foreach (KeyValuePair<int, string> score in scores)
Console.WriteLine("{0}: {1}", score.Value, score.Key);
}
}
public class DescendingComparer<T> : IComparer<T>
{
public int Compare(T x, T y)
{
return Comparer<T>.Default.Compare(x, y) * -1;
}
}
// The example displays the following output:
// Without Multiline option:
// No matches.
//
// With multiline option:
// Allison: 211
// Sam: 208
// Gwen: 171
// Joe: 164
規則運算式模式 ^(\w+)\s(\d+)\r*$ 的定義方式如下表所示。
模式 |
描述 |
---|---|
^ |
在一行的開頭開始。 |
(\w+) |
比對一個或多個文字字元。 這是第一個擷取群組。 |
\s |
比對空白字元。 |
(\d+) |
比對一個或多個十進位數字。 這是第二個擷取群組。 |
\r? |
比對零或一個歸位字元。 |
$ |
在行尾結束。 |
下列範例和前一個相同,不同之處在於它使用內嵌選項 (?m) 設定多行選項。
Imports System.Collections.Generic
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim scores As New SortedList(Of Integer, String)(New DescendingComparer(Of Integer)())
Dim input As String = "Joe 164" + vbCrLf + _
"Sam 208" + vbCrLf + _
"Allison 211" + vbCrLf + _
"Gwen 171" + vbCrLf
Dim pattern As String = "(?m)^(\w+)\s(\d+)\r*$"
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.Multiline)
scores.Add(CInt(match.Groups(2).Value), match.Groups(1).Value)
Next
' List scores in descending order.
For Each score As KeyValuePair(Of Integer, String) In scores
Console.WriteLine("{0}: {1}", score.Value, score.Key)
Next
End Sub
End Module
Public Class DescendingComparer(Of T) : Implements IComparer(Of T)
Public Function Compare(x As T, y As T) As Integer _
Implements IComparer(Of T).Compare
Return Comparer(Of T).Default.Compare(x, y) * -1
End Function
End Class
' The example displays the following output:
' Allison: 211
' Sam: 208
' Gwen: 171
' Joe: 164
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
SortedList<int, string> scores = new SortedList<int, string>(new DescendingComparer<int>());
string input = "Joe 164\n" +
"Sam 208\n" +
"Allison 211\n" +
"Gwen 171\n";
string pattern = @"(?m)^(\w+)\s(\d+)\r*$";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.Multiline))
scores.Add(Convert.ToInt32(match.Groups[2].Value), match.Groups[1].Value);
// List scores in descending order.
foreach (KeyValuePair<int, string> score in scores)
Console.WriteLine("{0}: {1}", score.Value, score.Key);
}
}
public class DescendingComparer<T> : IComparer<T>
{
public int Compare(T x, T y)
{
return Comparer<T>.Default.Compare(x, y) * -1;
}
}
// The example displays the following output:
// Allison: 211
// Sam: 208
// Gwen: 171
// Joe: 164
回到頁首
單行模式
RegexOptions.Singleline 選項 (或 s 內嵌選項) 會使規則運算式引擎處理輸入字串,如同包含單一行。 它會變更句號 (.) 語言項目的行為,使其符合每個字元,而不是比對新行字元 \n 或 \u000A 以外的所有字元,以達到此目的。
下列範例示範當您使用 RegexOptions.Singleline 選項時,. 語言項目的行為會如何變更。 規則運算式 ^.+ 會從字串的開端開始,並且比對每個字元。 預設情況下,比對會在第一行結尾結束。規則運算式模式會比對歸位字元 \r 或 \u000D,卻不會比對 \n。 因為 RegexOptions.Singleline 選項會將整個輸入字串解譯為單行,它會比對輸入字串中的每個字元,包括 \n。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim pattern As String = "^.+"
Dim input As String = "This is one line and" + vbCrLf + "this is the second."
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine(Regex.Escape(match.Value))
Next
Console.WriteLine()
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.SingleLine)
Console.WriteLine(Regex.Escape(match.Value))
Next
End Sub
End Module
' The example displays the following output:
' This\ is\ one\ line\ and\r
'
' This\ is\ one\ line\ and\r\nthis\ is\ the\ second\.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = "^.+";
string input = "This is one line and" + Environment.NewLine + "this is the second.";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine(Regex.Escape(match.Value));
Console.WriteLine();
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.Singleline))
Console.WriteLine(Regex.Escape(match.Value));
}
}
// The example displays the following output:
// This\ is\ one\ line\ and\r
//
// This\ is\ one\ line\ and\r\nthis\ is\ the\ second\.
下列範例和前一個相同,不同之處在於它使用內嵌選項 (?s) 啟用單行模式。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim pattern As String = "(?s)^.+"
Dim input As String = "This is one line and" + vbCrLf + "this is the second."
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine(Regex.Escape(match.Value))
Next
End Sub
End Module
' The example displays the following output:
' This\ is\ one\ line\ and\r\nthis\ is\ the\ second\.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = "(?s)^.+";
string input = "This is one line and" + Environment.NewLine + "this is the second.";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine(Regex.Escape(match.Value));
}
}
// The example displays the following output:
// This\ is\ one\ line\ and\r\nthis\ is\ the\ second\.
回到頁首
只有明確擷取
預設情況下,擷取群組是由在規則運算式模式中使用括號定義的。 具名群組的名稱或編號是按 (?<名稱> 子運算式) 語言選項指派,而未具名群組則是按索引存取。 在 GroupCollection 物件中,未命名的群組在具名群組之前。
群組建構通常僅用於將數量詞套用至多個語言項目,而不使用擷取到的子字串。 例如,如果下列規則運算式:
\b\(?((\w+),?\s?)+[\.!?]\)?
只適用於從文件中擷取以句號、驚嘆號或問號結尾的句子,只會使用所產生的句子 (由 Match 物件表示)。 集合中的個別文字則不是。
擷取後續未使用的群組可能會高度耗費資源,因為規則運算式引擎必須填入 GroupCollection 和 CaptureCollection 集合物件。 您還可以使用 RegexOptions.ExplicitCapture 選項或 n 內嵌選項,指定唯一有效的擷取為明確具名或編號的群組 (由 (?<名稱> 子運算式) 建構所指定)。
下列範例顯示有關 \b\(?((\w+),?\s?)+[\.!?]\)? 規則運算式模式在使用及不使用 RegexOptions.ExplicitCapture 選項呼叫 Match方法時所傳回之符合項目的資訊。 如第一個方法呼叫所示的輸出,規則運算式引擎會在 GroupCollection 和 CaptureCollection 集合中完整填入有關擷取子字串的資訊。 因為呼叫第二個方法時會將 options 設為 RegexOptions.ExplicitCapture,因此不會擷取群組的資訊。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim input As String = "This is the first sentence. Is it the beginning " + _
"of a literary masterpiece? I think not. Instead, " + _
"it is a nonsensical paragraph."
Dim pattern As String = "\b\(?((?>\w+),?\s?)+[\.!?]\)?"
Console.WriteLine("With implicit captures:")
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("The match: {0}", match.Value)
Dim groupCtr As Integer = 0
For Each group As Group In match.Groups
Console.WriteLine(" Group {0}: {1}", groupCtr, group.Value)
groupCtr += 1
Dim captureCtr As Integer = 0
For Each capture As Capture In group.Captures
Console.WriteLine(" Capture {0}: {1}", captureCtr, capture.Value)
captureCtr += 1
Next
Next
Next
Console.WriteLine()
Console.WriteLine("With explicit captures only:")
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.ExplicitCapture)
Console.WriteLine("The match: {0}", match.Value)
Dim groupCtr As Integer = 0
For Each group As Group In match.Groups
Console.WriteLine(" Group {0}: {1}", groupCtr, group.Value)
groupCtr += 1
Dim captureCtr As Integer = 0
For Each capture As Capture In group.Captures
Console.WriteLine(" Capture {0}: {1}", captureCtr, capture.Value)
captureCtr += 1
Next
Next
Next
End Sub
End Module
' The example displays the following output:
' With implicit captures:
' The match: This is the first sentence.
' Group 0: This is the first sentence.
' Capture 0: This is the first sentence.
' Group 1: sentence
' Capture 0: This
' Capture 1: is
' Capture 2: the
' Capture 3: first
' Capture 4: sentence
' Group 2: sentence
' Capture 0: This
' Capture 1: is
' Capture 2: the
' Capture 3: first
' Capture 4: sentence
' The match: Is it the beginning of a literary masterpiece?
' Group 0: Is it the beginning of a literary masterpiece?
' Capture 0: Is it the beginning of a literary masterpiece?
' Group 1: masterpiece
' Capture 0: Is
' Capture 1: it
' Capture 2: the
' Capture 3: beginning
' Capture 4: of
' Capture 5: a
' Capture 6: literary
' Capture 7: masterpiece
' Group 2: masterpiece
' Capture 0: Is
' Capture 1: it
' Capture 2: the
' Capture 3: beginning
' Capture 4: of
' Capture 5: a
' Capture 6: literary
' Capture 7: masterpiece
' The match: I think not.
' Group 0: I think not.
' Capture 0: I think not.
' Group 1: not
' Capture 0: I
' Capture 1: think
' Capture 2: not
' Group 2: not
' Capture 0: I
' Capture 1: think
' Capture 2: not
' The match: Instead, it is a nonsensical paragraph.
' Group 0: Instead, it is a nonsensical paragraph.
' Capture 0: Instead, it is a nonsensical paragraph.
' Group 1: paragraph
' Capture 0: Instead,
' Capture 1: it
' Capture 2: is
' Capture 3: a
' Capture 4: nonsensical
' Capture 5: paragraph
' Group 2: paragraph
' Capture 0: Instead
' Capture 1: it
' Capture 2: is
' Capture 3: a
' Capture 4: nonsensical
' Capture 5: paragraph
'
' With explicit captures only:
' The match: This is the first sentence.
' Group 0: This is the first sentence.
' Capture 0: This is the first sentence.
' The match: Is it the beginning of a literary masterpiece?
' Group 0: Is it the beginning of a literary masterpiece?
' Capture 0: Is it the beginning of a literary masterpiece?
' The match: I think not.
' Group 0: I think not.
' Capture 0: I think not.
' The match: Instead, it is a nonsensical paragraph.
' Group 0: Instead, it is a nonsensical paragraph.
' Capture 0: Instead, it is a nonsensical paragraph.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string input = "This is the first sentence. Is it the beginning " +
"of a literary masterpiece? I think not. Instead, " +
"it is a nonsensical paragraph.";
string pattern = @"\b\(?((?>\w+),?\s?)+[\.!?]\)?";
Console.WriteLine("With implicit captures:");
foreach (Match match in Regex.Matches(input, pattern))
{
Console.WriteLine("The match: {0}", match.Value);
int groupCtr = 0;
foreach (Group group in match.Groups)
{
Console.WriteLine(" Group {0}: {1}", groupCtr, group.Value);
groupCtr++;
int captureCtr = 0;
foreach (Capture capture in group.Captures)
{
Console.WriteLine(" Capture {0}: {1}", captureCtr, capture.Value);
captureCtr++;
}
}
}
Console.WriteLine();
Console.WriteLine("With explicit captures only:");
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.ExplicitCapture))
{
Console.WriteLine("The match: {0}", match.Value);
int groupCtr = 0;
foreach (Group group in match.Groups)
{
Console.WriteLine(" Group {0}: {1}", groupCtr, group.Value);
groupCtr++;
int captureCtr = 0;
foreach (Capture capture in group.Captures)
{
Console.WriteLine(" Capture {0}: {1}", captureCtr, capture.Value);
captureCtr++;
}
}
}
}
}
// The example displays the following output:
// With implicit captures:
// The match: This is the first sentence.
// Group 0: This is the first sentence.
// Capture 0: This is the first sentence.
// Group 1: sentence
// Capture 0: This
// Capture 1: is
// Capture 2: the
// Capture 3: first
// Capture 4: sentence
// Group 2: sentence
// Capture 0: This
// Capture 1: is
// Capture 2: the
// Capture 3: first
// Capture 4: sentence
// The match: Is it the beginning of a literary masterpiece?
// Group 0: Is it the beginning of a literary masterpiece?
// Capture 0: Is it the beginning of a literary masterpiece?
// Group 1: masterpiece
// Capture 0: Is
// Capture 1: it
// Capture 2: the
// Capture 3: beginning
// Capture 4: of
// Capture 5: a
// Capture 6: literary
// Capture 7: masterpiece
// Group 2: masterpiece
// Capture 0: Is
// Capture 1: it
// Capture 2: the
// Capture 3: beginning
// Capture 4: of
// Capture 5: a
// Capture 6: literary
// Capture 7: masterpiece
// The match: I think not.
// Group 0: I think not.
// Capture 0: I think not.
// Group 1: not
// Capture 0: I
// Capture 1: think
// Capture 2: not
// Group 2: not
// Capture 0: I
// Capture 1: think
// Capture 2: not
// The match: Instead, it is a nonsensical paragraph.
// Group 0: Instead, it is a nonsensical paragraph.
// Capture 0: Instead, it is a nonsensical paragraph.
// Group 1: paragraph
// Capture 0: Instead,
// Capture 1: it
// Capture 2: is
// Capture 3: a
// Capture 4: nonsensical
// Capture 5: paragraph
// Group 2: paragraph
// Capture 0: Instead
// Capture 1: it
// Capture 2: is
// Capture 3: a
// Capture 4: nonsensical
// Capture 5: paragraph
//
// With explicit captures only:
// The match: This is the first sentence.
// Group 0: This is the first sentence.
// Capture 0: This is the first sentence.
// The match: Is it the beginning of a literary masterpiece?
// Group 0: Is it the beginning of a literary masterpiece?
// Capture 0: Is it the beginning of a literary masterpiece?
// The match: I think not.
// Group 0: I think not.
// Capture 0: I think not.
// The match: Instead, it is a nonsensical paragraph.
// Group 0: Instead, it is a nonsensical paragraph.
// Capture 0: Instead, it is a nonsensical paragraph.
規則運算式模式 \b\(?((?>\w+),?\s?)+[\.!?]\)? 的定義方式如下表所示。
模式 |
描述 |
---|---|
\b |
在字緣開始。 |
\(? |
比對出現零次或一次的左括號 ("(")。 |
(?>\w+),? |
比對零或多個文字字元,後面接零或一個逗號。 比對字組字元時不回溯。 |
\s? |
比對零或一個空白字元。 |
((\w+),? \s?)+ |
比對一或多次一或多個文字字元、零或一個逗號,以及零或一個空白字元的組合。 |
[\.!?]\)? |
比對三個標點符號當中任何一個,後面跟著零或一個右括號 (")")。 |
您也可以使用 (?n) 內嵌項目以隱藏自動擷取。 下列範例會修改上一個規則運算式模式以使用 (?n) 內嵌項目,而不使用 RegexOptions.ExplicitCapture 選項。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim input As String = "This is the first sentence. Is it the beginning " + _
"of a literary masterpiece? I think not. Instead, " + _
"it is a nonsensical paragraph."
Dim pattern As String = "(?n)\b\(?((?>\w+),?\s?)+[\.!?]\)?"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("The match: {0}", match.Value)
Dim groupCtr As Integer = 0
For Each group As Group In match.Groups
Console.WriteLine(" Group {0}: {1}", groupCtr, group.Value)
groupCtr += 1
Dim captureCtr As Integer = 0
For Each capture As Capture In group.Captures
Console.WriteLine(" Capture {0}: {1}", captureCtr, capture.Value)
captureCtr += 1
Next
Next
Next
End Sub
End Module
' The example displays the following output:
' The match: This is the first sentence.
' Group 0: This is the first sentence.
' Capture 0: This is the first sentence.
' The match: Is it the beginning of a literary masterpiece?
' Group 0: Is it the beginning of a literary masterpiece?
' Capture 0: Is it the beginning of a literary masterpiece?
' The match: I think not.
' Group 0: I think not.
' Capture 0: I think not.
' The match: Instead, it is a nonsensical paragraph.
' Group 0: Instead, it is a nonsensical paragraph.
' Capture 0: Instead, it is a nonsensical paragraph.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string input = "This is the first sentence. Is it the beginning " +
"of a literary masterpiece? I think not. Instead, " +
"it is a nonsensical paragraph.";
string pattern = @"(?n)\b\(?((?>\w+),?\s?)+[\.!?]\)?";
foreach (Match match in Regex.Matches(input, pattern))
{
Console.WriteLine("The match: {0}", match.Value);
int groupCtr = 0;
foreach (Group group in match.Groups)
{
Console.WriteLine(" Group {0}: {1}", groupCtr, group.Value);
groupCtr++;
int captureCtr = 0;
foreach (Capture capture in group.Captures)
{
Console.WriteLine(" Capture {0}: {1}", captureCtr, capture.Value);
captureCtr++;
}
}
}
}
}
// The example displays the following output:
// The match: This is the first sentence.
// Group 0: This is the first sentence.
// Capture 0: This is the first sentence.
// The match: Is it the beginning of a literary masterpiece?
// Group 0: Is it the beginning of a literary masterpiece?
// Capture 0: Is it the beginning of a literary masterpiece?
// The match: I think not.
// Group 0: I think not.
// Capture 0: I think not.
// The match: Instead, it is a nonsensical paragraph.
// Group 0: Instead, it is a nonsensical paragraph.
// Capture 0: Instead, it is a nonsensical paragraph.
最後,您可以使用內嵌群組項目 (?n:) 隱藏以群組對群組為主的擷取。 下列範例會修改前一個模式,以隱藏外部群組 ((?>\w+),?\s?) 中未命名的擷取。 請注意,這同樣會隱藏內部群組中的未命名擷取。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim input As String = "This is the first sentence. Is it the beginning " + _
"of a literary masterpiece? I think not. Instead, " + _
"it is a nonsensical paragraph."
Dim pattern As String = "\b\(?(?n:(?>\w+),?\s?)+[\.!?]\)?"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("The match: {0}", match.Value)
Dim groupCtr As Integer = 0
For Each group As Group In match.Groups
Console.WriteLine(" Group {0}: {1}", groupCtr, group.Value)
groupCtr += 1
Dim captureCtr As Integer = 0
For Each capture As Capture In group.Captures
Console.WriteLine(" Capture {0}: {1}", captureCtr, capture.Value)
captureCtr += 1
Next
Next
Next
End Sub
End Module
' The example displays the following output:
' The match: This is the first sentence.
' Group 0: This is the first sentence.
' Capture 0: This is the first sentence.
' The match: Is it the beginning of a literary masterpiece?
' Group 0: Is it the beginning of a literary masterpiece?
' Capture 0: Is it the beginning of a literary masterpiece?
' The match: I think not.
' Group 0: I think not.
' Capture 0: I think not.
' The match: Instead, it is a nonsensical paragraph.
' Group 0: Instead, it is a nonsensical paragraph.
' Capture 0: Instead, it is a nonsensical paragraph.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string input = "This is the first sentence. Is it the beginning " +
"of a literary masterpiece? I think not. Instead, " +
"it is a nonsensical paragraph.";
string pattern = @"\b\(?(?n:(?>\w+),?\s?)+[\.!?]\)?";
foreach (Match match in Regex.Matches(input, pattern))
{
Console.WriteLine("The match: {0}", match.Value);
int groupCtr = 0;
foreach (Group group in match.Groups)
{
Console.WriteLine(" Group {0}: {1}", groupCtr, group.Value);
groupCtr++;
int captureCtr = 0;
foreach (Capture capture in group.Captures)
{
Console.WriteLine(" Capture {0}: {1}", captureCtr, capture.Value);
captureCtr++;
}
}
}
}
}
// The example displays the following output:
// The match: This is the first sentence.
// Group 0: This is the first sentence.
// Capture 0: This is the first sentence.
// The match: Is it the beginning of a literary masterpiece?
// Group 0: Is it the beginning of a literary masterpiece?
// Capture 0: Is it the beginning of a literary masterpiece?
// The match: I think not.
// Group 0: I think not.
// Capture 0: I think not.
// The match: Instead, it is a nonsensical paragraph.
// Group 0: Instead, it is a nonsensical paragraph.
// Capture 0: Instead, it is a nonsensical paragraph.
回到頁首
編譯的規則運算式
預設情況下會解譯 .NET Framework 中的規則運算式。 Regex 物件執行個體化或呼叫靜態 Regex 方法時,規則運算式模式會剖析成一組自訂的 opcode,而且解譯器會使用這些 opcode 執行規則運算式。 這牽涉到折衷方式:降低執行階段效能而將初始化規則運算式引擎的成本降到最低。
您可以使用 RegexOptions.Compiled 選項來使用編譯而非解譯的規則運算式。 在這種情況下,將模式傳遞至規則運算式引擎時,會剖為一組 opcodes,然後轉換為可以直接傳遞至 Common Language Runtime 的 Microsoft 中繼語言 (MSIL)。 已編譯的規則運算式會加長初始化時間來最大化執行階段的效能。
注意事項 |
---|
您只能透過提供 RegexOptions.Compiled 值至 Regex 類別建構函式或靜態模式比對方法的 options 參數,才能編譯規則運算式。不可以做為內嵌選項。 |
您可以在呼叫靜態和執行個體規則運算式時使用編譯的規則運算式。 在靜態規則運算式中,會將 RegexOptions.Compiled 選項傳遞至規則運算式模式比對方法的 options 參數。 在執行個體規則運算式中,它會傳遞至 Regex 類別建構函式的options 參數。 在這兩種情況下,它會產生增強效能。
但是,只在下列情況下才能改善效能:
Regex 物件,代表特定規則運算式用於對規則運算式模式比對方法的多個呼叫。
Regex 物件不允許超出範圍,所以可重複使用。
靜態規則運算式用於對規則運算式模式比對方法的多個呼叫。 (效能改進是可能的,因為規則運算式引擎函式會快取靜態方法呼叫中所使用的規則運算式)。
注意事項 |
---|
RegexOptions.Compiled 選項與 Regex.CompileToAssembly 方法無關,此方法會建立特殊目的組件,其中包含預先定義的已編譯規則運算式。 |
回到頁首
忽略空白字元
預設情況下,規則運算式模式中的空白字元是很重要的,它會強制規則運算式引擎比對輸入字串中的空白字元。 因此,規則運算式 "\b\w+\s" 和 "\b\w+ " 是大致上相等的規則運算式。 此外,在規則運算式模式中遇到數字符號 (#) 時,會將它解譯為要比對的常值字元。
RegexOptions.IgnorePatternWhitespace 選項 (或 x 內嵌選項) 會變更此預設行為,如下所示:
會忽略規則運算式模式中未逸出的空白字元。 在規則運算式模式當中,必須逸出空白字元 (例如以 \s 或\ ")。
重要事項 不論 RegexOptions.IgnorePatternWhitespace 選項的使用方式為何,均會逐字解譯字元類別內的空白字元。比方說,規則運算式模式 [ .,;:] 會比對任何空白字元、句號、逗號、分號或冒號。
數字符號 (#) 會解譯為一個註解的開頭,而不是常值字元。 從 # 字元到字串結尾的規則運算式模式中的所有文字都會解譯為註解。
啟用此選項可協助簡化通常很難剖析與了解的規則運算式。 它可增進可讀性,以便記錄規則運算式。
下列範例定義下列的規則運算式模式:
\b \(? ( (?>\w+) ,? \s? )+ [\.!?] \)? # Matches an entire sentence.
此模式類似於在 僅限明確擷取區段中定義的模式,不同之處在於它會使用 RegexOptions.IgnorePatternWhitespace 選項來忽略圖樣空白字元。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim input As String = "This is the first sentence. Is it the beginning " + _
"of a literary masterpiece? I think not. Instead, " + _
"it is a nonsensical paragraph."
Dim pattern As String = "\b \(? ( (?>\w+) ,?\s? )+ [\.!?] \)? # Matches an entire sentence."
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnorePatternWhitespace)
Console.WriteLine(match.Value)
Next
End Sub
End Module
' The example displays the following output:
' This is the first sentence.
' Is it the beginning of a literary masterpiece?
' I think not.
' Instead, it is a nonsensical paragraph.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string input = "This is the first sentence. Is it the beginning " +
"of a literary masterpiece? I think not. Instead, " +
"it is a nonsensical paragraph.";
string pattern = @"\b\(?((?>\w+),?\s?)+[\.!?]\)?";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnorePatternWhitespace))
Console.WriteLine(match.Value);
}
}
// The example displays the following output:
// This is the first sentence.
// Is it the beginning of a literary masterpiece?
// I think not.
// Instead, it is a nonsensical paragraph.
下列範例使用內嵌選項 (?x) 忽略模式空格。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim input As String = "This is the first sentence. Is it the beginning " + _
"of a literary masterpiece? I think not. Instead, " + _
"it is a nonsensical paragraph."
Dim pattern As String = "(?x)\b \(? ( (?>\w+) ,?\s? )+ [\.!?] \)? # Matches an entire sentence."
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine(match.Value)
Next
End Sub
End Module
' The example displays the following output:
' This is the first sentence.
' Is it the beginning of a literary masterpiece?
' I think not.
' Instead, it is a nonsensical paragraph.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string input = "This is the first sentence. Is it the beginning " +
"of a literary masterpiece? I think not. Instead, " +
"it is a nonsensical paragraph.";
string pattern = @"(?x)\b \(? ( (?>\w+) ,?\s? )+ [\.!?] \)? # Matches an entire sentence.";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine(match.Value);
}
}
// The example displays the following output:
// This is the first sentence.
// Is it the beginning of a literary masterpiece?
// I think not.
// Instead, it is a nonsensical paragraph.
回到頁首
從右至左模式
預設情況下,規則運算式引擎會從左到右搜尋。 您可以使用 RegexOptions.RightToLeft 選項反向搜尋。 搜尋會自動於字串最後一個字元的位置開始。 對於包含起始位置參數 (例如 Regex.Match(String, Int32)) 的模式比對方法,起始位置是搜尋開始之最右邊字元位置的索引。
注意事項 |
---|
只有透過提供 RegexOptions.RightToLeft 值給 Regex 類別建構函式或靜態模式比對方法的 options 參數,才能使用從右至左模式模式。不可以做為內嵌選項。 |
RegexOptions.RightToLeft 選項只會變更搜尋方向,不會從右至左解譯規則運算式模式。 例如,規則運算式 \bb\w+\s 會比對以字母"b"開頭、後面接空白字元的文字。 下列範例中,輸入字串由包括一或多個 b 字元的三個字組成。 第一個字的開頭為"b"、第二個字的結尾為"b",而第三個字當中則包含"b"字元。 如範例輸出所示,只有首字符合規則運算式模式。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim pattern As String = "\bb\w+\s"
Dim input As String = "builder rob rabble"
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.RightToLeft)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
End Sub
End Module
' The example displays the following output:
' 'builder ' found at position 0.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = @"\bb\w+\s";
string input = "builder rob rabble";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.RightToLeft))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
}
}
// The example displays the following output:
// 'builder ' found at position 0.
此外請注意,lookahead 判斷提示 ((?=子運算式)語言項目) 和 lookbehind 判斷提示 ((?<=子運算式) 語言項目) 並不會變更方向。 Lookahead 判斷提示會向右檢查,Lookbehind 判斷提示則向左檢查。 例如,規則運算式 (?<=\d{1,2}\s)\w+,*\s\d{4} 會使用 lookbehind 判斷提示來測試月份名稱之前的日期。 然後規則運算式會比對月份和年份。 如需 Lookahead 和 Lookbehind 判斷提示的詳細資訊,請參閱群組建構。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim inputs() As String = { "1 May 1917", "June 16, 2003" }
Dim pattern As String = "(?<=\d{1,2}\s)\w+,?\s\d{4}"
For Each input As String In inputs
Dim match As Match = Regex.Match(input, pattern, RegexOptions.RightToLeft)
If match.Success Then
Console.WriteLine("The date occurs in {0}.", match.Value)
Else
Console.WriteLine("{0} does not match.", input)
End If
Next
End Sub
End Module
' The example displays the following output:
' The date occurs in May 1917.
' June 16, 2003 does not match.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string[] inputs = { "1 May 1917", "June 16, 2003" };
string pattern = @"(?<=\d{1,2}\s)\w+,?\s\d{4}";
foreach (string input in inputs)
{
Match match = Regex.Match(input, pattern, RegexOptions.RightToLeft);
if (match.Success)
Console.WriteLine("The date occurs in {0}.", match.Value);
else
Console.WriteLine("{0} does not match.", input);
}
}
}
// The example displays the following output:
// The date occurs in May 1917.
// June 16, 2003 does not match.
此規則運算式模式的定義方式如下表所示。
模式 |
描述 |
---|---|
(?<=\d{1,2}\s) |
符合項目的開頭前面必須有一或兩個十進位數字,後面接著一個空格。 |
\w+ |
比對一個或多個文字字元。 |
,* |
比對零或一個逗號字元。 |
\s |
比對空白字元。 |
\d{4} |
比對四個十進位數字。 |
回到頁首
ECMAScript 比對行為
預設情況下,規則運算式引擎比對規則運算式模式與輸入文字時會使用標準行為。 但是,您可以指示規則運算式引擎使用 ECMAScript 比對行為,方法是指定 RegexOptions.ECMAScript 選項。
注意事項 |
---|
必須提供 RegexOptions.ECMAScript 值至 Regex 類別建構函式或靜態模式比對方法的 options參數,才能使用 ECMAScript 相容的行為。不可以做為內嵌選項。 |
RegexOptions.ECMAScript 選項只能與 RegexOptions.IgnoreCase 和 RegexOptions.Multiline 選項結合。 在規則運算式中使用任何其他選項會造成 ArgumentOutOfRangeException。
ECMAScript 和標準規則運算式的行為差異分為三個方面:字元類別語法、自我參考擷取群組,以及八進位與反向參考解譯。
字元類別的語法。 因為標準規則運算式支援 Unicode,而 ECMAScript 不支援,因此 ECMAScript 中的字元類別語法更受限制,而且某些字元類別語言項目會有不同的意義。 比方說,ECMAScript 並不支援 Unicode 分類或 \p 和 \P 區塊項目等語言項目。 同樣地,比對文字字元的 \w 項目相當於使用 ECMAScript 的 [a-zA-Z_0-9],以及使用標準行為的 [\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}\p{Pc}\p{Lm}]。 如需詳細資訊,請參閱字元類別。
下列範例示範標準模式比對和 ECMAScript 模式比對之間的差異。 它會定義規則運算式 \b(\w+\s*)+ (會比對後面接空白字元的文字)。 輸入包含兩個字串,其中一個使用拉丁文字元集,另一個使用斯拉夫文字元集。 如輸出所示,使用 ECMAScript 比對的Regex.IsMatch(String, String, RegexOptions) 方法呼叫無法比對斯拉夫文字,而使用標準比對的方法呼叫則可以比對這些文字。
Imports System.Text.RegularExpressions Module Example Public Sub Main() Dim values() As String = { "целый мир", "the whole world" } Dim pattern As String = "\b(\w+\s*)+" For Each value In values Console.Write("Canonical matching: ") If Regex.IsMatch(value, pattern) Console.WriteLine("'{0}' matches the pattern.", value) Else Console.WriteLine("{0} does not match the pattern.", value) End If Console.Write("ECMAScript matching: ") If Regex.IsMatch(value, pattern, RegexOptions.ECMAScript) Console.WriteLine("'{0}' matches the pattern.", value) Else Console.WriteLine("{0} does not match the pattern.", value) End If Console.WriteLine() Next End Sub End Module ' The example displays the following output: ' Canonical matching: 'целый мир' matches the pattern. ' ECMAScript matching: целый мир does not match the pattern. ' ' Canonical matching: 'the whole world' matches the pattern. ' ECMAScript matching: 'the whole world' matches the pattern.
using System; using System.Text.RegularExpressions; public class Example { public static void Main() { string[] values = { "целый мир", "the whole world" }; string pattern = @"\b(\w+\s*)+"; foreach (var value in values) { Console.Write("Canonical matching: "); if (Regex.IsMatch(value, pattern)) Console.WriteLine("'{0}' matches the pattern.", value); else Console.WriteLine("{0} does not match the pattern.", value); Console.Write("ECMAScript matching: "); if (Regex.IsMatch(value, pattern, RegexOptions.ECMAScript)) Console.WriteLine("'{0}' matches the pattern.", value); else Console.WriteLine("{0} does not match the pattern.", value); Console.WriteLine(); } } } // The example displays the following output: // Canonical matching: 'целый мир' matches the pattern. // ECMAScript matching: целый мир does not match the pattern. // // Canonical matching: 'the whole world' matches the pattern. // ECMAScript matching: 'the whole world' matches the pattern.
自我參考擷取群組。 具有本身反向參考的規則運算式擷取類別必須以每個擷取重複項目來更新。 如下列範例所示,這項功能可讓規則運算式 ((a+)(\1) ?)+ 在使用 ECMAScript 時比對輸入字串 "aa aaaa aaaaaa",而不是在使用標準比對時使用。
此規則運算式的定義方式如下表所示。
模式
描述
(a+)
比對字母"a"一次或多次。 這是第二個擷取群組。
(\1)
比對第一個擷取群組所擷取的子字串。 這是第三個擷取群組。
?
比對零或一個空白字元。
((a+)(\1) ?)+
比對字元"a"後面跟著字串的模式一或多次 (此模式符合第一個擷取群組後面跟著零或多個空白字元一或多次)。 這是第一個擷取群組。
解析八進位逸出與反向參考之間的語意模糊。 下表概述由標準和 ECMAScript 規則運算式所解譯八進位與反向參考之間的差異。
規則運算式
標準行為
ECMAScript 行為
\0 後面接著 0 到 2 的八進位數字
解譯為八進位數。 例如,\044 永遠解譯為八進位值中,而且表示 "$"。
相同行為。
\ 加上 1 到 9 的數字,後面沒有其他十進位數字,
解譯為反向參考。 例如,即使第九個擷取群組不存在,\9 永遠表示反向參考 9。 如果擷取群組不存在,規則運算式剖析器會擲回 ArgumentException。
如果單一十進位數字擷取群組存在的話,則反向參考到該數字。 否則,會將該值解譯為常值。
\ 後面加上 1 到 9 的數字,再加上其他十進位數字。
將數字解譯為十進位值。 如果該擷取群組存在的話,則會將運算式解譯為反向參考。
否則,會解譯為最多使用八進位 377 值的前置八進位數字,也就是說只考慮此值的 8 個低位元。 將其餘的數字解譯為常值。 例如,在運算式 \3000 中,如果擷取群組 300 存在的話,則解譯為反向參考 300。如果擷取群組 300 不存在的話,則解譯為八進位 300 後面接著 0。
藉由儘可能將多個位數轉換為能參考擷取的十進位值,以解譯為反向參考。 如果沒有可以轉換的數字,則會使用最多使用八進位 377 值之前置八進位數字解譯為八進位,而解譯剩餘數字為常值。
回到頁首
使用不因文化特性而異的比較
預設情況下,規則運算式引擎執行不區分大小寫比較時,會使用目前文化特性的大小寫慣例來判斷對等的大寫和小寫字元。
不過,某些類型的比較並不希望出現這種行為,特別是在比較使用者輸入與系統資源名稱時,例如 密碼、檔案或 URL。 下列範例說明這類情況。 程式碼適用於封鎖存取 URL 引用 FILE:// 的任何資源。 規則運算式會嘗試藉由使用規則運算式 $FILE://,以不區分大小寫方式比對字串。 不過,當目前系統的文化特性式 tr-TR (土耳其文-土耳其) 時,"I" 並不等於大寫的 "i"。 如此一來,Regex.IsMatch 方法呼叫會傳回 false,並且允許存取檔案。
Dim defaultCulture As CultureInfo = Thread.CurrentThread.CurrentCulture
Thread.CurrentThread.CurrentCulture = New CultureInfo("tr-TR")
Dim input As String = "file://c:/Documents.MyReport.doc"
Dim pattern As String = "$FILE://"
Console.WriteLine("Culture-sensitive matching ({0} culture)...", _
Thread.CurrentThread.CurrentCulture.Name)
If Regex.IsMatch(input, pattern, RegexOptions.IgnoreCase) Then
Console.WriteLine("URLs that access files are not allowed.")
Else
Console.WriteLine("Access to {0} is allowed.", input)
End If
Thread.CurrentThread.CurrentCulture = defaultCulture
' The example displays the following output:
' Culture-sensitive matching (tr-TR culture)...
' Access to file://c:/Documents.MyReport.doc is allowed.
CultureInfo defaultCulture = Thread.CurrentThread.CurrentCulture;
Thread.CurrentThread.CurrentCulture = new CultureInfo("tr-TR");
string input = "file://c:/Documents.MyReport.doc";
string pattern = "FILE://";
Console.WriteLine("Culture-sensitive matching ({0} culture)...",
Thread.CurrentThread.CurrentCulture.Name);
if (Regex.IsMatch(input, pattern, RegexOptions.IgnoreCase))
Console.WriteLine("URLs that access files are not allowed.");
else
Console.WriteLine("Access to {0} is allowed.", input);
Thread.CurrentThread.CurrentCulture = defaultCulture;
// The example displays the following output:
// Culture-sensitive matching (tr-TR culture)...
// Access to file://c:/Documents.MyReport.doc is allowed.
注意事項 |
---|
如需區分大小寫及使用不因文化特性而異之字串比較的詳細資訊,請參閱在 .NET Framework 中使用字串的最佳作法。 |
您可以不使用不區分大小寫的比較方式來比較目前的文化特性,而改為指定 RegexOptions.CultureInvariant 選項忽略語言中的文化特性差異,並且使用不因文化特性而異的慣例。
注意事項 |
---|
使用不因文化特性而異的比較僅適用於提供 RegexOptions.CultureInvariant 值給 Regex 類別建構函式或靜態模式比對方法的 options 參數。不可以做為內嵌選項。 |
下列範例與前一個範例相同,不同之處在於會呼叫靜態 Regex.IsMatch(String, String, RegexOptions) 方法並搭配包含 RegexOptions.CultureInvariant 在內的選項。 甚至當目前的文化特性設定為土耳其文 (土耳其) 時,規則運算式引擎也能成功比對 "FILE"和"file",並且封鎖存取檔案資源。
Dim defaultCulture As CultureInfo = Thread.CurrentThread.CurrentCulture
Thread.CurrentThread.CurrentCulture = New CultureInfo("tr-TR")
Dim input As String = "file://c:/Documents.MyReport.doc"
Dim pattern As String = "$FILE://"
Console.WriteLine("Culture-insensitive matching...")
If Regex.IsMatch(input, pattern, _
RegexOptions.IgnoreCase Or RegexOptions.CultureInvariant) Then
Console.WriteLine("URLs that access files are not allowed.")
Else
Console.WriteLine("Access to {0} is allowed.", input)
End If
Thread.CurrentThread.CurrentCulture = defaultCulture
' The example displays the following output:
' Culture-insensitive matching...
' URLs that access files are not allowed.
CultureInfo defaultCulture = Thread.CurrentThread.CurrentCulture;
Thread.CurrentThread.CurrentCulture = new CultureInfo("tr-TR");
string input = "file://c:/Documents.MyReport.doc";
string pattern = "FILE://";
Console.WriteLine("Culture-insensitive matching...");
if (Regex.IsMatch(input, pattern,
RegexOptions.IgnoreCase | RegexOptions.CultureInvariant))
Console.WriteLine("URLs that access files are not allowed.");
else
Console.WriteLine("Access to {0} is allowed.", input);
Thread.CurrentThread.CurrentCulture = defaultCulture;
// The example displays the following output:
// Culture-insensitive matching...
// URLs that access files are not allowed.
回到頁首