c# - 正規表現のネストされた括弧

Question

次の文字列があります。

a,b,c,d.e(f,g,h,i(j,k)),l,m,n

次のような括弧の「最初のレベル」のみを返す正規表現を作成する方法を教えてください。

[0] = a,b,c,
[1] = d.e(f,g,h,i.j(k,l))
[2] = m,n

目標は、括弧内に同じインデックスを持つセクションを入れ子にして、未来を操作することです。

ありがとうございました。

編集

例を改善しようとしています...

この文字列があると想像してください

username,TB_PEOPLE.fields(FirstName,LastName,TB_PHONE.fields(num_phone1, num_phone2)),password

私の目標は、文字列を動的クエリに変換することです。次に、「TB_」で始まらないフィールドはメインテーブルのフィールドであることがわかります。それ以外の場合は、括弧内の情報フィールドが別のテーブルに関連していることがわかります。しかし、すべてのフィールドを「最初のレベル」で取得するのは困難です。関連するテーブルからそれらを分離できるため、残りのフィールドを再帰的に回復することができます。

最終的には、次のようになります。

[0] = username,password
[1] = TB_PEOPLE.fields(FirstName,LastName,TB_PHONE.fields(num_phone1, num_phone2))

もう少し詳しく説明できれば幸いです、申し訳ありません。

score 12 · Accepted Answer

これを使用できます：

(?>\w+\.)?\w+\((?>\((?<DEPTH>)|\)(?<-DEPTH>)|[^()]+)*\)(?(DEPTH)(?!))|\w+

あなたの例では、次のものが得られます。

0 => username
1 => TB_PEOPLE.fields(FirstName,LastName,TB_PHONE.fields(num_phone1, num_phone2))
2 => password

説明：

(?>\w+\.)? \w+ \(    # the opening parenthesis (with the function name)
(?>                  # open an atomic group
    \(  (?<DEPTH>)   # when an opening parenthesis is encountered,
                     #  then increment the stack named DEPTH
  |                  # OR
    \) (?<-DEPTH>)   # when a closing parenthesis is encountered,
                     #  then decrement the stack named DEPTH
  |                  # OR
    [^()]+           # content that is not parenthesis
)*                   # close the atomic group, repeat zero or more times
\)                   # the closing parenthesis
(?(DEPTH)(?!))       # conditional: if the stack named DEPTH is not empty
                     #  then fail (ie: parenthesis are not balanced)

次のコードで試すことができます。

string input = "username,TB_PEOPLE.fields(FirstName,LastName,TB_PHONE.fields(num_phone1, num_phone2)),password";
string pattern = @"(?>\w+\.)?\w+\((?>\((?<DEPTH>)|\)(?<-DEPTH>)|[^()]+)*\)(?(DEPTH)(?!))|\w+";
MatchCollection matches = Regex.Matches(input, pattern);
foreach (Match match in matches)
{
    Console.WriteLine(match.Groups[0].Value);
}

score 0 · Accepted Answer

私は新しい戦略、R2 を提案します - それをアルゴリズム的に行います。最終的に求めているものに近づく正規表現を構築することはできますが、それは非常に保守が難しく、新しいエッジケースを見つけたときに拡張するのは困難です。私は C# を話せませんが、次の疑似コードで正しい方向に進むことができます。

function parenthetical_depth(some_string):
    open = count '(' in some_string
    close = count ')' in some_string
    return open - close

function smart_split(some_string):
    bits = split some_string on ','
    new_bits = empty list
    bit = empty string
    while bits has next:
        bit = fetch next from bits
        while parenthetical_depth(bit) != 0:
            bit = bit + ',' + fetch next from bits
        place bit into new_bits
    return new_bits

これはそれを理解する最も簡単な方法です。現在のアルゴリズムは次のとおりです。O(n^2)内部ループを作成するための最適化がありますO(n)(これの最悪の部分である文字列のコピーを除いて)。

depth = parenthetical_depth(bit)
while depth != 0:
    nbit = fetch next from bits
    depth = depth + parenthetical_depth(nbit)
    bit = bit + ',' + nbit

文字列のコピーは、スペース効率を犠牲にして、バッファーとバッファーサイズを巧みに使用することでより効率的にすることができますが、C# がそのレベルの制御をネイティブに提供するとは思いません。

c# - 正規表現のネストされた括弧

3 に答える 3

Related

Reference