java - 区切り文字内の文字列を抽出する正規表現

Question

区切り文字 (この場合は括弧) 内の文字列の出現を抽出しようとしていますが、引用符 (一重または二重) 内にあるものは抽出しません。これが私が試したことです-この正規表現は、括弧内のすべての出現をフェッチし、引用符内のものもフェッチします（引用符内のものは必要ありません）

public class RegexMain {
    static final String PATTERN = "\\(([^)]+)\\)";
    static final Pattern CONTENT = Pattern.compile(PATTERN);
    /**
     * @param args
     */
    public static void main(String[] args) {
        String testString = "Rhyme (Jack) and (Jill) went up the hill on \"(Peter's)\" request.";
        Matcher match = CONTENT.matcher(testString);
        while(match.find()) {
            System.out.println(match.group()); // prints Jack, Jill and Peter's
        }
    }
}

score 1 · Accepted Answer

あなたは試すことができます

public class RegexMain {
    static final String PATTERN = "\\(([^)]+)\\)|\"[^\"]*\"";
    static final Pattern CONTENT = Pattern.compile(PATTERN);
    /**
     * @param args
     */
    public static void main(String[] args) {
        String testString = "Rhyme (Jack) and (Jill) went up the hill on \"(Peter's)\" request.";
        Matcher match = CONTENT.matcher(testString);
        while(match.find()) {
            if(match.group(1) != null) {
                System.out.println(match.group(1)); // prints Jack, Jill
            }
        }
    }
}

このパターンは、引用符で囲まれた文字列だけでなく、括弧で囲まれた文字列にも一致しますが、括弧で囲まれた文字列のみがに何かを挿入しgroup(1)ます。+と*は正規表現で貪欲であるため、よりも一致することを好み"(Peter's)"ます(Peter's)。

score 1 · Accepted Answer

これは、後読み演算子と先読み演算子をエレガントに使用して、目的を達成できるケースです。Python での解決策を次に示します (コマンドラインで何かをすばやく試すために常に使用します) が、正規表現は Java コードでも同じである必要があります。

この正規表現は、肯定的な後読みを使用する左括弧が前にあり、肯定的な先読みを使用する右括弧が続くコンテンツに一致します。ただし、左括弧の前に否定後読みを使用する一重引用符または二重引用符があり、右括弧の後ろに否定後読みを使用する一重引用符または二重引用符がある場合、これらの一致は回避されます。

In [1]: import re

In [2]: s = "Rhyme (Jack) and (Jill) went up the hill on \"(Peter's)\" request."

In [3]: re.findall(r"""
   ...:     (?<=               # start of positive look-behind
   ...:         (?<!           # start of negative look-behind
   ...:             [\"\']     # avoids matching opening parenthesis preceded by single or double quote
   ...:         )              # end of negative look-behind
   ...:         \(             # matches opening parenthesis
   ...:     )                  # end of positive look-behind
   ...:     \w+ (?: \'\w* )?   # matches whatever your content looks like (configure this yourself)             
   ...:     (?=                # start of positive look-ahead
   ...:         \)             # matches closing parenthesis 
   ...:         (?!            # start of negative look-ahead
   ...:             [\"\']     # avoids matching closing parenthesis succeeded by single or double quote
   ...:         )              # end of negative look-ahead  
   ...:     )                  # end of positive look-ahead
   ...:     """, 
   ...:     s, 
   ...:     flags=re.X)
Out[3]: ['Jack', 'Jill']

java - 区切り文字内の文字列を抽出する正規表現

3 に答える 3

Related

Reference