java - 二重引用符で囲まれたコンマを除く、文字列内のコンマの数をカウントします

Question

二重引用符で囲まれたコンマを数えずに、文字列内のコンマ（またはその他の文字）の数を数える次の関数があります。これを達成するためのより良い方法があるかどうか、またはこの関数がクラッシュする可能性があるケースを見つけることができるかどうかを知りたいです。

public int countCharOfString(char c, String s) {
    int numberOfC = 0;
    boolean doubleQuotesFound = false;
    for(int i = 0; i < s.length(); i++){
        if(s.charAt(i) == c && !doubleQuotesFound){
            numberOfC++;
        }else if(s.charAt(i) == c && doubleQuotesFound){
            continue;
        }else if(s.charAt(i) == '\"'){
            doubleQuotesFound = !doubleQuotesFound;
        }
    }
    return numberOfC;
}

アドバイスありがとうございます

score 4 · Accepted Answer

この実装には2つの違いがあります。

CharSequence文字列の代わりに使用
boolean引用符で囲まれたサブシーケンス内にある場合、追跡する値は必要ありません。

関数：

public static int countCharOfString(char quote, CharSequence sequence) {

    int total = 0, length = sequence.length();

    for(int i = 0; i < length; i++){
        char c = sequence.charAt(i);
        if (c == '"') {
            // Skip quoted sequence
            for (i++; i < length && sequence.charAt(i)!='"'; i++) {}
        } else if (c == quote) {
            total++;
        }
    }

    return total;
 }

score 2 · Accepted Answer

public static int countCharOfString(char c, String s)
{
    int numberOfC = 0;
    int innerC = 0;
    boolean holdDoubleQuotes = false;
    for(int i = 0; i < s.length(); i++)
    {
        char r = s.charAt(i);
        if(i == s.length() - 1 && r != '\"')
        {
            numberOfC += innerC;
            if(r == c) numberOfC++;
        }
        else if(r == c && !holdDoubleQuotes) numberOfC++;
        else if(r == c && holdDoubleQuotes) innerC++;
        else if(r == '\"' && holdDoubleQuotes)
        {
            holdDoubleQuotes = false;
            innerC = 0;
        }
        else if(r == '\"' && !holdDoubleQuotes) holdDoubleQuotes = true;
    }
    return numberOfC;
}

System.out.println(countCharOfString(',', "Hello, BRabbit27, how\",,,\" are, you?"));

出力：

別の方法は、正規表現を使用することです。

public static int countCharOfString(char c, String s)
{
   s = " " + s + " "; // To make the first and last commas to be counted
   return s.split("[^\"" + c + "*\"][" + c + "]").length - 1;
}

score 1 · Accepted Answer

charAt()ループ内で何度も呼び出すべきではありません。変数を使用しcharます。
length()反復ごとに呼び出すべきではありません。intループの前に使用します。
との重複比較は避けてくださいc-ネストされたif/elseを使用してください。

score 1 · Accepted Answer

多分最速ではない...

public int countCharOfString(char c, String s) {
    final String removedQuoted = s.replaceAll("\".*?\"", "");
    int total = 0;
    for(int i = 0; i < removedQuoted.length(); ++i)
        if(removedQuoted.charAt(i) == c)
            ++total;
    return total;
}

score 1 · Accepted Answer

大きな違いを生むには大きな文字列が必要です。

このコードが高速である理由は、ループごとに3つのチェックではなく、ループごとに平均1.5のチェックが含まれているためです。これは、2つのループを使用して行われます。1つは引用符付きの状態用で、もう1つは引用符なしの状態用です。

public static void main(String... args) {
    String s = generateString(20 * 1024 * 1024);
    for (int i = 0; i < 15; i++) {
        long start = System.nanoTime();
        countCharOfString(',', s);
        long mid = System.nanoTime();
        countCharOfString2(',', s);
        long end = System.nanoTime();
        System.out.printf("countCharOfString() took %.3f ms, countCharOfString2() took %.3f ms%n",
                (mid - start) / 1e6, (end - mid) / 1e6);
    }
}

private static String generateString(int length) {
    StringBuilder sb = new StringBuilder(length);
    Random rand = new Random(1);
    while (sb.length() < length)
        sb.append((char) (rand.nextInt(96) + 32)); // includes , and "
    return sb.toString();
}

public static int countCharOfString2(char c, String s) {
    int numberOfC = 0, i = 0;
    while (i < s.length()) {
        // not quoted
        while (i < s.length()) {
            char ch = s.charAt(i++);
            if (ch == c)
                numberOfC++;
            else if (ch == '"')
                break;
        }
        // quoted
        while (i < s.length()) {
            char ch = s.charAt(i++);
            if (ch == '"')
                break;
        }
    }
    return numberOfC;
}


public static int countCharOfString(char c, String s) {
    int numberOfC = 0;
    boolean doubleQuotesFound = false;
    for (int i = 0; i < s.length(); i++) {
        if (s.charAt(i) == c && !doubleQuotesFound) {
            numberOfC++;
        } else if (s.charAt(i) == c && doubleQuotesFound) {
            continue;
        } else if (s.charAt(i) == '\"') {
            doubleQuotesFound = !doubleQuotesFound;
        }
    }
    return numberOfC;
}

プリント

countCharOfString() took 33.348 ms, countCharOfString2() took 31.381 ms
countCharOfString() took 28.265 ms, countCharOfString2() took 25.801 ms
countCharOfString() took 28.142 ms, countCharOfString2() took 14.576 ms
countCharOfString() took 28.372 ms, countCharOfString2() took 14.540 ms
countCharOfString() took 28.191 ms, countCharOfString2() took 14.616 ms

score 1 · Accepted Answer

よりシンプルでバグが発生しにくい（そして、文字列を文字ごとに歩き、すべてを手作業で追跡するよりもパフォーマンスが低い）：

public static int countCharOfString(char c, String s) {
  s = s.replaceAll("\".*?\"", "");
  int cnt = 0;
  for (int foundAt = s.indexOf(c); foundAt > -1; foundAt = s.indexOf(c, foundAt+1)) 
    cnt++;
  return cnt;
}

score 0 · Accepted Answer

正規表現とString.split（）を使用することもできます

次のようになります。

public int countNonQuotedOccurrences(String inputstring, char searchChar)
{
  String regexPattern = "[^\"]" + searchChar + "[^\"]";
  return inputString.split(regexPattern).length - 1;
}

免責事項：

これは基本的なアプローチを示しています。

上記のコードは、文字列の最初または最後でsearchCharをチェックしません。

これを手動で確認するか、regexPatternに追加することができます。

java - 二重引用符で囲まれたコンマを除く、文字列内のコンマの数をカウントします

7 に答える 7

Related

Reference