c - Cでファイルを解析してcharを読み取る

Question

空白を含むランダムな文字で満たされたファイルがあり、 \n もランダムに含まれているとしましょう。

たとえば、UU、II、NJ、KU などの文字グループを探します。したがって、目的はファイルを読み取り、この種のグループを探して、ファイル内にそれらがいくつあるかを示すことです。

私の問題は空白と \n です。これらのいずれかが見つかった場合は、スキップしてグループを再度検索する必要があります。私を助けることができる解決策、関数strtok_rを見つけました。

http://www.codecogs.com/reference/computing/c/string.h/strtok.php?alias=strtok_r

これにより、完全な文字列が分離されるため、一度に1つずつ読み取ることができると思います。

それは良い解決策ですか、それとも他のアプローチを取るべきですか?

score 4 · Accepted Answer

単純な解決策は、おそらく一度に 1 文字を読み取り、それが , のときに'U'別の文字を読み取って'I'、それがグループ内の次の文字かどうかを確認することです'N'。'K'そうであれば、そのグループのカウンターを増やします。他のすべての文字は単純に破棄されます。

編集：関数の例：

int count_uu = 0;
int count_ii = 0;
int count_nj = 0;
int count_ku = 0;

void check_next_char(int expected, FILE *input, int *counter);

void count(FILE *input)
{
    int ch;  /* Character we read into */

    while ((ch = fgetc(input)) != EOF)
    {
        switch (ch)
        {
        case 'U':
            check_next_char('U', input, &count_uu);
            break;
        case 'I':
            check_next_char('I', input, &count_ii);
            break;
        case 'N':
            check_next_char('J', input, &count_nj);
            break;
        case 'K':
            check_next_char('U', input, &count_ku);
            break;

        default:
            /* Not a character we're interested in */
            break;
    }
}

/* This function gets the next character from a file and checks against
   an `expected` character. If it is same as the expected character then
   increase a counter, else put the character back into the stream buffer */
void check_next_char(int expected, FILE *input, int *counter)
{
    int ch = fgetc(input);
    if (ch == expected)
        (*counter)++;
    else
        ungetc(ch, input);
}

score 0 · Accepted Answer

使用することもできます

https://github.com/leblancmeneses/NPEG/tree/master/Languages/npeg_c

検索パターンがより難しくなる場合。

C バージョンをエクスポートできるビジュアルツールは次のとおりです。

ルール文法のドキュメント: http://www.robusthaven.com/blog/parsing-expression-grammar/npeg-dsl-documentation

ルール

    (?<UU>): 'UU'\i; 
(?<II>): 'II'\i; 
(?<NJ>): 'NJ'\i; 
(?<KU>): 'KU'; // does not use \i so is case sensitive 

Find: UU / II / NJ / KU;
(?<RootExpression>): (Find / .)+;

入力 1:

 UU, II, NJ, KU  uu, ii, nJ, kU

入力 2:

jsdlfj023#uu, ii, nJ, kU $^%900oi)()*()  UU, II, NJ, KU

c - Cでファイルを解析してcharを読み取る

2 に答える 2

Related

Reference