c++ - C ++ファイルのioとセパレーターによる分割

Question

次のようにリストされたデータを含むファイルがあります。

0,       2,    10
10,       8,    10
10,       10,   10
10,       16,   10
15,       10,   16
17,       10,   16

ファイルを入力して 3 つの配列に分割し、余分なスペースをすべて削除して各要素を整数に変換できるようにしたいと考えています。

何らかの理由で、C++ でこれを行う簡単な方法が見つかりません。私が経験した唯一の成功は、各行を配列に入力し、すべてのスペースを正規表現してから分割することです。このプロセス全体で20〜30行のコードが必要で、別のセパレーター（スペースなど）などを変更するのは面倒でした。

これは、C++ で使用したいものと同等の Python です。

f = open('input_hard.dat')
lines =  f.readlines()
f.close()

#declarations
inint, inbase, outbase = [], [], []

#input parsing
for line in lines:
    bits = string.split(line, ',')
    inint.append(int(bits[0].strip()))
    inbase.append(int(bits[1].strip()))
    outbase.append(int(bits[2].strip()))

これを Python で行うことの使いやすさは、そもそも Python に移行した理由の 1 つです。ただし、今は C++ でこれを行う必要があり、20 ～ 30 行の醜いコードを使用する必要はありません。

どんな助けでも大歓迎です、ありがとう！

score 7 · Accepted Answer

ストリームがうまく機能するので、この例では実際にブーストを使用する必要はありません。

int main(int argc, char* argv[])
{
    ifstream file(argv[1]);

    const unsigned maxIgnore = 10;
    const int delim = ',';
    int x,y,z;

    vector<int> vecx, vecy, vecz;

    while (file)
    {
        file >> x;
        file.ignore(maxIgnore, delim);
        file >> y;
        file.ignore(maxIgnore, delim);
        file >> z;

        vecx.push_back(x);
        vecy.push_back(y);
        vecz.push_back(z);
    }
}

ブーストを使用する場合は、正規表現よりもトークナイザーのシンプルさを好むでしょう... :)

score 5 · Accepted Answer

この場合、おそらく最速のソリューションである fscanf に問題はありません。そして、それは Python コードと同じくらい短くて読みやすいです:

FILE *fp = fopen("file.dat", "r");
int x, y, z;
std::vector<int> vx, vy, vz;

while (fscanf(fp, "%d, %d, %d", &x, &y, &z) == 3) {
  vx.push_back(x);
  vy.push_back(y);
  vz.push_back(z);
}
fclose(fp);

score 3 · Accepted Answer

何かのようなもの：

vector<int> inint;
vector<int> inbase;
vector<int> outbase;
while (fgets(buf, fh)) {
   char *tok = strtok(buf, ", ");
   inint.push_back(atoi(tok));
   tok = strtok(NULL, ", ");
   inbase.push_back(atoi(tok));
   tok = strtok(NULL, ", ");
   outbase.push_back(atoi(tok));
}

エラーチェックを除いて。

score 2 · Accepted Answer

Python と同じコードではないのはなぜですか :) ?

std::ifstream file("input_hard.dat");
std::vector<int> inint, inbase, outbase;

while (file.good()){
    int val1, val2, val3;
    char delim;
    file >> val1 >> delim >> val2 >> delim >> val3;

    inint.push_back(val1);
    inbase.push_back(val2);
    outbase.push_back(val3);
}

score 1 · Accepted Answer

std::getline を使用すると、テキスト行を読み取ることができ、文字列ストリームを使用して個々の行を解析できます。

string buf;
getline(cin, buf); 
stringstream par(buf);

char buf2[512];
par.getline(buf2, 512, ','); /* Reads until the first token. */

文字列にテキストの行を取得したら、実際に必要な解析関数を使用できます。整数を使用して、または他の方法で部分文字列にatoiします。

入力ストリームに不要な文字があることがわかっている場合は、それらを無視することもできます。

if (cin.peek() == ',')
    cin.ignore(1, ',');
cin >> nextInt;

score 1 · Accepted Answer

Boost ライブラリの使用を気にしない場合は...

#include <string>
#include <vector>
#include <boost/lexical_cast.hpp>
#include <boost/regex.hpp>

std::vector<int> ParseFile(std::istream& in) {
    const boost::regex cItemPattern(" *([0-9]+),?");
    std::vector<int> return_value;

    std::string line;
    while (std::getline(in, line)) {
        string::const_iterator b=line.begin(), e=line.end();
        boost::smatch match;
        while (b!=e && boost::regex_search(b, e, match, cItemPattern)) {
            return_value.push_back(boost::lexical_cast<int>(match[1].str()));
            b=match[0].second;
        };
    };

    return return_value;
}

ストリームから行を取得し、Boost::RegEx ライブラリ (キャプチャグループを使用) を使用して行から各数値を抽出します。必要に応じて変更できますが、有効な番号でないものはすべて自動的に無視されます。

s を使用してもまだ約 20 行ですが、これを使用して、ファイルの行から基本的に何でも#include抽出できます。これは些細な例です。データベースフィールドからタグとオプションの値を抽出するためにほとんど同じコードを使用しています。唯一の大きな違いは正規表現です。

編集: おっと、3 つの個別のベクトルが必要でした。代わりに、このわずかな変更を試してください。

const boost::regex cItemPattern(" *([0-9]+), *([0-9]+), *([0-9]+)");
std::vector<int> vector1, vector2, vector3;

std::string line;
while (std::getline(in, line)) {
    string::const_iterator b=line.begin(), e=line.end();
    boost::smatch match;
    while (b!=e && boost::regex_search(b, e, match, cItemPattern)) {
        vector1.push_back(boost::lexical_cast<int>(match[1].str()));
        vector2.push_back(boost::lexical_cast<int>(match[2].str()));
        vector3.push_back(boost::lexical_cast<int>(match[3].str()));
        b=match[0].second;
    };
};

score 0 · Accepted Answer

より難しい入力形式にスケーリングできるようにしたい場合は、spirit、boost パーサーコンビネーターライブラリを検討する必要があります。

このページには、必要なことをほぼ実行する例があります（ただし、実数と1つのベクトルを使用）

c++ - C ++ファイルのioとセパレーターによる分割

7 に答える 7

Related

Reference