java - csv の余分な列を無視する - SuperCSV

Question

SuperCSV を使用して CSV レコードをオブジェクトに解析しています。私の CSV ファイルには最後に余分な列があり、最初の X 列のみを処理したいと考えています。したがって、最初の X 列と同じサイズのString[]マッピングを定義します。CellProcessor[]しかし、それは機能していないようで、セルプロセッサの数が列の数とまったく同じでなければならないという例外をスローします。

私が何か欠けているかどうか誰かに教えてもらえますか。マッピング配列が必要ない場合でも、5 つとまったく同じ列を持つように定義する必要がありますか?

  public CsvToBeanParser(Reader reader, Class<T> type, CsvPreference preference, CellProcessor[] cellProcessors, String[] mapping, boolean skipHeader)
        throws IOException {
    this.beanReader = new CsvBeanReader(reader, preference);
    this.mapping = mapping;
    if (skipHeader) {
        beanReader.getHeader(true);
    }
    this.cellProcessors = cellProcessors;
    this.type = type;

}

/**
 * Parse and return record.
 * 
 * @return
 * @throws Exception
 *             if there is any parsing error
 */
public T getItem() throws Exception {
    try {
        return (T) beanReader.read(type, mapping, cellProcessors);
    } catch (Exception e) {
        LOG.error("Error parsing record", e);
        throw e;
    }
}

これが私のマッピングとセルプロセッサです

String[] mapping = {"column1", "column2"};
CellProcessor[] cellProcessors = {null, null};

これはファイルに対して機能します

column1, column2
1,2

しかし失敗します（ column3 を無視したい場合）

column1, column2, column3
1,2,3

score 0 · Accepted Answer

ヘッダーにアクセスしたり、非常に大きなファイルを解析したりできない場合は、次のことができます。

単純に拡張するCsvBeanReader

public class FlexibleCsvBeanReader extends CsvBeanReader {

    public FlexibleCsvBeanReader(final Reader reader, final CsvPreference preferences) {
        super(reader, preferences);
    }

    @Override
    protected List<Object> executeProcessors(final List<Object> processedColumns, final CellProcessor[] processors) {
//      we have to make sure that processors.length is equal to columnSize (thats the real column-count of the csv)
        final int columnSize = getColumns().size();
//      resize the array
        final CellProcessor[] newProcessors = new CellProcessor[columnSize];
        System.arraycopy(processors, 0, newProcessors, 0, processors.length);
//      do default stuff
        return super.executeProcessors(processedColumns, newProcessors);
    }
}

新しい列が最後にある限り、ネームマッピングは無視できます。それが悪い習慣だと思う場合は、readメソッドもオーバーライドする必要があります。

プロセッサによっては時間がかかる場合があるため、サイズ変更された配列をキャッシュすることもできます。ただし、これは、各行に同じCellProcessor配列を適用する場合にのみ意味があります。

java - csv の余分な列を無視する - SuperCSV

2 に答える 2

Related

Reference