performance - Bash の読み取りと解析ファイル - ループパフォーマンス

Question

ファイルを読み込んで bash で解析しようとしています。ddをからに変換しEBCDIC、ASCIIループして X バイトを読み取り、各 X バイトを新しいファイルの行としてパイプする必要があります。

#!/bin/bash

# $1 = input file in EBCDIC
# $2 = row length
# $3 = output file

# convert to ASCII and replace NUL (^@) with ' '
dd conv=ascii if=$1 | sed 's/\x0/ /g' > $3.tmp
file=$(cat "$3.tmp")
sIndex=1
fIndex=$2

# remove file
rm $3
echo "filesize: ${#file}";   

# loop, retrieving each fixed-size record and appending to a file
while true; do
    # append record to file
    echo "${file:sIndex:fIndex}" >> $3;

    # break at end of file
    if [ $fIndex -ge ${#file} ] 
    then    
        break;
    fi

    # increment index
    sIndex=$((sIndex+fIndex));
done

# remove tmp
rm $3.tmp

このプロセス全体を高速化する方法はありますか?

score 1 · Accepted Answer

私自身の質問に答えます。答えはfold!を使えばとても簡単です。

# $1 = ASCII input file
# $2 = file record length (i.e. 100)
# $3 = output file (non-delimited, row-separated file)

# dd : convert from EBCDIC to ASCII
# sed : replace NUL (^@) with space ' '
# fold : wrap input to specified width (record length)

dd conv=ascii if=$1 | sed 's/\x0/ /g' | fold -$2 > $3

performance - Bash の読み取りと解析ファイル - ループ パフォーマンス

1 に答える 1

Related

Reference

performance - Bash の読み取りと解析ファイル - ループパフォーマンス