bash - bashで文字列から列を印刷する

Question

更新された質問わかり ました。次のような行を含むファイルがあります。

44:)   2.884E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  9.990E+02
45:)   2.884E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  9.990E+02
1:)   3.593E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  1.000E+05
2:)   3.593E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  1.000E+05

最初の列の数字は 1 から x (この場合は 45) まで実行され、その後 1 ロットから始まります。一部の列を別のファイルに移動したいと考えています。移動する列のインデックスは変数/配列$selected_columns(この場合は 2、5、および 8) に格納され、移動する列の数$number_of_columns(この場合は 3) が格納されます。

次に、45 個のファイルを作成します。1 つはすべての選択された列用1:)、もう 1 つはすべての選択された列用2:)などです。列の数と1からxまでの数の両方が変わるため、これをできるだけ一般的にしたいと思います。数値 x は常に既知であり、抽出する列はユーザーが選択します。

元の質問:

egrep によってフェッチされた文字列があります。次に、その文字列の列 (単語) の一部を出力したいと思います。位置 (列インデックス) は、bash スクリプトのリストで認識されています。現在、次のようになっています。

line=$(egrep " ${i}:\)" $1)

for ((j=1; j<=$number_of_columns; j++))
do
    awk $line -v current_column=${selected_columns[$j]} '{printf $(current_column)}' > "history_files/history${i}"
done

number_of_columnsは、印刷される列の数であり、selected_columnsそれらの列の対応するインデックスが含まれています。例としてnumber_of_columns = 3とselected_columns = [2 5 8]、文字列から単語番号 2、5、および 8 をlineファイルに出力したいと思いますhistory${i}。

何が悪いのかわかりませんが、これは試行錯誤の末に行われました。現在のエラーはawk: cannot open 0.000E+00 (No such file or directory)です。

どんな助けでも大歓迎です！

score 3 · Accepted Answer

awk行を次のように変更する必要があると思います。

echo $line | awk -v current_column=${selected_columns[$j]} ...

更新された質問について、列が配列にある場合$selected_columns。サンプルファイルでは、列は複数の隣接するスペースで区切られています。これが元のファイルに当てはまらない場合は、sedbeforeを省略できますgrep。

columns=`echo ${selected_columns[*]} | sed 's/ /,/g'`
for i in `seq 45`; do
    sed -e 's/  */ /g' file | grep "^$i:)" | cut -d' ' -f $columns >file-$i
done

score 1 · Accepted Answer

の：

awk $line -v ...

$line は、grep の出力を保持します。おそらく awk がコマンドラインで表示することを期待するものではありません。また、これ：

for ((j=1; j<=$number_of_columns; j++))
do
    anything > "history_files/history${i}"
done

ループのたびに履歴ファイルを上書きします。あなたが本当にそこで何をしたかったのか私にはわかりません。

ただし、スクリプトには他にも多くの問題があります。あなたは「例としてnumber_of_columns = 3およびselected_columns = [2 5 8]なので、文字列行からファイルhistory $ {i}に単語番号2、5、および8を出力したい」と言いました。

これは完全に awk では些細なことであり、awk の外部でも「grep」を実行する必要がないため、次のようにすべてを実行できます。

awk -v pat=" ${i}:\)" -v selected_columns="$selected_columns" '

BEGIN { number_of_columns = split(selected_columns,selected_columnsA) }

$0 ~ pat {
    sep=""
    for (j=1;j<=number_of_columns;j++) {
        current_column = selected_columnsA[j]
        printf "%s,%s",sep,lineA[current_column]
        sep = "\t"
    }
    print ""
}
' "$1" > "history_files/history${i}"

それがうまくいかない場合は、元のスクリプトを修正しようとする代わりに、THAT を修正しましょう。上記の外側にループを囲んでいるように聞こえますが、それは awk スクリプトの一部でもある可能性があります。

更新されたOPに基づく編集：

たくさんのコメントを追加しましたが、質問がある場合はお知らせください。

$ cat file
44:)   2.884E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  9.990E+02
45:)   2.884E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  9.990E+02
1:)   3.593E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  1.000E+05
2:)   3.593E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  1.000E+05
$
$ cat tst.sh
selected_columns=(2 5 8)

selCols="${selected_columns[@]}"

awk -v selCols="$selCols" '

BEGIN { # Executed before the first line of the input file is read

    # Split the string of selected column numbers, selCols, into
    # an array selColsA where selColsA[1] has the value of the
    # first space-separated sub-string of selCols (i.e. the number
    # of the first column to print). Note that we dont need the
    # number of columns passed into the script as a result of
    # splitting the string is the count of elements put into the
    # array as a return code from the split() builtin function.
    numCols = split(selCols,selColsA)
}

{ # Executed once for every line of the input file

    # Create a numerix suffix like "45" from the first column
    # in the current line of the input file, e.g. "45:)" by
    # just getting rid of all non-digit characters.
    sfx = $1
    gsub(/[^[:digit:]]/,"",sfx)

    # Create the name of the output file by attaching that
    # numeric suffix to the base value for all output files.
    #histfile = "history_files/history" sfx
    histfile = "tmp" sfx


    # Loop through every column we want printed. selColsA[<index>]
    # gives us a column number which we can then use to access the
    # columns of the current line. Awk uses the builtin variable $0
    # to hold the current line, and it autolatically splits it so
    # that $1 holds the first column, $2 is the second, etc. So
    # if selColsA[1] has the value 3, then $(selColsA[1]) would be
    # the value of the 3rd column of the current input line.
    sep=""
    for (i=1;i<=numCols;i++) {
        curCol = selColsA[i]

        # Print the current column, prefixed by a tab for all but
        # the first column, and without a terminating newline so the
        # next column gets appended to the end of the current output line.
        # Note that in awk "> file" has different semantics from shell
        # and opens the file for writing the first time the line is hit
        # like "> file" in shell, but then appends to it every time its
        # hit afterwards, like ">> file" in shell.
        printf "%s%s",sep,$curCol > histfile
        sep = "\t"
    }
    # Add a newline to the end of the current output line
    print "" > histfile
}

' "$1"
$
$ ./tst.sh file
$
$ cat tmp1
3.593E-02       2.780E+02       1.000E+05
$ cat tmp2
3.593E-02       2.780E+02       1.000E+05
$ cat tmp44
2.884E-02       2.780E+02       9.990E+02
$ cat tmp45
2.884E-02       2.780E+02       9.990E+02

ところで、学習のためだけに「列」と「行」という言葉を使用しましたが、参考までに、awk の用語は実際には「フィールド」と「レコード」です。

score 0 · Accepted Answer

私はあなたがあなたがやろうとしていることをするためにカットを使うことができると思います、すなわち

echo "$line" | cut -d" " -f2 -f5 -f8 > "history_files/history${i}"

-dは区切り文字です。テストにスペースを使用したため、「」

お役に立てれば

bash - bashで文字列から列を印刷する

3 に答える 3

Related

Reference