macos - GAWK および Bash スクリプトから Gnuplot を呼び出すと、最初のプロットのみがプロットされます

Question

よし、これで取り引きだ。私は計算材料科学の学部論文を書き始めており、データ分析の準備に役立ついくつかのスクリプトをまとめようとしています。

基本的にいくつかのデータ (4 列に配置) を取得し、そのうちの 2 つを取得して GNUPLOT にプロットする GAWK スクリプトを準備しています。この目的を達成するために、複数のタイムステップとそれに関連するデータを含むデータファイルを読み込み、ファイルをタイムステップごとに個別の .dat ファイルに分割します。

そこから、GNUPLOT の基本的な入力スクリプトを生成し、データファイルで発生する各タイムステップをプロットします。

問題は、何らかの理由で、生成されたすべてのプロットがまったく同じプロット (この場合は常に最初の時間ステップ) ですが、正しい時間ステップとして保存されることです。

私はすでにスクリプト全体で各変数/ファイル名を調べて追跡しており、最終的に問題はスクリプトから呼び出されている GNUPLOT にあると判断しました。システムコマンドを取り出して、for ループから gnuplot を呼び出す短い bash スクリプトを作成しました。

#!/bin/bash
for file in ./*gnu
do
   gnuplot $file
done

それでも、すべてのプロットが同じであるという同じ問題が発生します。次に、.gnu ファイルを含むディレクトリのコマンドラインからコマンド gnuplot *gnu を実行したところ、機能しました。

フラッシュする必要があるバッファがあるのか、それとも何か不足しているだけなのか疑問に思っているだけだと思いますか？

GAWK スクリプトを以下に示します。私はまだこれに慣れていないので、建設的な批判でスクリプトにコメントしたい場合は、それも大いに感謝します.

#!/opt/local/bin/gawk -v inputf=$1 -f                                                   

# Write gnuplot files and plot RDF data                                                 
function plot_rdf(timestep, Load_RDF_dat)
{
# Set number of digits in filenames to 6 so data is organized                           
    if (timestep < 10){
        pad_timestep="00000"timestep;
    }
    else if (timestep < 100){
        pad_timestep="0000"timestep;
    }
    else if (timestep < 1000){
        pad_timestep="000"timestep;
    }
    else if (timestep < 10000){
        pad_timestep="00"timestep;
    }
    else if (timestep < 100000){
        pad_timestep="0"timestep;
    }
    else{
        pad_timestep=timestep;
    }

# Give output filenames                                                                 
       gnu_file="plot_RDF_"pad_timestep".gnu";
       png_file="RDF_"pad_timestep".png";

# Create input files for gnuplot                                                        
       print "set output \""png_file"\"" >> gnu_file;
       print "set terminal png" >> gnu_file;
       print "plot './"Load_RDF_dat"' u 1:2" >> gnu_file;
       close(gnu_file);
       system("gnuplot "gnu_file);
}


# Main part of script                                                                   
{
# Parse the RDF data and save it to GNUPLOT readable files                              
    while(getline < inputf){
       if ($1 == "#"){
           # skips the three commented header lines                                     
           next;
       }
       else if (NF == 2){
           timestep=$1;
           bin_num=$2;
           print "Reading timestep "timestep;
           RDF_dat="RDF_"timestep".dat";
           next;
       }
       else if (NF == 4){
           print $2" "$3 >> RDF_dat;
           if ($1 == bin_num){
               plot_rdf(timestep, RDF_dat);
               close(RDF_dat);
           }
           next;
       }
    }
    close(inputf);
    close(RDF_dat);
 }

私が読んでいるデータファイルのスニペットは次のとおりです。

# Time-averaged data for fix rdf
# TimeStep Number-of-rows
# Row c_allrdf[1] c_allrdf[2] c_allrdf[3]
500 100
1 0.005 0 0
2 0.015 0 0
3 0.025 0 0
4 0.035 0 0
5 0.045 0 0
6 0.055 1.16597 0.00133333
7 0.065 2.08865 0.00466667
8 0.075 1.56958 0.008
9 0.085 0.733433 0.01
10 0.095 0.587288 0.012
600 100
1 0.005 0 0
2 0.015 0 0
3 0.025 2.79219 0.000666667
4 0.035 2.86766 0.002
5 0.045 0 0.002
6 0.055 0.582985 0.00266667
7 0.065 2.08865 0.006
8 0.075 0.62783 0.00733333
9 0.085 0.488955 0.00866667
10 0.095 1.17458 0.0126667

通常、各タイムステップセクションには 100 セットのデータがありますが、ここでは、アイデアを得るために短くすることにしました。

score 0 · Accepted Answer

mgilson が指摘しているように、がないために plot_rdf の呼び出しに失敗している可能性があります$1 == bin_num。コマンドラインでデータファイル名を指定して awk を呼び出すと、awk の組み込みファイル読み取りループを簡単に使用できることに注意してください。これは、次の awk プログラムの書き直しに示されています。また、次の点にも注意
してください。>>>

pad_timestep = sprintf("%06d", timestep);if

以下では、プログラムを file に入れso-gnuplot-awk、データをそのままfile に入れ、次の方法でdata-so-gnuplotプログラムを呼び出しました。

awk -f so-gnuplot-awk data-so-gnuplot

プログラム：

# Parse the RDF data and save it to GNUPLOT readable files
BEGIN { dopen=0 }

NF==2 {
    if (dopen) plot_rdf(timestep, RDF_dat);
    timestep = $1;
    print "Reading timestep "timestep;
    RDF_dat="RDF_"timestep".dat";
    printf "" > RDF_dat     # Init empty file
    dopen = 1;
}

NF == 4 {  if (dopen) print $2" "$3 >> RDF_dat; }

# Write gnuplot files and plot RDF data
function plot_rdf(timestep, Load_RDF_dat) {
# Set output filenames & create gnuplot command file
    pad_timestep = sprintf("%06d", timestep);
    gnu_file="plot_RDF_"pad_timestep".gnu";
    png_file="RDF_"pad_timestep".png";
    print "set output \""png_file"\"" > gnu_file; # Use > first
    print "set terminal png" >> gnu_file;
    print "plot './"Load_RDF_dat"' u 1:2" >> gnu_file;
    close(gnu_file);
    close(RDF_dat);
    print "Plotting with "RDF_dat" into "png_file
    system("gnuplot "gnu_file);
    dopen=0
}

END { if (dopen) plot_rdf(timestep, RDF_dat); }

score 0 · Accepted Answer

あなたの質問に答えられるかどうかはわかりませんが、データファイルを少し変更したところ、問題なく動作したようです。

あなたのデータファイルの私の変更されたバージョンは次のとおりです。

# Time-averaged data for fix rdf
# TimeStep Number-of-rows
# Row c_allrdf[1] c_allrdf[2] c_allrdf[3]
500 100
1 0.005 0 0
2 0.015 0 0
3 0.025 0 0
4 0.035 0 0
5 0.045 0 0
6 0.055 1.16597 0.00133333
7 0.065 2.08865 0.00466667
8 0.075 1.56958 0.008
9 0.085 0.733433 0.01
10 0.095 0.587288 0.012
100 0.095 0.56 0.014     #<-added this line
600 100
1 0.005 0 0
2 0.015 0 0
3 0.025 2.79219 0.000666667
4 0.035 2.86766 0.002
5 0.045 0 0.002
6 0.055 0.582985 0.00266667
7 0.065 2.08865 0.006
8 0.075 0.62783 0.00733333
9 0.085 0.488955 0.00866667
10 0.095 1.17458 0.0126667
100 0.095 1.179 0.12      #<-added this line

これらの行は、次の行があるため、gnuplot プロット関数を「トリガー」するために必要でした。

   if ($1 == bin_num){
       plot_rdf(timestep, RDF_dat);
       close(RDF_dat);
   }

bin_num「ヘッダー」の 2 番目のフィールドから取得されるため。(例600 100)。

完全なデータファイルで正しく設定されているかどうかはわかりません。また、スクリプトを次のように呼び出しました。

gawk -f test.awk -v inputf=test.dat test.dat

最初はシバンを完全に無視しますが、多くのシステムでそれらを正しく分割するのに問題があることを読みました。

最後に、gnuplot のバージョンは何ですか? gawk4.6 を使用している場合は、スクリプトを完全にスキップして、はるかに単純なスクリプトに置き換えることで、この苦痛の多くを忘れることができます。

macos - GAWK および Bash スクリプトから Gnuplot を呼び出すと、最初のプロットのみがプロットされます

2 に答える 2

Related

Reference