perl - 最新のログファイルを開き、特定のタイムスタンプより後の行を印刷します

Question

私はPerlスクリプトを書いていますが、ガベージコレクションログからいくつかの行をキャプチャしてファイルに書き込む必要があります。

ログはリモートホストにあり、Net::OpenSSHモジュールを使用して接続しています。

利用可能な最新のログファイルを読み取る必要があります。

シェルでは、次のコマンドを使用して最新のログを見つけることができます。

cd builds/5.7.1/5.7.1.126WRF_B/jboss-4.2.3/bin
ls -lat | grep '.log$' | tail -1

これは最新のログを返します：

-rw-r--r--   1 load     other    2406173 Jul 11 11:53 18156.stdout.log

したがって、Perlでは、そのログを見つけて開いて読み取るものを記述できるようにしたいと思います。

そのログファイルがある場合、指定した時間よりも長いタイムスタンプを持つすべての行を印刷したいと思います。指定されたタイムスタンプは$Runtime、最新のログメッセージ時刻から差し引かれた変数です。

ガベージコレクションログの最後のメッセージは次のとおりです。

                                      ...

73868.629: [GC [PSYoungGen: 941984K->14720K(985216K)] 2118109K->1191269K(3065984K), 0.2593295 secs] [Times: user=0.62 sys=0.00, real=0.26 secs]
73873.053: [GC [PSYoungGen: 945582K->12162K(989248K)] 2122231K->1189934K(3070016K), 0.2329005 secs] [Times: user=0.60 sys=0.01, real=0.23 secs]

したがって$Runtime、120秒の値がある場合、タイムスタンプ（73873.053-120）秒からすべての行を出力する必要があります。

結局、私のスクリプトは次のようになります...

open GARB, ">", "./report/archive/test-$now/GC.txt" or die "Unable to create file: $!";

my $ssh2 = Net::OpenSSH->(
  $pathHost,
  user => $pathUser,
  password => $pathPassword
);
$ssh2->error and die "Couldn't establish SSH connection: ". $ssh2->error; 

# Something to find and open the log file.
print GARB #Something to return certain lines.
close GARB;

これはこの質問にいくぶん似ていると思いますが、私が探しているものに合わせて調整する方法を考えることはできません。どんな助けでも大歓迎です！

score 2 · Accepted Answer

最新のファイルを見つけてperlにフィードします。

 LOGFILE=`ls -t1 $DIR | grep '.log$' | head -1`
 if [ -z $LOGFILE ]; then
   echo "$0: No log file found - exiting"
   exit 1;
 fi

 perl myscript.pl $LOGFILE

最初の行のパイプは、ディレクトリ内のファイルを名前のみで1つの列にリストし、最新のものを最初にリストします。ログファイルをフィルタリングし、最初のファイルのみを返します。

私はあなたのタイムスタンプを私が理解できるものに変換し、数学と比較を行う方法がわかりません。しかし、一般的に：

$threshold_ts = $time_specified - $offset;
while (<>) {
  my ($line_ts) = split(/\s/, $_, 2);
  print if compare_time_stamps($line_ts, $threshold_ts);
}

しきい値の操作と比較を書くことは、読者の練習問題として残されています。

score 2 · Accepted Answer

Net::OpenSSHのページは、これについてかなり良いベースラインを提供していると思います。

my ($rout, $pid) = $ssh->pipe_out("cat /tmp/foo") or
  die "pipe_out method failed: " . $ssh->error;

while (<$rout>) { print }
close $rout;

しかし、代わりに、いくつかの破棄作業を行いたいと思います。

my ($rout, $pid) = $ssh->pipe_out("cat /tmp/foo") or
  die "pipe_out method failed: " . $ssh->error;

my $line;
while (   $line = <$rout> 
      and substr( $line, 0, index( $line, ':' )) < $start 
      ) {}
while (   $line = <$rout> 
      and substr( $line, 0, index( $line, ':' )) <= $start + $duration 
      ) {
    print $line;
}
close $rout;

score 1 · Accepted Answer

これがテストされていないアプローチです。私は使っNet::OpenSSHたことがないので、もっと良い方法があるかもしれません。それがうまくいくかどうかさえわかりません。動作するのは、私がテストした解析部分です。

use strict; use warnings;
use Net::OpenSSH;

my $Runtime = 120;
my $now = time;
open my $garb, '>', 
  "./report/archive/test-$now/GC.txt" or die "Unable to create file: $!";
my $ssh2 = Net::OpenSSH->(
$pathHost,
  user => $pathUser,
  password => $pathPassword
);
$ssh2->error and die "Couldn't establish SSH connection: ". $ssh2->error;   

# Something to find and open the log file.
my $fileCapture = $ssh2->capture(
  q~ls -lat builds/5.7.1/5.7.1.126WRF_B/jboss-4.2.3/bin |grep '.log$' |tail -1~
);
$fileCapture =~ m/\s(.+?)$/; # Look for the file name
my $filename = $1;           # And save it in $filename

# Find the time of the last log line 
my $latestTimeCapture = $ssh2->capture(
  "tail -n 1 builds/5.7.1/5.7.1.126WRF_B/jboss-4.2.3/bin/$filename");
$latestTimeCapture =~ m/^([\d\.]+):/;
my $logTime = $1 - $Runtime;

my ($in, $out, $pid) = $ssh2->open2(
  "builds/5.7.1/5.7.1.126WRF_B/jboss-4.2.3/bin/$filename");
while (<$in>) {
  # Something to return certain lines.
  if (m/^([\d\.]+):/ && $1 > $logTime) {
    print $garb $_; # Assume the \n is still in there
  }
}

waitpid($pid);

print $garb;
close $garb;

それはあなたのls行を使用してcaptureメソッドでファイルを検索します。次に、SSHトンネルを介してパイプを開き、そのファイルを読み取ります。$inは、読み取ることができるそのパイプへのファイルハンドルです。

ファイルを1行ずつ処理するため、最初に最後の行を取得して最後のタイムスタンプを取得する必要があります。これはtail、captureメソッドを使用して行われます。

それができたら、パイプから1行ずつ読み取ります。これは単純な正規表現になりました（上記で使用したものと同じです）。タイムスタンプを取得し、以前に設定した時間（120秒を引いたもの）と比較します。高い場合はprint、出力ファイルハンドルへの行。

ドキュメントwaitpidには、サブプロセスを取得するために、$pidから返されたファイルを使用する必要があると記載され$ssh2->open2ているため、出力ファイルを閉じる前にそれを実行します。

score 1 · Accepted Answer

すべての行を含むアキュムレータを保持する（より多くのメモリ）か、ログを複数回繰り返す（より多くの時間）必要があります。

アキュムレータ付き：

my @accumulated_lines;
while (<$log_fh>) {
    push @accumulated_lines, $_;

    # Your processing to get $Runtime goes here...

    if ($Runtime > $TOO_BIG) {
        my ($current_timestamp) = /^(\d+(?:\.\d*))/;
        my $start_timestamp = $current_timestamp - $Runtime;

        for my $previous_line (@accumulated_lines) {
            my ($previous_timestamp) = /^(\d+(?:\.\d*))/;
            next unless $previous_timestamp <= $current_timestamp;
            next unless $previous_timestamp >= $start_timestamp;
            print $previous_line;
        }
    }
}

または、ログを2回繰り返すこともできます。これは似ていますが、ネストされたループはありません。ログにこれらのスパンが複数ある可能性があると想定しました。

my @report_spans;
while (<$log_fh>) {
    push @accumulated_lines, $_;

    # Your processing to get $Runtime goes here...

    if ($Runtime > $TOO_BIG) {
        my ($current_timestamp) = /^(\d+(?:\.\d*))/;
        my $start_timestamp = $current_timestamp - $Runtime;

        push @report_spans, [ $start_timestamp, $current_timestamp ];
    }
}

# Don't bother continuing if there's nothing to report
exit 0 unless @report_spans;

# Start over
seek $log_fh, 0, 0;

while (<$log_fh>) {
    my ($previous_timestamp) = /^(\d+(?:\.\d*))/;
    SPAN: for my $span (@report_spans) {
        my ($start_timestamp, $current_timestamp) = @$span;

        next unless $previous_timestamp <= $current_timestamp;
        next unless $previous_timestamp >= $start_timestamp;
        print; # same as print $_;

        last SPAN; # don't print out the line more than once, if that's even possible
    }
}

スパンが重複している可能性がある場合、後者には同じログ行を2回表示しないという利点があります。重複するスパンがない場合は、出力するたびにアキュムレータをリセットすることで、一番上のスパンを最適化できます。

my @accumulator = ();

これはメモリを節約します。

score 0 · Accepted Answer

SFTPを使用してリモートファイルシステムにアクセスします。Net :: SFTP :: Foreign（単独またはNet :: OpenSSH経由）を使用できます。

これにより、リモートファイルシステムの内容を一覧表示し、処理するファイルを選択して開き、ローカルファイルとして操作することができます。

あなたがする必要がある唯一のトリッキーなことは、行を逆方向に読み取ることです。たとえば、ファイルのチャンクを最後から読み取り、それらを行に分割します。

perl - 最新のログファイルを開き、特定のタイムスタンプより後の行を印刷します

5 に答える 5

Related

Reference