linux - ログファイル検索からの重複出力の削減に関する問題

Question

このWebサイトは、プログラミングに戻って、ディレクトリ（複数のドメイン）からApacheログファイルを分析し、各ログファイルの最後の1000行をプルする簡単なperlスクリプトを作成しようとしているので、非常に役立ちました。ログファイルからIPアドレスを取り除き、ボットスパマーの既知のブロックリストと比較します。

これまでのところ、1つの問題を除いて、スクリプトは機能しています。2つのログファイルにIPアドレス10.128.45.5があるとしましょう。もちろん、スクリプトは各ログファイルを順番に分析し、IPを1つのPERログファイルに減らしますが、私がやろうとしていることは、それをさらに絞り込むことです。同じIPが複数のログファイルに表示されるかどうかに関係なく、インスタンスごとに1つこのスクリプトを実行します。

これが私がこれまでに入手したコードです。少し厄介な場合は申し訳ありません。

#!/usr/bin/perl

# Extract IP's from apache access logs for the last hour and matches with forum spam bot list.
# The fun work of Daniel Pearson

use strict;
use warnings;
use Socket;

# Declarations
my ($file,$list,@files,%ips,$match,$path,$sort);
my $timestamp = localtime(time);

# Check to see if matching file exists
$list ='list';

if (-e $list) {
Delete the file so we can download a new one if it exists
print "File Exists!";
print "Deleting File $list\n";
unlink($list);
}
sleep(5);

system ("wget http://www.domain.com/list");
sleep(5);

my $dir = $ARGV[0] or die "Need to specify the log file directory\n";

opendir(DIR, "$dir");
@files = grep(/\.*$/,readdir(DIR));
closedir(DIR);

foreach my $file(@files) {
my $sum = 0;
if (-d $file) {
print "Skipping Directory $file\n";
}
else {

$path = "$dir$file";
open my $path, "-|", "/usr/bin/tail", "-1000", "$path" or die "could not start tail on $path: $!";

my %ips;


while (my $line = <$path>) {
chomp $line;
if ($line =~  m/(?!0+\.0+\.0+\.0+$)(([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?|2[0-4]\d|25[0-5]))/g) {

my $ip = $1;

$ips{$ip} = $ip;
        }
}
}

foreach my $key (sort keys %ips) {
open ("files","$list");
while (my $sort = <files>) {
chomp $sort;
if ($key =~ $sort) {
open my $fh, '>>', 'banned.out';
print "Match Found we need to block it $key\n";
print $fh "$key:$timestamp\n";
close $fh;
        }
    }
}
}

何かアドバイスをいただければ幸いです。

score 0 · Accepted Answer

タスクを達成するには:

ループmy %ipsの外 (上) に移動します。foreach my $file (@files)
ループforeach my $key ( sort keys %ips )の外 (下) に移動します。foreach my $file (@files)

linux - ログファイル検索からの重複出力の削減に関する問題

1 に答える 1

Related

Reference