arrays - 配列内の行をスキップ, Perl

Question

私は Perl に比較的慣れていないので、このプロジェクトに出くわしましたが、少し苦労しています。プロジェクトの目的は、2 つの csv ファイルを比較することです。1 つは $name、$model、$version を含み、もう 1 つは $name2、$disk、$storage を含みます。最後に RESULT ファイルにはそれが含まれます。 $name、$model、$version、$disk、$storage などの行を照合し、情報をまとめます。

私はなんとかこれを行うことができましたが、私の問題は、プログラムの欠落の要素の1つが壊れたときです。要素が欠落しているファイル内の行に遭遇すると、その行で停止します。この問題を解決するにはどうすればよいですか? その行をスキップして続行する方法についての提案や方法はありますか?

これが私のコードです：

open( TESTING, '>testing.csv' ); # Names will be printed to this during testing. only .net       ending names should appear
open( MISSING, '>Missing.csv' ); # Lines with missing name feilds will appear here.

#open (FILE,'C:\Users\hp-laptop\Desktop\file.txt');
#my (@array) =<FILE>;
my @hostname;    #stores names

#close FILE;
#***** TESTING TO SEE IF ANY OF THE LISTED ITEMS BEGIN WITH A COMMA AND DO NOT HAVE A   NAME.
#***** THESE OBJECTS ARE PLACED INTO THE MISSING ARRAY AND THEN PRINTED OUT IN A SEPERATE
#***** FILE.
#open (FILE,'C:\Users\hp-laptop\Desktop\file.txt');
#test
if ( open( FILE, "file.txt" ) ) {

}
else {
  die " Cannot open file 1!\n:$!";

}

$count = 0;
$x     = 0;
while (<FILE>) {

  ( $name, $model, $version ) = split(",");    #parsing

  #print $name;
  chomp( $name, $model, $version );

  if ( ( $name =~ /^\s*$/ )
      && ( $model   =~ /^\s*$/ )
      && ( $version =~ /^\s*$/ ) )    #if all of the fields  are blank ( just a blank space)
  {

    #do nothing at all
  }
  elsif ( $name =~ /^\s*$/ ) {   #if name is a blank
    $name =~ s/^\s*/missing/g;
    print MISSING "$name,$model,$version\n";

    #$hostname[$count]=$name;
    #$count++;
  }
  elsif ( $model =~ /^\s*$/ ) {   #if model is blank
    $model =~ s/^\s*/missing/g;
    print MISSING"$name,$model,$version\n";
  }
  elsif ( $version =~ /^\s*$/ ) {   #if version is blank
    $version =~ s/^\s*/missing/g;
    print MISSING "$name,$model,$version\n";
  }

  # Searches for .net to appear in field "$name" if match, it places it into hostname array.
  if ( $name =~ /.net/ ) {

    $hostname[$count] = $name;
    $count++;
  }

#searches for a comma in the name feild, puts that into an array and prints the line into the missing file.
#probably won't have to use this, as I've found a better method to test all of the    feilds ( $name,$model,$version)
#and put those into the missing file. Hopefully it works.
#foreach $line (@array)
#{
#if($line =~ /^\,+/)
#{
#$line =~s/^\,*/missing,/g;
#$missing[$x]=$line;
#$x++;
#}
#}

}
close FILE;

for my $hostname (@hostname) {
  print TESTING $hostname . "\n";
}

#for my $missing(@missing)
#{
# print MISSING $missing;
#}
if ( open( FILE2, "file2.txt" ) ) {    #Run this if the open succeeds

  #open outfile and print starting header
  open( RESULT, '>resultfile.csv' );
  print RESULT ("name,Model,version,Disk, storage\n");
}
else {
  die " Cannot open file 2!\n:$!";
}
$count = 0;
while ( $hostname[$count] ne "" ) {
  while (<FILE>) {
    ( $name, $model, $version ) = split(",");    #parsing

    #print $name,"\n";

    if ( $name eq $hostname[$count] )    # I think this is the problem area.
    {
      print $name, "\n", $hostname[$count], "\n";

      #print RESULT"$name,$model,$version,";
      #open (FILE2,'C:\Users\hp-laptop\Desktop\file2.txt');
      #test
      if ( open( FILE2, "file2.txt" ) ) {

      }
      else {
        die " Cannot open file 2!\n:$!";

      }

      while (<FILE2>) {
        chomp;
        ( $name2, $Dcount, $vname ) = split(",");    #parsing

        if ( $name eq $name2 ) {
          chomp($version);
          print RESULT"$name,$model,$version,$Dcount,$vname\n";

        }

      }

    }

    $count++;
  }

  #open (FILE,'C:\Users\hp-laptop\Desktop\file.txt');
  #test
  if ( open( FILE, "file.txt" ) ) {

  }
  else {
    die " Cannot open file 1!\n:$!";

  }

}

close FILE;
close RESULT;
close FILE2;

score 2 · Accepted Answer

nextが必要だと思います。これにより、現在の反復をすぐに終了して次の反復を開始できます。

while (<FILE>) {
  ( $name, $model, $version ) = split(",");
  next unless( $name && $model && $version );
  ...;
  }

使用する条件は、受け入れる値によって異なります。私の例では、すべての値が true である必要があると想定しています。それらが空の文字列ではない必要がある場合は、代わりに長さを確認してください。

while (<FILE>) {
  ( $name, $model, $version ) = split(",");
  next unless( length($name) && length($model) && length($version) );
  ...;
  }

各フィールドを検証する方法を知っている場合は、それらのサブルーチンがあるかもしれません:

while (<FILE>) {
  ( $name, $model, $version ) = split(",");
  next unless( length($name) && is_valid_model($model) && length($version) );
  ...;
  }

sub is_valid_model { ... }

あとは、それをすでに行っていることに統合する方法を決定する必要があります。

score 2 · Accepted Answer

use strictプログラムの先頭にandを追加し、最初に使用する時点でuse warningsすべての変数を宣言することから始めます。myこれにより、他の方法では見つけるのが難しい多くの単純な間違いが明らかになります。

openまた、ofおよび字句ファイルハンドルには 3 つのパラメーターを使用する必要があります。また、ファイルを開く際の例外をチェックするための Perl のイディオムはor die、open呼び出しに追加することです。if成功パスの空のブロックを含むステートメントはスペースを浪費し、判読できなくなります。呼び出しは次のopenようになります

open my $fh, '>', 'myfile' or die "Unable to open file: $!";

最後に、CSV ファイルを扱う場合は Perl モジュールを使用する方が安全です。単純なsplit /,/. モジュールはすべてのText::CSV作業を行っており、CPAN で利用できます。

問題は、最初のファイルの最後まで読み取った後、2 番目のネストされたループで同じハンドルから再度読み取る前に、巻き戻したり、再度開いたりしないことです。つまり、そのファイルからデータが読み取られなくなり、プログラムはファイルが空であるかのように動作します。

対応するレコードをペアにするためだけに、同じファイルを何百回も読み通すのは悪い戦略です。ファイルのサイズが妥当な場合は、情報を保持するためにメモリ内にデータ構造を構築する必要があります。Perl ハッシュは、特定の名前に対応するデータを即座に検索できるので理想的です。

これらの点を示すあなたのコードのリビジョンを書きました。サンプルデータがないため、コードをテストするのは面倒ですが、引き続き問題が発生する場合はお知らせください。

use strict;
use warnings;

use Text::CSV;

my $csv = Text::CSV->new;

my %data;

# Read the name, model and version from the first file. Write any records
# that don't have the full three fields to the "MISSING" file
#
open my $f1, '<', 'file.txt' or die qq(Cannot open file 1: $!);

open my $missing, '>', 'Missing.csv' 
    or die qq(Unable to open "MISSING" file for output: $!);
    # Lines with missing name fields will appear here.

while ( my $line = csv->getline($f1) ) {

  my $name = $line->[0];

  if (grep $_, @$line < 3) {
    $csv->print($missing, $line);
  }
  else {
    $data{$name} = $line if $name =~ /\.net$/i;
  }
}

close $missing;

# Put a list of .net names found into the testing file
#
open my $testing, '>', 'testing.csv'
    or die qq(Unable to open "TESTING" file for output: $!);
    # Names will be printed to this during testing. Only ".net" ending names should appear

print $testing "$_\n" for sort keys %data;

close $testing;

# Read the name, disk and storage from the second file and check that the line
# contains all three fields. Remove the name field from the start and append
# to the data record with the matching name if it exists.
#
open my $f2, '<', 'file2.txt' or die qq(Cannot open file 2: $!);

while ( my $line = $csv->getline($f2) ) {

  next unless grep $_, @$line >= 3;

  my $name = shift @$line;
  next unless $name =~ /\.net$/i;

  my $record = $data{$name};
  push @$record, @$line if $record;
}

# Print the completed hash. Send each record to the result output if it
# has the required five fields
#
open my $result, '>', 'resultfile.csv' or die qq(Cannot open results file: $!);

$csv->print($result, qw( name Model version Disk storage ));

for my $name (sort keys %data) {

  my $line = $data{$name};

  if (grep $_, @$line >= 5) {
    $csv->print($result, $data{$name});
  }
}

arrays - 配列内の行をスキップ, Perl

2 に答える 2

Related

Reference