perl - 2 つのテキストファイルを比較し、対応する値を返す

Question

2 つのファイルがあります...最初のファイルは ID を含む .txt ファイルです...

2 番目のファイルには、最初の列にテキストが含まれ、2 番目の列に ID が含まれます。2 つの ID を比較し、一致を見つけて、対応するテキストを最初の列に返す方法はありますか?

AUX    2398432
AUL    245406

したがって、2 つのファイルを解析すると、スクリプトは一致245406し、対応する text を返すはずですAUL。

これが私がこれまでに持っているものです：

open FH_TF_IDS, "<$ARGV[0]" or die $!; 
while (<FH_TF_IDS>) {
    chomp; 
    @fields=split("\t",$_);
    $hash{$fields[1]}=$fields[0];
} 
close FH_TF_IDS;

open IDS, "<$ARGV[1]" or die $!;
@ids=<IDS>; 
close IDS; 

foreach $id (@ids){ 
    $hash_count{$hash{$id}}++;
} 

foreach $family (sort (keys %hash_count)) {
    print "$family\t$hash_count{$family}\n";
}

score 1 · Accepted Answer

ユーザー1364517、

問題を解決しようとして、あなたは素晴らしい仕事をしたと思います。ただし、問題が 2 つあります。

chomp @ids;afterを追加しclose IDS;て、各配列要素の末尾にある \n を削除します。
$hash_count{$hash{$id}}++;に変更$hash_count{$hash{$id}} = $id if $hash{$id};

これらの小さな変更により、プログラムが機能するようになります。

これは、より「ハッキーな」(確かに慣用的ではない) 解決策です。

use strict;
use warnings;

my %hash;

{open my $file, "<$ARGV[0]" or die $!;
$hash{$2} = $1 while <$file> =~ /(.*)\t(.*)/;}

{open my $file, "<$ARGV[1]" or die $!;
map{print "$hash{$_}\t$_\n"}sort{$hash{$a} cmp $hash{$b}}
grep{$hash{$_}}map{s/\n\z//r}<$file>;}

my $file範囲外になるとファイルが閉じられるように、ブロックが使用されます。

お役に立てれば！

score 0 · Accepted Answer

いくつかの提案:

Modern Perlに関する本を手に取ってください。Perl は粗雑な古い言語です。Perl でのプログラミング方法は、1980 年代に初めて登場して以来、何年にもわたって変化してきました。残念ながら、Perl 5.0 より前の時代に書かれた Web サイトから Perl を学ぶ人が多すぎます。
use strict;プログラムでandを使用use warnings;します。これにより、ほとんどのプログラミングエラーが検出されます。
に依存しないでください$_。これはグローバルであり、問題を引き起こす可能性があります。for (@people) {きれいに見えますが、実行する方が良いですfor my $person ( @people )。
およびで使用/../します。split'...'join
ファイルハンドルには変数を使用します。サブルーチンに渡す方が簡単です:

これがあなたのプログラムです：

私はあなたのプログラムをより現代的なスタイルに書き直しました。少しエラーチェックを行いましたが、それ以外は機能します：

use strict;
use warnings;
use feature qw(say);  # Nicer that print.
use autodie;          # Will automatically die on open and close errors

if ( @ARGV < 2 ) {
    die qq(Not enough arguments);
}

my $tf_id_file = shift;   # Use variable names and not `@ARGV` directly
my $id_file    = shift;   # Makes your program easier to understand 

open my $tf_ids_fh, "<", $tf_id_file;

my %hash;                # Not a good name for the variable, but that's what you had.
while ( my $line = <$tf_ids_fh> ) {
    chomp $line;         # Always chomp after a read
    my ( $text, $id ) = split /\s+/, $line;  # Use variable names, not @fields
    if ( not defined $id ) {           # Error checking
        die qq(Missing id field in line $. of tf_ids file);
    }
    $hash{$text} = $id;
}
close $tf_ids_fh;

open my $ids_fh, "<", $id_file;
my @ids = <$ids_fh>;
chomp @ids;
close $ids_fh;

my %totals;
for my $id ( @ids ) {
    if ( not exists $totals{ $hash{$id} } ) {   #Initialize hash before adding to it
        $totals{ $hash{$id} } = 0;
    }
    $totals{ $hash{$id} }++;
}

for my $family ( sort keys %totals ) {
    printf "%10.10s %4d\n", $family, $totals{$family};
}

印刷出力を通常よりも少しきれいにフォーマットするprintfを使用printします。

score 0 · Accepted Answer

あなたがその言語の初心者であることは理解しています。プログラムのデバッグに役立つものがあります。

スクリプトの先頭で「use Data::Dumper;」

これを行うと、print Dumper( $hash ) や print Dumper( $hash_count ) などのステートメントを挿入できるようになります。これら 2 つのステートメントにより、プログラムにあるバグを確認できます。

補足として、これを perl -d を介して実行することもオプションであり、言語を続行する場合は確実に学習する必要があります。

score 0 · Accepted Answer

これを試して...

    #!/usr/bin/perl
    use Data::Dumper;

    open a1, "<$ARGV[0]";
    while(<a1>) {
        my @a = split " ", $_;

        open b1, "<$ARGV[1]";
        while(<b1>) {
            my @b = split "\n", $_;
            my @test = (split " ", $b[0]);
            if($test[1] == $a[0]) {
                print $test[0]."\n";
            }
        }
        close b1;
    }

ターミナルで次のコマンドを与える

    perl test.pl a.txt b.txt

perl - 2 つのテキスト ファイルを比較し、対応する値を返す

4 に答える 4

Related

Reference

perl - 2 つのテキストファイルを比較し、対応する値を返す