perl - 複数のテキストファイルをマージするためのPerlコード

Question

複数のテキストファイルがあります。シェルを介して2つのファイルを入力し、それらをマージするコードを記述しました。しかし、複数のファイルをマージするにはどうすればよいですか。この目的でシステムコマンドは便利です。

my @a = read_file($file1)
    or die "couldn't read $file1 - $!";
my @b = read_file($file2)
    or die "couldn't read $file2 - $!";

my $combined = {}; # hashref

my $i=0;
foreach (@a) {
    chomp;
    $combined->{$i}{b} = '' unless defined $combined->{$i}{b};
    $combined->{$i++}{a} = $_;
}

$i=0;
foreach (@b) {
    chomp;
    $combined->{$i}{a} = '' unless defined $combined->{$i}{a};
    $combined->{$i++}{b} = $_;
}

foreach my $i (sort {$a<=>$b} keys %$combined) {
    print $combined->{$i}{a}, ("\t" x 2), $combined->{$i}{b}, "\n";
}

score 4 · Accepted Answer

私が理解しているように、次のように、両方のファイルに対して同時に1行を読み取り、各行をタブで区切って印刷できます。

use warnings;
use strict;

die unless @ARGV == 2;

open my $fha, q|<|, $ARGV[0] or die;
open my $fhb, q|<|, $ARGV[1] or die;

while ( my $a = <$fha>, my $b = <$fhb> ) { 
    chomp( $a, $b );
    printf qq|%s\t\t%s\n|, $a, $b; 
}

ファイルの行数が異なる場合、このスクリプトは機能しません。そのような状況では、別のアプローチが必要になります。

score 2 · Accepted Answer

シェルで簡単に実行できます：cat file1.txt file2.txt file3.txt > selected.txt

またはPerlで：

use strict;

@ARGV = ('file1.txt', 'file2.txt', 'file3.txt');

open MULTI, '>', 'selected.txt' 
    or die $!;

while (<>) {
    print MULTI;
}

score 2 · Accepted Answer

どうですか：

#!/usr/bin/perl
use strict;
use warnings;

my @files = qw(file1 file2 file3 file4);
my %content;
my $max_rec = 0;

foreach (@files) {
    open my $fh, '<', $_ or die $!;
    @{$content{$_}} = <$fh>;
    chomp @{$content{$_}};
    close $fh;
    $max_rec = @{$content{$_}} if scalar(@{$content{$_}}) > $max_rec;
}

open my $fh, '>', 'outfile' or die $!;
for my $i (0 .. $max_rec) {
    my $out = '';
    foreach (@files) {
        $out .= defined($content{$_}[$i]) ? $content{$_}[$i] : '';
        $out .= "\t\t" unless $_ eq $files[-1];
    }
    print $fh $out,"\n";
}

入力ファイル:

$ cat file1
1.1
$ cat file2
2.1
2.2
$ cat file3
3.1
3.2
3.3
$ cat file4
4.1
4.2
4.3
4.4

出力ファイル:

$ cat outfile 
1.1     2.1     3.1     4.1
        2.2     3.2     4.2
                3.3     4.3
                        4.4

score 0 · Accepted Answer

このスクリプトは、IO :: Fileを使用した高性能に焦点を当てており、同じ行に空白以外のテキストが少なくともいくつかあるファイルに対してのみ機能します。

#!/usr/bin/perl
use IO::File;
@f= map { IO::File->new($_) } @ARGV;
print $q,qq(\n) until ($q=join (qq(\t), map { m{(.*)} && $1 } map { $_->getline } @f))=~m{^\t+$}

perl - 複数のテキストファイルをマージするためのPerlコード

4 に答える 4

Related

Reference