perl - 新しい行や更新された行のみを別のファイルに挿入する方法

Question

Perlを扱い、すでにブロックされている最初の日:)

状況は次のとおりです。ファイルはフォルダーAで更新されますが、フォルダーB、C、Dにも存在します。簡単にするために、ファイルはすべてのフォルダーで異なる可能性があるため、差分を実行することはできません。他のファイルにコピーされることを意図した新しい行は、行の終わりにあるフラグ（たとえば、 ＃I ）によって識別されます。

更新前のファイルは次のようになります。

    First line
    Second line
    Fifth line

更新後は次のようになります。

    First line
    Second line
    Third line #I
    Fourth line #I
    Fifth line
    Sixth line #I

私がする必要があるのは、他のファイルで「2行目」を検索し、＃Iでタグ付けされた行を挿入された順序で挿入してから、「5行目」を検索して「6行目#I」を挿入することです。。

この例では、それらはすべて連続していますが、更新する必要のあるファイルでは、最初の更新ブロックと2番目（および3番目など）の間に複数の行が存在する可能性があります。

更新されるファイルは、shスクリプト、awkスクリプト、プレーンテキストファイルなどです。スクリプトは汎用であると想定されています。スクリプトには、更新されたファイルと更新されるファイルの2つのエントリパラメータがあります。

これを行う方法についてのヒントは大歓迎です。必要に応じて、これまでに使用したコードを提供できます。近いですが、まだ機能していません。

ありがとう、

João

PS：これが私がこれまでに持っているものです

# Pass the content of the file $FileUpdate to the updateFile array
@updateFile = <UPD>;

# Pass the content of the file $FileOriginal to the originalFile array
@originalFile = <ORG>;

# Remove empty lines from the array contained on the updated file
@updateFile = grep(/\S/, @updateFile);

# Create an array that will contain the modifications and the line
# prior to the first modification.
@modifications = ();

# Counter initialization
$i = 0;


# Loop the array to find out which lines are flagged as new and
# which lines immediately precede those
foreach $linha (@updateFile) {

# Remove \n characters
chomp($linha);

# Find the new lines flagged with #I
if ($linha =~ m/#I$/) {

    # Verify that the previous line is not flagged as updated.
    # If it is not, it means that the update starts here.
    unless ($updateFile[$i-1] =~ m/#I$/) {
        print "Line where the update starts $updateFile[$i-1]\n";

        # Add that line to the array modifications
        push(@modifications, $updateFile[$i-1]);

    } # END OF unless 

print "$updateFile[$i]\n";

# Add the lines tagged for insertion into the array
push(@modifications, $updateFile[$i]);

} # END OF if ($linha =~ m/#I$/)

# Increment the counter
$i = $i + 1;

} # END OF foreach $linha (@updateFile) 


foreach $modif (@modifications) {
    unless ($modif =~ m/#I$/) {
        foreach $original (@originalFile) {
            chomp($original);
            if ($original ne $modif) {
                push (@newOriginal, $originalFile[$n]);
            }
            elsif ($original eq $modif) { #&& $modif[$n+1] =~ m/#I$/) {
                push (@newOriginal, $originalFile[$n]);
                last;
            }
            $n = $n + 1;
        }
    }
    if ($modif =~ m/#I$/) {
        push (@newOriginal, $modifications[$m]);
    }
    $m = $m + 1;
}

得られた結果は、私が望むものとほぼ同じですが、まだです。

score 1 · Accepted Answer

私はついにこの問題に戻ることができました、そして私はこれを解決することができたようです。おそらく最良の解決策や「最も美しい」ものではありませんが、私が必要としていることを実行しているものです:)。

# Open the file

# First parameter is the file containing the update
my ($FileUpdate) = $ARGV[0];

# Second parameter is the file to be updated
my ($FileOriginal) = $ARGV[1];


# \s whitespace characters

# Open both files and give them handles to be referred to further ahead
open(UPD, $FileUpdate) || die("Could not open file $FileUpdate!");
open(ORG, $FileOriginal) || die("Could not open file $FileOriginal!");

# ------------------------------------------------ #
# ---------------- ARRAY CREATION ---------------- #
# ------------------------------------------------ #

# Pass the content of the file $FileUpdate to the updateFile array
@updateFile = <UPD>;

# Pass the content of the file $FileOriginal to the originalFile array
@originalFile = <ORG>;

# Remove empty lines from the array contained on the updated file
@updateFile = grep(/\S/, @updateFile);

# Create an array that will contain the modifications and the line
# prior to the first modification.
@modifications = ();

# Counter initialization
$i = 0;


# ------------------------------------------------ #
# ----- LOOP TO IDENTIFY LINES FOR INSERTION ----- #
# ------------------------------------------------ #

# Loop the array to find out which lines are flagged as new and
# which lines immediately precede those
foreach $linha (@updateFile) {

# Remove \n characters
chomp($linha);

# Find the new lines flagged with #I
if ($linha =~ m/#I$/) {

    # Verify that the previous line is not flagged as updated.
    # If it is not, it means that the update starts here.
    unless ($updateFile[$i-1] =~ m/#I$/) {

        # Add that line to the array modifications
        push(@modifications, $updateFile[$i-1]);

    } # END OF unless 

# Add the lines tagged for insertion into the array
push(@modifications, $updateFile[$i]);

} # END OF if ($linha =~ m/#I$/)

# Increment the counter
$i = $i + 1;

} # END OF foreach $linha (@updateFile) 


# ------------------------------------------------ #
# --------- ADD VALUES TO MODIFICATIONS  --------- #
# ------------------------------------------------ #
foreach $valor (@modifications) {   
print "$valor\n";
}

# ------------------------------------------------ #
# -------------------- BACKUP -------------------- #
# ------------------------------------------------ #

# Make a backup copy from the original file   
# in case something goes wrong when updating it

# Obtain the current time
$tt=localtime();
use POSIX qw(strftime);
$tt = strftime "%Y%m%d-%H%M\n", localtime;

system("cp $FileOriginal $FileOriginal.$tt");

# ------------------------------------------------ #
# ------------- INSERT THE NEW LINES ------------- #
# ------------------------------------------------ #

# Counter initialization
$m = 0;

# New file array
@newOriginal = ();

# Goes through the original file and for each line not present in modifs, writes it .

foreach $original (@originalFile) {
# Initialize counter
$n = 0;

# Remove spaces
chomp ($original);

# Check if the value already exists on the array
# If it doesnt, adds it
if (grep {$_ eq $original} @newOriginal) {
}
else {
    push (@newOriginal, $originalFile[$m]); 
}

# Iterate over the array containing the modifications
# These new lines shall be added to the final file.
foreach $modif (@modifications) {
    # Remove spaces
    chomp ($modif);

    #print "Original: $original, Modif: $modif\n";

    # Initialize counter
    $k = 0;

    # Compare the current value from the original file with
    # the elements that exist on the modifications array.
    # If they are equal push that line in order to be added
    # to the results file.
    if ($original eq $modif) {

        # Increment the counter
        $k = $n+1;

        # Iterate the array with the modifications
        # in order to insert all lines that end with #I
        # immediately after the common line between files.
        foreach my $igual ($k..$#modifications) {

            # Remove spaces
            chomp($igual);

            # If the line ends with #I add it to the final file.
            if ($modifications[$igual] =~ m/#I$/) {

                foreach $newO (@newOriginal) {
                    # Remove spaces
                    chomp($newO);
                    if ($newO ne $modifications[$igual]) {
                        push (@newOriginal, $modifications[$igual]);
                        last;
                    }
                }
            }
            else {
                last;
            }
        }
    }

    # Increment the counter
    $n = $n + 1;
}
# Increment the counter
$m = $m + 1;
}

# ------------------------------------------------ #
# ------------- RESULTS PRESENTATION ------------- #
# ------------------------------------------------ #
$v = 0;
print "--------------------\n";
foreach $vl (@newOriginal) {
print "newOriginal: $newOriginal[$v]\n";
$v = $v + 1;
}
print "--------------------\n";

# ------------------------------------------------ #
# ------------- CREATE UPDATED FILE -------------- #
# ------------------------------------------------ #
$v = 0;

# Create the new name for the file - only for testing purposes now, it will be the original name afterwards
$NewFileToWriteTo = $FileOriginal;
# Retrieve the extension of the file to be updated
my ($ext) = $FileOriginal =~ /(\.[^.]+)$/;
# Remove the extension - just for testing purposes because I want to change the file name now
$NewFileToWriteTo =~ s/$ext//;
# Create the new file name by adding the suffix _tst and the correct extension to it.
$NewFileToWriteTo = $NewFileToWriteTo . '_tst' . ${ext};


# Create the new file or die in case it is not possible to open it
open DAT, ">$NewFileToWriteTo" or die("Could not open file!");


# Write to the new file. This will be the UPDATED version of the ORIGINAL file.
foreach $vl (@newOriginal) {
print DAT "$newOriginal[$v]\n";
$v = $v + 1;
}

# Close all files
close(DAT);
close(UPD);
close(ORG);

score 0 · Accepted Answer

OK私はあなたが何を必要としているか理解していると思います、そして以下のプログラムは解決策を実行します。

ソース（B、C、D）ファイルがどのように見えるかは完全にはわかりませんが、質問の更新後の状態では、ターゲット（A）ファイルと同じであると思います。

私が遭遇した別のエッジケース：ソース（B、C、D）ファイルの最初の行が？でタグ付けされている場合はどうなり#Iますか？出力の最初に挿入する必要があると想定しました。

dieまた、ソースファイルの前の行がターゲットに見つからない場合も選択しました。

これが正しい方向に沿っているかどうかをお知らせください。

use strict;
use warnings;

open my $fa, '<', 'A.txt' or die $!;

open my $fb, '<', 'B.txt' or die $!;

my $keyline;
my $inserting;

while (<$fb>) {

  if (/#I$/) {

    if ($keyline) {             # We have to search for a match

      while () {

        my $source = <$fa>;     # read from the target

        if (defined $source) {  # copy to output. stop reading if key is found
          print $source;
          last if $source eq $keyline;
        }
        else {                  # die if key nowhere in target
          chomp $keyline;
          die qq(Key Line "$keyline" not found);
        }
      }

      undef $keyline;           # don't have to search next time
    }

    print;                      # insert the new line
  }
  else {
    $keyline = $_;              # remember the line to search for
  }
}

perl - 新しい行や更新された行のみを別のファイルに挿入する方法

2 に答える 2

Related

Reference