regex - perl regular expression to match between a fixed keyword and another two variable keywords

Question

I need to write a regex in perl to do the following.

The starting line is keyword1 (like "this is keyword1"), and the ending line is either keyword2 (like "end1 here") or keyword3 (like "end2 here"). For example, the text file may look like:

*********** this is keyword1***********
*****
..
*******apple***********
******
..
*********** this is keyword1***********
*****
..
*******orange***********
******
..
*********** this is keyword1***********
*****
..
*******orange***********
******
..

My task is to match those blocks

*********** this is keyword1***********
*****
..(comment: no "this is keyword1" here)
*******apple***********

or

*********** this is keyword1***********
*****
.. (comment: no "this is keyword1" here)
*******orange***********

Appreciate your help!

score 1 · Accepted Answer

元の推奨ソリューション

元の「apple」は「end1 here」と綴られ、「orange」は元は「end2 here」と綴られていたことに注意してください。

#!/usr/bin/env perl
use strict;
use warnings;

my $printing = 0;

while (<>)
{
    $printing = 1 if m/this is keyword1/;
    print if $printing;
    $printing = 0 if m/end[12] here/;
}

最後の行を出力から除外する場合は、そのテストを印刷の上に移動します。出力から開始行を除外する場合は、そのテストを印刷の下に移動します。明らかに、例のように 2 つの終了パターンを簡単に組み合わせることができない場合は、単純に 2 つの行を使用できます。

    $printing = 0 if m/the first end pattern/;
    $printing = 0 if m/a radically different end marker/;

サンプルデータの場合、出力は次のようになります。

*********** this is keyword1***********
*****
..
*******end1 here***********
*********** this is keyword1***********
*****
..
*******end1 here***********
*********** this is keyword1***********
*****
..
*******end2 here***********

改訂された要件 — 改訂されたプログラム

改訂された出力要件を満たす簡単な方法の 1 つは、次の場合に行を文字列に蓄積することです$printing = 1。

my $saving = 0;
my $result;

while (<>)
{
    $saving  = 1  if m/this is keyword1/;
    $result .= $_ if $saving;
    $saving  = 0  if m/end[12] here/;
}

ただし、これはファイル全体をメモリに丸呑みするわけでもなく、を使用するわけでもm//gないため、改訂された要件に対して定義されたメカニズムを満たしていません。

改訂された要件により、このコードは多かれ少なかれあなたが望むことを行うと思います:

#!/usr/bin/env perl
use strict;
use warnings;

my $file;
{
    local $/;
    $file = <>;
}

my $result;
while ($file =~ m/(^[^\n]*this is keyword1.*?end[12] here[^\n]*$)/gms)
{
    print "Found: $1\n";
    $result .= "$1\n";
}

print "Overall set of matched material:\n";
print $result;

明らかに、見つかった各段落が必要ない場合は、ループ内の印刷を省略できます。non-greedy.*?を使用してスキャンを途中で停止し、および( ^multi -line) 修飾子を使用して行全体を取得していることに注意してください。$/m

サンプルデータの出力は次のとおりです。

Found: *********** this is keyword1***********
*****
..
*******end1 here***********
Found: *********** this is keyword1***********
*****
..
*******end1 here***********
Found: *********** this is keyword1***********
*****
..
*******end2 here***********
Overall set of matched material:
*********** this is keyword1***********
*****
..
*******end1 here***********
*********** this is keyword1***********
*****
..
*******end1 here***********
*********** this is keyword1***********
*****
..
*******end2 here***********

再改訂された要件 — 再改訂されたソリューション

#!/usr/bin/env perl
use strict;
use warnings;

my $file;
{
    local $/;
    $file = <>;
}

my $result;
while ($file =~ m/(^[^\n]*this is keyword1.*?(apple|orange)[^\n]*$)/gms)
{
    print "Found: $1\n";
    $result .= "$1\n";
}

print "Overall set of matched material:\n";
print $result;

サンプルデータ

*********** this is keyword1***********
*****
..
*******orange***********
******
..
*********** this is keyword1***********
*****
..
*******orange***********
******
..
*********** this is keyword1***********
*****
..
*******apple***********
******

サンプル出力

Found: *********** this is keyword1***********
*****
..
*******orange***********
Found: *********** this is keyword1***********
*****
..
*******orange***********
Found: *********** this is keyword1***********
*****
..
*******apple***********
Overall set of matched material:
*********** this is keyword1***********
*****
..
*******orange***********
*********** this is keyword1***********
*****
..
*******orange***********
*********** this is keyword1***********
*****
..
*******apple***********
$

regex - perl regular expression to match between a fixed keyword and another two variable keywords

2 に答える 2

元の推奨ソリューション

改訂された要件 — 改訂されたプログラム

再改訂された要件 — 再改訂されたソリューション

Related

Reference