python - 連続する 2 行を削除

Question

次の形式のデータがあります。

#@ <id_wxyz_1>
A line written after this.

#@ <id_123>
A line written after this one also.

#@ <id_wxyz_2>
One more line.

#@ <id_yex_9>
Another line.

ここで、#@ <...> に「wxyz」を含む行とその次の行の 2 行を削除します。私が望む出力例は次のとおりです。

#@ <id_123>
A line written after this one also.

#@ <id_yex_9>
Another line.

同じことを達成できるLinuxコマンドがありますか、それともPythonで同じことを達成するための効率的な方法がありますか。grep、sedなどを使用して1行を選択的に削除できることは知っていますが、Linuxコマンドを使用して連続する2行を選択的に削除することは可能ですか?

編集:与えられた答えは優れていますが、次の形式の入力では機能しません:

#@ <id_wxyz_1>
A line written after this.

#@ <id_wxyz_2>
A line written after this.

#@ <id_wxyz_3>
A line written after this.

#@ <id_wxyz_4>
A line written after this.

#@ <id_wxyzadded5>
A line written after this.

上記の入力では、出力行が表示されません。

再度編集：私が持っている別の入力セットは次のとおりです。

#@ <id_wxyz0>
Line 1.
#@ <id_wxyz1>
line 2.
#@ <id_wxyz2> 
line 3.
#@ <id_wxyz3> 
line 4.
#@ <id_6>
line 5.

出力が必要な対象

#@ <id_6>
line 5.

score 4 · Accepted Answer

たとえば、sed by を使用してこれを行うことができます。

/^#@ <.*wxyz.*>/ {
   N        #Add the next line to the pattern space
   s/.*//   #clear the line
   N        #Read another line
   /^\n$/ d #if line was blank, delete and start next cycle (reading again)
   D        #Otherwise, delete up to newline, and start next cycle with that

}

注: 2 番目のケースでは、実際には 1 行の空白行が出力されます。

score 1 · Accepted Answer

を使用awkすると、次のように言うことができます。

awk '/^#@ <.*wxyz.*>/{getline;getline}1' filename

編集:変更された質問に従って、次のように言うことができます:

sed '/^#@ <id_wxyz.*/,/^$/d' filename

score 1 · Accepted Answer

awkも使用できます。行に一致する場合は、getline次の 2 つの行に対して 2 回使用nextし、印刷を回避するために使用します。

awk '/^#@[[:blank:]]+<.*wxyz.*>/ { getline; getline; next } { print }' infile

次の結果が得られます。

#@ <id_123>
A line written after this one also.

#@ <id_yex_9>
Another line.

OPの新しい編集の解決策を提供するUPDATE：

awk  '
    BEGIN { RS = "#@" } 
    $1 ~ /[^[:space:]]/ && $1 !~ /<.*wxyz.*>/ { 
        sub(/\n[[:blank:]]*$/, "")
        print RS, $0 
    }
' infile

最後の例では、次のようになります。

#@  <id_6>
line 5.

python - 連続する 2 行を削除

4 に答える 4

Related

Reference