regex - sedを使用してテキストを同じ長さに置き換えます

Question

sed を使用してパターンを他のもの (ドット、ゼロなど) と同じ長さに置き換える方法はありますか? このような：

maci:/ san$ echo "She sells sea shells by the sea shore" | sed 's/\(sh[a-z]*\)/../gI'
.. sells sea .. by the sea ..

( "I" は、大文字と小文字を区別しない新しいバージョンの sed が必要です)
これは簡単でした: "sh" で始まる単語は、2 つのドット (..) に置き換えられますが、次のようにするにはどうすればよいですか:... sells sea ...... by the sea .....

何か案が？乾杯！

score 7 · Accepted Answer

私の疑いでは、あなたはそれを standardsedで行うことはできませんが、Perl またはより強力な正規表現処理を備えた何かで行うことができます。

$ echo "She sells sea shells by the sea shore" |
> perl -pe 's/(sh[a-z]*)/"." x length($1)/gei'
... sells sea ...... by the sea .....
$

修飾子はe、置換パターンが実行可能な Perl スクリプトであることを意味します。この場合、.一致したパターン内の文字の数だけ文字を繰り返します。g修飾子は行全体で繰り返されます。修飾子は、i大文字と小文字を区別しない一致用です。Perl へのオプションは、オプションで指定されたスクリプト (代替コマンド-p) での処理の後に各行を出力します。-e

score 5 · Accepted Answer

このawk-onelinerはあなたのために仕事をしますか?

awk '{for(i=1;i<=NF;i++)if($i~/^[Ss]h/)gsub(/./,".",$i)}1' file

あなたのデータでテストしてください：

kent$  echo "She sells sea shells by the sea shore"|awk '{for(i=1;i<=NF;i++)if($i~/^[Ss]h/)gsub(/./,".",$i)}1'
... sells sea ...... by the sea .....

score 5 · Accepted Answer

古い質問ですが、私は素敵で比較的短い1行のsedソリューションを見つけました:

sed ':a;s/\([Ss]h\.*\)[^\. ]/\1./;ta;s/[Ss]h/../g'

ループ内で一度に 1 文字ずつ置換することで機能します。

:a;ループを開始する

s/\([Ss]h\.*\)[^\. ]shの後に任意の数の.s (これまでに完成した作業) が続き、その後にドットまたはスペース以外の文字 (置き換えようとしているもの) が続くを検索します。

/\1./;それを、これまでに完成した作業と別のに置き換えます.。

ta;置換を行った場合はループし、そうでない場合は...

s/[Ss]h/../gshs を 2に置き換えて.、1 日と呼びます。

score 4 · Accepted Answer

$ echo "She sells sea shells by the sea shore" |
awk '{
   head = ""
   tail = $0
   while ( match(tolower(tail),/sh[a-z]*/) ) {
      dots = sprintf("%*s",RLENGTH,"")
      gsub(/ /,".",dots)
      head = head substr(tail,1,RSTART-1) dots
      tail = substr(tail,RSTART+RLENGTH)
   }
   print head tail
}'
... sells sea ...... by the sea .....

score 3 · Accepted Answer

他の人が指摘したように、sed はこのタスクにはあまり適していません。もちろん可能です。スペースで区切られた単語を含む単一行で機能する例を次に示します。

echo "She sells sea shells by the sea shore" |

sed 's/ /\n/g' | sed '/^[Ss]h/ s/[^[:punct:]]/./g' | sed ':a;N;$!ba;s/\n/ /g'

出力：

... sells sea ...... by the sea .....

最初の「sed」はスペースを改行に置き換え、2 番目はドットを付け、3 番目はこの回答に示すように改行を削除します。

予測できない単語区切りや段落がある場合、このアプローチはすぐに手に負えなくなります。

編集 - 複数行の代替

Kent のコメント (GNU sed)に触発された、複数行の入力を処理する 1 つの方法を次に示します。

echo "
She sells sea shells by the sea shore She sells sea shells by the sea shore,
She sells sea shells by the sea shore She sells sea shells by the sea shore
 She sells sea shells by the sea shore She sells sea shells by the sea shore
" |

# Add a \0 to the end of the line and surround punctuations and whitespace by \n 
sed 's/$/\x00/; s/[[:punct:][:space:]]/\n&\n/g' |

# Replace the matched word by dots
sed '/^[Ss]h.*/ s/[^\x00]/./g' | 

# Join lines that were separated by the first sed
sed ':a;/\x00/!{N;ba}; s/\n//g'

出力：

... sells sea ...... by the sea ..... ... sells sea ...... by the sea .....,
... sells sea ...... by the sea ..... ... sells sea ...... by the sea .....
 ... sells sea ...... by the sea ..... ... sells sea ...... by the sea .....

score 3 · Accepted Answer

これはうまくいくかもしれません（GNU sed）：

sed -r ':a;/\b[Ss]h\S+/!b;s//\n&\n/;h;s/.*\n(.*)\n.*/\1/;s/././g;G;s/(.*)\n(.*)\n.*\n/\2\1/;ta' file

本質的に; shまたはで始まる単語をコピーし、Sh各文字をに置き換えて.から、新しい文字列を元の文字列に再挿入します。検索文字列がすべて使い果たされると、その行が出力されます。

別の方法:

sed -E 's/\S+/\n&/g;s#.*#echo "&"|sed "/^sh/Is/\\S/./g"#e;s/\n//g' file

regex - sedを使用してテキストを同じ長さに置き換えます

6 に答える 6

編集 - 複数行の代替

Related

Reference