python - sed に相当する python

Question

次のsedコマンドが行うことを達成するための二重ループなしの方法はありますか

入力：

Time
Banana
spinach
turkey

sed -i "/Banana/ s/$/Toothpaste/" file

出力：

Time
BananaToothpaste
spinach
turkey

これまでのところ、両方を通過するには長い時間がかかる二重のリストがあります。

リスト a にはたくさんの数字がありますリスト b にはたくさんの数字がありますが、順番が異なります

A の各エントリについて、B で同じ番号の行を見つけ、その末尾に値 C を追加したいと考えています。

私の例がそうでなくても、これが理にかなっていることを願っています。

私はBashで次のことを行っていましたが、動作していましたが、非常に遅かったです...

for line in $(cat DATSRCLN.txt.utf8); do
        srch=$(echo $line | awk -F'^' '{print $1}');
        rep=$(echo $line | awk -F'^' '{print $2}');
        sed -i "/$(echo $srch)/ s/$/^$(echo $rep)/" tmp.1;
done

ありがとう！

score 17 · Accepted Answer

レースに遅れて来た人は、Python での sed の実装を次に示します。

import re
import shutil
from tempfile import mkstemp


def sed(pattern, replace, source, dest=None, count=0):
    """Reads a source file and writes the destination file.

    In each line, replaces pattern with replace.

    Args:
        pattern (str): pattern to match (can be re.pattern)
        replace (str): replacement str
        source  (str): input filename
        count (int): number of occurrences to replace
        dest (str):   destination filename, if not given, source will be over written.        
    """

    fin = open(source, 'r')
    num_replaced = count

    if dest:
        fout = open(dest, 'w')
    else:
        fd, name = mkstemp()
        fout = open(name, 'w')

    for line in fin:
        out = re.sub(pattern, replace, line)
        fout.write(out)

        if out != line:
            num_replaced += 1
        if count and num_replaced > count:
            break
    try:
        fout.writelines(fin.readlines())
    except Exception as E:
        raise E

    fin.close()
    fout.close()

    if not dest:
        shutil.move(name, source)

例:

sed('foo', 'bar', "foo.txt")

foo.txt のすべての「foo」を「bar」に置き換えます

sed('foo', 'bar', "foo.txt", "foo.updated.txt")

「foo.txt」のすべての「foo」を「bar」に置き換え、結果を「foo.updated.txt」に保存します。

sed('foo', 'bar', "foo.txt", count=1)

「foo」の最初の出現のみを「bar」に置き換え、結果を元のファイル「foo.txt」に保存します

score 5 · Accepted Answer

実際に sed を python から呼び出すことができます。これを行う方法はたくさんありますが、私は sh モジュールを使用するのが好きです。(yum -y インストール python-sh)

私のサンプルプログラムの出力は次のとおりです。

[me@localhost sh]$ cat input 
Time
Banana
spinich
turkey
[me@localhost sh]$ python test_sh.py 
[me@localhost sh]$ cat input 
Time
Toothpaste
spinich
turkey
[me@localhost sh]$

ここにtest_sh.pyがあります

import sh

sh.sed('-i', 's/Banana/Toothpaste/', 'input')

これはおそらく LINUX でのみ動作します。

score 5 · Accepted Answer

Python3 を使用している場合は、次のモジュールが役立ちます: https://github.com/mahmoudadel2/pysed

wget https://raw.githubusercontent.com/mahmoudadel2/pysed/master/pysed.py

モジュールファイルを Python3 モジュールパスに配置してから、次のようにします。

import pysed
pysed.replace(<Old string>, <Replacement String>, <Text File>)
pysed.rmlinematch(<Unwanted string>, <Text File>)
pysed.rmlinenumber(<Unwanted Line Number>, <Text File>)

score 2 · Accepted Answer

これは、システム要件が低く、ファイル全体をメモリにコピーせずに1回だけ反復するtmpファイルを使用して行うことができます。

#/usr/bin/python
import tempfile
import shutil
import os

newfile = tempfile.mkdtemp()
oldfile = 'stack.txt'

f = open(oldfile)
n = open(newfile,'w')

for i in f:
        if i.find('Banana') == -1:
                n.write(i)
                continue

        # Last row
        if i.find('\n') == -1:
                i += 'ToothPaste'
        else:
                i = i.rstrip('\n')
                i += 'ToothPaste\n'

        n.write(i) 

f.close()
n.close()

os.remove(oldfile)
shutil.move(newfile,oldfile)

score 0 · Accepted Answer

上記の Oz123 のおかげで、行ごとではない sed がここにあるので、置換は改行にまたがることができます。大きなファイルは問題になる可能性があります。

import re
import shutil
from tempfile import mkstemp

def sed(pattern, replace, source, dest=None):
"""Reads a source file and writes the destination file.

Replaces pattern with replace globally through the file.
This is not line-by-line so the pattern can span newlines.

Args:
    pattern (str): pattern to match (can be re.pattern)
    replace (str): replacement str
    source  (str): input filename
    dest (str):   destination filename, if not given, source will be over written.
"""

if dest:
    fout = open(dest, 'w')
else:
    fd, name = mkstemp()
    fout = open(name, 'w')

with open(source, 'r') as file:
    data = file.read()

    p = re.compile(pattern)
    new_data = p.sub(replace, data)
    fout.write(new_data)

fout.close()

if not dest:
    shutil.move(name, source)

python - sed に相当する python

8 に答える 8

Related

Reference