python - このパターンに一致する文字列を置き換えるようにファイルを変更する方法

Question

次のようなjsonファイルがあります。

{
    "title": "Pilot",
    "image": [
        {
            "resource": "http://images2.nokk.nocookie.net/__cb20110227141960/notr/images/8/8b/pilot.jpg",
            "description": "not yet implemented"
        }
    ],
    "content": "<p>The pilot ...</p>"
},
{
    "title": "Special Christmas (Part 1)",
    "image": [
        {
            "resource": "http://images1.nat.nocookie.net/__cb20090519172121/obli/images/e/ed/SpecialChristmas.jpg",
            "description": "not yet implemented"
        }
    ],
    "content": "<p>Last comment...</p>"
}

ファイル内のすべてのリソース値のコンテンツを置き換える必要があるため、文字列が次の形式の場合:

"http://images1.nat.nocookie.net/__cb20090519172121/obli/images/e/ed/SpecialChristmas.jpg"

結果は次のようになります。

"../img/SpecialChristmas.jpg"

ファイルを変更するためにそのパターンを一致させる方法を教えてもらえますか?

私はこの推奨事項のようなものを試しました：

https://stackoverflow.com/a/4128192/521728

しかし、それを自分の状況に適応させる方法がわかりません。

前もって感謝します！

score 1 · Accepted Answer

グループで正規表現を使用します：

from StringIO import StringIO    
import re

reader = StringIO("""{
    "title": "Pilot",
    "image": [
        {
            "resource": "http://images2.nokk.nocookie.net/__cb20110227141960/notr/images/8/8b/pilot.jpg",
            "description": "not yet implemented"
        }
    ],
    "content": "<p>The pilot ...</p>"
},
{
    "title": "Special Christmas (Part 1)",
    "image": [
        {
            "resource": "http://images1.nat.nocookie.net/__cb20090519172121/obli/images/e/ed/SpecialChristmas.jpg",
            "description": "not yet implemented"
        }
    ],
    "content": "<p>Last comment...</p>"
}""")

# to open a file just use reader = open(filename)

text = reader.read()
pattern = r'"resource": ".+/(.+).jpg"'
replacement = '"resource": "../img/\g<1>.jpg"'
text = re.sub(pattern, replacement, text)

print(text)

パターンを説明します。: で始まり、スラッシュの前に 1 つ以上の文字があり、前に 1 つ以上の文字がある"resource": ".+/(.+)?.jpg"テキストを探します。角かっこは、グループとして内部にあるものが欲しいことを意味します。ブラケットのセットが 1 つしかないので、に置き換えてアクセスできます。(文字列全体に一致することに注意してください: "resources": etc'`)"resource": ".jpg"()'\g<1>''\g<0>''

score 1 · Accepted Answer

これが私の答えです。それほど簡潔ではありませんが、re.search(".jpg",line)行で使用されている正規表現を任意の正規表現に調整できます。

import re

with open("new.json", "wt") as out:
for line in open("test.json"):
    match = re.search(".jpg",line)
    if match:
      sp_str = line.split("/")
      new_line = '\t"resource":' + '"../img/'+sp_str[-1]
      out.write(new_line)

    else:
      out.write(line)

python - このパターンに一致する文字列を置き換えるようにファイルを変更する方法

3 に答える 3

Related

Reference