1

私は次のような刺し傷を持っています:

u'\'\'\'Joseph Michael "Joe" Acaba\'\'\' (born May 17, 1967) is an [[Teacher|educator]], [[Hydrogeology|hydrogeologist]], and [[NASA]] [[astronaut]].<ref name="bio">{{Cite web|url=http://www.jsc.nasa.gov/Bios/htmlbios/acaba-jm.html|title=Astronaut Bio: Joseph Acaba|month=February | year=2006|publisher=[[NASA|National Aeronautics and Space Administration]]|author=NASA|accessdate=November 26, 2006}}</ref><ref name="bio2">{{Cite web|url=http://oeop.larc.nasa.gov/hep/hep-astronauts.html|title=NASA Hispanic Astronauts\n|publisher=National Aeronautics and Space Administration|author=NASA|accessdate=November 26, 2006}}</ref> In May 2004 he became the first person'

マーカー<refを含めるまでのすべてのテストを削除したいと思います。ref>私はPythonを初めて使用しますが、これを行うための最良の方法がわかりません。

4

1 に答える 1

4

この場合、正規表現は問題なく機能します。

import re
ref = re.compile(u'<ref.*?ref>', re.DOTALL)

ref.sub(u'', yourtext)

修飾子に注意してre.DOTALLください。セクション内に改行があり、<ref>それらも削除したいと思います。

デモ:

>>> import re
>>> tst=u'\'\'\'Joseph Michael "Joe" Acaba\'\'\' (born May 17, 1967) is an [[Teacher|educator]], [[Hydrogeology|hydrogeologist]], and [[NASA]] [[astronaut]].<ref name="bio">{{Cite web|url=http://www.jsc.nasa.gov/Bios/htmlbios/acaba-jm.html|title=Astronaut Bio: Joseph Acaba|month=February | year=2006|publisher=[[NASA|National Aeronautics and Space Administration]]|author=NASA|accessdate=November 26, 2006}}</ref><ref name="bio2">{{Cite web|url=http://oeop.larc.nasa.gov/hep/hep-astronauts.html|title=NASA Hispanic Astronauts\n|publisher=National Aeronautics and Space Administration|author=NASA|accessdate=November 26, 2006}}</ref> In May 2004 he became the first person'
>>> ref = re.compile(u'<ref.*?ref>', re.DOTALL)
>>> ref.sub(u'', tst)
u'\'\'\'Joseph Michael "Joe" Acaba\'\'\' (born May 17, 1967) is an [[Teacher|educator]], [[Hydrogeology|hydrogeologist]], and [[NASA]] [[astronaut]]. In May 2004 he became the first person'
于 2012-10-10T17:29:10.843 に答える