python - Python etree コントロールの空のタグ形式

Question

Python の etree で XML ファイルを作成するときに、を使用して空のタグをファイルに書き込むと、次のようになりますSubElement。

<MyTag />

残念ながら、Fortran で使用されている XML パーサーライブラリは、正しいタグであってもこれを処理しません。以下を確認する必要があります。

<MyTag></MyTag>

これを機能させるために書式設定ルールまたは etree の何かを変更する方法はありますか?

score 18 · Accepted Answer

As of Python 3.4, you can use the short_empty_elements argument for both the tostring() function and the ElementTRee.write() method:

>>> from xml.etree import ElementTree as ET
>>> ET.tostring(ET.fromstring('<mytag/>'), short_empty_elements=False)
b'<mytag></mytag>'

In older Python versions, (2.7 through to 3.3), as a work-around you can use the html method to write out the document:

>>> from xml.etree import ElementTree as ET
>>> ET.tostring(ET.fromstring('<mytag/>'), method='html')
'<mytag></mytag>'

Both the ElementTree.write() method and the tostring() function support the method keyword argument.

On even earlier versions of Python (2.6 and before) you can install the external ElementTree library; version 1.3 supports that keyword.

Yes, it sounds a little weird, but the html output mostly outputs empty elements as a start and end tag. Some elements still end up as empty tag elements; specifically <link/>, <input/>, <br/> and such. Still, it's that or upgrade your Fortran XML parser to actually parse standards-compliant XML!

score 3 · Accepted Answer

空の追加textは、別のオプションです。

etree.SubElement(parent, 'child_tag_name').text=''

ただし、これは表現だけでなくドキュメントの構造も変更することに注意してchild_el.textください。''None

ああ、Martijn が言ったように、より良いライブラリを使用してみてください。

score 2 · Accepted Answer

sed が利用可能な場合は、python スクリプトの出力をパイプして

sed -e "s/<\([^>]*\) \/>/<\1><\/\1>/g"

の出現を見つけて<Tag />置き換えます<Tag></Tag>

score 0 · Accepted Answer

コードを言い換えると、ElementTree.py私が使用するバージョンには、_writeメソッドに次のものが含まれています。

write('<' + tagname)
...
if node.text or len(node): # this line is literal
    write('>')
    ...
    write('</%s>' % tagname)
else:
    write(' />')

プログラムカウンターを操作するために、次のものを作成しました。

class AlwaysTrueString(str):
    def __nonzero__(self): return True
true_empty_string = AlwaysTrueString()

次にnode.text = true_empty_string、自己終了タグではなく開閉タグが必要な ElementTree ノードを設定します。

「プログラムカウンターを操作する」とは、ライブラリメソッドの呼び出しがその制御フローグラフを希望どおりにトラバースするように、ライブラリメソッドへの一連の入力 (この場合はやや奇妙な真偽テストを持つオブジェクト) を構築することを意味します。 . これはばかばかしいほど脆弱です: ライブラリの新しいバージョンでは、私のハックが壊れる可能性があります。一般に、抽象化の障壁を壊さないでください。ここでうまくいきました。

python - Python etree コントロールの空のタグ形式

6 に答える 6

Related

Reference