python - タグ付き文字列の読み方python3によるxmlファイルから

Question

私が持っているもの:次<xliff:g>のようなxmlファイルのタグ付きの行:

<string name="AAAAAAA" msgid="XXXXXXX">"Activity <xliff:g id="BBBBBBB">%1$s</xliff:g> isn\'t responding."\n\n"Do you want to close it?"</string>

私が必要とするもの: 文字列全体を次と同じように読む:

Activity %1$s isn't responding.\n\nDo you want to close it?

助けていただけますか？

xml.dom.minidom を使用しようとしました。

dom = xml.dom.minidom.parse(xmlfile)
strings = dom.getElementsByTagName('string')
for string in strings:
    rText = string.childNodes[0].nodeValue
    print(rText)

結果は「活動

score 0 · Accepted Answer

要素がより大きなファイルの一部であると仮定します。例えば：

<strings xmlns:xliff="some-name-space">
  <string name="AAAAAAA" msgid="XXXXXXX">"Activity <xliff:g id="BBBBBBB">%1$s</xliff:g> isn\'t responding."\n\n"Do you want to close it?"</string>
  <string name="AAAAAAA" msgid="XXXXXXX">"Another <xliff:g id="BBBBBBB">%1$s</xliff:g>message</string>
</strings>

minidom を使用することは、他のフレームワークと同じくらい優れています。ファイルを開き、すべての要素を反復処理します。要素ごとに関数を呼び出しますget_text。以下に定義されているテキストを取得すると、すべての要素のコンテンツ (nodeValue) が再帰的に返されます。

import xml.dom.minidom as md
dom = md.parse('wu.xml')
strings = dom.getElementsByTagName('string')
for string in strings:
    print get_text(string)

def get_text(el):
    """get_text
    For text nodes, returns the text. For element nodes, recursively call the
    function to aggregate all the text nodes into a string"""           
    msg = ''
    for n in el.childNodes:
        if n.nodeType == n.TEXT_NODE:
            msg += n.nodeValue
        elif n.nodeType == n.ELEMENT_NODE:
            msg += get_text(n)
    return msg

他にも多くの方法があります。

score 0 · Accepted Answer

非常に使いやすい (私の意見では) BeautifulSoupのような XML パーサーを使用できます。

>>> myxml = "thexmlyouposted"
>>> from bs4 import BeautifulSoup as BS
>>> soup = BS(myxml, 'xml')
>>> print soup.find('string').text
"Activity %1$s isn't responding."

"Do you want to close it?"

python - タグ付き文字列の読み方python3によるxmlファイルから

2 に答える 2

Related

Reference