python - Pythonフィードパーサー

Question

pythonfeedparserを使用して次のようにxmlデータをどのように解析しますか

<Book_API>
<Contributor_List>
<Display_Name>Jason</Display_Name>
</Contributor_List>
<Contributor_List>
<Display_Name>John Smith</Display_Name>
</Contributor_List>
</Book_API>

score 4 · Accepted Answer

これは、RSS/ATOMフィードのようなものではありません。そのためにfeedparserはまったく使用せず、lxmlを使用します。実際、feedparserはそれを理解できず、例の「Jason」寄稿者を削除します。

from lxml import etree

data = <fetch the data somehow>
root = etree.parse(data)

これで、xmlオブジェクトのツリーができました。lxmlでそれをより具体的に行う方法は、実際に有効なXMLデータを提供するまで言うことは不可能です。;）

score 2 · Accepted Answer

Lennart Regebroが述べたように、それはRSS / Atomフィードではなく、単なるXMLドキュメントのようです。Python標準ライブラリにはいくつかのXML解析機能（SAXとDOMの両方）があります。ElementTreeをお勧めします。また、サードパーティのライブラリでは、 lxmlが最適です（ElementTreeのドロップイン置換です）。

try:
    from lxml import etree
except ImportError:
    try:
        from xml.etree.cElementTree as etree
    except ImportError:
        from xml.etree.ElementTree as etree

doc = """<Book_API>
<Contributor_List>
<Display_Name>Jason</Display_Name>
</Contributor_List>
<Contributor_List>
<Display_Name>John Smith</Display_Name>
</Contributor_List>
</Book_API>"""
xml_doc = etree.fromstring(doc)

python - Pythonフィードパーサー

2 に答える 2

Related

Reference