python - Python 2.6 で html エンティティを対応する utf-8 文字に置き換えます

Question

次のようなhtmlテキストがあります。

&lt;xml ... &gt;

そして、それを読みやすいものに変換したい:

<xml ...>

Pythonでそれを行う簡単な（そして速い）方法はありますか?

score 25 · Accepted Answer

Python >= 3.4

の公式ドキュメントHTMLParser: Python 3

>>> from html import unescape
>>> unescape('&copy; &euro;')
© €

の公式ドキュメントHTMLParser: Python 3

>>> from html.parser import HTMLParser
>>> pars = HTMLParser()
>>> pars.unescape('&copy; &euro;')
© €

注: これはを支持して廃止されましたhtml.unescape()。

の公式ドキュメントHTMLParser: Python 2.7

>>> import HTMLParser
>>> pars = HTMLParser.HTMLParser()
>>> pars.unescape('&copy; &euro;')
u'\xa9 \u20ac'
>>> print _
© €