python - lxmlでpythonxpathを使用してタグを一致させることはできません

Question

これが私のコードです：

def extractContent(self,html):
    parser = etree.XMLParser(ns_clean=True, recover=True)
    print html.find('id="detail"')
    tree = etree.fromstring(html,parser)
    if tree!=None:
      for c in self.contents:
        m = tree.xpath(c['xpath'])
        print m,c['xpath']
        if len(m) >= 1:
          print c['name'] + ' : ' + m[0].text

//*[@id="i-detail"]/li[1] HTMLソースで一致させようとしていますが、何も表示されません。

上記のコードの出力は次のとおりです。

25803
[] //*[@id="i-detail"]/li[1]

これはhtmlコードです：

<div class="mc fore tabcon">
                    <ul id="i-detail">
                        <li title="XXXXXXXXX">**AAAAAAAAAAA**(what i want to match)</li>
                        <li>BBBBBBBBB</li>
.......

私はcomandlineの下でxpathを使おうとしました：

>>> root.xpath('//*[@id="i-detail"]/li')
>>> []
>>> root.xpath('//*[@id="i-detail"]/*')
>>> [<Element {http://www.w3.org/1999/xhtml}li at 0x1007b7910>, <Element {http://www.w3.org/1999/xhtml}li at 0x1007b79b0>, <Element {http://www.w3.org/1999/xhtml}li at 0x1007b7a50>, <Element {http://www.w3.org/1999/xhtml}li at 0x1007b7aa0>, <Element {http://www.w3.org/1999/xhtml}li at 0x1007b7af0>, <Element {http://www.w3.org/1999/xhtml}li at 0x1007b7b40>, <Element {http://www.w3.org/1999/xhtml}li at 0x1007b7b90>]
>>> root.xpath('//*[@id="i-detail"]/*')[0] <----- this line could get the target !

score 0 · Accepted Answer

それは私の側で働いているようです：

>>> s = """<div class="mc fore tabcon">
                    <ul id="i-detail">
                        <li title="XXXXXXXXX">**AAAAAAAAAAA**(what i want to match)</li>
                        <li>BBBBBBBBB</li>
                    </ul>
</div>"""
>>> parser = etree.XMLParser(ns_clean=True, recover=True)
>>> root = etree.fromstring(s, parser)
>>> for node in root.xpath('//*[@id="i-detail"]/li[1]'):
    print node, node.text


<Element li at 0x12534b8> **AAAAAAAAAAA**(what i want to match)

python - lxmlでpythonxpathを使用してタグを一致させることはできません

1 に答える 1

Related

Reference