python - このスクレーパープログラムで lxml を使用してテキストを抽出する方法は?

Question

このページの特定の要素からテキストデータをスクレイピングしようとしています (scraperwiki を使用)

import requests
from lxml import html

response = requests.get(http://portlandmaps.com/detail.cfm?action=Assessor&propertyid=R246274)

tree = html.fromstring(response.content)
owner = tree.xpath('/html/body/div[2]/table[1]/tbody/tr[11]/td[2]')

print owner.text

そして、scraperwiki コンソールは以下を返します:

AttributeError: 'list' object has no attribute 'text'

Google Chrome を使用して XPath を見つけましたが、リクエストは Chrome と同じ標準を使用していると思います

score 0 · Accepted Answer

それはあなたが探しているものが存在しないからです。まずは親を試してみてください。

そして、それが機能したら、これを試してください：

owner[0].text

必要な tr が見つからない/思い出せない場合は、3 番目のインデックスのすべての tds を取得してください。

tree = html.fromstring(response.content)
owner = tree.xpath('/html/body/div[2]/table[1]/tbody/tr/td[2]')

texts = [o.text for o in owner]
print texts

次に、選択して、それに応じてコードを変更します。お役に立てれば。

python - このスクレーパー プログラムで lxml を使用してテキストを抽出する方法は?

1 に答える 1

Related

Reference

python - このスクレーパープログラムで lxml を使用してテキストを抽出する方法は?