python - 出力をリストに変換して量をカウントする方法は?

Question

ウェブページを解析し、そのページ上のリンク ('a' タグ) の量を取得するスクリプトを作成しました。

import urllib
import lxml.html
connection = urllib.urlopen('http://test.com')
dom =  lxml.html.fromstring(connection.read())
for link in dom.xpath('//a/@href'):
    print link

スクリプトの出力:

./01.html
./52.html
./801.html
http://www.blablabla.com/1.html
#top

リンクの数をカウントするためにリストに変換するにはどうすればよいですか? 私は link.split() を使用していますが、それは私に届きました:

['./01.html']
['./52.html']
['./801.html']
['http://www.blablabla.com/1.html']
['#top']

しかし、私は取得したい:

[./01.html, ./52.html, ./801.html, http://www.blablabla.com/1.html, #top]

ありがとう！

score 7 · Accepted Answer

link.split()リンク自体を分割しようとします。ただし、すべてのリンクを表すエンティティを使用する必要があります。あなたの場合：dom.xpath('//a/@href')。

したがって、これはあなたを助ける必要があります：

links = list(dom.xpath('//a/@href'))

そして、組み込みlen関数で長さを取得します:

print len(links)

score 3 · Accepted Answer

list(dom.xpath('//a/@href'))

dom.xpathこれは、すべてのアイテムを返すイテレータを取り、リストに入れます。

python - 出力をリストに変換して量をカウントする方法は?

2 に答える 2

Related

Reference