python - セレン：リストインデックスが範囲外エラー

Question

私は得ています

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
IndexError: list index out of range

このコードを実行しようとするとエラーが発生します。このページのすべてのURLを印刷しようとしています。誰かに教えてもらえますか、私が間違っていることは何ですか？

from selenium import webdriver
browser = webdriver.Firefox()
browser.get("http://www.tour-india.net/best-of-india.htm")
cities=browser.find_elements_by_css_selector(".posts1>a>h2")
for i in range(0,len(cities)):
    cities1=browser.find_elements_by_css_selector(".posts1>a>h2")[i]
    cities1.click()
    title=browser.find_elements_by_xpath("//title")
    content=browser.find_elements_by_css_selector(".tours_text_innerpage.content_margin_top")
    currentUrl=browser.current_url
    print currentUrl
    browser.back()

編集：forループの後で、 cities = browser.find_elements_by_css_selector（"。posts1>a> h2"）を追加したコードにいくつかの変更を加えていたところ、突然インデックスエラーが発生しなくなりました。今、私はなぜそれが起こったのか混乱しています。

from selenium import webdriver
browser = webdriver.Firefox()
browser.get("http://www.tour-india.net/best-of-india.htm")
cities=browser.find_elements_by_css_selector(".posts1>a>h2")
for i in range(0,len(cities)):
    cities=browser.find_elements_by_css_selector(".posts1>a>h2")
    cities1=browser.find_elements_by_css_selector(".posts1>a>h2")[i]
    cities1.click()
    title=browser.find_elements_by_xpath("//title")
    content=browser.find_elements_by_css_selector(".tours_text_innerpage.content_margin_top")
    currentUrl=browser.current_url
    print currentUrl
    browser.back()

編集：私の全体のトレースバック

>>> import traceback
>>> from selenium import webdriver
>>> browser = webdriver.Firefox()
>>> browser.get("http://www.tour-india.net/best-of-india.htm")
>>> cities=browser.find_elements_by_css_selector(".posts1>a>h2")
>>> for i in range(0,len(cities)):      
...     try:
...             #cities=browser.find_elements_by_css_selector(".posts1>a>h2")
...             cities1=browser.find_elements_by_css_selector(".posts1>a>h2")[i]
...             cities1.click()
...             title=browser.find_elements_by_xpath("//title")
...             content=browser.find_elements_by_css_selector(".tours_text_innerpage.content_margin_top")
...             currentUrl=browser.current_url
...             print currentUrl
...             browser.back()
...     except:
...             print traceback.format_exc()
... 
http://www.tour-india.net/golden-triangle.htm
http://www.tour-india.net/golden-triangle-varanasi.htm
http://www.tour-india.net/magnificent-rajasthan.htm
http://www.tour-india.net/northindia-rajasthan-tour.htm
http://www.tour-india.net/north_india_himalaya_tour.htm
Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
IndexError: list index out of range

http://www.tour-india.net/southindia-panorma.htm
http://www.tour-india.net/classical-rajasthan-tours.htm
http://www.tour-india.net/rajasthan-forts.htm
http://www.tour-india.net/india-nepal-tour.htm
http://www.tour-india.net/southindia-glimpses.htm
http://www.tour-india.net/enchanting-southindia.htm
http://www.tour-india.net/shekhawati-tours.htm
http://www.tour-india.net/delhi-tour.htm
http://www.tour-india.net/bombay-goa.htm
http://www.tour-india.net/royal-rajasthan.htm
http://www.tour-india.net/grand-mughal.htm
http://www.tour-india.net/north_india_himalaya_tour.htm
http://www.tour-india.net/northindia-images.htm
http://www.tour-india.net/karnataka-heritage.htm
http://www.tour-india.net/leh-ladakh.htm
Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
IndexError: list index out of range

http://www.tour-india.net/darjeeling-sikkim.htm
http://www.tour-india.net/himalayan-heritage.htm
Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
IndexError: list index out of range

http://www.tour-india.net/rajasthan-goa.htm
http://www.tour-india.net/rajasthan-forts-palaces.htm
http://www.tour-india.net/rajasthan-mp.htm
http://www.tour-india.net/rajasthan-nepal.htm
http://www.tour-india.net/splendid-gujarat.htm

score 1 · Accepted Answer

都市が問題を解決した後、都市変数を再度呼び出します。まだ理由はわかりませんが、問題なく動作しています。誰も答えを投稿していないためです。自分の答えを受け入れる

from selenium import webdriver
browser = webdriver.Firefox()
browser.get("http://www.tour-india.net/best-of-india.htm")
cities=browser.find_elements_by_css_selector(".posts1>a>h2")
for i in range(0,len(cities)):
    cities=browser.find_elements_by_css_selector(".posts1>a>h2")
    cities1=browser.find_elements_by_css_selector(".posts1>a>h2")[i]
    cities1.click()
    title=browser.find_elements_by_xpath("//title")
    content=browser.find_elements_by_css_selector(".tours_text_innerpage.content_margin_top")
    currentUrl=browser.current_url
    print currentUrl
    browser.back(

score 1 · Accepted Answer

では、すべてのリンクをクリックして印刷し、戻ってきますか? これは非常に非効率的です。.get_attribute メソッドを使用すると、ページ上のすべてのリンクの URL を非常に迅速に取得できます。

links = [i.get_attribute('href') for i in driver.find_elements_by_xpath('.//a')]
for i in links:
    print i

ページ上のすべてのリンクのリストを出力します。ページの小さな領域を選択するには、選択したい「フレーム」要素を見つけて使用します

frame.find_elements_by_xpath('//a')

代わりは。

score 0 · Accepted Answer

を使用するlen(cities)-1と、lenPython が認識するリストの長さよりも 1 つ多く返されます。

score -3 · Accepted Answer

for i in range(len(cities)):

範囲は引数を 1 つだけ取ります:)

ループを変更できます：

for city in cities:
    city.click()
    # rest is the same

それはより「pythonic」です

python - セレン：リストインデックスが範囲外エラー

4 に答える 4

Related

Reference