python - Pythonでセレンを使用してWebサイトをスクレイピングしながら2つの連続したボタンをクリックする

Question

以下のウェブサイトから国情報をスクレイピングしようとして 'Country'いますExposure。 . セクションの下部にある矢印を使用してページ。私が試したことは何もないようです。Pythonでセレンを使用してそれを行う方法はありますか?

どうもありがとう！

使用したコードは次のとおりです。

    urlpage   = 'https://www.morningstar.com/etfs/xnas/vnqi/portfolio'
    driver = webdriver.Chrome(options=options, executable_path='D:\Python\Python38\chromedriver.exe')
    driver.get(urlpage)
    elements=WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//a[text()='Country']")))
    for elem in elements:
        elem.click()

これはエラーメッセージです：

TimeoutException                          

Traceback (most recent call last)  
<ipython-input-3-bf16ea3f65c0> in <module>  
    23 driver = webdriver.Chrome(options=options, executable_path='D:\Python\Python38\chromedriver.exe')  
     24 driver.get(urlpage)  
---> 25 elements=WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//a[text()='Country']")))  
     26 for elem in elements:  
     27      elem.click()  
D:\Anaconda\lib\site-packages\selenium\webdriver\support\wait.py in until(self, method, message)  
     78             if time.time() > end_time:  
     79                 break  
---> 80         raise TimeoutException(message, screen, stacktrace)  
     81   
     82     def until_not(self, method, message=''):  
TimeoutException: Message:

申し訳ありませんが、エラーメッセージを適切にフォーマットする方法がわかりません。再度、感謝します。

score 0 · Accepted Answer

あなたが実際に持っているものを確認していないようですHTML。だからあなたは最も重要なことをしなかった。

このページ<a>のテキストには NO があります。Country

あり<input>ますvalue="Country"

このコードは私のために働く

import time
from selenium import webdriver

url = 'https://www.morningstar.com/etfs/xnas/vnqi/portfolio'

driver = webdriver.Chrome()
driver.get(url)

time.sleep(2)

country = driver.find_element_by_xpath('//input[@value="Country"]')
country.click()

time.sleep(1)
next_page = driver.find_element_by_xpath('//a[@aria-label="Go to Next Page"]')
    
while True:
    
    # get data
    table_rows = driver.find_elements_by_xpath('//table[@class="sal-country-exposure__country-table"]//tr')
    for row in table_rows[1:]:  # skip header 
        elements = row.find_elements_by_xpath('.//span')  # relative xpath with `.//`
        print(elements[0].text, elements[1].text, elements[2].text)

    # check if there is next page
    disabled = next_page.get_attribute('aria-disabled')
    #print('disabled:', disabled)
    if disabled:
        break

    # go to next page        
    next_page.click()
    
    time.sleep(1)

結果

Japan 22.08 13.47
China 10.76 1.45
Australia 9.75 6.05
Hong Kong 9.52 6.04
Germany 8.84 5.77
Singapore 6.46 4.33
United Kingdom 6.22 5.77
Sweden 3.48 2.00
France 3.18 2.58
Canada 2.28 2.92
Switzerland 1.78 0.69
Belgium 1.63 1.31
Philippines 1.53 0.15
Israel 1.47 0.16
Thailand 0.98 0.09
India 0.87 0.11
South Africa 0.87 0.21
Taiwan 0.83 0.08
Mexico 0.80 0.33
Spain 0.62 0.84
Malaysia 0.54 0.08
Brazil 0.52 0.06
Austria 0.51 0.16
New Zealand 0.41 0.21
Indonesia 0.37 0.02
Norway 0.37 0.29
United States 0.29 44.09
Netherlands 0.24 0.19
Chile 0.21 0.01
Ireland 0.16 0.19
South Korea 0.15 0.00
Turkey 0.08 0.02
Russia 0.08 0.00
Finland 0.06 0.16
Poland 0.05 0.00
Greece 0.05 0.00
Italy 0.02 0.05
Argentina 0.00 0.00
Colombia 0.00 0.00
Czech Republic 0.00 0.00
Denmark 0.00 0.00
Estonia 0.00 0.00
Hungary 0.00 0.00
Latvia 0.00 0.00
Lithuania 0.00 0.00
Pakistan 0.00 0.00
Peru 0.00 0.00
Portugal 0.00 0.00
Slovakia 0.00 0.00
Venezuela 0.00 0.00

python - Pythonでセレンを使用してWebサイトをスクレイピングしながら2つの連続したボタンをクリックする

1 に答える 1

Related

Reference