python - SeleniumとBeautifulSoupを使用したPythonスクレイピングJavaScript

Question

BSとSeleniumを使用してJavaScript対応ページをスクレイプしようとしています。私はこれまでに次のコードを持っています。それでもどういうわけかJavaScriptを検出しません（そしてnull値を返します）。この場合、私は下部にあるFacebookのコメントを削り取ろうとしています。（Inspect要素はクラスをpostTextとして表示します）
助けてくれてありがとう！

from selenium import webdriver  
from selenium.common.exceptions import NoSuchElementException  
from selenium.webdriver.common.keys import Keys  
import BeautifulSoup

browser = webdriver.Firefox()  
browser.get('http://techcrunch.com/2012/05/15/facebook-lightbox/')  
html_source = browser.page_source  
browser.quit()

soup = BeautifulSoup.BeautifulSoup(html_source)  
comments = soup("div", {"class":"postText"})  
print comments

score 9 · Accepted Answer

コードにいくつかの誤りがあり、以下で修正されています。ただし、クラス「postText」は元のソースコードで定義されていないため、別の場所に存在する必要があります。あなたのコードの私の改訂版はテストされ、複数の Web サイトで動作しています。

from selenium import webdriver  
from selenium.common.exceptions import NoSuchElementException  
from selenium.webdriver.common.keys import Keys  
from bs4 import BeautifulSoup

browser = webdriver.Firefox()  
browser.get('http://techcrunch.com/2012/05/15/facebook-lightbox/')  
html_source = browser.page_source  
browser.quit()

soup = BeautifulSoup(html_source,'html.parser')  
#class "postText" is not defined in the source code
comments = soup.findAll('div',{'class':'postText'})  
print comments

python - SeleniumとBeautifulSoupを使用したPythonスクレイピングJavaScript

1 に答える 1

Related

Reference