python - 入力リンクに適切なフォームがない場合の例外処理

Question

たとえば、次のようなリンクのリストがあります。

linklists = ['www.right1.com', www.right2.com', 'www.wrong.com', 'www.right3.com']

各 right1、right2、および right3 の html の形式は次のとおりです。

<html>
<p>
hi
</p>
<strong>
hello
</strong>
</html>

www.wrong.com html の形式は次のとおりです (実際の html はもっと複雑です)。

<html>
<p>
hi
</p>
</html>

そして、私は次のようなコードを使用しています:

from BeautifulSoup import BeautifulSoup
stronglist=[]
for httplink in linklists:  
    url = httplink
    page = urllib2.urlopen(url)
        html = page.read()
        soup = BeautifulSoup(html)
    findstrong = soup.findAll("strong")
    findstrong = str(findstrong)
    findstrong = re.sub(r'\[|\]|\s*<[^>]*>\s*', '', findstrong)        #remove tag
    stronglist.append(findstrong)

私がやりたいことは：

リストから html リンクを取得する'linklists'
間のデータを検索<strong>
それらをリストに追加する'stronglist'

しかし問題は、www.wrong.comを持たない間違ったリンク ( ) があることです。それからコードはエラーを言います...

私が欲しいのは、リンクに「強い」フィールドがない場合（エラーがある場合）、取得できないため、コードで文字列「null」をストロングリストに追加する例外処理（またはその他）ですリンクからのデータ。

これを解決するために「if」を使用してきましたが、私には少し難しいです

助言がありますか？

score 1 · Accepted Answer

例外処理を使用する必要はありません。findAll メソッドが空のリストを返すタイミングを特定し、それに対処するだけです。

from BeautifulSoup import BeautifulSoup
strong_list=[]
for url in link_list:  
    soup = BeautifulSoup(urllib2.urlopen(url).read())
    strong_tags = soup.findAll("strong")
    if not strong_tags:
        strong_list.append('null')
        continue
    for strong_tag in strong_tags:
        strong_list.append(strong_tag.text)

python - 入力リンクに適切なフォームがない場合の例外処理

1 に答える 1

Related

Reference