python - 単純な python / Beautiful Soup 型の質問

Question

Beautiful Soupを使用して抽出されたハイパーリンクの href 属性を使用して、単純な文字列操作を実行しようとしています。

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup('<a href="http://www.some-site.com/">Some Hyperlink</a>')
href = soup.find("a")["href"]
print href
print href[href.indexOf('/'):]

私が得るのは次のとおりです。

Traceback (most recent call last):
  File "test.py", line 5, in <module>
    print href[href.indexOf('/'):]
AttributeError: 'unicode' object has no attribute 'indexOf'

何でもhref通常の文字列に変換するにはどうすればよいですか?

score 10 · Accepted Answer

Python 文字列にはメソッドがありませんindexOf。

使用するhref.index('/')

href.find('/')似ている。ただし、文字列が見つからない場合は戻りfindます。-1indexValueError

したがって、正しいのは使用することですindex('...'[-1] は文字列の最後の文字を返すため)。

score 0 · Accepted Answer

href はユニコード文字列です。通常の文字列が必要な場合は、使用します

regular_string = str(href)

score 0 · Accepted Answer

indexOf() ではなく、find() を意味します。

文字列に関する Python ドキュメント。

python - 単純な python / Beautiful Soup 型の質問

3 に答える 3

Related

Reference