python - Python 2.7 , issue with decode('utf-8')

Question

I have:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from urllib2 import urlopen

page2 = urlopen('http://pogoda.yandex.ru/moscow/').read().decode('utf-8')

page = urlopen('http://yasko.by/').read().decode('utf-8')

And in line "page ..." I have error "UnicodeDecodeError: 'utf8' codec can't decode byte 0xc3 in position 32: invalid continuation byte", but in line "page2 ..." th error is not, why?

From a position of 32 in yasko.by starts Cyrillic symbols, how I get it correctly?

Thanks!

score 2 · Accepted Answer

http://yasko.by/のコンテンツはでエンコードされ、 http://pogoda.yandex.ru/moscow/windows-1251のコンテンツはでエンコードされます。utf-8

page = ..行は次のようになります。

page = urlopen('http://yasko.by/').read().decode('windows-1251')

python - Python 2.7 , issue with decode('utf-8')

1 に答える 1

Related

Reference