python - txt を Python に取り込まない

Question

このコードを使用して、特定の企業 (コードの下の株式) の主要な財務データを取得しようとしています。

        netIncomeAr = []

        endLink = 'order=asc'   # order=asc&
        try:

            netIncome = urllib.request.urlopen('https://www.quandl.com/api/v3/datasets/RAYMOND/'+stock.upper()+'_NET_INCOME_A.csv?'+endLink).read()

            splitNI = netIncome.split('\n')
            print('Net Income:')
            for eachNI in splitNI[1:-1]:
                print(eachNI)
                netIncomeAr.append(eachNI)


            incomeDate, income = np.loadtxt(netIncomeAr, delimiter=',',unpack=True,
                                            converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

        except Exception as e:
            print('failed in the Quandl grab')
            print(str(e))
            time.sleep(555)

しかし、「Quandl グラブに失敗しました」というエラーメッセージが表示されます。エラーは、Quandl からの urllib.request を実行する最初の行にあるに違いないことはわかっています。

このコードが機能しない理由は誰にも分かりますか?

OK - ローランドさん、ありがとう

コードをこの制限付きの概念実証スニペットに変更しました。

import urllib.request, urllib.error, urllib.parse
import time
import datetime
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import matplotlib.dates as mdates

evenBetter = ['GOOG','AAPL']


def graphData(stock, MA1, MA2):
    #######################################
    #######################################
    '''
        Use this to dynamically pull a stock from Quandl:
    '''
    print('Currently Pulling',stock)

    netIncomeAr = []
#    revAr = []
#    ROCAr = []

    endLink = 'order=asc'

    netIncome = str(urllib.request.urlopen('https://www.quandl.com/api/v3/datasets/RAYMOND/'+stock.upper()+'_NET_INCOME_A.csv?'+endLink).read())[2:-1]
    # convert to string, remove leading "b'" and trailing "'" characters.
    # netIncome = 'head\\ndata\\ndata\\n...'


    splitNI = netIncome.split('\\')[1:-1]
    # data segments still have leading 'n' character.
    # the [1:-1] is more pythonic and releases memory.
    for i in range (len(splitNI)):
        splitNI[i] = splitNI[i][1:]
    # data segments are now converted.

    print('Net Income:')
    for eachNI in splitNI:
        print(eachNI)
        netIncomeAr.append(eachNI)


    incomeDate, income = np.loadtxt(netIncomeAr, delimiter=',',unpack=True,
                                    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

for stock in evenBetter:
    graphData(stock,25,50)

そして今、urllib.requestの問題を別の問題に乗り越えています...以下のエラー:

Currently Pulling GOOG
Net Income:
2009-12-31,6520448000.0
2010-12-31,8505000000.0
2011-12-31,9737000000.0
2012-12-31,10737000000.0
2013-12-31,12920000000.0
Traceback (most recent call last):

  File "<ipython-input-3-5ce0b8405254>", line 1, in <module>
    runfile('C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py', wdir='C:/Users/Morten/Google Drev/SpyderProject/test')

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
    execfile(filename, namespace)

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
    exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 57, in <module>
    graphData(stock,25,50)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 54, in graphData
    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 860, in loadtxt
    items = [conv(val) for (conv, val) in zip(converters, vals)]

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 860, in <listcomp>
    items = [conv(val) for (conv, val) in zip(converters, vals)]

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\matplotlib\dates.py", line 261, in __call__
    return date2num(datetime.datetime(*time.strptime(s, self.fmt)[:6]))

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\_strptime.py", line 494, in _strptime_time
    tt = _strptime(data_string, format)[0]

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\_strptime.py", line 306, in _strptime
    raise TypeError(msg.format(index, type(arg)))

TypeError: strptime() argument 0 must be str, not <class 'bytes'>

Davse Bamse の提案により、次のトレースバックが表示されます (これは難しいものです)。

Currently Pulling GOOG
Net Income:
Traceback (most recent call last):

  File "<ipython-input-3-c3f1db0f3995>", line 1, in <module>
    runfile('C:/Users/Morten/Google Drev/SpyderProject/test/sentdex_Test_comp_screener_own_webscraper2.py', wdir='C:/Users/Morten/Google Drev/SpyderProject/test')

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
    execfile(filename, namespace)

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
    exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/sentdex_Test_comp_screener_own_webscraper2.py", line 59, in <module>
    graphData(stock)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/sentdex_Test_comp_screener_own_webscraper2.py", line 56, in graphData
    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 845, in loadtxt
    converters[i] = conv

IndexError: list assignment index out of range

Davse Bamse の新しい提案では、コンバーターで次のようなリストを使用します。

[incomeDate, income] = np.loadtxt(netIncomeAr, delimiter=',',unpack=True,
                                converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

次のエラーが表示されます。

Currently Pulling GOOG
Net Income:
C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py:823: UserWarning: loadtxt: Empty input file: "[]"
  warnings.warn('loadtxt: Empty input file: "%s"' % fname)
Traceback (most recent call last):

  File "<ipython-input-1-c3f1db0f3995>", line 1, in <module>
    runfile('C:/Users/Morten/Google Drev/SpyderProject/test/sentdex_Test_comp_screener_own_webscraper2.py', wdir='C:/Users/Morten/Google Drev/SpyderProject/test')

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
    execfile(filename, namespace)

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
    exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/sentdex_Test_comp_screener_own_webscraper2.py", line 60, in <module>
    graphData(stock)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/sentdex_Test_comp_screener_own_webscraper2.py", line 56, in graphData
    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 845, in loadtxt
    converters[i] = conv

IndexError: list assignment index out of range

10月12日からのご参加ありがとうございます。2015 ダブセバムセ,

ただし、あなたが言うように .join をどこに挿入すればよいかわかりません...

このスニペットをコピーして、(編集した) 提案を投稿していただけませんか。私は光を見る必要があります！これは、12月12日までのすべての編集の後、私が今持っているものです。

import urllib.request, urllib.error, urllib.parse
import numpy as np
import matplotlib.dates as mdates

stocklist = ['GOOG']


def graphData(stock, MA1, MA2):
    #######################################
    #######################################
    '''
        Use this to dynamically pull a stock from Quandl:
    '''
    print('Currently Pulling',stock)

    netIncomeAr = []

    endLink = 'order=asc'   # order=asc&

    netIncome = str(urllib.request.urlopen('https://www.quandl.com/api/v3/datasets/RAYMOND/'+stock.upper()+'_NET_INCOME_A.csv?'+endLink).read())[2:-1]
    # convert to string, remove leading "b'" and trailing "'" characters.
    # netIncome = 'head\\ndata\\ndata\\n...'


    splitNI = netIncome.split('\\')[1:-1]
    # data segments still have leading 'n' character.
    # the [1:-1] is more pythonic and releases memory.
    for i in range (len(splitNI)):
        splitNI[i] = splitNI[i][1:]
    # data segments are now converted.

    print('Net Income:')
    for eachNI in splitNI:
        print(eachNI)
        netIncomeAr.append(eachNI)


    incomeDate, income = np.loadtxt(netIncomeAr, delimiter=',',unpack=True,
                                    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

for stock in stocklist:
    graphData(stock,25,50)

今日 (2015 年 10 月 13 日) の Davse Bamse からの入力で、次のエラーが表示されます。

Currently Pulling GOOG
Net Income:
2009-12-31,6520448000.0
2010-12-31,8505000000.0
2011-12-31,9737000000.0
2012-12-31,10737000000.0
2013-12-31,12920000000.0
Traceback (most recent call last):

  File "<ipython-input-13-5ce0b8405254>", line 1, in <module>
    runfile('C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py', wdir='C:/Users/Morten/Google Drev/SpyderProject/test')

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
    execfile(filename, namespace)

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
    exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 54, in <module>
    graphData(stock,25,50)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 51, in graphData
    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 740, in loadtxt
    fh = iter(open(fname))

OSError: [Errno 22] Invalid argument: '2009-12-31,6520448000.0\n2010-12-31,8505000000.0\n2011-12-31,9737000000.0\n2012-12-31,10737000000.0\n2013-12-31,12920000000.0'

Davse Bamse は、次のように io.StringIO を使用することを提案しました。

incomeDate, income = StringIO(np.loadtxt('\n'.join(netIncomeAr), delimiter=',',unpack=True,
                                converters={ 0: mdates.strpdate2num('%Y-%m-%d')}))

しかし、これにより以前と同じエラーが発生します...何か考えはありますか???

コンバータ行を次のように変更します。

incomeDate, income = np.loadtxt(StringIO('\n'.join(netIncomeAr)), delimiter=',',unpack=True,
                                converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

次のスタックトレースを提供します:

Currently Pulling GOOG
Net Income:
2009-12-31,6520448000.0
2010-12-31,8505000000.0
2011-12-31,9737000000.0
2012-12-31,10737000000.0
2013-12-31,12920000000.0
Traceback (most recent call last):

  File "<ipython-input-26-5ce0b8405254>", line 1, in <module>
    runfile('C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py', wdir='C:/Users/Morten/Google Drev/SpyderProject/test')

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
    execfile(filename, namespace)

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
    exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 60, in <module>
    graphData(stock,25,50)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 57, in graphData
    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 860, in loadtxt
    items = [conv(val) for (conv, val) in zip(converters, vals)]

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 860, in <listcomp>
    items = [conv(val) for (conv, val) in zip(converters, vals)]

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\matplotlib\dates.py", line 261, in __call__
    return date2num(datetime.datetime(*time.strptime(s, self.fmt)[:6]))

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\_strptime.py", line 494, in _strptime_time
    tt = _strptime(data_string, format)[0]

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\_strptime.py", line 306, in _strptime
    raise TypeError(msg.format(index, type(arg)))

TypeError: strptime() argument 0 must be str, not <class 'bytes'>

Numpy の (私は np 1.9.2 にいます) loadtxt の代わりに、別のメソッド np.genfromtxt を見つけました。これは明らかに、このソリューションで説明されているこれを行うことができますnumpy.loadtxt does not read file with complex numbers。

したがって、代わりにこのコンバーターラインを使用します

incomeDate, income = np.genfromtxt('\n'.join(netIncomeAr), delimiter=',',unpack=True,
                                converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

出力

Currently Pulling GOOG
Net Income:
2009-12-31,6520448000.0
2010-12-31,8505000000.0
2011-12-31,9737000000.0
2012-12-31,10737000000.0
2013-12-31,12920000000.0
Traceback (most recent call last):

  File "<ipython-input-10-5ce0b8405254>", line 1, in <module>
    runfile('C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py', wdir='C:/Users/Morten/Google Drev/SpyderProject/test')

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
    execfile(filename, namespace)

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 85, in execfile
    exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 50, in <module>
    graphData(stock,25,50)

  File "C:/Users/Morten/Google Drev/SpyderProject/test/Test_sentdex_comp_screener_own_webscraper2.py", line 47, in graphData
    converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\npyio.py", line 1366, in genfromtxt
    fhd = iter(np.lib._datasource.open(fname, 'rb'))

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\_datasource.py", line 151, in open
    return ds.open(path, mode)

  File "C:\Program Files\WinPython-64bit-3.3.5.7\python-3.3.5.amd64\lib\site-packages\numpy\lib\_datasource.py", line 501, in open
    raise IOError("%s not found." % path)

OSError: 2009-12-31,6520448000.0
2010-12-31,8505000000.0
2011-12-31,9737000000.0
2012-12-31,10737000000.0
2013-12-31,12920000000.0 not found.

これが良いか悪いかはわかりませんが…

score 0 · Accepted Answer

Python 3.x では、urllib.request.urlopen(...).read() 関数が成功すると、文字列オブジェクトではなくByteArrayが返されます。

ByteArray を String に変換するソリューションは次のとおりです。

...
netIncome = str(urllib.request.urlopen('https://www.quandl.com/api/v3/datasets/RAYMOND/'+stock.upper()+'_NET_INCOME_A.csv?'+endLink).read())[2:-1]
# convert to string, remove leading "b'" and trailing "'" characters.
# netIncome = 'head\\ndata\\ndata\\n...'
...

splitNI = netIncome.split('\\')[1:-1]
# data segments still have leading 'n' character.
# the [1:-1] is more pythonic and releases memory.
for i in range (len(splitNI)):
    splitNI[i] = splitNI[i][1:]
# data segments are now converted.

print('Net Income:')
for eachNI in splitNI:
    print(eachNI)
    netIncomeAr.append(eachNI)

score 0 · Accepted Answer

ローランドが指摘しているように、返されるのはバイト配列であり、文字列ではないという問題があります。

ただし、コードは次のようになります。

netIncomeBytes = urllib.request.urlopen('https://www.quandl.com/api/v3/datasets/RAYMOND/'+stock.upper()+'_NET_INCOME_A.csv?'+endLink).read()
netIncome = netIncomeBytes.decode("utf-8")

これにより、bytearray が utf-8 の文字列に変換されます。

python - txt を Python に取り込まない

2 に答える 2

Related

Reference