python - Pythonでcsvファイルを操作する

Question

3つの列を持つ長いcsvファイルで次のことをしようとしています:

行ごとに、前の 250 行のエントリの最大値と最小値を取得します。データは次のようになります。列 1 はインデックス (1-5300)、列 2 はデータの場所、列 3 は別の列ですが、ここでは使用しません。これは私が今まで持っているコードです。「i」は列 1 を参照する行インデックスであることに注意してください。列 2 はデータが格納される場所です (つまり、最大値と最小値が必要なデータ)。

私が抱えている問題は、 csv.reader が常にファイルの最後から開始され、アルゴリズム全体がウィンドウの外に投げ出されることです。私が間違っていることを知りません。助けてください

max1 = 0
min1 = 1000000    

i = 3476
f1=  open('PUT/PUT_SELLING.csv')
file_reader = csv.reader(f1)
for col in file_reader:
    serial          = int(col[0])
    if serial <i-250:
        spyy = float(col[1])
        print spyy

    for j in range(0,250):
        spyy = float(col[1])          
        max1 = max(max1,spyy)
        min1 = min(min1,spyy)
        file_reader.next()
        #print spyy

f1.close()

print 'max =' +str(max1) + 'min = ' + str(min1)

score 1 · Accepted Answer

あなたのコードでは、この行

for col in file_reader:

列ではなく、ファイルの行または行を実際に反復しています

各colについて、後でリーダーをこのコードで 250 行進めます。

for j in range(0,250):
    spyy = float(col[1]) # here you're grabbing the same second item 250 times
    max1 = max(max1,spyy) # setting the new max to the same value 250 times
    min1 = min(min1,spyy) # setting the new min to the same value 250 times
    file_reader.next() # now you advance, but col is the same so ...
    # it's like you're skipping 250 lines

これは、に格納されたcol各行が、に格納された前の行から実際には 250 行後であることを意味しcolます。ファイルを 250 単位でスキップするようなものです。

あなたがやりたいと言ったことに基づいて、私はそれを書き直しました。これがより意味があるかどうかを確認してください：

f1=  open('PUT/PUT_SELLING.csv')
file_reader = csv.reader(f1)

spyy_values = []
mins = []
maxes = []

# just saying 'for x in file_reader' is all you need to iterate through the rows
# you don't need to use file_reader.next()
# here I'm also using the enumerate() function
# which automatically returns an index for each row
for row_index, row in enumerate(file_reader):
    # get the value
    spyy_values.append( float(row[1]) )

    if row_index >= 249:
        # get the min of the last 250 values,
        # including this line
        this_min = min(spyy_values[-250:])
        mins.append(this_min)
        # get the max of the last 250 values,
        # including this line
        this_max = max(spyy_values[-250:])
        maxes.append(this_max)

print "total max:", max(maxes)
print "total min:", min(mins)
print "you have %s max values" % len(maxes)
print "you have %s min values" % len(mins)
print "here are the maxes", maxes
print "here are the mins", mins

csv.reader はiteratorであるため、 for ループは自動的に各行を進みます。ドキュメントの例を確認してください。

score 0 · Accepted Answer

f1=  open('PUT/PUT_SELLING.csv')
file_reader = csv.reader(f1)
which_str = raw_input('Comma seperated list of indices to show: ')
which_to_show = [int(i) for i in which_str.split(',')]
vals = []
for cols in file_reader:  # This will iteratate the rows
    vals.append(float(col[1]))  # Accumulate the results
    index = int(cols[0])
    if index > 249:      # enough to show min,max
        mini = (min(vals))  # add to vals
        maxi = (max(vals))
        del vals[0]  # remove the first entry
    if index in which_to_show:
         print 'index %d min=%f max=%f' % (index, mini, maxi)  # Format vals

f1.close()

score 0 · Accepted Answer

間違った場所で file_reader. next() を実行しているようです。あなたが投稿したコードによると、 file_reader.next() は内側の FOR ループ内で実行されます。これが、最初の列自体を処理した後に EOF で終了する理由である可能性があります。

正しいコードは次のようになります。

max1 = 0
min1 = 1000000    

i = 3476
f1=  open('PUT/PUT_SELLING.csv')
file_reader = csv.reader(f1)
for col in file_reader:
    serial          = int(col[0])
    if serial <i-250:
        spyy = float(col[1])
        print spyy

    for j in range(0,250):
        spyy = float(col[1])          
        max1 = max(max1,spyy)
        min1 = min(min1,spyy)
# you move to the next row after processing the current row
file_reader.next()
 #print spyy

f1.close()

print 'max =' +str(max1) + 'min = ' + str(min1)

これがうまくいくかどうか教えてください

score 0 · Accepted Answer

最初の 2 列は数値なので、これが役立つ場合があります。自分で行を読み込んで "," で分割することもできます。（単なる回避策）。

使用する

file_reader=  open('PUT/PUT_SELLING.csv').readlines()
for line in file_reader:
    col = line.split(",")
    serial          = int(col[0])

代わりに

f1=  open('PUT/PUT_SELLING.csv')
file_reader = csv.reader(f1)
for col in file_reader:
   serial          = int(col[0])

python - Pythonでcsvファイルを操作する

4 に答える 4

Related

Reference