python - x軸オフセットを使用してPython（5GB）で本当に大きなファイルをプロットします

Question

pythonとmatplotlibを使用して非常に大きなファイル（〜5 GB）をプロットしようとしています。ファイル全体をメモリにロードできますが（マシンで使用可能な合計は16 GBです）、単純なimshowを使用してプロットすると、セグメンテーション違反が発生します。これは、私が15000に設定したulimitに最も可能性がありますが、それより高く設定することはできません。配列をバッチでプロットする必要があるという結論に達したので、それを行うための簡単なコードを作成しました。私の主な問題は、大きな配列のバッチをプロットするとき、x座標は常に0から始まり、画像をオーバーレイして最終的な大きな配列を作成する方法がないということです。何か提案があれば私に知らせてください。また、管理者権限により、このマシンに「イメージ」などの新しいパッケージをインストールできません。これは、配列の最初の12行を読み取り、3つのプロットを作成するコードのサンプルです。

import os
import sys
import scipy
import numpy as np
import pylab as pl
import matplotlib as mpl
import matplotlib.cm as cm
from optparse import OptionParser
from scipy import fftpack
from scipy.fftpack import *
from cmath import *
from pylab import *
import pp
import fileinput
import matplotlib.pylab as plt
import pickle

def readalllines(file1,rows,freqs):
    file = open(file1,'r')
    sizer = int(rows*freqs)
    i = 0
    q = np.zeros(sizer,'float')
    for i in range(rows*freqs):
        s =file.readline()
        s = s.split()
        #print s[4],q[i]
        q[i] = float(s[4])
        if i%262144 == 0:
            print '\r ',int(i*100.0/(337*262144)),'  percent complete',
        i += 1
    file.close()
    return q

parser = OptionParser()
parser.add_option('-f',dest="filename",help="Read dynamic spectrum from FILE",metavar="FILE")
parser.add_option('-t',dest="dtime",help="The time integration used in seconds, default 10",default=10)
parser.add_option('-n',dest="dfreq",help="The bandwidth of each frequency channel in Hz",default=11.92092896)
parser.add_option('-w',dest="reduce",help="The chuncker divider in frequency channels, integer default 16",default=16)
(opts,args) = parser.parse_args()
rows=12
freqs = 262144

file1 = opts.filename

s = readalllines(file1,rows,freqs)
s = np.reshape(s,(rows,freqs))
s = s.T
print s.shape
#raw_input()

#s_shift = scipy.fftpack.fftshift(s)


#fig = plt.figure()

#fig.patch.set_alpha(0.0)
#axes = plt.axes()
#axes.patch.set_alpha(0.0)
###plt.ylim(0,8)

plt.ion()

i = 0
for o in range(0,rows,4):

    fig = plt.figure()
    #plt.clf()

    plt.imshow(s[:,o:o+4],interpolation='nearest',aspect='auto', cmap=cm.gray_r, origin='lower')
    if o == 0:
        axis([0,rows,0,freqs])
    fdf, fdff = xticks()
    print fdf
    xticks(fdf+o)
    print xticks()
    #axis([o,o+4,0,freqs])
    plt.draw()

    #w, h = fig.canvas.get_width_height()
    #buf = np.fromstring(fig.canvas.tostring_argb(), dtype=np.uint8)
    #buf.shape = (w,h,4)

    #buf = np.rol(buf, 3, axis=2)
    #w,h,_ = buf.shape
    #img = Image.fromstring("RGBA", (w,h),buf.tostring())

    #if prev:
    #    prev.paste(img)
    #    del prev
    #prev = img
    i += 1
pl.colorbar()
pl.show()

score 4 · Accepted Answer

グラフィックチェーン内の何かに最大2,000ピクセルを超える配列をプロットすると、何らかの方法で画像がダウンサンプリングされ、モニターに表示されます。制御された方法でダウンサンプリングすることをお勧めします

data = convert_raw_data_to_fft(args) # make sure data is row major
def ds_decimate(row,step = 100):
    return row[::step]
def ds_sum(row,step):
    return np.sum(row[:step*(len(row)//step)].reshape(-1,step),1)
# as per suggestion from tom10 in comments
def ds_max(row,step): 
    return np.max(row[:step*(len(row)//step)].reshape(-1,step),1)
data_plotable = [ds_sum(d) for d in data] # plug in which ever function you want

または補間。

score 2 · Accepted Answer

Matplotlibは、画像をプロットするときにメモリ効率がかなり悪くなります。それはいくつかのフル解像度の中間配列を作成します、それはおそらくあなたのプログラムがクラッシュしている理由です。

1つの解決策は、@ tcaswellが示唆しているように、画像をmatplotlibにフィードする前に画像をダウンサンプリングすることです。

また、画面の解像度に基づいて、このダウンサンプリングを自動的に行うためのラッパーコードもいくつか作成しました。便利な場合は、https：//github.com/ChrisBeaumont/mpl-modest-imageにあります。また、画像がその場でリサンプリングされるという利点もあるため、必要な場所で解像度を犠牲にすることなく、パンやズームを行うことができます。

score 0 · Accepted Answer

extent=(left, right, bottom, top)のキーワード引数が欠落しているだけだと思いますplt.imshow。

x = np.random.randn(2, 10)
y = np.ones((4, 10))
x[0] = 0  # To make it clear which side is up, etc
y[0] = -1

plt.imshow(x, extent=(0, 10, 0, 2))
plt.imshow(y, extent=(0, 10, 2, 6))
# This is necessary, else the plot gets scaled and only shows the last array
plt.ylim(0, 6)
plt.colorbar()
plt.show()

ここに画像の説明を入力してください

python - x軸オフセットを使用してPython（5GB）で本当に大きなファイルをプロットします

3 に答える 3

Related

Reference