python - Pythonで画像の一部だけをロードする

Question

ばかげた質問かもしれませんが...

Python にロードしてから numpy 配列に変換したい画像が数千あります。明らかに、これは少しゆっくりと進みます。しかし、私は実際には各画像のごく一部にしか興味がありません。(同じ部分、画像の中央の 100x100 ピクセルのみ)。

画像の一部だけを読み込んで処理を高速化する方法はありますか?

サンプル画像を生成して保存し、再度読み込むサンプルコードを次に示します。

import numpy as np
import matplotlib.pyplot as plt
import Image, time

#Generate sample images
num_images = 5

for i in range(0,num_images):
    Z = np.random.rand(2000,2000)
    print 'saving %i'%i
    plt.imsave('%03i.png'%i,Z)

%load the images
for i in range(0,num_images):
    t = time.time()

    im = Image.open('%03i.png'%i)
    w,h = im.size
    imc = im.crop((w-50,h-50,w+50,h+50))

    print 'Time to open: %.4f seconds'%(time.time()-t)

    #convert them to numpy arrays
    data = np.array(imc)

score 9 · Accepted Answer

単一のスレッドで PIL トリミングよりもはるかに高速になることはありませんが、複数のコアを使用してすべてを高速化できます! :)

以下のコードを 8 コアの i7 マシンと、7 歳の 2 コアでかろうじて 2 GHz のラップトップで実行しました。両方とも実行時間が大幅に改善されました。ご想像のとおり、改善は利用可能なコアの数に依存していました。

コードのコアは同じです。関数を値のリストに並列に適用できるように、ループを実際の計算から分離しただけです。

したがって、この：

for i in range(0,num_images):
    t = time.time()

    im = Image.open('%03i.png'%i)
    w,h = im.size
    imc = im.crop((w-50,h-50,w+50,h+50))

    print 'Time to open: %.4f seconds'%(time.time()-t)

    #convert them to numpy arrays
    data = np.array(imc)

なりました：

def convert(filename):  
    im = Image.open(filename)
    w,h = im.size
    imc = im.crop((w-50,h-50,w+50,h+50))
    return numpy.array(imc)

高速化の鍵はライブラリのPool機能です。multiprocessing複数のプロセッサ間で物事を実行するのは簡単です。

完全なコード:

import os 
import time
import numpy 
from PIL import Image
from multiprocessing import Pool 

# Path to where my test images are stored
img_folder = os.path.join(os.getcwd(), 'test_images')

# Collects all of the filenames for the images
# I want to process
images = [os.path.join(img_folder,f) 
        for f in os.listdir(img_folder)
        if '.jpeg' in f]

# Your code, but wrapped up in a function       
def convert(filename):  
    im = Image.open(filename)
    w,h = im.size
    imc = im.crop((w-50,h-50,w+50,h+50))
    return numpy.array(imc)

def main():
    # This is the hero of the code. It creates pool of 
    # worker processes across which you can "map" a function
    pool = Pool()

    t = time.time()
    # We run it normally (single core) first
    np_arrays = map(convert, images)
    print 'Time to open %i images in single thread: %.4f seconds'%(len(images), time.time()-t)

    t = time.time()
    # now we run the same thing, but this time leveraging the worker pool.
    np_arrays = pool.map(convert, images)
    print 'Time to open %i images with multiple threads: %.4f seconds'%(len(images), time.time()-t)

if __name__ == '__main__':
    main()

かなり基本的です。ほんの数行の追加コードと、変換ビットを独自の関数に移動するための少しのリファクタリングのみです。結果が物語っています。

結果：

8 コア i7

Time to open 858 images in single thread: 6.0040 seconds
Time to open 858 images with multiple threads: 1.4800 seconds

2 コア Intel Duo

Time to open 858 images in single thread: 8.7640 seconds
Time to open 858 images with multiple threads: 4.6440 seconds

では、どうぞ！非常に古い 2 コアマシンを使用している場合でも、画像を開いて処理するのに費やす時間を半分にすることができます。

注意事項

メモリー。数千枚の画像を処理している場合、おそらくある時点で Python のメモリ制限を超えてしまうでしょう。これを回避するには、データをチャンクで処理するだけです。マルチプロセッシングの優れた点はすべて活用できますが、ほんの少しずつです。何かのようなもの：

for i in range(0, len(images), chunk_size): 
    results = pool.map(convert, images[i : i+chunk_size]) 
    # rest of code.

score 8 · Accepted Answer

ファイルを非圧縮の 24 ビット BMP として保存します。これらは非常に規則的な方法でピクセルデータを保存します。ウィキペディアから、この図の「画像データ」の部分を確認してください。図の複雑さのほとんどはヘッダーによるものであることに注意してください。

BMP ファイル形式

たとえば、この画像を保存しているとしましょう (ここでは拡大して表示されています)。

これは、24 ビットの非圧縮 BMP として格納されている場合、ピクセルデータセクションがどのように見えるかです。データは何らかの理由でボトムアップで保存され、RGB ではなく BGR 形式で保存されることに注意してください。したがって、ファイルの最初の行は画像の一番下の行であり、2 番目の行は画像の一番下の行です。等：

00 00 FF    FF FF FF    00 00
FF 00 00    00 FF 00    00 00

そのデータは次のように説明されています。

           |  First column  |  Second Column  |  Padding
-----------+----------------+-----------------+-----------
Second Row |  00 00 FF      |  FF FF FF       |  00 00
-----------+----------------+-----------------+-----------
First Row  |  FF 00 00      |  00 FF 00       |  00 00
-----------+----------------+-----------------+-----------

また：

           |  First column  |  Second Column  |  Padding
-----------+----------------+-----------------+-----------
Second Row |  red           |  white          |  00 00
-----------+----------------+-----------------+-----------
First Row  |  blue          |  green          |  00 00
-----------+----------------+-----------------+-----------

パディングは、行サイズを 4 バイトの倍数にパディングするためにあります。

したがって、この特定のファイル形式のリーダーを実装し、各行の読み取りを開始および停止する位置のバイトオフセットを計算するだけです。

def calc_bytes_per_row(width, bytes_per_pixel):
    res = width * bytes_per_pixel
    if res % 4 != 0:
        res += 4 - res % 4
    return res

def calc_row_offsets(pixel_array_offset, bmp_width, bmp_height, x, y, row_width):
    if x + row_width > bmp_width:
        raise ValueError("This is only for calculating offsets within a row")

    bytes_per_row = calc_bytes_per_row(bmp_width, 3)
    whole_row_offset = pixel_array_offset + bytes_per_row * (bmp_height - y - 1)
    start_row_offset = whole_row_offset + x * 3
    end_row_offset = start_row_offset + row_width * 3
    return (start_row_offset, end_row_offset)

次に、適切なバイトオフセットを処理する必要があります。たとえば、10000x10000 ビットマップの 500x500 の位置から始まる 400x400 のチャンクを読みたいとします。

def process_row_bytes(row_bytes):
    ... some efficient way to process the bytes ...

bmpf = open(..., "rb")
pixel_array_offset = ... extract from bmp header ...
bmp_width = 10000
bmp_height = 10000
start_x = 500
start_y = 500
end_x = 500 + 400
end_y = 500 + 400

for cur_y in xrange(start_y, end_y):
    start, end = calc_row_offsets(pixel_array_offset, 
                                  bmp_width, bmp_height, 
                                  start_x, cur_y, 
                                  end_x - start_x)
    bmpf.seek(start)
    cur_row_bytes = bmpf.read(end - start)
    process_row_bytes(cur_row_bytes)

バイトをどのように処理するかが重要であることに注意してください。おそらく、PIL を使用してピクセルデータをダンプするだけで何か賢いことができると思いますが、完全にはわかりません。非効率な方法で行うと、その価値がなくなる可能性があります。速度が非常に重要な場合は、pyrexで記述するか、上記を C で実装して Python から呼び出すことを検討してください。

score 4 · Accepted Answer

ああ、BMP ファイルに関して上で書いたことを行うよりもはるかに簡単な方法があるかもしれないことに気付きました。

とにかく画像ファイルを生成していて、読み取りたい部分が常にわかっている場合は、生成中にその部分を別の画像ファイルとして保存するだけです。

import numpy as np
import matplotlib.pyplot as plt
import Image

#Generate sample images
num_images = 5

for i in range(0,num_images):
    Z = np.random.rand(2000, 2000)
    plt.imsave('%03i.png'%i, Z)
    snipZ = Z[200:300, 200:300]
    plt.imsave('%03i.snip.png'%i, snipZ)

#load the images
for i in range(0,num_images):
    im = Image.open('%03i.snip.png'%i)

    #convert them to numpy arrays
    data = np.array(im)

score 1 · Accepted Answer

いくつかのタイミングテストを実行しましたが、申し訳ありませんが、PIL のクロップコマンドよりもはるかに高速になるとは思えません。手動シーク/低レベルの読み取りでも、バイトを読み取る必要があります。タイミング結果は次のとおりです。

%timeit im.crop((1000-50,1000-50,1000+50,1000+50))
fid = open('003.png','rb')
%timeit fid.seek(1000000)
%timeit fid.read(1)
print('333*100*100/10**(9)*1000=%.2f ms'%(333*100*100/10**(9)*1000))


100000 loops, best of 3: 3.71 us per loop
1000000 loops, best of 3: 562 ns per loop
1000000 loops, best of 3: 330 ns per loop
333*100*100/10**(9)*1000=3.33 ms

一番下の計算からわかるように、読み取り 1 バイト *10000 バイト (100x100 サブイメージ)*1 バイトあたり 333ns = 3.33ms であり、これは上のクロップコマンドと同じです。

python - Pythonで画像の一部だけをロードする

4 に答える 4

完全なコード:

結果：

8 コア i7

2 コア Intel Duo

注意事項

Related

Reference