python - 行列乗算を使用したnumpyテンプレートマッチング

Question

テンプレートを画像に沿ってシフトすることにより、テンプレートをバイナリ画像 (白黒のみ) と一致させようとしています。テンプレートと画像の間の最小距離を、この最小距離が発生した対応する位置とともに返します。例えば：

画像:

0 1 0
0 0 1
0 1 1

テンプレート：

0 1
1 1

このテンプレートは、位置 (1,1) で画像に最もよく一致し、距離は 0 になります。これまでのところ、それほど難しくはなく、トリックを実行するコードを既に取得しています。

def match_template(img, template):
    mindist = float('inf')
    idx = (-1,-1)
    for y in xrange(img.shape[1]-template.shape[1]+1):
        for x in xrange(img.shape[0]-template.shape[0]+1):
        #calculate Euclidean distance
        dist = np.sqrt(np.sum(np.square(template - img[x:x+template.shape[0],y:y+template.shape[1]])))
        if dist < mindist:
            mindist = dist
            idx = (x,y)
    return [mindist, idx]

しかし、必要なサイズの画像 (500 x 200 ピクセルの画像と 250 x 100 のテンプレート) の場合、これにはすでに約 4.5 秒かかり、遅すぎます。そして、行列の乗算を使用して同じことをより迅速に実行できることを知っています（matlabでは、im2colとrepmatを使用してこれを実行できると信じています）。python/numpyでそれを行う方法を誰かに説明してもらえますか?

ところで。必要なことを正確に行うopencv matchTemplate関数があることは知っていますが、後でコードを変更する必要があるかもしれないので、完全に理解して変更できるソリューションを好むでしょう。

ありがとう！

編集:opencvがこれを0.2秒未満で行う方法を誰かが説明できれば、それも素晴らしいでしょう。ソースコードをざっと見てみましたが、これらのことは常に非常に複雑に見えます。

edit2: Cython コード

import numpy as np
cimport numpy as np

DTYPE = np.int
ctypedef np.int_t DTYPE_t

def match_template(np.ndarray img, np.ndarray template):
    cdef float mindist = float('inf')
    cdef int x_coord = -1
    cdef int y_coord = -1
    cdef float dist
    cdef unsigned int x, y
    cdef int img_width = img.shape[0]
    cdef int img_height = img.shape[1]
    cdef int template_width = template.shape[0]
    cdef int template_height = template.shape[1]
    cdef int range_x = img_width-template_width+1
    cdef int range_y = img_height-template_height+1
    for y from 0 <= y < range_y:
        for x from 0 <= x < range_x:
            dist = np.sqrt(np.sum(np.square(template - img[ x:<unsigned int>(x+template_width), y:<unsigned int>(y+template_height) ]))) #calculate euclidean distance
            if dist < mindist:
                mindist = dist
                x_coord = x
                y_coord = y
    return [mindist, (x_coord,y_coord)]

img = np.asarray(img, dtype=DTYPE)
template = np.asarray(template, dtype=DTYPE)
match_template(img, template)

score 3 · Accepted Answer

あなたが望むことを行う1つの可能な方法は、畳み込み（ブルートフォースまたはFFTである可能性があります）によるものです。行列の乗算のAFAIKは機能しません。テンプレートを使用してデータを畳み込む必要があります。そして、最大値を見つけます (適切に機能させるには、スケーリングも行う必要があります)。

xs=np.array([[0,1,0],[0,0,1],[0,1,1]])*1.
ys=np.array([[0,1],[1,1]])*1.
print scipy.ndimage.convolve(xs,ys,mode='constant',cval=np.inf)
>>> array([[  1.,   1.,  inf],
       [  0.,   2.,  inf],
       [ inf,  inf,  inf]])

print scipy.signal.fftconvolve(xs,ys,mode='valid') 
>>> array([[ 1.,  1.],
           [ 0.,  2.]])

score 1 · Accepted Answer

純粋な numpy/scipy マジックを使用してこれを行うための素晴らしい方法があるかもしれません。しかし、これを行うために Cython に立ち寄った方が簡単かもしれません (そして、後でコードを見たときに理解しやすいかもしれません)。Cython を numpy と統合するための優れたチュートリアルがhttp://docs.cython.org/src/tutorial/numpy.htmlにあります。

編集: Cython コードで簡単なテストを行ったところ、100x200 テンプレートを使用した 500x400 の画像で約 15 秒かかりました。いくつかの微調整 (numpy メソッド呼び出しと numpy 境界チェックを排除) の後、3 秒未満でダウンしました。それはあなたにとって十分ではないかもしれませんが、可能性を示しています。

import numpy as np
cimport numpy as np
cimport cython
from libc.math cimport sqrt

DTYPE = np.int
ctypedef np.int_t DTYPE_t

@cython.boundscheck(False)
def match_template(np.ndarray[DTYPE_t, ndim=2] img, np.ndarray[DTYPE_t, ndim=2] template):
    cdef float mindist = float('inf')
    cdef int x_coord = -1
    cdef int y_coord = -1
    cdef float dist
    cdef unsigned int x, y
    cdef int img_width = img.shape[0]
    cdef int img_height = img.shape[1]
    cdef int template_width = template.shape[0]
    cdef int template_height = template.shape[1]
    cdef int range_x = img_width-template_width+1
    cdef int range_y = img_height-template_height+1
    cdef DTYPE_t total
    cdef int delta
    cdef unsigned int j, k, j_plus, k_plus
    for y from 0 <= y < range_y:
        for x from 0 <= x < range_x:
            #dist = np.sqrt(np.sum(np.square(template - img[ x:<unsigned int>(x+template_width), y:<unsigned int>(y+template_height) ]))) #calculate euclidean distance
            # Do the same operations, but in plain C
            total = 0
            for j from 0 <= j < template_width:
                j_plus = <unsigned int>x + j
                for k from 0 <= k < template_height:
                    k_plus = <unsigned int>y + k
                    delta = template[j, k] - img[j_plus, k_plus]
                    total += delta*delta
            dist = sqrt(total)
            if dist < mindist:
                mindist = dist
                x_coord = x
                y_coord = y
    return [mindist, (x_coord,y_coord)]

python - 行列乗算を使用したnumpyテンプレートマッチング

2 に答える 2

Related

Reference