python - ノイズの多い 2 次元配列でのピーク検出

Question

以下のような画像で、最も明白なクラスタリングの中心pythonにできるだけ近づけようとしています。

以前の質問で、2 次元配列のグローバル最大値とローカル最大値を取得する方法を尋ねたところ、与えられた回答は完全に機能しました。問題は、最大のビンのグループではなく最大のビンのみを考慮しているため、さまざまなビンのサイズで取得したグローバルな最大値を平均することで取得できる中心の推定が、目で設定したものよりも常にわずかにずれていることです。（目で見るように）。

この質問に対する答えを私の問題に適応させようとしましたが、私の画像はノイズが多すぎてそのアルゴリズムが機能しないことがわかりました。その答えを実装する私のコードは次のとおりです。

import numpy as np
from scipy.ndimage.filters import maximum_filter
from scipy.ndimage.morphology import generate_binary_structure, binary_erosion
import matplotlib.pyplot as pp

from os import getcwd
from os.path import join, realpath, dirname

# Save path to dir where this code exists.
mypath = realpath(join(getcwd(), dirname(__file__)))
myfile = 'data_file.dat'

x, y = np.loadtxt(join(mypath,myfile), usecols=(1, 2), unpack=True)
xmin, xmax = min(x), max(x)
ymin, ymax = min(y), max(y)

rang = [[xmin, xmax], [ymin, ymax]]
paws = []

for d_b in range(25, 110, 25):
    # Number of bins in x,y given the bin width 'd_b'
    binsxy = [int((xmax - xmin) / d_b), int((ymax - ymin) / d_b)]

    H, xedges, yedges = np.histogram2d(x, y, range=rang, bins=binsxy)
    paws.append(H)


def detect_peaks(image):
    """
    Takes an image and detect the peaks usingthe local maximum filter.
    Returns a boolean mask of the peaks (i.e. 1 when
    the pixel's value is the neighborhood maximum, 0 otherwise)
    """

    # define an 8-connected neighborhood
    neighborhood = generate_binary_structure(2,2)

    #apply the local maximum filter; all pixel of maximal value 
    #in their neighborhood are set to 1
    local_max = maximum_filter(image, footprint=neighborhood)==image
    #local_max is a mask that contains the peaks we are 
    #looking for, but also the background.
    #In order to isolate the peaks we must remove the background from the mask.

    #we create the mask of the background
    background = (image==0)

    #a little technicality: we must erode the background in order to 
    #successfully subtract it form local_max, otherwise a line will 
    #appear along the background border (artifact of the local maximum filter)
    eroded_background = binary_erosion(background, structure=neighborhood, border_value=1)

    #we obtain the final mask, containing only peaks, 
    #by removing the background from the local_max mask
    detected_peaks = local_max - eroded_background

    return detected_peaks


#applying the detection and plotting results
for i, paw in enumerate(paws):
    detected_peaks = detect_peaks(paw)
    pp.subplot(4,2,(2*i+1))
    pp.imshow(paw)
    pp.subplot(4,2,(2*i+2) )
    pp.imshow(detected_peaks)

pp.show()

そして、これがその結果です（ビンサイズを変えます）：

ここに画像の説明を入力

明らかに、私のバックグラウンドはノイズが多すぎてそのアルゴリズムが機能しないため、問題は、そのアルゴリズムの感度を下げるにはどうすればよいかということです。代替ソリューションが存在する場合は、お知らせください。

編集

Bi Rico のアドバイスに従って、次のように、2 次元配列を極大値ファインダーに渡す前に平滑化を試みました。

H, xedges, yedges = np.histogram2d(x, y, range=rang, bins=binsxy)
H1 = gaussian_filter(H, 2, mode='nearest')
paws.append(H1)

これらは、sigma2、4、および 8 の結果です。

ここに画像の説明を入力

編集2

Amode ='constant'よりもはるかにうまく機能しているようですnearest。sigma=2これは、最大のビンサイズので右中央に収束します。

ここに画像の説明を入力

では、最後の画像に示されている最大値の座標を取得するにはどうすればよいでしょうか?

score 4 · Accepted Answer

質問の最後の部分に答えると、常に画像内に点があり、画像の極大値をある順序で検索することでそれらの座標を見つけることができます。データがポイントソースでない場合は、将来の検索の実行中にピーク近傍が最大になることを回避するために、各ピークにマスクを適用できます。次のコードを提案します。

import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import copy

def get_std(image):
    return np.std(image)

def get_max(image,sigma,alpha=20,size=10):
    i_out = []
    j_out = []
    image_temp = copy.deepcopy(image)
    while True:
        k = np.argmax(image_temp)
        j,i = np.unravel_index(k, image_temp.shape)
        if(image_temp[j,i] >= alpha*sigma):
            i_out.append(i)
            j_out.append(j)
            x = np.arange(i-size, i+size)
            y = np.arange(j-size, j+size)
            xv,yv = np.meshgrid(x,y)
            image_temp[yv.clip(0,image_temp.shape[0]-1),
                                   xv.clip(0,image_temp.shape[1]-1) ] = 0
            print xv
        else:
            break
    return i_out,j_out

#reading the image   
image = mpimg.imread('ggd4.jpg')
#computing the standard deviation of the image
sigma = get_std(image)
#getting the peaks
i,j = get_max(image[:,:,0],sigma, alpha=10, size=10)

#let's see the results
plt.imshow(image, origin='lower')
plt.plot(i,j,'ro', markersize=10, alpha=0.5)
plt.show()

テスト用のイメージ ggd4 は、次の場所からダウンロードできます。

http://www.ipac.caltech.edu/2mass/gallery/spr99/ggd4.jpg

最初の部分は、画像のノイズに関する情報を取得することです。画像全体の標準偏差を計算することでそれを行いました（実際には、信号のない小さな長方形を選択する方が良いです）。これは、画像にどれだけのノイズが存在するかを示しています。ピークを取得するためのアイデアは、特定のしきい値 (たとえば、ノイズの 3、4、5、10、または 20 倍) を超える連続する最大値を求めることです。これは関数 get_max が実際に行っていることです。そのうちの 1 つがノイズによって課されたしきい値を下回るまで、最大値の検索を実行します。同じ最大値を何度も見つけないようにするには、画像からピークを削除する必要があります。一般的に、そのためのマスクの形状は、解決したい問題に大きく依存します。星の場合、ガウス関数などを使用して星を削除するとよいでしょう。簡単にするために二乗関数を選択しました。関数のサイズ (ピクセル単位) は変数「サイズ」です。この例から、より一般的なものを追加することで、誰でもコードを改善できると思います。

編集：

元の画像は次のようになります。

光点を特定した後の画像は次のようになります。

score 4 · Accepted Answer

スタックオーバーフローのn00bが多すぎて、ここの他の場所でアレハンドロの答えにコメントできません。出力に事前に割り当てられたnumpy配列を使用するように、彼のコードを少し改良します。

def get_max(image,sigma,alpha=3,size=10):
    from copy import deepcopy
    import numpy as np
    # preallocate a lot of peak storage
    k_arr = np.zeros((10000,2))
    image_temp = deepcopy(image)
    peak_ct=0
    while True:
        k = np.argmax(image_temp)
        j,i = np.unravel_index(k, image_temp.shape)
        if(image_temp[j,i] >= alpha*sigma):
            k_arr[peak_ct]=[j,i]
            # this is the part that masks already-found peaks.
            x = np.arange(i-size, i+size)
            y = np.arange(j-size, j+size)
            xv,yv = np.meshgrid(x,y)
            # the clip here handles edge cases where the peak is near the 
            #    image edge
            image_temp[yv.clip(0,image_temp.shape[0]-1),
                               xv.clip(0,image_temp.shape[1]-1) ] = 0
            peak_ct+=1
        else:
            break
    # trim the output for only what we've actually found
    return k_arr[:peak_ct]

これと彼のサンプル画像を使用した Alejandro のコードのプロファイリングでは、このコードは約 33% 高速です (Alejandro のコードでは 0.03 秒、私のコードでは 0.02 秒です)。リストはピークが増えるほど遅くなります。

score 1 · Accepted Answer

私がそれをする方法：

1) H を 0 と 1 の間で正規化します。

2) tcaswell が示唆するように、しきい値を選択します。たとえば、.9 から .99 の間である可能性があります。

3) マスクされた配列を使用して、H がしきい値を超える x、y 座標のみを保持します。

import numpy.ma as ma
x_masked=ma.masked_array(x, mask= H < thresold)
y_masked=ma.masked_array(y, mask= H < thresold)

4) これで、好み/テストに応じて、(H-threshold)^2 のような重み、または 1 以上のその他の任意のパワーを使用して、マスクされた座標で重み平均を計算できます。

コメント: 1) しきい値を調整しなければならない場合があるため、これはピークのタイプに関してロバストではありません。これは小さな問題です。2) これはそのままでは 2 つのピークでは機能せず、2 番目のピークがしきい値を超えると間違った結果が得られます。

それにもかかわらず、クラッシュすることなく常に答えが得られます（長所と短所があります..）

score 1 · Accepted Answer

私が最終的に使用したソリューションであるため、この回答を追加しています。ここでの Bi Ricoのコメント (5 月 30 日 18:54) と、この質問で与えられた答えを組み合わせたものです。

この質問のピーク検出アルゴリズムを使用して判明したように、2D 配列でのピーク検出は問題を複雑にするだけです。画像にガウスフィルターを適用した後、行う必要があるのは、(Bi Rico が指摘したように) 最大ビンを要求し、座標の最大値を取得することだけです。

したがって、上記のように検出ピーク関数を使用する代わりに、ガウス 2D ヒストグラムが取得された後に次のコードを追加するだけです。

# Get 2D histogram.
H, xedges, yedges = np.histogram2d(x, y, range=rang, bins=binsxy)
# Get Gaussian filtered 2D histogram.
H1 = gaussian_filter(H, 2, mode='nearest')
# Get center of maximum in bin coordinates.
x_cent_bin, y_cent_bin = np.unravel_index(H1.argmax(), H1.shape)
# Get center in x,y coordinates.
x_cent_coor , y_cent_coord = np.average(xedges[x_cent_bin:x_cent_bin + 2]), np.average(yedges[y_cent_g:y_cent_g + 2])

python - ノイズの多い 2 次元配列でのピーク検出

編集

編集2

5 に答える 5

Related

Reference