python - Autoencoder - 複数のデータ例の場合、コストは低下しますが出力が正しくありません

Question

最近、numpy でオートエンコーダーを実装しました。すべての勾配を数値的にチェックしたところ、それらは正しいように見えます。また、学習率が十分に小さい場合、コスト関数も反復ごとに減少するようです。

問題：

ご存じのとおり、オートエンコーダーは inputを取得し、できるだけx近いものを返そうとします。x

my が行ベクトルの場合はいつでも、x非常にうまく機能します。コスト関数は 0 に減少し、非常に良い結果が得られます。たとえばx = [[ 0.95023264 1. ]]、10000 回の反復後に得られた出力xhat = [[ 0.94972973 0.99932479]]で、コスト関数が約 10^-7

ただし、 myxが行ベクトルでない場合、2 行 2 列の小さな行列であっても、出力は元の x に近くなく、コスト関数は 0 まで減少せず、横ばいになります。

例：

入力がx = [[ 0.37853141 1. ][ 0.59747807 1. ]]の場合、出力はxhat = [[ 0.48882265 0.9985147 ][ 0.48921648 0.99927143]]です。xhat の最初の列が x の最初の列に近くないように見えますが、x の最初の列の平均に近いことがわかります。これは、私が実行したすべてのテストで発生するようです。また、コスト関数は 0.006 あたりで頭打ちになり、0 にはなりません。

なぜこれが起こり、どうすれば修正できますか? 繰り返しますが、導関数は正しいです。これを修正する方法がわかりません。

私のコード

import numpy as np
import matplotlib.pyplot as plt

def g(x): #sigmoid activation functions
    return 1/(1+np.exp(-x)) #same shape as x!

def gGradient(x): #gradient of sigmoid
    rows,cols = x.shape
    grad = np.zeros((cols, cols))
    for i in range(0, cols):
        grad[i, i] = g(x[0, i])*(1-g(x[0, i]))
    return grad

def cost(x, xhat): #mean squared error between x the data and xhat the output of the machine
    return ((x - xhat)**2).sum()/(2 * m)

m, n = 2, 1
trXNoBias = np.random.rand(m, n)
trX = np.ones((m, n+1))
trX[:, :n] = trXNoBias #add the bias, column of ones
n = n+1

k = 1 #num of neurons in the hidden layer of the autoencoder, shouldn't matter too much
numIter = 10000
learnRate = 0.001
x = trX
w1 = np.random.rand(n, k) #weights from input layer to hidden layer, shape (n, k)
w2 = np.random.rand(k, n) #weights from hidden layer to output layer of the autoencoder, shape (k, n)
w3 = np.random.rand(n, n) #weights from output layer of autoencoder to entire output of the machine, shape (n, n)

costArray = np.zeros((numIter, ))
for i in range(0, numIter):
    #Feed-Forward
    z1 = np.dot(x,w1) #output of the input layer, shape (m, k)
    h1 = g(z1) #input of hidden layer, shape (m, k)

    z2 = np.dot(h1, w2) #output of the hidden layer, shape (m, n)
    h2 = g(z2) #Output of the entire autoencoder. The output layer of the autoencoder. shape (m, n)

    xhat = np.dot(h2, w3) #the output of the machine, which hopefully resembles the original data x, shape (m, n)

    print(cost(x, xhat))
    costArray[i] = cost(x, xhat)

    #Backprop
    dSdxhat = (1/float(m)) * (xhat-x)
    dSdw3 = np.dot(h2.T, dSdxhat)
    dSdh2 = np.dot(dSdxhat, w3.T)
    dSdz2 = np.dot(dSdh2, gGradient(z2))
    dSdw2 = np.dot(h1.T,dSdz2)
    dSdh1 = np.dot(dSdz2, w2.T)
    dSdz1 = np.dot(dSdh1, gGradient(z1))
    dSdw1 = np.dot(x.T,dSdz1)

    w3 = w3 - learnRate * dSdw3
    w2 = w2 - learnRate * dSdw2
    w1 = w1 - learnRate * dSdw1

plt.plot(costArray)
plt.show()

print(x)
print(xhat)

python - Autoencoder - 複数のデータ例の場合、コストは低下しますが出力が正しくありません

0 に答える 0

Related

Reference