だから、私は Matlab で EM アルゴリズムを実装していますが、私の行列はすぐに値によって汚染されてNaN
しまいInf
ます。行列の反転が原因である可能性があると思いますが、それが唯一の理由かどうかはわかりません。
コードは次のとおりです。
function [F, Q, R, x_T, P_T] = em_algo(y, G)
% y_t = G_t'*x_t + v_t 1*1 = 1*p p*1
% x_t = F*x_t-1 + w_t p*1 = p*p p*1
% G is T*p
p = size(G,2); % p = nb assets ; G = T*p
q = size(y,2); % q = nb observations ; y = T*q
T = size(y,1); % y is T*1
F = eye(p); % = Transition matrix p*p
Q = eye(p); % innovation (v) covariance matrix p*p
R = eye(q); % noise (w) covariance matrix q x q
x_T_old = zeros(p,T);
mu0 = zeros(p,1);
Sigma = eye(p); % Initial state covariance matrix p*p
converged = 0;
i = 0;
max_iter = 60; % only for testing purposes
while ~converged
if i > max_iter
break;
end
% E step = smoothing
fprintf('Iteration %d\n',i);
[x_T,P_T,P_Tm2] = smoother(G,F,Q,R,mu0,Sigma,y);
%x_T
% M step
A = zeros(p,p);
B = zeros(p,p);
C = zeros(p,p);
R = eye(q);
for t = 2:T % eq (9) in EM paper
A = A + (P_T(:,:,t-1) + (x_T(:,t-1)*x_T(:,t-1)'));
end
for t = 2:T % eq (10)
%B = B + (P_Tm2(:,:,t-1) + (x_T(:,t)*x_T(:,t-1)'));
B = B + (P_Tm2(:,:,t) + (x_T(:,t)*x_T(:,t-1)'));
end
for t = 1:T %eq (11)
C = C + (P_T(:,:,t) + (x_T(:,t)*x_T(:,t)'));
end
F = B*inv(A); %eq (12)
Q = (1/T)*(C - (B*inv(A)*B')); % eq (13) pxp
for t = 1:T
bias = y(t) - (G(t,:)*x_T(:,t));
R = R + ((bias*bias') + (G(t,:)*P_T(:,:,t)*G(t,:)'));
end
R = (1/T)*R;
if i>1
err = norm(x_T-x_T_old)/norm(x_T_old);
if err < 1e-4
converged = 1;
end
end
x_T_old = x_T;
i = i+1;
end
fprintf('EM algorithm iterated %d times\n',i);
end
これは収束するまで反復し(私の問題のために決して起こりません)、smoother.m
各反復で呼び出します:
function [x_T, P_T, P_Tm2] = smoother(G,F,Q,R,mu0,Sigma,y)
% G is T*p
p = size(mu0,1); % mu0 is p*1
T = size(y,1); % y is T*1
J = zeros(p,p,T);
K = zeros(p,T); % gain matrix
x = zeros(p,T);
x(:,1) = mu0;
x_m1 = zeros(p,T);
x_T = zeros(p,T); % x values when we know all the data
% Notation : x = xt given t ; x_m1 = xt given t-1 (m1 stands for minus
% one)
P = zeros(p,p,T);% array of cov(xt|y1...yt), eq (6) in Shumway & Stoffer 1982
P(:,:,1) = Sigma;
P_m1 = zeros(p,p,T); % Same notation ; = cov(xt, xt-1|y1...yt) , eq (7)
P_T = zeros(p,p,T);
P_Tm2 = zeros(p,p,T); % cov(xT, xT-1|y1...yT)
for t = 2:T %starts at t = 2 because at each time t we need info about t-1
x_m1(:,t) = F*x(:,t-1); % eq A3 ; pxp * px1 = px1
P_m1(:,:,t) = (F*P(:,:,t-1)*F') + Q; % A4 ; pxp * pxp = pxp
if nnz(isnan(P_m1(:,:,t)))
error('NaNs in P_m1 at time t = %d',t);
end
if nnz(isinf(P_m1(:,:,t)))
error('Infs in P_m1 at time t = %d',t);
end
K(:,t) = P_m1(:,:,t)*G(t,:)'*pinv((G(t,:)*P_m1(:,:,t)*G(t,:)') + R); %A5 ; pxp * px1 * 1*1 = p*1
%K(:,t) = P_m1(:,:,t)*G(t,:)'/((G(t,:)*P_m1(:,:,t)*G(t,:)') + R); %A5 ; pxp * px1 * 1*1 = p*1
% The matrix inversion seems to generate NaN values which quickly
% contaminate all the other matrices. There is no warning about
% (close to) singular matrices or whatever. The use of pinv()
% instead of inv() seems to solve the problem... but I don't think
% it's the appropriate way to deal with it, there must be something
% wrong elsewhere
if nnz(isnan(K(:,t)))
error('NaNs in K at time t = %d',t);
end
x(:,t) = x_m1(:,t) + (K(:,t)*(y(t)-(G(t,:)*x_m1(:,t)))); %A6
P(:,:,t) = P_m1(:,:,t) - (K(:,t)*G(t,:)*P_m1(:,:,t)); %A7
end
x_T(:,T) = x(:,T);
P_T(:,:,T) = P(:,:,T);
for t = T:-1:2 % we stop at 2 since we need to use t-1
%P_m1 seem to get really huge (x10^22...), might lead to "Inf"
%values which in turn might screw pinv()
%% inv() caused NaN value to appear, pinv seems to solve the issue
J(:,:,t-1) = P(:,:,t-1)*F'*pinv(P_m1(:,:,t)); % A8 pxp * pxp * pxp
%J(:,:,t-1) = P(:,:,t-1)*F'/(P_m1(:,:,t)); % A8 pxp * pxp * pxp
x_T(:,t-1) = x(:,t-1) + J(:,:,t-1)*(x_T(:,t)-(F*x(:,t-1))); %A9 % Becomes NaN during 8th iteration!
P_T(:,:,t-1) = P(:,:,t-1) + J(:,:,t-1)*(P_T(:,:,t)-P_m1(:,:,t))*J(:,:,t-1)'; %A10
nans = [nnz(isnan(J)) nnz(isnan(P_m1)) nnz(isnan(F)) nnz(isnan(x_T)) nnz(isnan(x_m1))];
if nnz(nans)
error('NaN invasion at time t = %d',t);
end
end
P_Tm2(:,:,T) = (eye(p) - K(:,T)*G(T,:))*F*P(:,:,T-1); % %A12
for t = T:-1:3 % stop at 3 because use of t-2
P_Tm2(:,:,t-1) = P_m1(:,:,t-1)*J(:,:,t-2)' + J(:,:,t-1)*(P_Tm2(:,:,t)-F*P(:,:,t-1))*J(:,:,t-2)'; % A11
end
end
NaN
s とsは、〜 8回目の反復Inf
でポップし始めます。
私はどこかでマトリックスで何か不浄なことをしていると思いますが、何が悪いのか本当にわかりません。私はあなたの専門知識を信頼しています。
助けてくれてありがとう。
Rody : データを生成する方法は次のとおりです (これはまだ「現実世界」のデータではなく、何も問題がないことを確認するために生成されたテスト データです)。
T = 500;
nbassets = 3;
G = .1 + randn(T,nbassets); % random walk trajectories
y = (1:T).';
y = 1.01.^y; % 1 * T % Exponential 1% returns curve
ダン : その通りです。数式がどのように導き出されるかを本当に理解するための数学のバックグラウンドが実際にありません。役に立たないことはわかっていますが、当分の間、それを修正できるかどうかはわかりません。:/
Rody : そうです、私も同じ結論に達しました。しかし、何がそのようにうまくいかないのか、私にはまったくわかりません。
ここに論文へのリンクがあります: http://www.stat.pitt.edu/stoffer/em.pdf
スムーザーの公式はすべて、付録の一番最後にあります。これまでお時間をいただきありがとうございます。