2

In a typical clustering problem, the probability of a data point x is p(x) = sum_k p(k)p(x|k), where k is a latent variable specifying the cluster that x belongs to. We can use EM algorithm to maximize the log likelihood of the objective function for the training data set: sum_n log (sum_k p(k)(p(x|k))).

I wonder if EM algorithm can solve the problem with two sets of latent variables, i.e. p(x) = sum_k sum_l p(x|k, l)p(k)p(l)? If so, how can we do that?

What if all of the probability distributions are sigmoid functions?

4

1 に答える 1