Answered By : Nicholas Mancuso
- Construct the model
- Compute Conditional Expectation under the model (E-Step)
- Maximize our likelihood by updating our current estimate of $theta$ (M-Step)
Construct the Model Before we go further with EM we need to figure out what exactly it is we are computing. In the E-step we are computing exactly the expected value for $log Pr[X,Z|theta]$. So what is this value, really? Observe that $$begin{align*} log Pr[X,Z|theta] &= sum_{i=1}^5 logsum_{Cin {A,B}}Pr[X_i, Z_i=C| theta] &=sum_{i=1}^5 logsum_{Cin {A,B}} Pr[Z_i=C | X_i, theta] cdot frac{Pr[X_i, Z_i=C| theta]}{Pr[Z_i=C | X_i, theta]} &geq sum_{i=1}^5 sum_{Cin {A,B}} Pr[Z_i=C | X_i, theta] cdot logfrac{Pr[X_i, Z_i=C| theta]}{Pr[Z_i=C | X_i, theta]}. end{align*}$$ The reason is that we have 5 experiments to account for, and we don’t know what coin was used in each. The inequality is due to $log$ being concave and applying Jensen’s inequality. The reason we need that lower bound is that we cannot directly compute the arg max to the original equation. However we can compute it for the final lower bound. Now what is $Pr[Z_i=C|X_i,theta]$? It is the probability that we see coin $C$ given experiment $X_i$ and $theta$. Using conditional probabilities we have, $$Pr[Z_i=C| X_i, theta] = frac{Pr[X_i, Z_i = C|theta]}{Pr[X_i|theta]}.$$ While we’ve made some progress, we’re not done with the model just yet. What is the probability that a given coin flipped the sequence $X_i$? Letting $h_i = #text{heads in } X_i$ $$Pr[X_i, Z_i = C| theta] = frac{1}{2} cdot theta_C^{h_i} (1 – theta_C)^{10 – h_i}, text{ for } C in {A, B}.$$ Now $Pr[X_i|theta]$ is clearly just the probability under both possibilities of $Z_i=A$ or $Z_i=B$. Since $Pr[Z_i = A] = Pr[Z_i = B] = 1/2$ we have, $$Pr[X_i|theta] = 1/2 cdot (Pr[X_i |Z_i = A, theta] + Pr[X_i |Z_i = B, theta]).$$ E-Step Okay… that wasn’t so fun but we can start doing some EM work now. The EM algorithm begins by making some random guess for $theta$. In this example we have $theta^0 = (0.6,0.5)$. We compute $$Pr[Z_1=A|X_1,theta] = frac{1/2 cdot (0.6^5 cdot 0.4^5)}{1/2 cdot ((0.6^5 cdot 0.4^5) + (0.5^5 cdot 0.5^5))} approx 0.45.$$ This value lines up with what is in the paper. Now we can compute the expected number of heads in $X_1 = (H,T,T,T,H,H,T,H,T,H)$ from coin $A$, $$E[# text{heads by coin }A | X_1, theta] = h_1 cdot Pr[Z_1=A|X_1,theta] = 5 cdot 0.45 approx 2.2.$$ Doing the same thing for coin $B$ we get, $$E[# text{heads by coin }B | X_1, theta] = h_1 cdot Pr[Z_1=B|X_1,theta] = 5 cdot 0.55 approx 2.8.$$ We can compute the same for the number of tails by substituting $h_1$ for $10 – h_1$. This continues for all other values of $X_i$ and $h_i$ $1 leq i leq 5$. Thanks to linearity of expectation we can figure out $$E[#text{heads by coin } A|X ,theta] = sum_{i=1}^5 E[# text{heads by coin }A | X_i, theta]$$ M-Step With our expected values in hand, now comes the M step where we want to maximize $theta$ given our expected values. This is done by simple normalization! $$theta_A^1 = frac{E[#text{heads over } X text{ by coin } A|X ,theta]}{E[#text{heads and tails over } X text{ by coin } A|X ,theta]} = frac{21.3}{21.3 + 9.6} approx 0.71.$$ Likewise for $B$. This process begins again with the E-Step and $theta^1$ and continues until the values for $theta$ converge (or to some alloweable threshold). In this example we have 10 iterations and $hat{theta} = theta^{10} = (0.8, 0.52)$. In each iteration the value of $Pr[X,Z|theta]$ increases, due to the better estimate of $theta$. Now in this case the model was fairly simplistic. Things can get much more complicated pretty quickly, however the EM algorithm will always converge, and will always produce a maxmimum likelihood estimator $hat{theta}$. It may be a local estimator, but to get around this we can just restart the EM process with a different initialization. We can do this a constant amount of times and retain the best results (i.e., those with the highest final likelihood).
Asked By : IcySnow
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/10637