EM Algorithm for estimating a Gaussian Mixture Model

Here is an applet demonstrating the EM algorithm for estimating the parameters of a Gaussian Mixture.
(For more information on what that means, see a gentle introduction to EM)

Since EM only guarantees to reach a local maxima, the initial guess of the parameters can have a big impact on where EM ends up. This applet helps you observe that effect.

The applet randomly selects five Gaussian distributions (i.e. five pairs of means and variances). Then it generates 200 points from each of the five Gaussian distributions with equal probability. For simplicity, all variances are fixed at 1.0 (and EM is told about this).
Then the applet allows you to select the initial 'guesses' (marked by green spots) for the five means. You can place these five guesses close to the real means (marked by red spots) and hence make it easy for EM, or you can place them in a way that makes it hard for EM to move them to their correct positions. For instance, placing the initial guesses very close to each other will (usually) cause them to 'compete' for the same cluster. Sometimes, you will even see one green spot trying to 'own' two clusters and position itself right between them.

Usage:
To create a random mixture of Gaussians, press the Shuffle button (until satisfied). Then use the left mouse button to place five initial guesses for the means. Once you have done so, press the 'Begin EM' button. Use Abort if EM tries to test your patience.

The algorthm stops when log-likelihood increases by less than 0.0001
This does not necessarily mean that a local maxima has been reached, or that further increases will be less than 0.0001! To force EM to try a little more, press the 'Begin EM' button again without shuffling.

Try placing all your green dots (initial guesses) close to each other in one corner of the window.


« Home