\( \def\v{\mathbf{v}} \def\w{\mathbf{w}} \def\x{\mathbf{x}} \def\D{\mathbf{D}} \def\V{\mathbf{V}} \def\S{\mathbf{S}} \def\F{\mathcal F} \def\bold#1{\bf #1} \)
In this problem we will consider shift-invariant mixtures of multi-variate multinomial distributions.
Consider data that have multiple discrete attributes. "Discrete" attributes are attributes that can take only one of a countable set of values. We will consider discrete attributes of a particular kind -- integers that have not only a natural rank ordering, but also a definite notion of distance.
Let $(X,Y)$ be the pair of discrete attributes defining any data instance. Since both $X$ and $Y$ are discrete, the probability distribution of $(X,Y)$ is a bi-variate multinomial.
We describe $(X,Y)$ as the outcome of generation by the following process:
The process has at its disposal several urns. Each urn has two sub-urns inside it. The first sub-urn represents a bi-variate multinomial: it contains balls, such that each ball has an $(X_1,Y_1)$ value marked on it. The second sub-urn represents a uni-variate multinomial -- it contains balls, such that each ball has a $X_2$ value marked on it.
In the following explanation we will use the notation $P_x(X)$ to indicate the probability that the Random Variable $x$ takes the value $X$.
We represent the content of the larger sub-urn within each urn as $(x_1, y_1)$. The smaller sub-urn generates the random variable $x_2$.
Drawing procedure: At each draw the drawing process performs the following operations.
Thus, the final observation is:
Representing the output random variable as $(x,y)$, the probability that it takes a value $(X,Y)$ is given by $P_{x,y}(X,Y)$.
Give the expression for $P_{x,y}(X,Y)$ in terms of $P_z(Z)$, $P_{x_1,y_1}(X_1,Y_1|Z)$ and $P_{x_2}(X_2|Z)$.
You are given a histogram of counts $H(X,Y)$ obtained from a large number of observations. $H(X,Y)$ represents the number of times $(X,Y)$ was observed. Give the EM update rules to estimate $P_z(Z)$, $P_{x_1,y_1}(X_1,Y_1|Z)$ and $P_{x_2}(X_2|Z)$.
In this problem we will try to deblur a picture that has become blurry due to a slight left-to-right shake of the camera. You can download the actual picture from this link:
We model the picture as a histogram (the value of any pixel at a position $(X,Y)$, which ranges from 0-255, is viewed as the count of ``light elements'' at that position). We model this distribution as a shift-invariant mixture of one component (i.e. one large urn).
Assuming a very slight 20-pixel strictly-horizontal shake, we model that within the $X_2$ sub-urn $X_2$ can take integer values 0-19 (i.e. 20 wide). The $X_1$ value in the $(X_1,Y_1)$ sub-urn can range from 0 to (width-of-picture - 20). $Y_1$ can take values in the range 0 to (hieght-of-picture - 1).
Estimate and plot $P_{x_2}(X_2)$ and $P_{x_1,y_1}(X_1, Y_1)$. You will need the solution to problem 2 for this problem. If the solution to problem 2 is incorrect, the solution of problem 3 will not be considered or given any points.
In this problem we will try to track a number of opinion polls and try to estimate the true support for the candidates in a recent election.
The election is between four candidates. Public sentiment about the candidates fluctuates all the time. A number of opinion polls try to gauge public sentiment. However, since opinion polls are fundamentally noisy procedures (affected by factors such as the specific subset of people they poll, or the number of samples in their poll), each of them can be viewed as a noisy measurement of the true public sentiment. We will try to obtain a better estimate of the true sentiment, as well as the uncertainty of the estimate (which a pollster could use to establish a margin of error).
We will model the polls as the output of a linear Gaussian process as follows:
Our objective is to use the measurements $O_{0:t}$ (i.e. all measurements from time 0 to $t$) to estimate the true sentiment $S_t$ at time $t$.
NaN
. These represent opinion poll numbers that were not obtained, i.e. in that particular week the actual number of opinion polls obtained was less than 17, and so the NaN
entries represent polls that were not taken. Note that since each poll gives you three numbers (one per candidate), the NaN
s will occur in groups of three. You will have to keep track of which data entries are missing in any week, because you will have to remove that entry from the observation to get the true observation vector, and also remove the corresponding row from $A$, and the corresponding columns and rows of $\Theta_\gamma$ to get the actual covariance matrix for the observation noise that week.
N.B: Please remember that this is only a homework problem and may not in any way be indicative of reality. Our model is unrealistic -- its unlikely that either the noise nor the innovation is Gaussian. We're also not explicitly handling other factors that affect the polling, or the constraint that the samples are strictly non-negative (you can't have a negative percent of the population voting for anyone). Various other factors are being ignored (although, in principle, all of these could be included in the model). Nonetheless, we believe the computational exercise itself is interesting and should tell you something of the power of MLSP techniques.
The assignment is due on 30 Nov 2016. The solutions must be emailed to Bhiksha, Chiyu and Anurag. Please use the format given here for your submissions.