The following matrix transforms 4-dimensional vectors into 3-dimensional ones:
A | = |
|
This problem too has two parts.
All data for this problem are available from the links in the Downloads section below.
You are given a corpus of facial images. You must learn a "typical" (i.e. Eigen) face from one of them.
You are also given four group photographs with many faces. You must use the Eigen face you have learnt to detect all faces in the photos.
The faces in the group photo may have different sizes. You must account for these variations.
Use matlab, if you can. Other similar tools, such as “Octave” or “Python” (which comes with some very nice scientific and visualization libraries) are also reasonable alternatives. The machochistic among you may want to do it all in “C”.
You are given a collection of images of faces (see the downloads section) from image database 1. These images are all dimension 64x64. From these you must compute the “best” Eigen faces.
Some hints on how to read files into matlab can be found here
You must compute the first Eigen face from this data. To do so, you will have to read all images into a matrix. Here are instructions for building a matrix of images in matlab. You must then compute the first Eigen vector for this matrix. Information on computing Eigen faces from an image matrix can be found here.
To detect faces in the image, you must scan the group photo and identify all regions in it that “match” the patterns in Eigen face most. To “Scan” the image to find matches against an N x M Eigen face, you must match every N x M region of the photo against the Eigen face.
The “match” between any N x M region of an image and an Eigen face is given by the normalized dot product between the Eigen face and the region of the image being evaluated. The normalized dot product between an N x M Eigen face and a corresponding N x M segment of the image is given by E.P / norm(P), where E is the vector (unrolled) representation of the Eigen face, and P is the unrolled vector form of the N x M patch.
A simple matlab loop that scans an image for an Eigen vector is given here
The locations of faces are likely to be where the match score peaks.
Some tricks may be useful to get better results.
The Eigen face is fixed in size and can only be used to detect faces of approximately the same size as the Eigen face itself. On the other hand faces in the group photos are of different sizes -- they get smaller as the subject gets farther away from the camera.
The solution to this is to make many copies of the eigen face and match them all.
In order to make your detection system robust, resize the Eigen faces from 64 pixels to 32x32, 48x48, 96x96, and 128x128 pixels in size. You can use the scaling techniques we discussed in the linear algebra lecture. Matlab also provides some easy tools for scaling images. You can find information on scaling images in matlab here. Once you've scaled your eigen face, you will have a total of five “typical” faces, one at each level of scaling. You must scan the group pictures with all of the five eigen faces. Each of them will give you a “match” score for each position on the image. If you simply locate the peaks in each of them, you may find all the faces. Sometimes multiple peaks will occur at the same position, or within a few pixels of one another. In these cases, you can merge all of these, they probably all represent the same face.
Additional heuristics may also be required (appropriate setting of thresholds, comparison of peak values from different scaling factors, addiitonal scaling etc.). These are for you to investigate.
You are a training corpus of facial images. You must learn the first K Eigen faces from the corpus. Set K = 10 initially. Mean and variance normalize the images before computing Eigenfaces.
You are given a second training set of facial images. Express each image as a linear combination of the Eigen faces. i.e., express each face F as
F ≈ wF,1E1 + wF,2E2 + wF,3E3 + ... + wF,KEK
where Ei is the ith Eigen face and wF,i is the weight of the ith Eigen face, when composing face F. wF,i can, of course, be computed as the dot product of w and Ei
It will generally not be possible to represent a face exactly using a limited number of typical faces; as a result there will be an error between the face F and the approximation in terms of the K Eigenfaces. You can also compute the normalized total error in representation as:
errF = ||F - Σi wF,iEi||2 / N
where, ||.||2 represents the sum of the squares of the error of each pixel, and N represents the number of pixels in the image.
Represent each face by the set of weights for the Eigen faces and the error, i.e. F → {wF,1, wF,2, wF,3, ..., wF,K, errF}
You are also given a collection of non-face images in the dataset. Represent each of these images too as linear combinations of the Eigen faces, i.e. express each non-face image NF as
NF ≈ wNF,1E1 + wNF,2E2 + wNF,3E3 + ... + wNF,KEK
As before, the weights wNF,i can be computed as dot products. As in the case of faces, the approximation of the non-face images in terms of Eigenfaces will not be exact and will result in error. You can compute the normalized total error as you did for the face images to obtain errNF. In general, this error will be greater for non-face images than faces, since we're trying to compose them from Eigenfaces. Represent each of the non-face images by the set of weights i.e. NF → {wNF,1, wNF,2, ... , wNF,K, errNF}.
The set of weights and the normalized error for the Eigen faces are the features representing all the face and non-face images.
From the set of face and non-face images represented by the Eigenface weights, learn and an Adaboost classifier to classify faces vs. non-faces.
You are given a fourth set which is a collection of face and non-face images. Use the adaboost classifier to classify these images.
The classifier you have learned will be for the same size of images that were used in the training data (64 x 64). Scale the classifier by scaling the Eigenfaces to other sizes (32 x 32, 48 x 48, 96 x 96, 128x 128).
Scan the group photographs from the class to detect faces using your adaboost classifier.
You can adjust the tradeoff between missing faces and false alarms by comparing the margin (H(x)) of the Adaboost classifier to a threshold other than 0.
The HW is due on Oct 20, 2011. What to submit:
1. a brief write-up of what you did
2. the segments that your detector found to be faces. You may either copy those segments into individual files into a folder, or mark them on the group photograph. Make sure we can understand which part(s) of the image was detected as a face.
3. You do not need to submit code. However, we might ask you for the code at a later point, so try to document the code well.
How to submit:
Put the above in a zipfile and name it "MLSP-HW2-FirstnameLastname.zip".
Send the zip file by email to me, cc-ing Anoop and Manuel with "MLSP HW 2 Problem 1 Submission" as the subject line.
You may include any other information if necessary in a "Readme.txt" in the zip file that you submit. Do not put important information in the body of the email.