Novel View Modeling from a Single Image Using TIP (Tour into the Picture)

Final Project: Feature Matching for Morphing

For Computational Photography, 15-463

Carnegie Mellon University jmm59@pitt.edu

For this project, I chose to combine two previous projects for interesting and artistic effects. I implemented automatic feature matching, which was originally used in constructing panoramas of semi-overlapping images, to be used with my face morpher, so that similar points are detected in unrelated images. This allows for a smooth and aesthetically pleasing (if sometimes illogical) morph between them.

The largest problem faced with combining these two projects is triangle flipping. The morphing algorithm works by creating triangles from the various points, and morphing each of these triangles in turn. Since these points are moving as the morph progresses, it is possible for one point to cross the line between two others. If this happens, the triangles connected to this point are ‘flipped,’ resulting in very strange behavior.

Point Regions

To prevent this from happening, I use region-based points. The user enters how many regions they would like in the x and y directions. After the corners are found using the Harris algorithm and then evenly distributed with ANMS, these points are only matched up with points in the second image which are in the same corresponding region. Since each pair of points is found in a certain area of the image, triangle flipping is much less likely. I found that it occurred in one or two of my many trials, and were easily removed by adjusting several parameters, including the number of regions.

Deciding how many regions to use is a challenge. With fewer regions, each correspondence pair has a further distance they can travel, and thus lends more ‘motion’ to the image. However, the resulting triangles are larger, which results in rougher morphing. This makes the transition seem artificial, when edges that are intended to match up become misaligned due to not having enough points to define that edge. I found that between 10 and 20 total points seemed the best compromise, depending on amount of detail in the image.

Num Regions: 3x3, Shrink Factor: 1, Num Points: 200

Num Regions: 4x4, Shrink Factor: 1, Num Points: 200

Num Regions: 10x10, Shrink Factor: 1, Num Points: 200

Limiting To Strong Corners

Another challenge I was faced with was making the program match the same sorts of features the human eye would expect. The Harris corner detector is quite adept, and will often pick up on variations in what we would expect to be continuous surfaces, like walls and skies. I made several changes to the feature matching portion of the program to resolve this. I limited the features the matcher can choose from by decreasing the final number of points in the ANMS function. When only the strongest corners are chosen, most of the points are ‘corners’ to the human eye as well.

Num Regions: 4x4, Shrink Factor: 2, Num Points: 500

Num Regions: 4x4, Shrink Factor: 2, Num Points: 50

Descriptor-Matching Threshold

I next put a threshold on the matching program, such that if two points are not similar to a certain value, no points will be selected for that region. Since we’re not looking for features that should be the same object in both images, using the ratio of the best match over the second best is not useful. We merely want the features that match best. But if a region doesn’t have any matches that are good enough, I prefer to not have any match at all, which frequently results in a more natural morph.

Num Regions: 4x4, Shrink Factor: 2, Num Points: 200, Descriptor Matching Threshold: 5 (all others are 75)

Feature Window Size

The best matches are often between the largest distinguishing features. For that reason, it is often best for the feature descriptor to encompass a larger area. The Harris corner detector looks at a Gaussian-weighted 11x11 window. The feature descriptor built from these corners then encompasses 40x40 pixels. Instead of changing these parameters directly, I allow the user to enter a shrinking factor. The point matching is then performed on an image that is scaled down by this shrinking factor, thus detecting larger corners without the need to change sampling sizes.

Num Regions: 4x4, Shrink Factor: 2, Num Points: 200

Num Regions: 4x4, Shrink Factor: 4, Num Points: 200

Num Regions: 4x4, Shrink Factor: 1/2, Num Points: 200

By starting with an acceptable default and then tweaking these parameters through trial and error, a good match between two images can be made.

Sine Wave Blending

I felt that the blending of pixel values was frequently distracting from the shape distortion. To counter this, I blend the pixels using a weight based on a sine wave, instead of the linear value used in the face morphing.

The effect is that the current image is visible longer as it distorts, before switching to the next image, which then continues to morph from the previous shape. The transition is not too abrupt, but quick enough that the second image is not faintly visible even from the start, as it was previously.

Final Result

As a presentation of my algorithm, I assembled a number of images and created a morph video through all of them. These are the original images I used (many gathered from boners.com, a collection of sometimes-funny, sometimes-disturbing images:

And the final video can be found here!