In the initial part of my thesis research, I developed complete systems to recognize 3D polyhedral models in both range images and intensity images. I introduced the use of non-parametric statistical distributions to represent the constraints on feature appearance for selecting matches. These distributions can represent a wide range of phenomena and provide strong cues about the utility of each feature for recognition. Random sampling of synthetic views of the object was used to generate the distributions. By processing the synthetic images as we would a real input image, the constraints were an improvement over the results of previous analytic techniques. The statistical constraints led to a natural formulation of the match selection problem as maximum a posteriori estimation resulting in a dramatic pruning of the search space.
After much work, it was apparent that reliable verification could only be possible with accurate position estimation. Because of the limited accuracy of position estimates provided by alignment, further refinement of the position via a local search (known as localization) was required. I then developed a localization algorithm using techniques from robust statistics. In experiments, this algorithm demonstrated significantly better convergence than previous techniques. The localization and verification algorithms operate on the range data points and edge points and thus do not depend on any high level features. The restriction to polyhedral models is thus isolated in the matching procedure; however, a variety of more general features (such as distinguished points) are available to augment line segments and planar surfaces for purposes of matching, and, thus, we can remove the polyhedral restriction. In fact, the current work handles occluding contours and arbitrary curved surfaces.
A common criticism of feature based recognition is that data-driven feature extraction is not reliable enough for robust recognition; however, my results show that even imperfect feature extraction can be effectively used for recognition. The real problem is the use of unrealistic assumptions about feature extraction performance. In my system, features are only used as cues to generate initial hypotheses which are then fed to the data-driven localization scheme. My only assumption is that the feature's appearance follows a distribution similar to its appearance distribution over the images used to build the model.