CMU Advanced Perception Seminar, Spring 1999
Table of Contents
- Class Format
- What Should be in a Critique?
- Grading Policy
- Computer Vision Resources
- Overview of topics, by week
- Week 1. Introduction and Explanation
- Week 2. Edge Extraction
- Week 3. Region/Volume Segmentation
- Week 4. Active Contours
- Week 5. Object Recognition
- Week 6. Volumetric Registration
- Week 7. Projective Geometry
- Week 8. Symmetry and Perception
- Week 9. Stabilization And Mosaicing
- Week 10. Egomotion and Structure from Motion
- Week 11. New View Synthesis
- Week 12. Range Imaging
- Week 13. Auditory Sensing
Class Format
The Advanced Perception course is a graduate reading seminar, meeting once a week to discuss a
set of papers covering a specific topic in computer vision and perception. We will look at historically important papers in field, as well as current papers from recent conferences and journals. By
reading a mixture of both types of papers, we will be able to trace the development of some the
fundamental ideas that make up current-day research.
Each week, two papers on a particular topic will be assigned. After reading them, your must find a
third paper on your own that is relevant to the topic (for example, in Week 2 you will find a paper
on edge extraction, published in a conference proceedings or archival journal). Finally, you will
write a short critique/essay (3-4 pages) on the topic area based on the three papers you have read.
This essay will be handed in for grading. During class, each of the two assigned papers will be
presented by one of the students (one student per paper, assigned the week before). This is
expected to be a formal 20 minute presentation in front of the class, using transparancies. The
presentation will then evolve into a class discussion on the topic covered in the paper. The instructors are responsible for keeping the discussion in a fruitful vein and making sure all students get a
chance to participate. The instructors are also responsible for making sure that the important
points are touched upon during the discussion, which will sometimes mean asking questions of
the class, and for making sure that each paper is covered (which sometimes means cutting off discussion and moving on).
At the end of the class, we will go around the room asking each of you to cite the third paper you
have personally chosen for that week, very briefly describe it (1 minute), tell us why you picked
it (i.e. how does it relate to the topic area and the two assigned papers), and finally whether or not
you would recommend that paper for others to read.
What Should be in a Critique?
The critiques you write will provide a short summary and analysis of the technical papers you
have read each week. Critique writing is an important component of the class, and serves several
goals: to give you practice in technical writing, to concretely organize your ideas in preparation
for class discussion, and to develop the skills necessary to become a good conference/journal
paper referee. Furthermore, getting in the habit of writing critiques of the papers you have read
will help you do better research - a good critique provides a concise summary that you can refer to
later without having to dig out and read the original work, and can provide a written starting point
for the obligatory literature review section of your own papers/thesis. To help provide you with a
sense for what goes into a critique, see the handout `The Task of the Referee,' by Alan Jay Smith,
particularly the section entitled `Evaluating a Research Paper.'
We have found that it is helpful to us, when grading critiques, to have them all follow a consistent
format. We ask you to hand in critiques with roughly the following sections (in this order):
- Reviewer: your name and the date
- Citation: the title, author, year, and publication citation of the three papers you are reviewing
- A one paragraph summary (abstract) of the topic area. Why is it important?
- A short overview of each paper including a) key ideas, b) technical approaches and c) results.
- Comparison of the papers, including strong points and weak points of each. How would you
rank each paper relative to the others?
- Questions and issues
We will grade critiques on a three-level scale: check-minus, check, check-plus. Above average
resourcefulness, initiative, creativity and depth of analysis will get a check-plus. Missing any
required sections (1-6) or obvious lack of effort on any of them results in a check-minus.
Pay attention to your speling and grammar of English. :-)
Grading Policy
You will be graded on the following items:
1. Written Critiques | (40%)
|
2. Oral Presentations | (20%)
|
3. Class Participation | (20%)
|
4. Take-Home Final | (20%)
|
5. Extra Credit | (10%)
|
| --------
|
| 110% total
|
Written critiques form the highest-weighted category, as they represent the bulk of the work that
you will be performing (aside from reading the papers themselves). Each critique will be graded
based on your demonstration that you know what that week's papers are about and have carefully
considered their technical approaches and reported results. We are particularly interested in how
well you compare and contrast the three papers that you read that week.
Oral presentation refers to the formal presentation of a paper in front of the class. Depending on
class size, you will be giving roughly two-three oral paper presentations during the semester. To
make it more like a real conference presentation, your talk will be strictly timed to be 20 minutes
long. We suggest you carefully organize and prioritize what you want to say, and maybe even
practice it once with a watch.
Class participation is rather hard to judge objectively (but we are going to try). We highly encourage you to participate in class discussion, and indeed, this type of class will be a complete failure
if people don't speak up with their opinions. On the other hand, we don't wish to penalize folks
who aren't naturally talkative. We will try to ensure that even soft-spoken people get a chance to
air their opinions, and will attempt to grade based on the insightfulness of your comments, rather
than the frequency or volume.
There will be a take-home final exam. It will involve writing!
The extra credit category will reflect both objective evidence and subjective impressions we
receive that indicate you are genuinely putting in a lot of effort. Anything you do (of a professional nature, related to this class) that makes us like you better, will increase your extra credit
score.
Computer Vision Resources
There are many places to go to look for computer vision papers, ranging from archival journals to
on-line web sites. Here is a list of our favorite sources of material:
Archival Journals
- International Journal of Computer Vision (IJCV)
- Computer Vision and Image Understanding (CVIU)
- used to be Computer Vision, Graphics and Image Processing (CVGIP)
- IEEE Trans on Pattern Analysis and Machine Intelligence (PAMI)
- Image and Vision Computing (IVC)
- Pattern Recognition (PR)
Conference Proceedings
- International Conference on Computer Vision (ICCV)
- Computer Vision and Pattern Recognition (CVPR)
- European Conference on Computer Vision (ECCV)
- DARPA Image Understanding Workshop (IUW)
WWW Resources
Overview of Topics by Week (Selections subject to change)
Week 1: Introduction and Explanation
Introduction; explanation of class format and logistics. Instructors talk about computer vision
resources, and why particular papers were selected for this course. Discussion of how to write a
critique, give a presentation, and find relevant research papers.
Week 2. Feature Extraction I: Edge Extraction
(Reminder: read these two and also find a third related paper on your own.)
- E.C.Hildreth, `The Detection of Intensity Changes by Computer and Biological Vision Systems,'
Computer Vision, Graphics and Image Processing, Vol. 27, 1983, pp.1-27.
- J.F.Canny. `A computational approach to edge detection.''
IEEE Trans. on Pattern Analysis
and Machine Intelligence,Vol.8(6), November 1986, pp.679-698.
Week 2 Third Papers (selected by the students):
- C. Harris, B. Buxton. `Low-level Edge Detection Using Genetic Programming: Performance,
Specificity, and Application to Real-World Signals''. June 1997, University College London
Tech Report RN/97/34. -- This paper describes how genetic programming can be used
to evolve a set of edge detectors specific to a training dataset. These detectors are shown
to outperform both theoretical optimal detectors and other evolved detectors.
- P. Perona and J. Malik, `Scale-Space and Edge Detection Using Anisotropic Diffusion,'
IEEE
Trans. on PAMI, V. 12 (7), July 1990. -- This paper presents a global approach to edge
detection which formulates edge detection as a diffusion process and attempts to find
edges via global deformation of the image rather than local sliding-window operations.
Second summary: Instead of detecting edge locally, this paper approaches the problem
globally. It views the convolution with a Gaussian as similar to the solution of heat conduction/diffusion. The approach fixes many of the shortcomings of convolution-based and
Canny edge detectors, however the computational cost is higher for sequential machine.
- Asada
et.al., `Edge and Depth from Focus' ,
IJCV,
26(2), 1998, 153-163. -- Edges are
extracted by observing the blurring in an image when a series of de-focussing operations
is deliberately introduced.
- Y. Lu and R. C. Jain. `Reasoning about Edges in Scale Space,''
EEE Trans. on Pattern Analysis and Machine Intelligence, Vol 14(4), April 1992. -- RESS is a method of integrating
edges from multiple scales of the LoG edge operator using a knowledge base of the behavior of edges at different scales.
- D. Demigny and T. Kamle, `A Discrete Expression of Cannys Criteria for Step Edge Detector
Performances Evaluation,'
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.19(11), November 1997, pp. 1199-1211. -- Since all filters are implemented in
the discrete domain, this paper proposes three criteria (similar to Canny's 3 criteria) to
directly optimize filters in the discrete domain; the paper also shows that optimizing the
three discrete domain criteria yields better results than what is obtained by sampling the
optimized Canny filter.
- MIT AI lab memo:
AI memo 773, April 1984.
- J.B. Burns, A.R.Hanson and E.M.Riseman, Extracting Straight Lines, IEEE Trans. on Pattern
Analysis and Machine Intelligence, Vol8(4), July 1986, pp.425-455. -- This paper presents an approach for the extraction of straight lines in intensity images. It starts at the
level of lines directly without going through the intermediate stage of first detecting local
edges. They argue that this overcomes the difficulties encountered in aggregation when
using local operators.
- A. A. Farag and E. J. Delp, `Edge Linking By Sequential Search,'
Pattern Recognition, Vol.
28(5), 1995, pp. 611-633. -- Considering the edge detection as a two-stage process (edge
enhancement followed by edge linking), more focus should be given to the edge linking
process than what Canny did in his detector. The paper by Farag and Delp used Laplacian
of Gaussian operator for edge enhancement and used A* (or Stack) search with mathematically sound heuristic for edge linking.
Week 3. Feature Extraction II: Region/Volume Segmentation
- T.Kapur, W.E.Grimson, W.Wells and R.Kikinis, `Segmentation of Brain Tissue from Magnetic Resonance Images ,
Medical Image Analysis, Vol.1(2), 1996, pp. 109-127.
- B.Maxwell and S.Shafer, `Physics-Based Segmentation of Complex Objects using Multiple
Hypotheses of Image Formation,'
ComputerVision and Image Understanding, Vol.65(2),
Feb 1997, pp.269-295.
Week 3 Third Papers (selected by the students):
- V. Rehrmann and L. Priese, `Fast and Robust Segmentation of Natural Color Scenes'',
Proceedings from Third Asian Conference on Computer Vision, Hongkong, Jan 1998. -- This
paper describes the CSC algorithm, Color Structure Code, for performing real-time segmentation of color images. Images are represented with hexagonal connectivity using a
hierarchical tree structure. Regions are created by color similarity comparisons of local
elements, with provision for later splitting regions that prove to be dissimilar at a global
level of analysis.
- M.A. Gonzalez Ballester, A. Zisserman, and J.M. Brady. `Measurement of Brain Structures
based on Statistical and Geometrical 3D Segmentation,' MICCAI'98.
To appear. --This
paper presents a method for three-dimensional segmentation and measurement of volumetric data based on the combination of statistical and geometrical information. The shape
of complex three-dimensional structures, such as the cortex is represented by combining a
discrete 3D simplex mesh with the construction of a smooth surface using triangular Gregory-Bezier patches. Confidence bounds are produced for all the measurements, thus
obtaining bounds on the position of the surface segmenting the image.
- T. Uchiyama and M. A. Arbib, `Color Image Segmentation Using Competitive Learning,'
IEEE Transactions on Pattern Analysis and Machine Intelligence,
Vol.16(12), Dec.1994,
pp.1197-1206. -- This paper deals with the problem of colour image segmentation; clusters of the same colour are identified using competitive learning, thereby producing the
least sum of squares solution.
- T. Leung and J. Malik, `Contour Continuity in Region-Based Image Segmentation',
Fifth
Euro. Conf. on Computer Vision, Freiburg, Germany, June 1998. -- The paper takes into
account contour continuity, in addition to intensity, color and texture to determine the partitioning of an image. The image `soft' contour is first detected by using elongated filters
and Hilbert transform, giving out the `orientation energy' measure. The orientation
energy is used as a basic to propagate contour. Afterward, the regions are segmented by
using the normalized cut approach.
- B. Leroy, I.L. Herlin, L.D. Cohen, `Multi-Resolution Algorithms for Active Contour Models',
Proceedings of the 12th International Conference on Analysis and Optimization of
Systems, Images, Wavelets and PDE'S, Rocquencourt (France), 1996. S.C. Zhu, T.S. --
The paper attempts to speed up active contour models, the balloons, by going into multi-resolution using two separate methods. The first uses multi-resolution data, the second
incorporate multi-resolution to the model itself (by using elliptic Fourier harmonics).
-
Week 4. Feature Extraction III: Active Contours
- M.Kass, A.Witkin, D.Terzopoulos, `Snakes: Active Contour Models,'
International Journal
of Computer Vision, Vol.1(4), January 1988, pp. 321-331.
- A.Pentland,and S.Sclaroff, `Closed-Form Solutions for Physically Based Shape Modeling
and Recognition,'
IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 13, no.
7, July 1991, pp. 715-729.
Week 4 Third Papers (selected by the students):
- F. Leymarie and M. Levine, `Tracking Deformable Objects in the Plane Using an Active Contour Model'',
IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 15, no. 6,
June 1993, pp 617--634. -- This paper suggests improvements on the original snake active
contour model (Kass, Witkin, Terzopoulos): 1) a different terminating criterion to improve
convergence 2) selection of bounds on parameters to prevent oscillation 3) initialization
using a sequence of hierarchical discrete correlations (Burt and Adelson Laplacian pyramid). Active contours, along with the proposed modifications, are used to track the movement of cells on microscope slides.
- Lee, A.L Yuille, `Region Competition: Unifying Snakes, Region Growing, Energy/Bayes/MDL for Multi-band Image Segmentation',
Proceedings of the Fifth ICCV, pp. 416-425,
1995. -- Using the statistical properties, a new region competition algorithm will have a
combined best features of snakes/balloons, region growing, and Bayes/MDL. This new
Region Competition algorithm allows pixels inside regions to compete for pixels along
boundaries. The likelihood of membership in a region is determined using statistical properties.
- Michael Isard and Andrew Blake, `Contour tracking by stochastic propagation of conditional
density',
Proc. European Conference on Computer Vision, vol. 1, pp. 343--356, Cambridge UK, (1996). -- The paper proposes a stochastic algorithm (Condensation algorithm) for tracking curves in dense visual cluttered images. It uses `factored sampling', a
method previously applied to interpretation of static images, in which the distribution of
possible interpretations is represented by a randomly generated set of representatives. The
algorithm combines factored sampling with learned dynamical models to propagate an
entire probability distribution for object position and shape, over time. The result is highly
robust real-time tracking of agile motion in clutter. Clearly written paper with a good
explanation of the proposed technique, it contains experimental results and a complexity
analysis. Not surprisingly it won the best paper award.
- A. Hoover, D. Goldgof, K. W. Bowyer, `Extracting a Valid Boundary Representation from a
Segmented Range Image,'
IEEE Trans. On Pattern Analysis and Machine Intelligence,
vol.17 no.9, September 1995, pp. 920-925. -- This paper addresses the problem of creating boundary representations (b-rep) of polyhedral shapes, by using topological and geometric information, and also including a hypothetical representation of the un-visible
section of the object.
Week 5. Object Recognition
- D.P.Huttenlocker and S.Ullman, `Recognizing Solid Objects by Alignment with an Image,'
Int'l Journal of Computer Vision, vol. 5(2), 1990, pp. 195-212.
- H.Murase, and S.K.Nayar, `Visual Learning and Recognition of 3-D Objects from Appearance,'
Int'l Journal of Computer Vision, vol. 14, 1995, pp. 5-24.
Week 5 Third Papers (selected by the students):
- C.S. Chua and R. Jarvis. `Point Signatures: A New Representation for 3D Object Recognition.'
International Journal of Computer Vision,
25(1), 63-85 (1997). -- A point signature is a 1D feature curve that describes the undulation of the 3D object surface local to a
point of interest, a collection of which facilitates the recognition of 3D free form objects.
- P. Viola, `Complex Feature Recognition: A Bayesian Approach for Learning to Recognize
Objects,''
MIT AI Labs Tech Report 1591. --
This paper describes a Bayesian approach
for extracting complex object features that are less affected by illumination and pose
changes. Since each feature captures a greater area of a scene, the correspondence problem between model and image is reduced as well.
- C.F. Olson and D.P.Huttenlocher `Automatic Target Recognition by Matching Oriented Edge
Pixels',
IEEE Trans. on Image Processing, 6(1):103-113, January 1997. -- The paper
defines oriented edge pixels by taking x, y, and delta (which is either the direction of the
gradient, edge normal or tangent). A modified Hausdorff measure, which measures the
maximum distance and orientation of nearest points, is utilized to provide a closeness
measure. K number of pixels (not all) are matched, to account for occlusion and noise.
The 3-D models (and multiple models) are organized in hierarchical way based on similarity (so if you have two similar models, you will create a parent having the intersection of
the models). A recognition is done by computing the Hausdorff distance between the
image and the models. Additionaly, a probability of a false alrm is computed by calculating Markov process, both for predicted false alarm and observed false alarm.
Week 6. Volumetric Registration
- R.Bajcsy and S.Kovacic, `Multiresolution Elastic Matching,'
Computer Vision, Graphics and
Pattern Recognition,
Vol 46, 1989, pp.1-21.
- P.A.Viola and W.Wells, `Alignment by Maximization of Mutual Information,
International
Journal of Computer Vision, Vol.24(2), September 1997, pp. 137-154.
Week 7. Projective Geometry
- J.B.Burns, R.S.Weiss and E.M.Riseman, `The Non-Existence of General-Case View-Invariants,'
Geometrical Invariance in Computer Vision, ed. J. Mundy and A.Zisserman, MIT
Press, Cambridge, 1992, pp.120-131.
The following two papers will be treated as one, for the purposes of critiquing/presenting:
- H.C.Longuet-Higgins, `A Computer Algorithm for Reconstructing a Scene from Two Projections,'
Nature, vol 293, 1981, pp. 133-135.
- R.Hartley, `In Defense of the 8-point Algorithm,'
IEEE Trans on Pattern Analysis and
Machine Intelligence, 19(6), June 1997, pp. 580-593.
Week 8. Symmetry and Perceptio
- F.Ulupinar and R.Nevatia, `Constraints for Interpretation of Line Drawings under Perspective
Projection,'
CVGIP: Image Understanding,
Vol. 53(1), 1991, pp.88-96.
The following two papers will be treated as one, for the purposes of critiquing/presenting:
- H.Zabrodsky, S.Peleg and D.Avnir, `Symmetry as a Continuous Feature,'
IEEE Transactions
on Pattern Analysis and Machine Intelligence,
vol 17(12), 1995, pp.1154-1165.
- K.Kanatani, `Comments on `Symmetry as a Continuous Feature',
IEEE Transactions on Pattern Analysis and Machine Intelligence,
vol 19(3), 1997, pp. 246-247.
Week 9. Stabilization And Mosaicing
- J.Bergen et.al., `Hierarchical Model-Based Motion Estimation,' in
Proceedings of European
Conference on Computer Vision,
1992, pp. 237-252.
- H.Shum and R.Szeliski, `Construction and Refinement of Panoramic Mosaics with Glocal
and Local Alignment,'
International Conference on Computer Vision, Bombay, India,
Jan.1998, pp. 953-958.
Week 10. Egomotion and Structure from Motion
- C.Tomasi and T.Kanade, `Shape and Motion from Image Streams under Orthography: a Factorization Method,'
Int'l Journal of Computer Vision, Vol. 9(2), 1992, pp. 137-154.
- J.L.Barron, D.J.Fleet, and S.S.Beauchemin, `Performance of Optical Flow Techniques,'
Int'l
Journal of Computer Vision, vol. 12, no. 1, Jan. 1994, pp. 43-77.
Week 11. New View Synthesis
- L.McMillan and G.Bishop, `Plenoptic Modeling: An Image-Based Rendering System,'
Proc.
SIGGRAPH,
1995, pp.39-46.
- S.Gortler, R.Grzeszczuk, R.Szeliski and M.Cohen, `The Lumigraph,'
Proc. SIGGRAPH,
1996, pp.43-54.
Week 12. Range Imaging
- A.Johnson and M.Hebert, `Surface Matching for Object Recognition in Complex Three-Dimensional Scenes,'
Image and Vision Computing,
Vol.16, 1998, pp.635-651.
- P.Besl and N.McKay, `A Method for Registration of 3-D Shapes,'
IEEE Trans on Pattern
Analysis and Machine Intelligence (PAMI), Vol. 14(2), 1992, pp.239-256.
Week 13. Auditory Sensing