Bill Freeman, Photographing Events over Time |
Abstract:
Artists and scientists have used photography to portray events
occurring over a range of timescales. I'll review some of the
hardware and computational techniques used to tell these stories over
time, focusing on the computational challenges involved in analyzing
events over longer timescales.
Bio:
Bill Freeman is a Professor of Computer Science at the Massachusetts
Institute of Technology, joining the faculty in 2001. His current
research interests include machine learning applied to computer vision
and graphics, and computational photography.
Dr. Freeman is active in the program and organizing committees of the
major computer vision, graphics, and machine learning conferences. He
was the program co-chair for the International Conference on Computer
Vision (ICCV) in 2005, and will be the program co-chair for Computer
Vision and Pattern Recognition (CVPR) in 2013.
From 1981 - 1987, he worked at Polaroid, developing image processing
algorithms for electronic cameras and printers. In 1987-88, he was a
Foreign Expert at the Taiyuan University of Technology , P. R. of
China. From 1992 - 2001 he worked at Mitsubishi Electric Research
Labs (MERL), in Cambridge, MA, most recently as Sr. Research Scientist
and Associate Director. He holds 30 patents and is an IEEE Fellow. A
hobby is flying cameras in kites.
|
Peter Belhumeur, Lessons from Photographing and Identifying the Word's Plant Species |
Abstract:
Columbia University, the University of Maryland, and the Smithsonian Institution are working on visual recognition software
to help identify species from photographs. I will discuss our work on?developing Leafsnap -- the first in a series of
electronic field guides. As part of this work, we have completed photographing close to one third of the world's plant
species and have begun capturing beautiful high-resolution images of live specimens. Our?work has led us in many new
research directions in different domains such as human faces and, most recently, architecture, including the adoption of
centuries-old techniques from taxonomy for the process of labeling images with visual attributes. In particular, I will show
that it is possible to automatically recognize a wide range of visual attributes in images and use them in numerous
applications of computational photography.
Bio:
Peter N. Belhumeur is currently a Professor in the Department of Computer Science at Columbia University and the Director of the Laboratory for the Study of Visual
Appearance (VAP LAB). He received a Sc.B. in Information Sciences from Brown University in 1985. He received his Ph.D. in Engineering Sciences from Harvard
University under the direction of David Mumford in 1993. He was a postdoctoral fellow at the University of Cambridge's Isaac Newton Institute for Mathematical
Sciences in 1994. He was made Assistant, Associate and Professor of Electrical Engineering at Yale University in 1994, 1998, and 2001, respectively. He joined
Columbia University as a Professor of Computer Science in 2002. His research focus lies somewhere in the mix of computer vision, computer graphics, and computational
photography. He is a recipient of the Presidential Early Career Award for Scientists and Engineers (PECASE) and the National Science Foundation Career Award. He won
both the Siemens Best Paper Award at the IEEE Conference on Computer Vision and Pattern Recognition and the Olympus Prize at the European Conference of Computer
Vision.
|
Luc Vincent, Google Street View: Challenges of Photography at Large Scale
|
Abstract:
This presentation will cover some of the unique challenges posed by systematic, large-scale photography. I will discuss the evolution of the
Google Street View project from its early days to today, focusing on the unique cameras and imaging platforms the team developed and
fine-tuned over the years.
Bio:
Luc Vincent joined Google in 2004. He is an Engineering Director responsible for projects generally focused on computer vision and large-scale data
collection. They include Google Street View (a project he started in his "20% time"), the recently unveiled Google Art Project, Google's Optical
Character Recognition (OCR) technology, and several Geo-related efforts around oblique imagery and 3D buildings.
Before Google, Luc was Chief Scientist, then VP of Document Imaging at LizardTech, a developer of advanced image compression software. Prior to
this, he led an R&D team at the prestigious Xerox Palo Alto Research Center (PARC). He was also Director of Software Development at Scansoft (now
Nuance) and held various technical management and individual contributor positions at Xerox Corporation.
Luc has over 70 publications in the area of vision, image analysis and document understanding. He recently served as an Associate Editor for the
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) and for the Journal of Electronic Imaging. He has also served as chair for
SPIE's conferences on Document Recognition, the International Symposium on Mathematical Morphology (ISMM), and in the program committee of
numerous conferences and workshops.
Luc earned his B.S. from Ecole Poly technique, M.S. in Computer Science from University of Paris XI, and PhD in Mathematical Morphology from the
Ecole des Mines de Paris in 1990.
|
Ramesh Raskar,
Looking Around Corners: New Opportunities in Femto-Photography |
Abstract:
Can we look around corners beyond the line of sight? Our goal is to exploit the finite speed of light to improve image
capture and scene understanding. New theoretical analysis coupled with emerging ultra-high-speed imaging techniques can
lead to a new source of computational visual perception. We are developing the theoretical foundation for sensing and
reasoning using Femto-photography and transient light transport, and experimenting with scenarios in which transient
reasoning exposes scene properties that are beyond the reach of traditional computer vision. (Joint work with a large team,
see http://raskar.info/femto)
Bio:
Ramesh Raskar joined the Media Lab from Mitsubishi Electric Research Laboratories in 2008 as head of the Lab?s Camera Culture research group. His research interests
span the fields of computational photography, inverse problems in imaging and human-computer interaction. Recent inventions include transient imaging to look around
corners (Femto-photography), next generation CAT-Scan machine, imperceptible markers for motion capture (Prakash), long distance barcodes (Bokode), touch+hover 3D
interaction displays (BiDi screen), low-cost eye care devices (Netra) and new theoretical models to augment light fields (ALF) to represent wave phenomena.
He is a recipient of TR100 award from Technology Review, 2004, Global Indus Technovator Award, top 20 Indian technology innovators worldwide, 2003, Alfred P. Sloan
Research Fellowship award, 2009 and Darpa Young Faculty award, 2010. He holds over 40 US patents and has received four Mitsubishi Electric Invention Awards. He is
currently co-authoring a book on Computational Photography. http://raskar.info
|
Markus Testorf,
Phase-Space Tools for Computational Imaging and Photography |
Abstract:
Virtually all recent developments in imaging science employ concepts closely related to the properties of optical phase
spaces. This includes among other topics superresolution imaging, multi-aperture and synthetic aperture imaging, feature
specific imaging, compressive sensing, wavefront sensing and computational photography. In this context ?phase space?
refers to a configuration space, where optical signals are specified in terms of generalized ray coordinates or Fourier
reciprocal variables. For optical rays this corresponds to certain lightfield representations, while for optical waves the
corresponding phase space is defined by joint space-spatial frequency transforms such as the Wigner distribution function.
The term ?phase space? should be reserved for physically meaningful signal representations, which not only encode the
signal, but also information about its dynamics, i.e. the physics of light propagation. Using primarily schematic
depictions of phase space the tutorial will introduce phase-space optics by focusing on optical sensing, imaging and
photography. Starting with the phase space of geometrical optics the discussion will develop a phase-space picture of
paraxial optics, plenoptic imaging and auto-stereoscopic 3D display technology. The phase-space of coherent optical waves
will be introduced as a generalization of geometrical optics for interfering signals. This will lead the discussion to wave
front sensing, imaging systems with an extended depth of field, as well as superresolution imaging methods. The phase space
of partially coherent signals will highlight limitations of geometrical optics and the need for a phase space of physical
optics. A brief discussion of discrete representations of phase space will connect the analysis of optical hardware with
signal processing methods.
Bio:
Markus Testorf holds a Ph.D. in physics from the University of Erlangen-Nuremberg, in Germany. Since his graduation he
has worked at the National Institute for Astrophysics, Optics, and Electronics, Mexico, the University of Hagen, Germany,
the University of Massachusetts-Lowell, and Dartmouth College, U.S.A. At Dartmouth, he is engaged in optics related
research and teaching since 2003. His research interests cover a wide variety of topics related to information optics,
including the design of diffractive and nano optics, as well as computational imaging methods. Throughout his professional
career he has employed phase-space optics as a tool for research and education. To his own amazement, phase-space optics,
long been considered merely an esoteric way of thinking about optics, is currently becoming integral part of exciting new
imaging science. Markus Testorf is a fellow of the Optical Society of America, and a member of the European Optical Society
(EOS), and the German Optical Society (DGaO).
|
Paolo Favaro,
Portable Light Field Imaging: Extended Depth of Field, Aliasing and Superresolution |
Abstract:
Portable light field cameras have demonstrated capabilities beyond
conventional cameras. In a single snapshot, they enable digital image
refocusing, i.e., the ability to change the camera focus after taking
the snapshot, and 3D reconstruction. We show that they also achieve
a larger depth of field while maintaining the ability to reconstruct
detail at high resolution. More interestingly, we show that their depth
of field is essentially inverted compared to regular cameras.
Crucial to the success of the light field camera is the way it samples
the light field, trading off spatial vs. angular resolution, and how
aliasing affects the light field. We present a novel algorithm that
estimates a full resolution sharp image and a full resolution depth map
from a single input light field image. The algorithm is formulated in a
variational framework and it is based on novel image priors designed
for light field images. We demonstrate the algorithm on synthetic and
real images captured with our own light field camera, and show that it
can outperform other computational camera systems.
Bio:
Paolo Favaro received the D.Ing. degree from Universita di Padova,
Italy in 1999, and the M.Sc. and Ph.D. degree in electrical
engineering from Washington University in St. Louis in 2002 and 2003
respectively. He was a postdoctoral researcher in the computer science
department of the University of California, Los Angeles and
subsequently in Cambridge University, UK. Dr. Favaro is now lecturer
(assistant professor) in Heriot-Watt University and Honorary Fellow at
the University of Edinburgh, UK. His research interests are in
computer vision, computational photography, machine learning, signal
and image processing, estimation theory, inverse problems and variational
techniques. He is also a member of the IEEE Society.
|
Kari Pulli,
FCam -- An architecture and API for computational cameras
|
Abstract:
FCam (short for Frankencamera) is an architecture for computational cameras and
an API that implements the architecture. Our motivation is flexible programming
of cameras, especially camera phones. We give an overview of the API and
discuss two implementations, one on N900, a Linux-bases smart phone from Nokia,
and "F2", a research camera built at Stanford University that allows
experimentation with different sensors and optics. We also describe several
applications developed on top of FCam, and FCam use at universities in research
and teaching, so far in North America, South America, Europe, and Asia.
Bio:
Kari joined NVIDIA research in April 2011 to work in imaging and other
mobile applications. Previously he was at Nokia since 1999 (1999-2004 in
Oulu, Finland; 2004-06 a visiting scientist at MIT CSAIL; 2006-11 at Nokia
Research Center Palo Alto). He was Nokia's 6th Nokia Fellow, and a Member of
CEO's Technology Council. Kari worked a lot on standardizing mobile graphics
APIs at Khronos (OpenGL ES, OpenVG) and JCP (M3G) and wrote with colleagues
a book on Mobile 3D
Graphics.
In Palo Alto he started research group working on mobile augmented reality
and computational photography (including the
FCam architecture
for computational cameras).
Kari has a B.Sc. in Comp. Sci. from Univ. of Minnesota, M.Sc. and Lic. Tech.
in Comp. Eng. from Univ. of Oulu (Finland), PhD from Univ. of Washington
(Seattle), MBA from Univ. of Oulu, and worked as a research associate at
Stanford as the technical lead of the Digital Michelangelo
Project.
|