Mehmet Kemal Kocamaz

About Me

I am currently a Postdoctoral Fellow working in NavLab in the Robotics Institute of Carnegie Mellon University. I am interested in computer vision, robotics, robot navigation, object tracking, detection, segmentation, and machine learning.

Research Projects

Vision Based Pedestrian and Cyclist Counting

The goal of this project is to count the pedestrians and bicycles using the data of stationary cameras installed near pedestrian and bike lanes in the city. For this purpose, I developed an object counting system. This method first detects and distinguishes between the bikes and pedestrians in a cascaded way, and then it tracks the classified subjects. It combines color, geometric shape prior, moving direction and background models to improve the output of conventional object detectors.

Obstacle and Road Detection on Moving Platforms

In this project, I have been investigating methods for fusing Geographical Information System (GIS) database and camera data to detect the obstacles and road. This system requires to run robustly and real-time on the moving platforms, such as cars, and smart phones. External GIS knowledge is retrieved from OpenStreetMap as the geometry of 2D bird eye view. This low level map information defines some semantic labels of the scene structure, such as roads and buildings. Projecting this knowledge to the image space helps to reduce the search space and also increase the accuracy of the detection. The preliminary result of the algorithm is promising in several datasets

Autonomous Driving and International Intelligent Ground Vehicle Competition

Our unmanned ground vehicle, Warthog, won the 1st place in 2009 and 2011, 3rd place in 2010 Autonomous Challenge part of International Intelligent Ground Vehicle Competition among ~50 robots. I worked on the line detection, tracking, and obstacle detection modules of the robot.

Tracking Complex Objects Integrating Visual and Structural Features

A point-wise tracking method which fuses the structural and visual cues obtained from the depth and color images is developed. The point-wise features of the object are learnt by Random Forest Classifier. The displacement of the object is estimated by SIFT keypoint matching method over time. The confidence score of a point being inside the object is achieved from the RF classifier. These confidence scores are given to graph cut framework to determine final borders of the object.

Automatic Low Dimensional Human Shape Refinement

Given bounding box is refined to obtain pixel-wise mask

A method which refines a given low dimensional human shape, such as a bounding box, to obtain the pixel-wise mask of the human is developed. It fuses the features achieved from the color and depth images of Kinect sensor. A pixel-wise human descriptor which incorporates the shape of human, geodesic distance and local normal information on the human is developed. The descriptors are trained in random forest classifier framework. Low order observations, such as color discontinuity and random forest classifier retrieved shape confidence information, and high order observations, such as pre-determined ground plane, are jointly combined in multi layer graph cut framework. Final detailed pixel-wise human mask is obtained by performing the graph cut in this multilayer graph

Multimodal Human Detection

An accurate and computationally very fast multi-modal human detector is developed. This 1D+2D detector fuses 1D range scan and 2D image information via an effective geometric descriptor and a silhouette based visual representation within a radial basis function kernel support vector machine learning framework. Unlike the existing approaches, the proposed 1D+2D detector does not make any restrictive assumptions on the range scan positions, thus it is applicable to a wide range of real-life detection tasks. Extensive experiments demonstrate that the 1D+2D detector works robustly under challenging imaging conditions and achieves several orders of magnitude performance improvement while reducing the computational load drastically

Deformable Object Segmentation and Tracking via Graph Cuts

The purpose of this work is to obtain a refined segmentation of an object given a coarse initial segmentation of it. One line of investigation modifies the standard graph cut method by incorporating color and shape distance terms, adaptively weighted at run time to try to favor the most informative cue given visual conditions. Furthermore, single-frame refinement method is extended to serve as the basis of the tracker which works for a variety of object types with complex, deformable shapes.

Dense Stereo for Omnidirectional Cameras

The goal of this work is to detect obstacles, both positive and negative, through 3D reconstruction, which can be used as a cue for robot navigation. We mounted stereo fisheye cameras on a ground vehicle (robot). Stereo is used to recover depth, and thus to perform scene reconstruction for obstacle detection. Fisheye cameras provide much wider views than regular lens cameras, and the robot can not only "see" what is in front of it, but also "see" what is behind it. This is extremely useful when the robot is in backup mode.
The stereo cameras are calibrated using OCamCalib Toolbox, and the extrinsic parameters are optimized using the Levenberg-Marquardt nonlinear least squares minimization algorithm. We rectify the relevant portion of each omnidirectional image into a virtual perspective image such that epipolar lines are image rows. A semiglobal block matching function (SGBM) is applied to the stereo image pairs to generate disparity images. With disparity, we reproject the image to 3D for scene reconstruction.

Trail Segmentation and Tracking

Hypothetical trail triangle region and sample k-means labels for different cues.

This project describes a framework for segmenting continuous trails in isolated images and tracking them over image sequences for autonomous robot navigation. Proceeding from a shape assumption that the trail region is approximately triangular under perspective, an objective function is formulated in terms of trail appearance, which drives an efficient multi-scale particle filter. A hypothetical trail triangles appearance likelihood is based on a robust measure of color and brightness contrast with and symmetry between flanking triangular regions. The absolute trail likelihood correlates well with confidence that a trail is even visible; the system uses this to switch between appearance cue sets in order to maximize accuracy under changing visual conditions.

Medical Image Segmentation

An algorithm for the ultrasound image segmentation is proposed. Our approach is based on semi-automatic standard region growing segmentation algorithm. Given a seed point by user, one can automatically and quickly segment the object from the image. A mechanism is developed to add groups of pixels to the segmented object region. Our growing mechanism uses the distances between the distributions as the criteria to solve the speckle noise problem in the ultrasound images. Minimizing the distances between the distributions allow us to find the borders. The performance and the results of the proposed method are tested in different set of ultrasound images. [unpublished paper]

Virtual Dageurreotype

A system to simulate daguerreotype pictures is presented. Daguerreotypes are one of the earliest forms of photography. They are created by using a direct-positive process that produces a very detailed image on a sheet of copper plated with a thin coat of silver without the use of a negative. The system we offer is built on a handheld mirror simulation system and adds image composition modules to reproduce the feeling of looking into a mirror and seeing a daguerreotype image superimposed on your reflection. The software components of the system are developed using the Modular Flow Scheduling Middleware (MFSM), an open source implementation of IMSC’s Software Architecture for Immersipresence (SAI).

Research Proposals & Funds

Measuring Pedestrian Wait-Time at Intersections, CO-PI, Budget: 25K, Center of Technologies for Safe and Efficient Transportation - CMU, October 2015 - January 2016
Pedestrian Detection for the Surtac Adaptive Traffic System, CO-PI, Budget: 100K, Center of Technologies for Safe and Efficient Transportation - CMU, January 2016 - December 2017

Patent Disclosures

Porikli Fatih, Kocamaz Mehmet K., “Method for Detecting Persons Using 1D Depths and 2D Texture ”, (13,897,517), June 2013

Publications

Book Chapters

C. Rasmussen, Y. Lu, and M.K. Kocamaz, “A Trail-Following Robot which Uses Appearance and Structural Cues”, FSR, Springer Tracts in Advanced Robotics, Volume 92, Page 265-279. Springer, 2012 [paper]

Journals

Kocamaz M.K., Rasmussen C., Approaches for Automatic Low Dimensional Human Shape Refinement with Priors or Generic Cues using RGB-D Data, Journal of Image and Vision Computing, Volume 40, 16-27, 2015 [paper]
Kocamaz M.K., Rasmussen C., Multimodal Point-wise Object Tracker with RGB-D Data, Submitted to Journal of Computer Vision and Image Understanding, Under Review, [paper] [Video]
Kocamaz M.K., Porikli F., Unconstrained 1D Range and 2D Image Based Human Detection and Segmentation, In Submission to Journal of Machine Vision and Application [paper]

Conferences

Laddha A., Kocamaz M.K., Hebert M., Self-Supervised Road Detection in Monocular Images using Maps, Submitted to IEEE Intelligent Vehicles Symposium - 2016
Kocamaz M.K., Pires B., Gong J., Vision based Counting of Pedestrians and Cyclists, IEEE Winter Conference on Applications of Computer Vision - 2016 [paper] [Video]
M.K. Kocamaz, F. Porikli, “Unconstrained 1D Range and 2D Image Based Human Detection”, IEEE International Conference on Intelligent Robots and Systems (IROS), 2013, Oral Presentation [paper] [Video]
C. Rasmussen, Y. Lu, and M.K. Kocamaz, "Integrating Stereo Structure for Omnidirectional Trail Following", IEEE International Conference on Intelligent Robots and Systems (IROS), 2011, Oral Presentation [paper]
M.K. Kocamaz, Y. Lu, and C. Rasmussen, "Deformable Object Shape Refinement and Tracking Using Graph Cuts and Support Vector Machines", International Symposium on Visual Computing (ISVC), 2011 [paper] [Video]
M.K. Kocamaz and C. Rasmussen, "Automatic Refinement of Foreground Regions for Robot Trail Following", IEEE International Conference of Pattern Recognition, 2010, Oral Presentation [paper]
C. Rasmussen, Y. Lu, and M.K. Kocamaz, "Trail Following with Omnidirectional Vision", IEEE International Conference on Intelligent Robots and Systems (IROS), 2010, Oral Presentation [paper]
C. Rasmussen, Y. Lu, and M.K. Kocamaz, "Appearance Contrast for Fast, Robust Trail-Following", IEEE International Conference on Intelligent Robots and Systems (IROS), 2009, Oral Presentation [paper]

Others

M.K. Kocamaz, A. Kaya, Y.E. Kang, A. Francois, "The Virtual Daguerreotype", IMSC Technical Report, University of Southern California, May 2003 [report]

Contact

E-mail : kocamaz AT cmu DOT edu
Address : 4123 Newell Simon Hall
   School of Computer Science
   Carnegie Mellon University
   Pittsburgh, PA 15213