Wentao Yuan (袁文韬)

I am a second-year M.S. Robotics student advised by Prof. Martial Hebert at the Robotics Institute, Carnegie Mellon University. I am interested in developing algorithms that enable robots to perceive and interact with diverse 3D environments. Currently, I am working on deep learning archictures for 3D point cloud understanding.

I obtained my B.A. in Computer Science and Mathematics from Pomona College, where I worked with Prof. Vin de Silva on 3D shape matching using functional maps. I've spent two summers at Google and one summer working on underwater shipwreck detection with Prof. Chris Clark's group from Harvey Mudd College.

Email  /  CV  /  GitHub  /  Google Scholar  /  LinkedIn


Preprints

Iterative Transformer Network for 3D Point Cloud
Wentao Yuan, David Held, Christoph Mertz, Martial Hebert
arXiv Preprint, 2018
paper  /  abstract  /  bibtex

3D point cloud is an efficient and flexible representation of 3D structures. Recently, neural networks operating on point clouds have shown superior performance on tasks such as shape classification and part segmentation. However, performance on these tasks are evaluated using complete, aligned shapes, while real world 3D data are partial and unaligned. A key challenge in learning from unaligned point cloud data is how to attain invariance or equivariance with respect to geometric transformations. To address this challenge, we propose a novel transformer network that operates on 3D point clouds, named Iterative Transformer Network (IT-Net). Different from existing transformer networks, IT-Net predicts a 3D rigid transformation using an iterative refinement scheme inspired by classical image and point cloud alignment algorithms. We demonstrate that models using IT-Net achieves superior performance over baselines on the classification and segmentation of partial, unaligned 3D shapes. Further, we provide an analysis on the efficacy of the iterative refinement scheme on estimating accurate object poses from partial observations.

                  @article{yuan2018iterative,
                    title={Iterative Transformer Network for 3D Point Cloud},
                    author={Yuan, Wentao and Held, David and Mertz, Christoph and Hebert, Martial},
                    journal={arXiv preprint arXiv:1811.11209},
                    year={2018}
                  }
                

Publications

PCN: Point Completion Network
Wentao Yuan, Tejas Khot, David Held, Christoph Mertz, Martial Hebert
International Conference on 3D Vision (3DV), 2018 (Oral)
[Best Paper Honorable Mention]
paper  /  abstract  /  code  /  project page  /  presentation  /  poster  /  bibtex

Shape completion, the problem of estimating the complete geometry of objects from partial observations, lies at the core of many vision and robotics applications. In this work, we propose Point Completion Network (PCN), a novel learning-based approach for shape completion. Unlike existing shape completion methods, PCN directly operates on raw point clouds without any structural assumption (e.g. symmetry) or annotation (e.g. semantic class) about the underlying shape. It features a decoder design that enables the generation of fine-grained completions while maintaining a small number of parameters. Our experiments show that PCN produces dense, complete point clouds with realistic structures in the missing regions on inputs with various levels of incompleteness and noise, including cars from LiDAR scans in the KITTI dataset.

                  @inproceedings{yuan2018pcn,
                    title={PCN: Point Completion Network},
                    author={Yuan, Wentao and Khot, Tejas and Held, David and Mertz, Christoph and Hebert, Martial},
                    booktitle={2018 International Conference on 3D Vision (3DV)},
                    pages={728--737},
                    year={2018},
                    organization={IEEE}
                  }
                

Intelligent Shipwreck Search Using Autonomous Underwater Vehicles
Jeffrey Rutledge*, Wentao Yuan*, Jane Wu, Sam Freed, Amy Lewis, Zoe Wood, Timmy Gambin, Christopher Clark
International Conference on Robotics and Automation (ICRA), 2018
paper  /  abstract  /  bibtex

This paper presents an autonomous robot system that is designed to autonomously search for and geo-localize potential underwater archaeological sites. The system, based on Autonomous Underwater Vehicles, invokes a multi-step pipeline. First, the AUV constructs a high altitude scan over a large area to collect low-resolution side scan sonar data. Second, image processing software is employed to automatically detect and identify potential sites of interest. Third, a ranking algorithm assigns importance scores to each site. Fourth, an AUV path planner is used to plan a time-limited path that visits sites with a high importance at a low altitude to acquire high-resolution sonar data. Last, the AUV is deployed to follow this path. This system was implemented and evaluated during an archaeological survey located along the coast of Malta. These experiments demonstrated that the system is able to identify valuable archaeological sites accurately and efficiently in a large previously unsurveyed area. Also, the planned missions led to the discovery of a historical plane wreck whose location was previously unknown.

                  @inproceedings{rutledge2018intelligent,
                    title={Intelligent Shipwreck Search Using Autonomous Underwater Vehicles},
                    author={Rutledge, Jeffrey and Yuan, Wentao and Wu, Jane and Freed, Sam and Lewis, Amy and Wood, Zo{\"e} and Gambin, Timmy and Clark, Christopher},
                    booktitle={2018 IEEE International Conference on Robotics and Automation (ICRA)},
                    pages={1--8},
                    year={2018},
                    organization={IEEE}
                  }
                

Projects
Point Cloud Semantic Segmentation using Graph Convolutional Network

pdf  /  abstract  /  code

We explore a new way of converting point clouds to a representation suitable for deep learning, without destroying any geometric information. Specifically, we connect neighbouring points in a point cloud to form an undirected graph. Although graphs lack the translational-invariant structure just as point clouds, there has been a line of work that extends CNNs to graphs by defining convolution in the spectral domain. The aim of this project is to investigate the effectiveness of these spectral CNNs on the task of point cloud semantic segmentation.

Active Neural Localization in Noisy Environments

pdf  /  abstract  /  code

Localization, the problem of estimating the location of the robot given a map and a sequence of observations, is a fundamental problems in mobile robotics. Most traditional localization methods are passive, i.e. the robot does not have the ability to adjust its action based on its observations. The recently proposed active neural localization algorighm combines deep neural networks with Bayes filter to perform efficient active localization. In this project, we seek to extend active neural localization in noisy environments, where Gaussian noises are added to both the position and the observation of the agent. A series of experiments show that our active localization method outperforms passive localization methods in both noiseless and noisy environments.


Experiences
  • 2017.9 - present: Graduate Research Assistant, Carnegie Mellon University
  • 2017.5 - 2017.8: Undergraduate Research Assistant, Harvey Mudd College
  • 2016.5 - 2016.8: Software Engineering Intern, Google New York
  • 2015.5 - 2015.8: Engineering Practicum Intern, Google Kirkland

Personal

I enjoy singing, playing guitar and soccer. I am also a PADI AOW Diver.