Hua Yu
Language Technologies Institute
Carnegie Mellon University
Pittsburgh, PA 15213
(412) 268-5479
<hyu@cs.cmu.edu>
http://www.cs.cmu.edu/~hyu
RESEARCH INTEREST
- Speech Recognition, especially large vocabulary continuous speech recognition and modeling of sloppy speech
- Statistical Natural Language Processing
- Machine Learning
QUALIFICATIONS
- Seasoned in speech recognition, developed several
state-of-the-art LVCSR systems (Broadcast News, Switchboard, Meeting
Room) from scratch, worked on acoustic modeling, language modeling,
pre-processing (front-end), letter-to-phoneme mapping, speech
synthesis, etc.
- Strong background in computer science, experienced programmer,
years of work in applications development. Graduate courses include
algorithm, security & cryptography
- Strong practical and academic background in digital signal processing,
especially in speech. Graduate courses include modern signal processing,
statistical modeling, information theory, speech recognition
- Strong research and academic background in artificial intelligence.
Graduate courses include artificial intelligence, machine learning, neural
networks
EXPERIENCE
- Interactive Systems Labs, Human Computer Interaction Institute, School of Computer Science, Carnegie Mellon University
Research Assistant, Aug. 1996 - present
- Developing the Switchboard LVCSR System (achieves 23.4% word error rate) on
RT-03 spring evaluation
- Face Recognition using a direct LDA algorithm
- Developing the Broadcast News Transcription System using the Janus Recognition Toolkit
- Automatic Meeting Transcription
- Automatic Classification, Segmentation, Clustering of broadcast news / meeting data
- Reviewer: Pattern Recognition, ICMI
- Spoken Language Technology group, Sony Electronics Inc.
Consultant on training Large Vocabulary Speech Recognition Systems, May, 2000
- Language Technologies Institute, School of Computer Science, Carnegie Mellon University
Research Assistant, Feb. 1998 - Dec. 1998
- Developed a voice-driven web browser using Sphinx2 recognition engine
- Research on automatic document clustering on TDT/SWB corpus
- Developed a text-to-phoneme mapping server
- LTI Admission Committee member, 1998
- Speech Lab, Computer Science Department, Tsinghua University, China
Research Assistant, 1993-1996
- Research on language identification (M.S. thesis).
- Developed recognition systems for 1994 National 863 Spoken Language System
Evaluation, which ranked No.1 in speaker-independent syllable recognition
as well as phrase recognition (B.S. thesis)
- System/network administrator
- Designed an intelligent controller for brushless DC motor with a single-chip
controller, 1993-1994
- Chengda Chemical Engineering Co. of China
Consultant, 1995
Track down a new virus & develop an anti-virus program
- Beijing Postal & Communication Research Institute
Consultant, 1995
Design & develop Postal Service Window System
- Beijing Intelligent Monitoring & Control System Co., China
Consultant, 1994-1996
Develop Underground Coal Mine Monitoring & Control System, Automatic
Campus Monitoring System
EDUCATION
- Carnegie Mellon University, Pittsburgh,
Pennsylvania
Ph.D. candidate in Language Technologies Institute, School of Computer
Science
May 1998, M.S. in Language Technologies Institute, School of Computer
Science
Full research fellowship for all years.
- Tsinghua University, Beijing,
P.R.China
June 1996, M.S.E. in Computer Science Department
June 1994, B.S.E. in Computer Science Department
First-class scholarship for all years, Motorola Scholarship, 3rd prize
in National College Physics Contest
AFFILIATIONS
- IEEE student member
- ACM student member
TECHNICAL SKILLS
- Programming: Seasoned in C/C++, Perl, TclTk, experienced with Matlab, VisualBasic, 80x86 Assembly,
familiar with HTML, PASCAL, Lisp, etc.
- Systems&Networks:
Seasoned in Linux/Unix, System Administration, TCP/IP, experienced with Windows, familiar with HTTP/CGI
SELECTED PUBLICATIONS
(Click here for a more up-to-date list)
- Hagen Soltau, Hua Yu, Florian Metze, Christian Fuegen, Qin Jin and Szu-Chen Jou.
The ISL Transcription System for Conversational Telephony Speech.
ICASSP, Montreal, 2004
- Hua Yu and Alex Waibel.
Integrating Thumbnail Features for Speech Recognition Using Conditional Exponential Models.
ICASSP, Montreal, 2004
- Hua Yu and Tanja Schultz.
Enhanced Tree Clustering with Single Pronunciation Dictionary for Conversational Speech Recognition.
Eurospeech, Geneva, 2003
- Hua Yu and Tanja Schultz.
Implicit Trajectory Modeling through Gaussian Transition Models for Speech Recognition.
HLT-NAACL, Edmonton, 2003
- Hua Yu and Alex Waibel.
Flexible Parameter Tying for Conversational Speech Recognition.
ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition, Tokyo, 2003
- Susanne Burger, Victoria MacLaren and Hua Yu.
The ISL Meeting Corpus: The Impact of Meeting Type on Speech Style.
ICSLP, Denver, 2002
- Hagen Soltau, Hua Yu, Florian Metze, Christian Fuegen, Qin Jin and Sze-Chen Jou.
The ISL RT-03 Conversational Telephone Speech Recognition System.
Rich Transcription Workshop, Boston, MA, 2003
- Hagen Soltau, Hua Yu, Florian Metze, Christian Fuegen, Yue Pan and Sze-Chen Jou.
ISL Meeting Recognition.
Rich Transcription Workshop, Vienna, VA, 2002
- Chiori Hori, Sadaoki Furui, Rob Malkin, Hua Yu and Alex Waibel.
Automatic Speech Summarization Applied to English Broadcast News Speech.
ICASSP02, Orlando, 2002
- Alex Waibel, Michael Bett, Florian Metze, Klaus Ries, Thomas Schaaf, Tanja Schultz, Hagen Soltau, Hua Yu and Klaus Zechner.
Advances in Automatic Meeting Record Creation and Access.
ICASSP01, Salt Lake City, 2001
- Alex Waibel, Hua Yu, Martin Westphal, Hagen Soltau, Tanja Schultz, Thomas Schaaf, Yue Pan, Florian Metze and Micheal Bett.
Advances in Meeting Recognition. HLT2001, San Diego, 2001
- Hua Yu, Jie Yang.
A Direct LDA Algorithm for High-Dimensional Data -- with Application to Face Recognition. Pattern Recognition 34(10), 2001, pp. 2067-2070
- Jie Yang, Hua Yu, William Kunz.
An Efficient LDA Algorithm for Face Recognition.
ICARCV2000, Singapore, 2000
- Hua Yu, Alex Waibel.
Streamlining the Front-End of a Speech Recognizer.
ICSLP00, Beijing, 2000
- Hua Yu, Takashi Tomokiyo, Zhirong Wang, Alex Waibel.
New Developments in Automatic Meeting Transcription.
ICSLP00, Beijing, 2000
- Ralph Gross, Michael Bett, Hua Yu, Xiaojin Zhu, Yue Pan, Jie Yang, and Alex Waibel.
Towards a Multimodal Meeting Record.
ICME2000, New York
- Hua Yu, Michael Finke, Alex Waibel.
Progress in Automatic Meeting Transcription.
Eurospeech99, 1999
- Hua Yu.
Automatically Determining Number of Clusters.
Information Retrieval Course (CMU CS11-741) Final Report, Apr.1998
-
Hua Yu, Cortis Clark, Rob Malkin and Alex Waibel.
Experiments in Automatic Meeting Transcription using JRTk.
Proc. ICASSP '98, Seattle, USA, May 1998
-
Hua Yu, Zhongtao Wang, A Survey on Anonymous Digital Cash Systems. Security
and Cryptography (CMU CS15-827) final report, 1997
-
Ditang Fang, Hua Yu, Shuqing Li. Speech Recognition based on Normal Distribution Hypothesis. Intl. Conf. on Chinese Computing '94, Singapore
-
Hua Yu, et al. Speaker-independent Isolated Word/Phrase Recognition ---
a Statistical Approach. National Conf. on Human-Machine Communication '94,
Chongqing, Oct.1994