Home - CV
- Research - Teaching - Publications
- Links
After six years, I have finally graduated with a PhD from the LTI at CMU. You can download my thesis document from here, and also the slides from my defense (with a few audio recordings of dialogs with the Let's Go system) from here (note that the ppt version might have some glitches since this was originally a Powerpoint 2007 presentation, which you can find here). Following my PhD, I am contiuing my research on human machine spoken dialog interaction in the Honda Research Institute in Mountain View, CA.
My research interests are in human computer interaction through
natural language, with a focus on speech.
Under the supervision of my advisor Maxine
Eskenazi, I am currently part of the Let's Go!! Project,
which aims at building a spoken dialogue system that is both available to the general public
(it provides a natural language interface to the bus schedules of the Port Authority of Allegheny
County), and a research platform (see publications).
I have also worked with Alan Black
on modeling prosody for speech synthesis.
Prior to coming to CMU, I got a Masters Degree in Intelligence Science and
Technology at Kyoto University,
Kyoto, Japan. There, I worked on using speech recognition to teach pronunciation
in a Computer Assisted Language Learning application, under the supervision
of Tatsuya Kawahara
and Hiroshi
Okuno.
Spoken Dialogue Systems
- Let's Go!! Project: a project aiming at designing and implementing
a spoken dialogue system for bus schedule information that can be used effectively
by non-native speakers and elderly people. My work focused on speech recognition,
natural language understanding, and dialogue management.
- RavenClaw: the dialogue management architecture used in Let's Go!!
It was written by Dan Bohus
at CMU and is still partly under development. I am involved in the development
of this general framework along with its application to the Let's Go!! domain.
- I am also part of Dialogs on Dialogs,
an international group of students that holds tele-meetings every other week
on topics related to spoken dialogue systems.
Information Retrieval
Speech Synthesis
- I am working with Alan Black
on unit selection methods for prosody generation for speech synthesis. As
for segmental synthesis, the hope is that using actual F0 contours from recorded
utterances will yield a more natural prosody than purely artificially generated
contours.
Computer Assisted Language Learning
- I am working on Computer Assisted Reading Comprehension
instruction. My main focus of work is now on automatically generating
relevant, natural and useful questions about a random text.
- The research I did for my Masters Degree at Kyoto University, concerned
the use of speech recognition to teach pronunciation in a CALL program. In
particular, my thesis described a Bayesian model to estimate the overall intelligibility
of non-native speech given a set of automatically detected pronunciation (segmental
and suprasegmental) errors. I also implemented my research and that of the
other members of the CALL group in a pronunciation tutor called HUGO, which
is currently used in English classes at Kyoto University.
For the Fall 2006 semester, I was the teaching assistant for 11-711 Algorithms for NLP, taught by Alon Lavie and Bob Frederking.
For the Spring 2004 semester, I was the teaching assistant for 11-752 Speech II: Phonetics, Prosody, Perception, and Synthesis taught by Maxine Eskenazi and Alan Black.
Spoken Dialogue Systems
- A. Raux and M. Eskenazi,
Optimizing Endpointing Thresholds using Dialogue Features in a Spoken Dialogue System,
SIGdial 2008, Columbus, OH, USA.
- A. Raux and M. Eskenazi,
A Multi-Layer Architecture for Semi-Synchronous Event-Driven Dialogue Management,
ASRU 2007, Kyoto, Japan.
- Ai, H., Raux, A., Bohus, D., Exkenazi, M., and Litman, D.
Comparing Spoken Dialog Corpora Collected with Recruited Subjects versus Real Users,
8th SIGDial Workshop on Discourse and Dialogue, Antwerp, Belgium.
- D. Bohus, A. Raux, T. Harris, M. Eskenazi, and A. Rudnicky,
Olympus: an open-source framework for conversational spoken language interface research,
HLT-NAACL 2007 workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technology, Rochester, NY, USA.
- D. Bohus, S. Grau, D. Huggins-Daines, V. Keri, G. Krishna, R. Kumar, A. Raux, and S. Tomko
Conquest - an Open-Source Dialog System for Conferences, HLT-NAACL 2007, Rochester, NY, USA.
- A. Raux, D. Bohus, B. Langner, A. W Black and M. Eskenazi,
Doing Research on a Deployed Spoken Dialogue System: One Year of Let's Go! Experience,
Interspeech 2006 Pittsburgh, USA.
- D. Bohus, B. Langner, A. Raux, A. W Black, M. Eskenazi, and A. Rudnicky,
Online Supervised Learning of Non-understanding Recovery Policies, SLT-2006, Palm Beach, Aruba
- A. Raux, B. Langner, D. Bohus, A. W Black and M. Eskenazi,
Let's Go Public! Taking a Spoken Dialog System to the Real World,
Interspeech 2005 Lisbon, Portugal.
- A. Raux and M. Eskenazi, Non-Native Users in the
Let's Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch, HLT/NAACL 2004,
Boston, MA, USA. [slides]
- A. Raux, B. Langner, A. Black, M. Eskenazi, LET'S
GO: Improving Spoken Dialog Systems for the Elderly and Non-natives, Eurospeech
2003, Geneva, Switzerland.
Speech Recognition
Speech Synthesis
Computer Assisted Language Learning
- M. Eskenazi, A. Raux, E. Harris, Using Speech Recognition for Just-in-Time Language Learning.
Journal of the Acoustic Society of America, vol 120, no. 5, pt.2, p.3138.
- A. Raux and M. Eskenazi, Using Task-Oriented
Spoken Dialogue Systems for Language Learning: Potential, Practical Applications
and Challenges, InSTIL 2004, Venice, Italy.
- A. Raux and T. Kawahara, Automatic intelligibility
assessment and diagnosis of critical pronunciation errors for computer-assisted
pronunciation learning, ICSLP 2002, pp.737--740, Denver.
- K. Imoto, Y. Tsubota, A. Raux, T. Kawahara, and M. Dantsuji, Modeling
and automatic detection of English sentence stress for computer-assisted English
prosody learning system, ICSLP 2002, pp. 749--752, Denver, CO.
- A. Raux and T. Kawahara., Optimizing computer-assisted
pronunciation instruction by selecting relevant training topics, InSTIL
2002 Advanced Workshop, Davis, CA.
Favorite Sites
- WWWJDIC:
Jim Breen's very complete and easy to use Japanese English dictionary.
- YourDictionary.com: the ultimate
online linguistic resource. Dictionaries for more than 250 languages. The
place to go if you're looking for a Samoan, Gilbertese, or Miskitu online
dictionary.
- Vivisimo: a web search engine that
groups the results per (automatically generated) categories. If you get tired
of Google...
Friends