Chiori Hori, Ph.D.
InterACT
Language Technologies Institute
Carnegie Mellon University
407 South Craig Street
Pittsburgh, PA 15213, TEL/FAX: +1-412-268-9177
E-mail:
chiori_at_cs.ccmu.edu
http://www.cs.cmu.edu/afs/cs.cmu.edu/user/chiori/www
Research
interest
Spoken language processing
- Speech translation
- Spoken interactive open domain question answering
- Speech summarization
- language model for speech recognition
Project
- STR-DUST: Speech
Translation for Domain-Unlimited Spontaneous Communication Tasks
Research Objectives:
This project attempts for the first time speech translation for
domain-unlimited, as well as conversational tasks such as telephone
conversations, lectures & meetings. Both syntax and semantics are
open and might be ill-formed.
Approach:
We treat speech recognition, message extraction (from an error-full and
ill-formed transcript) and machine translation as a cascade of
statistical transformations. The results are communicated via
hypothesis lattices.
- STEEM: Summarization, Translation
and Entity Extraction of Multimedia Documents
Research Objectives:
The proposed research aims to improve the reliability and
usefulness of machine translation (MT) by shifting emphasis from a
word-by-word translation to a synopsis of key information that is to be
understood and processed by an English speaking human analyst. To
achieve this goal, information has to become more readable (reduced to
its key information, and reliability/confidence in this key information
has to be improved.
Approach:
We propose to explore solutions to these problems in two separate ways:
the first by reliably detecting and searching for named entities and
their relations, even if they are buried in the source by the
out-of-vocabulary problems we discussed; the second, by summarizing the
input in a manner that is optimized for translation and for retention
of useful content. Such dramatic reduction requires summarization to go
beyond selecting phrases from the source, to rewording (translating!)
the input to a new more concise paraphrase of the content.
Research
Collaboration
Organizing the following workshops:
- International
Workshop for Soken Languate Translation (IWSLT2005)
The Consortium for Speech Translation Advanced Research (C-STAR), an
international partnership of research laboratories engaged in automatic
translation of spoken language, organizes The International Workshop on
Spoken Language Translation (IWSLT) from Oct. 24th to 25th. Speech
translation systems combine speech recognition (ASR) systems with
machine translation (MT) system. This year, WSLT will focus on dealing
with the integration of speech, and how to achieve more robust
translations in face of recognition errors. An evaluation campaign will
be held, using the multilingual BTEC corpus from the IWSLT-2004
evaluation, but extend it by using ASR output. In addition, multiple
language tracks and integration modalities will be compared.
- International
Workshop for Speech Summarization and Translation (IWSpS2004)
An international workshop on summarization of spoken language is
planned by a group of four laboratories around the world. The technical
goal of the workshop is investigating algorithms and approaches for
summarization of conversational spoken language and evaluating of
summarization quality in terms of abstract generation quality, question
answering performance and/or translation performance of a Text Based MT
system.
The workshop will be held from Aug. 26th to Oct. 13th consisting of 3
phases, i.e., a kick-off meeting in Pittsburgh, on Aug. 26th to 28th, a
research phase at the respective laboratories and a wrap up meeting in
Tokyo, Japan, on Oct. 11th to 13th. The attendees of the workshop are
Carnegie Mellon University (CMU), University of Karlsruhe, Tokyo
Institute of Technology (TITEC) and Information Sciences Institute
(ISI).
Scientific program comittee:
- Interspeech2005
- ACL-2005 Workshop: Intrinsic and Extrinsic Evaluation Measures
for MT and/or Summarization
- ACL-2004 Workshop: Text Summarization Branches Out
Journal and Transactions
-
Chiori Hori, Sadaoki Furui, Rob Malkin, Hua Yu and Alex
Waibel,
"A Statistical Approach for Automatic Speech Summarization,"
"Special Issue on Unstructured Information management,"
EURASIP Journal on Applied Signal Processing 2003:2, 128-139.
ABSTRACT: This paper proposes a statistical approach to
automatic speech summarization. In our method, a set of words
maximizing a summarization score indicating the appropriateness of
summarization is extracted from automatically transcribed speech and
then concatenated to create a summary. The extraction process is
performed using a dynamic programming (DP) technique based on a target
compression ratio. In this paper, we demonstrate how an English news
broadcast transcribed by a speech recognizer is automatically
summarized. We adapted our method, which was originally proposed for
Japanese, to English by modifying the model for estimating word
concatenation probabilities based on a dependency structure in the
original speech given by a stochastic dependency context free grammar
(SDCFG).We also propose a method of summarizing multiple utterances
using a two-level DP technique. The automatically summarized sentences
are evaluated by summarization accuracy based on a comparison with a
manual summary of speech that has been correctly transcribed by human
subjects. Our experimental results indicate that the method we propose
can effectively extract relatively important information and remove
redundant and irrelevant information from English news broadcasts.
- Chiori Hori and Sadaoki Furui,
"A New Approach to Automatic Speech Summarization"
IEEE Transactions on Multimedia, Vol. 5, NO. 3, SEPTEMBER 2003, pp.
368-378.
ABSTRACT: This paper proposed a new automatic speech
summarization method. In this method, a set of words maximizing a
summarization score is extracted from automatically transcribed speech.
This extraction is performed according to a target compression ratio
using a dynamic programming (DP) technique. The extracted set of words
is then connected to build a summarization sentence. This summarization
score consists of a word significance measure, a confidence measure,
linguistic likelihood, and a word concatenation probability, The word
concatenation score is determined by a dependency structure in the
original speech given by broadcast news speech transcribed using a
large-vocabulary continuous-speech recognition (LVCSR) system is
summarized using our proposed method and compare with manual
summarization by human subjects. The manual summarization results are
combined to build a word network. This word network is used to
calculate the word accuracy of each automatic summarization results
using the most similar word string in the network Experimental results
show that the proposed method effectively extracts relatively important
information by removing redundant and irrelevant information.
- Chiori Hori and Sadaoki Furui,
"Summarized Speech Sentence Generation Based on Word Extraction
and Its Evaluation,"
(in Japanese)
The transaction of the Institute of Electronics, Information and
Communication
Engineers (IEICE) D-II, Vol.J85-D-II, No.2, pp.200-209(2002-2).
(*) This paper is given the 59th Paper Award from the
IEICE and
will be translated into English and published in the IEICE
transactions.
"Summarization: An
Approach through Word Extraction and a Method for Evaluation,"
(in English) Vol.E87-D No.1 pp.15-25
ABSTRACT: In this paper, we propose a new method of
automatic speech summarization for each utterance, where a set of words
that maximizes a summarization score is extracted from automatic speech
transcriptions. The summarization score indicates the appropriateness
of summarized sentences. This extraction is achieved by using a dynamic
programming technique according to a target summarization ratio. This
ratio is the number of characters/words in the summarized sentence
divided by the number of characters/words in the original sentence. The
extracted set of words is then connected to build a summarized
sentence. The summarization score consists of a word significance
measure, linguistic likelihood, and a confidence measure. This paper
also proposes a new method of measuring summarization accuracy based on
a word network expressing manual summarization results. The
summarization accuracy of each automatic summarization is calculated by
comparing it with the most similar word string in the network. Japanese
broadcast-news speech, transcribed using a large-vocabulary
continuous-speech recognition (LVCSR) system, is summarized and
evaluated using our proposed method with 20, 40, 60, 70 and 80%
summarization ratios. Experimental results reveal that the proposed
method can effectively extract relatively important information by
removing redundant or irrelevant information.
- Chiori Hori, Masaharu Katoh, Akinori Ito and Masaki
Kohda
"Construction and Evaluation of Language Models based on
Stochastic Context Free Grammar
for Speech Recognition,"(in Japanese)
The transaction of the Institute of Electronics, Information and
Communication
Engineers D-II, Vol.J83-D-II, No.11, pp.2407-2417(2000-10)
(*) This paper will be translated into English and published in
"Electronics and Communications in Japan" by John Wiley & Sons, Inc.
"Construction and evaluation of language models based on
stochastic context-free grammar for speech recognition," (in
English)
Systems and Computers in Japan, Volume 33, Issue 13,
Copyright c 2002 Wiley Periodicals, Inc., A Wiley Company,
Published Online: 23 Oct 2002
ABSTRACT: This paper reports the use of a stochastic
context free grammar (SCFG) for large vocabulary continuous speech
recognition (LVCSR). Unlike n-gram models, the SCFG can describe not
only local constraints but also global constraints pertaining to the
sentence as a whole, thus making possible language models to contribute
speech recognition. However, the inside-outside algorithm must be used
for estimation of the SCFG parameters, which involves a great amount of
calculation, proportional to the third power of the number of
non-terminal symbols and of the input string length. Hence, due to
problems in dealing with extensive text corpora, the SCFG has hardly
been applied as a language model for LVCSR. The proposed phrase-level
dependency SCFG allows a significant reduction of the computational
load. In experiments with the Mainichi news paper text corpus, a
large-scale phrase-level dependency SCFG was built for a LVCSR system.
Speech recognition tests with a vocabulary of about 5000 words showed
that the proposed method was comparable with the trigram model in
performance; however, when it was used in combination with a trigram
model, the error rate was reduced by 14% compared to the trigram model
alone.
Internatinal Conferences
Chiori Hori and Alex Waible,
"Spontaneous Speech Consolidation for Spoken Language Applications,"
Proc. Interspeech2005.
Takaaki Hori, Chiori Hori and Yasuhiro Minami,
"Fast on-the-fly composition for weighted finite-state transducers
in 1.8 million-word vocabulary continuous speech recognition,"
Proc. ICSLP2004.
Chiori Hori, Tsutomu Hirao and Hideki Isozaki,
"Evaluation Measures Considering Sentence Concatenation for
Automatic Summarization by Sentence or Word Extraction,"
ICSLP2004, Vol. 1, pp. 289--292 (2004.10).
Chiori
Hori, Takaaki Hori and Sadaoki Furui,
"Evaluation Methods for Automatic Speech Summarization,"
Proc. Eurospeech2003, Geneva, (2003-9)
Takaaki Hori, Chiori Hori, and Yasuhiro Minami,
"Speech summarization using weighted finite-state transducers,"
Proc. Eurospeech2003, pp.2817--2820 (2003.9).
Chiori Hori,
Takaaki Hori, Hajime Tsukada, Hideki Isozaki, Yutaka Sasaki and Eisaku
Maeda,
"Spoken Interactive ODQA System: SPIQA,"
ACL-2003 Interactive Poster and Demonstration Session, (2003).
Chiori Hori,
Takaaki Hori, Hideaki Isozaki, Eisaku Maeda, Shigeru Katagiri and
Sadaoki Furui,
"Study on Spoken Interactive Open Domain Question Answering,"
Spontaneous Speech Processing and Recognition (SSPR), pp. 111-114
(2003-4).
Tomonori Kikuchi, Sadaoki Furui and Chiori Hori,
"Two-stage Automatic Speech Summarization by Sentence Extraction and
Compaction,"
Spontaneous Speech Processing and Recognition (SSPR), pp. 207-210
(2003-4).
Chiori
Hori, Takaaki Hori, Hideki Isozaki, Eisaku Maeda, Shigeru Katagiri and
Sadaoki Furui,
"Disambiguous Queries in a Soken Interactive ODQA System,"
. ICASSP2003, Hongkong, Vol. I, pp 624-627.
Tomonori
Kikuchi, Sadaoki Furui and Chiori Hori,
"Automatic Speech Summarization Based on Sentence Extraction and
Compaction,
Proc. ICASSP2003, Hongkon, Vol. I, pp 384-387.
Chiori
Hori, Sadaoki Furui, Rob Malkin, Hua Yu and Alex Waibel,
"Automatic Speech Summarization Applied to English Broadcast News
Speech,"
Proc. ICASSP2002, Orlando, U.S.A., VOL. 1, pp. 9-12 (2002-5).
Chiori
Hori and Sadaoki Furui
"Automatic Summarization of English Broadcast News speech,"
Notebook of HLT2002, San Diego, U.S.A., pp. 228-233 (2002-3).
Chiori
Hori and Sadaoki Furui,
"Advances in Automatic Speech Summarization,"
Proc. Eurospeech2001, 7th European Conference on Speech Communication
and Technology, Aalborg, Denmark, Vol.III, pp.1771-1774(2001-9).
Chiori
Hori and Sadaoki Furui,
"Improvements in Automatic Speech Summarization and Evaluation
Methods,"
Proc. ICSLP2000 6th International Conference on Spoken Language
Processing,
Beijing, Vol.4, pp.326-329(2000-10).
Akinobu Ito, Chiori
Hori, Masaharu Katoh and Masaki Kohda,
"Language Modeling by Stochastic Dependency Grammar for Japanese
Speech Recognition,"
Proc. ICSLP2000 6th International Conference on Spoken Language
Processing,
Beijing, Vol.1, pp.246-249(2000-10).
Chiori
Hori and Sadaoki Furui,
"Automatic Speech Summarization based on Word Significance and
Linguistic
Likelihood,"
Proc. 2000 IEEE International Conference on Acoustics, Speech, and
Signal Processing, Istanbul, Vol.3, pp.1579-1582(2000-6).
Chiori
Hori and Sadaoki Furui,
"Toward Automatic Summarization of Broadcast News Speech,"
Proc. Japan-China Symposium on Advanced Information Technology, Tokyo,
pp.75-82(1999-9).
Doctoral
Thesis
"A
Study on Statistical
Methods for Automatic Speech Summarization"
This thesis submitted for the degree of Doctoral of
Philosophy
in the Department of Computer Science, Graduate School of Information
Science and Engineering.
Tokyo Institute of Technology
March 2002
Award