EMOTION RECOGNITION AND SYNTHESIS SYSTEM
ON SPEECH
ABSTRACT
In this study, the system that is capable of both recognizing and
synthesizing emotional content in speech is developed.
At first, the relation information that relates the physical features
of emotional speech to the emotional content perceived by listeners is
estimated through linear statistical methods, and it is applied to the
system.
It realizes emotion recognition and synthesis just through easy linear
operation using the relation information.
In the system, the pitch contour is expressed by the model proposed by
Fujisaki (7 parameters) and the power envelope is approximated by 5
line segments (11 parameters), and PSOLA is applied to synthesize the
speech.
The emotion words among which there is very little correlates were
selected from the preliminary statistical experiments.
The relation information was verified to be significant and from the
result of the experiments, the system was able to recognize and
synthesize emotional content in speech as subjects did.
Moreover, the emotion recognition system is applied to the emotion
measurement module in the cyber shopping system.
REFERENCES
- Tsuyoshi MORIYAMA, Shinji OZAWA,
``Emotion recognition and synthesis system on speech'',
IEEE ICMCS, , Jun 1999.
E-mail:
moriyama@sak.iis.u-tokyo.ac.jp