EMOTION RECOGNITION AND SYNTHESIS SYSTEM
ON SPEECH

ABSTRACT

In this study, the system that is capable of both recognizing and synthesizing emotional content in speech is developed. At first, the relation information that relates the physical features of emotional speech to the emotional content perceived by listeners is estimated through linear statistical methods, and it is applied to the system. It realizes emotion recognition and synthesis just through easy linear operation using the relation information. In the system, the pitch contour is expressed by the model proposed by Fujisaki (7 parameters) and the power envelope is approximated by 5 line segments (11 parameters), and PSOLA is applied to synthesize the speech. The emotion words among which there is very little correlates were selected from the preliminary statistical experiments. The relation information was verified to be significant and from the result of the experiments, the system was able to recognize and synthesize emotional content in speech as subjects did. Moreover, the emotion recognition system is applied to the emotion measurement module in the cyber shopping system.

REFERENCES

Tsuyoshi MORIYAMA, Shinji OZAWA,
``Emotion recognition and synthesis system on speech'', IEEE ICMCS, , Jun 1999.

E-mail: moriyama@sak.iis.u-tokyo.ac.jp