Kornel Laskowski
Former Student (c/o R Stern)
Language Technologies Institute
School of Computer Science
Carnegie Mellon University
kornel AT cs DOT cmu DOT edu
Carnegie Mellon University
407 S Craig St, SCR 218
Pittsburgh PA, 15213
USA
Phone: +1 412 268 2518
Fax: +1 412 268 5578
KTH Speech, Music and Hearing
Lindstedstvägen 24
SE-100 44 Stockholm
Sweden
Phone: +46 8 790 97 51
Fax: +46 8 790 78 54
|
Kornel Laskowski
|
Fundamental Frequency Variation (FFV): A Normative Implementation in C
The FFV representation is an instantaneous-frame representation of variation in fundamental
frequency, and is intended to indirectly model (at a sub-unit level) intonation trajectories, in the same
way that standard MFCC features indirectly model formant trajectories at the sub-unit level. The representation
was developed with Jens Edlund and
Mattias Heldner at the
Department of Speech, Music and Hearing at
KTH.
Code which wraps ffv-1.x.x for use in existing signal processing environments includes:
References:
The FFV representation was introduced in
and an overview is available in
Several computational refinements are described in
Kornel Laskowski, Matthias Wölfel, Mattias Heldner, and Jens Edlund (2008),
Computing the Fundamental Frequency Variation Spectrum in Conversational Spoken Dialogue Systems.
In proceedings of the 155th Meeting of the Acoustical Society of America, 5th EAA Forum Acusticum, and 9th
SFA Congrés Français d'Acoustique (Acoustics2008),
Paris, France, 29 June - 04 July, pp3305-3310.
[slides]
Kornel Laskowski, Mattias Heldner and Jens Edlund (2009),
A General-Purpose 32 ms Prosodic Vector for Hidden Markov Modeling,
In proceedings of the 10th Annual Conference of the International Speech Communication Association (INTERSPEECH2009),
Brighton, UK, 6-10 September, pp724-727.
[slides]
Demonstrations of inferred model structure over the representation are available in
Kornel Laskowski, Jens Edlund, and Mattias Heldner (2008),
Learning Prosodic Sequences Using the Fundamental Frequency Variation Spectrum.
In proceedings of the 4th ISCA International Conference on Speech Prosody (SP2008),
Campinas, Brazil, 06-09 May.
[poster]
Mattias Heldner, Jens Edlund, Kornel Laskowski, and Antoine Pelcé (2008),
Prosodic Features in the Vicinity of Silences and Overlaps.
To appear in proceedings of the 10th Nordic Conference on Prosody,
Helsinki, Finland, 04-06 August.
Kornel Laskowski, Mattias Heldner and Jens Edlund (2009),
Exploring the Prosody of Floor Mechanisms in English Using the Fundamental Frequency Variation Spectrum,
In proceedings of the 17th European Signal Processing Conference (EUSIPCO2009),
Glasgow, UK, 24-28 August, pp2539-2543.
[poster]
Finally, application of the FFV representation to speaker recognition is described in
|
|
Last modified: Sun 20 Feb 2011 2315hrs GMT
|
|