Kornel Laskowski
Former Student (c/o R Stern)
Language Technologies Institute
School of Computer Science
Carnegie Mellon University
kornel AT cs DOT cmu DOT edu
Carnegie Mellon University
407 S Craig St, SCR 218
Pittsburgh PA, 15213
USA
Phone: +1 412 268 2518
Fax: +1 412 268 5578
KTH Speech, Music and Hearing
Lindstedstvägen 24
SE-100 44 Stockholm
Sweden
Phone: +46 8 790 97 51
Fax: +46 8 790 78 54
|
Kornel Laskowski
|
Mel Spectral Flux (MSF): A Normative Implementation in C
MSF, the negative logit of the cosine distance between consecutive Mel spectra, is an easy-to-compute instantaneous-frame correlate of speaking rate. The representation was developed with Anna Hjalmarsson, at the
Department of Speech, Music and Hearing at
KTH, to detect final lengthening. It is currently being explored to detect final creak, to characterize voice quality, to aid in second-language learning, and to quantify rate of speech (ROS) in general conversational settings.
To reproduce the results from (Hjalmarsson & Laskowski, 2011):
- obtain the file dealsnippets.tar.gz from Anna Hjalmarsson and place in SOMEPATH
- obtain Makefile.INTERSPEECH2011 and place in SOMEPATH
- change directory to SOMEPATH and run "make -f Makefile.INTERSPEECH2011 all"
- the Makefile downloads, builds and executes all required code to produce encapsulated PostScript of Figures 1, 3 and 4, as well as the text for Table 2 and other miscellaneous numerical results found in the paper
References:
The MSF representation was introduced in
|
|
Last modified: Sat 17 Mar 2012 0020hrs GMT
|
|