11-756 / 18799D Design and Implementation of ASR Systems
11-756/18799D ASR: Assignment 2, Feature Computation
Problem
Write/implement a routine for computing MFC features from audio.
- Record multiple instances of digits zero, one, two etc., preferably using the code from assignment 1. Recordings must be 16khz sampled in 16-bit PCM format.
- Compute log spectra and cepstra for the recordings. The cepstra must contain 13 cepstral values for each analysis frame. For the first test, the log spectra from which cepstra are computed must be obtained using 40 Mel filters spanning the frequencies 100Hz-7000Hz.
- Visualize the MFCCs and log spectra as spectrograms, e.g. using matlab.
- Visually compare spectrographic representations of mel log spectra of different instances of the same word.
- Repeat above with the number of mel filters set to 30 and 25.
Changing the number of mel filters should result in somewhat blurrier mel log spectra.
You may use open source packages/modules for the actual code. The "wave2feat" tool in the CMU Sphinx open source package is one tool that can be used to compute mel cepstral features from audio.
Dan Ellis also has some nice matlab code on his web page that you may use.
A TEST FILE: test.wav