11-755 MLSP Homework 1

11-755 MLSP Homework: Linear Algebra Refresher

Problem

Below are links to a piece of music and recordins of several notes. You are required to "transcribe" the music.

For transcription, you will have to determine the note or set of notes being played at each time.

Downloads

This is a recording of "Polyshka Polye", played on the harmonica. It has been downloaded from youtube (with permission from the artist).

Below are a set of notes from a harmonica.
Note Wav File
E e.wav
F f.wav
G g.wav
A a.wav
B b.wav
C c.wav
D d.wav
E2 e2.wav
F2 f2.wav
G2 g2.wav
A2 a2.wav

Download the following matlab files: stft.m

Matlab instructions

You can read a wav file into matlab as follows:
[s,fs] = wavread('filename');
s = resample(s,16000,fs);

The recordings of the notes can be computed to a spectrum as follows:
spectrum = mean(abs(stft(s',2048,256,0,hann(2048))),2);
spectrum” will be a 1025 x 1 vector.

The recordings of the complete music can be read just as you read the notes. To convert it to a spectrogram do the following:
sft = stft(s',2048,256,0,hann(2048)); sphase = sft./abs(sft); smag = abs(sft);
smag” will be a 1025 x K matrix (K is the number of spectral vectors in the matrix. We will also need “sphase” to reconstruct the signal later.

Additional Info

Compute the spectrum for each of the notes. Compute the spectrogram matrix “smag” for the music signal. This matrix is composed of K spectral vectors. Each vector represents 16 milliseconds of the signal.

You may find, projections, pseudo inverses, and dot products useful. If you know of any other techniques, you can use those too. Tricks like thresholding (setting all values of some variable that fall below a threshold to 0) might also help.

The output should be of the form of a matrix :
11000001...
00011011...
01110111...
...........

Each row of the matrix represents one note. Hence there will be as many rows as you have notes in table 1.

Each column represents one of the columns in the spectrogram for the music. So if there are K vectors in the spectrogram, there will be K vectors in your output.

Each entry will denote if a note was found in that vector or not. For instance, if matrix entry (4,25) = 0, then the fourth note (d) was not found in the 25th spectral vector of the signal.

Evaluation

The results will be evaluated perceptually -- we'll compose the music using your transcription and notes from a different instrument. How good that sounds will determine the quality of the transcription.

Additional points

If you figure out how to do the composition youself (using the Harmonica notes) and return an example, you will get bonus points. The bonus points will be accounted for separately and added to your final score at the end of the semester (if it is less than 100).

Points

15 points, with 3 bonus points for synthesis.

Due date

The assignment is due in 2 weeks (September 15th). Use the week of the 6th (no mlsp class) wisely. Each day of delay thereafter will automatically deduct 0.5 points from your score.

Solutions may be emailed to be at "bhiksha@cs.cmu.edu". The message must have the subject line "MLSP assignment 1". It should include a 1 page report of what you did (can be longer), and the resulting matrix. You may also send me synthesized music (for the bonus points).