My proposal for a Thesis Proposal..

So far, all I know is that it is going to be on recognizing speech in the presense of music...

What I know of the problem:

Music does not have as catastrophic an effect on recognition performance as noise does.
Regular algorithms meant to handle speech in noise fail on speech in music
The problems with music seem to be, variously, the harmonicity, the presence of transients and the colour (not necessarily in that order).
The `nice' properties of music, that could potentially be taken advantage of, are the systematic nature (I dont call Heavy Metal music) and stationarity in parts.

Possible approaches:

Characterize music with complex distributions; attempt regular ML/MAP/MMI algorithms on the model.
Attempt to use the obvious spectral differences between music and speech to isolate part of the effects of music
Different acoustic paramters for modelling speech in music??

Ideas, suggestions welcome. If you are within a 100 mile radius of Pittsburgh I'll even treat you to beer if you have a good suggestion for me.

I'm also in the process of picking out a committee who will be sympathetic to my approach to research (I'm not very enthusiastic about random experiments :-/). I have 5 weeks to get all of this AND the proposal done. Wish me luck..

back to my `personal' page

Updated March 1996: bhiksha@cs.cmu.edu