My proposal for a Thesis Proposal..
So far, all I know is that it is going to be on recognizing speech in the presense of music...
What I know of the problem:
- Music does not have as catastrophic an effect on recognition performance as
noise does.
- Regular algorithms meant to handle speech in noise fail on speech in
music
- The problems with music seem to be, variously, the harmonicity, the
presence of transients and the colour (not necessarily in that order).
- The `nice' properties of music, that could potentially be taken advantage of, are the systematic nature (I dont call Heavy Metal music) and stationarity
in parts.
Possible approaches:
- Characterize music with complex distributions; attempt regular ML/MAP/MMI
algorithms on the model.
- Attempt to use the obvious spectral differences between music and speech
to isolate part of the effects of music
- Different acoustic paramters for modelling speech in music??
Ideas, suggestions welcome. If you are within a 100 mile radius of Pittsburgh
I'll even treat you to beer if you have a good suggestion for me.
I'm also in the process of picking out a committee who will be sympathetic to
my approach to research (I'm not very enthusiastic about random experiments :-/). I have 5 weeks to get all of this AND the proposal done. Wish me luck..
back to my `personal' page
Updated March 1996: bhiksha@cs.cmu.edu