Conference

Language Modeling:

I am working on context-aware Neural Network Language modeling for robust speech recognition, especially for languages with limited data in the DARPA sponsored BABEL project. Adding context information other than just the previous words to the language model helps improving the accuracy of a speech system ( such as location, speaker ID, etc. ).

I am also working on optimizing the Neural Network Language Models towards keyword search. Typical language models are optimized for Word Error Rate, but often the improvement in WER does not translate to improvement in keyword search. Based on the language word distribution as well as distribution of words in Key Words, we can modify the language model probabilities towards keyword search.

Using the Web:

Along with my collegues at CMU, I am working on automatically crawling, cleaning and filtering web text for language modeling of languages with low resources to improve Word Error Rate and Key Word search.

Machine Translation:

I worked on improving the machine translation system at IBM, with a focus on Indian languages. We Worked on specific problems of handling word reordering and morphology in machine translation for Hindi-English systems and got substantial improvements over the original system. I also Worked at implementing an Italian to English translation assist tool as a plug-in to Microsoft Word for IBM GDLT group, which had a translation memory to add recently translated and human corrected sentences.