Michael Finke

 

Interactive Systems Inc.

1900 Murray Ave

Pittsburgh, PA 15217, USA

Phone: (412) 421 2533
FAX: (412) 421 2533
Email: finkem@interactivesys.com
http://www.is.cs.cmu.edu/~finkem


 

Objective

Research position in computer science. Research interests in large vocabulary conversational speech recognition, integration of speech recognition technologies into multimodal dialog systems.

Education

University of Karlsruhe, Karlsuhe, Germany
(1989-1993)
Diploma thesis: Theory and Parallel Implementation of Stochastic Neural Networks
Supervisors: Prof. W. Menzel and Dr. K. R. Müller
Grade: Very good (ranked #1)

Experience

Interactive Systems Inc., Pittsburgh, PA, USA

(1998-present)

Chief Technology Officer, Board Member

 

  • Topics: Real-time large vocabulary conversational speech recognition technology. To capture the large variability of conversational speech in a new framework of an attribute instead of a phone-based representation of speech was introduced. The idea is to enable a much tighter coupling of the pronunciation, duration and acoustic models. To allow for real-time recognition of highly spontaneous speech within this new modeling paradigm a finite state single-prefix-tree one-pass time-synchronous decoder was developed.
  • Applications: English and Japanese speech recognition engine for large vocabulary dictation, conversational speech, and broadcast news on broadband and telephone channels. Design and implementation of conversational dialog system.
  • Software: Designed and implemented Interactive Systems speech recognition toolkit in collaboration with Jürgen Fritsch and Detlef Koll. Main focus was on attribute-based model induction, decision networks, hidden markov models, and development of a finite state single-prefix-tree one-pass time-synchronous decoder.
  • Management: Chief Technology Officer heading a team of 12 research scientists and engineers at Interactive Systems.

 

Carnegie Mellon University, Pittsburgh, PA, USA
Computer Science Department
Interactive Systems Laboratories
(1997-1998)
Project Scientist 
(1995-1997)
Visiting Research Scientist 

  • Topics: Research in automated speech recognition systems. The main focus is on conversational speech recognition, speaker normalization and adaptation, speaking mode dependent pronunciation modelling, decoding and confidence measures. Developed algorithms to alleviate the problem of having unreliable transcripts to train a speech recognizer on (flexible transcription alignment; FTA), to allow for speaker dependent/adapted training techniques for speaker independent recognizers (Label Boosting), adaptation and confidence measure based adaptation (LMjitter confidence measure). Implemented a framework for speaking mode dependent pronunciation modelling (dynamic pronunciation modelling).
  • Applications: Evaluated ideas on the Switchboard and Callhome corpus in international large vocabulary conversational speech recognition competitions by NIST; tied first in 1997 (other participants were BBN, Boston University, Cambridge University HTK, Dragon Systems, SRI).
  • Software: Designed and implemented the large vocabulary speech recognizer Janus-3 which turned out to be a state of the art recognizer (1st in Verbmobil, 1st on Switchboard in 1997 and tied 1st in Callhome English 1997).

University of Karlsruhe, Karlsuhe, Germany
Institut für Logik Komplexität und Deduktionssysteme
Interactive Systems Laboratories
Prof. Alexander Waibel
(1993-1997)
Graduate Research Assistant

  • Topics: Theoretical foundations of the modelling assumptions inherent in the design of neural network error and transfer functions. Information geometry and approximation theory of neural networks. Preprocessing techniques for online handwriting recognition systems and efficient decoding algorithms for the Janus-2 speech recognizer and the NPen handwriting recognizer.
  • Applications: Mostly online cursive handwriting recognition and large vocabulary speech recognitionon the german Verbmobil corpus.
  • Software: Developed a library of efficient matrix based neural network routines in C++ for the ISL handwriting recognizer NPen++ (Stefan Manke). Implemented a new decoder for the Janus-2 speech recognition engine that allowed for crossword dependent acoustics and long range language models.

Heidelberg Scientific Center of IBM; Heidelberg, Germany
Research Consultant (1989-1994)
Supervisor: Dr. Eric Keppel.

  • Topics: Large vocabulary dictation system in German (Tangora project)
  • Software: Language modelling tools and a neural network toolkit for acoustic modelling.

Finke Consulting; Cologne/Heidelberg, Germany
Consultant (1984-1995)
Focus: Computer science consulting services in areas related to distributed database systems and to human language processing; offering both technical and managment/development services.

Invitation

  • Invited attendee, 1993 Connectionist Models summer school, University of Colorado, Boulder.
  • Invited lecture, "Probabilistic Modelling of Neural Networks", Beckman Institute, University of Urbana Champaign, USA, 1993.
  • Invited lecture, "Stochastic Interpretation of Neural Networks", Gesellschaft für Mathematik und Datenverarbeitung, Germany, Berlin, Januar 1996
  • Invited attendee, "CLSP Summer Workshop 1996 - Modelling Systematic Variations in Pronunciation Modelling via a Language Dependent Hidden Speaking Mode", Johns Hopkins University, Baltimore, USA
  • Invited attendee, "CLSP Summer Workshop 1997 - Pronunciation Modelling", Johns Hopkins University, Baltimore, USA

Personal

Native speaker of German. Citizenship: Germany. English fluent and literate.

Publications

Large Vocabulary Conversational Speech Recognition

  • Modeling and Efficient Decoding of Large Vocabulary Conversational Speech
    M.Finke, J.Fritsch, D.Koll, and A.Waibel
    Proceedings of Eurospeech’99
    Budapest, Hungary, September 1999
  • Modeling and Efficient Decoding of Large Vocabulary Conversational Speech
    M.Finke, J.Fritsch, D.Koll, and A.Waibel
    Proceedings of Hub5-E (Switchboard) LVCSR Workshop
    Linthicum Heights, Maryland, June 1999
  • Effective Structural Adaptation of LVCSR Systems to Unseen Domains Using Hierarchical Connectionist Acoustic Models
    J.Fritsch, M.Finke, A.Waibel
    Proceedings of International Conference on Spoken Language Processing (ICSLP'98)
    Sydney, Australia, December 1998
  • Phonetic-Distance-Based Hypothesis Driven Lexical Adaptation for Transcribing Multilingual Broadcast News
    P.Geutner, M.Finke, A.Waibel
    Proceedings of International Conference on Spoken Language Processing (ICSLP'98)
    Sydney, Australia, December 1998
  • Applying Divide and Conquer to Large Scale Pattern Recognition Tasks
    J.Fritsch, M.Finke
    in "Neural Networks: Tricks of the Trade"
    G.B. Orr, K.R. Müller (eds), Springer 1998
  • Structural Adaptation of Hierarchical Connectionist Acoustic Models to Unseen Domains
    J.Fritsch, M.Finke, A.Waibel
    Slides of talk presented at Hub5-E (Switchboard) LVCSR Workshop
    Linthicum Heights, Maryland, September 1998
  • Adaptive Vocabularies for Transcribing Multilingual Broadcast News
    P. Geutner, M. Finke, P. Scheytt
    Proceedings of International Conference on Acoustics Speech and Signal Processing  (ICASSP'98)
    Seattle Wa, May 1998
  • Pronunciation Modelling using a Hand-Labelled Corpus for Conversational Speech Recognition
    B. Byrne, M. Finke, S. Khudanpur, J. McDonough, H. Nock, M. Riley, M. Saraclar, C. Wooters, G. Zavaliagkos
    Proceedings of International Conference on Acoustics Speech and Signal Processing  (ICASSP'98)
    Seattle Wa, May 1998
  • ACID/HNN: Clustering Hierarchies of Neural Networks for Context Dependent Connectionist Acoustic Modeling
    J. Fritsch, M. Finke
    Proceedings of International Conference on Acoustics Speech and Signal Processing  (ICASSP'98)
    Seattle Wa, May 1998
  • Stochastic Pronunciation Modelling from Hand-Labelled Phonetic Corpora
    M. Riley, B. Byrne, M. Finke, S. Khudanpur, A.Ljolje, J. McDonough, H. Nock, M. Saraclar, C. Wooters, G. Zavaliagkos
    ESCA Workshop on Modeling Pronunciation Variation for Automatic Speech Recognition
    May 3-7 1998, Kerkrade, Netherlands
  • Clarity: Inferring Discourse Structure from Speech
    M. Finke, M. Lapara, A. Lavie, L. Levin, L. Mayfield Tomokiyo, T. Polzin, K. Ries, A. Waibel, K. Zechner
    Proceedings of the AAAI 98 Spring Symposium.
  • Transcribing Multilingual Broadcast News using Hypothesis Driven Lexical Adaptation
    P. Geutner, M. Finke, P. Scheytt, A. Waibel, H. Wactlar
    Proceedings of 1998 DARPA Hub4 Workshop, Lansdowne, W.Virginia
  • Meeting Browser: Tracking and Summarizing Meetings
    A. Waibel, M. Finke, M. Bett
    Proceedings of 1998 DARPA Hub4 Workshop, Lansdowne, W.Virginia
  • Flexible Transcription Alignment
    Michael Finke and Alex Waibel
    1997 IEEE Workshop on Speech Recognition and Understanding, 
    Dec 14-17, Santa Barbara, California
  • Pronunciation Modelling for Conversational Speech Recognition: A Status Report from WS97
    B.Byrne, M.Finke, S.Khudanpur, J.McDonough, H.Nock, M.Riley, M.Saraclar, C.Wooters, G.Zavaliagkos
    1997 IEEE Workshop on Speech Recognition and Understanding, 
    Dec 14-17, Santa Barbara, California
  • Speaking Mode Dependent Pronunciation Modeling in Large Vocabulary Conversational Speech Recognition
    Michael Finke and Alex Waibel
    Eurospeech 97, Rhodos, Greece.
  • Improving Performance on Switchboard by Combining Hybrid HME/HMM and Mixture of Gaussian Acoustic Models
    Juergen Fritsch and Michael Finke
    Eurospeech 97, Rhodos, Greece.
  • Speaker Normalization and Speaker Adaptation - A Combination for Conversational Speech Recogntion
    Puming Zhan, Martin Westphal, Michael Finke and Alex Waibel
    Eurospeech 97, Rhodos, Greece.
  • The JanusRTk Switchboard/Callhome 1997 Evaluation System
    Michael Finke, Juergen Fritsch, Petra Geutner, Klaus Ries and Torsten Zeppenfeld
    Proceedings of LVCSR Hub5-e Workshop, May 13-15, Baltimore, Maryland.
  • The Karlsruhe-Verbmobil Speech Recognition Engine
    Michael Finke, Petra Geutner, Hermann Hild, Thomas Kemp, Klaus Ries and Martin Westphal
    Proceedings of ICASSP 97, Muenchen, Germany
  • JANUS-III: Speech-To-Speech Translation in Multiple Languages
    Alon Lavie, Alex Waibel, Lori Levin, Michael Finke, Donna Gates, Marsal Gavalda, Torsten Zeppenfeld and Puming Zhan
    Proceedings of ICASSP 97, Muenchen, Germany
  • Wide Context Acoustic Modeling in Read vs. Spontaneous Speech
    Michael Finke and Ivica Rogina
    Proceedings of ICASSP 97, Muenchen, Germany
  • Recognition of Conversational Telephone Speech using the Janus Speech Engine
    Torsten Zeppenfeld, Michael Finke, Klaus Ries, Martin Westphal and Alex Waibel
    Proceedings of ICASSP 97, Muenchen, Germany
  • Context-Dependent Hybrid HME/HMM Speech Recognition using Polyphone Clustering Decision Trees
    Jürgen Fritsch, Michael Finke, Alex Waibel
    Proceedings of ICASSP 97, Muenchen, Germany
  • Adaptively Growing Hierarchical Mixtures of Experts
    Jürgen Fritsch, Michael Finke, Alex Waibel 
    Neural Information Processing Systems NIPS 96, Denver, USA
  • Mode Dependent Pronunciation Modeling in LVCSR
    Michael Finke
    Proceedings of the CLSP Workshop 1996, Johns Hopkins University, Baltimore.
  • Modeling Systematic Variations in Pronunciation via a Language-Dependent Hidden Speaking Mode
    M.Ostendorf, B.Byrne, M.Bacchiani, M.Finke, A.Gunawardana, K.Ross, S.Roweis, E.Shriberg, D.Talkin, A.Waibel, B.Wheatley and T.Zeppenfeld
    Proceedings of the ICSLP 1996
  • LVCSR Switchboard April 1996 Evaluation Report
    M.Finke and T.Zeppenfeld
    Proceedings of the LVCSR Hub 5 Workshop, April 29 - May 1, 1996
    Maritime Institute of Technology, Linthicum Heights, Maryland
  • Minimizing Search Errors Due to Delayed Bigrams in Real-Time Speech Recognition Systems
    M.Woszczyna, M.Finke
    Proceedings of the ICASSP 1996,
  • JANUS-II: Translation of Spontaneous Conversational Speech
    A.Waibel, M.Finke, D.Gates, M.Gavaldà, T.Kemp, A.Lavie, L.Levin, M.Maier, L.Mayfield, A.McNair, I.Rogina, K.Shima, T.Sloboda, M.Woszczyna, T.Zeppenfeld, P.Zhan
    Proceedings of the ICASSP 1996

Handwriting Recognition

  • A Fast Search Technique for Large Vocabulary On-Line Handwriting Recognition
    S. Manke, M. Finke, and A. Waibel
    Proceedings of the International Workshop on Frontiers in Handwriting Recognition, Colchester, England, 1996. 
  • The Use of Dynamic Writing Information in a Connectionist On-Line Cursive Handwriting Recognition System
    S. Manke, M. Finke, and A. Waibel
    In Advances in Neural Information Processing Systems 7, 1995. 
  • NPen++: A Writer Independent, Large Vocabulary On-Line Cursive Handwriting Recognition System
    S. Manke, M. Finke, and A. Waibel
    Proceedings of the International Conference on Document Analysis and Recognition, Montreal, 1995. 
  • Combining Bitmaps with Dynamic Writing Information for On-Line Handwriting Recognition
    S. Manke, M. Finke, and A. Waibel
    Proceedings of the International Conference on Pattern Recognition, Jerusalem, 1994. 

Theory of Neural Networks

  • Statistical Theory of Overtraining - Is Cross-Validation Effective?
    Amari, S., Murata, N., Müller, K.-R., Finke, M., Yang, H.
    NIPS'95: Advances in Neural Information Processing Systems 8, D.S. Touretzky, M.C. Mozer and M.E. Hasselmo (eds.), MIT Press: Cambridge, MA., p. 176-182. 
  • Asymptotic Statistical Theory of Overtraining and Cross-Validation
    Amari, S., Murata, N., Müller, K.-R., Finke, M., Yang, H.
    University of Tokyo Technical Report METR 95-06 and accepted at IEEE Transactions on neural networks 
  • A Numerical Study on Learning Curves in Stochastic Multi-Layer Feed-Forward Networks
    Müller, K.-R., Finke, M., Schulten, K., Murata, N., Amari, S.
    University of Tokyo Technical Report METR 95-03, Neural Computation, 8, 1085-1106
  • On Large Scale Simulations for Learning Curves
    Müller, K.-R., Finke, M., Murata, N., Schulten, K., Amari, S.
    Proceedings of the CTP-PBSRI Workshop on Theoretical Physics: Neural Networks, The Statistical Mechanics Perspective, World Scientific, Singapore, 73-84 
  • On Large Scale Simulations for Learning Curves
    Müller, K.-R., Finke, M., Schulten, K., Murata, N., Amari, S.
    in Proceedings of the 6th Australian conference on Neural Networks (ACNN'95), eds. Margaret Charles & Cyril Latimer, School of Electrical Engineering, University of Sydney, 45-48 
  • Learning curves of faithful versus unfaithful neural network models
    Müller, K.-R., Bös, S., Finke, M., Murata, N.
    in NOLTA 95: Las Vegas Symposium on Nonlinear Theory and its Applications, 127-132
  • Estimating a-posteriori probabilities using stochastic network models
    Finke, M., Müller, K.-R.
    In Proc. of the 1993 Connectionist Models summer school, Mozer, M., Smolensky, P., Touretzky, D.S., Elman, J.L. and Weigend, A.S. (Eds.), Hillsdale, NJ: Erlenbaum Associates, 324-331 
  • Constructing Neural Network Models
    Müller, K.-R., Finke, M.
    Extended Abstract in Proc. of the JNNS'94: 5th Annual Conference of the Japanese Neural Network Society, 94-95