Natural Language Processing / Computational Linguistics
Natural Language Processing (NLP) researchers study fundamental problems in automating
textual and linguistic analysis, generation, representation, and
acquisition. While NLP is applied in a huge variety of language technologies, NLP
researchers focus on cross-cutting techniques. The closely related field of
Computational Linguistics (CL) aims
to model aspects of the human language faculty using formal computational
models (of both symbolic and statistical varieties), with the aim of understanding
the nature of language as a phenomenon.
Carnegie Mellon boasts one of the world's largest and most diverse collection of NLP/CL researchers, courses, and research projects.
Faculty and Interests
- Jaime Carbonell, LTI (Director) & CSD (Ph.D., Yale, 1979)
- Justine Cassell, HCII (Director) & LTI (Ph.D., Chicago, 1991)
- William Cohen, LTI & MLD (Ph.D., Rutgers, 1990): information extraction, text
classification, analysis of semi-structured documents, applications of
machine learning to language problems
- Chris Dyer, LTI & MLD (Ph.D., U. Maryland, 2010): statistical machine translation, Bayesian methods in NLP, big data
- Scott Fahlman, LTI & CSD (Ph.D., MIT, 1977): knowledge representation and reasoning, knowledge-based NLP
- Robert Frederking, LTI (Ph.D., CMU, 1986): email understanding, translation for endangered languages
- Eduard Hovy, LTI (Ph.D., Yale, 1987)
- Alon Lavie, LTI (Ph.D., CMU, 1996): parsing algorithms of all flavors
and varieties, grammar formalisms, synchronous context-free grammars
and related formalisms, syntax-driven machine translation
- Lori Levin, LTI (Ph.D., MIT, 1986)
- Brian MacWhinney, Psychology (Ph.D., U.C. Berkeley, 1974): child language learning, second language learning, corpus analysis, corpus data mining, conversation analysis, bilingual corpus analysis, morphosyntactic analysis and tagging, self-organizing feature maps, neural networks, web-based collaborative commentary
- Teruko Mitamura, LTI (Ph.D., U. Pittsburgh, 1989): multilingual question answering, corpus annotation, Japanese NLP, language tutoring
- Tom Mitchell, MLD & LTI (Ph.D., Stanford, 1979): machine learning, artificial intelligence, cognitive neuroscience
- Eric Nyberg, LTI (Ph.D., CMU, 1992)
- Kemal Oflazer, LTI and CMU-Qatar (Ph.D., CMU, 1987)
- Carolyn Penstein Rosé, LTI & HCII (Ph.D., CMU, 1997): linguistic structure of conversation, distinguishing features of different forms of conversation, adaptation of techniques from expository text analysis to conversational text
- Roni Rosenfeld, LTI, MLD & CSD (Ph.D., CMU, 1994)
- Noah Smith, LTI & MLD (Ph.D., Johns Hopkins, 2006): statistical NLP, machine learning for structured data, computational linguistics, computational social science (leaving for U.W. in 2015)
- Eric Xing, MLD, LTI, & CSD (Ph.D., Rutgers, 1999; Ph.D., U.C. Berkeley, 2004)
Courses
Groups and Current Projects
- AVENUE - statistical syntax-driven machine translation, applied to
resource-poor (including endangered) languages and resource-rich languages
- CHILDES - child language data exchange system (part of TalkBank)
- GRASP - parsing the CHILDES corpus
- JAVELIN - open domain, multilingual question answering (English, Japanese, Chinese)
- Minorthird - an open-source Java package of information extraction
and text classification learning tools
- Noah's ARK - statistical NLP research group
- RADAR - building and evaluating a learning cognitive assistant
- Scone - knowledge representation for meaning-based understanding of text
- SIDE - a summarization integrated development environment
- TagHelper - a semi-automatic tool that facilitates reliable content analysis of corpus data
- TalkBank - communication datasets and software
See Also
Page maintained by Noah Smith, with much help.