Tutorial: Machine Learning for the Computational Humanities

Digital Humanities and Computer Science Colloquium/David Bamman
TEI ConferenceSchool of Computer Science
Northwestern University, Orrington Hotel (Bonbright Room)Carnegie Mellon University
Friday, October 24, 2014dbamman@cs.cmu.edu
1pm-5pm

Free and open to the public, but register to reserve a place by emailing your full name and institution (if applicable) to teiconference2014@gmail.com with the subject "Register | Machine Learning for the Computational Humanities Tutorial."

Slides


Machine learning is a branch of computer science that helps drive much of the exciting work in the computational corners of the humanities and social sciences; its methods underlie topic models, classifiers, clustering algorithms, syntactic parsers and named entity recognizers (among much more). A variety of tools like MALLET and Weka have made the application of machine learning techniques widespread, but it's easy to see them as black boxes; the goal of this tutorial is to break open these boxes and have a look inside.

We'll survey a range of existing methods in machine learning, and answer the following questions for each one:

Machine learning techniques that we'll cover include:

By the end of the tutorial, participants will be able to explain how each of these methods works from a high-level perspective, understand what is a good (and bad) time to apply each one, and know where to go for more information. No prior computational background is required. This tutorial is free and open to the public.

Bio

David Bamman is a PhD student in Computer Science at Carnegie Mellon University. His research applies natural language processing and machine learning to empirical questions in the humanities and social sciences, including modeling linguistic variation (ACL 2014, Journal of Sociolinguistics 2014), inferring character types in movie plot summaries (ACL 2013) and novels (ACL 2014), inferring social rank in an Old Assyrian trade network (DH 2013) and detecting censorship in Chinese social media (First Monday 2012). David designed and co-taught an interdisciplinary (English/Computer Science) course at CMU on "Digital Literary and Cultural Studies," for which he received Carnegie Mellon's 2014 Alan J. Perlis Graduate Student Teaching Award. Prior to CMU, David was a senior researcher at the Perseus Project.