
Advanced Statistical Language Processing: Reading the Web (10-709)

Homework 2
Due October 1, 2009

Tom Mitchell
Machine Learning Department
School of Computer Science, Carnegie Mellon University

Fall 2009


It's time to propose your class project.   Please turn in a 3-6 page final project proposal, that includes
Please note:  The fifth point here is extremely important -- it is 90% of the work you need to do for this homework: develop the strongest evidence you can during the coming week that your proposed project is doable and valuable.  For example, suppose you propose semi-supervised learning of parse-tree based extractors of relations between noun phrases.  Then by next week you should at least have run a parser on a small collection of sentences and written some computer-interpretable patterns that are examples of what you believe your algorithm will be able to learn.  Use this experience to get some insight into the most problematic issues that will arise (parser errors?  difficulty in figuring out the representation for learned patterns?), and then describe these insights in your proposal. You should also do a literature search to identify the 3-6 most closely related publications in this area -- who else has tried this, and how?  Which of their ideas will you build on, and what is the most unique new thing you will try in your approach?

Feel free to speak with Tom, Justin, Andy, Burr or Estevam if that will help you.  This is a big assignment, so start early.

On working alone versus in pairs: It's fine to work either in pairs or alone on projects for this class.