Language Technologies Institute
11-712: Self-Paced Laboratory


Algorithms for NLP:
Chart Parsing Module

Objective: Because of its efficiency and relative simplicity of implementation, the chart parsing algorithm is popular for practical NLP applications. The goal of this module is to teach you the fundamental chart parsing algorithm and its useful variations. You will implement the chart parsing algorithm in Common Lisp, test it on a variety of cases, and analyze its efficiency in terms of the number of parse operations carried out on each sentence.
Background
Material:
Allen, James (1995). Natural Language Understanding (Redwood City, CA: Benjamin/Cummings), pp. 53-60 (Section 3.4).

Copies of this material will be available in the LTI Lab (CyH 277).

  1. Read the Allen material to familiarize yourself with the chart parsing algorithm. Work through the given examples on paper until you are sure you understand how the algorithm works. If you have any questions, visit the instructor's office hours or schedule an appointment.
  2. Read the data structure specification for the module. To keep things simple, we have provided you with the specification for the basic data structures to be used in the Common Lisp implementation, including the internal data structures for the chart, arcs, and agenda, and also the format for the grammar and lexicon.
  3. Read the functional specification for the main program, bottom-up-chart. Note that your function must implement two of the variations that are mentioned briefly by Allen: breadth-first vs. depth-first search, and exhaustive search vs. earliest answer.
  4. Examine the given code, which includes the grammar, lexicon and a few utility functions which aren't central to the implementation exercise. You will include this code into your own solution by adding this line at the beginning of your file:

    (load "/afs/cs/project/cmt-55/lti/Lab/Modules/NLP-712/chart/given-code.lisp")

  5. Keeping in mind the data structure specification and given code, write up a pseudo-code version of how you intend to implement the parser in Common Lisp. This step is optional for experienced Lisp programmers who are comfortable with prototyping directly in Lisp. For those with less Lisp experience, this can be a useful extra step. If you understand the algorithm fully, but you're not exactly sure how to implement the algorithm in Lisp, now is the time to seek advice.
  6. Code the parser. Comment your code, so that it is easy to see which lines of code implement which parts of the algorithm as described by Allen. Name your solution file chart.lisp, and be sure to include a statement which loads the given code (see above).
  7. Test your solution. Load the test code file and run (run-tests 'bottom-up-chart)to see whether your implementation behaves as desired. The test code file defines a series of tests and the testing function run-tests for you.
  8. In a file called chart-questions.txt, answer the following questions:
    1. Run your parser on these three inputs:
      (bottom-up-chart '(the man holds the water in the large can) 'dfs t)
      (bottom-up-chart '(the man holds the water in the large can) 'bfs t)
      (bottom-up-chart '(the man holds the water in the large can) 'dfs nil)
      You should get three different answers; what are the parse trees printed in each case?
    2. Why are the answers different in the first two cases? Explain the difference between DFS and BFS in terms of the chart parser's operation on the two sentences; it is acceptable to include a relevant sample of your trace output, with an explanation of what the crucially different operations are and why they're different.
    3. What is the difference between the first answer and the third answer in terms of cost (parser operations)?
    4. Assume that the goal of a particular application is to favor local attachment rather than long-distance attachment, and to return the first parse found. (By local attachment, we mean that in structures like NP V NP PP, the PP is a child of the object NP rather than the V.) How would you set the searchmode and quitmode variables? As n gets larger for sentences of the form (NP V NP PP1 PP2 ... PPn), how would you characterize the difference in the number of parse operations for (quitmode = NIL) vs. (quitmode = T)?
  9. Hand in your two files (chart.lisp and chart-questions.txt) by copying them to the directory corresponding to your userid under:

    /afs/cs/project/cmt-55/lti/Lab/Modules/NLP-712/handin/

  10. Notify the instructor that your module is ready for evaluation.

Updated: 8-Sep-08 by alavie@cs.cmu.edu
Author: 4-Oct-96 by ehn@cs.cmu.edu