Language Technologies Institute
11-712: Self-Paced Laboratory


Algorithms for NLP:
GenKit Module Instructions

  1. You'll hand in a code file called genfns.lisp and a grammar file called gen.gra.

  2. To get started, you need to load GenKit; make sure you load it at the top of your genfns.lisp file:

    /afs/cs/project/cmt-55/lti/Lab/Modules/NLP-712/genkit/genkit.lsp

  3. To compile your grammar and load it, you'll need to use COMPGEN and LOAD; NOTE that you should give the file name only, not the extension:

    (load (compgen "gen"))

    You should start with the given grammar file, which has some but not all of the functionality you need:

    /afs/cs/project/cmt-55/lti/Lab/Modules/NLP-712/genkit/given.gra

  4. You should use the given code in givenfns.lisp. This file defines some tracing stuff for you, and gives you the input/output specs for the subfunctions you need to write:

  5. You will need to extend the given grammar as follows:

    You can test individual f-structures by calling GENERATOR on them; the variable *sentences*, defined when you load the test-code file, can be used to reference particular examples, e.g.,

    >(nth 15 *sentences*)
    ((CAT V) (ROOT "eat") (VALENCY (*OR* TRANS INTRANS))
     (SUBJ ((CAT PRO) (PERSON 2) (NUMBER SG)))
     (OBJ ((DET ((CAT DET) (ROOT "the"))) (CAT N) (PERSON 3) (NUMBER PL)
           (ROOT "fig")))
     (PPADJUNCT
         ((TOPIC +) (CAT P) (ROOT "with")
          (OBJ ((CAT N) (PERSON 3) (NUMBER SG) (ROOT "relish"))))))
    
    >(generator (nth 15 *sentences*))
    "with relish you eat the figs"
    
    Your ultimate goal is to get coverage on the set of sentences in the test file, shown here in a correct transcript of a test run.

    You can use the built-in rule tracing by typing (genkit-tracing 2) at the Lisp prompt (be aware that the tracing mechanism described in the User's Manual is obsolete):

    >(genkit-tracing 2)
    NIL
    
    >(generator (nth 1 *sentences*))
    
    GenKit> <START> called
    GenKit>   <NP> called
    GenKit>     Rule 1 for <NP> returns NIL
    GenKit>     Rule 2 for <NP> returns NIL
    GenKit>     <PRO> called
    GenKit>       Rule 1 for <PRO> returns "they"
    GenKit>     <PRO> returns "they"
    GenKit>     Rule 3 for <NP> returns "they"
    GenKit>   <NP> returns "they"
    GenKit>   Rule 1 for <START> returns "they"
    GenKit> <START> returns "they"
    "they"
    

    You can also trace individual Lisp functions. Since GenKit produces a function with a GG- prefix for each non-terminal, you can find out what f-structure is getting passed to your rules by tracing individual functions; e.g.,

    >(trace gg-pro)
    (GG-PRO)
    
    >(generator (nth 1 *sentences*))
    
    GenKit> <START> called
    GenKit>   <NP> called
    GenKit>     Rule 1 for <NP> returns NIL
    GenKit>     Rule 2 for <NP> returns NIL
      1> (GG-PRO ((CAT PRO) (PERSON 3) (NUMBER PL)))
    
    GenKit>     <PRO> called
    GenKit>       Rule 1 for <PRO> returns "they"
    GenKit>     <PRO> returns "they"  <1 (GG-PRO "they")
    
    GenKit>     Rule 3 for <NP> returns "they"
    GenKit>   <NP> returns "they"
    GenKit>   Rule 1 for <START> returns "they"
    GenKit> <START> returns "they"
    "they"
    
    (This example shows both the rule tracing and the Lisp tracing combined in one.)

  6. Test your code by loading the test file:

    /afs/cs/project/cmt-55/lti/Lab/Modules/NLP-712/genkit/test-code.lisp

    Call the function (run-tests). Once you fix any remaining bugs, you're ready to comment your code and grammar, and hand them in! (Note: the output of (run-tests) should be placed in a file called test-output.txt in your handin directory.)


27-Nov-96 by ehn@cs.cmu.edu