Steps for score-sampling training --------------------------------- --------------------------------- Terminology ----------- MAXMAP: maximum a-posteriori decoding in terms of hypotheses segmentations, i.e., finding the target segmentation from the n-best list with the highest score and outputting the corresponding target sentence SUMMAP: maximum a-posteriori decoding in terms of sentences: convert scores of the target segmentations in the n-best list into probability space and sum up all probabilities of target segmentations that represent the same sentence output the sentence whose sum of segmentation probabilities is maximal MBR : Minimum Bayes Risk decoding - output the sentence with the highest expected BLEU score, where the expected value is approximated using the segmentations and probabilities from the n-best list MER-training: minimum error-rate training - MAXMAP training via greedy line search, optimizing development set performance, where the hypotheses are chosen via MAXMAP from an n-best list, which is enriched with hypotheses from successive decoding steps with increasingly optimal parameter settings score sampling: non-local, zero-order optimization procedure allowing for MAXMAP, SUMMAP, and MBR training Directories with param files we are using for EUROPARL: MBR score sampling training on development set: /nfs/islpc1_1/informedia1/SMT/ACL-Workshop/All-FR-EN/data_a/mbrbetterpruning-dev MBR decoding on test set: /nfs/islpc1_1/informedia1/SMT/ACL-Workshop/All-FR-EN/data_a/mbrbetterpruning-test Score-sampling training procedure --------------------------------- 1- Split off a development corpus DC (typically 1000 sentences) from the training corpus 2- Create n-best list either using Ashish's MER-training on DC (preferred). Alternatively, create n-best list by decoding DC with an initial guess of parameters. Use settings similar to the ones in /nfs/islpc1_1/informedia1/SMT/ACL-Workshop/All-FR-EN/data_a/mbrbetterpruning-dev/params.ts > /nfs/dish2/ashishv/Production/STTK2K4/bin/TranslateTS -f params.ts If the settings in param.ts are used, you should now have a featsfull.opt and a candsfull.opt file. These are the only two files needed by score sampling. 3- link the score sampling matlab files into the current directory containing the featsfull.opt and candsfull.opt files: > ln -s /afs/cs.cmu.edu/user/zollmann/projects/smt/*.m . 4- start score-sampling For pure STTK, start score sampling as follows: > matlab -nodisplay -nojvm -r "maptraining=0,pharaoh=0,reordering=true,feats_file='featsfull.opt',cands_file='candsfull.opt',mbrtrain" The parameters are: reordering %true / false maptraining %0 (MBR), 1 (SUM-MAP), or 2 (MAX-MAP) pharaoh %0 (pure STTK), 1 (pure pharaoh), or 2 (STTK with pharaoh phrase table) feats_file %features file name cands_file %candidates file name After each iterations, the program outputs the current parameter vector and the resulting BLEU score achieved on the development set. Additionally, the best parameter vector so far and its resulting BLEU score is output each time. You can change the maximum number of iterations (default: 150) in the matlab source code (variable n_o_iterations). You can also finish the program anytime by hitting Ctrl-C if you are already satisfied with the best parameter setting so far. The program also outputs the variables best_commandstr and best_thetastr. They will be relevant in the following steps. 5- Possibly re-run TranslateTS on the development corpus and then running score sampling again (not really necessary if you started with an OK guess or with the n-best list from Ashish's MER training). To do that, insert the optimal parameter settings output by score-sampling in the variable best_thetastr in your param.ts file, replacing the corresponding lines with the previous parameter settings and rerun TranslateTS and score sampling. 6- Translate and evaluating the sentences in the test corpus: Insert the optimal parameter settings output by score-sampling in the variable best_thetastr into your param.ts file (replacing the corresponding lines with the previous parameter settings) of your directory with the test corpus files and create featsfull.opt and candsfull.opt using TranslateTS. Now run the command output by score sampling in the variable best_commandstr. This returns the (global) BLEU scores of MAXMAP, SUMMAP, and MBR on the test corpus. The most interesting score for you is likely to be the one you optimized the score-sampling training for via the parameter maptraining (see above). Minor notes ----------- Each time the n-best reevaluation program MBRFast is run, two temporary files are created, read in, or modified to speed up the program. If you change parameter settings and redecode to create a new n-best list, you should delete these temporary files because otherwise MBRFast might not work correctly, thinking that the stuff in the temp files is still current information and ignorant that the n-best list has changed. > rm *.lossfile *.global I know this is annoying and we'll fix that soon. If you have problems starting up Matlab, make sure that your X DISPLAY variable is set correctly. Although Matlab is started up with -nodisplay, it still causes trouble when DISPLAY is not set correctly. Reference --------- Ashish Venugopal, Andreas Zollmann and Alex Waibel. 2005. Training and Evaluating Error Minimization Decision Rules for Statistical Machine Translation. In Proceedings of the 43rd annual meeting of the Association for Computational Linguistics, Workshop on Building and Using Parallel Corpora, Ann Harbor. http://www.cs.cmu.edu/~zollmann/publications/acl2005.pdf