Ashish Venugopal

research introduction

my publications

minimum error rate training

other people's research

MER versions

This pages tracks version of the MER training script (thanks in part due to helpful suggestions by users), with some details regarding relevant differences.

optimizeV5IBMBLEU.m, optimizeV5NIST.m

Version 4 fixed a bug that limited the number of distinct error regions that were considered to 10
Version 5 using more sensible parameter for NumberOfRandomTests and ConvergedLimit to avoid overfitting

optimizeV3IBMBLEU.m

Based on V2
Goes after each lambda based on how much gain is available on that parameter
Lambdas with most score gain are changed first
Once it converges (its more greedy now), jump upto JUMP_PERC of the param range, and try again, do this upto ConvergedLimit times
I have found this script to give more stable param values over multiple iterations, and equal or higher final values

optimizeV3NIST.m

NIST version of above
Takes in info gains instead of correct counts. Make sure length of ref is set correctly

optimizeV2IBMBLEU.m

Allows you to specify the best working parameters uptil this point in the INIT param
NumRandomTests is the number of times random seeds are initialized for the search (I set this to 3 using 1K best lists)
PermutationsEpsilon indicates how close two score can be before they are considered the same
ConvergedLimit: if the score diff is less that PermutationsEpsilon for ConvergedLimit steps thru lambda, end iteration for this random seed run

optimizeV2NIST.m

NIST version of V2 script. Instead of corrrect/suggest counts pass in the info gains for the correct field