MER versions
This pages tracks version of the MER training script (thanks in part due to helpful suggestions by users), with some details regarding relevant differences.
optimizeV5IBMBLEU.m, optimizeV5NIST.m
- Version 4 fixed a bug that limited the number of distinct error regions that were considered to 10
- Version 5 using more sensible parameter for NumberOfRandomTests and ConvergedLimit to avoid overfitting
optimizeV3IBMBLEU.m
- Based on V2
- Goes after each lambda based on how much gain is available on that parameter
- Lambdas with most score gain are changed first
- Once it converges (its more greedy now), jump upto JUMP_PERC of the param range, and try again, do this upto ConvergedLimit times
- I have found this script to give more stable param values over multiple iterations, and equal or higher final values
optimizeV3NIST.m
- NIST version of above
- Takes in info gains instead of correct counts. Make sure length of ref is set correctly
optimizeV2IBMBLEU.m
- Allows you to specify the best working parameters uptil this point in the INIT param
- NumRandomTests is the number of times random seeds are initialized for the search (I set this to 3 using 1K best lists)
- PermutationsEpsilon indicates how close two score can be before they are considered the same
- ConvergedLimit: if the score diff is less that PermutationsEpsilon for ConvergedLimit steps thru lambda, end iteration for this random seed run
optimizeV2NIST.m
- NIST version of V2 script. Instead of corrrect/suggest counts pass in the info gains for the correct field