Documentation: Learning Rate Schedule

Learning Rate Schedule

For fine-tuning, PDNN currently supports four types of learning rate schedules. During pre-training such as SdAs, we normally adopt a constant learning rate all through the training process. The learning rates are passed to PDNN via --lrate. The first field in the string must be a capital string (C, D, MD, or FD) indicating learning rate type. Validation error reduction means the reduction of error in terms of percentage. For example, the reduction 0.05 means 49.00(%) --> 48.95%.

Type	Mearning	Comments
constant	--lrate="C:l:n" Eg. C:0.08:15	run n iterations with lrate = l unchanged
newbob	--lrate="D:l:c:dh,ds:n" Eg. D:0.08:0.5:0.05,0.05:8	starts with the learning rate l; if the validation error reduction between two consecutive epochs is less than dh, the learning rate is scaled by c during each of the remaining epochs. Traing finally terminates when the validation error reduction between two consecutive epochs falls below ds. n is the minimum epoch number after which scaling can be performed.
min-rate newbob	--lrate="MD:l:c:dh,ml:n" Eg. MD:0.08:0.5:0.05,0.0002:8	the same as newbob, except that training terminates when the learning rate value falls below ml
fixed newbob	--lrate="FD:l:c:eh,es" Eg. FD:0.08:0.5:10,6	starts with the learning rate l; after eh epochs, the learning rate starts to be scaled by c. Traing terminates when doing another es epochs after scaling starts. n is the minimum epoch number after which scaling can be performed.