For
fine-tuning, PDNN
currently supports four types of learning rate schedules. During
pre-training such as
SdAs, we normally adopt a constant learning rate all through the
training process. The learning rates are passed to PDNN via --lrate.
The
first field in the string must be a capital string (C, D, MD, or FD)
indicating learning rate type. Validation error reduction means the
reduction of error in terms of percentage. For example, the reduction
0.05 means 49.00(%) --> 48.95%.
Type
|
Mearning
|
Comments
|
constant |
--lrate="C:l:n"
Eg. C:0.08:15
|
run
n
iterations with lrate = l
unchanged |
newbob
|
--lrate="D:l:c:dh,ds:n"
Eg. D:0.08:0.5:0.05,0.05:8
|
starts
with the learning rate l; if the validation
error
reduction between two consecutive epochs is less than dh, the learning
rate is
scaled by c
during each of the remaining epochs. Traing finally terminates when
the validation error reduction between two consecutive epochs falls
below ds. n is the
minimum epoch number after which scaling can be performed. |
min-rate
newbob
|
--lrate="MD:l:c:dh,ml:n"
Eg. MD:0.08:0.5:0.05,0.0002:8
|
the
same as newbob, except that
training terminates when the learning rate value falls below ml |
fixed
newbob
|
--lrate="FD:l:c:eh,es"
Eg. FD:0.08:0.5:10,6 |
starts
with the learning rate l; after eh epochs, the learning rate starts to
be
scaled by c.
Traing terminates when doing another es epochs
after scaling starts. n is the
minimum epoch number after which scaling can be performed. |
|