argument
|
meaning/value
|
default value / comments
|
--train-data |
training
data specification |
required. Data paths for different tasks are separated by "|". |
--valid-data |
valid
data specification |
required. Data paths for different tasks are separated by "|". |
--task-number
|
how many tasks you are running (in order for verification) |
required. Its value equals the number of tasks indicated by --train-data and --valid-data |
--shared-nnet-spec
|
--shared-nnet-spec="d:h(1):h(2):...:h(m)"
Eg. 250:1024:1024:1024 |
required. Specifies the structure of the lower layers shared across tasks.
d-input
dimension
h(i)-size of the
i-th hidden layer |
--indiv-nnet-spec
| --indiv-nnet-spec="h(1)(n):s(1)|
...|h(T)(n):s(T)"
Eg. 1024:1920|1024:1887|1024:1790 | required. Specifies task-specific upper layers separated by "|". Although we only show one hidden layer h(t)(n), each task can have arbitrary upper-lower architecture.
h(t)(n)-size of the n-th hidden layer for task t
s(t)- number
of targets for task t
|
--wdir
|
working
directory |
required
|
|
--param-output-file
|
(prefix) path
to
save model parameters in the PDNN format
|
by
default
"": doesn't output PDNN-formatted model. Filename for each task is appended with the suffix ".task#"
|
--cfg-output-file
|
(prefix) path
to
save model config
|
by
default
"": doesn't output model config. Filename for each task is appended with the suffix ".task#" |
--kaldi-output-file
|
(prefix) path
to
save the Kaldi-formatted model
|
by
default
"": doesn't output Kaldi-formatted model. Filename for each task is appended with the suffix ".task#" |
--model-save-step
|
number
of
epochs between model saving
|
by
default
1: save the tmp model after each epoch
|
|
--ptr-file
|
pre-trained
model
file
|
by
default
"": no pre-training
|
--ptr-layer-number |
how
many
layers to be initialized with the pre-trained model
|
required
if
--pre-file is provided |
|
--lrate |
learning
rate |
by
default
D:0.08:0.5:0.05,0.05:15 |
--batch-size |
mini-batch
size for SGD |
256
|
--momentum |
the momentum |
0.5 |
|
--activation |
the same as dnn
|
by
default
sigmoid
|
|
--input-dropout-factor
|
the same as dnn |
by
default 0: no dropout is applied to the input features
|
--dropout-factor
|
the same as dnn
|
by
default "": no dropout is applied.
|
|
--l1-reg |
l1
norm regularization weight
train_objective = cross_entropy + l1_reg * [l1 norm of all weight
matrices]
|
by
default 0
|
--l2-reg |
l2
norm regularization weight
train_objective = cross_entropy + l2_reg * [l2 norm of all weight
matrices]
|
by
default 0
|
--max-col-norm |
the
max value of norm of gradients; usually used in dropout and maxout
|
by
default
none: not applied
|