argument
|
meaning/value
|
comments
|
--train-data |
training
data specification |
required |
--valid-data |
valid
data specification |
required |
--conv-nnet-spec
|
net
specification for the convolutional layers
--conv-nnet-spec="txnxm:a,bxc,pdxe,f"
Eg. "1x29x29:64,4x4,p2x2:128,5x5,p3x3,f" stacks two convolutional layers
|
required
"txnxm":
the inputs are t feature maps,
each with the dimension of n x m
"a,bxc,pdxe,f"
describes one convolution layer a
-- number of feature maps
bxc -- size
of local filters (kernels)
dxe --
pooling size
if "f" appears, the
outputs are flattened
you can continue to stack more convolution layers |
--nnet-spec
|
net specification
for the FC layers
--nnet-spec="h(1):h(2):...:h(n):s"
Eg. 1024:1024:1024:1920 |
required. h(i)-size of the
i-th FC hidden layers; s-number
of targets |
--wdir
|
working
directory |
required
|
|
--param-output-file
|
path
to
save model parameters in the PDNN format
|
by
default
"": doesn't output PDNN-formatted model
|
--cfg-output-file
|
path
to
save model config
|
by
default
"": doesn't output model config
|
--kaldi-output-file
|
path
to
save the Kaldi-formatted model
|
by
default
"": doesn't output Kaldi-formatted model
|
--model-save-step
|
number
of
epochs between model saving
|
by
default
1: save the tmp model after each epoch
|
|
--ptr-file
|
pre-trained
model
file
|
by
default
"": no pre-training
|
--ptr-layer-number |
how
many
layers to be initialized with the pre-trained model
|
required
if
--pre-file is provided |
|
--lrate |
learning
rate |
by
default
D:0.08:0.5:0.05,0.05:15 |
--batch-size |
mini-batch
size for SGD |
by default 256
|
--momentum |
the momentum |
by default 0.5 |
--use-fast
|
whether
to use the fast version of CNN
|
by
default false. More details at the bottom of this page
|
|
--conv-activation
|
activation
function for the convolutional layers; more details on the DNN webpage |
by
default sigmoid
|
--activation |
activation
function for the FC layers; more details on the DNN
webpage
|
by
default
sigmoid
|
|
--input-dropout-factor
|
dropout
factor for the input layer (features)
|
by
default 0: no dropout is applied to the input features
|
--dropout-factor
|
comma-delimited
dropout factors for *hidden layers*.
Note the matching between dropout factors and network structure
(nnet-spec)
E.g. --dropout-factor 0.2,0.2,0.2,0.2
|
by
default "": no dropout is applied. This is equivalent to setting
dropout factors to all 0s. However, the
latter
case will be slower. Thus, "--dropout-factor 0,0,0,0" is NOT
recommended.
|
|
--l1-reg |
l1
norm regularization weight
train_objective = cross_entropy + l1_reg * [l1 norm of all weight
matrices]
|
by
default 0
|
--l2-reg |
l2
norm regularization weight
train_objective = cross_entropy + l2_reg * [l2 norm of all weight
matrices]
|
by
default 0
|
--max-col-norm |
the
max value of norm of gradients; usually used in dropout and maxout
|
by
default
none: not applied
|