argument |
meaning
|
default
value / comment
|
--train-data |
training
data
specification |
required |
--nnet-spec
|
--nnet-spec="d:h(1):h(2):...:h(n):s"
Eg.250:1024:1024:1024:1024:1920
|
required. d-input
dimension; h(i)-size of the
i-th hidden layers; s-number
of targets |
--output-file |
path
to save the resulting net |
required |
--wdir |
working
directory
|
required |
|
--param-output-file | path
to
save model parameters in the PDNN format | by
default
"": doesn't output PDNN-formatted model |
--cfg-output-file | path
to
save model config | by
default
"": doesn't output model config |
--kaldi-output-file | path
to
save the Kaldi-formatted model | by
default
"": doesn't output Kaldi-formatted model |
|
--corruption-level |
corruption
factor for binary
random masking |
by default 0.2 |
--learning-rate |
learning
rate value; constant |
by default 0.01 |
--epoch-number |
number
of
epochs
|
by default 10
|
--batch-size |
mini-batch
size during training
|
by default 128 |
--momentum |
the
momentum factor
|
by default 0.5 |
--ptr-layer-number |
number
of layers to be
pre-trained |
by default train all
the hidden layers
|
--sparsity | these two parameters together achieve sparse autoencoders at each layer of SdA. The implementation follows this paper. sparsity and sparsity-weight correspond to rho on page14 and beta on page15 in the paper.
| by default both parameters are set to None, we are imposing no sparsity
|
--sparsity-weight |
|
--hidden-activation |
hidden
activation function |
by default sigmoid
|
--1stlayer-reconstruct-activation |
reconstruction
activation
function for the first layer. Now
supports sigmoid and tanh reconstruction activation
functions. |
by default sigmoid.
If your inputs are mean (and sometimes variance) normalized, you need to use tanh for feature reconstruction. |