Syntax:
<distrib set> cdcn <distrib1> <distrib2> <feature> [optional flags]
Example:
CDCNdss cdcn CDCN SIL(I) LOGSPEC -itcount 30
For the <feature> the CDCN algorithm is executed. The <distrib set> contains the joined CDCN distribution <distrib1>. Although <distrib1> holds the Information for silence and speech <distrib2> is needed to get the number of codebook vectors for silence. As result the channel compensated version of <feature> is placed in the feature designated in the codebook-set desription file.
Optional flags:
-itcount <number>
-n <feature FMatrix>
-q <feature FMatrix>
-f <feature FMatrix>
The correct name for the implemented version should be CDLSN (Codeword Dependent Log-Spectral Normalisation) but there are very little differences to the original CDCN and so we opted for CDCN. The basic difference is that we work in the log-spectral and not in the cepstral domain. We can consider this as a implementation detail because of the linear properties of the Fourier transform. For more information see: [1] Acero Alejandro. "Acoustical and Environmental Robustness in Automatic Speech Recognition", Department of Electrical and Computer Engineering Carnegie Mellon University Pittsburgh Pennsylvania 15231 13.9.1990
First of all we need a number of description files.
With them we perform the training of the two codebooks.
Note: The shown files are only examples. Especially the feature description file differs from system to system.
SIL CDCNFEA 50 30 DIAGONAL SPEECH CDCNFEA 200 30 DIAGONAL
SIL(|) SIL SPEECH(|) SPEECH
ROOT-b {0=SIL} LSPEECH LSIL - - ROOT-m {0=SIL} LSPEECH LSIL - - ROOT-e {0=SIL} LSPEECH LSIL - - LSPEECH {} - - - SPEECH(|) LSIL {} - - - SIL(|)
#-------------------------------------------------------------------------- #fes command name source parameter #-------------------------------------------------------------------------- $fes readADC ADC $arg(ADCFILE) -h $arg(ADCHEADER) \ -v 0 -offset mean #----------------- mel filter bank ---------------------------------------- set melN 30 $fes spectrum FFT ADC 16ms if { [llength [objects FBMatrix matrixMEL]] != 1} { set points [$fes:FFT configure -coeffN] set rate [expr 1000 * [$fes:FFT configure -samplingRate]] [FBMatrix matrixMEL] mel -N $melN -p $points -rate $rate } $fes filterbank MEL FFT matrixMEL $fes log CDCNFEA MEL 1.0 1.0 $fes meansub CDCNFEA CDCNFEA -a 0
# ----------------------------------------------------------------------- #initialise CDCN -> basic Object = CDCNdss # ----------------------------------------------------------------------- source ../cdcn_desc/cdcn.tcl cdcnInit $SID -dssdesc ../cdcn_desc/cdcnDistribSet -dssparam ../cdcn_create_melv/3i.dss.gz \ -cbsdesc ../cdcn_desc/cdcnCodebookSetMel -cbsparam ../cdcn_create_melv/3i.cbs.gz # -----------------------------------------------------------------------
#-------------------------------------------------------------------------- #fes command name source parameter #-------------------------------------------------------------------------- $fes readADC ADC $arg(ADCFILE) -h $arg(ADCHEADER) \ -v 0 -offset mean #----------------- mel filter bank ---------------------------------------- set melN 30 $fes spectrum FFT ADC 16ms if { [llength [objects FBMatrix matrixMEL]] != 1} { set points [$fes:FFT configure -coeffN] set rate [expr 1000 * [$fes:FFT configure -samplingRate]] [FBMatrix matrixMEL] mel -N $melN -p $points -rate $rate } $fes filterbank CDCNFEA FFT matrixMEL $fes log MCEP CDCNFEA 1.0 1.0 #----------------- CDCN ---------------------------------------- global CDCNdss CDCNdss cdcn CDCN SIL(|) MCEP -itcount 30 #----------------- cepstrum ---------------------------------------- set cepN 13 if { [llength [objects FMatrix matrixCOS]] != 1} { set n [$fes:CDCNFEA configure -coeffN] [FMatrix matrixCOS] cosine $cepN $n -type 1 } $fes matmul MCEP CDCNFEA matrixCOS #----------------- context -----------------------------------------------
There is a possibility to use other transforms like third square-root instead of the logarithm. But this can't be done without internal changes in the "cdcn" method.
[2] Baumgärtner Rainer.: Diplomarbeit: Kanalkompensation in der Spracherkennung; Universität Karlsruhe, Institut für Logik, Komplexität und Deduktionsysteme 1996