If we define the total scatter matrix Tm (or covariance matrix) and the within class scatter matrix Wm in the m space, we try to minimize Det(Wm) and maximize Det(Tm) simultaneously or
| Tm | ! -------- = | Wm^-1 Tm | = max | Wm |The solution for the transformation matrix A are the eigenvectors of Wn^-1 Tn with the m largest eigenvalues:
(e1 e2 .. en) = eigenvec( Wn^-1 Tn ) A' = (e1 e2 .. em) new feature vector: y = A xThe scatter matrices for the new data samples Tm and Wm are then both diogonal which means the coefficients are uncorrelated and they can be calculated as
Tm = A Tn A' Wm = A Wn A'
LDA
and a user definded name. The user also has to give
the name of a FeatureSet, the name of
a Feature from this FeatureSet and its dimension. The FeatureSet must
already exist, the Feature need not to. It is only locked to make sure it will have the
correct dimension.
% FeatureSet fs fs % LDA lda fs feat 16 ldaNow classes can be added with the
add
method. The classes could
be triphones, phonemes or anything else. Using just the name of the LDA object
gives a list of the existing classes:
% lda add c1 % lda add c2 % lda c1 c2Later we will use a Path object to fill the mean vectors and scatter matrices of the LDA object. This path contains (senone) indices that we can map into our LDA classes. We can map more than one index into one LDA class but not the other way round:
% lda map 0 -class c1 % lda map 1 -class c1 % lda map 5 -class c1The indices we haven't used will be ignored. That means the according sample won't be taken into account for any class mean or the within scatter matrix. Using the
map
method without the optional flag -class
will give
us the LDA class index (Don't mix it up with the (senone) index!). This class index
can be converted into the class name using the name
method:
% lda map 0 0 % lda map 4 -1 % lda name [lda map 0] c1 % lda name [lda map 4] (null)After setting all means and scatter matrices to 0 (methods
clearMeans
and
clearScatters
) which is already the case when you just defined a LDA object,
2 passes over the whole speech data base must be run. In a 1st pass the total mean
and means for each LDA class are obtained taking samples from the
FeatureSet and indices and weighting factors
from a Path object. This indices are mapped to
the LDA classes as defined before. In the 2nd pass the total scatter matrix
(covariance matrix) and the within class scatter matrix are calculated using the same
input. For the total scatter matrix it is necessary to have the total mean, for the
within class scatter we need all class means.
% lda clearMeans % foreach utterance $utts { > ... > lda accuMeans path > } % lda clearScatters % foreach utterance $utts { > ... > lda accuScatters path > }If you want to look at the means or the scatter matrices here is how you can do that:
% lda.mean ;# look at the total mean % lda:c1.mean ;# look at mean of class c1 % lda:c1.mean configure -count ;# look at the count % lda.matrixT ;# look at total scatter % lda.matrixW configure -count ;# look at count of within class scatterThe whole reason we did this, was to find a linear transformation (each feature vector or speech vector multiplied with a matrix A) that minimizes the within class scatter in the new feature space and maximizes the total scatter. A way to find this matrix A is the simultaneous diagonalisation [1]. It is implemented in JANUS as a DMatrix method since the LDA scatter matrices are of type DMatrix.
% DMatrix K % [DMatrix A] simdiag K lda.matrixT lda.matrixWNow we can transform the feature
feat
into a new feature
newfeat
with equal or less coefficients.
% [FMatrix B] DMatrix A ;# convert to FMatrix % fs matmul newfeat feat B -cut 10Here is a small test script you can run with JANUS.
Extracting the means and scatter matrices with more than one machine is possible using LDA I/O methods.