Computing an LDA is not much different from any other kind of training. The main loop will go through all training utterances get an alignment from somewhere (labels or Viterbi) and accumulate the training data. Accumulating could mean distributions and codebooks, or, as in our current case, the LDA data (scatter matrices).
Since we don't have any weights yet, we can't do Viterbi, and since we don't have any Janus-written labels, we'll have to use the available ASCII-labels that came with the data.
So we will address four issues in this development stage:
initialize sytem, create objects create and initialize LDA object foreach sentence in trainingset { read the ASCII labels and build a path object from them accumulate the data for the scatter matrices } compute the LDA matrix write out matrix and side dishesHere come the details for each of the outlined steps. We could use the most flexible and portable and universal script, hiding everithing from the user, but for the beginning, let's use something more simple to get a better view of what is happening. Later, we'll introduce more universal scripts.
PhonesSet phonesSet Tags tags [FeatureSet fs] setDesc @../prepare/featdesc fs setAccess @../prepare/feataccess [Phones phones] read ../prepare/phones [TmSet tms] read ../prepare/trans [Dictionary dict phones tags] read ../prepare/dict [CodebookSet cbs fs] read ../inienv/cbs-desc [DistribSet dss cbs] read ../inienv/dss-desc [Tree dst phones phonesSet tags dss] read ../inienv/tree-desc SenoneSet sns [DistribStream stream dss dst] -phones phones -tags tags [TopoSet tps sns tms] read ../prepare/topoSet [Tree ttree phones phonesSet tags tps] read ../prepare/topoTree [DBase dbase] open ../prepare/dbase.dat ../prepare/dbase.idx -mode r AModelSet amo ttree ROOT Path p HMM hmm dict amoWe won't explain the purpose of all these objects, have a look at the Janus users' manual for that. You can see, that many objects depend on the existance of other objects which restricts the order of object creation. Although we don't have any sets phonemes defined, and although we don't want to use any tags, we still have to define the corresponding objects and just leave them empty, because other object expect a Tags and a PhonesSet object to exist.
We've used a shortcut programming as it is usually seen in Tcl. The return value of an object creation is the name of the created object, that way we can use this name immediately for the object's read method, and read the corresponding description file. After all the object are created, we can read labels into the path object named p, and tell the yet to be created LDA object to accumulate its stuff.
For accumulating data, the LDA object will get the labels-filled path and read the senone indices from there. Since not every senone is necessarily a class of it's own, but usually classes correspond to codebooks, we have to tell the lda which senone belongs to which class. Senones are created dynamically, so we can't know in advance which senone index will be assigned to which model. Therefore we'll fill the path-items' senone indices with the coresponding distribution indices. For the initialization of the LDA object this means that we have to tell it which distribution index belongs to which class. In our example we have the same number and names of distributions as codebooks, so we can simply use the distributions as classes. At a later stage, we'll do it differently. For now the following script will do:
LDA lda fs MSC 16 foreach ds [dss:] { lda add $ds ; lda map [dss index $ds] -class $ds }In our example we simply use the 16 mel-scale-coefficients out of which we will compute a few LDA coefficients. For a real system you would probably use a much more sophisticated approach.
If we are not doing any parallel processing then the loop is rather simple:
set fp [open ../prepare/trainIDs r] while { [gets $fp utt] != -1 } { readPath $utt dbase fs ../data/labels/$utt p sns MSC lda accu p } close $fpThe body-lines of the loop match the lines of the outline, further above. We will define the "readPath" procedure soon.
Your label format might differ from the one in the example. In that case either filter you labels such that they look like those in this example or modify the following procedure to suit your label format.
proc readPath { utt db fs file path sns feature } { $fs eval [$db get $utt] set frameN [$fs:$feature configure -frameN] $path make $sns -from 0 -to [expr $frameN - 1 ] set str [ open $file r ] while { [ gets $str item ] != -1 } { set frX [lindex $item 0] ; # <--- MODIFY THESE LINES IF YOUR FORMAT set sn [lindex $item 1] ; # <--- DIFFERS FROM THE EXAMPLE'S FORMAT $path.itemList($frX) add 1 -senoneX [sns.stream(0).distribSet index $sn] $path.itemList($frX).item(0) configure -gamma 1.0 } close $str }The readPath procedure fills the given path object with the labels from the named file. The utt-argument is the ID of the utterance, the db-argument is the name of the used database, the fs-argument is the name of the feature set, which is needed to load the feature and together with the feature-argument to find out how many frames there are in the utterance.
So far we have accumulated the so called scatter matrices, one for the total scatter of all data, and one for the average within-class scatter of all classes. Before we can use these matrices we have to let the LDA object do an 'update', to create the actual matrices according to the accumulated data.
lda update DMatrix A DMatrix K A simdiag K lda.matrixT lda.matrixW [FMatrix B] DMatrix A B bsave lda.bmat