main
User Manual
Basic things you can do

Training JANUS on a New Database

The following flow chart shows how a new JANUS recognizer can be built. There are different ways to do it, depending on what you have to start with, and depending on what you want to have in the end. This page is not meant to explain the advantages and disadvantages of the different ways to train a recognizer. You better read a good book about speech recognition with HMMs or ask people who have gone through that process before, if you are not sure how to preceed. If you know what path through the following diagram you'd like to go, you can click on the available links to get details about each step. Keep in mind, that the flow chart is not complete. There are, of course, many other ways to build a recognizer. What you'll find here, are just some suggestions. Whenever there is a link in some box of the flow chart the link may refer not only to the underlined text but also to the whole box. None of the links here is pointing to a developers manual page or to a Tcl script page. All links are pointing to a detailed description of the flow chart box. You can find links to Tcl scripts on those detailed description pages.

                           +------------------------------------+
                           | create a feature description file, |
                           | so you can read the recordings     |
                           +-----------------+------------------+
                                             |
                                       +-----+-----+
                                      / do you want \
                                     / to use prepro-\
                                     \ cessed featu- /
                                      \ re files ?  /
                                       +-----------+
                                       /           \
                                     yes            no
                                     /               \
    +-------------------------------+-------+         \
    | read all feature files, preprocess    |          |
    | them and write the preprocessed files |          |
    +-------------------------------+-------+          |
                                    |                  |
    +-------------------------------+-------+          |
    | modify the feature description file   |          |
    +-------------------------------+-------+          |
                                    |                  |
                         +----------+------------------+-----------+
                         | create a list of IDs for each utterance |
                         +-------------------+---------------------+
                                             |
                              +--------------+---------+
                              | create a task database |
                              +--------------+---------+
                                             |
                                       +-----+-----+
                                      / do you want \
               +------- yes ---------+ to train with +--- no -+
               |                      \   labels?   /         |
         +-----+----+                  +-----------+  +-------+-------+
        / do you have\                               / do you want to  \
   +---+  any kind of +----+                +-------+ continue with ex- +
   |    \   labels?  /     |                |        \ isting recogn.  / 
  yes    +----------+      no               |         +-----------+---+
   |                       |                |                     |   
+--+--------+ +------------+----+           no                    |   
| make them | | modify and pre- |           |                    yes   
| readable  | | pare existing   |           |                     |
| for JANUS | | recognizer for  |           |    +----------------+-------+
+--------+--+ | writing labels  |           |    | run regular training as|
         |    +----------+------+           |    | was done before with   |
         |               |                  |    | the existing recognizer|
         |    +----------+------------+     |    | until you are happy    |
         |    | write labels with the |     |    +----------------------+-+
         |    | existing recognizer   |     |                           |
         |    +--+--------------------+     |                           |
         |       |                          |                           |
        +------+------+           +---------+-----------------+         |
       /  do you want  \         / want to start with 1-vector \        |
    +-+ to use LDA for  +---+   +  codebooks and growing them   +       |
    |  \ preprocessing /    |    \or start full-sized codebooks/        |
    |   +-------------+     |     +--------+-----------------++         |
    no                     yes             |                 |          |
    |                       |          full-sized          grow         |
    |   +-------------------+---+          |                 |          |
    |   | make LDA matrix along | +--------+-------+ +-------+--------+ |
    |   | the labels using the  | | design initial | | design initial | |
    |   | initial architecture  | | full-sized     | | 1-vector/book  | |
    |   | or existing recognizer| | architecture   | | architecture   | |
    |   +-------+---------------+ +--------+-------+ +-------+--------+ |
    |           |                          |                 |          |
    |   +-------+----------+         +-----+-------+    +----+--------+ |
    |   | modify feature   |         | make random |    | make random | |
    |   | description file |         |   weights   |    |   weights   | |
    |   | to use LDA matrix|         +-----------+-+    +--+----------+ |
    |   +-------+----------+                     |         |            |
    |           |                                |         |    +-----+ |
    +-----------+-----------------------+        |         |    |     | |
   / do you want to start with 1-vector  \       | +-------+----+--+  | |
  + codebooks growing them each iteration +      | | run regular   |  | |
   \ or start with full-sized codebooks? /       | | training with |  | |
     ----+---------------------+--------         | | split accus   |  | |
         |                     |                 | +-------+-------+  | |
       grow                full-sized            |         |         no |
         |                     |                 |    +------------+  | |
+--------+-------+     +-------+--------+        |   / are you happy\ | |
| design initial |     | design initial |        |  + with the word- ++ |
| 1-vector/book  |     | full-sized     |        |   \ accuracy ?   /   |
| architecture   |     | architecture   |        |    +----+-------+    |
+----+-----------+     +----------+-----+        |         |            |
     |                            |              |        yes           |
+----+------------+         +--+---------------+ |         |            |
| make one-vector |         | extract  feature | |         |            |
|   codebooks     |  +---+  | samples to files | |         |            |
+----+------------+  |   |  +--+---------------+ |         |            |
     |               |   |     |                 |         |            |
+----+---------------+-+ |  +--+---------------+ |         |            |
| run regular training | |  | run k-means or   | |         |            |
| with split accumula- | |  | neural gas on    | |         |            |
| tors for all vectors | |  | all sample files | | +---+   |            |
+----+-----------------+ |  +--+---------------+ | |   |   |            |
     |                   |     |                 | |   |   |            |
+----+-----------------+ |  +--+-----------------+-+-+ |   |            |
| update (incl. splits)| |  | run regular training   | |   |            |
| codebooks and distrib| |  | (ML update, no splits) | |   |            |
+----+-----------------+ |  +------------------------+ |   |            |
     |                   |         |                   |   |            |
  +--+---------+         |      +--+---------+         |   |            |
 / are you happy\        |     / are you happy\        |   |            |
+ with the word- +---no--+    + with the word- +---no--+   |            |
 \ accuracy ?   /              \ accuracy ?   /            |            |
  +-----+------+                +-----+------+             |            |
        |                             |                    |            |
       yes                           yes                   |            |
        |                             |                    |            |
  +-----+-----------------------------+--------------------+------------+
 /   do you want to switch from a context independent system to a        \
+    context dependent semicontinuous or fully continuous system?         +
 \                                                                       /
  +-----+---------------------------------------------------------------+
        |                      |
       yes                     no
        |                      |
+-------+----------------+     |
| create a new context-  |     |
| dependent architecture |     |
+-------+----------------+     |
        |                      |
+-------+----------------+     |
| initialize the blown-up|     |
| system's weights with  |     |
| the weights from the   |     |
| original system        |     |
+-------------------+----+     |
                    |          |
+-------------------+----+     |
| do whatever kind of    |     |
| training is appropriate|     |
| restart with LDA, new  |     |
| codebooks, etc. if you |     |
| like, or simply        |     |
| continue with regular  |     |
| training               |     |
+-------------------+----+     |
                    |          |
                 +--+----------+--+
                /do you want to do \
         +-----+ some clustering of +-------+
         |      \contextual models /        |
         no      +----------------+        yes
         |                                  |
   +-----+---------------+       +----------+--------------+
   | well then you're    |       | run on of the available |
   | just about finished |       | clustering algorithms   |
   +---------------------+       | and make yourself a new |
                                 | clustered architecture  |
                                 +-------------------------+

Maintainer: rogina@ira.uka.de