main
Developers Manual
Description of all Modules
This is an alphabetically sorted complete list of all the modules of Janus. Each entry is pointing
to an overview page of the corresponding module, which itself contains links into more detailed documentation.
You can also find a diagram with the dependencies of the modules in the Introduction of the Developers Manual.
- Allophone Model Indexing
This model is keeping a list of all acoustically distinguishable allophones,
i.e. a sequence of senone and transition model indices for every allophone.
It asures that the same allophonic acoustic model is indexed with the same index.
The object type AModel is defined here.
- Codebooks
Codebooks are used to store the means and covariance matrices of the densities
of a mixture. The codebooks module defines the objects CodebookSet,
which is a set of Codebook objects. The CodebookAccu object
is a training data accumulator. The Covar object is a covariance matrix.
- Data Base
The data base module defines the objects DBase and DBaseIdx. A
DBase object maintains a set of entries, each of which has a key and a
value. These data bases are stored in the file system and can be read from there.
A DBaseIdx object maintains the positions of each entry.
- Dictionary
The dictionary module defines the objects Dictionary and Word.
It maintains a list of words (orhtographic spelling and a seuqence of monophone
and tag names. It does not (yet) store entire word-dependent HMMs.
- Distributions
The distributions module defines the objects DistribSet and
Distrib. It is used to store and maintain the mixture weights of
mixture densities.
- Duration Modeling
During testing every phone can be penalized by some value which depends on its
duration. The duration modeling module can learn and maintain duration
histograms. Duration models can be clustered in a tree just the same way as
topologies or distributions.
- Error Handling
This module is rather small. It manages the flow of error messages and warnings
and info messages. The actual handling of error messages can be done in Tcl.
- Features
The feature module defines the Feature object. Multiple features can
be stored in one FeatureSet object. The feature module offers many
operations that can be performed on features (like computing FFTs, matrix
multiplication, cutting and pasting around in the feature space etc.).
- Generic List
The generic list module offers functions for maintaing lists of anything. Many
modules define objects type which make heavy use of generic lists.
- Generic Tree
The generic tree is only used as a basis for senone and topology trees. The
object type Tree is defined here.
- Hidden Markov Model
The HMM Module defines the object type HMM, which contains the entire
topology of an utterance (word-graph, phone-graph, state-graph).
- Hypotheses
Using JANUS, you probably never need to create a HypoList object.
However, the Search object contains a HypoList to store
it's results. If you want to manipulate those, the methods used for a
HypoList will be interesting for you.
- Interface C/Tcl
The C/Tcl interface module does not define any object itself. This module offers
many C-functions and tools for other modules to declare and maintain object types,
as well as handling error messages. Click here if you want to know how to implement
new modules and object classes.
- Labels of Speech Segments
The labels module defines the Label and LabelDesc object.
A LabelDesc object is a collection of many other subobjects (dictionary,
senone tree, etc.) that are needed to make sure that the meaning of labels can
be interpreted correctly. The Label object maintains a database of labels
(using a DBaseIdx).
- LDA
A LDA object can be used to calculate total and within class scatter
matrices using a Path object and data from a FeatureSet. With this information a
transformation matrix A can be obtained that can be used afterwards to reduce the
dimension of the input feature.
- Language Model
A Lm object contains a statistical language model. It's defined
only over a Vocab object.
Currently, only trigrams, bigrams and unigrams are implemented.
- Machine-Independent I/O
This module implements functions that can be used for writing and reading
binary data across different platforms, making sure that the data remain the same.
- Matrices
Matrices are used all over the system. This module provides different kinds of
matrices and vector objects and offers functions to work with them.
- Path (output of aligment)
A Path object can be filled by a forced alignment procedure or the
labels module. It contains alignment information (which only make sense if the
corresponding HMM object is available).
- Phones
The phones module defines the objects Phones and PhonesSet
used for assigning unique indices to monophone names and to define qeustions
about an allophone context.
- Polyphone Trees
The polyphone trees (ptree) module defines the object Ptree
used for managing arbitrary wide contexed polyphones.
- Rewriting of Names
The rewriting module keeps simple rewriting rules, each of which is a pair of
identifyers, defining which identifyer should be renamed by what. The codebooks
and distributions module make use of this. Here the object types Rewrite
and RewriteSet are defined.
- Sample Set
For extracting sample vectors from the training data and sorting them into
class-dependent files, you need some buffers. The accumulation and dumping
of such samples is done in this module. Other Services include the computation
of initial codebooks with k-means or "neural gas".
- Search (Decoding)
The search module defines the Search object and other subobjects that
are used for decoding (i.e. recognition).
- Senones
The senones module defines the objects SenoneSet and Senone.
A senone set is just a collection of many senones. Each senone is representing
an atomic acoustic model. The senones module offers
transparent functions for computing emission probabilities, accumulating training
data, and updating the acoustic parameters.
- Senone Tree
The senone tree module defines the STree object which itself is defined
over a generic tree. A senone tree is used to cluster senones according to an
allophones context.
- Tags
The tags module defines the object Tags which is used for indexing
phoneme-tags (like 'relative word position', 'stress level', 'semantic or
syntactic boundary').
- Topologies
The topology module defines the TopoSet object and the Topo
object. A topology is the definition of HMM states and transitions, it also
defines a node of a senone tree from where to start for finding the senone that
should be used for modelling the state.
- Topology Tree
The topology tree module defines the TTree object which itself is defined
over a generic tree. A topology tree is used to cluster HMM-topologies according
to an allophones context.
- Transition Models
The transition models module defines the objects Tm and TmSet.
It is used to store and maintain the topologies of single HMM states, i.e. keeping
a list of all transitions out of the state together with their probabilities.
- Viterbi Alignment
The Viterbi module does not define any objects. It only implements the
Viterbi alignment algorithm, i.e. traversing a given HMM and storing the
path in a Path
- Vocabulary
A Vocab Object stores the Recognition Vocabulary, defining common indecees
for search and language module. If used for search it must be defined over
a dictionary for valid pronounciations.
- ...
Maintainer: monika@ira.uka.de, rogina@ira.uka.de, finkem@cs.cmu.edu, maier@ira.uka.de