main
Developers Manual
Introduction to JANUS for Developers
If all you want is to train a recognizer, do some recognition, write labels or have a look at them, if you want to perform
jobs that somebody else has probably done before and has written ready-to-use scripts, or if you want to do something a little
different that it is usually done, then you most likely will be happy with the user's manual. Have a look in the user's manual
at what tasks have been done and for what there are scripts available. The developer's manual is of interest only to people who
want to write new scripts for performing new task, introduce new features to the JANUS recognizer, make major modifications to
existing scripts, write or modify C source code, and write and maintain this documentation.
On this page you can find the following topics:
JANUS was designed to be programmable. The programming language is Tcl/Tk, expanded by some object classes and their
methods. Object classes are things like dictionaries, codebooks, but also the decoder itself is an object class. Every object
class has its methods (operations that can be done with objects of that class). Objects can have subobjects and can be
hierarchically organized. The object oriented programming paradign allows, at least in principle, to plug in and out objects
as one wishes. Simply change the dictionary by assigning a new one, copy codebooks as easily as "cb1 := cb2", add
distribution accumulators as easily as "ds1.accu += ds2.accu", etc.
Why Tcl/Tk
The JANUS recognizer is written in both, C and Tcl/Tk. In the predecessor version, only C code was used, the user
interface was rather clumsy not very powerful. Whenever a new feature hat to be introduced, this meant C coding and debugging.
In this JANUS, we decided to use Tcl as a user inteface. Tcl provides a powerful shell, has its own programming language
interpreter, is well documented, freely available, and allows easy cooperation with C programs. Using Tcl, it is now possible
for casual users (and developers) to write their own scripts in Tcl or to modify existing scripts and make JANUS do
what they need without having to do C coding, and, hopefully, without having to ask somebody who is more experienced with the
code.
The C/Tcl Interface Module
Most of the services that Tcl offers to C programmers should be accessed by calling the appropriate functions from the C/Tcl
interface module. This module's most important job is to maintain a list of object types. The C source code modules implement
the datastructures and methods (operations) for their object classes and call itfNewType(...) to make their object
classes available to the Tcl programmer. (Please excuse, that sometimes we call the same thing object class, or
object type, or somethimes even wrongly object.)
The interface module also offers some useful functions and preprocessor macroes for often used operations, like creating new
instances of an object class (i.e new objects), destroying accessing and cofiguring objects.
There are many modules in JANUS. A module usually consists of one C source file (.c) and a corresponding header file
(.h), sometimes one module can consist of more than one source file. Usually, a module does implement one or more object
classes that logically belong together. E.g. the dictionary module implements the object classes "Dictionary" and
"Word". Since some objects imbed or refer to other objects, there is an object-dependency-hierarchy. These
dependencies are also reflected in the module architecture. Some modules have to include the header files of other modules.
The logical dependencies of modules and objects and the header-includes, all represent an almost identical hierarchy. Keep in
mind, however, that it is not always possible to find a definition of what is dependent on what with consent from all involved
people. The following diagram ties to show all modules and their dependencies (upper modules
depend on the connected lower
modules):
search
/ | \
hypotheses | language model
\ | /
\ | / labels
vocabulary / | \
|\ / | \--- dictionary...
| \ alignment / | \--- allophone models...
| \ \ \ / | \--- senone tree...
| \ \ path | \--- topology tree...
| \ \ | | \--- topologies...
| \ \ | | \--- transitions
| \ Markov model \--- tags
| \ / | \--- phones
| \/ | \--- database
| /\ |
dictionary allophone models
| | | \
| | | \
| | | \
tags | senone tree topology tree
| / | \ \ / / |
| / | \ tags / |
phones | \ / |
| \ / |
| generic tree |
| |
| |
senones topologies
/ \ |
/ \ |
(score computer) |
/ \ |
/ \ |
distributions neural net transitions
| |
| |
codebooks | C/Tcl interface
lda | \ | generic lists
/ \ | rewriting sample sets
path features
Some of the modules appear more than once, this is only to avoid too many crossing dependency lines. The lines in the above
diagram mean the following:
Iff module A is connected to a lower module B, then there is a part in A's source code which needs to know
something about B's source code.
This doesn't necessarily mean that you cant define a object of type A without having one of type B. It also doesn't mean that
an object of type A can have a subobject object of type B.
There are three special cases in the diagram:
- There is no such thing as a score computer module, there are different modules that can serve as a score computer, like
e.g. a neural net module or a mixture densities module.
- The C/Tcl interface module is used by almost all other modules, and
- the generic list module is used by very many of the other modules, so their dependencies are not included.
The Makefile should work with standard UNIX command make. The safest way to create a JANUS executable is to create an empty directory,
make a symbolic link of the JANUS source RCS directory to ./RCS, run co RCS/*, have a look at some definitions in the Makefile,
to make sure that all paths are correct, run make depend, then run make. This is a protocol of such a session:
(i13a6:/home/i13hp1/rogina) mkdir tmp
(i13a6:/home/i13hp1/rogina) cd tmp
(i13a6:/home/i13hp1/rogina/tmp) ln -s /home/i13d4/speech/janus3/RCS .
(i13a6:/home/i13hp1/rogina/tmp) co RCS/*
... many RCS messages ...
(i13a6:/home/i13hp1/rogina/tmp) make depend
grep include *.c | grep "\.tclc" | cut -f2 -d'"' | xargs touch -t 199601011200
makedepend -- -I/home/i13d4/speech/janus3/include -g -- *.c
(i13a6:/home/i13hp1/rogina/tmp) make janus
... many messages from make ...
(i13a6:/home/i13hp1/rogina/tmp) janusA
# ==================================================
# JANUS-SR Version 3.0 [Jan 10 1996 13:02:21]
# ---------------------------------------
# University of Karlsruhe, Germany
# Carnegie Mellon University, USA
#
# (c) 1993-95 - Interactive Systems Labs
# ==================================================
%
There are a few things not usually found in other Makefiles. Some JANUS C source files do include a preprocessed Tcl script and assign it to a string
such that this string can be used for Tcl_Eval(...). It seemed nice and easier to us, to have extra humand-readable files which contain the
Tcl scripts. This way it is easy to read and to edit them, and they can also be used and debugged as standalone scripts. To do this, we first have to
convert a Tcl script into a single string, by replacing all doublequotes with backslash-doublequotes and all backslashes by backslash-backslashes, and
all newlines by backslash-n. This is done in a oneliner which is used as the rule for creating .tclc files from .tcl files. The
.tclc files are the ones that are included in the C source. Because makedepend complains when files-to-be-included don't exist, we've also
added a line to the rule for make depend which will create dummy .tclc files with very old dates. (It might be a waste of time,
tryining to optimize the onliner .tcl to .tclc rule.)
Maintainer: monika@ira.uka.de, rogina@ira.uka.de, finkem@cs.cmu.edu, maier@ira.uka.de