Janus 3 Tutorial - Do-It-Yourself Thread

If you decide to reproduce some development steps yourself in your own environment, then you should create yourself an empty directory, unpack the "data.tar.gz" archive in there, and follow the steps described in this tutorial. The archive contains everything that you need to follow the steps of this tutorial. There is a tiny speech database (subset of WSJ), a dictionary for most of the used words, as can be found on many publicly accessible places, the transcriptions of the recordings, a generic weights-file containing acoustic parameters that were trained previously with Janus, and a phoneme-names mapping definition, that maps the phonemes in the given dictionary to the phonemes that we want to use in our do-it-yourself system.

On this page you can find links to pages of the do-it-yourself thread. One link for each step of the development. Each of the pages will contain detailed descriptions of what should be done to develop a recognizer. All the used scripts are explained in detail. Step 1 starts right after unpacking the archive. Follow each step in the order they are listed, and you will end up with a working recognizer.

Some of the steps' descriptions are a bit lengthy, especially the first three steps. Don't be scared by this. The intention was to explain many things carefully step by step. After you've made the first couple of steps, you will find that the other steps refer to things that you've already done before and thus their pages will be shorter.

Create a directory where you will conduct all the do-it-yourself experiments. unpack the archive in there. Also create directories named step1, step2, etc. in which you will run the scripts and store the resulting files. Most of the scripts assume that you are actually using this kind of file organization.