Preparing the Database

The term "database" is used in two different contexts that have only little to do with each other, so, to avoid confusion, here are the two usages:

The task database is a collection of files (waveforms, labels, transcriptions, dictionaries, etc.) that contain all that describes a task. This is what can be found in the file "data.tar.gz".

The other meaning of database refers to the Janus object DBase, which can be used to store any kind of string-indexed information. It is generally used to store the utterance descriptions, like speaker, gender, location, transcription etc.


The following is a list of things that you might have to do with your data before you can use them with Janus.