Hypotheses File Formats

WARNING: valid only for hypos.c version 1.12 or newer
WARNING: requires itf.c version 1.28 or newer


As described in the TCL-Part the fileformats depend entirely on the user and can be specified from tcl level. However, here is what some programs EXPECT as output from a speech recognizer, which is easily created using the examples in the TCL-Part.

align

String Alignment and Scoring Program
Written by Stan Janet, July 1987

This program is used a lot in the speech community. It requires the output sentences of the recognizer strictly in the same order as the list of correct sentences, one line per hypothesis. At the end of each hypotesis, a filename identifier can be shown in parenthesis. Everything after the first open-parenthesis in the hypothesis is ignored. A list of homophones can be specified, each exactly one word long. This format can be produced by the method puts of an hypothesis list object in JANUS-3.:
"hl puts -style simple"

EXAMPLE:

ich w"urde gerne einen Termin ausmachen (any kind of nonsense)
wann h"atten sie denn Zeit (more nonsense)

vmeval

String and Graph Scoring Tool
Written by Michael Lehning, TU Brauschweig, 1995

This program is used for grading Verbmobil Evaluations. It accepts sentences in any order, as long as sentences in the list of correct sentences and hypotheses are preceeded by a sentence-identifier. Hypotheses in one file are seperated by an empty line. They can contain linebreaks within one hypothesis. This format can be produced by the method puts of an hypothesis list object in JANUS-3.:
"hl puts -style normal -id $uttid"
Note that vmeval can also grade word-graphs.

EXAMPLE:

%TURN GX1_001
ich w"urde gerne
einen Termin ausmachen

%TURN GX1_002
wann h"atten sie denn Zeit



Maintainer:monika@ira.uka.de