, but words that start with
that character are possible.
Possible files are wordlists or dictionary files.
As we're using tcl to read the vocabulary files, avoid lines beginning with
quotes. If in doubt, use brackes to ensure correct interpretation.
Example 1
;
; This is only the beginning of the dictionary that was used
; for the 1995 VERBMOBIL EVALUATION. If you write comments,
; remember the whitespace between the comment sign and the rest of
; the comment.
;
{"Ogai} {? OEH G AI}
{"Ogai-Jahrestagung} {? OEH G AI J AH R E2 S T AH G U NG}
{"Ogai-Tagung} {? OEH G AI T AH G U NG}
{"Ol} {? OEH L}
{"Uberlegen} {? UEH B ER2 L EH G E2 N}
{"Uberlegen(2)} {? UEH B ER2 L EH G N}
{"Ubernachtungskosten} {? UEH B ER2 N A X T U NG S K O S T E2 N}
{"ahnlich} {? AEH N L I CH}
{"ahnliche} {? AEH N L I CH E2}
{"argerlich} {? ER G ER2 L I CH}
Example 2
;
; This is also a valid vocabulary file.
;
{"Ogai}
{"Ogai-Jahrestagung}
{"Ogai-Tagung}
{"Ol}
{"Uberlegen}
{"Uberlegen(2)}
{"Ubernachtungskosten}
{"ahnlich}
{"ahnliche}
{"argerlich}
Note on Variants
Variants can be added to the vocabulary by adding a string in parentheses
to the main form. Only if the variant is listed in vocabulary and dictioanry
it will be used in the search. Very important: always list the main form
(without parentheses) first. The search will ask any language model penalty
from the lm module using the only trunc (everything up to the first open bracket)
of the variant, thus ensuring that they get the same probability as the base
form.
Example 3
;
; this is a vocabulary file with pronounciation variants
;
einen whatever I write here does not matter
einen(ohne_en) as currently only the first entry of each line
einen(ohne_ein) is used for the vocabulary