Explicite Duration Modeling - Overview

The duration module implements the object classes Duration and DurationSet. A DurationSet is an object that can hold many single Duration objects. Duration sets are ModelSet compatible, i.e. they can be used by Tree objects, can be clustered, can compute scores, can accumulate training data, and update their parameters. A single Duration can be considered a histogram. Since it is infeasible to store the frequencies of all possible durations, every DurationSet has a list of buckets. Each bucket counts the frequences of an entire range of possible durations. When creating a DurationSet you must specify the bucket ranges. Durations greater than the last bucket will be ignored. All functions assume that a bucket is just a simplification of "all bucket-elements are treated the same". This means, that if the range of a bucket is from 4 to 8 (i.e. it has 5 elements) and the probability for that bucket is 0.15, then the probability for a duration of 6 (as well as for any other value from 4 to 8) is 0.03.

If you want duration modeling to be active, you must specify that when creating the AmodelSet object. Then you can, in addition to the topology tree specification, specify a duration tree and the root node of the duration tree. A duration tree is a regular tree (just like distribution or topology trees) defined over a DurationSet. When a duration set is plugged into the amodel set correctly then during testing the probability of a phone having some specific duration can be calculated into the accumulated search score. And during training, training data can be accumulated if the duration set's accumulators have been created.


Further information about the module: