DIPHONE COLLECTION AND SYNTHESIS

$^{\rm 1}$ International Software Research Institute, $^{\rm 2}$ Language Technologies Institute,
Carnegie Mellon University,
{lenzo,awb}@cs.cmu.edu

In this paper, we describe the design and collection of corpora for diphone synthesis, the voice building process, and our experience in the creation of a new, publically available database of ten diphone sets of one American English speaker for the Festival Speech Synthesis System [3], using the FestVox document and tools [1]. In support of our goal to make the tools and techniques available for anyone to build their own synthetic voices, we have generalized and streamlined the tasks involved from what were once arcane anecdotes, half-written one-off scripts, and partial descriptions, to detailed, complete instructions that others have followed with good results.

DIPHONE COLLECTION AND SYNTHESIS

Abstract: