festvox: Building Voices in Festival

talk by Alan W Black including work by Kevin A Lenzo

Slides (postscript)

In our continuing goal of making speech synthesis more accessible. I will describe our latest advancements in documenting and automating the process of building new voices in Edinburgh University's Festival Speech Synthesis System

The intention is allow relatively unskilled users build new synthetic voices in currently supported and completely new languages. Although the task of producing perfect quality synthesis is still a research issue, we now have examples of how basic diphone synthesizers in new langauges can created in a few months of work (sometimes more, sometimes less). I will discusses the generic techniques we provide for building text analysers, lexicons, letter to sound rules, data driven prosodic models, autolabelling techniques, schema generation and recording aids.

I will also discuss some limit domain synthesis techniques that allow near automatic construction high quality natural synthesis for specific tasks, using one our unit selection techniques.

Most of the documents, scripts tools and techniques discussed in the talks are collect together at http://www.festvox.org, (which is continually being updated).

Sound samples


This page is maintained by Alan W Black awb@cs.cmu.edu