I am working on improving Festival on Emacs. I need better control of Festival when it reads a sentence. Basically, I need two things:
- Show what word is being read.
- Change the speed (and maybe pitch) of what is being read.
Ideally, there would be some data structure output by Festival that would link offset/length (usually the start/length of a word) with an output WAV file (or even a location in a wav file). I could then use something like mplayer to build a playlist and somehow tell me when the next word is being played and where that word exists in the buffer.
I'm also hoping there's some simple command to change the speed of what is being read. However, mplayer can do that for me, so it's not a big deal if I can get #1 working.