Back in the old days, text-to-speech, as cutting edge as it was, was very imperfect. When you typed in a word, it would pretty much read it how you spelled it... in monotone. Oftentimes, the result would be very funny. Nowadays, Text-to-Speech is too intelligent to goof in ways that can bring a laugh.
As a personal project, I'd like to make up an application that can bring back this old style of text-to-speech, if only as a toy. In .Net, I have available to me both System.Speech.dll
and the SpeechLib
COM objects. (Microsoft Speech Object Library) Both seem to use the OS's built in Text-to-Speech, which again, is too dang smart. Are there any ways to configure these to disable whatever it is that makes it intelligent?
I've tried a few different 'SayAs' options, I've tried setting the culture to invariant (exception!), and now I'm looking at SSML. It's beginning to look like I'll have to find the old technology itself, but I don't even know where to begin there.
As an example of the chaos I'm hoping to see, here's some Moonbase Alpha for you: http://www.youtube.com/watch?v=Hv6RbEOlqRo (Make sure you are wearing headphones!)
Con flab these newfangled text-to-phoneme converters, and normalizers, and cableless phones, and...