views:

649

answers:

2

I am writting a spelling word application for my son and would like recomendations for good APIs that can be understood when it speaks. I am programming in .Net so something that will interop with that would be handy. Thanks in advance.

+7  A: 

MS Speech SDK. It is exposed via a .NET assembly. Very easy to use. My kids loved it. Free.

using System.Speech.Synthesis;

public class SpeakHelloWorld
{
  public static void Main(string[] args)
  {
      SpeechSynthesizer synthesizer = new SpeechSynthesizer();
      synthesizer.Speak("As for me and my house, ...");
  }
}

The sounds generated by the code above do not use natural inflection, with pauses and so on. So with a complete sentence, it does not come out sounding human. But single words sound ok, just sort of robotic.

For a little kid, with a small enough set of words, you might want to just record your own voice saying the words. I did that with a USA state puzzle, to pronounce the state names, rather than resort to the synthesizer.

Cheeso
Was it clear enough to do a spelling word application or did it malpronounce the words.
Jeremy E
It's pretty good. Still sounds like a computer voice, but pretty good. It is very easy to try it out, so you can see for yself in just a few minutes.
Cheeso
+3  A: 

Don't know why I didn't think of this before -
I was doing a dictionary-lookup tool, and wanted to add pronunciation to it. Rather than use Text-to-speech, which gives a robotic sound, I took a different approach. M-W.com has human voices captured in .wav files for most words. So I screen-scrape the Merriam-Webster website to grab a wav file for the word, and then just play that. If your app will be connected, then maybe this would work for you, too.

This is the flow it goes through:

pronouncing Tricky...looking up 'Tricky'...
dictionary page: http://www.merriam-webster.com/dictionary/Tricky
got dictionary page markup, 35828 chars...
getting pronunciation uri...
got uri: 'http://www.merriam-webster.com//cgi-bin/audio.pl?tricky01.wav=tricky'...
getting page markup...
got pronunciation page markup, 3498 chars...
getting wav uri...
got wav uri: 'http://media.merriam-webster.com/soundc11/t/tricky01.wav'...
getting wav data...
got wav data, 6260 bytes...
playing wav data.
done.

Here's some prototype source code that does it.

This works on the .NET Framework 2.0, and also works on the .NET CF 2.0. It's just an illustration. It sort of naive about selecting the proper .wav file when there are multiple word forms and multiple pronunciations. If you ask for a plural form, you may not get it. Also you may want to add caching and additional exception handling to harden it.

Cheeso
I would love to see the code for that. That is a really cool idea!
Jeremy E
ok I updated the post and included the code.
Cheeso