views:

39

answers:

1

Folks, I am trying to put together a server-side system where I use Microsoft SAPI to:

  • Perform multiple simultaneous dictation-style recognitions in the server.
  • Additionally, all recognitions could be using different speaker profiles (my application would identify the user and indicate to SAPI which profile to load).
  • I would also like to train various user profiles programmatically.

I already know that some of the above is not possible from managed code, i.e. System.speech namespace. Can anyone enlighten me as to whether what I am trying to do possible in theory using SAPI 5.x?

Thanks for your help.

-Raj

A: 

You would need Microsoft Speech Server to do this. You can still use managed code, but you would need to use the Microsoft.Speech.Recognition library (which uses the Server engines) instead of System.Speech.Recognition (which uses the desktop engine).

Aside from that, everything that you want can be done in native SAPI, of course.

Eric Brown
Eric, Thanks for your response. Microsoft.Speech.Recognition library does not allow dictation grammars. In doing my research I have found that the current server-based speech products from microsoft either do not do dictation or do not expose that ability. Additionally, the managed code variant of speech APIs is restricted to the most common use case: writing speech aware user-interactive application (as in not server ). Looks like native SAPI is the way to go for what I need.
You'll still need the server engines to do what you want. The desktop engines don't work with telephone-quality audio.
Eric Brown
True. I am thinking of streaming audio to my SAPI-based server from a web-based (flash/silverlight) UI or mobile app.