views:

1421

answers:

5

I'm looking for a way to implement trainable voice recognition in C++.

I've found the SAPI 5.3 SDK which looks promising, but the only tutorials that I can find are for TTS which is the opposite of what I want.

Can anyone recommend a good tutorial that covers everything I would need to get SAPI up and running?

Either that or is there a second API I could use as opposed to SAPI? The only requirement is that is has to be distributable in a way that it could be installed on other windows computers.

Thanks.

+4  A: 

This may be helpful.

http://www.generation5.org/content/2001/sr00.asp?Print=1

(caveat: I've never used SAPI for recognition.)

Nick
+1, that tutorial did provide a bit of help for me.
sheepsimulator
+2  A: 

There's this very old (Microsoft Visual C++ 4.1) text MAPI, SAPI, and TAPI Developer's Guide. Chapter 19 is Creating SAPI Applications with C++.

The speech's team blog. And the All the Cool Developers use Speech APIs blog

Itamar Even-Zohar’s Page on Speech Recognition is a must.

For open and/or portable solution, look at this wikipedia article and CMU Sphinx. Though, Sphinx-4 is now in Java.

And you sure look there... MSDN : Microsoft Speech API (SAPI) 5.3

anno
+1  A: 

If I recall, Microsoft had some really good SAPI examples that came with the development kit (SAPI 5 SDK), especially for the new .NET framework stuff. Basically, to do voice recognition, they give you two different ways of doing it: natural and scripted. I played around with it a bit in VB.NET, and got it to recognize some words based upon the BNF grammar I passed to the scripted engine. I know it also comes with a trainer to aid in recognition of a given speaker. The SDK was pretty clear to me, at least.

Microsoft's SAPI 5.3 documentation

Microsoft's SAPI 5 SDK download - contains examples in many different languages.

NB: Especially check out the help file that comes with; it describes what is necessary to get the engine working, and what files/apps you need installed. The writing in the SDK, IIRC, was rather clear and somewhat Petzold-esque in tone, so if you persist a bit and study the examples, I suspect you can get this working in a reasonable amount of time.

sheepsimulator
+2  A: 

I work with SAPI and it's possible to make to make a ASR (Automatic Speech Recognition, or Speech to Text if you want) system :)

For that, and using the source code provided here you simply have to add (to the code given) something like this:

hr = cpRecoCtx->CreateGrammar(1, &cpGram2);
hr = cpGram2->LoadDictation(NULL, SPLO_DYNAMIC);
cpGram2->SetDictationState(SPRS_ACTIVE);

This will create a grammar that will Load a Dictation (it recognizes everything you say but it's not very accurate). If you know what the words that you want to recognize are, you can see thorught the video that they make a grammar file. This file contains all the words that they want to recognize (it's better with a grammar that with the simple dictation).

For this, they have in the code something like this:

hr = cpRecoCtx->CreateGrammar(1, &cpGram);
hr = cpGram->LoadCmdFromFile(argv[1], SPLO_STATIC);
hr = cpGram->SetRuleState(0, 0, SPRS_ACTIVE);

You can use both if you want, good luck! ;d

aF
A: 

Hi, I am wondering whether you have done Wav file to text profile training for SAPI. I am trying to do that but with no success. Any help would be much appreciated. I have completed reading most of the posting and articles on this, in the neat but with no luck. Many thanks.

Taneem

tkm