views:

494

answers:

4

Hello,

I am trying to do "streaming" speech recognition in C# from a TCP socket. The problem I am having is that SpeechRecognitionEngine.SetInputToAudioStream() seems to require a Stream of a defined length which can seek. Right now the only way I can think to make this work is to repeatedly run the recognizer on a MemoryStream as more input comes in.

Here's some code to illustrate:

            SpeechRecognitionEngine appRecognizer = new SpeechRecognitionEngine();

            System.Speech.AudioFormat.SpeechAudioFormatInfo formatInfo = new System.Speech.AudioFormat.SpeechAudioFormatInfo(8000, System.Speech.AudioFormat.AudioBitsPerSample.Sixteen, System.Speech.AudioFormat.AudioChannel.Mono);

            NetworkStream stream = new NetworkStream(socket,true);
            appRecognizer.SetInputToAudioStream(stream, formatInfo);
            // At the line above a "NotSupportedException" complaining that "This stream does not support seek operations."

Does anyone know how to get around this? It must support streaming input of some sort, since it works fine with the microphone using SetInputToDefaultAudioDevice().

Thanks, Sean

+1  A: 

Have you tried wrapping the network stream in a System.IO.BufferedStream?

NetworkStream netStream = new NetworkStream(socket,true);
BufferedStream buffStream = new BufferedStream(netStream, 8000*16*1); // buffers 1 second worth of data
appRecognizer.SetInputToAudioStream(buffStream, formatInfo);
Serguei
Just tried it, and I got the same error.
spurserh
Did you verify that the buffered stream supported seeking? I.e., in the above code, does buffStream.CanSeek() return true?
Eric Brown
A: 

Hi Sean,

I am facing the same problem too. I have to read the stream from a pipe.

So, finally did you get any workaround for this ? Do we need to make our own custom Stream for this ? Or, would the Custom Audio Object do ?

Thanks in advance, Mark

Mark Carter
A: 

Hey Mark,

I ended up buffering the input and then sending it to the speech recognition engine in successively larger chunks. For instance, I might send at first the first 0.25 seconds, then the first 0.5 seconds, then the first 0.75 seconds, and so on until I get a result. I am not sure if this is the most efficient way of going about this, but it yields satisfactory results for me.

Best of luck, Sean

spurserh
A: 

Hi

I am into the same problem. Could you please tell me if you had come up with a better work-around? Could you please share the code?

Thanks a lot.

Holly