Library for extracting words (speech) out from audio stream? | ansaurus

tags:

views:

13

answers:

1

+1 Q:

Library for extracting words (speech) out from audio stream?

I have an audio stream and I would extract words (speech) from it. So for example having audio.wav I would get 001.wav, 002.wav, 003.wav, etc where each XXX.wav is one word.

I am looking for a library or program to do it -- platform does not matter, but I prefer open-source solution.

Thank you in advance for help.

+1 A:

Nuance, the company that makes Dragon Naturally Speaking, has a number of Software Development Kits.

The Audio Mining kit seems to match your requirements:

Dragon NaturallySpeaking SDK AudioMining is a speaker-independent speech recognition toolkit that enables the indexing of 100% of the speech information within audio files. The technology uses highly accurate speech recognition to turn audio files into XML text with timestamp information. This can be integrated with standard text-search products to enable rapid access to specific audio content.

The speech to speech+metadata is far and away the hardest part to get right. Once you have the speech + metadata, extracting the words as individual audio files is much more straightforward.

mattbh 2010-07-06 12:08:40

related questions

How to emulate/replace/re-enable classical Sound Mixer controls (or commands) in Windows Vista? [answered]

Music - How do you analyse the fundamental frequency of a PCM or WAC sample

Convert WAV to WMA using .NET

How does one record audio from a Javascript based webapp?

What is the best way to merge mp3 files?

Slowing down the playback of an audio file without changing its pitch?

Creating MP4/M4A files with Chapter marks

Algorithm to decide if digital audio data is clipping?

Service to make an audio podcast from a video one?

Good python library for generating audio files?

How to do a sample rate conversion in Windows (and OSX)

Waveform Visualization in Ruby

Simple audio input API on a Mac?

Change Active Sound Card on the Fly

What Are High-Pass and Low-Pass Filters?

3.1 or 5.1 audio in Flash

Can an audio object be embedded in an InfoPath form ?

Must-see tech talks/presentations?

Free Wavetable Synthesizer?

How do I search content, within audio files/streams?

What is a good free library for editing MP3s/FLACs?

Detecting audio silence in WAV files using C#

Accessing audio/video metadata with .NET

Transcoding audio and video

Rockbox audio format