tags:

views:

101

answers:

1

Hi,

I have a audio file(s) that need to be cut and broken up into multiple audio files based on a keyword. For example, lets say the keyword is "CUT"

if we had an wav file called "original.wav" with the following audio, 
"Hello , is this CUT the time is CUT My name is CUT The balance is"

and the token audio cut.wav which contains the audio "CUT"

So original.wav, and cut.wav are feed into a program

and the output is

file1.wav which contains audio "Hello, is this"
file2.wav which contains audio "the time is"
file3.wav which contains audio "My name is"
file4.wav which contains audio "The balance is"

I have no experience in audio programming at all, what libraries would I need and how would I go about this.

Thanks

A: 

If you have no experience in this then you'd be better off cutting your audio files manually using Audacity.

It sounds like you're trying to save yourself the effort of cutting your recordings manually, but speech recognition is a very complex topic. You'll spend orders of magnitude more time implementing/integrating your speech recognition engine and training the models than you would spend redesigning your application with the recordings cut 'by hand'.

If you must, you can look at the Microsoft Speech API. The Open Directory also has several links.

iWerner
The whole point is it needs to be automated, I am not the only one in my team cutting audio manually. With the token, it should be that hard for a good speech recognition to cut, all your doing is recognizing the token word, and getting the exact time before and after the token, and cutting the audio based on that.
pp
and the token word will be almost exactly the same as the tokens words in the audio file to cut
pp
@pp: No, it won't be almost exactly the same. Will the word have the same Pitch? Tempo? Volume? Inflection? Tone? Noise? Any changes to any of those result in vastly different bit patterns in the audio stream.
Joel Coehoorn
The audio is recorded by professional voice talent, it be very similar
pp
OK, I've added some links to my answer. Are you developing an IVR? I still stand by my original answer though: If someone's going through the trouble of making the recordings, then someone might as well go through the trouble of recording them as separate files and labeling them properly.
iWerner
I am not developing any IVR. I am trying to speed of time of implementation of our professional services by making anything automated that can be automated
pp
Our current process is cutting the files manually and that is a big time waster for the team
pp
Also we can change the token to a word that speech recognition library, recognize very easily
pp
If you can instruct the voice talent to speak a special word to indicate a break in the recording, then it seems you should just as easily be able to instruct the voice talent (or the recording engineer) to press a button to indicate the same thing. At the time of the button press, have your software either start a new audio file or note the current time to cut the audio afterward.
Rob Kennedy
good workaround, but still doesn't solve the problem.
pp