ansaurus

Question

Answer 1

A:

If you have no experience in this then you'd be better off cutting your audio files manually using Audacity.

It sounds like you're trying to save yourself the effort of cutting your recordings manually, but speech recognition is a very complex topic. You'll spend orders of magnitude more time implementing/integrating your speech recognition engine and training the models than you would spend redesigning your application with the recordings cut 'by hand'.

If you must, you can look at the Microsoft Speech API. The Open Directory also has several links.

iWerner 2009-11-04 14:36:42

The whole point is it needs to be automated, I am not the only one in my team cutting audio manually. With the token, it should be that hard for a good speech recognition to cut, all your doing is recognizing the token word, and getting the exact time before and after the token, and cutting the audio based on that.

pp 2009-11-04 14:42:53

and the token word will be almost exactly the same as the tokens words in the audio file to cut

pp 2009-11-04 14:44:16

@pp: No, it won't be almost exactly the same. Will the word have the same Pitch? Tempo? Volume? Inflection? Tone? Noise? Any changes to any of those result in vastly different bit patterns in the audio stream.

Joel Coehoorn 2009-11-04 15:17:05

The audio is recorded by professional voice talent, it be very similar

pp 2009-11-04 15:21:52

OK, I've added some links to my answer. Are you developing an IVR? I still stand by my original answer though: If someone's going through the trouble of making the recordings, then someone might as well go through the trouble of recording them as separate files and labeling them properly.

iWerner 2009-11-04 15:27:23

I am not developing any IVR. I am trying to speed of time of implementation of our professional services by making anything automated that can be automated

pp 2009-11-04 15:31:04

Our current process is cutting the files manually and that is a big time waster for the team

pp 2009-11-04 15:32:52

Also we can change the token to a word that speech recognition library, recognize very easily

pp 2009-11-04 15:34:52

If you can instruct the voice talent to speak a special word to indicate a break in the recording, then it seems you should just as easily be able to instruct the voice talent (or the recording engineer) to press a button to indicate the same thing. At the time of the button press, have your software either start a new audio file or note the current time to cut the audio afterward.

Rob Kennedy 2009-11-04 20:30:40

good workaround, but still doesn't solve the problem.

pp 2009-11-04 21:56:02

ansaurus

tags:

views:

answers:

cutting audio files based on a keyword

related questions