views:

709

answers:

4

I have always wondered how many different search techniques existed, for searching text, for searching images and even for videos.

However, I have never come across a solution that searched for content within audio files.

For example: Let us assume that I have about 200 podcasts downloaded to my PC in the form of mp3, wav and ogg files. They are all named generically say podcast1.mp3, podcast2.mp3, etc. So, it is not possible to know what the content is, without actually hearing them. Lets say that, I am interested in finding out, which the podcasts talk about 'game programming'. I want the results to be shown as:

  • Podcast1.mp3 - 3 result(s) at time index(es) - 0:16:21, 0:43:45, 1:12:31
  • Podcast21.ogg - 1 result(s) at time index(es) - 0:12:01

So my questions:

  • How could one approach this problem?
  • Are there are suitable algorithms developed to do something like this?

One idea the cropped up in my mind was that, one could use a 'speech-to-text' software to get transcripts along with time indexes for each of the audio files, then parse the transcript to get the output.

I was considering this as one of my hobby projects. Thanks!

+1  A: 

Onlinemag.net has an article about Issues with Multimedia Searching, including audio. They also provide some answers, but mostly it says use spech-to-text wich is btw the only solution I would imagine being effective.

Espo
+3  A: 

If you want to search for text (i.e. what is being said) inside an audio stream you would have to process it with some kind of speech recognition algorithm and store the text as meta data associated with the files. For video you could also do text recognition for text inside the video. Evernote already does this for text inside image files, but has no support for audio as far as I know.

Something similar is possible when using audio to search for audio. I don't know the details of these algorithms, but I'm guessing they involve some kind of frequency analysis. Shazam is using this kind of technology to identify songs based on audio clips.

Here are some Wikipedia articles that may be useful:

Anders Sandvig
A: 

@Anders:

Thanks for the link to Shazam. I was not aware of anyone who did this.

Pascal
+2  A: 

If you want to outsource this to a service, you might like PodScope. We use it with moderate success at IT Conversations. Google also has audio search, but so far it's limited to YouTube.

Doug Kaye