I have always wondered how many different search techniques existed, for searching text, for searching images and even for videos.
However, I have never come across a solution that searched for content within audio files.
For example: Let us assume that I have about 200 podcasts downloaded to my PC in the form of mp3, wav and ogg files. They are all named generically say podcast1.mp3, podcast2.mp3, etc. So, it is not possible to know what the content is, without actually hearing them. Lets say that, I am interested in finding out, which the podcasts talk about 'game programming'. I want the results to be shown as:
- Podcast1.mp3 - 3 result(s) at time index(es) - 0:16:21, 0:43:45, 1:12:31
- Podcast21.ogg - 1 result(s) at time index(es) - 0:12:01
So my questions:
- How could one approach this problem?
- Are there are suitable algorithms developed to do something like this?
One idea the cropped up in my mind was that, one could use a 'speech-to-text' software to get transcripts along with time indexes for each of the audio files, then parse the transcript to get the output.
I was considering this as one of my hobby projects. Thanks!