I don't want sound-to-text software. What I need is the following:
- I'll record multiple (say 50+) audio streams (recordings of radio stations)
- from that recordings, I'll mark interesting audio clips - their length ranges from 2 to 60 seconds - there will be few thousands of such audio clips
- library should be able to find other instances of same audio clips from recorded sound streams
- confidence factor should be reported to used and additional input provided so the recognition could perform better next time
Do you know of such software library? LGPL would be most valuable to me, but I can go for commercial license as well.
Audio clips will contain both music, text, effects, or any combination thereof. So, TEXT recognition is out of the question.
Architecture: c++, C# for glue, CUDA if possible.