What is the current state of the art of sound matching / search in practical terms? I am currently remotely involved in planning a web application which, among others, will contain and expose a database of recorded short audio clips (at most 3-5 seconds, names of people). A question has been raised whether it would be possible to implement search based on user voice input. My gut tells me that it is an impossible task both from computational as well as algorithmic point of view, especially in web application (and besides that it would not a core feature of the application). I realize that there are perhaps a number of academic projects and that it would be a good research topic, but it’s not anything that could be implemented to a medium sized web application as an additional feature. To support my claims I spent half an hours searching so that I would not miss anything obvious, but I really could not find any good sources.
I know that it’s not very responsible to ask a question on SO without spending more time researching on my own, but I’ve been noticing that firing out a question on SO is far more effective, precise and faster that just randomly Googling stuff.