tags:

views:

275

answers:

4

I would like to write a program which is capable of storing a sound pattern, such as a train whistle, horn (beep beep)...listening for the sound via the microphone...and then taking some action when the sound is heard. I know a little python and have programmed a long time ago in VB. Mainly I am an Oracle, PLSQL guy. The program will require a modest UI.

What is the best solution combination (language, third party add-ons etc..) to tackle this problem?

+1  A: 

Sphinx is a speech recognition system. It may be able to be modified or even trained to work in the way you are expecting.

altCognito
+2  A: 

My guess is that the path of least resistance in this case is to use a third-party audio recognition library in combination with a high level language (such as Java or one of the .NET family languages such as C# or VB.NET).

You can start by doing some research in the areas of Digital Sound Processing and Audio Recognition.

When you find a library or framework that has the capabilities you're interested in, and bindings in your language of choice, start implementing with it.

See MARF (a Java library) and maybe Microsoft's work in this area withe the System.Speech.Recognition namespace (which if I remember correctly has been integrated with the newer Windows operating systems)

EDIT - Desktop vs. Run From Web

In the comments you asked about using Flash or Silverlight in order for your solution to be able to work both on the Desktop or from the web.

First off, I would like to point out that both Flash and Silverlight actually run on the client computer. The difference is that they run in the context of a web browser, and that the user doesn't have to install the application. Otherwise they are not much different than a desktop application, and the user obviously has to have the Flash of Silverlight plugin installed for their browser.

If that's what you're after (i.e. the user to not have to install your application) than you can look into Flash, Silverlight or Java Web Start. Actually JAVA Web Start would probably be a good candidate because you could leverage the MARF framework.

However if you do decide to go with Flash, Silverlight, or Java Web Start there are some security issues that you might have to deal with because accessing client system resources is bound to require some privileges that most "web-based apps" don't typically require.

Miky Dinescu
+1  A: 

If you're listening for a particular recording of a horn or a train whistle, that the program knows about beforehand, then it is likely that if the sounds are sufficiently distinctive, you will be able to detect and distinguish between them reliably.

Classifying a new sound that the program hasn't heard before (as sounding like a horn, or like a train whistle, etc.) is a much harder problem.

In either case, sound identification algorithms will generally look at the frequency spectrum of recorded sound (see Miky D's link on digital sound processing), and perform some pattern recognition on this data, rather than on the recorded waveform itself.

As for language and third-party libraries, go for something which allows you to get at the recorded audio data with a minimum of fuss. Java seems good in this respect (see also the Java machine learning algorithm WEKA). While there are programs/libraries around for speech and music analysis, I don't know of one designed for arbitrary sounds, so you may end up having to write the analysis algorithm yourself.

Chris Johnson
A: 

Most algorithms I know of use the spectrogram (i.e. the spectum vs. time) for distinguishing sounds. How hard this problem will be can be estimated by how different your spectrograms look.

An aspect of your sounds that might make them easier to distinguish from speech is that they will likely have clear harmonic structure (i.e. look more like the violin than the voice in the wikipedia link). This harmonic structure can be super useful in distinguishing sounds, and could be helpful in your problem. This brings to mind another place to look: there's a lot of work in distinguishing bird songs, which have clear harmonic structure, and many published algorithms, though I don't know of free software that can be extended to your needs. Still, it might be useful to use birdsong analysis software to just take a look at your sound files. See the Raven project, for example, though there are many other free spectrogram packages.

tom10