I would like to write a little application in VB.NET that will detect a baby's cry. How would I get started with such an application?
My thought: If you can get access to the raw microphone data:
- Average the set and remove all noise outside the standard deviation (this will get rid of most background noise)
- Normalize the data set
- Focus on the higher tones
- Configure your software to register an event on loud tones within a range of frequencies
Depending on the amount of effort that you want to throw into this: you could use Bayesian or neural networks to determine if the sound was the baby or not. It would make the program a bit more complicated, however it would try to sooth said baby when the baby does not wish to be.
Audio processing systems tend to use really a lot of math to massage the data and infer information from raw streams. VB.NET might not be the best platform when it comes to using math and input API's that produce high quality results and performance.
Signal processing is significantly more complicated that just applying algorithms in the hope that the application works. You really need to plan what you want to do, how to proceed and most important, how to test your results to verify the usefulness of the program.
Getting input from a microphone is fairly simple. Analysing the raw wav can be made to be simple if you can identify key characteristics of a babies cry. Record babies crying. Whats common ? Is it a change in pitch , duration ? Once you know what is common, then search for an algorithm that can identify that change in a series of changing values. There are A LOT of algorithms that can find range of changes in series of numbers.