tags:

views:

79

answers:

5

This one is probably for someone with some knowledge of music theory. Humans can identify certain characteristics of sounds such as pitch, frequency etc. Based on these properties, we can compare one sound to another and get a measure pf likeliness. For instance, it is fairly easy to distinguish the sound of a piano from that of a guitar, even if both are playing the same note.

If we were to go about the same process programmatically, starting with two audio samples, what properties of the sounds could we compute and use for our comparison? On a more technical note, are there any popular APIs for doing this kind of stuff?

P.S.: Please excuse me if I've made any elementary mistakes in my question or I sound like a complete music noob. Its because I am a complete music noob.

A: 

Any and all properties of sound can be represented / computed - you just need to know how. One of the more interesting is spectral analysis / spectrogramming (see http://en.wikipedia.org/wiki/Spectrogram).

Will A
+1  A: 

There are two sets of properties.

The "Frequency Domain" -- the amplitudes of overtones in a specific sample. This is the amplitudes of each overtone.

The "Time Domain" -- the sequence of amplitude samples through time.

You can, using Fourier Transforms, convert between the two.

The time domain is what sound "is" -- a sequence of amplitudes. The frequency domain is what we "hear" -- a set of overtones and pitches that determine instruments, harmonies, and dissonance.

A mixture of the two -- frequencies varying through time -- is the perception of melody.

S.Lott
A: 

Ignore all the arbitrary human-created terms that you may be unfamiliar with, and consider a simpler description of reality.

Sound, like anything else that we perceive is simply a spatial-temporal pattern, in this case "of movement"... of atoms (air particles, piano strings, etc.). Movement of objects leads to movement of air that creates pressure waves in our ear, which we interpret as sound.

Computationally, this is easy to model; however, because this movement can be any pattern at all -- from a violent random shaking to a highly regular oscillation -- there often is no constant identifiable "frequency", because it's often not a perfectly regular oscillation. The shape of the moving object, waves reverberating through it, etc. all cause very complex patterns in the air... like the waves you'd see if you punched a pool of water.

The problem reduces to identifying common patterns and features of movement (at very high speeds). Because patterns are arbitrary, you really need a system that learns and classify common patterns of movement (i.e. movement represented numerically in the computer) into various conceptual buckets of some sort.

Triynko
A: 

Any properties you want can be measured or represented in code. What do you want?

Do you want to test if two samples came from the same instrument? That two samples of different instruments have the same pitch? That two samples have the same amplitude? The same decay? That two sounds have similar spectral centroids? That two samples are identical? That they're identical but maybe one has been reverberated or passed through a filter?

endolith
+1  A: 

The Echo Nest has easy-to-use analysis apis to find out all you might want to know about a piece of music.

You might find the analyze documentation (warning, pdf link) helpful.

Jason Sundram