views:

51

answers:

1

I have developed two differing methods in MATLAB which aim to analyse a pop song and then automatically create a 30 second audio thumbnail (a preview clip) containing part of the chorus section.

Both methods have varying results:

  1. The first method can create a thumbnail for each track, managing to find a chorus section in 40 out of 50 tested songs
  2. The second method only managed to work on 30 out of the 50 songs, and it found the chorus section 21 times out the 30.

Obviously I know which method is superior, but I need to describe and explain the results in a report which requires the demonstration of proper statistical testing.

Other academic papers have previously used an f-test to do this, but because their methods are vastly superior, their aims are usually involve the detection of chorus onset times with 100% accuracy.

My aim is more relaxed as I am just looking for the generated thumbnails to contain any part of the chorus, regardless of onset.

Can anyone suggest some objective tests that I could possibly explore with regards to my project? This is my first time conducting an investigation like this so my experience/knowledge is incredibly low.

Thank you!

A: 

Possibly, the way for you is formating your song track with time cuts for relevant information about type of sound(chorus, etc). In sound editor like CoolEdit, you can set time cuts and assign names for theirs like 'chorus', 'pause','music'... Then, you must extract cut information to import in Matlab. For Windows 32 can be used utility Wav2labs from http://www.pallier.org/ressources/wspot/sig2wav/toolswav.html; http://www.pallier.org/ressources/wspot/sig2wav/Wav2labs.exe This program extract cuts to text file and you can read with Matlab textscan function.

After all, only segmentation accuracy must be proceed, like percent time when signal type(chorus/not chorus) was recognized correctly

Or specify your question more exactly

Singlet
I'm not sure what you are trying to suggest? I have an excel file with the timecodes of all the chorus sections for each of the 50 tracks. What do you mean by percent time?
Mark Spivey
If your's algorithm make decision chorus or not chorus, why don't just calculate in time or in frame count when algorithm correctly recognized sound type. If algorithm correctly assign type to 9 frames from 10, answer is 90% accuracy.
Singlet