views:

102

answers:

3

Hi all,

I'm playing around with some onset/beat detection algorithms on my own. My input is a .wav file and my output is a .wav file; I have access to the entire waveform in chunks of float[] arrays.

I'm having trouble coming up with a good way to debug and evaluate my algorithms. Since my input and output are both auditory, I thought it'd make the most sense if my debugging facility was also auditory, eg. by means of adding audible "ticks" or "beeps" to the .wav file at onset points.

Does anyone have any ideas on how to do this? Ideally, it would be a simple for-loop that I'd run a couple hundred or couple thousand samples through.

+1  A: 

Poor man's answer: find a recording of a tick or beep, then mix it with the original waveform at each desired moment. You mix by simply averaging the values of the beep and the input waveform for the duration of the beep.

Jonathan Feinberg
A: 

Figure out where in your sample you want to insert your tick (include the length of the tick, so this is a range, not a point). Take the FFT of that section of the waveform. Add to the frequency domain representation whatever frequency components you desire for your "tick" sound (simplest would be just a single frequency tone). Perform the inverse FFT on the result and voila, you have your tone mixed into the original signal. I think (it's been a while since I've done this).

rmeador
+1  A: 
float * sample = first sample where beep is to be mixed in
float const beep_duration = desired beep duration in seconds
float const sample_rate = sampling rate in samples per second
float const frequency = desired beep frequency, Hz
float const PI = 3.1415926..
float const volume = desired beep volume
for( int index = 0; index < (int)(beep_duration * sample_rate); index++ )
{
   sample[index] += 
      sin( float(index) * 2.f * PI * sample_rate / frequency ) * volume;
}
moonshadow
Nice. 15 chars. 15 chars.
Jonathan Feinberg
Thanks, this was nice and simple and worked perfectly. Two follow up questions, if you don't mind. 1) the 2.f notation; I have never seen this before. 2.0f wouldn't have surprised me, but 2.f is new to me. Is this part of the C++ syntax or C syntax? 2) In your opinion, which is "more correct": inserting the beep starting at the beginning of the beat, or inserting the beep such that the mid-point of it aligns with the beat?
psa
(1) a floating point number is a mantissa and a period followed by an optional fractional part (http://msdn.microsoft.com/en-us/library/tfh6f0w2(VS.71).aspx); so 2. would also be valid syntax but would yield a double. (2) since the purpose of this is debugging, whichever makes it easier for you to tell by ear when your algorithm is broken is the "more correct" ;)
moonshadow