views:

560

answers:

3

Hi :D

I'm writing a sound editor for my graduation. I'm using BASS to extract samples from MP3, WAV, OGG etc files and add DSP effects like echo, flanger etc. Simply speaching I made my framework that apply an effect from position1 to position2, cut/paste management.

Now my problem is that I want to create a control similar with this one from Cool Edit Pro that draw a wave form representation of the song and have the ability to zoom in/out select portions of the wave form etc. After a selection i can do something like:

TInterval EditZone = WaveForm->GetSelection();

where TInterval have this form:

struct TInterval
{
    long Start;
    long End;
}

I'm a beginner when it comes to sophisticated drawing so any hint on how to create a wave form representation of a song, using sample data returned by BASS, with ability to zoom in/out would be appreciated.

I'm writing my project in C++ but I can understand C#, Delphi code so if you want you can post snippets in last two languages as well :)

Thanx DrOptix

A: 

Wouldn't you just plot the sample points on a 2 canvas? You should know how many samples there are per second for a file (read it from the header), and then plot the value on the y axis. Since you want to be able to zoom in and out, you need to control the number of samples per pixel (the zoom level). Next you take the average of those sample points per pixel (for example take the average of every 5 points if you have 5 samples per pixel. Then you can use a 2d drawing api to draw lines between the points.

Marius
Ok I got the part with:> you need to control the number of samples per pixel (the zoom level)But what do you suggest to use for drawing when I work with floating-point samples (1.40129846432482E-045, 0, 9224.40234375, 1.7402837732327E-039, 1.74042390307914E-039) is rounding down/up a solution or I really have to be exact? I mean 1.40129846432482E-045 can be seen by the drawing function as 1 without problem?And thanx for quick reply
Dr.Optix
Some drawing API's support floating point numbers. If not, then you might have to add the support yourself. For example, if the y-value is 1, then you draw a black dot (on white background) at 1. If the value is 1.5, then you draw a grey dot (50% hue) on 1 and 2.
Marius
@Marius. Its not quite so easy as to do 50% hue. You aren't taking into account the fact that colour space is not linear. Half brightness is actually about 21% ... TBH you are best off just rounding to the nearest integer and treating it that way while accepting the aliasing. Its unlikely to be that bad anyway.
Goz
+4  A: 

By Zoom, I presume you mean horizontal zoom rather than vertical. The way audio editors do this is to scan the wavform breaking it up into time windows where each pixel in X represents some number of samples. It can be a fractional number, but you can get away with dis-allowing fractional zoom ratios without annoying the user too much. Once you zoom out a bit the max value is always a positive integer and the min value is always a negative integer.

for each pixel on the screen, you need to have to know the minimum sample value for that pixel and the maximum sample value. So you need a function that scans the waveform data in chunks and keeps track of the accumulated max and min for that chunk.

This is slow process, so professional audio editors keep a pre-calculated table of min and max values at some fixed zoom ratio. It might be at 512/1 or 1024/1. When you are drawing with a zoom ration of > 1024 samples/pixel, then you use the pre-calculated table. if you are below that ratio you get the data directly from the file. If you don't do this you will find that you drawing code gets to be too slow when you zoom out.

Its worthwhile to write code that handles all of the channels of the file in an single pass when doing this scanning, slowness here will make your whole program feel sluggish, it's the disk IO that matters here, the CPU has no trouble keeping up, so straightforward C++ code is fine for building the min/max tables, but you don't want to go through the file more than once and you want to do it sequentially.

Once you have the min/max tables, keep them around. You want to go back to the disk as little as possible and many of the reasons for wanting to repaint your window will not require you to rescan your min/max tables. The memory cost of holding on to them is not that high compared to the disk io cost of building them in the first place.

Then you draw the waveform by drawing a series of 1 pixel wide vertical lines between the max value and the min value for the time represented by that pixel. This should be quite fast if you are drawing from pre built min/max tables.

John Knoeller
A pretty complete answer. Here only a minor addition: The programs drawing waveforms that I know even cache the precalculated min/max tables persistently in a file.
h0b0
Yep it is horizontal zoom. Also thanx for the answer I'll try to implement a control using this idea.
Dr.Optix
@h0b0: yes peak files, They can make re-opening a file cheaper the second time, but the don't help at all the first time you open a file, and users don't like the way they litter up the disk. It's a judgement call whether they make sense for your application.
John Knoeller
+1  A: 

I've recently done this myself. As Marius suggests you need to work out how many samples are at each column of pixels. You then work out the minimum and maximum and then plot a vertical line from the maximum to the minimum.

As a first pass this seemingly works fine. The problem you'll get is that as you zoom out it will start to take too long to retrieve the samples from disk. As a solution to this I built a "peak" file alongside the audio file. The peak file stores the minimum/maximum pairs for groups of n samples. PLaying with n till you get the right amount is up to uyou. Personally I found 128 samples to be a good tradeoff between size and speed. Its also worth remembering that, unless you are drawing a control larger than 65536 pixels in size that you needn't store this peak information as anything more than 16-bit values which saves a bit of space.

Goz