tags:

views:

1535

answers:

5

I have several Gb of sample data captured 'in-the-field' at 48ksps using an NI Data Acquisition module. I would like to create a WAV file from this data.

I have done this previously using MATLAB to load the data, normalise it to the 16bit PCM range, and then write it out as a WAV file. However MATLAB baulks at the file size as it does everything 'in-memory'.

I would ideally do this in C++ or C, (C# is an option), or if there is an existing utility I'd use that. Is there a simple way (i.e. an existing library) to take a raw PCM buffer, specify the sample rate, bit depth, and package it into a WAV file?

To handle the large data set, it would need to be able to append data in chunks as it would not necessarily be possible to read the whole set into memory.

I understand that I could do this from scratch using the format specification, but I do not want to re-invent the wheel, or spend time fixing bugs on this if I can help it.

+1  A: 

I think you can use libsox for this.

hlovdal
Looks like what I need. I am hoping that I can build it without having to pollute my PC with Cygwin. I'd rather use a Linux VM than that!
Clifford
the pre-compiled binaries relay on cygwin; do you actually need a C library or is it enough to call `sox` from the command lineß
Christoph
At the moment, invoking sox directly if my favoured option.
Clifford
Thanks, I used sox.exe to achieve what I needed in the end.
Clifford
A: 

C# would be a good choice for this. FileStreams are easy to work with, and could be used for reading and writing the data in chunks. Also, reading WAV file headers is a relatively complicated task (you have to search for RIFF chunks and so on), but writing them is cake (you just fill out a header structure and write it at the beginning of the file).

There are a number of libraries that do conversions like this, but I'm not sure they can handle the huge data sizes you're talking about. Even if they do, you would probably still have to do some programming work to feed smaller chunks of raw data to these libraries.

For writing your own method, normalization isn't difficult, and even resampling from 48ksps to 44.1ksps is relatively simple (assuming you don't mind linear interpolation). You would also presumably have greater control over the output, so it would be easier to create a set of smaller WAV files, instead of one gigantic one.

MusiGenesis
+1  A: 

I came across a function called WAVAPPEND on Mathworks' File Exchange site a while ago. I never got around to using it, so I'm not sure if it works or is appropriate for what you're trying to do, but perhaps it'll be useful to you.

mtrw
Thanks, that is going to be useful in future I think.
Clifford
A: 

The current Windows SDK audio capture samples capture data from the microphone and save the captured data to a .WAV file. The code is far from optimal but it should work.

Note that RIFF files (.WAV files are RIFF files) are limited to 4G in size.

Larry Osterman
These are baseband signals from an RF receiver sampled with an analogue data acquisition module. DC offset needs to be preserved, which cannot be done with a microphone input or other AC coupled audio input. The data already exists as floating point voltage measurements. The question was about packaging the existing data into a form that could be replayed into an RF modulator for repeatable testing of a baseband decoder. The final files are far smaller than the original because the original data is double precision floats, whereas the converted data is 16bit PCM.
Clifford
I was just pointing out that the samples contained code that built a WAV file from raw PCM data, not suggesting that you capture from the device.
Larry Osterman
A: 

Interesting, I have found a bug on stackoverflow parse of code, it dont support the \ character at the end of the line like you see below, sad

//stolen from OGG Vorbis pcm to wav conversion rountines, sorry
#define VERSIONSTRING "OggDec 1.0\n"

static int quiet = 0;
static int bits = 16;
static int endian = 0;
static int raw = 0;
static int sign = 1;
unsigned char headbuf[44];  /* The whole buffer */







#define WRITE_U32(buf, x) *(buf)     = (unsigned char)((x)&0xff);\
                          *((buf)+1) = (unsigned char)(((x)>>8)&0xff);\
                          *((buf)+2) = (unsigned char)(((x)>>16)&0xff);\
                          *((buf)+3) = (unsigned char)(((x)>>24)&0xff);

#define WRITE_U16(buf, x) *(buf)     = (unsigned char)((x)&0xff);\
                          *((buf)+1) = (unsigned char)(((x)>>8)&0xff);

/*
 * Some of this based on ao/src/ao_wav.c
 */
static int
write_prelim_header (FILE * out, int channels, int samplerate)
{

  int knownlength = 0;

  unsigned int size = 0x7fffffff;
  // int channels = 2;
  // int samplerate = 44100;//change this to 48000
  int bytespersec = channels * samplerate * bits / 8;
  int align = channels * bits / 8;
  int samplesize = bits;

  if (knownlength)
    size = (unsigned int) knownlength;

  memcpy (headbuf, "RIFF", 4);
  WRITE_U32 (headbuf + 4, size - 8);
  memcpy (headbuf + 8, "WAVE", 4);
  memcpy (headbuf + 12, "fmt ", 4);
  WRITE_U32 (headbuf + 16, 16);
  WRITE_U16 (headbuf + 20, 1);  /* format */
  WRITE_U16 (headbuf + 22, channels);
  WRITE_U32 (headbuf + 24, samplerate);
  WRITE_U32 (headbuf + 28, bytespersec);
  WRITE_U16 (headbuf + 32, align);
  WRITE_U16 (headbuf + 34, samplesize);
  memcpy (headbuf + 36, "data", 4);
  WRITE_U32 (headbuf + 40, size - 44);

  if (fwrite (headbuf, 1, 44, out) != 44)
    {
      printf ("ERROR: Failed to write wav header: %s\n", strerror (errno));
      return 1;
    }

  return 0;
}

static int
rewrite_header (FILE * out, unsigned int written)
{
  unsigned int length = written;

  length += 44;

  WRITE_U32 (headbuf + 4, length - 8);
  WRITE_U32 (headbuf + 40, length - 44);
  if (fseek (out, 0, SEEK_SET) != 0)
    {
      printf ("ERROR: Failed to seek on seekable file: %s\n",
          strerror (errno));
      return 1;
    }

  if (fwrite (headbuf, 1, 44, out) != 44)
    {
      printf ("ERROR: Failed to write wav header: %s\n", strerror (errno));
      return 1;
    }
  return 0;
}
Arabcoder
finally I can add comments, thanks, and fix the code bug, I have lots of C code to post here
Arabcoder
bug fixed, and notice that a wav file has valid data after position 44 while in some cases the Microsoft tools that handle wav files may start at position 60, then look for the "data" position on the wav file to initiate in the correct position
Arabcoder