views:

8751

answers:

5

I have a need for a fixed-size (selectable at run-time when creating it, not compile-time) circular buffer which can hold objects of any type and it needs to be very high performance. I don't think there will be resource contention issues since, although it's in a multi-tasking embedded environment, it's a co-operative one so the tasks themselves can manage that.

My initial thought were to store a simple struct in the buffer which would contain the type (simple enum/define) and a void pointer to the payload but I want this to be as fast as possible so I'm open to suggestions that involve bypassing the heap.

Actually I'm happy to bypass any of the standard library for raw speed - from what I've seen of the code, it's not heavily optimized for the CPU : it looks like they just compiled C code for things like strcpy() and such, there's no hand-coded assembly.

Any code or ideas would be greatly appreciated. The operations required are:

  • create a buffer with specific size.
  • put at the tail.
  • get from the head.
  • return the count.
  • delete a buffer.
+1  A: 

The simplest solution would be to keep track of the item size and the number of items, and then create a buffer of the appropriate number of bytes:

typedef struct circular_buffer
{
    void *buffer;     // data buffer
    void *buffer_end; // end of data buffer
    size_t capacity;  // maximum number of items in the buffer
    size_t count;     // number of items in the buffer
    size_t sz;        // size of each item in the buffer
    void *head;       // pointer to head
    void *tail;       // pointer to tail
} circular_buffer;

void cb_init(circular_buffer *cb, size_t capacity, size_t sz)
{
    cb->buffer = malloc(capacity * sz);
    if(cb->buffer == NULL)
        // handle error
    cb->buffer_end = (char *)cb->buffer + capacity * sz;
    cb->capacity = capacity;
    cb->count = 0;
    cb->sz = sz;
    cb->head = cb->buffer;
    cb->tail = cb->buffer;
}

void cb_free(circular_buffer *cb)
{
    free(cb->buffer);
    // clear out other fields too, just to be safe
}

void cb_push_back(circular_buffer *cb, const void *item)
{
    if(cb->count == cb->capacity)
        // handle error
    memcpy(cb->head, item, cb->sz);
    cb->head = (char*)cb->head + cb->sz;
    if(cb->head == cb->buffer_end)
        cb->head = cb->buffer;
    cb->count++;
}

void cb_pop_front(circular_buffer *cb, void *item)
{
    if(cb->count == 0)
        // handle error
    memcpy(item, cb->tail, cb->sz);
    cb->tail = (char*)cb->tail + cb->sz;
    if(cb->tail == cb->buffer_end)
        cb->tail = cb->buffer;
    cb->count--;
}
Adam Rosenfield
Very standard solution - exactly to the spec. that the OP included as what he was trying to avoid. :P
Tony k
+2  A: 

Can you enumerate the types needed at the time you code up the buffer, or do you need to be able to add types at run time via dynamic calls? If the former, then I would create the buffer as a heap-allocated array of n structs, where each struct consists of two elements: an enum tag identifying the data type, and a union of all the data types. What you lose in terms of extra storage for small elements, you make up in terms of not having to deal with allocation/deallocation and the resulting memory fragmentation. Then you just need to keep track of the start and end indices that define the head and tail elements of the buffer, and make sure to compute mod n when incrementing/decrementing the indices.

dewtell
The types needed are enumerated, yes. There's only about six of them and that will never change. But the queues that hold the items must be able to hold all six types. I'm not sure about the storing of a whole item in the queue rather than a pointer - that means copying items (several hundred bytes) rather than a pointer. But I like the idea of a union - my primary concern here is speed rather than memory (we have enough of the latter, but the CPU is a shocker :-).
paxdiablo
Your answer's given me a good idea tho' - the memory for the items could be pre-allocated from malloc() and handed out from a mymalloc() purpose-built to handle just those memory blocks. And I could still just use pointers. +1 for that.
paxdiablo
You might or might not need to do extra copying, depending on the access pattern to the data. If you could build the items in place, and reference them while they are still in the buffer before popping, there might not be any extra copying. But it's certainly safer and more flexible to hand them out from your own allocator, and use a separate array of pointers (or indices) as the buffer.
dewtell
A: 

A simple implementation could consist of:

  • A buffer, implemented as an array of size n, of whatever type you need
  • A read pointer or index (whichever is more efficient for your processor)
  • A write pointer or index
  • A counter indicating how much data is in the buffer (derivable from the read and write pointers, but faster to track it separately)

Every time you write data, you advance the write pointer and increment the counter. When you read data, you increase the read pointer and decrement the counter. If either pointer reaches n, set it to zero.

You can't write if counter = n. You can't read if counter = 0.

Steve Melnikoff
+1  A: 
// Note power of two buffer size
#define kNumPointsInMyBuffer 1024 

typedef struct _ringBuffer {
    UInt32 currentIndex;
    UInt32 sizeOfBuffer;
    double data[kNumPointsInMyBuffer];
} ringBuffer;

// Initialize the ring buffer
ringBuffer *myRingBuffer = (ringBuffer *)calloc(1, sizeof(ringBuffer));
ringBuffer->sizeOfBuffer = kNumPointsInMyBuffer;
ringBuffer->currentIndex = 0;

// A little function to write into the buffer
void writeIntoBuffer(ringBuffer *myRingBuffer, double *myData, int numsamples) {
    // -1 for our binary modulo in a moment
    int buffLen = ringBuffer->sizeOfBuffer - 1; 
    int lastWrittenSample = ringBuffer->currentIndex;

    int idx;
    for (int i=0; i < numsamples; ++i) {
        // modulo will automagically wrap around our index
        idx = (i + lastWrittenSample) & buffLen; 
        ringBuffer->data[idx] = myData[i];
    }

    // Update the current index of our ring buffer.
    ringBuffer->currentIndex += numsamples;
    ringBuffer->currentIndex &= ringBuffer->sizeOfBuffer - 1;
}

As long as your ring buffer's length is a power of two, the incredibly fast binary "&" operation will wrap around your index for you. For my application, I'm displaying a segment of audio to the user from a ring buffer of audio acquired from a microphone.

I always make sure that the maximum amount of audio that can be displayed on screen is much less than the size of the ring buffer. Otherwise you might be reading and writing from the same chunk. This would likely give you weird display artifacts.

alexbw