views:

567

answers:

9

I'm not exactly a C++ newbie, but I have had little serious dealings with it in the past, so my knowledge of its facilities is rather sketchy.

I'm writing a quick proof-of-concept program in C++ and I need a dynamically sizeable buffer of binary data. That is, I'm going to receive data from a network socket and I don't know how much there will be (although not more than a few MB). I could write such a buffer myself, but why bother if the standard library probably has something already? I'm using VS2008, so some Microsoft-specific extension is just fine by me. I only need four operations:

  • Create the buffer
  • Write data to the buffer (binary junk, not zero-terminated)
  • Get the written data as a char array (together with its length)
  • Free the buffer

What is the name of the class/function set/whatever that I need?

Added: Several votes go to std::vector. All nice and fine, but I don't want to push several MB of data byte-by-byte. The socket will give data to me in few-KB large chunks, so I'd like to write them all at once. Also, at the end I will need to get the data as a simple char*, because I will need to pass the whole blob along to some Win32 API functions unmodified.

+16  A: 

You want a std::vector:

typedef std::vector<char> buffer_type;
buffer_type myData;

vector will automatically allocate and deallocate its memory for you. Use push_back to add new data (vector will resize for you if required), and the indexing operator [] to retrieve data.

If at any point you can guess how much memory you'll need, I suggest calling reserve so that subsequent push_back's won't have to reallocate as much.

If you want to read in a chunk of memory and append it to your buffer, easiest would probably be something like:

typedef std::vector<char> buffer_type;
buffer_type myData;

do
{
    static const BufferSize = 1024;
    char rawBuffer[BufferSize];

    const unsigned bytesRead = get_network_data(rawBuffer, BufferSize); // pseudo

    myData.insert(myData.end(), rawBuffer, rawBuffer + bytesRead);
} while (bytesRead > 0);

myData now has all the read data.

If you need to treat your data as a raw-array, take the address of the first element:

some_c_function(&myData[0], myData.size());

For raw data (what you're using), try something like this:

typedef std::vector<char> buffer_type;
buffer_type myData;

do
{
    static const BufferSize = 1024;

    const size_t oldSize = myData.size();
    myData.resize(myData.size() + BufferSize);        

    const unsigned bytesRead =
                       get_network_data(&myData[oldSize], BufferSize); // pseudo

    myData.resize(oldSize + bytesRead);
} while (bytesRead > 0);

Which reads directly into the buffer.

GMan
OK, but I don't see a member with which I could add a whole buffer of data. Or do I have to push several MB byte-by-byte? I will read from the socket it in nice few-KB large chunks.
Vilx-
Vilx -- use myData.insert(myData.end(), bytes_ptr, bytes_ptr + bytes_count)
atzz
Assuming that you have a buffer of known size, `vec.insert(vec.end, buf, buf+length)`
KeithB
I don't see an `append()` member function on the vector.
Vilx-
Vector is required to be contiguous, so it is possible to take the address of an element and memcopy() a block of data into it. Feel free to shudder at the horror of this.
RobH
I shudder at the horror of this.
Vilx-
Taking the address of the first element is fairly common. Also, if you're reading network data and want it, you'll have to copy *somewhere*, which involves every byte. Some CPU's can copy multiple bytes at once, and your compiler will take advantage of that for you.
GMan
sbk
Nice idea too! :)
Vilx-
Ha, good point sbk. I was just focused on using `memcpy` :P
GMan
+4  A: 
std::vector<unsigned char> buffer;

Every push_back will add new char at the end (reallocating if needed). You can call reserve to minimize the number of allocations if you roughly know how much data you expect.

buffer.reserve(1000000);

If you have something like this:

unsigned char buffer[1000];
std::vector<unsigned char> vec(buffer, buffer + 1000);
Nikola Smiljanić
+1  A: 

Use std::vector, a growing array that guarantees the storage is contiguous (your third point).

Xavier Nodet
+1  A: 

An alternative which is not from STL but might be of use - Boost.Circular buffer

skwllsp
A: 

std::string would work for this:

  • It supports embedded nulls.
  • You can append multi-byte chunks of data to it by calling append() on it with a pointer and a length.
  • You can get its contents as a char array by calling data() on it, and the current length by calling size() or length() on it.
  • Freeing the buffer is handled automatically by the destructor, but you can also call clear() on it to erase its contents without destroying it.
Wyzard
A: 

Regarding your comment "I don't see an append()", ineserting at the end is the same thing.

vec.insert(vec.end,

Brian D. Coryell
+2  A: 

I'd take a look at Boost basic_streambuf, which is designed for this kind of purpose. If you can't (or don't want to) use Boost, I'd consider std::basic_streambuf, which is quite similar, but a little more work to use. Either way, you basically derive from that base class and overload underflow() to read data from the socket into the buffer. You'll normally attach an std::istream to the buffer, so other code reads from it about the same way as they would user input from the keyboard (or whatever).

Jerry Coffin
A: 

If you do use std::vector, you're just using it to manage the raw memory for you. You could just malloc the biggest buffer you think you'll need, and keep track of the write offset/total bytes read so far (they're the same thing). If you get to the end ... either realloc or choose a way to fail.

I know, it isn't very C++y, but this is a simple problem and the other proposals seem like heavyweight ways to introduce an unnecessary copy.

Useless
Well, that is basically what I want to do. I just wondered if there wasn't some built-in way for doing that already.
Vilx-
+2  A: 

One more vote for std::vector. Minimal code, skips the extra copy GMan's code do:

std::vector<char> buffer;
static const size_t MaxBytesPerRecv = 1024;
size_t bytesRead;
do
{
    const size_t oldSize = buffer.size();

    buffer.resize(oldSize + MaxBytesPerRecv);
    bytesRead = receive(&buffer[oldSize], MaxBytesPerRecv); // pseudo, as is the case with winsock recv() functions, they get a buffer and maximum bytes to write to the buffer

    myData.resize(oldSize + bytesRead); // shrink the vector, this is practically no-op - it only modifies the internal size, no data is moved/freed
} while (bytesRead > 0);

As for calling WinAPI functions - use &buffer[0] (yeah, it's a little bit clumsy, but that's the way it is) to pass to the char* arguments, buffer.size() as length.

And a final note, you can use std::string instead of std::vector, there shouldn't be any difference (except you can write buffer.data() instead of &buffer[0] if you buffer is a string)

sbk
+1: if you choose vector, this is the way to do it.I still claim the vector here just used as a collection of {size, capacity, pointer} and you could just as easily call `realloc` yourself though ...
Useless
I claim C++ is just really some assembly instructions and you should use those. :P
GMan
Fair enough ;DI just don't think the vector is adding much abstraction or expressiveness here - although this may depend on the user/reader's level of comfort with C memory allocation.
Useless
@Useless: Ok then how about hassle free exception safe memory management?
SDX2000
OK, good point: I'm used to idiomatic C code for low-level socket programming (and the POSIX sockets API doesn't throw), but it isn't either good style in general or idiomatic C++.
Useless