tags:

views:

151

answers:

3

I'm working with a function which yields some data as a std::vector<char> and another function (think of legacy APIs) which processes data and takes a const char *, size_t len. Is there any way to detach the data from the vector so that the vector can go out of scope before calling the processing function without copying the data contained in the vector (that's what I mean to imply with detaching).

Some code sketch to illustrate the scenario:

// Generates data
std::vector<char> generateSomeData();

// Legacy API function which consumes data
void processData( const char *buf, size_t len );

void f() {
  char *buf = 0;
  size_t len = 0;
  {
      std::vector<char> data = generateSomeData();
      buf = &data[0];
      len = data.size();
  }

  // How can I ensure that 'buf' points to valid data at this point, so that the following
  // line is okay, without copying the data?
  processData( buf, len );
}
+10  A: 
void f() { 
  char *buf = 0; 
  size_t len = 0; 
  std::vector<char> mybuffer; // exists if and only if there are buf and len exist
  { 
      std::vector<char> data = generateSomeData(); 
      mybuffer.swap(data);  // swap without copy
      buf = &mybuffer[0]; 
      len = mybuffer.size(); 
  } 

  // How can I ensure that 'buf' points to valid data at this point, so that the following 
  // line is okay, without copying the data? 
  processData( buf, len ); 
} 
Alexey Malistov
Works great! Thanks - I didn't realize there's a swap() member function on vectors.
Frerich Raabe
Yes, that's how it is done with `std::vector`.
sharptooth
@Frerich: STL containers generally feature a `swap` member, or a `swap` free function. It's not a requirement, it's just good design.
Matthieu M.
If you are going to provide a *global* (outer scope) vector, why not just generate the data into that container? `{ mybuffer = generateSomeData(); }`??
David Rodríguez - dribeas
@David Rodríguez - dribeas: The code which calls `generateSomeData` doesn't have direct access to the `mybuffer` object. If it was that easy, I wouldn't have any need to detach the data in the first place. I wrote that in my response to your comment to my question already.
Frerich Raabe
A: 

I wouldn't recommend it, but:

char* ptr = NULL;
int len = -1;
{
    vector<char> vec;
    /* ... fill vec with data ... */
    vec.push_back(NULL); // dont forget to null terminate =)
    ptr = &vec.front();
    len = vec.size();
    // here goes...
    memset((void*)&vec, 0, sizeof(vec));
}
// vec is out of scoop, but you can still access it's old content via ptr.
char firstVal = ptr[0];
char lastVal = ptr[len-1];
delete [] ptr; // don't forget to free

Voila!

This code is actually pretty safe as vec's destructor will call delete [] 0;, which is a safe operation (unless you got some strange implementation of stl).

Viktor Sehr
you have a strange definition of "safe".
jalf
I'm no language lawyer, but there's gotta be some undefined behavior there.
Steve Fallows
-1. All decent compilers should produce a warning on the `memset()` line.
Dummy00001
@Dummy00001: Well,, they don't, it works, and its probably the only way to detach the array from the vector.
Viktor Sehr
Oh god... kill it with a fire!
Alex B
`delete [] ptr;` is certainly wrong, since by default `vector` uses `::operator new` and placement new (via `std::allocator`) to separately allocate memory and create objects. And the `memset` hack is undefined behavoiur, even if it does happen to have the right effect on your implementation.
Mike Seymour
@Dummy00001: Unfortunately will any complaint by a compiler be silenced by the cast to `void*`, but that does not remove the fact that the code is invoking UB.
Bart van Ingen Schenau
@Mike Seymour: a char doesn't have a destructor, so it doesn't matter whether it uses placement new. I already pointed out I don't recommend it, and that it's dependent of the stl implementation.
Viktor Sehr
The act of posting the code and calling it safe looks like a recommendation, even if you *say* it isn't.
Rob Kennedy
@Viktor: placement new isn't the issue. Allocating with new and deallocating with delete[] is. Even for POD, that is dangerous.
Dennis Zickefoose
@Dennis: the vector internally allocates and deallocates with [] as it doesn't know the size of the array at compile time.
Viktor Sehr
@Rob Kennedy: Or you could see it as a solution which answers the question, but isn't recommended to do. Exactly as I say.
Viktor Sehr
+1 for creative, practical thinking, and giving me a good laugh.
Tony
@Viktor: no, the vector internally allocates and deallocates with `::operator new()`, unless you give it a custom allocator that uses `new []`.
Mike Seymour
And the `memset` won't necessarily set the internal pointer(s) to null; null's representation isn't necessarily zero.
Mike Seymour
@Tony: Thank you =)
Viktor Sehr
@Mike Seymour: "And the memset won't necessarily set the internal pointer(s) to null; null's representation isn't necessarily". Yes, there are a lot of reasons I said "pretty safe" instead of "safe", and started by saying I dont recommend it.
Viktor Sehr
@Mike Seymour: "the vector internally allocates and deallocates with ::operator new()", ok, I thought the default allocater used new[] deep down somewhere.
Viktor Sehr
-1 for continuing to argue that invalid code, exhibiting at least two instances of undefined behaviour, is "pretty safe".
Mike Seymour
+1  A: 
Bart van Ingen Schenau
-1: This answers a question, but not the one I asked; I **intentionally** put a scope around the `data` vector so that it goes out of scope early to demonstrate my point.
Frerich Raabe
@Frerich: but perhaps it answers the one you should have asked.
Mike Seymour
@Mike: Feel free to answer 'Beneath the big apple tree in the garden', just in case I meant to ask 'Where can I find some shadow during hot days?'. :-}
Frerich Raabe
@Frerich: Taking that you have accepted a solution that *requires* adding a vector in the outer scope, this solution is just the simple elegant version of it... why do you want to create a second vector and swap if you can place the vector in the outer scope? My take is that you did not ask what you wanted to know, and a detail in the accepted answer helped you to a solution to the question you did not post.
David Rodríguez - dribeas
@David Rodríguez - dribeas: I asked how to detach a vector from the data it contains. I have no problem with creating a *second* vector (with a possibly longer lifetime!) for that. That's precisely the question which Alexey answered.
Frerich Raabe
@Frerich: Sorry, but I still don't see your point. You have a vector that goes out of scope too early, so you transfer its contents to a longer living vector. What is then the problem with using the longer living vector to store the data in in the first place?
Bart van Ingen Schenau
@Bart van Ingen Schenau: I cannot change the `generateSomeData` function, it's not under my control. It always generates a vector on the stack and passes it back to me. Now I need to pass the data in the vector to a different thread and that thread calls a legacy C API (also not under my control). So I need to ensure that the data in the vector lives long enough, until the other thread (and the legacy C function) finished their work. However, I want to avoid *copying* the data.
Frerich Raabe
@Bart van Ingen Schenau: My solution is now to do a `new std::vector<char>` and then use the `swap()` function to swap the data from the vector returned by `generateSomeData` with my allocated vector. This ensures that the data lives until the `new`'ed vector is deleted. The other thread deletes the vector after calling the legacy API function.
Frerich Raabe
@Frerich: The bit about communicating the data to another thread (instead of just to an outer scope) was missing from your question. My solution does indeed not work for the across-threads case.
Bart van Ingen Schenau
@Bart van Ingen Schenau: Yes, true - I suppose I should've provided a bit more context to avoid this confusion (I had basically the same argument with "David Rodríguez - dribeas").
Frerich Raabe