views:

752

answers:

5

Is there anyway to read a known number of bytes, directly into an std::string, without creating a temporary buffer to do so?

eg currently I can do it by

boost::uint16_t len;
is.read((char*)&len, 2);
char *tmpStr = new char[len];
is.read(tmpStr, len);
std::string str(tmpStr, len);
delete[] tmpStr;
+3  A: 

std::string has a resize function you could use, or a constructor that'll do the same:

boost::uint16_t len;
is.read((char*)&len, 2);

std::string str(len, '\0');
is.read(&str[0], len);

This is untested, and I don't know if strings are mandated to have contiguous storage.

GMan
Strings are defined to be vectors. Same contiguity.
bmargulies
They are not defined to be vectors, but 21.3.4/1 does imply contiguous storage. However there's confusion and defect reports about that specific section, and I'm not sure what the current consensus is, nor how portable depending on that interpretation is.
Roger Pate
@Roger. I disagree that 21.3.4/1 implies contiguous storage. It is the presence of c_str() and data() that imply it, but only because an efficient implementation would require contiguous storage to implement them. I believe the next version of the standard also disambiguate the situation.
Martin York
It's a known defect in the standard that will be corrected in C++0x. I do not know of any implementations that do not use contiguous storage though. You could always put in an assertion to check that it's contiguous though.
Joe Gauterin
A: 

Are you just optimizing code length or trying to save yourself a copy here? What's wrong with the temporary buffer?

I'd argue that you're actually circumventing the protections of the string trying to write directly do it like that. If you're worried about performance of the copy to a std::string because you've identified that it's in some way affecting the performance of your application, I'd work directly with the char*.

EDIT: Doing more looking... http://stackoverflow.com/questions/361500/initializing-stdstring-from-char-without-copy

In the second answer, it's stated pretty flatly that you can't achieve what you're looking to achieve (ie. populate a std::string without an iteration over the char* to copy.)

Take a look at your load routine (post it here perhaps?) and minimize allocations: new and delete certainly aren't free so you can at least save some time if you don't have to re-create the buffer constantly. I always find it helpful erase it by memset'ing the buffer to 0 or null terminating the first index of the array each iteration but you may quickly eliminate that code in the interests of performance once you're confident in your algorithm.

antik
The performance of std::string is fine, the problem is loading the data into them from a binary file, which is currently taking an unacceptably long period of time. Profiling showed that 70% of that load time is reading strings, with just 30% being other binary data or small bits of processing, so speeding up the string reading seems the obvious solution to speeding the whole thing up by a major margin. So I by no means want to replace std::string in the rest of the program which would mean changing 1000's of lines, rather than just changing the string loading routine.
Fire Lancer
How big an issue is the alloc, dealloc of the char* every iteration? What if you simply kept a char* of sufficient size (checking for each iteration, obviously) around and just created new strings from that single char*?
antik
A: 

You could use something like getline:

#include <iostream>
#include <string>
using namespace std;

int main () {
  string str;
  getline (cin,str,' ');
}
rmn
This is a good suggestion for other problems, but not for this one: unformatted input of a specific number of bytes.
Roger Pate
+2  A: 

I would use a vector as the buffer.

boost::uint16_t len;
is.read((char*)&len, 2); // Note if this file was saved from a different architecture 
                         // then endianness of these two bytes may be reversed.

std::vector buffer(len);  // uninitialized.
is.read(&buffer[0], len);

std::string  str(buffer.begin(),buffer.end());

Though you will probably get away with using a string as the buffer (as described by GMan). It is not guaranteed by the standard that a strings members are in consecutive locations (so check your current implementation and put a big comment that it needs checking when porting to another compiler/platform).

Martin York
A: 

How about:

std::string str;
str.reserve(NUMBER_OF_BYTES);
for(std::size_t idx = 0; idx != NUMBER_OF_BYTES; idx++)
{
    str.push_back(static_cast<char>(is.get()));
}

?

Billy ONeal