views:

247

answers:

4

I'm trying to use istringstream to recreate an encoded wstring from some memory. The memory is laid out as follows:

  1. 1 byte to indicate the start of the wstring encoding. Arbitrarily this is '!'.
  2. n bytes to store the character length of the string in text format, e.g. 0x31, 0x32, 0x33 would be "123", i.e. a 123-character string
  3. 1 byte separator (the space character)
  4. n bytes which are the wchars which make up the string, where wchar_t's are 2-bytes each.

For example, the byte sequence:

21 36 20 66 00 6f 00 6f 00

is "!6 f.o.o." (using dots to represent char 0)

All I've got is a char* pointer (let's call it pData) to the start of the memory block with this encoded data in it. What's the 'best' way to consume the data to reconstruct the wstring ("foo"), and also move the pointer to the next byte past the end of the encoded data?

I was toying with using an istringstream to allow me to consume the prefix byte, the length of the string, and the separator. After that I can calculate how many bytes to read and use the stream's read() function to insert into a suitably-resized wstring. The problem is, how do I get this memory into the istringstream in the first place? I could try constructing a string first and then pass that into the istringstream, e.g.

std::string s((const char*)pData);

but that doesn't work because the string is truncated at the first null byte. Or, I could use the string's other constructor to explicitly state how many bytes to use:

std::string s((const char*)pData, len);

which works, but only if I know what len is beforehand. That's tricky given that the data is variable length.

This seems like a really solvable problem. Does my rookie status with strings and streams mean I'm overlooking an easy solution? Or am I barking up the wrong tree with the whole string approach?

A: 

Try setting your stringstream's rdbuf:

char* buffer = something;
std::stringbuf *pbuf;
std::stringstream ss;

std::pbuf=ss.rdbuf();
std::pbuf->sputn(buffer, bufferlength);
// use your ss

Edit: I see that this solution will have a similar problem to your string(char*, len) situation. Can you tell us more about your buffer object? If you don't know the length, and it isn't null terminated, it's going to be very hard to deal with.

luke
There is no buffer 'object' I'm afraid, just a pointer to a blob of memory. I get handed a pointer to the start of that memory and I need to (re)create a wstring from it. I can't really null terminate anything since nulls are valid data (see my example). And I sort of know the size because it's encoded in the data, albeit as a text string. As a human I can parse this data easily, but I'm struggling to come up with an elegant way to do it in code. If there's anything specific you'd like to know, ask away.
WalderFrey
A: 
Default
A: 

It seems like something on this order should work:

std::wstring make_string(char const *input) { 
    if (*input != '!')
       return "";
    char length = *++input;
    return std::wstring(++input, length);
}

The difficult part is dealing with the variable length of the size. Without something to specify the length it's hard to guess when to stop treating the data as specifying the length of the string.

As for moving the pointer, if you're going to do it inside a function, you'll need to pass a reference to the pointer, but otherwise it's a simple matter of adding the size you found to the pointer you received.

Jerry Coffin
A: 

It's tempting to (ab)use the (deprecated but nevertheless standard) std::istrstream here:

// Maximum size to read is 
// 1 for the exclamation mark
// Digits for the character count (digits10() + 1)
// 1 for the space
const std::streamsize max_size = 3 + std::numeric_limits<std::size_t>::digits10;

std::istrstream s(buf, max_size);

if (std::istream::traits_type::to_char_type(s.get()) != '!'){
    throw "missing exclamation";
}

std::size_t size;
s >> size;

if (std::istream::traits_type::to_char_type(s.get()) != ' '){
    throw "missing space";
}

std::wstring(reinterpret_cast<wchar_t*>(s.rdbuf()->str()), size/sizeof(wchar_t));
Éric Malenfant
Interesting... So you're saying that since I can't know the size of the data up front then I set the stream to its maximum size. Yeah, that would work I guess. Would this introduce any additional overheads though?
WalderFrey
@WalderFrey: istrstream does not own its buffer, and makes no copy.
Éric Malenfant
But thinking twice, passing max may not be a good idea, because the stream's implementation may end up using something like buf + size to compute the end of its buffer, which could overflow if buf is a large value. I'll update the answer to use a safer buffer size.
Éric Malenfant