Which is a better c++ container for holding and accessing binary data?
std::vector<unsigned char>
or
std::string
Is one more efficient than the other?
Is one a more 'correct' usage?
Which is a better c++ container for holding and accessing binary data?
std::vector<unsigned char>
or
std::string
Is one more efficient than the other?
Is one a more 'correct' usage?
If you just want to store your binary data, you can use bitset
which optimizes for space allocation. Otherwise go for vector
, as it's more appropriate for your usage.
pure binary data: std::vector or something else, but not string. If your data contains a 0, string regards it as a terminator, hence you cannot store/access any data after it.
Both are correct and equally efficient. Using one of those instead of a plain array is only to ease memory management and passing them as argument.
I use vector because the intention is more clear than with string.
Edit: C++03 standard does not guarantee std::basic_string
memory contiguity. However from a practical viewpoint, there are no commercial non-contiguous implementations. C++0x is set to standardize that fact.
Is one more efficient than the other?
This is the wrong question.
Is one a more 'correct' usage?
This is the correct question.
It depends. How is the data being used? If you are going to use the data in a string like fashon then you should opt for std::string as using a std::vector may confuse subsequent maintainers. If on the other hand most of the data manipulation looks like plain maths or vector like then a std::vector is more appropriate.
Compare this 2 and choose yourself which is more specific for you. Both are very robust, working with STL algorithms ... Choose yourself wich is more effective for your task
You should prefer std::vector
over std::string
. In common cases both solutions can be almost equivalent, but std::string
s are designed specifically for strings and string manipulation and that is not your intended use.
std::string
offers methods that you will probably not want to use and just make the interface cumbersome for plain data storage. Some of the methods will have 'strange' behavior. For example, std::string::compare
will determine two not bitwise exact strings as equals if the differences happen in equivalent characters with respect to the character traits. This will make the free function operator==
return true
for strings that are not bitwise equals. Say that the default character traits determine that 'a' and 'á' are equivalent, then std::string("a") == std::string("á")
while this may be sensible to do with strings, it sure is not with binary data.
While I cannot recall a real example where this happens, there is no reason to prefer a more complex solution to a problem that can be solved by another container and that could fail in some weird, hard to debug way.
Personally I prefer std::string because string::data() is much more intuitive for me when I want my binary buffer back in C-compatible form. I know that vector elements are guaranteed to be stored contiguously exercising this in code feels a little bit unsettling.
This is a style decision that individual developer or a team should make for themselves.
This is a comment to dribeas answer. I write it as an answer to be able to format the code.
This is the char_traits compare function, and the behaviour is quite healthy:
static bool
lt(const char_type& __c1, const char_type& __c2)
{ return __c1 < __c2; }
template<typename _CharT>
int
char_traits<_CharT>::
compare(const char_type* __s1, const char_type* __s2, std::size_t __n)
{
for (std::size_t __i = 0; __i < __n; ++__i)
if (lt(__s1[__i], __s2[__i]))
return -1;
else if (lt(__s2[__i], __s1[__i]))
return 1;
return 0;
}