I have vector<unsigned char>
filed with binary data. I need to take, lets say, 2 items from vector(2 bytes) and convert it to integer. How this could be done not in C style?
views:
110answers:
6Or the other way around if we're talking little endian.
PigBen
2010-10-27 09:08:18
Won't work. `v[0]` is an unsigned char. You must first cast it to an integer so that the `* 0x100` doesn't overflow.
ereOn
2010-10-27 09:08:33
@ereOn -- It will work, because 0x100 is an int.
PigBen
2010-10-27 09:12:59
@Pigen: My bad. You're absolutely right.
ereOn
2010-10-27 09:15:08
v[0] will be implicitly promoted to signed int. C standard section 3.2.1.1.
2010-10-27 09:15:29
is this necessarily any more C++ than C? And anyway, it's a suboptimal version of the bitwise operations (i.e. the above will *most likely* be optimized by the compiler to the bitwise operations)! If the OP is after a class (akin to ByteBuffer in Java nio), that's a different issue...
Nim
2010-10-27 09:51:03
Okay, okay, I'll give you a point.
2010-10-27 09:58:29
@user434507, it's not about points, I'm just trying to highlight to you that you shouldn't ignore part of the language just because it's perceived to be "C style" whatever that is... If you are explicitly after a class to serialize/deserialize data properly from a binary stream, look at boost's serialization library - it's very powerful.
Nim
2010-10-27 10:03:25
+6
A:
You may do:
vector<unsigned char> somevector;
// Suppose it is initialized and big enough to hold a uint16_t
int i = *reinterpret_cast<const uint16_t*>(&somevector[0]);
// But you must be sure of the byte order
// or
int i2 = (static_cast<int>(somevector[0]) << 8) | somevector[1];
// But you must be sure of the byte order as well
ereOn
2010-10-27 09:06:07
On some platforms, the first version will generate an exception if you try to do it at an odd offset. On many (including most modern Intels, I think), an odd offset will involve a performance penalty. If the performance is crucial, it will be up to the programmer to determine the faster way.
2010-10-27 09:21:17
to be honest, a reinterpret_cast is more "C style" than the bitwise operations, it's just hidden by a C++ template! ;) What's that expression about a turd? ;) I would always go with the bitwise operations, it's frankly more clearer!
Nim
2010-10-27 09:46:38
+2
A:
what do you mean "not in C style"? Using bitwise operations (shifts and ors) to get this to work does not imply it's "C style!"
what's wrong with: int t = v[0]; t = (t << 8) | v[1];
?
Nim
2010-10-27 09:08:18
+1
A:
If you don't want to care about big/little endian, you can use:
vector<unsigned char> somevector;
// Suppose it is initialized and big enough to hold a uint16_t
int i = ntohs(*reinterpret_cast<const uint16_t*>(&somevector[0]));
Benoit Thiery
2010-10-27 09:21:54
A:
Casts are a bad thing to do.
template <class T>
long extract(typename T::const_iterator & input) {
typedef typename T::value_type value_type;
union {
long rv;
value_type[2] fields;
} temp;
temp.fields[0] = *input++;
temp.fields[1] = *input++;
return temp.rv;
}
Basilevs
2010-10-27 10:02:51
Writing to one field of a union and then reading from another is a bad thing to do: it's undefined behavior.
Daniel
2010-10-27 10:16:56
+3
A:
Please use the shift operator / bit-wise operations.
int t = (v[0] << 8) | v[1];
All the solutions proposed here that are based on casting/unions are AFAIK undefined behavior, and may fail on compilers that take advantage of strict aliasing (e.g. GCC).
Daniel
2010-10-27 10:13:56
Correct me if I'm wrong, but since `v[0]` is an unsigned char, won't the `<< 8` generate a warning if we don't cast it to `int` before ?
ereOn
2010-10-28 07:19:43
Types smaller than `int` get promoted to int when used with arithmetic or bitwise operators, so the shift works fine without having to cast v[0] explicitly.
Daniel
2010-10-28 09:53:10