views:

110

answers:

6

I have vector<unsigned char> filed with binary data. I need to take, lets say, 2 items from vector(2 bytes) and convert it to integer. How this could be done not in C style?

+5  A: 

v[0]*0x100+v[1]

Or the other way around if we're talking little endian.
PigBen
Won't work. `v[0]` is an unsigned char. You must first cast it to an integer so that the `* 0x100` doesn't overflow.
ereOn
@ereOn -- It will work, because 0x100 is an int.
PigBen
@Pigen: My bad. You're absolutely right.
ereOn
v[0] will be implicitly promoted to signed int. C standard section 3.2.1.1.
is this necessarily any more C++ than C? And anyway, it's a suboptimal version of the bitwise operations (i.e. the above will *most likely* be optimized by the compiler to the bitwise operations)! If the OP is after a class (akin to ByteBuffer in Java nio), that's a different issue...
Nim
Okay, okay, I'll give you a point.
@user434507, it's not about points, I'm just trying to highlight to you that you shouldn't ignore part of the language just because it's perceived to be "C style" whatever that is... If you are explicitly after a class to serialize/deserialize data properly from a binary stream, look at boost's serialization library - it's very powerful.
Nim
+6  A: 

You may do:

vector<unsigned char> somevector;
// Suppose it is initialized and big enough to hold a uint16_t

int i = *reinterpret_cast<const uint16_t*>(&somevector[0]);
// But you must be sure of the byte order

// or
int i2 = (static_cast<int>(somevector[0]) << 8) | somevector[1];
// But you must be sure of the byte order as well
ereOn
You can use ntohs() to make it independent of big/little endian
Benoit Thiery
On some platforms, the first version will generate an exception if you try to do it at an odd offset. On many (including most modern Intels, I think), an odd offset will involve a performance penalty. If the performance is crucial, it will be up to the programmer to determine the faster way.
to be honest, a reinterpret_cast is more "C style" than the bitwise operations, it's just hidden by a C++ template! ;) What's that expression about a turd? ;) I would always go with the bitwise operations, it's frankly more clearer!
Nim
+2  A: 

what do you mean "not in C style"? Using bitwise operations (shifts and ors) to get this to work does not imply it's "C style!"

what's wrong with: int t = v[0]; t = (t << 8) | v[1]; ?

Nim
+1  A: 

If you don't want to care about big/little endian, you can use:

vector<unsigned char> somevector;
// Suppose it is initialized and big enough to hold a uint16_t

int i = ntohs(*reinterpret_cast<const uint16_t*>(&somevector[0]));
Benoit Thiery
But who says they are in network order?
Basilevs
A: 

Casts are a bad thing to do.

template <class T>
long extract(typename T::const_iterator & input) {
  typedef typename T::value_type value_type;
  union {
    long rv;
    value_type[2] fields;
  } temp;
  temp.fields[0] = *input++;
  temp.fields[1] = *input++;
  return temp.rv;
}
Basilevs
Writing to one field of a union and then reading from another is a bad thing to do: it's undefined behavior.
Daniel
+3  A: 

Please use the shift operator / bit-wise operations.

int t = (v[0] << 8) | v[1];

All the solutions proposed here that are based on casting/unions are AFAIK undefined behavior, and may fail on compilers that take advantage of strict aliasing (e.g. GCC).

Daniel
Correct me if I'm wrong, but since `v[0]` is an unsigned char, won't the `<< 8` generate a warning if we don't cast it to `int` before ?
ereOn
This works perfect with G++. Even with the vector iterators.
Hitman_99
Types smaller than `int` get promoted to int when used with arithmetic or bitwise operators, so the shift works fine without having to cast v[0] explicitly.
Daniel