views:

96

answers:

4

Long:

  char long_num[8];
  for(i=0; i<8; i++)
    long_num[i] = data[pos++];

  memcpy(&res, long_num, 8);

The values in the long_num are as follows:

127 -1 -1 -1 -1 -1 -1 -1

res should be the maximum value of signed long, but is -129 instead.

EDIT: This one is taken care of. It was a result of communication problems: For the person providing data, a long is eight bytes; for my C it's four.

Float:

  float *res;
  /* ... */
  char float_num[4];
  for(i=0; i<4; i++)
    float_num[i] = data[pos++];

  res = (float *)float_num;

It's zero. Array values:

62 -1 24 50

I also tried memcpy(), but it yields zero as well. What am I doing wrong?


My system: Linux 2.6.31-16-generic i686 GNU/Linux

+3  A: 

You are running the code on a little-endian system. Reverse the order of bytes in the array and try again:

signed char long_num[] = {-1, -1, -1, -1, -1, -1, -1, 127};
// ...
Mehrdad Afshari
+2  A: 

These are two questions, quite unrelated.

In the first one, your computer is little-endian. The sign bit is set in the long that you piece together so the result is negative. It is close to zero because many "most significant bits" are set.

In the second example, the non-respect of strict aliasing rules could be an explanation for weird behavior. I am not sure. If you are using gcc, try using an union instead, gcc guarantees what happens when you convert data this way using an union.

Pascal Cuoq
+1  A: 

Given this code:

#include <stdio.h>
#include <string.h>

int main(void)
{
    {
        long res;
        char long_num[8] = { 0x7F, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF };
        memcpy(&res, long_num, 8);
        printf("%ld = 0x%lX\n", res, res);
    }
    {
        float res;

        char float_num[4] = { 62, 0xFF, 24, 50 };
        memcpy(&res, float_num, 4);
        printf("%f = %19.14e\n", res, res);

    }
    return 0;
}

Compiling in 64-bit mode on MacOS X 10.6.4 with GCC 4.5.1 gives:

-129 = 0xFFFFFFFFFFFFFF7F
0.000000 = 8.90559981314709e-09

This is correct for a little-endian Intel machine (well, the 'long' value is correct).

What you are trying to do is a little unusual - not recommended. It is not portable, not least because of issues with endian-ness.

I previously wrote some related code on a SPARC machine (which is a big-endian machine):

union u_double
{
    double  dbl;
    char    data[sizeof(double)];
};

union u_float
{
    float   flt;
    char    data[sizeof(float)];
};

static void dump_float(union u_float f)
{
    int exp;
    long mant;

    printf("32-bit float: sign: %d, ", (f.data[0] & 0x80) >> 7);
    exp = ((f.data[0] & 0x7F) << 1) | ((f.data[1] & 0x80) >> 7);
    printf("expt: %4d (unbiassed %5d), ", exp, exp - 127);
    mant = ((((f.data[1] & 0x7F) << 8) | (f.data[2] & 0xFF)) << 8) | (f.data[3] & 0xFF);
    printf("mant: %16ld (0x%06lX)\n", mant, mant);
}

static void dump_double(union u_double d)
{
    int exp;
    long long mant;

    printf("64-bit float: sign: %d, ", (d.data[0] & 0x80) >> 7);
    exp = ((d.data[0] & 0x7F) << 4) | ((d.data[1] & 0xF0) >> 4);
    printf("expt: %4d (unbiassed %5d), ", exp, exp - 1023);
    mant = ((((d.data[1] & 0x0F) << 8) | (d.data[2] & 0xFF)) << 8) | (d.data[3] & 0xFF);
    mant = (mant << 32) | ((((((d.data[4] & 0xFF) << 8) | (d.data[5] & 0xFF)) << 8) | (d.data[6] & 0xFF)) << 8) | (d.data[7] & 0xFF);
    printf("mant: %16lld (0x%013llX)\n", mant, mant);
}

static void print_value(double v)
{
    union u_double d;
    union u_float  f;

    f.flt = v;
    d.dbl = v;

    printf("SPARC: float/double of %g\n", v);
    image_print(stdout, 0, f.data, sizeof(f.data));
    image_print(stdout, 0, d.data, sizeof(d.data));
    dump_float(f);
    dump_double(d);
}


int main(void)
{
    print_value(+1.0);
    print_value(+2.0);
    print_value(+3.0);
    print_value( 0.0);
    print_value(-3.0);
    print_value(+3.1415926535897932);
    print_value(+1e126);
    return(0);
}

This is what I got on that platform. Note that there is an implicit '1' bit in the mantissa, so the value of '3' only has a single bit set because the other 1-bit is implied.

SPARC: float/double of 1
0x0000: 3F 80 00 00                                       ?...
0x0000: 3F F0 00 00 00 00 00 00                           ?.......
32-bit float: sign: 0, expt:  127 (unbiassed     0), mant:                0 (0x000000)
64-bit float: sign: 0, expt: 1023 (unbiassed     0), mant:                0 (0x0000000000000)
SPARC: float/double of 2
0x0000: 40 00 00 00                                       @...
0x0000: 40 00 00 00 00 00 00 00                           @.......
32-bit float: sign: 0, expt:  128 (unbiassed     1), mant:                0 (0x000000)
64-bit float: sign: 0, expt: 1024 (unbiassed     1), mant:                0 (0x0000000000000)
SPARC: float/double of 3
0x0000: 40 40 00 00                                       @@..
0x0000: 40 08 00 00 00 00 00 00                           @.......
32-bit float: sign: 0, expt:  128 (unbiassed     1), mant:          4194304 (0x400000)
64-bit float: sign: 0, expt: 1024 (unbiassed     1), mant: 2251799813685248 (0x8000000000000)
SPARC: float/double of 0
0x0000: 00 00 00 00                                       ....
0x0000: 00 00 00 00 00 00 00 00                           ........
32-bit float: sign: 0, expt:    0 (unbiassed  -127), mant:                0 (0x000000)
64-bit float: sign: 0, expt:    0 (unbiassed -1023), mant:                0 (0x0000000000000)
SPARC: float/double of -3
0x0000: C0 40 00 00                                       .@..
0x0000: C0 08 00 00 00 00 00 00                           ........
32-bit float: sign: 1, expt:  128 (unbiassed     1), mant:          4194304 (0x400000)
64-bit float: sign: 1, expt: 1024 (unbiassed     1), mant: 2251799813685248 (0x8000000000000)
SPARC: float/double of 3.14159
0x0000: 40 49 0F DB                                       @I..
0x0000: 40 09 21 FB 54 44 2D 18                           @.!.TD-.
32-bit float: sign: 0, expt:  128 (unbiassed     1), mant:          4788187 (0x490FDB)
64-bit float: sign: 0, expt: 1024 (unbiassed     1), mant: 2570638124657944 (0x921FB54442D18)
SPARC: float/double of 1e+126
0x0000: 7F 80 00 00                                       ....
0x0000: 5A 17 A2 EC C4 14 A0 3F                           Z......?
32-bit float: sign: 0, expt:  255 (unbiassed   128), mant:                0 (0x000000)
64-bit float: sign: 0, expt: 1441 (unbiassed   418), mant:      -1005281217 (0xFFFFFFFFC414A03F)

You'd have to do some diddling to the code to make it work sanely on a little-endian machine like an Intel machine.

Jonathan Leffler
A: 

If you are communicating over a network between different machines (as the update implies), you have to define your protocol to ensure that both ends know how to get the data accurately to the other end. It is not necessarily trivial - there are many complex systems available around the world.

  • One standard method is to define a canonical ordering for the bytes - and a canonical size for the types. This is often called 'network byte order' when dealing with IPv4 addresses, for example. It is partially defining the endianness of the data; it is also about defining that the value is sent as a 4-byte value rather than as an 8-byte value - or vice versa.

  • Another technique is based on ASN.1 - which encodes the data with a type, a length, and a value (TLV encoding). Each bit of data is sent with information that identifies what it is that is being sent.

  • The DRDA protocol used by IBM DB2 DBMS has a different policy - 'receiver makes right'. The sender identifies what sort of machine they are somehow when the session starts, and then sends the data in their own most convenient format. The receiver is responsible for fixing what was sent. (This applies to both the DB server and the DB client; the client sends in its preferred notation and the server fixes what it receives, while the server sends in its preferred notation and the client fixes what it receives.)

  • Another extremely effective way of dealing with the problems is to use a textual protocol. The data is transmitted as the text version of the data, with a clear mechanism for identifying the different fields. This is much easier to debug than the various binary-encoding mechanisms because you can dump the data and see what is going on. It is not necessarily much less efficient than a binary protocol - especially if you typically send 8-byte integers that actually contain single-digit integer values.

Jonathan Leffler