views:

4101

answers:

6

The man pages for htonl() seem to suggest that you can only use it for up to 32 bit values. (In reality, ntohl() is defined for unsigned long, which on my platform is 32 bits. I suppose if the unsigned long were 8 bytes, it would work for 64 bit ints).

My problem is that I need to convert 64 bit integers (in my case, this is an unsigned long long) from big endian to little endian. Right now, I need to do that specific conversion. But it would be even nicer if the function (like ntohl()) would NOT convert my 64 bit value if the target platform WAS big endian. (I'd rather avoid adding my own preprocessor magic to do this).

What can I use? I would like something that is standard if it exists, but I am open to implementation suggestions. I have seen this type of conversion done in the past using unions. I suppose I could have a union with an unsigned long long and a char[8]. Then swap the bytes around accordingly. (Obviously would break on platforms that were big endian).

Thanks in advance :).

+3  A: 

To detect your endian-ness, use the following union:

union {
    unsigned long long ull;
    char c[8];
} x;
x.ull = 0x0123456789abcdef; // may need special suffix for ULL.

Then you can check the contents of x.c[] to detect where each byte went.

To do the conversion, I would use that detection code once to see what endian-ness the platform is using, then write my own function to do the swaps.

You could make it dynamic so that the code will run on any platform (detect once then use a switch inside your conversion code to choose the right conversion) but, if you're only going to be using one platform, I'd just do the detection once in a separate program then code up a simple conversion routine, making sure you document that it only runs (or has been tested) on that platform.

Here's some sample code I whipped up to illustrate it. It's been tested though not in a thorough manner, but should be enough to get you started.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define TYP_INIT 0
#define TYP_SMLE 1
#define TYP_BIGE 2

static unsigned long long cvt(unsigned long long src) {
    static int typ = TYP_INIT;
    unsigned char c;
    union {
        unsigned long long ull;
        unsigned char c[8];
    } x;

    if (typ == TYP_INIT) {
        x.ull = 0x01;
        typ = (x.c[7] == 0x01) ? TYP_BIGE : TYP_SMLE;
    }

 

    if (typ == TYP_SMLE)
        return src;

    x.ull = src;
    c = x.c[0]; x.c[0] = x.c[7]; x.c[7] = c;
    c = x.c[1]; x.c[1] = x.c[6]; x.c[6] = c;
    c = x.c[2]; x.c[2] = x.c[5]; x.c[5] = c;
    c = x.c[3]; x.c[3] = x.c[4]; x.c[4] = c;
    return x.ull;
}

int main (void) {
    unsigned long long ull = 1;
    ull = cvt (ull);
    printf ("%llu\n",ull);
    return 0;
}

Keep in mind that this just checks for pure big/little endian. If you have some weird variant where the bytes are stored in, for example, {5,2,3,1,0,7,6,4} order, cvt() will be a tad more complex. Such an architecture doesn't deserve to exist, but I'm not discounting the lunacy of our friends in the microprocessor industry :-)

paxdiablo
"may need special suffix for ULL" - and neither C89 nor C++ defines one that's portable. However, you can do x.ull = ((unsigned long long) 0x01234567) << 32 + 0x89abcdef; provided that long long really is 64bit.
Steve Jessop
Thanks, onebyone, I ended up just using 0x01 and detecting that.
paxdiablo
Actually "return src" should be done for big-endian architectures, not little-endian. Also, a more concise way to do the conversion on a little-endian CPU would be to compute the upper 32 bits of the result by using htonl() on the lower 32 bits of src and the lower 32 bits of the result by using htonl() on the upper 32 bits of src (hope that makes some sense...).
Lance Richardson
That's not right, is it, Lance? The question asked for the value in little endian - that means leave it alone on little-endian systems and swap it on big-endian systems.
paxdiablo
Lance is right... I am on a little endian system and I am dealing with a big endian number. I like that suggestion Lance. You should write it up as an answer so it can be upvoted, and possibly accepted :).I will make a decision soon on what to do and post more and hand out reputation.
Tom
The htonl() and related functions are intended to convert from host CPU byte order to network byte order. Network byte order is big-endian, so a conversion has to be performed on little-endian machines. The question was consistent with this - it stated that the function should "NOT convert my 64 bit value if the target platform WAS big endian".
Lance Richardson
Tom - I'd be happy to have an up-voted comment (I'm not sure whether that impacts rep or not...)
Lance Richardson
I realize I made a mistake in my post... I meant to say ntohl, not htonl... ::sigh:: sorry for the confusion. But I think it was clear from the context what I was looking for. I'll update the post now.
Tom
Tom, if the code given should be changed based on your latest update, it's a simple matter of changing "if (typ == TYP_SMLE)" to "if (typ == TYP_BIGE)" - this will change when the swap is done.
paxdiablo
+5  A: 

some BSD systems has betoh64 which does what you need.

Francis
Linux (glibc) too. It's found in the <endian.h> header.
ephemient
Hmm... I can't find the function in any of the endian.h headers. I am on my intel mac right now (running leopard). I also need to get this to work on Linux machines at school. I'm not sure which distro is running, but I am fairly certain they are i386 machines, little endian, and sizeof(unsigned long long) == 8.Also, the function I would need is be64toh(). Any suggestions? I would prefer this solution to the other one.
Tom
my fault - what yo want should be betoh64. on FreeBSD, it's in /usr/include/sys/endian.h . The man page is byteorder(9).According to FreeBSD notes, these were originally from NetBSD, and appear on FreeBSD after 5.x.As I know, MacOSX is using lots of FreeBSD files as its backend (darwin) base - so there's a big chance that you may be able to use it.
Francis
@Francis: My sources indicate that it is present even in 4.3BSD. @Tom: Autoconf looks for endian.h, sys/endian.h, and machinfo/endian.h; you may have to use different include paths on different platforms.
ephemient
+1  A: 

I like the union answer, pretty neat. Typically I just bit shift to convert between little and big endian, although I think the union solution has fewer assignments and may be faster:

//note UINT64_C_LITERAL is a macro that appends the correct prefix
//for the literal on that platform
inline void endianFlip(unsigned long long& Value)
{
   Value=
   ((Value &   UINT64_C_LITERAL(0x00000000000000FF)) << 56) |
   ((Value &   UINT64_C_LITERAL(0x000000000000FF00)) << 40) |
   ((Value &   UINT64_C_LITERAL(0x0000000000FF0000)) << 24) |
   ((Value &   UINT64_C_LITERAL(0x00000000FF000000)) << 8)  |
   ((Value &   UINT64_C_LITERAL(0x000000FF00000000)) >> 8)  | 
   ((Value &   UINT64_C_LITERAL(0x0000FF0000000000)) >> 24) |
   ((Value &   UINT64_C_LITERAL(0x00FF000000000000)) >> 40) |
   ((Value &   UINT64_C_LITERAL(0xFF00000000000000)) >> 56);
}

Then to detect if you even need to do your flip without macro magic, you can do a similiar thing as Pax, where when a short is assigned to 0x0001 it will be 0x0100 on the opposite endian system.

So:

unsigned long long numberToSystemEndian
(
    unsigned long long In, 
    unsigned short SourceEndian
)
{
   if (SourceEndian != 1)
   {
      //from an opposite endian system
      endianFlip(In);
   }
   return In;
}

So to use this, you'd need SourceEndian to be an indicator to communicate the endianness of the input number. This could be stored in the file (if this is a serialization problem), or communicated over the network (if it's a network serialization issue).

Snazzer
A: 

Take a look at my reply at this other question

winden
+1  A: 

An easy way would be to use ntohl on the two parts seperately:

unsigned long long htonll(unsigned long long v) {
    union { unsigned long lv[2]; unsigned long long llv; } u;
    u.lv[0] = htonl(v >> 32);
    u.lv[1] = htonl(v & 0xFFFFFFFFULL);
    return u.llv;
}

unsigned long long ntohll(unsigned long long v) {
    union { unsigned long lv[2]; unsigned long long llv; } u;
    u.llv = v;
    return ((unsigned long long)ntohl(u.lv[0]) << 32) | (unsigned long long)ntohl(u.lv[1]);
}
bdonlan
your first function is htonll and uses ntohl() internally. both functions are interchangeable, correct? if so why are they implemented differently?
Marius
Oops, fixed. Strictly speaking, there are other options for endianness than big or little-endian - while you don't see them much anymore, on some very old systems, `htonl()` and `ntohl()` might behave differently.
bdonlan
A: 

How about:

#define ntohll(x) ( ( (uint64_t)(ntohl( (uint32_t)((x << 32) >> 32) )) << 32) | 
    ntohl( ((uint32_t)(x >> 32)) ) )                                        
#define htonll(x) ntohll(x)
#defines are evil and should be avoided whenever possible. They have their place, but this is not one of them.
Marius
Who says defines are evil? I read this for the first time after doing C programming for over ten years. If you forbid defines tomorrow, pretty much no Linux software in the world will build any longer. There is nothing evil about defines and unlike an inline function, they guarantee inlining regardless of C compiler being used.
Mecki