views:

600

answers:

4

I am trying to write server that will communicate with any standard client that can make socket connections (e.g. telnet client)

It started out as an echo server, which of course did not need to worry about network byte ordering.

I am familiar with ntohs, ntohl, htons, htonl functions. These would be great by themselves if I were transfering either 16 or 32-bit ints, or if the characters in the string being sent were multiples of 2 or 4 bytes.

I'd like create a function that operates on strings such as:

str_ntoh(char* net_str, char* host_str, int len)
{
    uint32_t* netp, hostp;
    netp = (uint32_t*)&net_str;
    for(i=0; i < len/4; i++){
         hostp[i] = ntoh(netp[i]);
    }
}

Or something similar. The above thing assumes that the wordsize is 32-bits. We can't be sure that the wordsize on the sending machine is not 16-bits, or 64-bits right?

For client programs, such as telnet, they must be using hton* before they send and ntoh* after they receive data, correct?

EDIT: For the people that thing because 1-char is a byte that endian-ness doesn't matter:

int main(void)
{
    uint32_t a = 0x01020304;
    char* c = (char*)&a;
printf("%x %x %x %x\n", c[0], c[1], c[2], c[3]);

}

Run this snippet of code. The output for me is as follows:

$ ./a.out
  4 3 2 1

Those on powerPC chipsets should get '1 2 3 4' but those of us on intel chipset should see what I got above for the most part.

+7  A: 

Maybe I'm missing something here, but are you sending strings, that is, sequences of characters? Then you don't need to worry about byte order. That is only for the bit pattern in integers. The characters in a string are always in the "right" order.

EDIT:

Derrick, to address your code example, I've run the following (slightly expanded) version of your program on an Intel i7 (little-endian) and on an old Sun Sparc (big-endian)

#include <stdio.h>
#include <stdint.h> 

int main(void)
{
    uint32_t a = 0x01020304;
    char* c = (char*)&a;
    char d[] = { 1, 2, 3, 4 };
    printf("The integer: %x %x %x %x\n", c[0], c[1], c[2], c[3]);
    printf("The string:  %x %x %x %x\n", d[0], d[1], d[2], d[3]);
    return 0;
}

As you can see, I've added a real char array to your print-out of an integer.

The output from the little-endian Intel i7:

The integer: 4 3 2 1
The string:  1 2 3 4

And the output from the big-endian Sun:

The integer: 1 2 3 4
The string:  1 2 3 4

Your multi-byte integer is indeed stored in different byte order on the two machines, but the characters in the char array have the same order.

Thomas Padron-McCarthy
strings are sequences of characters yes. Sending this data over the network between two computers of the same endian-ness would not matter. however, if you did not do any byte-order conversions, and did something like char* str = "abcd"; and sent it on a little endian machine, then received on a big-endian, when you addressed str[0] it would be d, and not a.http://stackoverflow.com/questions/526030/byte-order-with-a-large-array-of-characters-in-c
Derrick
@Derrick: No, that's wrong. With strings, the first character will always be in the first position, and so on. It's not like multi-byte integers.
Thomas Padron-McCarthy
@Derrick: To illustrate Thomas's point... Let's say you had an array of integers, { 0xaabb, 0xccdd }. Take this to a different endianness and the order of bytes within the integer get warped into 0xbbaa, 0xddcc. However the order of the integers within the array doesn't change. So it's { 0xbbaa, 0xddcc } and not { 0xddcc, 0xbbaa }. Now imagine these are 8 bit integers instead of 16. If you had an array {0xaa, 0xbb, 0xcc, 0xdd}, within an array element (0xaa) there are no bytes to swap, it is a single byte. And you wouldn't swap the individual bytes because that changes the order of the array.
asveikau
really? Run this:int main(void){ uint32_t a = 0x01020304; char* c = (char*) printf("%x %x %x %x\n", c[0], c[1], c[2], c[3]); }When I run it I get4 3 2 1
Derrick
@Derek - That's a 32-bit integer, not an array of bytes. If you declare: char a[] = {1,2,3,4}; it will always be in the same order.
asveikau
+2  A: 

If you'd like to send them as an 8-bit encoding (the fact that you're using char implies this is what you want), there's no need to byte swap. However, for the unrelated issue of non-ASCII characters, so that the same character > 127 appears the same on both ends of the connection, I would suggest that you send the data in something like UTF-8, which can represent all unicode characters and can be safely treated as ASCII strings. The way to get UTF-8 text based on the default encoding varies by the platform and set of libraries you're using.

If you're sending 16-bit or 32-bit encoding... You can include one character with the byte order mark which the other end can use to determine the endianness of the character. Or, you can assume network byte order and use htons() or htonl() as you suggest. But if you'd like to use char, please see the previous paragraph. :-)

asveikau
Should be noted that UTF-8 isn't affected by byte order. UTF-16 and UTF-32 are.
Artelius
+3  A: 

With your function signature as posted you don't have to worry about byte order. It accepts a char*, that can only handle 8-bit characters. With one byte per character, you cannot have a byte order problem.

You'd only run into a byte order problem if you send Unicode, either in UTF16 or UTF32 encoding. And the endian-ness of the sending machine doesn't match the one of the receiving machine. The simple solution for that is to use UTF8 encoding. Which is what most text is sent as across networks. Being byte oriented, it doesn't have a byte order issue either. Or you could send a BOM.

Hans Passant
A: 

It seems to me that the function prototype doesn't match its behavior. You're passing in a char *, but you're then casting it to uint32_t *. And, looking more closely, you're casting the address of the pointer, rather than the contents, so I'm concerned that you'll get unexpected results. Perhaps the following would work better:

arr_ntoh(uint32_t* netp, uint32_t* hostp, int len)
  {
  for(i=0; i < len; i++)
    hostp[i] = ntoh(netp[i]);
  }

I'm basing this on the assumption that what you've really got is an array of uint32_t and you want to run ntoh() on all of them.

I hope this is helpful.

Bob Jarvis