tags:

views:

82

answers:

2

Our application is a C server (this problem is for the Windows port of said server) that communicates with a Windows Java client. In this particular instance we are sending data to the client, particularly, the message consists of a 7 byte header where the first 3 bytes all have a specific meaning (op type, flags, etc) and the last 4 bytes contain the size of the rest of the message. For some reason I absolutely can't figure out, the third byte in the header is somehow changing; if I put a break point on the send() I can see that the third byte is what I'm expecting (0xfe), but when I check in the client, that byte is set to 0. Every other byte is fine. A did some traffic capturing with WireShark and saw that the byte was 0 leaving the server, which I find even more baffling. The third byte is set via a define, ala:

#define GET_TOP_FRAME   0xfe

Some testing I did that further confuses the issue:

  1. I changed the value from using the define to first 0x64, 0xff, 0xfd: all came across to the client.
  2. I changed the value from using the define to using 0xfe itself: the value was zero at the client.
  3. I changed the value of the define itself from 0xfe to 0xef: the value was zero at the client.

Nothing about this makes a lick of sense. The code goes through several levels of functions, but here is most of the core code:

int nbytes; /* network order bytes */
static int sendsize = 7;
unsigned char tbuffer[7];
tbuffer[0]= protocolByte;
tbuffer[1]= op;
tbuffer[2]= GET_TOP_FRAME;
nbytes = htonl(bytes);
memcpy((tbuffer+3), &nbytes, JAVA_INT);

send(fd, tbuffer, sendsize, 0);

Where fd is a previously put together socket, protocolByte, op, and bytes are previously set. It then sends the rest of the message with a very similar send command immediately after this one. As I mentioned, if I put a break point on that send function, the tbuffer contains exactly what I expect.

Anybody have any ideas here? I'm completely stumped; nothing about this makes sense to me. Thanks.

+1  A: 

It might be that there's a simple bug somewhere in your system doing a simple buffer overflow or similar, but it's hard to tell where, given this little information. However, keep in mind:

TCP doesn't send messages - it's a stream. One send() call might take several recv() calls to receive. one recv call might receive partial "messages" your application have defined.

Are you checking the return value of send ? And the return values of recv ? send might send less bytes than you tell it to. recv might receive less bytes than you tell it to, or it might receive more data than one of your application "messages" if you've given it a large enough buffer.

nos
Good points. We're actually calling `send()` in a loop checking the bytes sent so we can guarantee we are sending everything. The receive code on the client side is abstracted a bit so it's not obvious what it does, plus I stopped looking at it very heavily once I decided the problem had to be on the server due to the Wireshark data.
Morinar
A: 

Turns out something else was getting in the way: in addition to the C server, we have a Tomcat/Java server that sort of runs on top of everything; I'll be honest, I'm not involved in that piece of development so I don't understand it very well. As it turns out, we have some code that is shared between that middletier portion and our client; this shared code was taking that particular byte as it was leaving the server and, if it was set to 0xfe was setting it to an uninitialized value (hence the zero). Then, when it got to the client, it was wrong. That's why when I set the values to different things in the C code it was making it to the other side and also why the behavior seemed inconsistent; I was toggling values in the Java quite a bit to see what was going on.

Thanks for the pointers all, but as it turns out it was merely a case of me not understanding the path properly.

Morinar