views:

111

answers:

3

So this is a really strange problem. I have a Java app that acts as a server, listens for and accepts incoming client connections, and then read data (XML) off of the socket. Using my Java client driver, everything works great. I receive messages as expected. However, using my C++ client driver on the first message only, the very first character is read to be an ASCII 0 (shows up like a little box). We're using the standard socket API in C++, sending in a char* (we've done char*, std::string, and just text in quotes).

I used Wireshark to sniff the packet and sure enough, it's in there off of the wire. Admittedly, I haven't done the same on the client computer. My argument is that it really shouldn't matter, but correct me if that assumption is incorrect.

So my question: what the heck? Why does just the first message contain this extra prepended data, but all other messages are fine? Is there some little trick to making things work?

A: 

Not that I know of. It's time to binary-search the space of possible culprits.

I would run Wireshark on the client computer to make sure the problem really is originating there. Theoretically some misbehaving router or something could do this (very hard to believe though).

Then I would check the arguments to the socket APIs while the program is actually running, using a debugger.

At that point, if the program is definitely correct and the packets coming out of the computer are definitely wrong, you're looking at a misbehaving networking library or a bad driver.

Jason Orendorff
Randolpho's explanation is a lot more plausible. Interoperability issues abound.
Vinko Vrsalovic
The most likely explanation is a bug in the C++ program.
Jason Orendorff
Yes, but checking for a router problem or a driver issue before checking interop issues is not a good way to attack the problem, IMO.
Vinko Vrsalovic
A good rule of thumb when debugging a difficult problem is: if a sanity check is easy to do, *do it*. That's why I suggest running Wireshark first. It would only take a second.
Jason Orendorff
+3  A: 

This is most likely an encoding issue. If you're just using char * for your C++ client, you're assuming ASCII encoding (at best), while Java uses Unicode (or UTF, I misremember which) internally and emits UTF-8 (IIRC) by default.

Either have your Java server emit 7-bit/character ASCII, or have your C++ client read the encoding Java is emitting.

Ahhh. I'm going to have to spend some time curled up with Google by a fireplace to figure out how to match up the encoding, but that does give me something to go on. I'll probably need to change my Java encoding to match what C++ uses, since that matches the customer scenario. Anyone with a good link, additional info, or code snippet, please post.

If you've got your XML packed as a string, you can use getBytes() to do your encoding:

byte [] asciiEncodedBytes = myString.getBytes("US-ASCII");

EDIT: It's been a while since I've been in Java land, but it doesn't look like Java has any ASCII encoding streams in the core library. I did find this class out there which apparently will wrap an ASCII encoding stream for you. Thankfully it's in an open source project so you might be able to mine the class out of it for your server.

Randolpho
UTF is a Unicode encoding. Damn .NET framework naming convention
Vinko Vrsalovic
Ahhh. I'm going to have to spend some time curled up with Google by a fireplace to figure out how to match up the encoding, but that does give me something to go on. I'll probably need to change my Java encoding to match what C++ uses, since that matches the customer scenario. Anyone with a good link, additional info, or code snippet, please post.
Dopyiii
Yeah, I've been corrupted horribly. :)
Randolpho
Okay, so apparently Google and I are having a fight tonight - no fireside action for me tonight. the getBytes tip is pure gold. If the C++ encoding is ascii (for sure), then you've likely just solved my problem.
Dopyiii
I don't understand how the problem could be on the Java side, if the erroneous zero byte is visible in Wireshark.
Jason Orendorff
Since the data is XML, it would be better to use an XML parser, which should detect the encoding for you, than to try and figure out the encoding issues yourself.
Jason Orendorff
I'm using Castor for data binding. The problem appears to come off the wire.
Dopyiii
A: 

So, the encoding thing didn't work. In the end, I simply did a substring(startIndex) call on the incoming message using xmlMessage.indexOf("<") as the starting index. It may not be elegant, but it'll work. And the box, will remain a mystery. I appreciate the insight that you three provided.

Dopyiii