views:

117

answers:

4

I'm trying to write a chat client for a popular network. The original client is proprietary, and is about 15 GB larger than I would like. (To be fair, others call it a game.)

There is absolutely no documentation available for the protocol on the internet, and most search results only come back with the client's scripting interface. I can understand that, since used in the wrong way, it could lead to ruining other people's experience.

I've downloaded the source code of a couple of alternative servers, including the one I want to connect to, but those

  • contain no documentation other than install instructions
  • are poorly commented (I did a superficial browsing)
  • are HUGE (the src folder of the target server contains 12 MB worth of .cpp and .h files), and grep didn't find anything related

I've also tried searching their forums and contacting the maintainers of the server, but so far, no luck.

Packet sniffing isn't likely to help, as the protocol relies heavily on encryption.

At this point, all my hope is my ability to chew through an ungodly amount of code. How do I start?

Edit: A related question.

+2  A: 

I'd say

  1. find the command that is used to send data through the socket (the call depends on the network library)
  2. find references of this command and unroll from there. If you can modify-recompile the server code, it might help.

On the way, you will be able to log decrypted (or, more likely, not yet encrypted) network activity.

Benoît
+1  A: 

IMO, the best answer is to read the source code of the alternative server. Try using a good C++ IDE to help you. It will make a lot of difference.

It is likely that the protocol related material you need to understand will be limited to a subset of the files. These will contain references to network sockets and things. Start from there and work outwards as far as you need to.

Stephen C
+3  A: 

If your original code is encrypted with some well known library like OpenSSL or Ctypto++ it might be useful to write your wrapper for the main entry points of these libraries, then delagating the call to the actual library. If you make such substitution and build the project successfully, you will be able to trace everything which goes out in the plain text way.

If your project is not using third party encryption libs, hopefully it is still possible to substitute the encryption routines with some wrappers which trace their input and then delegate encryption to the actual code. Your bet is that usually enctyption is implemented in separate, relatively small number of source files so that should be easier for you to track input/output in these files.

Good luck!

AlexKR
A: 

A viable approach is to tackle this as a crypto challenge. That makes it easy, because you control so much.

For instance, you can use a current client to send a known message to the server, and then check server memory for that string. Once you've found out in which object the string ends, it also becomes possible to trace its ancestry through the code. Set a breakpoint on any non-const method of the object, and find the stacktraces. This gives you a live view of how messages arrive at the server, and a list of core functions essential to message processing. You can next find related functions (caller/callee of the functions on your list).

MSalters