views:

1900

answers:

10

Sometimes you don't have the source code and need to reverse engineer a program or a black box. Any fun war stories?

Here's one of mine:

Some years ago I needed to rewrite a device driver for which I didn't have source code. The device driver ran on an old CP/M microcomputer and drove a dedicated phototypesetting machine through a serial port. (Wow, I'm showing my age here...) Almost no documentation for the phototypesetting machine was available to me.

I finally hacked together a serial port monitor on a DOS PC that mimicked the responses of the phototypesetting machine. I cabled the DOS PC to the CP/M machine and started logging the data coming out of the device driver as I feed data in through the CP/M machine. This enabled me to figure out the handshaking and encoding used by the device driver and re-create an equivalent one for a DOS machine.

+3  A: 

The most painful for me was for this product where we wanted to include an image on a Excel Spreadsheet (few years back before the open standards). So I had to get and "understanding" if such thing exists of the internal format for the docs, as well. I ended up doing some Hex comparison between files with and without the image to figure out how to put it there, plus working on some little endian math....

webclimber
Reverse Engineering Pre-Excel-2007 Data formats? Now, THAT is possibly one of the most painful things to do. +1 because i share your pain.
Michael Stum
+6  A: 

I once worked on a tool that would gather inventory information from a PC as it logged into the network. The idea was to keep track of all the PCs in your company.

We had a new requirement to support the Banyan VINES network system, now long forgotten but pretty cool at the time it came out. I couldn't figure out how to get the Ethernet MAC address from Banyan's adapter as there was no documented API to do this.

Digging around online, I found a program that some other Banyan nerd had posted that performed this exact action. (I think it would store the MAC address in an environment variable so you could use it in a script). I tried writing to the author to find out how his program worked, but he either didn't want to tell me or wanted some ridiculous amount of money for the information (I don't recall).

So I simply fired up a disassembler and took his utility apart. It turned out he was making one simple call to the server, which was an undocumented function code in the Banyan API. I worked out the details of the call pretty easily, it was basically asking the server for this workstations address via RPC, and the MAC was part of the Banyan network address.

I then simply emailed the engineers at Banyan and told them what I needed to do. "Hey, it appears that RPC function number 528 (or whatever) returns what I need. Is that safe to call?"

The Banyan engineers were very cool, they verified that the function I had found was correct and was pretty unlikely to go away. I wrote my own fresh code to call it and I was off and running.

Years later I used basically the same technique to reverse engineer an undocumented compression scheme on an otherwise documented file format. I found a little-known support tool provided by the (now defunct) company which would decompress these files, and reverse engineered it. It turned out to be a very straightforward Lempel-Ziv variant applied within the block structure of their file format. The results of that work are recorded for posterity in the Wireshark source code, just search for my name.

Tim Farley
+2  A: 

I've had to reverse engineer a video-processing app, where I only had part of the source code. It took me weeks and weeks to even work out the control-flow, as it kept using CORBA to call itself, or be called from CORBA in some part of the app that I couldn't access.

Sheer idiocy.

Marcin
+7  A: 

Once, when playing Daggerfall II, I could not afford the Daedric Dai-Katana so I hex-edited the savegame.

Being serious though, I managed to remove the dongle check on my dads AutoCAD installation using SoftICE many years ago. This was before the Internet was big. He works as an engineer so he had a legitimate copy. He had just forgotten the dongle at his job and he needed to do some things and I thought it would be a fun challenge. I was very proud afterwards.

mannu
+2  A: 

I recently wrote an app that download the whole content from a Domino Webmail server using Curl. This is because the subcontractor running the server asks for a few hundred bucks for every archive request.

They changed their webmail version about one week after I released the app for the departement but managed to make it working again using a GREAT deal of regex and XML

Eric
+6  A: 

Way back in the early 90s, I forgot my Compuserve password. I had the encrypted version in CIS.INI, so I wrote a small program to do a plaintext attack and analysis in an attempt to reverse-engineer the encryption algorithm. 24 hours later, I figured out how it worked and what my password was.

Soon after that, I did a clean-up and published the program as freeware so that Compuserve customers could recover their lost passwords. The company's support staff would frequently refer these people to my program.

It eventually found its way onto a few bulletin boards (remember them?) and Internet forums, and was included in a German book about Compuserve. It's still floating around out there somewhere. In fact, Google takes me straight to it.

RoadWarrior
+3  A: 

I wrote a driver for the Atari ST that supported Wacom tablets. Some of the Wacom information could be found on their web sites, but I still had to figure out a lot on my own.

Then, once I had written a library to access the wacom tables (and a test application to show the results) - it dawned on me that there was no API for the OS (GEM windowing system) to actually place the mouse cursor somewhere. I ended up having to hook some interrupts in something called the VDI (like GDI in windows), and be very very careful not to crash the computer inside there. I had some help (in the form of suggestions) from the developers of an accelerated version of the VDI (NVDI), and everything was written in PurePascal. I still occasionally have people asking me how to move the mouse cursor in GEM, etc.

+20  A: 

Read the story of FCopy for the C-64 here:

Back in the 80s, the Commodore C-64 had an intelligent floppy drive, the 1541, i.e. an external unit that had its own CPU and everything.

The C-64 would send commands to the drive which in turn would then execute them on its own, reading files, and such, then send the data to the C-64, all over a propriatory serial cable.

The manual for the 1541 mentioned, besides the commands for reading and writing files, that one would read and write to its internal memory space. Even more exciting was that one could download 6502 code into the drive's memory and have it executed there.

This got me hooked and I wanted to play with that - execute code on the drive. Of course, there was no documention on what code could be executed there, and which functions it could use.

A friend of mine had written a disassembler in BASIC. and so I read out all its ROM contents, which was 16KB of 6502 CPU code, and tried to understand what it does. The OS on the drive was quite amazing and advanced IMO - it had a kind of task management, with commands being sent from the communication unit to the disk i/o task handler.

I learned enough to understand how to use the disk i/o commands to read/write sectors of the disc. Actually, having read the Apple ]['s DOS 3.3 book which explained all of the workings of its disk format and algos in much detail, was a big help in understanding it all.

(I later learned that I could have also found reserve-eng'd info on the more 4032/4016 disk drives for the "business" Commodore models which worked quite much the same as the 1541, but that was not available to me as a rather disconnected hobby programmer at that time.)

Most importantly, I also learned how the serial comms worked. I realized that the serial comms, using 4 lines, two for data, two for handshake, was programmed very inefficiently, all in software (though done properly, using classic serial handshaking).

Thus I managed to write a much faster comms routine, where I made fixed timing assumtions, using both the data and the handshake line for data transmission.

Now I was able to read and write sectors, and also transmit data faster than ever before.

Of course, it would have been great if one could simply load some code into the drive which speeds up the comms, and then use the normal commands to read a file, which in turn would use the faster comms. This was no possible, though, as the OS on the drive did not provide any hooks for that (mind that all of the OS was in ROM, unmodifiable).

Hence I was wondering how I could turn my exciting findings into a useful application.

Having been a programmer for a while already, dealing with data loss all the times (music tapes and floppy discs were not very realiable back then), I thought: Backup!

So I wrote a backup program which could duplicate a floppy disc in never-before seen speed: The first version did copy an entire 170 KB disc in only 8 minutes (yes, minutes), the second version did it even in about 4.5 minutes. Whereas the apps before mine took over 25 minutes. (Mind you, the Apple ][, which had its disc OS running on the Apple directly, with fast parallel data access, did this all in a minute or so).

And so FCopy for the C-64 was born.

It became soon extremely popular. Not as a backup program as I had intended it, but as the primary choice for anyone wanting to copy games and other software for their friends.

Turned out that a simplification in my code, which would simply skip unreadable sectors, writing a sector with a bad CRC to the copy, did circumvent most of the then-used copy protection schemes, making it possible to copy most formerly uncopyable discs.

I had tried to sell my app and sold it actually 70 times. When it got advertised in the magazines, claiming it would copy a disc in less than 5 minutes, customers would call and not believe it, "knowing better" that it can't be done, yet giving it a try.

Not much later, others started to reverse engineer my app, and optimize it, making the comms even faster, leading to copy apps that did it even in 1.5 minutes. Faster was hardly possible, because, due to the limited amount of memory available on the 1541 and the C-64, you had to swap discs several times in the single disc drive to copy all 170 KB of its contents.

In the end, FCopy and its optimized successors were probably the most-popular software ever on the C-64 in the 80s. And even though it didn't pay off financially for me, it still made me proud, and I learned a lot about reverse-engineering, futility of copy protection and how stardom feels. (Actually, Jim Butterfield, an editor for a C-64 magazine in Canada, told its readers my story, and soon he had a cheque for about 1000 CA$ for me - collected by the magazine from many grateful users sending 5$-cheques, which was a big bunch of money back then for me.)

Thomas Tempelmann
Fcopy... THANKS! :)
Sergio
+13  A: 

I actually have another story:

A few years past my FCopy "success" story, I was approached by someone who asked me if I could crack a slot machine's software.

This was in Germany, where almost every pub had one or two of those: You'd throw some money in what amounts to about a US quarter, then it would spin three wheels and if you got lucky with some pattern, you'd then have the choice to "double or nothing" your win on the next play, or get the current win. The goal of the play was to try to double your win a few times until you'd get in the "series" mode where any succeeding win, no matter how minor, would get you a big payment (of about 10 times your spending per game).

The difficulty was to know when to double and when not. For an "outsider" this was completely random, of course. But it turned out that those German-made machines were using simple pseudo-randomized tables in their ROMs. Now, if you watched the machine play for a few rounds, you'd could figure out where this "random table pointer" was and predict its next move. That way, a player would know when to double and when to pass, leading him eventually to the "big win series".

Now, this was already a common thing when this person approached me. There was an underground scene which had access to the ROMs in those machines, find the tables and create software for computers such as a C-64 to use for prediction of the machine's next moves.

Then came a new type of machine, though, which used a different algorithm: Instead of using pre-calc'd tables, it did something else and none of the resident crackers could figure that out. So I was approached, being known as a sort of genius since my FCopy fame.

So I got the ROMs. 16KB, as usual. No information on what it did and how it worked whatsoever. I was on my own. Even the code didn't look familiar (I knew 6502 and 8080 by then only). After some digging and asking, I found it was a 6809 (which I found to be the nicest 8 bit CPU to exist, and which had analogies to the 680x0 CPU design, which was much more linear than the x86 family's instruction mess).

By that time, I had already a 68000 computer (I worked for the company "Gepard Computer" which built and sold such a machine, with its own developer OS and, all) and was into programming Modula-2. So I wrote a disassembler for the 6809, one that helped me with reverse engineering by finding subroutines, jumps, etc. Slow I got a an idea of the flow control of the slot machine's program. Eventually I found some code that looked like a mathmatical algorithm and it dawned on me that this could be the random generating code.

As I never had a formal education in computer sciences, up to then I had no idea how a typical randomgen using mul, add and mod worked. But I remember seeing something mentioned in a Modula-2 book and then realized what it was.

Now I could quickly find the code that would call this randomgen and learn which "events" lead to a randomgen iteration, meaning I knew how to predict the next iterations and their values during a game.

What was left was to figure out the current position of the randomgen. I had never been good with abstract things such as algebra. I knew someone who studied math and was a programmer too, though. When I called him, he quickly knew how to solve the problem and quabbled a lot about how simple it would be to determine the randomgen's seed value. I understood nothing. Well, I understood one thing: The code to accomplish this would take a lot of time, and that a C-64 or any other 8 bit computer would take hours if not days for it.

Thus, I decided to offer him 1000 DM (which was a lot of money for me back then) if he could write me an assembler routine in 68000. Didn't take him long and I had the code which I could test on my 68000 computer. It took usually between 5 and 8 minutes, which was acceptable. So I was almost there.

It still required a portable 68000 computer to be carried to the pub where the slot machine stands. My Gepard computer was clearly not of the portable type. Luckly, someone else I knew in Germany produced entire 68000 computers on a small circuit board. For I/O it only had serial comms (RS-232) and a parallel port (Centronics was the standard of those days). I could hook up some 9V block battieries to it to make it work. Then I bought a Sharp pocket computer, which had a rubber keyboard and a single-line 32 chars display. Running on batteries, which was my terminal. It had a RS-232 connector which I connected to the 68000 board. The Sharp also had some kind of non-volatile memory, which allowed me to store the 68000 random-cracking software on the Sharp, transfer it on demand to the 68000 computer, which then calculated the seed value. Finally I had a small Centronics printer which printed on narrow thermo paper (which was the size of what cash registers use to print receipts). Hence, once the 68000 had the results, it would send a row of results for the upcoming games on the slot machine to the Sharp, which printed them on paper.

So, to empty one of these slot machines, you'd work with two people: You start playing, write down its results, one you had the minimum number of games required for the seed calculation, one of you would go to the car parked outside, turn on the Sharp, enter the results, it would have the 68000 computer rattle for 8 minutes, and out came a printed list of upcoming game runs. Then all you needed was this tiny piece of paper, take it back to your buddy, who kept the machine occupied, align the past results with the printout and no more than 2 minutes later you were "surprised" to win the all-time 100s series. You'd then play these 100 games, practically emptying the machine (and if the machine was empty before the 100 games were played, you had the right to wait for it to be refilled, maybe even come back next day, whereas the machine was stopped until you came back).

This wasn't Las Vegas, so you'd only get about 400 DM out of a machine that way, but it was quick and sure money, and it was exciting. Some pub owners suspected us of cheating but had nothing against us due to the laws back then, and even when some called the police, the police was in favor of us).

Of course, the slot making company soon got wind of this and tried to counteract, turning off those particular machines until new ROMs were installed. But the first few times they only changed the randomgen's numbers. We only had to get hold of the new ROMs, and it took me a few minutes to find the new numbers and implement them into my software.

So this went on for a while during which me and friends browsed thru pubs of several towns in Germany looking for those machines only we could crack.

Eventually, though, the machine maker learned how to "fix" it: Until then, the randomgen was only advanced at certain predictable times, e.g. something like 4 times during play, and once more per the player's pressing of the "double or nothing" button.

But then they finally changed it so that the randomgen would continually be polled, meaning we were no longer able to predict the next seed value exactly on time for the pressing of the button.

That was the end of it. Still, making the effort of writing a disassembler just for this single crack, finding the key routines in 16KB of 8 bit CPU code, figuring out unknown algorithms, investing quite a lot of money to pay someone else to develop code I didn't understand, finding the items for a portable high-speed computer involving the "blind" 68000 CPU with the Sharp as a terminal and the printer for the convenient output, and then actually emptying the machines myself, was one of the most exciting things I ever did with my programming skills.

Thomas Tempelmann
You should drop by #ipodlinux again sometime. We could use some brilliant ROing. :)
Alex Fort
A: 

Not a personal story, but still an excellent read: From the Eye of a Legal Storm, Murdoch's Satellite-TV Hacker Tells All.

none