views:

129

answers:

6

Without changing the build process or including the source files, can anyone think of a way to make two pieces of code that, when compiled, generates the same assembly but still executes differently? I already know why this should be impossible, so please don't bother to explain. There are definitely ways to do it, primarily by finding ways to hide information in the source code that ends up somewhere other than in the executeable.

A rather boring and unfun example that can do this: Create a chunk of code that takes a long time to parse but gets optimized out during the compilation process (or just add so much white space that disk I/O slows down compilation). Make your program generate both an exe and a dll during the build process. Have the program behave differently depending on the difference between the created timestamps on the dll and the exe. This is a pretty lame example, though. I wonder if someone can come up with anything more clever.

One could also somehow generate a different debugging output and have that change how the code runs, but that's kind of lame, too.

It is, of course, more impressive if your code doesn't seem to be inspecting itself or the output. If someone looking at your code would be shocked that the two versions behaved differently, it's a great answer.

+3  A: 

Here's a general example using Python as pseudocode:

input = raw_input("type something!\n")
if "a" in input:
    print "Great job!"
else:
    print "Ohh, too bad."

If you wrote that in a compiled language and compiled it, the assembly would be exactly the same, but it could produce different behaviour each time you execute it! This is exactly what your examples are describing, by the way. Changing execution path based on external input is a pretty fundamental part of programming.

To expound: when the assembly of two executables is exactly the same, then the execution is going to be exactly the same. The behaviour might be different, but this is completely dependent on the input each receives when run. Making the input be the time of creation of the executable is valid, but essentially useless.

Sean Nyman
@Darth: I agree that making the input be creation time is pretty lame. I'm wondering if there is a more subtle trick one could use.
Brian
+3  A: 

There's also Self-Modifying Code

[S]elf-modifying code is code that alters its own instructions while it is executing

So in theory, you could use some code to look at a timestamp and have the code modify itself based on that. SO has an article: What are the uses of self modifying code?

Gavin Miller
A: 

Let's see if I understand. Suppose you claim that you can do this. You write two different programs that result in identical assemblies. You put each on a CD-ROM. The CD-ROMs are bit-for-bit identical. You label them A and B with a pen. I put one or the other into my computer and install it. If I installed the disk labeled "A", the program displays "I am program A." If I installed the disk labeled "B", the program displays "I am program B."

I can't see how this is possible.

There are certainly tricks you can do. If you can see which disk I put in, you can secretly press A or B on your cell phone, sending a message to your web server which is retrieved by the program.

Mark Lutton
I am not claiming the CD-ROMs are bit-for-bit identical. I am only stating that the *assembly* is bit-for-bit identical, and that the build process used to generate each program was the same.
Brian
Do you mean the machine language code? There could be an environmental difference; for instance the CPU's zero flag could initially be 0 or 1, so a "branch if zero" executes differently. On some old operating systems, allocated memory is not zeroed out, so if you compile and execute in one step the program could allocate a lot of memory and search in it for an artifact left by the compiler.
Mark Lutton
A: 

If you worked at it hard enough, you should be able to have the exact same chunk of machine instructions to do two different things by jumping to the "middle" of one of the original instructions, in which case the CPU would see a completely different sequence of instructions. IIRC, the highly optimized for space code in ROM for the 8 bit computers in the early '80's did this quite a bit to save space.

Arthur Kalliokoski
This is perhaps the subtle and clever way you keep someone who looks at your code from catching on to what is happening. But actually causing the jump to happen without having a difference in the assembly is what I am asking about.
Brian
+1  A: 

If it is the same bit pattern in memory, and the program executes in the same starting state, the determinism of the individual instructions will force a deterministic answer. (Otherwise you wouldn't be able to debug anything).

Now, you may have instructions that produce different results based on external state ("cosmic ray hit CPU chip recently?"). These would produce different results. One specific instruction that produces this effect on the x86 is RDTSC, "read time stamp counter", which reads number of clocks for current CPU since execution started. If the OS interrupts your application, and doesn't save/restore RDTSC content, then different executions will see different RDTSC values at the same point in the code ("cosmic interrupt hit CPU chip recently").

The real problem with this is that the outcome isn't controlled by the program, so it is hard to ensure that the program behaves differently in a controllable way.

What are you trying to accomplish with this?

Ira Baxter
It's purely an exercise in curiosity.
Brian
A: 

Well, it can't ever really be 'executed' differently. It only ever execute in the fashion that you describe (however obscure it is, it's still following the rules of execution for the given environment).

It sounds to me like you want your code to behave differently while completely avoiding the use of conditional statements.

This isn't possible, and for good reason: computers are logical, and even if you program it so that it'll behave differently in a different environment, it's still following the rules you layed out. And thank goodness, because if it were possible for the computer to decide, based on it's own sensibilities, then we programmers are going to be in a bit of trouble ...

int k = 2;

// computer internally has decided k should = 5, because it is upset about a prior 
// argument, so you must actually write:

computer.pleaseWork();
int k = 2;
Noon Silk