tags:

views:

132

answers:

4
static char yes[80]; 

int main(int argc, char *argv[])
{
    void (*point)();
    // ... sets yes[] = to input
    point = (void*) yes;
    (*point)();
}

so does this create a function and execute commands that are in yes[]? how does it know to read the commands in yes? do i type in c, or does it have to be assembly?

+7  A: 

Just because some things will compile, does not mean they will produce valid defined results.

You are creating an array of chars (80 of them) and then you are casting that as a function pointer and trying to run it. Likely your program will crash but it's results are undefined.

Brian R. Bondy
+2  A: 

You could probably make this code work as you want it to, but it will be very machine/compiler/system dependent. The stuff you put into yes will need to be machine instructions, not C or assembly. I'll try to work up an example.

Edit: My few quick tests didn't turn up anything useful. Too much memory protection/security/safety going on in Mac OS X I guess. Your code could certainly be made to work in a less-protected environment - embedded systems, BIOS, etc.

Carl Norum
+2  A: 

On linux this won't work. You have to mmap a memory segment with the PROT_EXEC flag to be able to execute machine instructions like you want to do.

Don't forget that code pointers and data pointers are supposed to be incompatible.

Alexandre C.
Code and data pointers *can* be incompatible. Usually may be too strong.
dmckee
What is the downvote about ?
Alexandre C.
@Alex: Wasn't mine.
dmckee
+3  A: 

As stated, this is not a useful thing to do, but it's not that far off from what just-in-time compilers do all day, or the OS's executable loader.

  • As Carl says, you need to put machine instructions into the buffer, not C or assembly.
    • That means you get to do all of the work of complying with the ABI yourself.
    • That also means you get no portability at all. Code that does this sort of thing has to be written N times - once for each CPU+OS combination you care about. Just-in-time compilers usually have a fallback to a byte-code interpreter for not-yet-supported platforms.
  • For security reasons, you can't just dump machine code into a buffer and jump to it; you have to tell the OS what you are doing.

As it happens I have an example lying around. This is only tested to work on (x86 or x86-64)/Linux. If you wanted to make it do something more interesting, you would need to replace the memset with code that filled in the buffer with the machine code for a more interesting operation.

It will not work on any other CPU, because it hardwires the x86 machine encoding of the return instruction. It probably won't work on any other x86 OS, either, because it ignores the portability minefield surrounding mmap and mprotect :-(

#define _GNU_SOURCE 1
#include <sys/mman.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

static void __attribute__((noreturn))
perror_exit(const char *msg)
{
    perror(msg);
    exit(1);
}
int main(void)
{
    /* allocate one writable page */
    size_t pagesize = getpagesize();
    void *rgn = mmap(0, pagesize, PROT_READ|PROT_WRITE,
                     MAP_PRIVATE|MAP_ANON, -1, 0);
    if (rgn == MAP_FAILED)
        perror_exit("mmap");

    /* fill it with return instructions */
    memset(rgn, 0xC3, pagesize);

    /* now switch the page from writable to executable */
    if (mprotect((caddr_t)rgn, pagesize, PROT_READ|PROT_EXEC))
        perror_exit("mprotect");

    /* now we can call it */
    ((void (*)(void))rgn)();
    return 0;
}
Zack
+1 for an actual example. Now if it were just a "useful" example...
RBerteig
I wouldn't be able to fit a _useful_ example into the code box here -- the smallest thing I can think of that would qualify is a [subroutine threaded Forth interpreter](http://en.wikipedia.org/wiki/Threaded_code#Subroutine_threading), and that's still a whole lot of code.
Zack