views:

75

answers:

3

I understand somewhat how "int a = b+abs(c)" can be translated to simple assembly instructions and then translate that to some binary blob. But how can this be run and be interacted with dynamically in memory?

-- edit --

I know C doesn't have an eval feature. But what it's compiled to does. I mean this is what makes Java like JITs, and for that matter, code injection malware possible no? For instance the abs() function is just a pointer, which could be called following the cdecl protocol. New functions should be able to be exposed through passing cdecl function pointers. What I don't understand is how this new code can be injected at runtime.

I'm asking this more of as a longtime academic curiosity, then to most efficiently solve an actual problem.

-- example --

Say I have a piece of embedded python code which is called from a native program a lot, and which also calls a native binding notify():

def add(a, b):
    notify()
    return a+b

For this to be a point in doing, the function should probably contain quite a bit more code (and way more usefull), but bear with me. A profiler (or hints from the c-bindings) has also identified that all calls are with with integers both with all parameters and return value. This matches:

int add(int a, int b) {
    notify();
    return a + b;
}

Which could be compiled into an x86 cdecl something similar to this:

:_add
push ebp ;setting up scope
mov ebp, esp
call _notify ;or more likely by a pointer reference
mov eax, [ebp + 8]
mov edx, [ebp + 12]
add eax, edx
pop ebp
ret

Then finally assembled into a binary string. Of course one would have to implement a basic compiler for each platform to even get this far. But that problem aside, say I now have a char pointer to this valid binary x86 code. Is it somehow possible to extract a cdecl function pointer useable for the native program from this in any way?

Sorry for the unclear intent about my question

+1  A: 

C lacks any sort of "eval" feature, which is both a limitation and one of the things that allows it to be efficient. If all you need to be able to do is evaluate mathematical expressions with a small set of built-in math functions like abs() and not arbitrary C code, it's moderately easy to write such an expression evaluator.

Here's a link to a past SO thread on a similar topic: http://stackoverflow.com/questions/1465909/c-expression-evaluator

R..
For just mathematical expressions with moderate performance requirements, you'd be completly right. Updated with original question with clarifications
Imbrondir
Your question still isn't clear. Do you just want to add new functions that can be called, but still only support expression evaluation? Or do you want the full C language (loop constructs, etc.)? If the latter, the only portable solution is to write a full C interpreter or compiler that generates code to run on a virtual machine (which you also need to implement).If you're not looking for portability, and your platform has shared libraries/dynamic loading, you can run the C compiler and linker to build a new library then load it...
R..
So basically the only way of running newly machine compiled code, is to create a shared library and load it?
Imbrondir
+1  A: 

Assuming you have the code already compiled in a memory block, and you have the address of the block, it can be casted to a function pointer:

typedef int (*func)(int, int);
...
char * compiledCode = ...;
func f = (func) compiledCode;

and then the function can be called:

int x = f(2, 3);
Kcats
Way simpler than I thought. Big thanks!
Imbrondir
+1  A: 

The program must be in memory region that allows execution.

On linux you can do this:

void *address;
functype proc;
int prot, flags;

prot = PROT_READ|PROT_WRITE|PROT_EXEC;
flags = MAP_PRIVATE|MAP_ANONYMOUS;
address = mmap(NULL, length, prot, flags, -1, 0);
if (address == NULL) error;

memcpy(address, program, length);
link(address, program_info);

proc = (functype)address;
Cheery