views:

3114

answers:

8

I want to override certain function calls to various APIs for the sake of logging the calls, but I also might want to manipulate data before it is sent to the actual function.

For example, say I use a function called getObjectName thousands of times in my source code. I want to temporarily override this function sometimes because I want to change the behaviour of this function to see the different result.

I create a new source file like this:

#include <apiheader.h>    

const char *getObjectName (object *anObject)
{
    if (anObject == NULL)
        return "(null)";
    else
        return "name should be here";
}

I compile all my other source as I normally would, but I link it against this function first before linking with the API's library. This works fine except I can obviously not call the real function inside my overriding function.

Is there an easier way to "override" a function without getting linking/compiling errors/warnings? Ideally I want to be able to override the function by just compiling and linking an extra file or two rather than fiddle around with linking options or altering the actual source code of my program.

+16  A: 

If it's only for your source that you want to capture/modify the calls, the simplest solution is to put together a header file (intercept.h) with:

#ifdef INTERCEPT
    #define getObjectName(x) myGetObectName(x)
#endif

and implement the function as follows (in intercept.c which doesn't include intercept.h):

const char *myGetObjectName (object *anObject) {
    if (anObject == NULL)
        return "(null)";
    else
        return getObjectName(anObject);
}

Then make sure each source file where you want to intercept the call has:

#include "intercept.h"

at the top.

Then, when you compile with "-DINTERCEPT", all files will call your function rather than the real one and your function can still call the real one.

Compiling without the "-DINTERCEPT" will prevent interception from occurring.

It's a bit trickier if you want to intercept all calls (not just those from your source) - this can generally be done with dynamic loading and resolution of the real function (with dlload- and dlsym-type calls) but I don't think it's necessary in your case.

paxdiablo
Using a define is not a true polymorphic answer, but I'll agree for the use of ingenuity :P
Suroot
Thanks, that's a really good idea but like I said I want to try and avoid modifying the source code. If I can't find another way then I will have to do it like this I suppose. It needs to be easy to turn off the interception.
dreamlax
Use a compile-time flag for controlling interception (see updated answer). Again, this can be done at runtime as well, you just need to detect it in myGetObjectName() and always call getObjectName if the runtime flag is set (i.e., still intercept but change behavior).
paxdiablo
That does make the switching on and off easy, but it still requires modification to the source. I'm tempted to modify it anyway just because this way seems elegant enough for me, but it will be tedious to update so many files :(.
dreamlax
You don't have a tool to do search and replace across multiple files?
jmucchiello
I'd bite the bullet and do it. The alternative using dl* functions require code changes as well and they're a lot hairier to implement.
paxdiablo
@Pax, won't the getObjectName in the intercepted function be interpreted as a macro and expanded to myGetObjectName?
dreamlax
@dreamlax, you don't include intercept.h in intercept.c. That prevents the morphing. I should have made it clear that you only include intercept.h into those source files where you want the morphing to happen - intercept.c is not one of those places.
paxdiablo
What platform are you running on, @dreamlax? I could whip up a quick bash script or cmd script to update all your files in one hit.
paxdiablo
@Pax, thanks for the offer, but I was just working on a script just now. I figure worse case scenario, I can include this script with the source and say that I didn't modify the original source ;)
dreamlax
@Pax Good Answer
mahesh
You may use the option "-include file" to make GCC include the file automatically. you won't even have to touch any file then :)
Johannes Schaub - litb
though automatically including may cause problems when including the original API file. its declarations will be replaced too. not too easy :)
Johannes Schaub - litb
I like this solution the best of all proposed, as it works on any C compiler. It's the tried and true way used for decades too. Pretty simple.
Craig S
+2  A: 

There's also a tricky method of doing it in the linker involving two stub libraries.

Library #1 is linked against the host library and exposes the symbol being redefined under another name.

Library #2 is linked against library #1, interecepting the call and calling the redefined version in library #1.

Be very careful with link orders here or it won't work.

Joshua
Sounds tricky, but it does avoid modifying the source. Very nice suggestion.
dreamlax
I don't think you can force getObjectName to go to a specific library without dlopen/dlsym trickery.
paxdiablo
Any link-time operation that drags in the host library will result in a multiply-defined symbol.
paxdiablo
Example: you link against lib2, the linker requires l1getObj (the redefined name). But l1getObj requires getObj (which is already in l2 so the linker won't bring in the host objects) - this leads to infinite recursion.
paxdiablo
There's something particularly odd about the behavior when the libraries are dynamic rather than static that makes this work. I discovered it by accident.
Joshua
+2  A: 

You can define a function pointer as a global variable. The callers syntax would not change. When your program starts, it could check if some command-line flag or environment variable is set to enable logging, then save the function pointer's original value and replace it with your logging function. You would not need a special "logging enabled" build. Users could enable logging "in the field".

You will need to be able to modify the callers' source code, but not the callee (so this would work when calling third-party libraries).

foo.h:

typedef const char* (*GetObjectNameFuncPtr)(object *anObject);
extern GetObjectNameFuncPtr GetObjectName;

foo.cpp:

const char* GetObjectName_real(object *anObject)
{
    return "object name";
}

const char* GetObjectName_logging(object *anObject)
{
    if (anObject == null)
        return "(null)";
    else
        return GetObjectName_real(anObject);
}

GetObjectNameFuncPtr GetObjectName = GetObjectName_real;

void main()
{
    GetObjectName(NULL); // calls GetObjectName_real();

    if (isLoggingEnabled)
        GetObjectName = GetObjectName_logging;

    GetObjectName(NULL); // calls GetObjectName_logging();
}
cpeterso
I've considered this method but it does require modifying the source code, something I'm not really wanting to do unless I have to. Although this has the added benefit of switching during run-time.
dreamlax
+7  A: 

If you use GCC, you can make your function weak. Those can be overridden by non-weak functions:

test.c:

#include <stdio.h>

__attribute__((weak)) void test(void) { 
    printf("not overridden!\n"); 
}

int main() {
    test();
}

What does it do?

$ gcc test.c
$ ./a.out
not overridden!

test1.c:

#include <stdio.h>

void test(void) {
    printf("overridden!\n");
}

What does it do?

$ gcc test1.c test.c
$ ./a.out
overridden!

Sadly, that won't work for other compilers. But you can have the weak declarations that contain overridable functions in their own file, placing just an include into the API implementation files if you are compiling using GCC:

weakdecls.h:

__attribute__((weak)) void test(void);
... other weak function declarations ...

functions.c:

/* for GCC, these will become weak definitions */
#ifdef __GNUC__
#include "weakdecls.h"
#endif

void test(void) { 
    ...
}

... other functions ...

Downside of this is that it does not work entirely without doing something to the api files (needing those three lines and the weakdecls). But once you did that change, functions can be overridden easily by writing a global definition in one file and linking that in.

Johannes Schaub - litb
That would require modifying the API, wouldn't it?
dreamlax
your function name will be the same. also won't change the ABI or API in any way. just include the overriding file when linking, and calls will be made to the non-weak function. libc/pthread do that trick: when pthread is linked in, its threadsafe functions are used instead of libc's weak one
Johannes Schaub - litb
i've added a link. i don't know whether it suits your targets (i.e whether you can live with that GCC thingy. if msvc has something similar, you could #define WEAK it thoigh). but if on linux, i would use that one (maybe there is even a better way. i have no idea. look into versioning too).
Johannes Schaub - litb
+13  A: 

With gcc, under Linux you can use the --wrap linker flag like this:

gcc program.c -Wl,-wrap,getObjectName -o program

and define your function as:

const char *__wrap_getObjectName (object *anObject)
{
    if (anObject == NULL)
        return "(null)";
    else
        return __real_getObjectName( anObject ); // call the real function
}

This will ensure that all calls to getObjectName() are rerouted to your wrapper function (at link time). This very useful flag is however absent in gcc under Mac OS X.

Remember to declare the wrapper function with extern "C" if you're compiling with g++ though.

codelogic
that's a nice way. didn't know about that one. but if i'm reading the manpage right, it should be "__real_getObjectName( anObject );" which is routed to getObjectName by the linker. otherwise you will call __wrap_getObjectName recursively again. or am i missing something?
Johannes Schaub - litb
You're right it needs to be __real_getObjectName, thanks. I should've double checked in the man page :)
codelogic
A: 

You could use a shared library (Unix) or a DLL (Windows) to do this as well (would be a bit of a performance penalty). You can then change the DLL/so that gets loaded (one version for debug, one version for non-debug).

I have done a similar thing in the past (not to achieve what you are trying to achieve, but the basic premise is the same) and it worked out well.

[Edit based on OP comment]

In fact one of the reasons I want to override functions is because I suspect they behave differently on different operating systems.

There are two common ways (that I know of) of dealing with that, the shared lib/dll way or writing different implementations that you link against.

For both solutions (shared libs or different linking) you would have foo_linux.c, foo_osx.c, foo_win32.c (or a better way is linux/foo.c, osx/foo.c and win32/foo.c) and then compile and link with the appropriate one.

If you are looking for both different code for different platforms AND debug -vs- release I would probably be inclined to go with the shared lib/DLL solution as it is the most flexible.

TofuBeer
+8  A: 

You can override a function using LD_PRELOAD trick - see man ld.so. You compile shared lib with your function and start the binary (you even don't need to modify the binary!) like LD_PRELOAD=mylib.so myprog.

In the body of your function (in shared lib) you write like this:

const char *getObjectName (object *anObject) {
  static char * (*func)();

  if(!func)
    func = (char *(*)()) dlsym(RTLD_NEXT, "getObjectName");
  printf("Overridden!\n");     
  return(func(anObject));    // call original fucntion
}

You can override any function, even from stdlib, even syscall, without modifying/recompiling the program, so you could do the trick on programs you don't have a source for. Isn't it nice?

qrdl
+2  A: 

It is often desirable to modify the behavior of existing code bases by wrapping or replacing functions. When editing the source code of those functions is a viable option, this can be a straight-forward process. When the source of the functions cannot be edited (e.g., if the functions are provided by the system C library), then alternative techniques are required. Here, we present such techniques for UNIX, Windows, and Macintosh OS X platforms.

There is a great PDF covering how this was done on OS X, Linux and Windows.

It doesn't have any amazing tricks that haven't been documented here (this is an amazing set of responses BTW)... but it is a nice read.

lattice.umiacs.umd.edu/files/functions_tr.pdf

Care to share what this PDF might be?
HanClinto
lattice.umiacs.umd.edu/files/functions_tr.pdf - Link Added