views:

307

answers:

3

Hi,

I have a cross platform C++ application that is broken into several shared libraries and loads additional functionality from plugin shared libraries. The plugin libraries are supposed to be self contained and function by themselves, without knowledge of or dependency on the calling application.

One of the plugins contains copied code from the main application, so contains symbol names that are duplicate to those in the engine. (Yes I know that's generally a no-no, but at the time the plugin was written the engine was a monolithic binary and couldn't share libraries.) On Windows, everything runs fine. On Linux we were getting segfaults. By looking at the stack trace of the error, it was occurring in the plugin when calling functions in the duplicate class name. It appeared to be a result of the engine and plugin having slightly different versions of the shared code (some class functionality was commented out in the plugin). It was as if the plugin was getting it's symbols runtime linked to the engine's instead of its own. We "fixed" the issue by changing the dlopen's parameters to be dlopen(pFilepath, RTLD_LAZY | RTLD_LOCAL).

But when we rewrote the engine to be split into shared libraries (for the eventual purpose of reuse in the plugins), we get the segfault error again. And looking at the stack trace, it goes from the engine -> plugin -> engine.

Is there a way to specify for the runtime linker to not map symbols of the plugin to the engine (especially if they are defined in the plugin)?

Thanks! Matt


Edited 2009-12-3

I first tried to wrap the plugin's code in it's own namespace. That didn't work because it is statically linked to a library that is also linked to the engine. The versions of the static library are different, so segfault!

Then I changed the build of the engine and it's libraries to be statically linked. And when I run it, I no longer have the issue. So it appears it was a result of having the shared library symbols exported and then being dynamically relocated into the plugin when it was opened. But when all of the engine's code is in a single executable, it doesn't export its symbols (so it doesn't try to relocate the plugin's symbols into the engine).

I still have an issue though, as there is a parallelized version of the program (using Open-MPI) and that still gets the segfault. It appears in that it's still exporting the engine's symbols and relocating the plugin's. That might have to do with how Open-MPI executes the application.

Are there any linker flags that could be used on the plugin shared library that would tell it not to dynamically relocate the symbols at runtime? Or to hide it's symbols so they don't get relocated? I've tried -s ("Omit all symbol information") but that apparently didn't change the dynamic symbols (checked using nm -D <plugin>).

+1  A: 

I agree with Glen - you aren't going to really solve this unless you modify the class names, possibly via namespaces. Even 36 files will probably take less time to modify than trying to reliably fix it without changing symbol names.

Start by identifying all the classes whose names need to be tweaked. Your linker probably lists them for you already. Then I would change the names of both sets of classes (from Foo to Engine::Foo and Plugin::Foo for example) at least temporarily. That way you can get the compiler to find all references to the problematic classes. Chug away at the plugin source until the plugin compiles with references to the correct new plugin class names. Once that is done, change the Engine:: classes back to their old names (unless you want to permanently modify engine source too, which it sounds like you don't). The plugin should now compile and link to the correct, uniquely named classes.

Darryl
While you are probably right about this. It's not a satisfying answer though, so I'm going to leave the question unanswered for a little bit just in case. I'll modify the plugin code to add a namespace to the offending bits. Until I am able to rewrite it using the engine's shared libraries, since I changed all the code in the engine, it shouldn't matter if I modify the copy.
CuppM
A: 

I would just wrap ALL of the plugin's code with a PluginX namespace. That will surely save you from these errors. It's a very good, important, practice anyway.

rmn
+1  A: 

I think I've found the solution, the linker flag -Bsymbolic. Essentially this flag adds a flag in the shared library to tell the runtime linker to try and resolve symbol names within itself first. The engine was able to run with the plugin just fine in all cases (monolithic exe, exe w/ shared libs, plugin w/ and w/o wrapping the namespace) when the plugin was linked with that flag.

There does seem to be a few detractors with warnings about -Bsymbolic:
http://www.technovelty.org/code/c/bsymbolic.html
http://software.intel.com/en-us/articles/performance-tools-for-software-developers-bsymbolic-can-cause-dangerous-side-effects/

But considering their warnings and what the intention of the plugin is, I think it's the right option for me. At least for now.

CuppM