views:

851

answers:

3

I want to create a shared library that uses functions from a 3rd-party static library. For example, foo and bar from libfoobar.a. I know that my main application is also using foo and will be exporting that symbol. So I simply want to link in bar to save code size and leave 'foo' unresolved (as it will be provided by main application). If I include libfoobar.a, the linker ld will include both functions in my shared library. If I don't include libfoobar.a, my library will not have access to function bar because the application itself is not linking in bar. Questions:

  • Is there a way to tell ld to only resolve certain symbols when building the shared library?
  • Turn libfoobar.a into a shared library?
  • Extract file containing function bar from libfoobar.a and specify that on the linker line?
  • Don't worry about it, the run-time loader will use bar from your application so the copy of bar in the shared library will not be loaded?
A: 

I'm not the biggest expert on shared libraries, so I may be wrong here!

If I'm guessing right about what you're trying to do, just link your shared lib against libc.so. You don't want an extra copy of sscanf embedded in your library.

I answered your questions before I had quite figured out what you were getting at, in case you're interested in the answers.

Is there a way to tell ld to only resolve certain symbols when building the shared library?

only extern, not static, functions and variables go in the shared library's symbol table.

When you build your shared library, any symbols not found in objects on the linker command line will remain unresolved. If the linker complains about that, you probably need to link your shared lib against shared libc. You can have shared libs that depend on other shared libs, and ld.so can deal with the dependency chains.

If I had more rep, I'd ask this as a comment: Do you have a customized version of sprintf/sscanf, or would it be ok for your shared lib to use the implementation in -lc? If -lc is fine, then my answer probably solves your problem. If not, then you need to build your shared lib out of objects that only have the functions you need. i.e. don't link it against /usr/lib/libc.a.

Maybe I'm getting confused by your

libc.a (not actually the "real" libc) line. /usr/lib/libc.a is really glibc (on linux). It's a statically linked copy of the same code in libc.so. Unless you're talking about your own libc.a (which is what I was thinking at first)...

Turn libc.a into a shared library? You probably can, but don't, because it's probably not compiled as position-independent code, so it would be require a lot of relocations by ld.so at run time.

Extract sscanf from libc.a and specify that on the linker line?

May be possible. ar t /usr/lib/libc.a to list contents. (ar's args are similar to tar. tar was ar for tapes.... Old school Unix here.) Probably not that easy, because sscanf probably depends on symbols in other .o files in the .a.

Peter Cordes
Sorry about the libc confusion. I simply meant any 3rd-party static library and used libc as an example. I'm going to modify my question to clarify this.
KlaxSmashing
A: 

Answering your revised more-clear question.

Keep in mind that normally the point of a shared lib is that multiple programs can link against it. So your optimization of using the main program's symbol for a function you need will only work if the main program always provides that symbol (via a static lib or otherwise). This is not usually what people want to do.

If it's just a couple small functions, probably you should let it be. You'll probably end up with two copies of the code for the functions, one in your shlib, and one in the main program. If they're small (or at least not huge), or not called often and not performance-critical, then the code-size / I-cache hit from having two copies isn't something to worry about. (translation: I don't know how to avoid it it off the top of my head, so I might not take the time to look it up and make a more complex Makefile to avoid it.)

See my other answer for some comments on messing around with ar to extract stuff from a static library. summary: probably non-trivial, since you don't know the dependencies between the various .o files in the .a.

It may be possible to do what you're hoping for by having your shared library export the symbols that it pulls in from the static library. Then, when you link the main app, put your shared library before the static lib on the linker command line. ld will find "foo" in your shlib, and use that copy (if this re-exporting trick is possible), but for "bar" it will have to include a copy from the static lib.

ld --export-dynamic may be what you need to export all symbols in the dynamic symbol table. Try that. And search for "export" in the docs/man page. "export" is the jargon for making a symbol visible in a library. --export-all-symbols is in the i386 PE (windows DLL) section, otherwise it would probably do the trick.

Peter Cordes
Noticed something in the ld man page: --just-symbols=filename: "Read symbol names and their addresses from filename, but do not relocate it or include it in the output. This allows your output file to refer symbolically to absolute locations of memory defined in other programs."
Peter Cordes
Since the shared library is a 'plugin' (i.e., not always loaded), it cannot provide symbols for other code especially the main application.The easiest is probably turn the 3rd party static library into a dynamic library.
KlaxSmashing
+1  A: 

The following points attempt to answer the questions I had posed:

  • ld does not seem to allow you to omit linking in certain symbols from a static library. The usage of --just-symbols or --undefined (or the EXTERN linker script command) will not prevent ld from linking the symbols.
  • To convert a static library, libfoobar.a, into a shared one, libfoobar.so.1.0, and exporting all visible symbols. You can also use --version-script and other methods to export only a subset of symbols.

    ld -shared -soname libfoobar.so.1 -o libfoobar.so.1.0 --whole-archive libfoobar.a --no-whole-archive

  • It is better to delete archive members from a copy of your static library than it is to extract them because there may be internal dependencies you have to manage. For example, assuming you are exporting all symbols, you can generate a map file from your main executable. You can then grep for all the archive members that the executable pulled in from the copy of the static library and delete them from the copy. So when your DSO is linking in the static library, it will leave the same symbols unresolved.

  • It is possible to specify your main executable as a shared library for your DSO if you compile the executable with the --pie option. Your DSO will link first to your executable if it preceded the static library in the link command. The caveat is that the main executable must be available via LD_LIBRARY_PATH or -rpath. Furthermore, using strace reveals that, since the executable is a dependency of your library, it is loaded again when your DSO loads.

    ld -shared -rpath '$ORIGIN' -L. -lc -ldl -o DSO.so DSO.o app libfoobar.a

  • The dynamic linker will use the executable's version of foo first unless you call dlopen() with the RTLD_DEEPBIND flag. Using strace reveals that the entire DSO is file mapped mmap2() into memory. However, Wikipedia claims that for mmap "The actual reads from disk are performed in "lazy" manner, after a specific location is accessed." If this is true, then the duplicate foo will not be loaded. Note that the override only happens if your DSO exported the function foo. Otherwise, the function foo that was statically linked into your DSO will be used whenever your DSO calls foo.

In conclusion, if mmap() uses a lazy read, then the best solution is to link your DSO in the normal manner and let the dynamic linker and linux take care of the rest.

KlaxSmashing