tags:

views:

40

answers:

2

The (hypothetical for now) situation is the user of my system is going to be given a chunk of C code and needs my system to compile and run it in a chroot sandbox that is generated on the fly and I want to require the fewest files in the box as possible. I'm only willing to play with compiler and linker settings (e.g. static link everything I can expect to be able to find) and make some moderate restriction on what the code can expect use (e.g. they can't use arbitrary libs).

The question is how simple can I get the sandbox. Clearly I need the executable, but what about an ELF loader and a .so for the system calls? Can I dump either of them and is there something else I'll need?

A: 

no need for any ELF loader. to check what dynamic libraries you need do ldd <executable>. If you manage to static compile everything, it won't need any .so. Beyond that, it's only about the data and directory structure your program might need.

But all this is only if you use the /usr/bin/chroot command; if you make your program call int chroot(const char *path); itself after making sure all dynamic libraries are loaded, they you won't need anything on the directory sandbox. not even the executable itself.

edit: A different idea: use TCC (or rather, libtcc to compile, link, load and run the given C chunk. run the whole process inside an 'outer' chroot jail, dropping to an 'inner' (empty) one just before execution. (of course, execute in a fork(), or you won't be able to break out of the 'inner' jail to the 'outer' one). You might also take advantage of libtcc's bound's checked execution.

Javier
You **must not** run ldd on an untrusted executable! ldd actually *runs the executable*.
derobert
well, it's not an executable, it's C source code that he compiles himself. of course, it doesn't make that trusted; but he could insert the chroot() call and compile statically. any well-behaved source should run without anything on the jail; if it still needs anything else, i wouldn't call it well-behaved.
Javier
I was thinking of having the program get loaded post chroot.
BCS
@Javier: Calling `getprotobyname` requires /etc/protocols; `getservbyname` requires /etc/services; `getipnodebyname` /etc/hosts, and /etc/resolv.conf or similar. All require /etc/nsswitch.conf,
derobert
@derobert: i don't see any of these on the question
Javier
@Javier: that was in response to your comment "any well-behaved source should run without anything on the jail; if it still needs anything else, i wouldn't call it well-behaved.". I'm just giving examples files that a well-behaved program may well need.
derobert
@derobert: ah, i see. i was referring about the link requirements. of course, there can be lots of data requirements, such as those you mention.
Javier
+2  A: 

You don't need anything except the executable to run a statically-linked hello world. You will, of course, need a lot more to compile it.

You can test this fairly easily, I did so with the following trivial C code:

#include <stdio.h>
int main() {
    puts("Hello, world\n");
    return 0;
}

compile it with gcc -static. Then make a new directory (I called it "chroot-dir"), move the output ("hello") into it. So the only file in the chroot is now the executable. Then run chroot chroot-dir ./hello, and you'll get Hello, world.

Note that there are some things that can not be compiled statically. For example, if your program does authentication (through PAM), PAM modules are always loaded dynamically. Also note that various files in /etc are needed for certain calls; any of the getpw* and getgr* functions, the domain name resolution functions, etc. will require nsswitch.conf (and some shared objects, and maybe more config files, and sometimes even more executables, depending on the lookup methods configured.) /etc/hosts, /etc/services, and /etc/protocols will probably be quite useful for any networking.

One easy way to figure out what files a program uses is to run it under strace. You must trust the program first, of course.

derobert
Very nice answer! I wonder what it would take to normalize what can be used across systems: that only functions that are universally accessible are ever accessible.
BCS