views:

736

answers:

9

I want to create a Web app which would allow the user to upload some C code, and see the results of its execution (the code would be compiled on the server). The users are untrusted, which obviously has some huge security implications.

So I need to create some kind of sandbox for the apps. At the most basic level, I'd like to restrict access to the file system to some specified directories. I cannot use chroot jails directly, since the web app is not running as a privileged user. I guess a suid executable which sets up the jail would be an option.

The uploaded programs would be rather small, so they should execute quickly (a couple of seconds at most). Hence, I can kill the process after a preset timeout, but how do I ensure that it doesn't spawn new processes? Or if I can't, is killing the entire pgid a reliable method?

What would be the best way to go about this - other than "don't do it at all"? :) What other glaring security problems have I missed?

FWIW, the web app will be written in Python.

+1  A: 

There's a tool called strace - it monitors system calls made by a given process. You just need to watch out for specific calls suggesting 'illegal' function access. AFAIK, it's the method used in programming competitions to sandbox contestants' programs.

Mike Hordecki
There exist many ptrace-based sandboxes out there, like UMView. strace lets the program run normally, just with extra logging; that's not enough for sandboxing.
ephemient
@ephemient: You should write a proper answer - UMview in some form seems to fit the questioners needs perfectly.
Chris Kaminski
+3  A: 

I'd say this is extremely dangerous on many levels. You're essentially opening yourself up to any exploit that can be found on your system (whereas you're normally limited to the ones people can exploit remotely). I'd say don't do it if you can avoid it.

If you do want to do it, you might want to use some kind of virtual machine to run the user's code. Using something like KVM it's possible to set up a number of virtual machines using the same base image (you can even store a snapshot in an already-booted state, though I'm not sure how it will handle being cloned). You can then create the VMs on demand, run the user's code, return the results, and then kill off the VM. If you keep the VMs isolated from each other and the network, the users can wreak any havoc they want and it won't hurt your physical server. The only danger you're exposing yourself to under these conditions would be some kind of exploit that allows them to escape from the VM... those are extremely rare, and will be more rare as hardware virtualization improves.

rmeador
A: 

About the only chance you have is running a VirtualMachine and those can have vulnerabilities. If you want your machine hacked in the short term just use permissions and make a special user with access to maybe one directory. If you want to postpone the hacking to some point in the future then run a webserver inside a virtual machine and port forward to that. You'll want to keep a backup of that because you'll probably have that hacked in under an hour and want to restart a fresh copy every few hours. You'll also want to keep an image of the whole machine to just reimage the whole thing once a week or so in order to overcome the weekly hackings. Don't let that machine talk to any other machine on your network. Blacklist it everywhere. I'm talking about the virtual machine and the physical machine IP addresses. Do regular security audits on any other machines on your other machines on the network. Please rename the machines IHaveBeenHacked1 and IHaveBeenHacked2 and prevent access to those in your hosts lists and firewalls.

This way you might stave off your level of hackage for a while.

dude
+1  A: 

Although it is still in development, and not yet considered secure, you might check out the technology behind Google Native Client. It is designed to allow untrusted native code to be run in a web browser, but could probably be adapted for use on a web server. You might use something like this on top of other techniques such as a virtual machine, for additional security.

mark4o
+1  A: 

On Fedora 11, there is the SELinux Sandbox which seems to do exactly what you want (except perhaps limiting spawning new processes; the linked blog post does not mention that).

Of course, there is always the risk of kernel bugs; even with SELinux, parts of the kernel are still exposed to all processes.

CesarB
+1  A: 

The few details you provide imply that you have administrative control over the server itself, so my suggestion makes this assumption.

I'd tackle this as a batch system. The web server accepts an upload of the source file, a process polls the submission directory, processes the file, and then submits the result to another directory which the web application polls until it finds the result and displays it.

The fun part is how to safely handle the execution.

My OS of choice is FreeBSD, so I'd set up a pre-configured jail (not to be confused with a vanilla chroot jail) that would compile, run, and save the output. Then, for each source file submission, launch a pristine copy of the jail for each execution, with a copy of the source file inside.

Provided that the jail's /dev is pruned down to almost nothing, system resource limits are set safely, and that the traffic can't route out of the jail (bound to unroutable address or simply firewalled), I would personally be comfortable running this on a server under my care.

Since you use Linux, I'd investigate User Mode Linux or Linux-VServer, which are very similar in concept to FreeBSD jails (I've never used them myself, but have read about them). There are several other such systems listed here.

This method is much more secure than a vanilla chroot jail, and it is much more light-weight than using full virtualization such as qemu/kvm or VMware.

I'm not a programmer, so I don't what kind of AJAX-y thing you could use to poll for the results, but I'm sure it could be done. As an admin, I would find this a fun project to partake in. Have fun. :)

Geoff Fritz
The danger to jails or VServer is that a kernel bug (such as the splice vulnerability of a while back) still renders the host vulnerable. Likewise, all virtualization methods (including UML) have had jail-breaking security bugs too...
ephemient
Very true. However, exploits are an inevitable part of the game. Software can always be broken (eventually). This whole concept (allowing untrusted code to run on one's server) is not for the faint of heart. The OP stated that he wouldn't take "don't do it at all" for an answer, so I outlined one of several options.
Geoff Fritz
+2  A: 

Along with the other sugestions you might find this useful.

http://www.xs4all.nl/~weegen/eelis/geordi/

This is from http://codepad.org/'s about page.

Sweeney
Thanks. While googling around, I actually stumbled accross the same site in an earlier stackoverflow post:http://stackoverflow.com/questions/818402/letting-users-upload-python-scripts-for-executionI'd love to rip it off, but it doesn't seem that codepad is open source. So I indent to take a similar approach. Systrace/ptrace supervisor + chroot jail + <some sort of OS-level virtualization>.Also big thanks to everyone else for suggestions, very helpful indeed. Another interesting read:http://crypto.stanford.edu/cs155/lectures/06-sandboxing.ppt
oggy
A: 

See this page on Google Chrome's sandboxing methods for Linux. As you can see, there are plenty of methods, but none of them are great for a distributable application like Chrome because some distros might not include them. This is not really a problem for a web application though, because you can control what is installed on your server.

Personally, my favorite is Seccomp, because it has a very low overhead compared to other tools like ptrace (switch address spaces on every syscall!) or KVM (big memory hungry virtual machine), and it is incredibly simple compared to tools like SELinux (and therefore more likely to be secure).

Zifre
You could link the user-submitted C code into a library and load it before seccomp, but that's unsafe... you can't `exec` *after* seccomp, because that kills your process... maybe you could write your own linker that loads an image from FD 0 and jumps to it? Sadly, not so easy.
ephemient
@ephemient: Link library with C code. Control process starts. Control process forks and `exec`s program. Library runs first, opens message queue with control process, starts Seccomp.
Zifre
If the library contains an _init symbol or function marked with __attribute__((constructor)), it will be immediately when loaded. There's no way to "pause" the library loading, switch seccomp on, and then allow the library loading to continue.
ephemient
Hmm... I'll have to look at my old project (I used Seccomp for something similar a while ago, and I definitely don't recall setting it up being that complex...)
Zifre
s/immediately when/immediately run when/ You may not have realized that it was possible for libraries to run code just by being loaded, without being called back into or anything like that.
ephemient
A: 

I think your solutions must concentrate on analyzing the source code. I don't know any tools, and I think this would be pretty hard with C, but, for example, a Pascal program which doesn't include any modules would be pretty harmless in my opinion.

ilya n.