views:

51

answers:

2

Posit the following situation:

  • You have a large and complex system (distributed, concurrent, huge dataset) which supports many users. The code is sent to the data.
  • You want to allow mobile code in the system - ie untrusted code that will run within the same JVMs as the rest of the system, to take advantage of the locality of the data, avoid deserialization etc.

You can put the code in a funny classloader, and use a customised security policy like the applet runner does. But there are still problems:

The system as a whole should be protected from malicious code - eg spawning loads of threads, eating all the cpu up, allocating too much memory.

The mooted idea at the beginning of the millenium was JSR-121. Isolates were meant to bring most of the benefits of process isolation - limits on cpu usage, thread spawning, heap usage : resource allocation in general.

Given that this effort was seemingly abandoned by Sun, what is the closest we can currently get?

So far, my ideas are:

  • Bytecode translate the code to insert allocation tracking. Google seem to have done something similar to this : http://code.google.com/p/java-allocation-instrumenter/ . It needs a bit of work as Google have (Joshua) Bloch-ed themselves into a corner and made all kinds of things package private...
  • Also ban calls to things the security manager can't, eg Thread creation.
  • Insert (rare) interruption checks into loops and recursive functions so a monitor thread can watch (using ThreadMXBean), and if it takes too long, interrupt the offending thread. It might be simpler just to put limits on reentrancy - in any call into the user code, a basic block may only be entered n times before aborting.

Are there any better or existing ways to do this?

+1  A: 

This is a complex question. My first thought was to create a domain-specific language that does what the 'mobile' users need. The DSL would have no capability to perform a dangerous operation.

Who are the people that would be uploading untrusted code? This sounds like a dubious idea to begin with. We spend a lot of effort making sure people can't run untrusted code ;-)

Tony Ennis
A DSL is a huge can of worms. If it's internal, you have these problems anyway. If it's external, you have to actually design the language - language design is a huge amount harder than byte code manipulation! And also implement a good enough translation to byte code to let hotspot actually work. And anyway, designing a dsl which "can't" cause resource leaks is halting problem territory. This part of it has to be runtime. So, I'd prefer to use an existing language people know with decent tools as a dsl solves basically none of the actual issues.
rjw
As to the dubious idea jibe, what on earth do you think the point of the security manager was in the first place? For running untrusted code..
rjw
A: 

The problem is that the only real way to isolate a process is to have a dedicated machine/hardware. Anything else you do, you will have to make compromises. Depending on what those compromises are acceptable, depends on whether it is practice to share a JVM with that code.

This is not a problem you can solve for the general case in a trivial manner because you want to protect against things you haven't thought (which others might think of one day)

Peter Lawrey
This is a bit of an unsatisfying answer, basically an argument from personal incredulity. I think you need to reread the question, and find an actual reason why the approach wouldn't work... Your complaint, if it were valid, would prove that type systems, memory protection, capability based security, and all other protection mechanisms don't work. None of these systems protect against things going wrong that their authors haven't thought of. Tbh, this stuff isn't even new - DBMS have been resource constraining queries since the seventies. There's nothing magic going on here.
rjw
Yes, there are techniques to minimise the impact between well behaving systems. However to protect against "malicious code" you need a seperate system. For example, I can write a simple program in java which will write data to disk in a way that will bring a system to its knees. even ssh or smtp will think there system is unusuably slow. You can use the security manager to prevent this and many other operations until the malicous code cannot acutally do anything, then it will have no impact. Basically, with malicious code, anything you allow it to do can be used against you.
Peter Lawrey
A simple way to slow a machine is to create objects endlessly and trigger lots of GCs. The threads total memory need not be large but can slow your JVM dramatically. Say you disabled all object creation, (making the code pretty unusable) the malicous code could still call methods like File.length() which creates objects in the JNI call and can generate 400 MB/s of garbage, called often enough. Note: this kind of activity can slow not only the JVM but measurably impact the performance of the whole machine.
Peter Lawrey
The real answer is to protect yourself from accidental or complacent programming using techincal means which I would agree with @rjw is fairly easily to do and well understood and have legal or contractual protections against malicous code. (Again, common practice)
Peter Lawrey