views:

262

answers:

4

When I developed a piece of (academic) software using Java, I was forced to use an API that was rather badly implemented. This means that calls to this API for a certain set of input data would sometimes never return. This must have been a bug in the software as the algorithms that it offered were deterministic ones, and sometimes it would terminate on a set of data, sometimes it would run into an infinite loop on the same set of data...

However, fixing the API or reimplementing it was simply beyond scope. I even had the source but the API heavily relied on other APIs which were undocumented and without source and had by the time vanished from the web (or never been there?). On the other hand, this "bad" API was the only one out there that solved the specific problem I had, so I really had to stick with it.

The question is: what is the cleanest way of dealing with an API that behaves that, well, nasty? When I faced this problem I decided to put the calls to the API into a separate thread. Another thread would then occasionally check if this thread had terminated. If a certain amount of time had passed, I would kill the processing thread using Thread#stop() and start the processing again, hoping that it would return the next time. Now, I know (and knew back then) that this method is deprecated and must not be used. But in this academic context it was acceptable to have the software potentially run into an undefined state instead of having it crash.

It was also not acceptable to just ignore the processing thread that had run into an infinite loop because it did some quite CPU-intensive operations that would slow down the user's machine significantly.

Another way which I didn't try is to start the processing in a separate process instead of a thread, because a sub-process can be killed cleanly without putting the software in an inconsistent state. Or could the new SwingWorker class (which wasn't yet available) have done the job? It has a cancel() method, but the docs say that it "Attempts to cancel execution of this task", so it doesn't look like a reliable approach either.

+8  A: 

I would recommend the use of a separate process. There is essentially no safe way for one thread to kill a second thread in Java unless the second thread is periodically checking to see if it has been interrupted.

The ideal solution would be to use isolates. An isolate is essentially a private virtual machine that a Java app can create, managed and communicate with. In particular, the parent app can safely kill an isolate and all its threads.

Reference: JSR-000121 Application Isolation API Specification - Final Release

The problem is finding a JVM that supports Isolates.

Stephen C
I accepted your answer because of the interesting reference to the Isolation API. Thanks for that!
Robert Petermeier
+4  A: 

I'm a big fan of separate processes for this kind of thing.

Spawn a sub process and wait for results.

If the API is non-deterministic, put the timer thread in a wrapper that makes the bad API into a main program.

That way, the subprocess always ends within the given time. It either produces a useful result or a system exit code that indicates failure.

S.Lott
Thanks for your answer, I think I'll try that approach.
Robert Petermeier
+1  A: 

The best thing to do would be to re-implement the API in question. However, as you say, that's a very heavy weight and probably out-of-scope solution.

The next best thing would be to wrap the API if possible. Basically, if you can determine in advance what it is about the datasets that causes failure you could reject calls to guarantee determinism. It doesn't sound like this will work for you either, as you suggest that repeating a call with the same dataset will sometimes terminate when it had infinitely looped in a prior invocation.

Given the above options above aren't available:
I think your current thread solution is the best of the bad choices. Spinning up a process for a method call seems way too heavy weight to be acceptable from a performance point-of-view, even if it is safer than using threads. Thread.stop() is very dangerous, but if you religiously prevent any locking you can get away with it.

Kevin Montrose
I don't quite agree with you here. There would only be at most one sub-process at a time. Within this sub-process really CPU-intensive computation would take place so I don't see why creating another process is such a big deal. I suppose you are assuming that there would be many sub-processes, but my program would always need only one. There'd be some overhead for passing and retrieving data to and from it, though, but that wouldn't be too hard to do either. Well, thanks for your answer anyway.
Robert Petermeier
My concern is with the cost of __starting__ a process, not the number or live processes at any one time. Cost of process >> cost of thread in terms of startup time. I am kind of assuming that you're making repeated calls to this API, which I feel is fair given the question as written (though perhaps it isn't true of your actual situation). How CPU intensive the API is isn't really relevant, as you pay that cost regardless of the fix.
Kevin Montrose
+1  A: 

Both @S.Lott and @Stephen C's answers are spot on with regard to how to handle this type of situation, but I'd like to add that in a non-academic environment, you should also be looking to replace the API as soon as practical. In situations where we've been locked into a bad API, typically through choosing a vended solution for other reasons, I've worked to replace the functionality with my own over time. Your customers are not going to be as tolerant as your professor since they actually have to use your software (or not!) instead of merely grade it.

There are certainly situations where using duct tape is an adequate choice to solve a problem. When it results in such poor behavior as you describe, though, it's best not to rely on it too long and start working on a real repair.

tvanfosson
Well, I have seen (rather expensive) business software that crashed frequently and was still accepted by everybody as it was the only available software for a specific purpose. A workaround existed, too, the same as with my app: kill the process and try again ;-) I think Joel talked about software like that in the SO podcast but I can't find the episode right now. I mean really nasty software, unreliable and with a horrible UI but which still is extremely successful because it is the only solution to a really complicated problem.
Robert Petermeier
When really crappy software is the best available AND people are willing to pay for it, that's called opportunity.
tvanfosson