tags:

views:

149

answers:

5
private String indexPage(URL currentPage) throws IOException {
    String content = "";
    is = currentPage.openStream();
    content = new Scanner( is ).useDelimiter( "\\Z" ).next();
    return content;
}

This is my function with which I'm currently crawling webpages. The function that a problem is:

content = new Scanner( is ).useDelimiter( "\\Z" ).next();

If the webpage doesn't answer or takes a long time to answer, my thread just hangs at the above line. What's the easiest way to abort this function, if it takes longer than 5 seconds to load fully load that stream?

Thanks in advance!

+3  A: 

You can close the stream from another thread.

Tom Hawtin - tackline
A: 

Try to interrupt the thread; many blocking calls in Java will continue when they receive an interrupt.

In this case, content should be empty and Thread.isInterrupted() should be true.

Aaron Digulla
Usually I/O is not interruptible like that. Some versions of the Sun JRE on Solaris used to do it, but I don't know of any others that enable it by default.
Tom Hawtin - tackline
+1  A: 

Have a look at FutureTask...

pgras
+7  A: 

Instead of struggling with a separate watcher thread, it might be enough for you (although not exactly an answer to your requirement) if you enable connect and read timeouts on the network connection, e.g.:

URL url = new URL("...");
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setConnectTimeout(5000);
conn.setReadTimeout(10000);
InputStream is = conn.getInputStream();

This example will fail if it takes more than 5 seconds (5000ms) to connect to the server or if you have to wait more than 10 seconds (10000ms) between any content chunks which are actually read. It does not however limit the total time you need to retrieve the page.

jarnbjo
thanks, that was exactly what I was looking for!
ndee
+3  A: 

Google's recently released guava-libraries have some classes that offer similar functionality:

TimeLimiter:

Produces proxies that impose a time limit on method calls to the proxied object. For example, to return the value of target.someMethod(), but substitute DEFAULT_VALUE if this method call takes over 50 ms, you can use this code ...

matt b