tags:

views:

66

answers:

2

My application processes URLs entered manually by users. I have discovered that some of malformed URLs (like 'http:/not-valid') result in NullPointerException thrown when connection is being opened. As I learned from this Java bug report, the issue is known and will not be fixed. The suggestion is to use java.net.URI, which is "more RFC 2396-conformant".

Question is: how to use URI to work around the problem? The only thing I can do with URI is to use it to parse string and generate URL. I have prepared following program:

import java.net.*;

public class Test
{
    public static void main(String[] args)
    {
       try {
           URI uri = URI.create(args[0]);
           Object o = uri.toURL().getContent(); // try to get content
       }
       catch(Throwable e) {
           e.printStackTrace();
       }
    }
}

Here are results of my tests (with java 1.6.0_20), not much different from what I get with java.net.URL:

sh-3.2$ java Test url-not-valid
java.lang.IllegalArgumentException: URI is not absolute
        at java.net.URI.toURL(URI.java:1080)
        at Test.main(Test.java:9)
sh-3.2$ java Test http:/url-not-valid
java.lang.NullPointerException
        at sun.net.www.ParseUtil.toURI(ParseUtil.java:261)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:795)
        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:726)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1049)
        at java.net.URLConnection.getContent(URLConnection.java:688)
        at java.net.URL.getContent(URL.java:1024)
        at Test.main(Test.java:9)
sh-3.2$ java Test http:///url-not-valid
java.lang.IllegalArgumentException: protocol = http host = null
        at sun.net.spi.DefaultProxySelector.select(DefaultProxySelector.java:151)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:796)
        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:726)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1049)
        at java.net.URLConnection.getContent(URLConnection.java:688)
        at java.net.URL.getContent(URL.java:1024)
        at Test.main(Test.java:9)
sh-3.2$ java Test http:////url-not-valid
java.lang.NullPointerException
        at sun.net.www.ParseUtil.toURI(ParseUtil.java:261)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:795)
        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:726)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1049)
        at java.net.URLConnection.getContent(URLConnection.java:688)
        at java.net.URL.getContent(URL.java:1024)
        at Test.main(Test.java:9)
+1  A: 

If I run your code with the type of malformed URI in the bug report then it throws URISyntaxException. So the suggested fix fixes the reported error.

$ java -cp bin UriTest http:\\\\www.google.com\\
java.lang.IllegalArgumentException
    at java.net.URI.create(URI.java:842)
    at UriTest.main(UriTest.java:8)
Caused by: java.net.URISyntaxException: Illegal character in opaque part at index 5: http:\\www.google.com\
    at java.net.URI$Parser.fail(URI.java:2809)
    at java.net.URI$Parser.checkChars(URI.java:2982)
    at java.net.URI$Parser.parse(URI.java:3019)
    at java.net.URI.(URI.java:578)
    at java.net.URI.create(URI.java:840)

Your type of malformed URI is different, and does not appear to be a syntax error.

Instead, catch the null pointer exception and recover with a suitable message.

You could try and be friendly and check whether the URI starts with a single slash "http:/" and suggest that to the user, or you can check whether the hostname of the URL is non-empty:

import java.net.*;

public class UriTest
{
    public static void main ( String[] args )
    {
        try {
            URI uri = URI.create ( args[0] );

            // avoid null pointer exception
            if ( uri.getHost() == null )
                throw new MalformedURLException ( "no hostname" );

            URL url = uri.toURL();
            URLConnection s = url.openConnection();

            s.getInputStream();
        } catch ( Throwable e ) {
            e.printStackTrace();
        }
    }
}
Pete Kirkham
I do not want to check every possible problem with URL manually, since my understanding is this is exactly what URL/URI should do for me (if I am wrong here this can be a hint what I should do). Catching runtime exception here is an ugly hack I consider to apply if everything else fails, but in general the idea is bad - it can hide other fatal errors happening during the connection. I believe better solution should exist.
Bartłomiej Kalinowski
@Bartłomiej Kalinowski URI *is* doing the check for you - if the host name is null (URI.getHost()==null when URL.getHost().equals("")) then it throws a NPE to signify that you are connection to a null host. NPE can possibly hide other errors, but then what were you hoping to do in those cases? Whatever the error is, you probably need to ask the user to correct/retry/abort, so the distinction isn't massively important - possibly the difference between IO exceptions and other exceptions matters - you could retry automatically on IO error but not on the others.
Pete Kirkham
I need to detect that URL is wrong, and tell apart this problem from all other problems like I/O errors etc. The app does not interact with user, so I cannot ask user to retry/correct, and I can expect that URL is wrong (even if it was validated before).
Bartłomiej Kalinowski
+1  A: 

You can use appache Validator Commons ..

UrlValidator urlValidator = new UrlValidator();

urlValidator.isValid("http://google.com");

http://commons.apache.org/validator/

http://commons.apache.org/validator/api-1.3.1/

Shekhar
I will check that one. My feeling is: if using external validator is the only way to use java.net.URL/URI properly, then I would say that behavior I described is a bug that makes those classes kind of unusable - this would be really strange for a standard class. Maybe I just do not know some important detail/usage precondition?
Bartłomiej Kalinowski