Is the idea after the first resolution it'll rely on OS caching? Still this seems inefficient and in cases of multiple domains resolving to the same IP, incorrect. What am I missing?
Don't use java.net.URL
. That's the simple answer to your question. Use java.net.URI
instead, which won't do hostname resolution.
A lot of people think this was a very bad idea (e.g. see here).
Here's some explanation from the Javadoc of URI. This question is also useful.
hashCode()
is closely related to equals()
. The explanation for this behavior is described in the docs for equals()
as follows:
Two hosts are considered equivalent if both host names can be resolved into the same IP addresses; else if either host name can't be resolved, the host names must be equal without regard to case; or both host names equal to null.
Source: java.net.URL.equals() docs.
Why does java.net.URL’s hashcode resolve the host to an IP?
There are two reasons. The first is:
- The
URL
class's behavior was designed to model a URL being a locator of network accessible resource . Specificallyequals
andhashCode()
were designed so that twoURL
instances are equal if they locate the same resource. This requires that the DNS name be resolved to an IP address.
With the benefit of hindsight we know that:
the
equals
method cannot determine if two URL strings are locators for the same resource, due to (for example) virtual hosting, HTTP 30x forwarding, and server internal mapping of URLs, andthe IP resolution behavior of
equals
andhashcode
is a trap for inexperienced Java programmers, even though it is clearly documented.
(When I say "cannot" above, I mean that it is theoretically impossible. Dealing with some of the more difficult cases would require changes to the HTTP protocol, for example. Even if a hypothetical HTTP 2.0 "fixed" the problem, we'd still be dealing with legacy HTTP 1.1 servers in 20 years time.)
This brings us to the second, more important reason.
- The behavior of
URL.equals(Object)
was defined a LONG time ago, and it would be impossible to change now without breaking (possibly) millions of deployed Java applications. This rules out any possibility that Sun (now Oracle) will change it.
Maybe the designers of a (hypothetical) successor to the Java class library could fix this (and other things). Of course, backwards compatibility with existing Java programs would have to be thrown out of the window to achieve this.
And finally, the real answer for Java application developers is to simply use the URI class instead. (Real software engineering is about getting the job done as well as you can, not about complaining about the tools you have been provided with.)