Hi All,
I am wondering if there is a parser or library in java for extracting the second level domain (SLD) in an URL - or failing that an algo or regex for doing the same. For example:
URI uri = new URI("http://www.mydomain.ltd.uk/blah/some/page.html");
String host = uri.getHost();
System.out.println(host);
which prints:
mydomain.ltd.uk
Now what I'd like to do is robustly identify the SLD ("ltd.uk") component. Any ideas?
Edit: I'm ideally looking for a general solution, so I'd match ".uk" in "police.uk", ".co.uk" in "bbc.co.uk" and ".com" in "amazon.com".
Thanks