Is there a 3rd party API for the sole purpose with which I can replace relative URLs in the source HTML and CSS into absolute URLs, keeping in mind the fact that the source contains a mix of relative and absolute URLs. For those thinking twice about this question, the String
Object's replaceAll()
method has some shortcomings.
views:
49answers:
1
+2
A:
You can use TagSoup to parse the HTML and then use standard XPath expressions to get all your links and img tags.
chiborg
2010-03-25 10:41:58
+10! citate: "a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild"
Karussell
2010-03-25 11:02:33
@ chiborg I'm a total noob to XML Parsing, could you give me an example as to how to parse HTML code contained in a 'String' object, and then retrieving all the relative URLs, following which, converting them into absolute URL where, the host-name is specified in a String Object.
Catfish
2010-03-25 13:28:33