I'm attempting to do some screen scraping however the html being returned is causing an error as there is no header (i think). Below is the code
public class xpath
{
private Document doc = null;
public xpath()
{
HttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet("http://blah.com/blah.php?param1=value1&param2=value2");
ResponseHandler<String> responseHandler = new BasicResponseHandler();
try
{
String responseBody = httpclient.execute(httpget, responseHandler);
doc = parserXML(responseBody);
visit(doc, 0);
}
catch(Exception error)
{
error.printStackTrace();
}
}
public void visit(Node node, int level)
{
NodeList nl = node.getChildNodes();
for(int i=0, cnt=nl.getLength(); i<cnt; i++)
{
System.out.println("["+nl.item(i)+"]");
visit(nl.item(i), level+1);
}
}
public Document parserXML(String file) throws SAXException, IOException, ParserConfigurationException
{
return DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(file);
}
public static void main(String[] args)
{
new xpath();
}
}
Its throwing the exception "java.net.MalformedURLException: no protocol:"
Is there a way of getting the parser to be a bit more forgiving?
Thanks