I'm building a Java application which will download a HTML page from a website and save the file in my local system. I'm able to manually access the web page's URL via browser. But when I try to access the same URL in my Java program, the server returns a 503 Error. Here's the scenario:
sample URL = http://content.somesite.com/demo/somepage.asp
Able to access the above URL via browser. But the below Java code fails to download the page:
StringBuffer data = new StringBuffer();
BufferedReader br = null;
try {
br = new BufferedReader(new InputStreamReader(sourceUrl.openStream()));
String inputLine = "";
while ((inputLine = br.readLine()) != null) {
data.append(inputLine);
}
} catch (Exception e) {
e.printStackTrace();
} finally {
br.close();
}
So, my questions are:
Am I doing anything wrongly here?
Is there a way for the server to block requests from programs/bots and allow only the requests coming from browsers?