views:

200

answers:

4

A client has given me a spreadsheet of hundreds of domain names.

My task is to determine the following about each:

  • Which domains are connected to a web server / website.
  • Of those that are, which redirect to another site.
  • What is the server software running (ASP, ASP.NET, Apache, etc)

...and output the results in an organized fashion.

Is there a script, preferably c#, that can help with this?

+1  A: 

Most of your requirements can be handled via the System.Net.WebClient class. The one sticky point is what server software the site uses. Even if you run something that queries the server directly, you can't reliably tell what server software it's using because that software can usually be configured to lie to you and tell you and mimic the response of another common server brand. And while lying isn't common, it's not unheard of, either (it's considered by some to be a best practice as a way to throw off crackers).

Joel Coehoorn
+2  A: 

You could use the HttpWebRequest class to test the domain names. Based on the HttpStatus property of the HttpWebResponse you can decide whether there is a redirect.

For some cases you might be able to find out the server software by looking at the headers sent with the response, but probably not all (or only a few) servers send these headers.

M4N
Apache/php, IIS/ASP, and IIS/ASP.Net all include the header by default. That probably covers >95% of what's out there.
Joel Coehoorn
Yes, but isn't it a best practice to remove these headers? I might be wrong..
M4N
Indeed it is: removing them saves bandwidth and makes and it just that much harder on crackers. But it's one of those things that so many people just never get around to...
Joel Coehoorn
+1  A: 

With respect to your 2nd item

•Of those that are, which redirect to another site.

HttpWebRequest/Response and WebClient will catch most of the redirects but not all of them since there are pages that do the redirect via JavaScript. Since neither of them executes JavaScript, you'll not be able to detect these cases unless you use a WebBrowser control or something else capable of running JavaScript.

Alfred Myers
+2  A: 

To do this, I used the following:

HttpWebRequest req = (HttpWebRequest)WebRequest.Create(uri);
req.AllowAutoRedirect = false; // allows tracking of redirects
HttpWebResponse resp = (HttpWebResponse)req.GetResponse();
string server = resp.Headers["Server"]; // to track server software
string poweredby = resp.Headers["X-Powered-By"]; //denotes ASP.NET, PHP, etc
string aspnetVersion = resp.Headers["X-AspNet-Version"]; //only applies to IIS servers

Some additional response headers that could be capture for more info:

http://en.wikipedia.org/wiki/List_of_HTTP_headers

frankadelic