tags:

views:

61

answers:

2

Please how can i use the WebClient class to call a webserver, read the file/directory lists and navigate through the directories recursively?

+1  A: 

Well, there are a lot of potential restrictions here. Not the least of which is that nearly all web servers are configured to disallow what you are trying to do.

The only way this works is if the web server allows directory listing.

You might make more progress if you are a little more clear in exactly what you are trying to accomplish. Namely, is the server in question one you own or are you trying to get directory listings of servers you don't own. Also, what tech stack are you using: c#, vb, php? Finally, what is the target server running?

If you are looking at creating your own search engine, for example, they work off of traversing detected links in returned html... which is radically different from getting a list of all files off of some server.

If you own the server in question then you might just investigate setting up an FTP server on it and communicating that way. FTP has a lot more built in to do exactly what you are asking.

Chris Lively
Ahh, directorybrowsing. :) that indeed makes sense...
Caspar Kleijne
yes the webserver allows directory listing plus i am partially in control. i intend to do a directory listin to compare folders and files on webserver with folders and files on local directory-purpose is download new files.
unfortunately ftp is not activated (only http)
+1  A: 

You can use the WebClient's DownloadString(Uri) function to download the html source of a page. Keep in mind if you hit a non-html based page you'll download a file into that string which may be undesired behaviour.

From there you need to extract urls from the file. You can do this either with Regular Expressions (which is a bad idea) or by using a proper html parsing library. Alternatively if the pages are XHTML, you can use Linq to XML or the XML base library in .NET to parse it.

Once you've collected all those links simply call the function again with the new links.

Note: You should keep a reference to all urls you've already tracked, that way if you get cyclic links (A -> B -> A) you wont end up in an infinite loop.

Aren