what do you call a spidering technique where spider visits all links in the first level, and all links in second level. | ansaurus

tags:

views:

37

answers:

1

Q:

what do you call a spidering technique where spider visits all links in the first level, and all links in second level.

i forgot the name to a case where a web spider will

first visits all links it sees on the first level. then visits all links it sees on the second level. and so on...

there is a name for this technique....i forgot...

anyways, this is very exhaustive and obviously inefficient. Is there a better way ?

I remember reading a paper in summer about efficiently crawling web pages (DSL or something like that i dont know what that stands for)....in summary it discussed method for "Determine which URL's are likely to hold relevant information and which URL's are to be ignored like register, new account link..etc"

i didnt read it in too much detail, if any of this stuff rings a bell please post a link.

+1 A:

Sounds like 'breadth first search', as opposed to 'depth first search'. In the 1st one you examine all your options laterally, so to speak, whereas in the latter you drill as deep as you can on every path first. That's AI terminology, not sure if it's in vogue with web tool designers. Anyway, BFS consumes a lot of memory but is usually employed when you want to find an 'optimal result', something (in your terms) at the shallowest level possible, whereas DFS tends to use a lot less memory but may miss better solutions.

If you are just trying to catalog all the links, use DFS. If you are trying to find something at the shallowest link depth, use BFS.

JustJeff 2009-10-28 09:36:47

Beat me to it....

Drew Hall 2009-10-28 09:38:12

related questions

RSS feeds from Gallery2

User authentication on Resin webserver

Recommendation for straight-forward python frameworks

Why no favicon for my web site?

Preferred way to use favicons?

IE6: To support or not to support.

FF3 WinXP != FF3 Ubuntu - why?

Source control for web projects.

Drag and drop ftp file upload web widgets

Accessing post variables using Java Servlets

Anyone have experience creating a shared library in MATLAB?

The Difference Between a DataGrid and a GridView in ASP.NET?

Options for Google Maps over SSL

Is there an Unobtrusive Captcha for web forms?

How can I detect if a browser is blocking an popup?

How to curl or wget a web page?

What Hosting Service is best for Django applications?

What are some web-based knowledge-base solutions?

What is a good web-based Grid that accepts Excel clipboard data?

What's the best online payment processing solution?

What are effective options for embedding video in an ASP.NET web site?

How can I tell if a web client is blocking ads?

How do I make a checkbox toggle from clicking on the text label as well?

Can you recommend a good CSS online resource or book?

Class views in Django