views:

76

answers:

2
+1  Q: 

Bot Web Quality

I am looking for a good open source bot to determine some quality, often required for google indexing.

For example

  • find duplicate titles
  • invalid links ( jspider do this, and I think a lot more will do this)
  • exactly the same page, but different urls
  • etc, where etc equals google quality reqs.
+1  A: 

Your requirements are very specific so it's very unlikely there is an open source product that does exactly what you want.

There are, however, many open source frameworks for building web crawlers. Which one you use depends on your language preference.

For example:

Generally, these frameworks will provide classes for crawling and scraping pages of a site based upon the rules you give, but then it's up to you to extract the data you need by hooking in your own code.

Ben James
I tought about some hand me bot, I made some with scrapy ...I think this is the best answer for the moment!Do you know any already made bot for something like this?
llazzaro
A: 

Google Webmaster Tools is a web-based service (rather than an on-demand bot), and it doesn't do everything you've asked for - but it does do some of it and a lot of things you haven't asked for, and - being from Google - it no doubt matches your odd "etc, where etc equals google quality reqs." better than anywhere else will.

Peter Boughton
yes I know, and my question is inspired upon webmaster tools...but I want to avoid that. my web page its has a lot of pages and all of them are dynamic so its dificult to find duplicates titles and I want to do it beafor google finds out!
llazzaro