views:

56

answers:

2

I loaded about 15,000 pages, letters A & B of a dictionary and submitted to google a text site map. I'm using google's search with advertisement as the planned mechanism to go through my site. Google's webmaster accepted the site mapps as good but then did not index. My index page has been indexed by google and at this point have not linked to any pages.

So to get google's search to work I need to get all my content indexed. It appears google will not just index from the site map and so I was thinking of adding pages that spider in links from the main index page. But I don't want to create a bunch of pages that programicly link all of the pages without knowing if this has a chance to work. Eventually I plan on having about 150,000 pages each page being a word or phrase being defined. I wrote a program that is pulling this from a dictionary database. I would like to prove the content that I have to anyone interested to show the value of the dictionary in releation to the dictionary software that I'm completing. Suggestions for getting the entire site indexed by google so I can appear in the search results?

Thanks

A: 

Hi John,

Google crawling takes time, specially for new sites/domains. The sitemaps you submit are just "informational" for Google and don't necessarily mean that everything there will be indexed.

Some notes:

  1. Make sure you display unique content or Google will penalize you for duplicating.
  2. Avoid pages with few text, Google likes pages with a reasonable amount of content.
  3. Keep adding content progressively, a couple times every day if possible.

You should start to see your pages crawled hourly/daily (depending on the update frequency) and appearing in Search Engines in about 2 weeks-2 months (mines usually take 1 month for me to be considered "crawled"). Something you can do meanwhile is to get backlinks from other websites and keep checking Webmaster Tools to see Google's crawling rate.

Good luck! :)

XaviEsteve
+2  A: 

First off, you have to have legitimate, interesting content and not be some sort of attempt to scam or fool Google (see below).

Assuming that's true, Google has some guidelines for getting your site indexed. I highly recommend reading this document.

http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=35769

The first recommendation is:

Make a site with a clear hierarchy and text links. Every page should be reachable from at least one static text link.

This means that, yes, you have to have links to each page.

My experience is with a bookseller with about 90,000 titles, some of which are rare and hard to obtain elsewhere. Humans will access the books pages through the search interface, mostly. Each book page has a fair amount of information: title, publisher, date published, etc - about 20 fields in all.

The site and its ever-changing catalog has been live for more than a decade, but the titles were not accessible to search engines since they were only accessible through the search interface. So google only indexed the CMS pages and not the book catalog.

We started by having a site map. This didn't get google to index the book catalog. Then after reading the guidelines we added static text links to all the titles. Then we added accurate meta tags. (The urls remained normal IIS PHP cruft i.e. book.php?id=123546.) There is no duplicate content.

At one point we got google to index 17,000 pages, but since then the number of pages in google's index has dropped to about 900 - all the non book pages and a few books that are featured on the site.

Unfortunately the client didn't want to pay for us to continue to try to get google to index the titles, so that's how it is today.

Finally, it's not really clear from your post what you are trying to accomplish. If you are trying to put an existing, boring old dictionary up on the web and then get ad revenue, Google is going to ignore you.

Charles