tags:

views:

71

answers:

5

Hello, I have my blog (you can see it if you want, from my profile), and it's fresh, as well as google robots parsing results are.

The results were alarming to me. Apparently the most common 2 words on my site are "rss" and "feed", because I use text for links like "Comments RSS", "Post Feed", etc. These 2 words will be present in every post, while other words will be more rare.

Is there a way to make these links disappear from Google's parsing? I don't want technical links getting indexed. I only want content, titles, descriptions to get indexed. I am looking for something other than replacing this text with images.

I found some old discussions on Google, back from 2007 (I think in 3 years many things could have changed, hopefully this too)

This question is not about robots.txt and how to make Google ignore pages. It is about making it ignore small parts of the page, or transforming the parts in such a way that it will be seen by humans and invisible to robots.

+1  A: 

The only control that you have over the indexing robots, is the robots.txt file. See this documentation, linked by Google on their page explaining the usage of the file.

You basically can prohibit certain links and URL's but not necessarily keywords.

JYelton
Yes, I know about robots.txt. That's implemented.Russian search engines provide certain tags, like <noindex></noindex>, and anything that's between gets ignored by the search engine. Yahoo provides something based on class names. Doesn't Google offer anything?
Alexander
+1  A: 

you have to manually detect the "Google Bot" from request's user agent and feed them little different content than you normally serve to your user.

iamgopal
That is horrible advice. It's a good way to get google-spanked.
Aaron Harun
I don't think it is that bad. What if you have a site which is subscription based but you still want Google to index the content? I don't think you will get 'google-spanked'
Chris Diver
@Aaron Harun , its not black hat seo its completely white hat as long as you don't serve completely different content.
iamgopal
A: 

No, there really isn't anything like that. There are various server-side techniques, but if Google catches you serving up different text to its bot than you give to website visitors it will penalize you.

Charles
+1  A: 

Other than black-hat server-side methods, there is nothing you can do. You may want to look at why you have those words so often and remove some of them from the site.

It used to be that you could use JS to "hide" things from googlebot, but you can't now that it parses JS. ( http://www.webmasterworld.com/google/4159807.htm )

Aaron Harun
That is very interesting. So if I make text replacing with tools like cufon, Google bot will parse that JS, transform the text and ignore it because then it will only be a canvas?
Alexander
No guarantees, Google is tight-lipped about what the bot can and can not do, so it probably won't work. You can however, start with the canvas rather than having Cufon do a replace.
Aaron Harun
+2  A: 

I work on a site with top-3 google ranking for thousands of school names in the US, and we do a lot of work to protect our SEO. There are 3 main things you could do (which are all probably a waste of time, keep reading):

  • Move the stuff you want to downplay to the bottom of your HTML and use CSS and/or to place it where you want readers to see it. This won't hide it from crawlers, but they'll value it lower.
  • Replace those links with images (you say you don't want to do that, but don't explain why not)
  • Serve a different page to crawlers, with those links stripped. There's nothing black hat about this, as long as the content is fundamentally the same as a browser sees. Search engines will ding you if you serve up a page that's significantly different from what users see, but if you stripped RSS links from the version of the page crawlers index, you would not have a problem.

That said, crawlers are smart, and you're not the only site filled with permalink and rss links. They care about context, and look for terms and phrases in your headings and body text. They know how to determine that your blog is about technology and not RSS. I highly doubt those links have any negative effect on your SEO. What problem are you actually trying to solve?

If you want to build SEO, figure out what value you provide to readers and write about that. Say interesting things that will lead others to link to your blog, and crawlers will understand that you're an information source that people value. Think more about what your readers see and understand, and less about what you think a crawler sees.

chrispix
Thank you.It's just that I could make my blog appear in top results if I write a strange combination of category names, 2 post topics, and by adding the "rss" and "feed" keywords. Without "rss" and "feed" it's way to the end.I'll read the rules again and pay attention at clauses associated with serving slightly different content to bots.
Alexander