views:

55

answers:

5

I am wondering is there any (programming) way to block that any search engine indexes the content of a website.

+2  A: 

Create a robots.txt file for your site. For more info - see this link.

OMG Ponies
I accept your answer, because your link provides more info than others';) Thanks!
israkir
+6  A: 

You can specify it in robots.txt

User-agent: *
Disallow: /
Chandra Patni
-1: Untargetted.
Charles Stewart
The OP asked specifically to block *any* search engine. +1
Pekka
@Pekka: not all web crawlers work for search engines. Tnay below links to a widely used way of distinguishing the search engine indexers.
Charles Stewart
+4  A: 

As the other answers already say, Robots.txt is the standard that every proper search engine adheres to. This should be enough in most cases.

If you really want try to programmatically block malicious bots that do not listen to robots.txt, check out this question I asked a few months ago on how to tell bots apart from human visitors. You may find some good starting points there.

Pekka
+2  A: 

Most search engine bots identify themselves using a unique user agent.

You can block specific user agents using robots.txt

Here is a list of some user agents.

instanceofTom
A: 

Since you did not mention programming language, I'll give my input on this as from a php perspective - there is a wordpress plugin called bad behavior, which does exactly what you are looking for, it is configurable via a code script listing an array of search agent's strings. And based on what the agent is crawling on your site, the plugin automatically checks the user-agent's string and id, or IP address and based on the array, if there's a match, it either rejects or accepts the agent.

It might be worth your while to have a peek at the code to see how is it done from a programmer's perspective of the code.

If the language is other than php, and not satisfy what you are looking for, then I apologize for posting this answer.

Hope this helps, Best regards, Tom.

tommieb75