google search engine architecture- how do so many concurrent users do a search on it

views:

102

answers:

+3 Q:

google search engine architecture- how do so many concurrent users do a search on it

With millions of users searching for so many things on google, yahoo and so on. How can the server handle so many concurrent searches? I have no clue as to how they made it so scalable. Any insight into their architecture would be welcomed.

+6 A:

One element, DNS load balancing. If you reload

http://ipaddressinfo.net/index?q=google.com

several times, you'll see different machines responding.

There are plenty of resources on google architecture, this site has a nice list:

http://highscalability.com/google-architecture

The MYYN 2010-05-29 17:04:06

+1 very helpful

mtasic 2010-05-29 17:56:08

+2 A:

I've gone searching for information about this topic recently and Wikipedia's Google Platform article was the best all around source of information on how Google does it. However, the High Scalability blog has outstanding articles on scalability nearly every day. Be sure to check it out their Google architecture article too.

scott 2010-05-29 17:15:47

+1 A:

The primary concept in most of the highly scalable applications is clustering.

Some resources regarding the cluster architecture of different search engines.

You can also read interesting research articles at Google Research and Yahoo Research.

Ramesh Tabarna 2010-05-29 17:26:43

+3 A:

DNS Load Balancing is correct, but it is not really the full answer to the question. Google uses a multitude of techniques, including but not limited to the following:

DNS Load Balancing (suggested)
Clustering - as suggested, but note the following
- clustered databases (the database storage and retrieval is spread over many machines)
- clustered web services (analogous to DNSLB here)
- An internally developed clustered/distributed filing system
Highly optimised search indices and algorithms, making storage efficient and retrieval fast across the cluster
Caching of requests (squid), responses (squid), databases (in memory, see shards in the above article)

Kurucu 2010-05-29 17:46:12

+1 nice update to accepted answer

mtasic 2010-05-29 17:56:56

ansaurus

tags:

views:

answers:

google search engine architecture- how do so many concurrent users do a search on it

related questions