Ask anyone in the internet business about Datacenters and they will be able to tell you what they are. In fact, I’ve met people outside of the internet business who know what Datacenters are. So everyone knows what a data centre is, but how do they really work?
A few weeks ago I was asked to create an application that had to work with Google Datacenters, UK based Datacenter’s to be more specific and off I went to do some research. To my amusement I found that I didn’t know much about these centers and that in actual fact not many people do. The reason being that Google tries to keep it all under wraps. However, after all of my research I thought I would share some of my findings with you.
So let’s first look at what a Datacenter actually is. The technical definition: a Datacenter is a location owned by a Web site host and contains Web Servers. So it is a building, office or computer room located anywhere in the world that contains Web servers or computers that perform certain tasks and in Google’s case contain and manage copies of the search results and online Web applications.
So how many of these Datacenters does Google have spread across the world? No one really knows. Some speculate hundreds, but those that are really close to Google have mentioned that the number is closer to 40. Yes, 40, but considering that each Datacenter has about 150 server racks and host about 40 servers it becomes clear that Google has over 200 000 Web servers worldwide.
Also, taking into consideration that Google spends close to £1 billion a year on upgrading and increasing the number of Datacenters, the true amount will never really be known to outsiders. Another thing that I should mention is that Google doesn’t rely on third party vendors for hardware or software that is used within their Datacenters. They build and program everything themselves.
One of the general misconception is that there are location specific Datacenters across the globe. In reality, the majority of the Datacenters are located in the USA and their IP addresses rather, are specific to a country. For instance, most of Europe’s search results originate from the IP address 22.214.171.124, but the Datacenter itself is situated on the West coast of America. In fact, the only country beside America to have their own Google Datacenters is China and that’s only due to certain constraints implemented by the Chinese government.
Another misconception is that each Datacenter has its own set of results. This is only half true. In actual fact, each Datacenter is a copy of the other, creating in theory, one large Datacenter. Datacenters continuously updates information from each other, but a slow updating datacenter could create the situation where a person hitting one Datacenter could have an entirely different result set to a person hitting the next.
So why don’t we just hit one default Datacenter? It’s quite simple actually and it’s the reason for Google’s popularity (and efficiency) today. If you’re in England searching for something, Google automatically checks your location, and then sends you to the nearest Datacenter with your location specific results. Firstly this speeds up your search time and secondly, it makes your results more accurate to your location (imagine searching for restaurant and Google displays results of restaurants in your location and not some restaurant in Germany). Another benefit is that if a Datacenter goes down, you will simply be passed onto the next appropriate one without even recognizing the difference.
The last thing I would like to mention is how particular Google are when choosing locations for their Datacenters. Firstly, the Datacenter must have access to cheap and, as much as possible, renewable energy. Secondly it must have access to a large and cheap bandwidth. Pretty simple then, but Google has always pushed the envelope and because of this they are starting to invest in self powered Datacenters that will be situated in the ocean, generating power from the waves. Whether this is to help the planet or save money in the long run, we don’t know, but it’s a win-win situation for both.
All in all, this information is only a drop in the ocean and is probably out of date by the time I’ve posted it. Hopefully it will help you better understand Google’s techniques in their continued domination of the Search market.Tweet