Project Natick -- Microsoft's Undersea Data Centers
Microsoft recently shared information on its efforts to develop an undersea distributed data center solution to power The Cloud in the future.
Say Goodbye to Traditional PC Lifecycle Management
Traditional IT tools, including Microsoft SCCM, Ghost Solution Suite, and KACE, often require considerable custom configurations by T3 technicians (an expensive and often elusive IT resource) to enable management of a hybrid onsite + remote workforce. In many cases, even with the best resources, organizations are finding that these on-premise tools simply cannot support remote endpoints consistently and reliably due to infrastructure limitations.
Data Center Challenges
Constructing and operating data centers for a global market is a challenging proposition. Imagine what Microsoft’s data center planners and operators go through on a daily basis. No matter how many regions with multiple data centers each that are built (50 for Azure have been announced so far), there are never enough. It is never close enough to customers. The cost and complexities are never-ending.
The cost of land for one of these huge endeavors alone could be staggering. I know that in the USA, Microsoft typically builds out because land is relatively affordable. But in other locations such as Dublin, Ireland, Microsoft acquired a large tract of land when it was at its most expensive. In Japan, there is no such thing as affordable land.
To date in Europe, there are 8 Azure regions (2 being in Azure Germany) with two more planned for Germany and an additional two announced for Switzerland. But my bet is that companies in Luxembourg, Sweden, Lichtenstein, and more, all will refuse to consider Azure because there isn’t a region close to them. No matter how much you build or announce, there will never be enough. It’s not like a data center is a building you just inflate overnight at a whim!
Power is also a challenge. A cloud region (a cluster of data centers) the same power as a medium-sized city. Supplying that electricity is a problem! Here in Ireland, the local suppliers are unable to keep up with the growth of Microsoft’s footprint, spanning internal IT, Bing, Office 365, Azure, and more. Microsoft recently applied for planning permission to build its own power station and have signed a contract to consume all of the power generated by a large wind farm. With more plans to expand its physical footprint, Microsoft will need more power in the future. The Irish government has to consider whether the investment in infrastructure is worth the national income from these data centers. That’s a question that Microsoft understands and considers for its future plans.
Microsoft has a long history in doing research into all kinds of weird and wonderful things, a few of which turn into real applicable technologies. Before Microsoft starting building its cloud data centers, it challenged previous strongly held beliefs about cooling, air conditioning, and construction of computer rooms. The methods of server and storage design of the past were dispensed with this and software-defined on commodity hardware became the norm.
Such researching and questioning of accepted beliefs continues with hardware, software, and building design. Nothing displays this more than Project Natick (pronounced Nay-Tick). This is a project to develop, in cooperation with a French company called Naval Group, under-sea data center pods that are being run in the Orkney Islands off of the northern coast of Scotland, a rugged, windy, cold, and remote location.
Project Natick is the second time that Microsoft has announced an undersea deployment; the first was last year when a pod called Leona Philpot was placed undersea successfully for 105 days. With that proof of concept complete, Microsoft started work on a solution complete with servers and storage.
As you can imagine, there are all kinds of challenges with deploying a data center under the sea. A foundation must be built, broadband must be supplied, electricity and water have a damaging way of mixing, maintenance could be a nightmare, and workers will have to fend off kraken attacks.
There are several reasons why all these obstacles are worth overcoming:
- Available “land”: There is an incredible amount of coastal seabed that isn’t being used. It’s effectively free.
- Tidal power: There’s a reason that Orkney was selected as the test location. It was already one of the hot centers of research in tidal power, a system by which the movement of the sea (under or over) can be harnessed to generate carbon neutral electricity.
- Cooling: The “hot aisle” is a huge cost for every data center. Microsoft uses free air cooling in many data center regions but this only works well in moderate climates. In more extreme locations, normal cold air fans are required. Seawater is cold and provides ample cooling if it can be pumped to the right places to transport the heat away.
- Bandwidth: The backbones of the Internet and every global cloud (Microsoft, AWS, Facebook, and so on) all reside on the ocean floor. Connecting to those pipes in the sea will probably reduce packet latency.
- Population distribution: 50 percent of the world’s population lives near a shoreline (sea or large water body). Most large cities are near the sea. It makes sense that data centers are placed as close to the population as possible but without the punitive property costs.
- Rapid re-deploy: A cloud-scale data center is specialized for its compute and storage payload. When that payload is being replaced, it is often more affordable to knock down and rebuild the facility. The container system that Microsoft promoted several years ago is not widely adopted in Azure because it requires too much expensive land. A pod system on the sea floor could be viable and allow Microsoft to change compute/storage generations more rapidly than on land.
The more I have thought about this concept, the more I am convinced that there will be another side effect that might (not definitely) have benefits. Environmentalists have campaigned that large sections of coastal waters need protection against over-harvesting of fish stocks, which can also damage the sea floor and the life that resides there. Once a foundation is built, Microsoft’s aqua-centers (I’m claiming that one) will need very little work – more on this in a moment. Trawling with huge nets would damage the system so Microsoft would need to negotiate with the local government and fishing companies not to harvest these areas – and thus could create wildlife reserves around the distributed data centers.
I mentioned the reduced need for maintenance. This would be essential because the only way to do work in these pods would be to raise them up onto a ship. How will someone replace disks or other components when they fail? That’s what we do in normal environments. If something breaks, we get an alert, open a call, and replace the component as soon as possible before there’s a service outage.
The likes of Azure, AWS, and Facebook are designed for failure (fault domains). Imagine the hundreds of millions of disks that are probably in Azure. Do you think that there’s someone running every time a disk light changes to red or amber? There’s a reason that there are huge levels of resilience. This sort of maintenance will be done in periodic batches. Someone is going into a compute room on a periodic schedule with a set list of maintenance tasks and they won’t be back again until that room comes up on the rota. One disk blinking is a non-event. So I can imagine that data/services would be spread across multiple pods. If one disk has an issue, it’s ignored until that pod is due to be lifted for scheduled maintenance.
What Does It Look Like?
The Natic aqua-center is made up of one or more submarine pods. The pods are 12.2m x 2.8m (approximately the size of a shipping container) cylinders that are lowered onto a foundation. It is powered entirely by on-shore wind and tidal/wave power with a consumption rate of 240KW.
864 “standard” Microsoft data center servers with FPGA acceleration are loaded into 12 racks (72 servers each). There is also 27.6 petabytes of storage.
It takes less than 90 days to bring this pod into operation after the factory ships it with an expected life of up to 5 years.
Shooting for the Sea Floor
The project is still in early research. This phase is focused on operating a single pod with a low power usage effectiveness (PUE is an important data center economical/environmental impact measure). It is being deployed no more than 100 meters deep and no more than 12 nautical miles from land with the intention to demonstrate light-out operation for up to 5 years. The current phase (phase 2) is still using a land-based grid power supply. There is no estimate on when or if Project Natick will ever reach production.
Moon shots like Project Natick are important because we learn a lot from them, and not just from the core goal. This project might go nowhere but it might lead to other data center innovations that Microsoft might share through the Open Compute Project. However, if Project Natick works technically, operationally and economically, it could revolutionize the cloud.