Most administrators don’t know or care about the real cost of servers: power. A single server’s cost is much more than what you pay to Dell or HP. The power alone massively outweighs the purchase cost. It’s said a typical server has the carbon footprint of a car. It’s inevitable that we’re going to see carbon taxes hitting businesses now. Cloud computing/Software-as-a-Service mightn’t be for everyone so they need a solution. Cloud providers also need a solution to power issues because the biggest cost they have to pass on to customers is electricity.
I found this commentary by Chris Wolf talking about an experimental feature that was included in VMware VI3.5. This feature called Distributed Power Management (DPM) is an interesting one – one which had me nearly swinging towards VMware instead of Hyper-V. Virtual Center monitors the usage of host resources by VM’s and using DRA and memory over-subscription it will consolidate VM’s to fewer hosts. This allows idle hosts to be powered down or suspended. When resource consumption grows the required idle hosts are powered back up using WOL. VM’s can be migrated using VMotion to ensure they get the CPU and RAM (probably IO as well) resources that they need.
The commentary talks about how people are wary of powering down/up production servers. That’s fair enough. In my opinion however, that’s the wrong way to look at this. The production servers are the VM’s. In this scenario the VM’s are never powered down. They’re offline for a few milliseconds as the VMotion across the cluster, something that VMware customers are well used to now.
The hosts are just physical resources. The hardware is just an enabling layer like electricity or network when you’re dealing with virtualisation. And just like those utilities there’s fault tolerance at this layer – or there should be. In a network that could realistically use DPM to save power there will be significant numbers of hosts. They should be dealing with at least N+1 the number of hosts that they require, maybe even N+2. So what happens if there’s an occasional hardware failure? If you run an enterprise network then the hardware should be monitored and any faults will be responded to immediately.
Microsoft are currently taking a different approach to the power issue when it comes to Windows Server 2008 R2 – and logically Hyper-V. MS are using Core and CPU Parking. The server monitors the demand on the CPU cores every X milliseconds. When a core is idle it is suspended, thus reducing it’s power consumption. The CPU core is the major draw on power in a server. It’s also the generator of heat and cooling that heat is another major draw on power. Suspending idle Cores reduces both of those power demands. If a Core is required then it is snapped back online. The trick is in defining appropriate idle windows – you don’t want to suspend at millisecond 1 and find you’re always bringing it back online at millisecond 2. That’s wasteful. When all cores in a CPU are idle then the CPU is parked, thus saving more power.
I was at a power meeting/interview session with MS at TechEd EMEA and I brought up the VMware DPM approach. I don’t know if it’s something MS will look at or not. I hope they do look at it for the next release after Windows Server 2008 R2. Right now, I have to applaud VMware for trying to do something. They do see the hardware as just an enabling layer, not the production servers. I think that’s the right point of view to take. When DPM does go live I can see it saving VMware customers a good bit of money.