The speaker was Mark Russinovich, Microsoft Fellow and former owner of Sysinternals and Winternals.
This was a very exciting session with lots of news. Anyone attending was left quite geeked out by what’s happening with Microsoft Windows Server.
VM Migration
Live Migration is Microsoft’s answer to VMware VMotion. At a high level it seems quite similar. It requires a shared cluster file system between hosts and gradual memory transfer before switching VM’s over.
How it works:
- A connection is established between the source and target hosts.
- Transfer the VM configuration.
- Transfer RAM.
- Suspend the VM on the source and transfer any remaining state (CPU and maybe some RAM).
- Resume the VM on the target.
The goal is that this entire process complete in less than 20 milliseconds.
The RAM transfer works as follows:
- The source host creates a "dirty" bitmap of the memory pages used by the VM.
- The pages are copied to the target host one by one. As this happens, they are marked as "clean".
- If the VM changes a memory page then it is marked as dirty, thus requiring that it be copied again.
- This decremental copying process is done 10 times with remaining dirty pages, attempting to mark all pages as clean.
- The process stops either when (1) all pages are clean (copied) or (2) all 10 passes have been completed.
At this point the VM the VM is suspended (frozen). The remaining state (CPU and the few dirty pages) are copied to the target host where it can be reanimated.
Clustered Shared Volumes (CSV)
- This was needed for live migration, just like VMFS. It allows multiple hosts to access a single volume to facilitate near instant migration of VM’s from one host to another.
- It simplifies storage configuration. We will no longer need 1 LUN for every VM.
- Allowing large CSV LUN’s with many VM’s will make the self service portal much more acceptable for end user usage (with quota).
How it works:
- One host owns the namespace (LUN), e.g. the directory structure and metadata.
- Any host can read/write(lock) a file.
- Relatively rare operations such as create file, delete file and resize file are sent to the LUN owner.
- The host that owns a VM opens the VHD file for exclusive use.
Network Virtual Machine Queues
Right now, processing network packets for VM’s requires:
- VLAN lookup to determine the destination VLAN.
- MAC lookup to determine the destination machine.
- Copying the packet (via the VM Bus) to/from the VM to/from another VM or the parent partition (for the physical network).
This requires 2 context switches by the host CPU.
Microsoft has worked with NIC hardware vendors to introduce Network VM Queues. The VMQ works as follows:
- The hardware participates in the process.
- The parent partition is removed from the process.
- The context switch numbers are reduced.
NIC Embedded Switch:
- This is used in VM to VM traffic.
- The hardware provides virtual switch trafficking.
- If VM’s are on the same virtual switch then they communicate via the hardware only.
Hyper-V Power Management
Windows 7 (and 2008 R2) brings:
- Core Parking: Each core is put to sleep when it is idle for a predefined time frame – calculated by the cost of bringing the core back into operation. If all cores in a socket go to sleep then the core goes to sleep. The sleep time might be milliseconds but this saves power throughout the day.
- Timer Coalescing: Now, Hyper-V wastes tiny amounts of CPU cycles by synchronising timers in every VM, one VM at a time over different schedules. In Windows Server 2008 R2, all VM’s are synchronised at the same time. This provides more opportunities for Core Parking.
- Hyper-V uses these technologies. However, VM CPU rules are guaranteed.
VM Memory Management
- This uses new technology from both Intel and AMD. This allows the CPU to maintain 2 levels of memory mapping.
- In W2008, Hyper-V maintains a shadow table to map both the VM RAM and the host’s physical RAM. It is estimated that this causes 10% of CPU activity and consumes roughly 1MB RAM/VM.
- Second Level Address Translation (SLAT) means that there is no need for a shadow table. The hypervisor has less activity. CPU utilisation is reduced to 2%. We also remove the consumption of roughly 1MB RAM/VM.
Native VHD
This is easily the most exciting development I’ve heard this week and will change server computing in the data centre over the next few releases of Windows Server.
The aim of Microsoft is to increase the usage of the published format of VHD:
- Reduce format explosion (eventually replace WIM).
- Leverage existing tasks.
- Give a consistent experience for partners and administrators.
Remember that we have 3 types of VHD:
- Fixed: A static sized virtual disk file.
- Dynamic: A maximum size is defined but the file only consumes the disk space that it requires for it’s containing data.
- Differencing: This is an extension of a targeted fixed or dynamic disk. The idea is that you load a differencing disk and it stores and disk data that is different to the targeted VHD file.
Notes on VHD:
- VHD’s have a maximum size of 2TB.
- MS aims that VHD performance should be within 10% of raw physical disk. They gotten within 2% of raw physical disk performance in large scale lab tests.
- The term "surface" means mounting the VHD file as an accessible volume on the physical server.
The purpose of native VHD is that you can mount a VHD file from a physical server. You can use it as an ordinary volume or you can use BCDEDIT to boot the physical machine from the VHD file. In this scenario there are two volumes. A small volume has the minimum boot files and the paging file for the server. A storage volume contains the VHD file that contains the operating system.
We get a demonstration showing the disk management in operation. Mark is actually running his demonstration operating system from a differencing disk. The clean demonstration is in VHD file 1. VHD file 2 is a differencing disk that points to VHD1 as it’s source. The machine boots from VHD2. Anything new that stores data to its disk stores the data on VHD2.
Requirements:
There is a requirement that Hyper-V is installed before you "surface" a VHD. You can only boot from VHD if you surface it. The paging file must exist on the physical boot disk.
Here’s the strategy:
- This is the long term data centre strategy.
- There will be one image format from Microsoft. They want to figure out how to make VHD do everything that WIM can before WIM is killed off.
- This general image will allow easy migration, e.g. Physical to Virtual, Virtual to Physical and Physical to Virtual. Imagine not needing to worry about drivers when going from one generation of hardware to another? Consider automatically migrating VM’s from a Hyper-V cluster to a dedicated physical server or vice versa when performance/resource requirements change?
- Reduced total cost of ownership (TCO).
- Patching will become a safer process: freeze the machine, create a differencing disk, boot from the differencing disk. If all is well then merge the disks. If there is a problem, remove the differencing disk and boot from the original VHD.
How Is It Deployed?
- A boot agent is installed on the hardware onto a boot disk. The paging file resides
here. - A VHD is created and surfaced.
- An operating system is deployed to the VHD.
- You can now manage the server as you always have.
Limitations:
- The differencing disks currently must reside on the same physical LUN as the targeted VHD.
- Dynamic disks are pre expanded by default – not sure what that means to be honest.
- The nesting depth for booting from VHD is limited to no more than 2 levels.
- Non boot VHD’s are not auto mounted at the moment.
Other Hyper-V Improvements
- Hot add storage.
- Performance improvements "everywhere".
- Support for 32 logical processors.
One thought on “Day 4: Inside Windows 2008 R2 Virtualisation Improvements And Native VHD Support”