I’ll be joining fellow Cloud and Datacenter Management (Hyper-V) MVP Andy Syrewicze for a webcast by Altaro on June 14th at 3PM UK/Irish time, 4PM CET, and 10AM Eastern. The topic: What’s new in Windows Server 2016 Hyper-V (and related technologies). There’s quite a bit to cover in this new OS that we expect to be release during Microsoft Ignite 2015. I hope to see you there!
I recently spoke at the excellent Cloud and Datacenter Management conference in Dusseldorf, Germany. There was 5 tracks full of expert speakers from around Europe, and a few Microsoft US people, talking Windows Server 2016, Azure, System Center, Office 365 and more. Most of the sessions were in German, but many of the speakers (like me, Ben Armstrong, Matt McSpirit, Damian Flynn, Didier Van Hoye and more) were international and presented in English.
Note: Azure Backup Server does have a cost for local backup that is not sent to Azure. You are charged for the instance being protected, but there is no storage charge if you don’t send anything to Azure.
I have deployed Technical Preview 5 (TP5) of Windows Server 2016 (WS2016) to most of the hardware in my lab. One of the machines, a rather old DL380 G6, is set up as a standalone host. I’m managing it using Remote Server Administration Toolkit (RSAT) for Windows 10 (another VM).
I enabled Hyper-V on that host. I then deployed a 4 x Generation 2 VMs using Nano Server (domain pre-joined using .djoin files) – this keeps the footprint tiny and the boot times are crazy fast.
Hyper-V is enabled in the Nano VMs – thanks to the addition of nested virtualization. I’ve also clustered these machines. Networking-wise, I have given each VM 2 x vNICs, each with MAC spoofing (for nested VMs) and NIC teaming enabled.
I launched PowerShell ISE then used Enter-PSSession to connect to the host from the admin PC. And from the host, I used Enter-PSSession -VMName to use PowerShell Direct to get into each VM – this gives me connectivity without depending on the network. That’s because I wanted to deploy Switch Embedded Teaming (SET) and provision networking in the Nano VMs. This script configure the VMs each with 3 vNICs for the management OS, connected to the vSwitch that uses both of the Nano VMs vNICs as teamed uplinks:
$idx = 54 New-VMSwitch -Name External -NetAdapterName "Ethernet","Ethernet 2" -EnableEmbeddedTeaming $true -AllowManagementOS $false Add-VMNetworkAdapter -ManagementOS -Name "Management" -SwitchName External Add-VMNetworkAdapter -ManagementOS -Name "SMB1" -SwitchName External Add-VMNetworkAdapter -ManagementOS -Name "SMB2" -SwitchName External Sleep 10 New-NetIPAddress -InterfaceAlias "vEthernet (Management)" -IPAddress 172.16.2.$idx -PrefixLength 16 -DefaultGateway 172.16.1.1 Set-DnsClientServerAddress -InterfaceAlias "vEthernet (Management)" -ServerAddresses "172.16.1.40" New-NetIPAddress -InterfaceAlias "vEthernet (SMB1)" -IPAddress 192.168.3.$idx -PrefixLength 24 New-NetIPAddress -InterfaceAlias "vEthernet (SMB2)" -IPAddress 192.168.4.$idx -PrefixLength 24
Note: there’s no mention of RDMA because I’m working in a non-RDMA scenario – a test/demo lab. Oh yes; you can learn Hyper-V, Live Migration, Failover Clustering, etc on your single PC now!
And in no time, I had myself a new Hyper-V cluster with a tiny physical footprint, thanks to 4 new features in WS2016.
Microsoft has released a new version of the integration components for Linux guest operating systems running on Hyper-V (2008, 2008 R2, 2012, 2012 R2, and 2016 Technical Preview, Windows 8, Windows 81, and Azure).
- Expanded Releases: now applicable to Red Hat Enterprise Linux, CentOS, and Oracle
- Linux with Red Hat Compatible Kernel versions 5.2, 5.3, 5.4, and 7.2.
- Hyper-V Sockets.
- Manual Memory Hot Add.
- SCSI WNN.
- Uninstallation scripts.
There is a scenario when you are using Azure Site Recovery and a VM somehow becomes orphanened, no longer controlled by ASR, but you cannot remove replication from the VM on the host. I had that situation this morning with a WS2012 R2 Hyper-V VM (no VMM present).
The situation leaves you in a position where you cannot disable replication on the VM using either the UI or PowerShell, because the host continues to believe that replication is managed by Azure, even if you remove the provider (agent) from the host or remove the host from ASR. In PowerShell, you get the error:
Operation not allowed because the virtual machine ‘<name>’ is replicating to a provider other than Hyper-V”
Microsoft has guidance on how to clear this problem up for Hyper-V to Azure and VMM to Azure replication, which I found by accident after a difficult 30 minutes! The key for me to the solution was to run a small 4 line script that removes replication using WMI, found under the heading “Clean up protection settings manually (between Hyper-V sites and Azure)”. I copied that script into ISE (running with elevated admin rights) and replication was disabled for the VM.
In this post I’ll tell you about the cluster-in-a-box solution from DataOn Storage that allows you to deploy a Hyper-V cluster for a small-mid business or branch office in just 2U, at lower costs than you’ll pay to the likes of Dell/HP/EMC/etc, and with more performance.
So you might have noticed on social media that my employers are distributing storage/compute solutions from both DataON and Gridstore. While some might see them as competitors, I see them as complimentary solutions in our portfolio that are for two different markets:
- Gridstore: Their hyper-converged infrastructure (HCI) products remove fear and risk by giving you a pre-packaged solution that is easy and quick to scale out.
- DataON: There are two offerings, in my opinion. SMEs want HA but at a budget they can afford – I’ll focus on that area in this article. And then there are the scaled-out Storage Spaces offerings, that with some engineering and knowledge, allow you to build out a huge storage system at a fraction of the cost of the competition – assuming you buy from distributors that aren’t more focused on selling EMC or NetApp 🙂
There is a myth out there that the cloud has or will remove servers from SMEs. The category “SME” covers a huge variety of companies. Outside of the USA, it’s described as a business with 5-250 users. I know that some in Microsoft USA describe it as a company with up to 2,500 users. So, sure, a business with 5-50 users might go server-less pretty easily today (assuming broadband availability), but other organizations might continue to keep their Hyper-V (more likely in SME) or vSphere (less likely in SME) infrastructures for the foreseeable future.
These businesses have the same demands for applications, and HA is no less important to a 50 user business than it is for a giant corporation; in fact, SMEs are hurt more when systems go down because they probably have a single revenue operation that gets shut down when some system fails.
So why isn’t the Hyper-V (or vSphere) cluster the norm in an SME? It’s simple: cost. It’s one thing to go from one host to two, but throw in the cost of a modest SAS/iSCSI SAN and that solution just became unaffordable – in case you don’t know, the storage companies allegedly make 85% margin on the list price of storage. SMEs just cannot justify the cost of SAN storage.
I was at the first Build conference in LA when Microsoft announced Windows 8 and Windows Server 2012. WS2012 gave us Storage Spaces, and Microsoft implored the hardware vendors to invest in this new technology, mainly because Microsoft saw it as the future of cluster storage. A Storage Spaces-certified JBOD can be used instead of a SAN as shared cluster storage, and this could greatly bring down the cost of Hyper-V storage for customers of all sizes. Tiered storage (SSD and HDD) that combines the speed of SSD with the economy of large hard drives (now up to 10 TB) with transparent and automatic demand-based block based tiering meant that economy doesn’t mean a drop in performance – it actually increases performance!
One of the sessions, presented by Microsoft Clustering Principal PM Lead Elden Christensen, focused on a new type of hardware solution that MSFT wanted to see vendors develop. A Cluster-in-a-Box (CiB) would provide a small storage or Hyper-V cluster in a single pre-packaged and tested enclosure. That enclosure would contain:
- Up to 2 or 4 independent blade servers
- Shared storage in the form of a Storage Spaces “JBOD”
- Built in cluster networking
- Fault tolerant power supplies
- The ability to expand via SAS connections (additional JBODs)
I loved this idea; here was a hardware solution that was perfect for a Hyper-V cluster in an SME or a remote office/branch office (ROBO), and the deployment could be really simple – there are few decisions to make about the spec, performance would be awesome via storage tiering, and deployment could be really quick.
DataON CiB-9112 V12
This is the second generation of CiBs that I have worked with from DataON, a company that specialises in building state-of-the-art and Mcirosoft-certified Storage Spaces hardware. My employers, MicroWarehouse Ltd. (an Irish company that has nothing to do with an identically named UK company) distributes DataON hardware to resellers around Europe – everywhere from Galway in west Ireland to Poland so far.
The CiB concept is simple. There are two blade servers in the 2U enclosure. Each has the following spec:
- Dual Intel® Xeon® E5-2600v3 (Haswell-EP)
- DDR4 Reg. ECC memory up to 512GB
- Dual 1G SFP+ & IPMI management “KVM over IP” port
- Two PCI-e 3.0 x8 expansion slots
- One 12Gb/s SAS x4 HD expansion port
- Two 2.5” 6Gb/s SATA OS drive bays
Networking wise, there are 4 NICs per blade:
- 2 x LAN facing Intel 1 GbE NICs, which I team for a virtual switch with management OS sharing enabled (with QoS enabled).
- 2 x internal Intel 10 GbE , which I use for cluster communications and SMB 3.0 Live Migration. These NICs are internal copper connections so you do not need an external 10 GbE switch. I do not team these NICs, and they should be on 2 different subnets for cluster compatibility.
You can use the PCI-e expandability to add more SAS or NIC interfaces, as required, e.g. DataON work closely with Mellanox for RDMA networking.
The enclosure also has:
- 12-bay 3.5”/2.5“ shared drive slots (with caddies)
- 1023W (1+1) redundant power
Typically, the 12 shared drive bays are used as a single storage pool with 4 x SSDs (performance) and 8 x 7200 RPM HDDs (capacity). Tiering in Storage Spaces works very well. Here’s an anecdote I heard while in a pre-sales meeting with one of our resellers:
They put a CiB (6 GB SAS, instead of 12 GB as on the CiB-9112) into a customer site last year. That customer had the need to run a regular batch job that would normally takes hours, and they had gotten used to working around that dead time. Things changed when the VMs were moved onto the CiB. The batch job ran so quickly that the customer was sure that it hadn’t run correctly. The reseller double-checked everything, and found that Storage Spaces tiering and the power of the CiB blades had greatly improved the performance of the database in question, and everything was actually fine – great actually!
And here was the kicker – that customer got a 2 node Hyper-V cluster with shared storage in the form of a DataON CiB for less than the cost of a SAN, let alone the cost of the 2 Hyper-V nodes.
How well does this scale? I find that CPU/RAM are rarely the bottlenecks in the SME. There are plenty of cores/logical processors in the E5-2600v3, and 512 GB RAM is more than enough for any SME. Disk is usually the bottleneck. With a modest configuration (not the max) of 4 x 200 GB SSDs and 8 x 4 TB drives you’re looking at around 14 TB of usable 2-way mirrored (like RAID 10) storage. Or you could have 4 x 1.6 TB SSDs and 8 x 8 TB HDDs and have around 32 TB of usuable 2-way mirrored storage. That’s plenty!
And if that’s not enough, then you can expand the CiB using additional JBODs.
My Hands-On Experience
Lots of hardware goes through our warehouse that I never get to play with. But on occasion, a reseller will ask for my assistance. A couple of weeks ago, I got to do my first deployment of the 12 Gb SAS CiB-9112. We got it out of the box, and immediately I was impressed. This design indicates that engineers had designed the hardware for admins to manage. It really is a very clever and modular design.
The two side-bezels on the front of the 2U enclosure have a power switch and USB port for each blade server.
On the top, you can easily access the replaceable fans via a dedicated hinged panel. At the back, both fault-tolerant power supplies are in the middle, away from the clutter at the side of a rack. The blades can be removed separately from their SAS controllers. And each of the RAID1 disks for the blades’ OS (the management OS for a Hyper-C cluster) can be replaced without removing the blade.
Racking a CiB is a simple task – the entire Hyper-V cluster is a single 2U enclosure so there are no SAN controllers, SAN switches, SAN cables, and multiple servers. You slide a single 2U enclosure into it’s rail kit, plug in power, networking, and KVM, and you’re done.
Windows Server is pre-installed and you just need to modify the installation type (from eval) and enter your product key using DISM. Then you prep the cluster – DataON pre-installs MPIO, Hyper-V, and Failover Clustering to make your life easy.
My design is simple:
- The 1 GbE NICs are teamed, connected to a weight-based QoS Hyper-V switch, and shared with the parent. A weight of 50 is assigned to the default bucket QoS rule, and 50 is assigned to the management OS virtual NIC.
- The 10 GbE NICs are on 2 different subnets.
- I enable SMB 3.0 Live Migration on both nodes in Hyper-V Manager.
- MPIO is configured with the LB policy.
- I ensure that VMQ is disabled on the 1 GbE NICs and enabled on the 10 GbE NICs.
- I form the cluster with no disks, and configure the 10 GbE NICs for Live Migration.
- A single clustered storage pool is created in Failover Cluster Manger.
- A 1 GB (it’s always bigger) 2-way mirrored virtual disk is created and configured as the witness disk in the cluster.
- I create 2 virtual disks to be used as CSVs in the cluster, with 64 KB interleaves and formatted with 64 KB allocation unit size. The CSVs are tiered with some SSD and some HDD … I always leave free space in the pool to allow expandability of one CSV over the other. HA VMs are balanced between the 2 CSVs.
What about DCs? If the customer is keeping external DCs then everything is done. If they want DCs running on the CiB then I always deploy them as non-HA DCs that are stored on the C: of each CiB blade. I know that since WS2012, we are supposed to be able to run DCs are HA VMs on the cluster, but I’ve experienced issues with that.
With some PowerShell, the above process is very quick, and to be honest, the slowest bit is always the logistics of racking the CiB. I’m usually done in the early afternoon, and that includes some show’n’tell.
If you want a tidy, quick & easy to deploy, and affordable HA solution for an SME or ROBO then the DataOn CiB-9112 V12 is an awesome option. If I was doing our IT from scratch, this is what I would use (we had existing servers and added a DataON JBOD, and recently replaced the servers while retaining the JBOD). I love how tidy the solution is, and how simple it is to set up, especially with some fairly basic PowerShell. So check it out, and see what it can do for you.
How many times have you watched or read the news, saw some story about an earthquake, hurricane, typhoon, or some other disaster and think “that will never happen here”? Stop kidding yourself; disasters can happen almost everywhere.
I’ve always considered Ireland to be relatively safe. We don’t have (anything you’d notice) earthquakes, typhoons, or tornadoes; our cattle and sheep don’t need flying licenses. Our weather is dominated by the gulf stream, keep Ireland temperate. It doesn’t get hot here (we are quite northerly) and our winters consist of cloud, rain, and normally about half a day of snow. We get the tail end of some of those hurricanes that hit the east coast US, but there’s not much left by the time they reach us – some trees get knocked over, some tiles knocked on our roofs, but it’s not too bad. Even when we look at our neighbours in England, we see how their more extreme climate causes them disasters that we don’t get. Natural disasters just don’t happen here. Or do they?
The last month or so has revealed that to be a lie. Ireland has been battered by 6 storms in the past month. The latest, Storm Frank, was preceded with warnings that the country was saturated. That means that the ground has absorbed all of the water that it can; any further rainfall will not be absorbed, and it will pool, flow, and flood.
This morning, I woke to these scenes:
Enniscorthy, Co. Wexford [Image source: Paddy Banville]
Graignamanagh, Co. Kilkenny [Image source: Graignamanagh G.A.A]
Middleton, Co. Cork [Image source: Fiona Donnelly]
Frank isn’t finished. It’s still blowing outside my office and more rain is sure to fall. There are stories of communities being evacuated to hotels, and the above photos are just the easy ones for the media to access.
This isn’t just a case of cows trapped in fields, stick a sandbag on it and you’re sorted, or somewhere far away. This is local. And Ireland is a relatively safe place – we’re not Oklahoma, a place that some deity has decided should be subject to cat 5 tornadoes every time you’re not looking. Dorothy, the point is, that disasters happen everywhere, including in the EU where we think it safe.
Let’s bring this back to business. Businesses have been put out of action by these floods. Odds are any computers or servers were either on the ground floor or in the basement. Those machines are dead. That means those businesses are dead. They might be lucky enough to have tapes (let’s leave that for another time) stored offsite but how reliable are they and will bare-metal restore work, or will it take forever? How much money will those businesses lose, or more critically, will those businesses survive loss of customers?
This is exactly why these businesses need a disaster recovery (DR) solution. There are several reasons why they don’t have one now:
- Fires and other unnatural disasters happen everywhere
- They couldn’t afford one
- The business owners didn’t think there was a need for one
- Some resellers didn’t think there was demand for one so they never brought it up with their customers
The need is there, as we can clearly see above. And thanks to Microsoft Azure, DR has never been so affordable. FYI, it comes in at a price that is a small fraction of the cost of solutions from the likes of Irish companies such as KeepITSafe – I’ve done the competitive pricing – and it opens that customer up to more technical opportunities with hybrid cloud solutions.
Microsoft Azure Site Recovery Services (ASR) is a disaster recovery-as-a-service (DRaaS) or cloud DR site offering from Microsoft. The beauty of it is that it’s there for everyone from the small business to the large enterprise. It works with Hyper-V, vSphere or physical machines, and it works with Windows or Linux as long as the OS is supported by Azure (W2008 R2 or later on the Windows side).
Note: There is a cost overhead for vSphere or physical machines to allow for on-premises conversion and forward and in-cloud management and storage, so you need a certain scale to absorb that cost. This is why I describe ASR as being perfect for SMEs with Hyper-V and mid-large companies with Hyper-V, vSphere or physical machines.
If I had ASR in place, and I has a business on the quayside in Cork, near the Slaney in Enniscorthy, or anywhere else where the rivers were close to bursting the banks then I would perform a planned failover, requiring about 2 minutes of my time to started a pre-engineered and tested one-click failover. My machines would shut down in the desired order, flush the last bit of replication to Azure, and start up the VMs in the desired order in Azure, and my machines and data would be safe. I can failback to new equipment or stay in Azure if the disaster wipes out my servers. And if that disaster doesn’t happen, I can easily failback to new equipment, or choose to stay in Azure and not worry about local floods again.
Your virtual machines lost network connectivity.
Yeah, Aidan Smash … again.
READ HERE: I’m tired of having to tell people to:
Disable VMQ on 1 GbE NICs … no matter what … yes, that includes you … I don’t care what your excuse is … yes; you.
That’s because VMQ on 1 GbE NICs is:
- On by default despite the requests and advice of Microsoft
- It breaks Hyper-V networking
Here’s what I saw on a brand new dell R730, factory fresh with a NIC firmware/driver update:
Now what do you think is the correct action here? Let me give you the answer:
- Change Virtual Machine Queues to Disabled
- Click OK
- Repeat on each 1 GbE NIC on the host.
Got any objections to that? Go to READ HERE above. Still got questions? Go to READ HERE above. Got some objections? Go to READ HERE above. Want to comment on this post? Go to READ HERE above.
This BS is why I want Microsoft to disable all hardware offloads by default in Windows Server. The OEMs cannot be trusted to deploy reliable drivers/firmware, and neither can many of you be trusted to test/configure the hosts correctly. If the offloads are off by default then you’ve opted to change the default, and it’s up to you to test – all blame goes on your shoulders.
So what modification do you think I’m going to make to these new hosts? See READ HERE above 😀
FYI, basic 1 GbE networking was broken on these hosts when I installed WS2012 R2 with all Windows Updates – the 10 GbE NICs were fine. I had to deploy firmware and driver updates from Dell to get the R730 to reliably talk on the network … before I did what is covered in READ HERE above.
I had fun presenting at this Microsoft UK event in London. Here’s a recording of my session on Windows Server 2016 (WS2016) Hyper-V, featuring failover clustering, storage, and networking:
More sessions can be found here.
We’ve known since Ignite 2015 that Microsoft was going to have two kinds of containers in Windows Server 2016 (WS2016):
- Windows Server Containers: Providing OS and resource virtualization and isolation.
- Hyper-V Containers: The hypervisor adds security isolation to machine & resource isolation.
Beyond that general description, we knew almost nothing about Hyper-V Containers, other than expect them in preview during Q4 of 2015 – Technical Preview 4 (TPv4), and that it is the primary motivation for Microsoft to give us nested virtualization.
That also means that nested virtualization will come to Windows Server 2016 Hyper-V in TPv4.
We have remained in the dark since then, but Mark Russinovich appeared on Microsoft Mechanics (a YouTube webcast by Microsoft) and he explained a little more about Hyper-V Containers and he also did a short demo.
Some background first. Normally, a machine has a single user mode running on top of kernel mode. This is what restricts us to the “one app per OS” best practice/requirement, depending on the app. When you enable Containers on WS2016, an enlightenment in the kernel allows multiple user modes. This gives us isolation:
- Namespace isolation: Each container sees it’s own file system and registry (the hives in the containers hosted files).
- Resource isolation: How much process, memory, and CPU a container can use.
Kernel mode is already running when you start a new container, which improves the time to start up a container, and thus it’s service(s). This is great for deploying and scaling out apps because a containerised app can be deployed and started in seconds from a container image with no long term commitment, versus minutes for an app in a virtual machine with a longer term commitment.
But Russinovich goes on to say that while containers are great for some things that Microsoft wants to do in Azure, they also have to host “hostile multi-tenant code” – code uploaded by Microsoft customers that Microsoft cannot trust and that could be harmful or risky to other tenants. Windows Server Containers, like their Linux container cousins, do not provide security isolation.
In the past, Microsoft has placed such code into Hyper-V (Azure) virtual machines, but that comes with a management and direct cost overhead. Ideally, Microsoft wants to use lightweight containers with the security isolation of machine virtualization. And this is why Microsoft created Hyper-V Containers.
Hyper-V provides excellent security isolation (far fewer vulnerabilities found than vSphere) that leverages hardware isolation. DEP is a requirement. WS2016 is introducing IOMMU support, VSM, and Shielded Virtual Machines, with a newly hardened hypervisor architecture.
Hyper-V containers use the exact same code or container images as Windows Server Containers.That makes your code interchangeable – Russinovich shows a Windows Server Container being switched into a Hyper-V container by using PowerShell to change the run type (container attribute RuntimeType).
The big difference between the two types, other than the presence f Hyper-V, is that Hyper-V Containers get their own optimized instance of Windows running inside of them, as the host for the single container that they run.
The Hyper-V Container is not a virtual machine – Russinovich demonstrates this by searching for VMs with Get-VM. It is a container, and is manageable by the same commands as a Windows Server Container.
In his demos he switches a Windows Server Container to a Hyper-V Container by running:
Set-Container -Name <Container Name> -RuntimeType HyperV
And then he queries the container with:
Get-Container -Name <Container Name> | fl Name, State, RuntimeType
So the images and the commands are common across Hyper-V Containers and Windows Server Containers. Excellent.
It looked to me that starting this Hyper-V Container is a slower operation than starting a Windows Server Container. That would make sense because the Hyper-V container requires it’s own operating system.
I’m guessing that Hyper-V Containers either require or work best with Nano Server. And you can see why nested virtualization is required. A physical host will run many VM hosts. A VM host might need to run Hyper-V containers – therefore the VM Host needs to run Hyper-V and must have virtualized VT-x instructions.
Russinovich demonstrates the security isolation. Earlier in the video he queries the processes running in a Windows Server Container. There is a single CSRSS process in the container. He shows that this process instance is also visible on the VM host (same process ID). He then does the same test with a Hyper-V Container – the container’s CSRSS process is not visible on the VM host because it is contained and isolated by the child boundary of Hyper-V.
What about Azure? Microsoft wants Azure to be the best place to run containers – he didn’t limit this statement to Windows Server or Hyper-V, because Microsoft wants you to run Linux containers in Azure too. Microsoft announced the Azure Container Service, with investments in Docker and Mesospehere for deployment and automation of Linux, Windows Server, and Hyper-V containers. Russinovich mentions that Azure Automation and Machine Learning will leverage containers – this makes sense because it will allow Microsoft to scale out services very quickly, in a secure manner, but with less resource and management overhead.
That was a good video, and I recommend that you watch it.