These are my notes from the recording of this session by Gaurav Daga at Microsoft Ignite 2015. In case you don’t know, I’ve become of a fan of Azure Site Recovery (ASR) since it dropped SCVMM as a requirement for DR replication to the cloud. And soon it’s adding support for VMware and physical servers … that’s going to be a frakking huge market!
This technology currently is in limited preview (sign-up required). Changes will probably happen before GA.
Note: Replication of Hyper-V VMs is much simpler than all this. See my posts on Petri.com.
What is in Preview
Replication from the following to Azure:
- vSphere with vCenter
- ESXi
- Physical servers
Features
- Heterogeneous workload support (Windows and Linux)
- Automated discovery of vSphere ESXi VMs, with or without vCenter
- Manual discovery of physical machines (based on IP address)
- Near zero RPOs with Continuous Data Protection (they’ll use whatever bandwidth is available)
- Multi-VM consistency using Protection Groups. To have consistent failover of n-tier applications.
You get a cold standby site in Azure, consuming storage but not incurring charges for running VMs.
- Connectivity over the Internet, site-site VPN or ExpressRoute
- Secure data transfer – no need for inbound ports on the primary site
- Recovery Plans for single-click failovers and low RTOs
- Failback possible for vSphere, but not possible for physical machines
- Events and email notifications for protection and recovery status monitoring
Deployment Architecture
- An Azure subscription is required
- A Mobility Service is downloaded and installed onto all required VMware virtual machines (not hosts) and physical servers. This will capture changes (data writes in memory before they hit the VMDK) and replicates them to Azure.
- A Process Server sits on-premises as a DR gateway. This compresses traffic and caching. It can be a VM or physical machine. If there is a replication n/w outage it will cache data until the connection comes back. Right now, the PS is not HA or load balanced. This will change.
- A Master Target runs in your subscription as an Azure VM. The changes are being written into Azure VHDs – this is how we get VMDK to VHD … in VM memory to VHD via InMage.
- The Config(uration) Server is a second Azure VM in your subscription. It does all of the coordination, fix-ups and alerts.
- When you failover, VMs will appear in your subscription, attach to the VHDs, and power up, 1 cloud service per failed over recovery plan.
Demo
The demo environment is a SharePoint server running on vSphere (managed using vSphere Client) that will be replicated and failed over to Azure. He powers the SP web tier and the SP website times out after a refresh in a browser. He’s using Azure Traffic Manager with 2 endpoints – one on-premises and one in the cloud.
In Azure, he launches the Recovery Plan (RP) – and uses the latest application consistent recovery point (VSS snapshot). AD starts, then SQL, app tier, web tier, and then an automation script will open an endpoint for the Traffic Manager redirection. This will take around 40 minutes end-to-end with human involvement limited to 1 click. The slowness is the time it takes for Azure to create/boot VMs which is considerably slower than Hyper-V or vSphere. #
Later on in the session …
The SharePoint site is up and running thanks to the failed over Traffic Manager profile. What’s happened;
Now, back to setting this up:
First you need create an ASR vault. Then you need to deploy a Configuration Server (the manager or coordinator running in an Azure VM). This is similar to the new VM dialogs – you pick a name, username/password, and a VNET/subnet (requires site-site n/w configuration beforehand). A VM is deployed from a standard template in the IaaS gallery (starts with Azure A3 for required performance and scale). You download a registration key and register it in your Configuration Server (CS). The CS should show up as registered. Then you need to deploy a Master Target Server. You need a Windows MTS to replicate VMs with Windows and you need a Linux MTS to replicate VMs with Linux. There are two choices: Std A4 or Standard D14 (!). And you associate the new MTS with a CS. Again, a gallery image is deployed for you.
Next you will move on-premises to deploy a Process Server. Download this from the ASR vault quick start. It is an installation on WS2012 R2.
Are you going to use a VPN or not? The default is “over the Internet” via a public IP/port (endpoint to the CS). If you select VPN then a private IP address will be used.
Now you must register a vCenter server to the Azure portal in the ASR vault. Enter the private IP, credentials and select the on-premises Process Server. All VMs on vSphere will be discovered after a few minutes.
Create a new Protection Group in the ASR vault, select your source, and configure your replication policy:
- Multi-VM consistency: enable protection groups for n-tier application consistency.
- RPO Threshold: Replication will use what bandwidth is made available. Alerts will be raised if any server misses this threshold.
- Recovery Point Retention: How far back in time might you want to go during a failover? This retains more data.
- Application consistent snapshot frequency: How often will this be done?
Now VMs can be added to the Protection Group. There is some logic for showing which VMs cannot be replicated. The mechanism is guest-based so VMs must be powered on to replicate. Powered off VMs with replication enabled will cause alerts. Select the server, select a Process Server, select a MTS, and a storage account for the replicated VHDs. You then must enter credentials to allow you to push the Mobility Service (the replication agent) to the VMs’ guest OSs. Alternatively, use a tool like SCCM to deploy the Mobility Service in advance.
Monitoring is shown in the ASR events view. You can configure e-mail notifications here.
There’s a walk through of creating a RP.
Prerequisites
These Azure components must be in the same region:
- Azure VNET
- Geo-redundant storage account
- ASR vault
- Standard A3 Configuration Server
- Standard A4 or Standard D14 Master Target Servers
Source machines must comply with Azure VM requirements:
- Disc count: maximum of 32 disks per protected source machine
- Individual disk capacity of no more than 1023 GB
- Clustered servers not supported
- UEFI/EFI boot not supported
- BitLocker encrypted volumes not supported
Make sure your Azure subscription can fire up enough virtual processors for a failover – the limit is quite low by default so you will probably have to open an Azure account support call (free as a part of your subscription).
On-premises you need VMware with:
- vCenter Server or ESXi 5.1/5.5 with latest updates
- VMs with VMware tools installed & running
- All vCenter Server resource names in English
The Process Server:
- WS2012 R2 physical or virtual machine
- Same network/subnet as the source VMs
- VMware vSphere VLI 5.5 for automatic discovery of VMs
Network Connectivity:
- Bandwidth – don’t you dare ask me how much! That’s what you’re paid to figure out.
- Firewall rules to allow outbound connections to Azure
- Upfront decision on whether you’re connecting via public Internet, site-site VPN, or ExpressRoute.
Windows Server Source (Physical or Virtual) Machine Requirements:
- W2008 R2 SP1, WS2012, WS2012 R2
- No dynamic disk support
- Single or multiple NICs (Azure VM spec must support the number of NICs)
Linux Source (Physical or Virtual) Machine Requirements (64-bit only):
- CentOS 6.5/6.5/6.6
- Oracle Linux 6.4/6.5
- SLES 11 SP3
- File systems: EXT3/EXT4/ReiserFS/XFS
- Multipath software: Device Mapper-Multipath
- Volume Manager: LVM2
- Physical servers with HP CCISS controller storage not supported
- Only 1 NIC can be assigned to an Azure VNET – working to fix this.
Common Source Machines Requirements
- They must be turned on
- Remote Desktop or Secure Shell must be turned on if you need access after failover.
- English naming for: machine name, system path, mount points, device names, and file/folder names
Capacity Planning
In Azure:
- An A3 CS will protect around 750 source machine volumes – this assumes 3 volumes per source machine, which is around 250 source machines
- MTS: They offer a Std A4 (16 data disks) and a Std D14 (32 data disks).
One disk on every MTS reserved for retention. This leaves you with:
- Maximum protected disks per Std A4 = 15
- Maximum protected disks per Std D14 = 31
Better scale out with multiple A4 MTS’s. This means you can replicate VMs with 40 volumes to 3 x A4 MTSs. A single source machine cannot replicate to multiple MTS’s (N:1 replication only). Only use a D14 if a single source machine has more than 15 total disks. Remember: use Linux MTS for Linux source machines and Windows MTS for Windows source machines.
Storage Accounts
- Single MTS can span multiple storage accounts – one for it’s OS and retention disks, one or more for replicated data disks
- ASR replication as approx a 2.4 IPS multiplier on the Azure subscription. For every source IO, there are 2 IOs on the replicated data disk and .5 IO on the retention disk.
- Every Azure Storage Account support a max of 20,000 IOPS. Best practice is to have 1 SA (up to 100 in a subscription) for every 8,000-10,000 source machine IOPS – no additional cost to this because you pay for Azure Storage based on GB used (easy to predict) and transactions (hard to predict micropayment).
On Premises Capacity Planning
This is based on your change rate:
Migration from VMware to Azure
Yup, you can use this tool to do it. Perform a planned failover and strip away replication and the on-premises stuff.