PowerPoint – E2EVC Copenhagen Microsoft Virtualisation Keynote

I recently did a presentation called “What’s New In Microsoft Virtualization” at the E2EVC event in Copenhagen, Denmark.  It was a 45 minute slot and there was so much to cover.  So I had to be picky about what I presented on.  This is the deck that I used:

 

PowerPoint – Why Upgrade To Windows Server 2012

A few months ago myself and Dave Northey (then a DPE with MSFT Ireland) did a road show around Ireland discussing the reasons that companies should upgrade to Windows Server 2012.  We deliberately excluded discussions of Hyper-V … mainly cos I’ve been beating people around the head on that topic Smile  Here’s the deck we presented:

 

TechEd NA 2013: Some Sights From The Expo Floor

Today was the first time I really had a session gap and time to wander about the expo floor.  I had a chance to talk to some vendors.  I took the opportunity to take a few photos with a fab little Nikon Coolpix S9500 that I borrowed from the office.

In the Surface section was a Windows sponsored “NASCAR” stock car.  It was a cracking looking machine.  A few people tried to rock it … there was no give in the suspension.  I doubt it’s a comfy street machine Smile

DSCN0090

Inside, you can see that this is not quite a family machine, and the stereo system and bluetooth appear to be additional extras.

DSCN0092

DataOn are one of the big players in the Storage Spaces story of Windows Server 2012/R2.  They have certified JBODs for Storage Spaces.  This week they launched two Cluster-in-a-Box (CiB) appliances.

The first takes 70 drives.  See those loops at the front/bottom?  Those are easily removable backplanes – that makes all disk-related maintenance easy.  At the back are 2 blade servers, with IPMI BMCs and 2 * SFP+ iWARP NICs.  They’re dual E5 CPU powered with onboard RAID1 drives.

DSCN0093

There’s also a CiB the SME.  It takes 12 drives.  There are 2 blades with IPMI and E5 CPUs, with 1 GbE networking.

DSCN0096

I also saw the new Dell PowerEdge VRTX CiB.  It comes with 2 blades by default, with 2 blade slots free.  I was told it does PCI RAID instead of Storage Spaces.  You can see that it takes 12 drives.

I also talked to the folks at F5.  They clarified their strategy for an NVGRE gateway.  The new software device is for their virtual Big-IP appliance.  Their long term strategy is to include the NVGRE gateway in an update to the physical Big-IP load balancer.  Why?  Because combining NVGRE with the NLB allows them to intelligently do load balancing for VMs in VM Networks.

You know what disappointed me with the Expo floor?  The lack of swag.  Why should I talk to any sponsor if they don’t have something to make me talk to them.  And what the hell am I meant to wear in the office now?  My MMS 2012 t-shirts are starting to fall apart!!!

I wrapped up the afternoon by hanging outside the Channel 9 studio while Mark Minasi was being interviewed by MSFT’s Joey Snow.  And I was chuffed that Mark gave me a plug.

DSCN0098

Technorati Tags: ,

TechEd NA 2013 – Software Defined Storage In Windows Server & System Center 2012 R2

Speakers: Elden Christensen, Hector Linares, Jose Barreto, and Brian Matthew (last two are in the front row at least)

4:12 SSDs in 60 drive jbod.

Elden kicks off. He owns Failover Clustering in Windows Server.

New Approach To Storage

  • File based storage: high performance SMB protocol for Hyper-V storage over Ethernet networks.  In addition: the scale-out file server to make SMB HA with transparent failover.  SMB is the best way to do Hyper-V storage, even with backend SAN.
  • Storage Spaces: Cost-effective business critical storage

Enterprise Storage Management Scenarios with SC 2012 R2

Summary: not forgotten.  We can fully manage FC SAN from SysCtr via SMI’-S now, including zoning.  And the enhancements in WS2012 such as TRIM, UNMAP, and ODX offer great value.

Hector, Storage PM in VMM, comes up to demo.

Demo: SCVMM

Into the Fabric view of the VMM console.  Fibre Channel Fabrics is added to Providers under Storage.  He browses to VMs and Services and expands an already deployed 1 tier service with 2 VMs.  Opens the Service Template in the designer.  Goes into the machine tier template.  There we see that FC is surfaced in the VM template.  It can dynamically assign or statically assign FC WWNs.  There is a concept of fabric classification, e.g. production, test, etc.  That way, Intelligent Placement can find hosts with the right FC fabric and put VMs there automatically for you.

Opens a powered off VM in a service.  2 vHBAs.  We can see the mapped Hyper-V virtual SAN, and the 4 WWNs (for seamless Live Migration).  In Storage he clicks Add Fibre Channel Array.  Opens a Create New Zone dialog.  Can select storage array and FC fabric and the zoning is done.  No need to open the SAN console.  Can create a LUN, unmask it at the service tier …. in other words provision a LUN to 64 VMs (if you want) in a service tier with just a couple of mouse clicks … in the VMM console.

In the host properties, we see the physical HBAs.  You can assign virtual SANs to the HBAs.  Seems to offer more abstraction than the bare Hyper-V solution – but I’d need a €50K SAN and rack space to test Smile

So instead of just adding vHBA support, but they’ve given us end-end deployment and configuration.

Requirement: SMI-S provider for the FC SAN.

Demo: ODX

In 30 seconds, 3% of BITS VM template creation is done.  Using same setup but with ODX, but the entire VM can be deployed and customized much more quickly.  In just over 2 minutes the VM is started up.

Back to Elden

The Road Ahead

WS2012 R2 is cloud optimized … short time frame since last release so they went with a focused approach to make the most of the time:

  • Private clouds
  • Hosted clouds
  • Cloud Service Providers

Focus on capex and opex costs.  Storage and availability costs

IaaS Vision

  • Dramatically lowering the costs and effort of delivering IaaS storage services
  • Disaggregated compute and storage: independent manage and scale at each layer. Easier maintenance and upgrade.
  • Industry standard servers, networking and storage: inexpensive networks. inexpensive shared JBOD storage.  Get rid of the fear of growth and investment.

SMB is the vision, not iSCSI/FC, although they got great investments in WS2012 and SC2012 R2.

Storage Management Pillars

picture053

Storage Management API (SM-API)

DSCN0086

VMM + SOFS & Storage Spaces

  • Capacity management: pool/volume/file share classification.  File share ACL.  VM workload deployment to file shares.
  • SOFS deployment: bare metal deployment of file server and SOFS.
  • Spaces provisioning

Guest Clustering With Shared VHDX

See yesterday’s post.

iSCSI Target

  • Uses VHDX instead of VHD.  Can import VHD, but not create. Provision 64 TB and dynamically resize LUNs
  • SMI-S support built in for standards based management, VMM.
  • Can now manage an iSCSI cluster using SCVMM

Back to Hector …

Demo: SCVMM

Me: You should realise by now that System Center and Windows Server are developed as a unit and work best together.

He creates a Physical Computer Profile.  Can create a VM host (Hyper-V) or file server.  The model is limited to that now, but later VMM could be extended to deploy other kinds of physical server in the data centre.

Hector deploys a clustered file server.  You can use existing machine (enables roles and file shares on existing OS) OR provision a bare metal machine (OS, cluster, etc, all done by VMM).  He provisions the entire server, VMM provisions the storage space/virtual disk/CSV, and then a file share on a selected Storage Pool with a classification for the specific file share.

Now he edits the properties of a Hyper-V cluster, selects the share, and VMM does all the ACL work.

Basically, a few mouse clicks in VMM and an entire SOFS is built, configured, shared, and connected.  No logging into the SOFS nodes at all.  Only need to touch them to rack, power, network, and set BMC IP/password.

SMB Direct

  • 50% improvement for small IO workloads with SMB Direct (RDMA) in WS2012 R2.
  • Increased performance for 8K IOPS

Optimized SOFS Rebalancing

  • SOFS clients are now redirected to the “best” node for access
  • Avoids uneccessary redirection
  • Driven by ownership of CSV
  • SMB connections are managed by share instead of per file server.
  • Dynamically moves as CSV volume ownership changes … clustering balances CSV automatically.
  • No admin action.

Hyper-V over SMB

Enables SMB Multichannel (more than 1 NIC) and Direct (RDMA – speed).  Lots of bandwidth and low latency.  Vacate a host really quickly.  Don’t fear those 1 TB RAM VMs Smile

SMB Bandwidth Management

We now have 3 QoS categories for SMB:

  • Default – normal host storage
  • VirtualMachine – VM accessing SMB storage
  • LiveMigration – Host doing LM

Gives you granular control over converged networks/fabrics because 1 category of SMB might be more important than others.

Storage QoS

Can set Maximup IOPS and Minimum IOPS alerts per VHDX.  Cap IOPS per virtual hard disk, and get alerts when virtual hard disks aren’t getting enough bandwidth – could lead to auto LM to another better host.

Jose comes up …

Demo:

Has a 2 node SOFS.  1 client: a SQL server.  Monitoring via Perfmon, and both the SOFS nodes are getting balanced n/w utilization caused by that 1 SQL server.  Proof of connection balancing.  Can also see that the CSVs are balanced by the cluster.

Jose adds a 3rd file server to the SOFS cluster.  It’s just an Add operation of an existing server that is physically connected to the SOFS storage.  VMM adds roles, etc, and adds the server.  After a few minutes the cluster is extended.  The CSVs are rebalanced across all 3 nodes, and the client traffic is rebalanced too.

That demo was being done entirely with Hyper-V VMs and shared VHDX on a laptop.

Another demo: Kicks off an 8K IO worklaod.  Single client talking to single server (48 SSDs in single mirrored space) and 3 infiniband NICs per server.  Averaging nearly 600,000 IOPS, sometimes getting over that.  Now he enables RAM caching.  Now he gets nearly 1,000,000 IOPS.  CPU becomes his bottleneck Smile 

Nice timing: question on 32K IOs.  That’s the next demo Smile  RDMA loves large IO.  500,000 IOPS, but now the throughput is 16.5 GIGABYTES (not Gbps) per second.  That’s 4 DVDs per second.  No cheating: real usable data, going to real file system, nor 5Ks to raw disk as in some demo cheats.

Back to Elden …

Data Deduplication

Some enhancements:

  • Dedup open VHD/VHDX files.  Not supported with data VHD/VHDX.  Works great for volumes that only store OS disks, e.g. VDI.
  • Faster read/write of optimized files … in fact, faster than CSV Block Cache!!!!!
  • Support for SOFS with CSV

The Dedup filter redirects read request to the chunk store.  Hyper-V does buffered IO that bypasses the cache.  But Dedup does cache.  So Hyper-V read of deduped files is cached in RAM, and that’s why dedupe can speed up the boot storm.

Demo: Dedup

A PM I don’t know takes the stage.  This demo will be how Dedup optimizes the boot storm scenario.  Starts up VMs… one collection is optimized and the other not.  Has a tool to monitor boot up status.  The deduped VMs start up more quickly.

Reduced Mean Time To Recovery

  • Mirrored spaces rebuild: parallelized recovery
  • Increased throughput during rebuilds.

Storage Spaces

See yesterday’s notes.  They heapmap the data and automatically (don’t listen to block storage salesman BS) promote hot data and demote cold data through the 2 tiers configured in the virtual disk (SSD and HDD in storage space).

Write-Back Cache: absorbs write spikes using SSD tier.

Brian Matthew takes the stage

Demo: Storage Spaces

See notes from yesterday

Back to Elden …

Summary

DSCN0087

TechEdNA 2013 – Application Availability Strategies for the Private Cloud

Speakers: Jose Barreto, Steven Ekren

Pre-session question … How far can a cluster stretch?  You can have your heartbeat up to 1 minute timeout.  They recommend no more than 10-20 seconds.  However there is a license mobility limit – it is pretty long distance, but it does exist.

Moving Physical to the Private Cloud (Virtual)

Many ways to P2V from rebuilt, disk2vhd, backup/restore, VMM, and on and on and on.

VMs can be HA on Hyper-V.  Cost reductions and mobility by virtualization.  Easier backup.  Easier deployment.  Easier monitoring.  Flexibility.  Self-service.  Measurability.  Per-VM/VHD VM replication is built in with Hyper-V Replica.  And on and on and on.

VM Monitoring added in WS2012 Failover Clustering

2 levels of escalated action in response to a failure trigger:

  1. Guest level HA recovery
  2. Host level HA recovery

DSCN0079

Off by default and requires configuration.  Watch for an alert, say from a service.  If service fails, cluster gets the alert and restarts the service.  If within an hour, the cluster gets the same alert again, it’ll fail it over (shut down) to another host.

Requires that the VM is WS2008 R2 or later and in the same domain as the hosting Hyper-V cluster.

DSCN0080

In the private cloud:

  • Guest OS admin configures the failure triggers
  • Recovery from host is configured by the cloud admin

The process works through the Hyper-V heartbeat integration component in the guest OS.  An “application critical flag” goes back to the parent partition via VMMS, and escalated in the host via the VM resource in the cluster, to the Cluster Service.

You can enable VM Monitoring in WS2012 in the VM properties (cluster) in Settings.  The cluster will still get a signal, if configured in the guest OS, but it is ignored.  Basically cloud admin can disable the feature, and it ignores what the tenant does in their VM.

Event ID 1250 will be registered in System log with FailoverClustering source when the application critical flag is sent.

We can set up a trigger for a service failure or an event.

Add-ClusterVMMonitoredItem … Get-, Remove-, Reset- are run by a guest OS admin in the VM.

You can also hit Configure Monitoring action on a VM in Failover Cluster Manager on the cloud.  Assumes you have admin rights in the VM.

Guest Clustering

We can create guest OS clusters.  Protects against faults in the guest layer, e.g. BSOD, registry issue, etc.  Also allows preventative maintenance with high SLAs.

Can use: iSCSI, virtual fiber channel, or SMB 3.0 shared storage.

Guest Clustering and VM Monitoring

You can use both together.

Set cluster service restart action to none for 2nd and 3rd failure in the guest cluster node OS.  First failure is left at Restart the Service.

Then from the host site, enable VM monitoring for the guests’ Cluster Service.

Demo of virtual SOFS

Steven kills the cluster service on a SOFS node using Process Explorer.  The service restarts.  Video being streamed from the SOFS via that node pauses and resumes maybe 2-3 seconds later.  He kills the service a second time.  The host cluster shuts down the VM and fails it over.

Thorough Resource Health Check Interval defaults to 1 minute in the VM properties in Failover Cluster Manager.  You can reduce this if you need to, maybe 20 seconds.  Don’t make it too often, because the check does run a piece of code and that would be very inefficient. 

Jose comes on stage.

Shared Virtual Disks

Before WS2012 R2, the only way we could do guest clustering was by surfacing physical/cloud storage to the tenant layer, or by deploying virtual file servers/iSCSI.  First is insecure and inflexible, second is messy.  Hosting companies just won’t want to do it – and most will refuse.

With WS2012 R2, VMs can share a VHDX file as their shared data disk(s).  It is a shared SAS device from the VM’s perspective.  It is for data disks only.

There are 2 scenarios supported:

  • Using CSV to store the VHDX
  • Using SMB to store the VHDX

The storage location of the CSV must be available to all hosts that guest cluster nodes will be running on.

This solution isolates the guests/tenants from your hosts/cloud fabric. 

Deploying Shared VHDX

Use:

  • Hyper-V Manager
  • PowerSHell
  • VMM 2012 R2

Think about:

  • Anti-affinity, availability sets in VMM service templates.  Keep the guests on different hosts so you don’t have a single point of failure.
  • Watch out for heartbeats being too low.

Deploy the data disk on the SCSI controller of the VMs.  Enable sharing in the Advanced features of the VHDX in the VM settings.

In the VM, you just see a shared SAS disk.  You can use an older version of Windows … 2012 and 2012 R2 will be supported.  This is limited by time to test older versions.

DSCN0081

DSCN0082

PowerShell:

  • New-VHD
  • Add-VMHardDiskDrive …. –ShareVirtualDisk < repeat this on all the guest cluster VMs
  • Get-VMHardDiskDrive … | ft VMName, Path, ControllerType, SupportPersistentReservations < the latter setting indicates that it is shared if set to True.

In VMM service template tier properties, you can check Share The Disk Across The Service Tier in the VHDX properties.

Inside the VM, it just looks like a typical disk in Disk Management, just like in physical cluster.

Tip: use different VHDX files for your different data volumes in the guest OS cluster.  It gives you more control and flexibility.  Stop being lazy and do this!

The hosts must be 2012.  The guests are 2012 and 2012 R2, with the latest integration components. 

This is only VHDX – it uses the metadata feature of the disk to store persistent reservation information.  Can use fixed or dynamic, but not differencing.

Backup

Guest-based backup only.  Host based-backups and snapshots of the shared VHDX are not supported.  Same restrictions as with guest clusters using physical storage.

Storage Migration of Shared VHDX

This is not supported – it is being referenced by multiple VMs.  You can Live Storage Migrate the other VM files, but just not the shared data VHDX of the guest cluster.

You can Live Migrate the VMs.

Comparing Guest Cluster Options

DSCN0083

Troubleshooting

  • Performance counters: Added new counters to PerfMon
  • Event Viewer: Hyper-V-Shared-VHDX
  • Filter Manager (FLTMC.EXE): The Shared VHDX filter can be looked at – svhdxflt
  • Actual binaries of the filer: svhdxflt.sys and pvhdparsersys

Online Resize

You can hot resize a non-shared VHDX in WS2012 R2.  You cannot hot resize a shared VHDX.

You can hot-add a shared VHDX.

Unsupported bonus scenario

DSCN0085

TechEd NA 2013–Storage Spaces Performance

Speaker: Brian Matthew

Start was some metrics achieved stuff.  Summary: Lots of IOPS.

DSCN0067

DSCN0070

DSCN0068

Hardware

It’s simple and cost effective. Goes from basic to OLTP workloads.

Capabilities Overview

Storage Pools support ReFS

2 * JBODs.  We create a single storage pool to aggregate all 48 disks ( 2 * 24 in this example.  We create 1 * 2-way mirror spaces and 1 * parity space.

  • Flexible resilient storage spaces.
  • Native data striping maximizes performance
  • Enclosure awareness with certified hardware.
  • Data Integrity Scanner (aka “scrubber”) with NTFS and ReFS
  • Continuous Availability with Windows Server failover clustering – SOFS

Data is spread around the disks in the storage pool.  They parallelize the rebuild process.

8 * 3 TB disk test bed.  Test the failure of the disk.  Can rebuild in 50 minutes, with > 800 MB/s rebuild throughput.  The line is that hot spare is no longer necessary in WS2012 R2.  Hmm.  Must look into that.

Scale-Out Example

Note: CSV scales out linearly

DSCN0072.

Match workload characteristics to drives

  • Capacity optimized drives have lower performance. Higher TB/$
  • High performance drives has lower capacity/host.  Higher IOPS/$

Can we seamlessly merge these?

Tiered Storage Spaces

A single virtual disk can use the best of both types of disk.  High capacity for colder slices of data.  High speed for hotter slices of data.

The most compelling ratio appears to be 4 to 12 SSDs in a 60 slot device, with the rest of the disks being HDDs.

In the background, the file system actively measures the activity of file slices.  Transparently moves hot slices to the SSD tier, and cold slices to the HDD tier.

Tiering (analysis and movement) is done daily.  The schedule is configurable (change time, do it more than daily).  The slices are 1 MB in size.  So tracking watches 1 MB slices, and tiering is done on 1 MB slices.

Administrators can pin entire files to specified tiers.  Example, move a VDI parent VHDX to the SSD tier.

DSCN0073

Write-Back Cache

Per virtual disk, persistent write cache.  It smoothens out write bursts to a virtual disk.  Uses the SSD capacity of the pool for increased IOPS capacity.  Configurable using PowerShell.  Great for Hyper-V which needs write-through, instead of battery powered write cache.

PowerShell Demo

Get-PhysicalDisk to list the possible disks to use “CanPool”attribute.

$disks = Get-PhyscalDisks

New-StoragePool …. $disks

Get-StoragePool to see the disks.  Look at FriendlyName and MediaType attributes.

$SSD_Tier = New-StorageTier … list the SSDs

$HDD_Tier = New-StorageTier … list the HDDs

$vd1 = New-VirtualDisk …. –StorageTiers @($ssd_tier, $hdd_tier) –StorageTierSizes @(150GB, 1.7TB) ….

Now we have a drive with automated scheduled storage tiering.

Pins some files using Set-FileStorgeTier

Optimize-Volume –DriveLetter E –TierOptimize  ….. this will force the maintenance task to run and move slices.

Demo: Write-Back Cache

He increases the write workload to the disk. A quick spike and then the SSD takes over.  Then increases again and again, and the write-back cache absorbs the spikes.

DSCN0075

Question: How many tiers are supported in WS2012 R2?  2.  But the architecture will allow MSFT to increase this in later releases if required.

Right now, certified clustered storage spaces from:

  • DataOn
  • RAID Incorporated
  • Fujitsu

Takeaways

  • WS2012 R2 is a key component in the cloud:cost efficient
  • Scalable data access: capacity and performance
  • Continuously available
  • Manageable from Server Manager, PoSH, and SCVMM (including SOFS bare metal deployment from template.

Q&A

No docs on sizing Write-Back Cache.  They want the WBC to be not too large.  Up to 10 GB is being recommended right now.  You can reconfigure the size of the WBC after the fact … so monitor it and change as required.

On 15K disks: Expensive and small.  Makes sense to consider SSD + 7.5K disks in a storage pool rather than SSD + 15 K in a storage pool.

He can’t say it, but tier 1 manufacturers are scared *hitless of Storage Spaces.  I also hear one of them is telling porky pies to people on the Expo floor re the optimization phase of Storage Spaces, e.g. saying it is manual.

Is there support for hot spares?  Yes, in WS2012 and R2.  Now MSFT saying you should use space capacity in the pool with parallelized repair across all disks in the pool, rather than having a single repair point.

DeFrag is still important for contiguous data access.

If I have a file on the SSD tier, and the tier is full, writes will continue OK on the lower tier.  The ReFS integrity stream mechanism can find best placement for a block.  This is integrated with tiered storage spaces.

On adding physical disks to the storage space: old data is not moved: instant availability.  New writes are sent to the new disks.

A feature called dirty region table protects the storage space against power loss caused corruption.

Should hard drive caches be turned off?  For performance: turn it off.  For resilience, turn it on.  Note, a cluster will bypass the disk cache with write-through.

There is some level of failure prediction.  There are PoSH modules for detecting issues, e.g. higher than normal block failure rates, or disks that are slower than similar neighbours.

Ah, the usual question: Can the disks in a storage space span data centers.  The members of a storage pool must be connected to all nodes in a SOFS via SAS, which makes that impossible.  Instead, have 2 different host/storage blocks in 2 sites, and use Hyper-V Replica to replicate VMs.

Virtual Disk Deployment Recommendations

When to use Mirror, Parity, or Simple virtual disks in a storage space?

DSCN0076

A storage space will automatically repair itself when a drive fails – and then it becomes resilient again.  That’s quick thanks to parallelized repair.

Personal Comment

Love hearing a person talk who clearly knows their stuff and is very clear in their presentation.

Holy crap, I have over a mile to walk to get to the next storage session!  I have to get out before the Q&A ends.

TechEdNA – Upgrading your Private Cloud From 2012 to 2012 R2

I am live blogging so hit refresh to see more.

Speakers: Ben Armstrong, Jose Barreto, Rob Hindman

Primary focus of the session is upgrading from from (Windows Server 2012) WS2012 Hyper-V to (Windows Server) WS2012 R2 Hyper-V.  There are scale requirements.

Advice: deploy new designs with upgrades in mind – faster release cadence from Microsoft.

Fabric

  • System Management: System Center on Hyper-V
  • Compute: Hyper-V
  • Storage: Scale-Out File Server on block storage or Storage Spaces

picture051

Upgrade System Center First

It will manage the existing cloud/hosts and enable upgrades.

Question: will users notice if a given SysCtr component is offline for a brief period of time.

http://technet.microsoft.com/en-us/library/jj628203.aspx …. should be updated with WS2012 R2 upgrades.  Remember to turn on OpsMgr maintenance mode during upgrades!!!

Upgrading SCVMM

  • Ensure that SCVMM is configured with a seperate (preferably external) database server
  • Uninstall SCVMM 2012 SP1 – leave library/libraries and SCVMM database in place
  • Install SCVMM 2012 R2, and connect to existing database.

Your outage time is minutes.  Deploy SCVMM in a VM.  And deploy SCMM as a HA cluster (pretty sensible in a true cloud where SCVMM is critical to self-service, etc).

Up comes Jose Barreto …

You could do Compute upgrade next but ….

Upgrading Storage

Tools:

  • Storage migration
  • Copy Cluster Roles Wizard
  • Upgrade in place
  • PowerShell scripting

Options for storage upgrade

Extra hardware.  No down time: (easiest) migrate storage.  (2nd fave) Limited downtime: copy cluster role.

Limited extra hardware: No downtime: (4th fave) Migrate pools.  (3rd fave) Limited downtime: upgrade in place.

Option 1 – Migrate Storage

  • Setup new 2012 R2 storage cluster
  • Configure access to new cluster
  • Storage migrate every VM (Live Storage Migration to new storage platform)

Easy and zero downtime.  Easy to automate.  Network intensive.  Needs new storage platform.

picture052

Option 2 – Copy Cluster Roles

Some downtime, but very quick.

  • Setup new 2012 R2 storage cluster.  Connect new cluster to existing storage.
  • Copy cluster roles.
  • Downtime begins: Offline roles on old cluster.  Online roles on new cluster
  • Down time end.

Limited downtime.  No data moved on the network.  Limited additional h/w.  Good for impatient admins. 

3 – Upgrade in place

1 – Prepare

  • HA degraded
  • Evict a node from clsutger
  • Upgrade/clean install evicted node
  • Create new cluster with evicted node

2 – Migrate …. do the previous Cluster Role Copy process.

3 – Rebuild the last remaining node in old cluster and join the domain.

You lose HA for a time.  You could buy 1 extra server if that’s an issue and recycle 1 old server when the process completes. 

4 – Move Pools

No downtime.  Moves data over the network.  Limited additional hardware.

1 – Split cluster

  • Evict node(s) on old cluster – if you have 4 nodes then you can evict 2 nodes and keep HA.
  • Upgrade evicted nodes to new version
  • Forma  site-by-side cluster with shared access to the storage

2 – Migrate storage

  • Evacuate a pool of VMs using storage live migration
  • Evict pool from old cluster
  • Add pool to new cluster
  • Use storage live migration to move VMs to pool on new storage cluster
  • Repeat until complete

You need extra storage capacity to do this … you are moving VM files from pre-evicted pool to other pools in the older cluster, before moving them back to the pool in the new cluster.

Also have 1 pool (minimum) per node member in the storage cluster.

3 – Finalize

  • Destroy the old cluster
  • Rebuild idle nodes and join to new cluster

Why have 3 or 4 nodes …. you provide some cushion for upgrade/migration scenarios.

Note: you can use VMM for any LMs or storage LMs.

Back to Ben for the compute upgrade.

Cross-Version Live Migration

Provides simple zero-downtime way to move a VM across to a new platform.

You can use one of many methods to get a new WS2012 R2 cluster … evict/rebuild, brand new, etc.  Then you can do a Cross-Version Live Migration.

In the demo, Ben fires up the VMM 2012 R2 console (he can also do this using the built-in Server admin tools, e.g. Hyper-V Manager).  VMM is managing the WS2012 hosts and the WS2012 R2 hosts.  He can do a LM of the VM from the old hosts to the new hosts.  Here’s the benefit of upgrading System Center first.  It can manage the new platform and leverage the new WS2012 R2 features.

Another thing with SysCtr …. leverage your templates and logical networks to standardise hosts.  New hosts will be identical config to the old hosts, e.g. the VM Network will have the same name so the VM won’t go “offline” when it has moved to the new hosts.

You can stage the upgrades

WS2012 R2 hosts and use WS2012 R2 storage.  WS2012 hosts can use WS2012 R2 storage.

Upgrade the Guest OS Integration Components

The world won’t end if you don’t …. some new features won’t work if they rely on the new ICs.  Start planning the upgrade around your next maintenance window or planned upgrade.  You can deploy the ICs without rebooting immediately – but the new version won’t work until you do reboot.

d:supportamd64setup.exe /quiet /norestart …. Aidan – add that as an app in ConfigMgr if you have a private cloud, and send the sucker out to a collection of Hyper-V VMs, with a predefined maintenance window.

Cluster Rebuild Options

If you have scale, you can do  2 nodes at a time to maintain HA.

If you are small then do 1 node at a time, but lose HA.

Buy some new hardware to act as the “seed” for a new cluster, and evict/rebuild the older cluster.  You maintain HA, but at a relatively small cost.  You can recycle the last 2 nodes in the old cluster.

For a small shop, take advantage of save state compatibility through:

  • In place upgrade
  • Virtual machine import

Funnily enough, a HUGE shop might also use that last option.  They could also:

  • Save state the VMs
  • Reconnect the storage to new hosts
  • Import/register the VMs

Cluster Validation

Will require downtime unless you are using Windows Server File Storage.  Note that a cluster is not supported until you have a passed cluster validation report.  Block storage will bring down the disks when validated.

Windows Server 2008 R2 to 2012 R2

Here comes Rob Hindman … who has the best job in the world, apparently, cos he works with Ben and Jose Smile

Copy Cluster Roles Wizard

This will move the cluster roles from 2008 R2 to 2012 or 2012 R2.  Basically, it allows you to move cluster resources to a cluster from another cluster that is 2 levels back, e.g. 2008 R2 to 2012 R2.

  • You can test the copy without impacting production/customers
  • The process is reversible if you encounter issues
  • Assumes that your storage will be reused
  • Does not copy data … it remaps disks

You form a new cluster and connect it to the old storage.  You run the wizard against the old cluster.  You copy the roles.  Then you bring online the roles in the new cluster after off-lining them on the old cluster.  Then you can remove the old cluster.

Supports lots including:

  • Hyper-V VMs/VM configuration
  • SOFS
  • CSV
  • Storage pools/spaces

Does not do CAU or Task Scheduler Tasks.

PLEASE READ THE REPORT that the wizard creates.  There might be fix-up steps, e.g. network settings.

Demo:

Does a W2008 R2 – WS2012 R2 migration.  You have to migrate 1 LUN (CSV) at a time.  Make sure that your destination cluster can handle the VM workload that is on the CSV that you are migrating.  If it detects a VM workload, it’ll prompt you to select a destination virtual switch.  The copy is done … no downtime, yet.  Read the report, as advised.

The VM appears on the new cluster, but it’s showing as off.  So is the CSV.  On the original cluster, you take the resource offline – shutdown the VM.  Take the CSV disk offline.  Some customers prefer to unmask the CSV at this point from the old cluster.  Bring the CSV online in the new cluster.  Then power up the VMs on the new cluster.  Done!

Other than a MS IT VPN blip, the demo worked perfectly.

Summary

You can do the upgrade with no downtime if you have lots of resources.  More likely you’ll do with with few/no new resources with minimal downtime.

Q&A

Clarification: you are not abandoning CSV.  You are putting an active/active file server cluster (SOFS) and SMB 3.0 between the Hyper-V hosts and the CSVs.  This layer adds sooooo much and makes you very flexible.

Smaller deployments, such as 2 nodes, then you continue to direct attach your CSVs to your hosts, e.g. CiB Hyper-V deployment.

TechEd NA: Enabling On-Premises IaaS Solutions with the Windows Azure Pack

I am live blogging so hit refresh to see more.

Speakers: Mark Umeno and Eric Winner

I missed the first 10 minutes of the session.  This place is huge and it can take over 30 minutes to walk from one end to another.  Doesn’t look like I missed anything: user experience demo once again.

The IaaS Architecture

You can hook up an SPF call to an Orchestrator runbook.

picture043

Using VMM Roles

picture044

Service Admin Gallery

  • Import and manage gallery items – resource definition package
  • Publish/un-publish gallery items to tenants – immediate impact when un=publishing

Tenant VM Features:

Cloud OS virtual machine role

  • Scale out and in
  • update settings
  • upgrade to a new version
  • change networks
  • start/stop/shutdown/etc

Enabling console access.

VMs can be on isolated network and Windows/Linux or no OS.

Requires RDP client that support RDPTLSv2

picture045

Over to Eric Winner

Virtual Machine Role Deep Dive

picture046

The top line is conceptual.  Virtual machine role is the deployed instance.  The gallery is a catalogue of templates.  There are artefacts under each gallery item.  Main 3:

  • Role view – refers to UI
  • Role resource definition – refers to config
  • Role resource extension – refers to apps

picture047

A cloud service (hidden) is a VM role container.  In the container, is one or more VMs.

picture048

The viewdef is about the presentation layer.

Then stuff was shown and said.  Confused smile  Lots of “JSOC” (or something) code.  It wasn’t pretty.

TechEd NA 2013: Building Cloud Services with Windows Server 2012 R2, Microsoft System Center 2012 R2 and the Windows Azure Pack

Spakers: Bradley Bartz, Nagender Vedula, and an army of others.

1 consistent cloud experience

picture039

Service Bus coming to WS2012 R2.  There are 2 UIs:

  • Admin
  • Consumer portal

Cloud OS Consistent Experiences.

Heres Azure versus on-premise:

Continuity of experience and services being deployed.  Note that Windows Azure Pack portal is customizable.

picture042

The right hand side is powered by:

  • Windows Server
  • Hyper-V
  • System Center – VMM and Operations Manager
  • Service Provider Foundation
  • Windows Azure Pack

Service Consumers

People centric computing – self-service administration, acquire capacity on demand, empowered operations, predictable costs, get up and running quickly.

Difference between Azure and on-premise.  On-premise has limits of scalability.  So we set quote a limits to control how much resources the consumer can take.

Service Consumers:

  • Build highly scalable web apps
  • Iterate with integrated source control
  • Manage app with real-time telemetry
  • Use the languages and open source apps of your choice (supported by Azure pack)

Service Providers

Extreme focus on cost. Maximize per-customer profitability, hardware efficiency, automate everything, differentiate on SLAs.  All makes sense for the hoster.  What about the enterprise private cloud?  Same goals apply – IT needs to be efficient and effective.  Doubly so when doing cross-charging … and to be honest, IT doesn’t want to become more expensive than outsourced services!

Service Bus

  • Messaging service for loud apps
  • Guaranteed message delivery
  • Publish-subscribe messaging patterns
  • Standard protocols (REST, AMQP, WS*)
  • Interoperability (.NET, JAVA/JMS, C/C++)
  • Now integrated with management portal

An elastic message queuing system.  A dev building a modern app in Azure will feel right at home on your WSSC 2012 R2 cloud.

Virtual Machines

  • Consistent with IaaS Azure
  • Roles: portable, elastic, gallery, Windows & Linux support
  • Virtual networks: site-site connectivity, tenant supplied IP address

Additional services in Windows Azure Pack

  • Identity: AD integration, ADFS federation, co-administrator – huge for on-premise
  • Database services: SQL Server and MySQL
  • Value add services from gallery – you can curate a set of add-ons that your customers can use.
  • Other shared services from provider
  • Programmatic access to cloud services – Windows Azure consistent REST APIs

There is a model on acquiring capacity. There is a concept of offers and plans, and that dictates what’s being deployed.  A subscriber will get billed.  Concept of teams is supported with co-administration.  Teams can be large, and membership can change frequently.  With ADFS, you can use an AD group as the co-administrators of the subscription.

Demo

Azure supports ADFS – so he logs into Azure portal using his MSFT corporate ID.  He deploys a new website, goes to a store in Azure, and installs a source code control app: Git.  Now there’s a dedicate Git repository for that website.  It’s the usual non-modified Git.  He adds a connection to the repository locally.  Then he pushes his source code up to the repository from his PC.  That’s done in around a minute.  The website launches – and there’s the site that he pushed up.

This is more than just an FTP upload.  It’s cloud so it scales.  Can scale out the number of website instances.  By default they run on a shared tier, basically the same web server/pool.  Can change that through the GUI.  Can scale the site easily with a slider, with content and load balancing.

Now logs into the Katal portal.  Can sign in with AD user account, Email account (ASP membership of email and password), and ADFS.  The same login appears as on the Azure portal as on Azure.  Same end user experience (can be skinned).  Creates a web site.  Sets up Git source code control, as on Azure.  Basically repeats the same steps as on Azure – the customer is getting the same experience. 

In Katal, scalability can be limited by the admins, won’t have the same infinite resources as Azure.

Now he logs out, and Mark Umeno logs in as a co-admin.  He can see the resources that were just deployed by Bradley.  He can also see some other stuff that he owns. 

I get bored here … there’s no cloud building going on.  It’s turned into a user experience demo which does not match the title of the session.

TechEd 2013: How To Design & Configure Networking In VMM (Part 2)

Speaker: Greg Cusanza, Senior PM, MSFT (VMM) and Charlie Wen, PM (Windows).

This is a follow up to part 1.

Objective of this session: bring WS2012 R2, System Center 2012 R2 and Windows Azure together using hybrid networking.

Hybrid Network

Tenant thinks they have their own network, but it’s an abstracted network on hosting environment.  Can link to Internet and extend clients’ on-premise network into hosting network.  There is routing between the client network and the tenant network.

picture027

Can route between client site A, through client site B, to tenant network if Site A to tenant network link is down.

There is in-box capability for the gateway in WS2012 R2.

Hybrid Networking in WS2012 and SysCtr 2012 SP1

  • WS 2012 R2 adds HNV, RRAS, and IPAM
  • SC2012 SP1 – VM networks with single VPN.
  • 3rd party gateways: F5 (software solution out now), Huawei, IronNetworks
  • Introduced Windows Azure Services for Windows Server (Katal, vNext to be Windows Azure Pack).  Not a hybrid solution.

F5 solution is Windows Server based at the moment.  They are working on a hardware solution.

Benefits of Hybrid Networking

  • For hoster, internal IT, or enterprise customer. 
  • Must be cost effective
  • Capex cost per tenant must be low.  Multi-tenancy.
  • Gateways must be highly available – using clustering in WS2012 R2 gateway
  • Must support self-service
  • Enterprises: must be able to extend on-premise network.  Establish contract for average throughput for each connection.  Easily provision and configure site-site connection on the hoster side

picture029

Network Fabrication Configuration

  • Enabling network virtualization: WS2012 R2 no longer requires NV filter enablement
  • Configuring provider address space: must have static IP pool.  Must enable network virtualisation on logical network for provider addresses.
  • If mixing 2012 and 2012 R2 hosts, must have KB2779768 on 2012 hosts

Demo

Checked the Allow New VM Networks Created On This Logical …. in the settings of the tenant Logical Network – different tenant network than before – no VLAN stuff.

Enabling Hybrid Connectivity

  • you need a gateway
  • 3rd party gateways do exist
  • WS2012 R2 gateway will do for many customers.  3rd party solutions will probably offer extra features.

Charlie Wen (Mr. QoS in WS2012) comes on stage to talk about the WS gateway.

WS2012 Hybrid Connectivity

Limitations:

  • 1 VM per tenant
  • Static routing required on each tenant site
  • Manual provisioning
  • Internet connectivity back to remote site – no NAT for direct connectivity to VM networks.

picture030

WS2012 R2

  • Multi-tenant solution that requires far fewer VMs as gateways
  • Clustering for HA – this is an SLA business
  • BGP routing for dyanmic routing
  • Multitenant NAT for direct Internet connectivity

picture031

Demo

Shows NAT in action on the gateway.  Client connects to VM in VM network using IE and public IP address.  Does it twice and does 2 downloads (long and still running).  Uses Get-NetCompartment to view tenant networks.  Moves the gateway role from one WS2012 r2 cluster member to another and it’s done in the blink of an eye.  The downloads do not get interrupted because the proactive failover of the gateway resource happens so quickly.  Good for maintenance.

Private Cloud with WS2012 R2

  • You could use HNV for lab, test networks, dev networks
  • Most services still on the physical network, e.g. AD, DNS, etc. 
  • That means the labs are isolated.  You can give connectivity with a forwarding gateway.
  • You can extend into a 3rd party site by connecting the forwarding gateway to the edge router.

Multi-tenant networking stack

picture034

Multi-tenant Site-to-Site

On boarding: create new tenant with a compartment in the gateway  Incoming packets go into a default compartment.  Packet is inspected, and sent to the correct tenant compartment, and onwards to the VM network.

Outbound packet, from the VM network, to the tenant compartment.  There is a routing table there and then it goes out to the right client on-premise site over the VPN.

Multi-tenant NAT

Each tenant compartment needs a unique IP.

Outbound packet into tenant compartment from VM network, then NATed before going out to the net.

For inbound packet, it comes into the gateway.  A NAT mapping sends it to the correct client compartment, and onwards to the VM network.

BGP Dynamic Route learning and Best Path Selection

BGP will select the best route.  Say the Site 1 – hoster link goes down.  BGP will auto re-route to hoster via site 2.

picture036

Guest Clustering for HA

  • A 1:1 redundant (active/passive) cluster is created from the VMM service template when deploying the WS2012 R2 gateway
  • Failure is detected immediately
  • Site-site tunnels are reconnected on the new active node
  • So quick that end-end TCP connections do not time out

Back to Greg and SCVMM …

Provisioning from VMM

  1. Build a host/cluster – this host/cluster is dedicated for the gateway VMs.  DEDICATED.  They are edge network, “untrusted” hosts.  VMM agent uses certificates.
  2. Deploy gateway VMs from the service template
  3. Add gateway to VMM
  4. Finalize the gateway configuration

Post-preview functionality configured from SCVMM, ie not in the preview and will be in RTM:

  • HA
  • Forwarding gateway for private cloud

Demo

Has the service template and deploys it to the untrusted host.

picture037

Has one already baked, and shows the service in his cloud view.  The host was marked as a HNV host: get-scvmmhost <hostname> … IsDedicatedToWnvGateway is set to true.  Set-SCVMMHost –IsDedicatedToWnvGateway $true <hostname>.

Adds a Network Service in Fabric-Networking.  Selects RunAs account.  Sets a network service connection string.  Reviews the certificates.  Tests the provider before existing the wizard.  And then selects a host group – e.g. dedicate the gateway to a rack of servers.  Configures the front end and back end NICs: selects NICs and network sites for each of the two.  Done.  The g/w is added … but it takes a minute or so to set up the compartments …. watch out for that!

Goes into VM Newtorks.  Creates a new VM Network in the tenant logical network.  Enables HNV.  Sets the VM subnet.  Connects the VPN tunnel, with BGP.  Enables NAT.  Selects an IP Pool for the NAT connection.  Can add inbound access rules for specific ports, e.g. send inbound TCP 80 to 10.0.0.2 port 80.  That configures the compartment in the g/w.  Adds an IP pool to the HNV gateway. 

Done!  Now you can add VMs to the VM Network and they can talk through the gateway, e.g. talk to an external network.

No configuration done in the gateway VMs or on the HNV hosts.

Enabling Tenant Self-Service

Using Windows Azure Services for Windows Server:

  • Tenants creat their own networks
  • Consistent experience with Windows Azure
  • Configuration of topology and BGP
  • Reporting and chargeback

SPF provides REST API to enable hosters and private cloud providers to build their own portal if they want.

The client configures a VM network and VPN tunnel on the hoster portal.  That configures VMM and the gateway for the tenant.  The tenant must then configure their own VPN endpoint to complete the tunnel.

Demo of tenant self-service

Logs into the portal as a tenant.  Creates a new virtual network.  Selects IPv4.  Specifies DNS, and chooses to enable NAT and VPN.  Enter his tenant VPN endpoint info and enables BGP.  Adds an address space for the VM network.  Names the site-site VPN, enters the pre-shared key, and the address space for BGP to do initial routing for dynamic discovery.

Note: it is IBGP.  Add the BGP peers and ASN info.  Check the wizard and done.

Outbound NAT is enabled.  Inbound requires configuration.  Hosters can supply VPN configuration scripts that the tenant can download from the portal. 

Creates a new NAT rule for a web server.  Nice bit: can choose an already selected VM rather than entering an IP address.

And that’s that!