Satya Nadella & Brad Smith Speaking at Microsoft Ireland Tech Gathering

I attended today’s Microsoft Ireland Tech Gathering, a surprising event for Microsoft Ireland – they do very little in the public anymore. What’s even more surprising is that Microsoft CEO, Satya Nadella, is in town to speak (here, an earlier CEO breakfast, and a later education event by Dublin City University). Nadella is doing the keynote. I’m in the 7th row, and I have a heavy camera to swing/throw if he talks about Cortana – which only works in 10 countries, and Ireland is not one of the ten Open-mouthed smile (just kidding, big security dudes!).

All photos in this post are the property of Aidan Finn and may not be used without my permission – just ask, it’s easy!

Claire Dillon

The group lead of the local DX (Developer Experience) team takes the stage. She explains what DX is, a team now focused on development (technical architects) and business (account managers) in in the cloud, no longer the mix of devs and IT pros that DPE once was.

There’s a quick reminder of the last Microsoft year. And open source is highlighted.

image

The world is changing very rapidly. Mobile, cloud, data growth, machine learning, AI, augmented reality …me: these aren’t endpoints, they are the start of a journey. Industries are changing, and cloud/mobile has set an expectation that goods/services are delivered immediately.

There’s an opportunity for start-up small-in the cloud companies – they are flexible and can be disruptive to the larger incumbents. Microsoft Encarta killed Encyclopaedia Britannia’s 244 year old published product. But EB is more profitable than ever! They adapted and transformed to embrace the Internet for delivering their product. WIkipedia is a newer threat to EB. EB focused on a quality and fact checked product, and customers that required that: education, for example.

IT pros and developers are in for an exciting time. Things are changing, and resistance is futile. Some facts:

  • Outlook.com:400 million active users.
  • Office on 340 million mobile devices.
  • Skype users using 3 billion minutes of calls. Sky Translator doing real time comms in 8 languages.
  • 40% of Azure income coming from small business and start ups. 1 in 3 Azure VM are Linux. The data centres consume less than 50% of the power of traditional data centres. 80% of large enterprises using MSFT cloud.

Today will be all about the digital transformation.

  • Satya Nadella, Brad Smith, and Irish MD will evangelize.
  • Then customers will talk about their journey, including some open source.

 

Cathriona Hallahan

MD of MS Ireland. Large breath of people here, partners, bloggers, media, small customers and large.

Microsoft has transformed under Satya Nadella.

image

Satya Nadella

CEO of Microsoft.

Vision: to empower every person on the planet to do more. Every product that they make is shaped by this vision. People build institutions to outlast them, including software.

image

It’s not about MS tech, it’s about what happens with that technology when it’s in customers hands, and how they can transform.

Mobility is not about a device, it’s about our mobility across all the devices in our life. Seamless movement is only possible in the cloud. This is why cloud first, mobile first are happening at the same time. Cloud computing is not a single destination – it’s a distributed computing service.

Digital transformation that customers will achieve through this technology is what is important. Microsoft is building this out through a hyper scale global cloud. 6 regions in Europe. The North Europe region (Dublin) is expanding – there are planning applications/decisions in the local news every now and then.

Azure is being built out as the first AI super computer (SkyNet).

Every compute node in Azure has FPGA’s now. You can distribute your AI across this fabric. N-Series NVIDIA chipsets provide great processing for AI too. But raw infrastructure is not enough. The magic is in software. Microsoft is state-of-the-art in speech and object recognition. Doing stuff with deep neural nets.

The Bot framework was launched 6 months ago. 4500 developers are building new kinds of apps on this framework. Graph gets a nod. Dynamics 365 is brought up – how can we think about business process as a continuum of productivity and comms, instead of putting it into a silo? Every company is becoming a digital company. You want to be able to empower every employee in your company with data, information, and analytics. Predictive and analysis power will be the new strength of a business – can you do it better and faster than your competitor and jump on opportunities. Can you predict service failures and proactively remediate? For example, factory can shift from focusing on the thing they make to the service they offer.

He refers to a digital feedback loop – data coming in and coming back out as intelligence.

How is all this going to diffuse through the world? In Europe, they see a broad spectrum of uses in Europe, and by European companies around the world. Access to the technology is critical. A Swiss company called Temenos has democratized access to banking s/w in Asia. They use the public cloud – there’s a video.

image

Some local Irish examples. He met with AIB and talked about their strategies. They are using the cloud and their data centers to transform customer banking. Office 365 is being rolled out to empower employees. Cubic Telecom is working with Automative Industry – to connect every car to a mobile phone network – s/w allows a car to move to any region and have network support without changing hardware. eHealth Ireland is connecting patients with doctors, providing information in patients’ most vulnerable moments.

In the future, this infinite cloud infrastructure and new types of devices (AR, VR, IoT) is what will transform every life and every industry. HoloLens is an infinite display – mixing realities. Another video.

When you change the way you see the world, you change the world you see.

It is incumbent on technology pros and government to ask if a tech is going to help everyone on the planet. MSFT launching a book called a cloud for global good.

image

Brad Smith

Chief legal man in Microsoft.

Started his career in MS France. Talks about the history of MS in Ireland – from manufacturing CDs, to eventually be involved in a global cloud issue. Their data center in Ireland lead to litigation in the USA about the FBI demanding access to a mailbox in Dublin – Microsoft won, in case you didn’t know. It was good news for Microsoft, and great news for the cloud. Microsoft touring Europe this week to talk about the globality of the cloud.

He reckons that the cloud is a new industrial revolution – a recap of what he presented at WPC earlier this year.

The cloud is powering all of the current digital transformations. How do we ensure that this cloud serves everyone and not just the lucky few. We need to act with shared responsibility. The new book as 72 recommendations to ensure a cloud for global good.

image

We need a new set of cyber security rules. We need personal rights for data crossing borders.

We more than just trust. We need a cloud that is responsible, and respects human rights and public safety.

We need to advance sustainability. MS data centers already consuming the same power as a small US state. This is escalating. MS committed to get better every year on use of renewable energy and to be transparent. By 2018, it’s to hit 50% or better, and 60% in the next decade … but they need help with supply.

We need laws to enable AI, but laws to control ethics.

The cloud needs to be more inclusive for people around the world. Form access to digital literacy, developing skills for the next generations.

To build a digital economy, you need to build a learning economy. We need to connect rural communities – the cloud can reduce distances. We need to think about people with disabilities – 300 million are visually impaired. Over 1 billion those with some kind of disability. They have potential to do great, but face obstacles to adopt and achieve.

Ignite 2016 – Storage Spaces Direct

Read the notes from the session recording (original here) on Windows Server 2016 (WS2016) Storage Spaces Direct (S2D) and hyper-converged infrastructure, which was one of my most anticipated sessions of Microsoft Ignite 2016. The presenters were:

  • Claus Joergensen: Program Manager
  • Cosmos Darwin, Program Manager

Definition

Cosmos starts the session.

Storage Spaces Direct (S2D) is software-defined, shared-nothing storage.

  • Software-defined: Use industry standard hardware (not proprietary, like in a SAN) to build lower cost alternative storage. Lower cost doesn’t mean lower performance … as you’ll see Smile
  • Shared-nothing: The servers use internal disks, not shared disk trays. HA and scale is achieved by pooling disks and replicating “blocks”.

Deployment

There’s a bunch of animated slides.

  1. 3 servers, each with internal disks, a mix of flash and HDD. The servers are connected over Ethernet (10 GbE or faster, RDMA)
  2. Runs some PowerShell to query the disks on a server. The server has  4 x SATA HDD and 2 x SATA SSD. Yes, SATA. SATA is more affordable than SAS. S2D uses a virtual SAS bus over the disks to deal with SATA issues.
  3. They form a cluster from the 3 servers. That creates a single “pool” of nodes – a cluster.
  4. Now the magic starts. They will create a software-defined pool of virtually shared disks, using Enable-ClusterStorageSpacesDirect. That cmdlet does some smart work for us, identifying caching devices and capacity devices – more on this later.
  5. Now they can create a series of virtual disks, each which will be formatted with ReFS and mounted by the cluster as CSVs – shared storage volumes. This is done with one cmdlet, New-Volume, which is doing all the lifting. Very cool!

image

There are two ways we can now use this cluster:

  • We expose the CSVs using file shares to another set of servers, such as Hyper-V hosts, and those servers store data, such as virtual machine files, using SMB 3 networking.
  • We don’t use any SMB 3 or file shares. Instead, we enable Hyper-V on all the S2D nodes, and run compute and storage across the cluster. This is hyper-converged infrastructure (HCI)

image

A new announcement. A 3rd scenario is SQL Server 2016 (supported). You install SQL Server 2016 on each node, and store database/log files on the CSVs (no SMB 3 file shares).

image

Scale-Out

So your S2D cluster was fine, but now your needs have grown and you need to scale out your storage/compute? It’s easy. Add another node (with internal storage) to the cluster. In moments, S2D will claim the new data disks. Data will be re-balanced over time across the disks in all the nodes.

Time to Deploy?

Once you have the servers racked/cabled, OS installed, and networking configured, you’re looking at under 15 minutes to get S2D configured and ready. You can automate a lot of the steps in SCVMM 2016.

Cluster Sizing

The minimum number of required nodes is an “it depends”.

  • Ideally you have a 4-node cluster. This offers HA, even during maintenance, and supports the most interesting form of data resilience that includes 3-way mirroring.
  • You could do a 3 node cluster, but that’s limited to 2-way mirroring.
  • And now, as of Ignite, you can do a 2-node cluster.

Scalability:

  • 2-16 nodes in a single cluster – add nodes to scale out.
  • Over 3PB of raw storage per cluster – add drives to nodes to scale up (JBODS are supported).
  • The bigger the cluster gets, the better it will perform, depending on your network.

The procurement process is easy: add servers/disks

Performance

Claus takes over the presentation.

1,000,000 IOPS

Earlier in the week (I blogged this in the WS2016 and SysCtr 2016 session), Claus showed some crazy numbers for a larger cluster. He’s using a more “normal” 4-node (Dell R730xd) cluster in this demo. There are 4 CSVs. Each node has 4 NVMe flash devices and a bunch of HDDs. There are 80 VMs running on the HCI cluster. They’re using a open source stress test tool called VMFleet. The cluster is doing just over 1 million IOPS, over 925,000 read and 80.000 write. That’s 4 x 2U servers … not a rack of Dell Compellent SAN!

Disk Tiering

You can do:

  • SSD + HDD
  • All SSD

You must have some flash storage. That’s because HDD is slow at seek/read. “Spinning rust” (7200 RPM) can only do about 75 random IOs per second (IOPS). That’s pretty pathetic.

Flash gives us a built-in, always-on cache. One or more caching device (a flash disk) is selected by S2D. Caching devices are not pooled. The other disks, capacity devices, are used to store data, and are pooled and dynamically (not statically) bound to a caching device. All writes up to 256 KB and all reads up to 64 GB are cached – random IO is intercepted, and later sent it to capacity devices as optimized IO.

Note the dynamic binding of capacity devices to caching devices. If a server has more than one caching device, and one fails, the capacity devices of the failed caching device are dynamically re-bound.

Caching devices are deliberately not pooled – this allows their caching capability to be used by any pool/volume in the cluster –the flash storage can be used where it is needed.

image

The result (in Microsoft’s internal testing) was that they hit 600+ IOPS per HDD …. that’s how perfmon perceived it … in reality the caching devices were positively greatly impacting the performance of “spinning rust”.

NVMe

WS2016 S2D supports NVMe. This is a PCIe bus-connected form of very fast flash storage, that is many times faster than SAS HBA-connected SSD.

Comparing costs per drive/GB using retail pricing on NewEgg (a USA retail site):

image

Comparing performance, not price:

image

If we look at the cost per IOP, NVMe becomes a very affordable acceleration device:

image

Some CPU assist is require to move data to/from storage. Comparing SSD and NVMe, the NVMe has more CPU for Hyper-V or SQL Server.

image

The highest IOPS number that Microsoft has hit, so far, is over 6,000,000 read IOPS from a single cluster, which they showed earlier in the week.

1 Tb/s Throughput (New Record)

IOPS are great. But IOPS is much like horsepower in a car, we care more about miles/KMs per hour or amounts of data we can actually push in a second. Microsoft recently hit 1 terabit per second. The cluster:

  • 12 nodes
  • All Micron NVMe
  • 100 GbE Mellanox RDMA network adapters
  • 336 VMs, stress tested by VMFleet.

Thanks to RDMA and NVMe, the CPU consumption was only 24-27%.

1 terabit per second. Wikipedia (English) is 11.5 GB. They can move English Wikipedia 14 times per second.

Fault Tolerance

Soooo, S2D is cheaper storage, but the performance is crazy good. Maybe there’s something wrong with fault tolerance? Think again!

Cosmos is back.

Failures are not a failure mode – they’re a critical design point. Failures happen, so Microsoft wants to make it easy to deal with.

Drive Fault Tolerance

  • You can survive up to 2 simultaneous drive failures. That’s because each chunk of data is stored on 3 drives. Your data stays safe and continuously (better than highly) available.
  • There is automatic and immediate repair (self-healing: parallelized restore, which is faster than classic RAID restore).
  • Drive replacement is a single-step process.

Demo:

  1. 3 node cluster, with 42 drives, 3 CSVs.
  2. 1 drive is pulled, and it shows a “Lost Communication” status.
  3. The 3 CSVs now have a Warning health status – remember that each virtual disk (LUN) consumes space from each physical disk in the pool.
  4. Runs: Cluster* | DebugStorageSubSystem …. this cmdlet for S2D does a complete cluster health check. The fault is found, devices identified (including disk & server serial), fault explained, and a recommendation is made. We never had this simple debug tool in WS2012 R2.
  5. Runs: $Volumes | Debug-Volume … returns health info on the CSVs, and indicates that drive resiliency is reduced. It notes that a restore will happen automatically.
  6. The drive is automatically marked as restired.
  7. S2D (Get-StorageJob) starts a repair automatically – this is a parallelized restore writing across many drives, instead of just to 1 replacement/hot drive.
  8. A new drive is inserted into the cluster. In WS2012 R2 we had to do some manual steps. But in WS2016 S2D, the disk is added automatically. We can audit this by looking at jobs.
  9. A rebalance job will automatically happen, to balance data placement across the physical drives.

So what are the manual steps you need to do to replace a failed drive?

  1. Pull the old drive
  2. Install a new drive

S2D does everything else automatically.

Server Fault Tolerance

  • You can survive up to 2 node failures (4+ node cluster).
  • Copies of data are stored in different servers, not just different drives.
  • Able to accommodate servicing and maintenance – because data is spread across the nodes. So not a problem if you pause/drain a node to do planned maintenance.
  • Data resyncs automatically after a node has been paused/restarted.

Think of a server as a super drive.

Chassis & Rack Fault Tolerance

Time to start thinking about fault domains, like Azure does.

You can spread your S2D cluster across multiple racks or blade chassis. This is to create the concept of fault domains – different parts of the cluster depend on different network uplinks and power circuits.

image

You can tag a server as being in a particular rack or blade chassis. S2D will respect these boundaries for data placement, therefore for disk/server fault tolerance.

Efficiency

Claus is back on stage.

Mirroring is Costly

Everything so far about fault tolerance in the presentation has been about 3-copy mirror. And mirroring is expensive – this is why we encounter so many awful virtualization deployments on RAID5. So if 2-copy mirror (like RAID 10) gives us  the raw storage as usable storage, and only 1/3 with 3-way mirroring, this is too expensive.

2-way and 3-way mirroring give us the best performance, but parity/erasure coding/RAID5 give us the best usable storage percentage. We want performance, but we want affordability too.

image

We can do erasure coding with 4 nodes in an S2D cluster, but there is a performance hit.

image

Issues with erasure coding (parity or RAID 5):

  • To rebuild from one failure, you have to read every column (all the disks), which ties up valuable IOPS.
  • Every write incurs an update of the erasure coding, which tiers up valuable CPU. Actively written data means calculating the encoding over and over again. This easily doubles the computational work involved in every write!

Local Reconstruction Codes

A product of Microsoft Research. It enables much faster recovery of a single drive by grouping bits. The XO the groups and restore required bits instead of an entire stripe. It reduces the number of devices that you need to touch to do a restore of a disk when using parity/erasure coding. This is used in Azure and in S2D.

image

This allows Microsoft to use erasure coding on SSD, as do many HCI vendors, but also on HDDs.

The below depicts the levels of efficiency you can get with erasure coding – note that you need 4 nodes minimum for erasure coding. The more nodes that you have, the better the efficiencies.

image

Accelerated Erasure Coding

S2D optimizes the read-modify-write nature of erasure coding. A virtual disk (a LUN) can combine mirroring and erasure coding!

  • Mirror: hot data with fast write
  • Erasure coding: cold data – fewer parity calculations

The tiering is real time, not scheduled like in normal Storage Spaces. And ReFS metadata handling optimizes things too – you should use ReFS on the data volumes in S2D!

Think about it. A VM sends a write to the virtual disk. The write is done to the mirror and acknowledged. The VM is happy and moves on. Underneath, S2D is continuing to handle the persistently stored updates. When the mirror tier fills, the aged data is pushed down to the erasure coding tier, where parity is done … but the VM isn’t affected because it has already committed the write and has moved on.

And don’t forget that we have flash-based caching devices in place before the VM hits the virtual disk!

As for updates to the parity volume, ReFS is very efficient, thanks to it’s way of abstracting blocks using metadata, e.g. accelerated VHDX operations.

The result here is that we get the performance of mirroring for writes and hot data (plus the flash-based cache!) and the economies of parity/erasure coding.

If money is not a problem, and you need peak performance, you can always go all-mirror.

image

Storage Efficiency Demo (Multi-Resilient Volumes)

Claus does a demo using PoSH.

image

Note: 2-way mirroring can lose 1 drive/system and is 50% efficient, e.g. 1 TB of usable capacity has a 2 TB footprint of raw capacity.

  1. 12 node S2D cluster, each has 4 SSDs and 12 HDDs. There is 500 TB of raw capacity in the cluster.
  2. Claus creates a 3-way mirror volume of 1 TB (across 12 servers). The footprint is 3 TB of raw capacity. 33% efficiency. We can lose 2 systems/drives
  3. He then creates a parity volume of 1 TB (across 12 servers). The The footprint is 1.4 TB of raw capacity. 73% efficiency. We can lose 2 systems/drives
  4. 3 more volumes are created, with different mixtures of 3-way mirroring and erasure coding.
  5. The 500 GB mirror + 500 dual parity virtual disk has 46% efficiency with a 2.1 TB footprint.
  6. The 300 GB mirror + 700 dual parity virtual disk has 54% efficiency with a 1.8 TB footprint.
  7. The 100 GB mirror + 900 dual parity virtual disk has 65% efficiency with 1.5 TB footprint.

Microsoft is recommending that 10-20% of the usable capacity in “hybrid volumes” should be 3-way mirror.

If you went with the 100/900 balance for a light write workload in a hybrid volume, then you’ll get the same performance as a 1 TB 3-way mirror volume, but by using half of the raw capacity (1.5 TB instead of 3 TB).

CPU Efficiency

S2D is embedded in the kernal. It’s deep down low in kernel mode, so it’s efficient (fewer context switches to/from user mode). A requirement for this efficiency is using Remote Direct Memory Access (RDMA) which gives us the ultra-efficient SMB Direct.

There’s lots of replication traffic going on between the nodes (east-west traffic).

image

RDMA means that:

  • We use less CPU when doing reads/write
  • But we also can increase the amount of read/write IOPS because we have more CPU available
  • The balance is that we have more CPU for VM workloads in a HCI deployment

Customer Case Study

I normally hate customer case studies in these sessions because they’re usually an advert. But this quick presentation by Ben Thomas of Datacom was informative about real world experience and numbers.

They switched from using SANs to using 4-node S2D clusters with 120 TB usable storage – a mix of flash/SATA storage. Expansion was easy compared to compute + SAN > just buy a server and add it to the cluster. Their network was all Ethernet (even the really fast 100 Gbps Mellanox stuff is Ethernet-based) so they didn’t need fibre networks for SAN anymore. Storage deployment was easy. In SAN there’s create the LUN, zone it, etc. In S2D, 1 cmdlet creates a virtual disk with the required resilience/tiering, formats it, and it appears as a replicated CSV across all the nodes.

Their storage ended up costing them $0.04 / GB or $4 / 1000 IOPS. The IOPS was guaranteed using Storage QoS.

Manageability

Cosmos is back.

You can use PowerShell and FCM, but mid-large customers should use System Center 2016. SCVMM 2016 can deploy your S2D cluster on bare metal.

Note: I’m normally quite critical of SCVMM, but I’ve really liked how SCVMM simplified Hyper-V storage in the past.

If you’re doing a S2D deployment, do a Hyper-V deployment and check a single box to enable S2D and that’s it, you get a HCI cluster instead of a compute cluster that requires storage from elsewhere. Simple!

SCOM provides the monitoring. They have a big dashboard to visualize alerts and usage of your S2D cluster.

image

Where is all that SCOM data coming from? You can get this raw data yourself if you don’t have System Center.

Health Service

New in WS2016. S2D has a health service built into the OS. This is the service that feeds info to the SCOM agents. It has:

  • Always-on monitoring
  • Alerting with severity, description, and call to action (recommendation)
  • Root-cause analysis to reduce alert noise
  • Monitoring software and hardware from SLA down to the drive (including enclosure location awareness)

We actually saw the health service information in an earlier demo when a drive was pulled from an S2D cluster.

image

It’s not just health. There are also performance, utilization, and capacity metrics. All this is built into the OS too, and accessible via PowerShell or API: Cluster* | Get-StorageHealthReport

DataON MUST

Cosmos shows a new tool from DataON, a manufacturer of Storage Spaces and Storage Spaces Direct (S2D) hardware.

If you are a reseller in the EU, then you can purchase DataON hardware from my employer, MicroWarehouse (www.mwh.ie) to resell to your customers.

DataON has made a new tool called MUST for management and monitoring of Storage Spaces and S2D.

Cosmos logs into a cloud app, must.dataonstorage.com. It has a nice bright colourful and informative dashboard with details of the DataON hardware cluster. The data is live and updating in the console, including animated performance graphs.

image

There is an alert for a server being offline. He browses to Nodes. You can see healthy node with all it’s networking, drives, CPUs, RAM, etc.

image

He browses to the dead machine – and it’s clearly down.

Two things that Cosmos highlights:

  • It’s a browser-based HTML5 experience. You can access this tool from any kind of device.
  • DataON showed a prototype to Cosmos – a “call home” feature. You can opt in to get a notification sent to DataON of a h/w failure, and DataON will automatically have a spare part shipped out from a relatively local warehouse.

The latter is the sort of thing you can subscribe to get for high-end SANs, and very nice to see in commodity h/w storage. That’s a really nice support feature from DataON.

Cost

So, controversy first, you need WS2016 Datacenter Edition to run S2D. You cannot do this with Standard Edition. Sorry small biz that was considering this with a 2 node cluster for a small number of VMs – you’ll have to stick with a cluster in a box.

Me: And the h/w is rack servers with RDMA networking – you’ll be surprised how affordable the half-U 100 GbE switches from Mellanox are – each port breaks out to multiple cables if you want. Mellanox price up very nicely against Cisco/HPE/Dell/etc, and you’ll easily cover the cost with your SAN savings.

Hardware

Microsoft has worked with a number of server vendors to get validated S2D systems in the market. DataON will have a few systems, including an all-NVME one and this 2U model with 24 x 2.5” disks:

image

You can do S2D on any hardware with the pieces, Microsoft really wants you to use the right, validated and tested, hardware. you know, you can put a loaded gun to your head, release the safety, and pull the trigger, but you probably shouldn’t. Stick to the advice, and use especially engineered & tested hardware.

Project Kepler-47

One more “fun share” by Claus.

2-nodes are now supported by S2D, but Microsoft wondered “how low can we go?”. Kepler-47 is a proof-of-concept, not a shipping system.

These are the pieces. Note that the motherboard is mini-ITX; the key thing was that it had a lot of SATA connectors for drive connectivity. The installed Windows on a USB3 DOM. 32 GB RAM/node. There are 2 SATA SSDs for caching and 6 HDDs for capacity in each node.

image

There are two nodes in the cluster.

image

It’s still server + drive fault tolerant. They use either a file share witness or a cloud witness for quorum. It has 20 TB of usable mirrored capacity. Great concept for remote/branch office scenario..

Both nodes are 1 cubic foot, 45% smaller than 2U of rack space. In other words, you can fit this cluster into one carry-on bag in an airplane! Total hardware cost (retail, online), excluding drives, was $2,190.

The system has no HBA, no SAS expander, and no NIC, switch or Ethernet! They used Thunderbolt networking to get 20 Gbps of bandwidth between the 2 servers (using a PoC driver from Intel).

Summary

My interpretation:

Sooooo:

  • Faster than SAN
  • Cheaper than SAN
  • Probably better fault tolerance than SAN thanks to fault domains
  • And the same level of h/w support as high end SANs with a support subscription, via hardware from DataON

Why are you buying SAN for Hyper-V?

Ignite 2016 – Discover Shielded VMs And Learn About Real World Deployments

This post is my set of notes from the Azure Backup session recording (original here) from Microsoft Ignite 2016. The presenters were:

  • Dean Wells, Principal Program Manager, Microsoft
  • Terry Storey, Enterprise Technologist, Dell
  • Kenny Lowe, Head of Emerging Technologies, Brightsolid

This is a “how to” presentation, apparently. It actually turned out to be high level information, instead of a Level 300 session, with about 30 minutes of advertising in it. There was some good information (some nice insider stuff by Dean), but it wasn’t a Level 300 or “how to” session.

When The Heck Is A Shielded VM?

A new tech to protect VMs from the infrastructure and administrators. Maybe there’s a rogue admin, or maybe an admin has had their credentials compromised by malware. And a rogue admin can easily copy/mount VM disks.

Shielded VMs:

  • Virtual TPM & BitLocker: The customer/tenant can encrypt the disks of a VM, and the key is secured in a virtual TPM. The host admin has no access/control. This prevents non-customers from mounting a VHD/X. Optionally, we can secure the VM RAM while running or migrating.
  • Host Guardian Service: The HGS is a small dedicated cluster/domain that controls which hosts a VM can run on. A small subset of trusted admins run the HGS. This prevents anyone from trying to run a VM on a non-authorized host.
  • Trusted architecture: The host architecture is secure and trusted. UEFI is required for secure boot.

Shielded VM Requirements

image

Guarded Hosts

image

WS2016 Datacenter edition hosts only. A host must be trusted to get the OK from the HGS to start a shielded VM.

The Host Guardian Service (HGS)

image

 

A HA service that runs, ideally, in a 3-node cluster – this is not a solution for a small business! In production, this should use a HSM to store secrets. For PoC or demo/testing, you can run an “admin trusted” model without a HSM. The HGS gives keys to known/trusted/healthy hosts for starting shielded VMs.

Two Types of Shielding

image

  • Shielded: Fully protected. The VM is a complete black box to the admin unless the tenant gives the admin guest credentials for remote desktop/SSH.
  • Encryption Supported: Some level of protection – it does allow Hyper-V Console and PowerShell Direct.

Optionally

  • Deploy & manage the HGS and the solution using SCVMM 2016 – You can build/manage HGS using PowerShell. OpenStack supports shielded virtual machines.
  • Azure Pack can be used.
  • Active Directory is not required, but you can use it – required for some configurations.

Kenny (a customer) takes over. He talks for 10 minutes about his company. Terry (Dell) takes over – this is a 9 minute long Dell advert. Back to Kenny again.

Changes to Backup

The infrastructure admins cannot do guest-level backups – they can only backup VMs – and they cannot restore files from those backed up VMs. If you need file/application level backup, then the tenant/customer needs to deploy backup in the guest OS. IMO, a  secure cloud-based backup solution with cloud-based management would be ideal – this backup should be to another cloud because backing up to the local cloud makes no sense in this scenario where we don’t trust the local cloud admins.

The HGS

This is a critical piece infrastructure – Kenny runs it on a 4-node stretch cluster. If your hosting cloud grows, re-evaluate the scale of your HGS.

Dean kicks in here: There isn’t that much traffic going on, but that all depends on your host numbers:

  • A host goes through attestation when it starts to verify health. That health certificate lasts for 8 hours.
  • The host presents the health cert to the HGS when it needs a key to start a shielded VM.
  • Live Migration will require the destination host to present it’s health cert to the HGS to get a key for an incoming shielded VM.

MSFT doesn’t have at-scale production numbers for HGS (few have deployed HGS in production at this time) but he thinks a 3 node cluster (I guess 3 to still have HA during a maintenance cycle – this is a critical infrastructure) will struggle at scale.

Back to Kenny. You can deploy the HGS into an existing domain or a new one. It needs to be a highly trusted and secured domain, with very little admin access. Best practice: you deploy the HGS into it’s own tiny forest, with very few admins. I like that Kenny did this on a stretch cluster – it’s a critical resource.

Get-HGSTrace is a handy cmdlet to run during deployment to help you troubleshoot the deployment.

Disable SMB1 in the HGS infrastructure.

Customer Education

Very good points here. The customer won’t understand the implications of the security you are giving them.

  • BitLocker: They need to protect the key (cloud admin cannot) – consider MBAM.
  • Backup: The cloud admin cannot/should not backup files/databases/etc from the guest OS. The customer should back to elsewhere if they want this level of granularity.

Repair Garage

Concept here is that you don’t throw away a “broken” fully shielded VM. Instead, you move the VM into another shielded VM (owned by the customer) that is running nested Hyper-V, reduce the shielding to encryption supported, console into the VM and do your work.

image

Dean: There are a series of scripts. The owner key of the VM (which only the customer has) is the only thing that can be used to reduce the shielding level of the VM. Otherwise, you download the shielding policy, use the key (on premises) to reduce the shielding, and upload/apply it to the VM.

Dean: Microsoft is working on adding support for shielded VMs to Azure.

There’s a video to advertise Kenny’s company. Terry from Dell does another 10 minutes of advertising.

Back to Dean to summarize and wrap up.

Ignite 2016 – Discover What’s New In Windows Server 2016 Virtualization

This post is a collection of my notes from the Ben Armstrong’s (Principal Program Manager Lead in Hyper-V) session (original here) on the features of WS2016 Hyper-V. The session is an overview of the features that are new, why they’re there, and what they do. There’s no deep-dives.

A Summary of New Features

Here is a summary of what was introduced in the last 2 versions of Hyper-V. A lot of this stuff still cannot be found in vSphere.

image

And we can compare that with what’s new in WS2016 Hyper-V (in blue at the bottom). There’s as much new stuff in this 1 release as there were in the last 2!

image

Security

The first area that Ben will cover is security. The number of attack vectors is up, attacks are on the rise, and the sophistication of those attacks is increasing. Microsoft wants Windows Server to be the best platform. Cloud is a big deal for customers – some are worried about industry and government regulations preventing adoption of the cloud. Microsoft wants to fix that with WS2016.

Shielded Virtual Machines

Two basic concepts:

  • A VM can only run on a trusted & healthy host – a rogue admin/attacker cannot start the VM elsewhere. A highly secured Host Guardian Service must authorize the hosts.
  • A VM is encrypted by the customer/tenant using BitLocker – a rogue admin/attacker/government agency cannot inspect the VM’s contents by mounting the disk(s).

image

There are levels of shielding, so it’s not an all or nothing.

Key Storage Drive for Generation 1 VMs

Shielding, as above, required Generation 2 VMs. You can also offer some security for Generation 1 virtual machines: Key Storage Drive. Not as secure as shielded virtual machines or virtual TPM, but it does give us a safe way to use BitLocker inside a Generation 1 virtual machine – required for older applications that depend on older operating systems (older OSs cannot be used in Generation 2 virtual machines).

 

image

Virtual Secure Mode (VSM)

We also have Guest Virtual Secure Mode:

  • Credential Guard: protecting ID against pass-the-hash by hiding LSASS in a secured VM (called VSM) … in a VM with a Windows 10 or Windows Server 2016 guest OS! Malware running with admin rights cannot steal your credentials in a VM.
  • Device Guard: Protect the critical kernel parts of the guest OS against rogue s/w, again, by hiding them in a VSM in a Windows 10 or Windows Server 2016 guest OS.

image

Secure Boot for Linux Guests

Secure boot was already there for Windows in Generation 2 virtual machines. It’s now there for Linux guest OSs, protecting the boot loader and kernel against root kits.

image

Host Resource Protection (HRP)

Ben hopes you never see this next feature in action in the field Smile This is because Host Resource Protection is there to protect hosts/VMs from a DOS attack against a host by someone inside a VM. The scenario: you have an online application running in a VM. An attacker compromises the application (example: SQL injection) and gets into the guest OS of the VM. They’re isolated from other VMs by the hypervisor and hardware/DEP, so they attack the host using DOS, and consume resources.

A new feature, from Azure, called HRP will determine that the VM is aggressively using resources using certain patterns, and start to starve it of resources, thus slowing down the DOS attack to the point of being pointless. This feature will be of particular interest to:

  • Companies hosting external facing services on Hyper-V/Windows Azure Pack/Azure Stack
  • Hosting companies using Hyper-V/Windows Azure Pack/Azure Stack

image

This is another great example of on-prem customers getting the benefits of Azure, even if they don’t use Azure. Microsoft developed this solution to protect against the many unsuccessful DOS attacks from Azure VMs, and we get it for free for our on-prem or hosted Hyper-V hosts. If you see this happening, the status of the VM will switch to Host Resource Protection.

Security Demos

Ben starts with virtual TPM. The Windows 10 VM has a virtual TPM enabled and we see that the C: drive is encrypted. He shuts down the VM to show us the TPM settings of the VM. We can optionally encrypt the state and live migration traffic of the VM – that means a VM is encrypted at rest and in transit. There is a “performance impact” for this optional protection, which is why it’s not on by default. Ben also enables shielding – and he loses console access to the VM – the only way to connect to the machine is to remote desktop/SSH to it.

Note: if he was running the full host guardian service (HGS) infrastructure then he would have had no control over shielding as a normal admin – only the HGS admins would have had control. And even the HGS admins have no control over BitLocker.

He switches to a Generation 1 virtual machine with Key Storage Drive enabled. BitLocker is running. In the VM settings (Generation 1) we see Security > Key Storage Drive Enabled. Under the hood, an extra virtual hard disk is attached to the VM (not visible in the normal storage controller settings, but visible in Disk Management in the guest OS). It’s a small 41 MB NTFS volume. The BitLocker keys are stored there instead of a TPM – virtual TPM is only in Generation 2, but it’s using the same sorts of tech/encryption/methods to secure the contents in the Key Storage Drive, but it cannot be as secure as virtual TPM, but it is better than not having BitLocker. Microsoft can make the same promises with data at rest encryption for Generation 1 VMs, but it’s still not as good as a Generation 2 VM with vTPM or even a shielded VM (requires Generation 2).

Availability

The next section is all about keeping services up and running in Hyper-V, whether it’s caused by upgrades or infrastructure issues. Everyone has outages and Microsoft wants to reduce the impact of these. Microsoft studied the common causes, and started to tackle them in WS2016

Cluster OS Rolling Upgrades

Microsoft is planning 2-3 updates per year for Nano Server, plus there’ll be other OS upgrades in the future. You cannot upgrade a cluster node. And in the past we could only do cluster-cluster migrations to adopt new versions of Windows Server/Hyper-V. Now, we can:

  1. Remove cluster node 1
  2. Rebuild cluster node 1 with the new version of Windows Server/Hyper-V
  3. Add cluster node 1 to the old cluster – the cluster runs happily in mixed-mode for a short period of time (weeks), with failover and Live Migration between the old/new OS versions.
  4. Repeat steps 1-3 until all nodes are up to date
  5. Upgrade the cluster functional level – Update-ClusterFunctionalLevel (see below for “Emulex incident”)
  6. Upgrade the VMs’ version level

Zero VM downtime, zero new hardware – 2 node cluster, all the way to a 64 node cluster.

If you have System Center:

  1. Upgrade to SCVMM 2016.
  2. Let it orchestrate the cluster upgrade (above)

Supports starts with WS2012 R2 to WS2016. Re-read that statement: there is no support for W2008/W2008 R2/WS2012. Re-read that last statement. No need for any questions now Smile

image

To avoid an “Emulex incident” (you upgrade your hosts – and a driver/firmware fails even though it is certified, and the vendor is going to take 9 months to fix the issue) then you can actually:

  1. Do the node upgrades.
  2. Delay the upgrade to the cluster functional level for a week or two
  3. Test your hosts/cluster for driver/firmware stability
  4. Rollback the cluster nodes to the older OS if there is an issue –> only possible if the cluster functional level is on the older version.

And there’s no downtime because it’s all leveraging Live Migration.

Virtual Machine Upgrades

This was done automatically when you moved a VM from version X to version X+1. Now you control it (for the above to work). Version 8 is WS2016 host support.

image

Failover Clustering

Microsoft identified two top causes of outages in customer environments:

  • Brief storage “outages” – crashing the guest OS of a VM when an IO failed. In WS2016, when an IO fails, the VM is put in a paused-critical state (for up to 24 hours, by default). The VM will resume as soon as the storage resumes.
  • Transient network errors – clustered hosts being isolated causing unnecessary VM failover (reboot), even if the VM was still on the network. A very common 30 seconds network outage will cause a Hyper-V cluster to panic up to and including WS2012 R2 – attempted failovers on every node and/or quorum craziness! That’s fixed in WS2016 – the VMs will stay on the host (in an unmonitored state) if they are still networked (see network protection from WS2012 R2). Clustering will wait (by default) for 4 minutes before doing a failover of that VM. If a host glitches 3 times in an hour it will be automatically quarantined, after resuming from the 3rd glitch, (VMs are then live migrated to other nodes) for 2 hours, allowing operator inspection.

image

Guest Clustering with Shared VHDX

Version 1 of this in WS2012 R2 was limited – supported guest clusters but we couldn’t do Live Migration, replication, or backup of the VMs/shared VHDX files. Nice idea, but it couldn’t really be used in production (it was supported, but functionally incomplete) instead of virtual fibre channel or guest iSCSI.

WS2016 has a new abstracted form of Shared VHDX – it’s even a new file format. It supports:

  • Backup of the VMs at the host level
  • Online resizing
  • Hyper-V Replica (which should lead to ASR support) – if the workload is important enough to cluster, then it’s important enough to replicate for DR!

image

One feature that does not work (yet) is Storage Live Migration. Checkpoint can be done “if you know what you are doing” – be careful!!!

Replica Support for Hot-Add VHDX

We could hot-add a VHDX file to a VM, but we could not add that to replication if the VM was already being replicated. We had to re-replicate the VM! That changes in WS2016, thanks to the concept of replica sets. A new VHDX is added to a “not-replicated” set and we can move it to the replicated set for that VM.

image

Hot-Add Remove VM Components

We can hot-add and hot-remove vNICs to/from running VMs. Generation 2 VMs only, with any supported Windows or Linux guest OS.

We can also hot-add or hot-remove RAM to/from a VM, assuming:

  • There is free RAM on the host to add to the VM
  • There is unused RAM in the VM to remove from the VM

This is great for those VMs that cannot use Dynamic Memory:

  • No support by the workload
  • A large RAM VM that will benefit from guest-aware NUMA

A nice GUI side-effect is that guest OS memory demand is now reported in Hyper-V Manager for all VMs.

Production Checkpoints

Referring to what used to be called (Hyper-V) snapshots, but were renamed to checkpoints to stop dumb people from getting confused with SAN and VSS snapshots – yes, people really are that stupid – I’ve met them.

Checkpoints (what are now called Standard Checkpoints) were not supported by many applications in a guest OS because they lead to application inconsistency. WS2016 adds a new default checkpoint type called a Production Checkpoint. This basically uses backup technology (and IT IS STILL NOT A BACKUP!) to create an application consistent checkpoint of a VM. If you apply (restore) the checkpoint the VM:

  • The VM will not boot up automatically
  • The VM will boot up as if it was restoring from a backup (hey dumbass, checkpoints are STILL NOT A BACKUP!)

For the stupid people, if you want to backup VMs, use a backup product. Altaro goes from free to quite affordable. Veeam is excellent. And Azure Backup Server gives you OPEX based local backup plus cloud storage for the price of just the cloud component. And there are many other BACKUP solutions for Hyper-V.

Now with production checkpoints, MSFT is OK with you using checkpoints with production workloads …. BUT NOT FOR BACKUP!

image

Demos

Ben does some demos of the above. His demo rig is based on nested virtualization. He comments that:

  • The impact of CPU/RAM is negligible
  • There is around a 25% impact on storage IO

Storage

The foundation of virtualization/cloud that makes or breaks a deployment.

Storage Quality of Service (QOS)

We had a basic system in WS2012 R2:

  • Set max IOPS rules per VM
  • Set min IOPS alerts per VM that were damned hard to get info from (WMI)

And virtually no-one used the system. Now we get storage QoS that’s trickled down from Azure.

In WS2016:

  • We can set reserves (that are applied) and limits on IOPS
  • Available for Scale-Out File Server and block storage (via CSV)
  • Metrics rules for VHD, VM, host, volume
  • Rules for VHD, VM, service, or tenant
  • Distributed rule application – fair usage, managed at storage level (applied in partnership by the host)
  • PoSH management in WS2016, and SCVMM/SCOM GUI image

You can do single-instance or multi-instance policies:

  • Single-instance: IOPS are shared by a set of VMs, e.g. a service or a cluster, or this department only gets 20,000 IOPS.
  • Multi-instance: the same rule is applied to a group of VMs, the same rule for a large set of VMs, e.g. Azure guarantees at least X IOPS to each Standard storage VHD.

image

Discrete Device Assignment – NVME Storage

DDA allows a virtual machine to connect directly to a device. An example is a VM connects directly to extremely fast NVME flash storage.

Note: we lose Live Migration and checkpoints when we use DDA with a VM.

image

Evolving Hyper-V Backup

Lots of work done here. WS2016 has it’s only block change tracking (Resilient Change Tracking) so we don’t need a buggy 3rd party filter driver running in the kernel of the host to do incremental backups of Hyper-V VMs. This should speed up the support of new Hyper-V versions by the backup vendors (except for you-know-who-yellow-box-backup-to-tape-vendor-X, obviously!).

Large clusters had scalability problems with backup. VSS dependencies have been lessened to allow reliable backups of 64 node clusters.

Microsoft has also removed the need for hardware VSS snapshots (a big source of bugs), but you can still make use of hardware features that a SAN can offer.

ReFS Accelerated VHDX Operations

Re-FS is the preferred file system for storing VMs in WS2016. ReFS works using metadata which links to data blocks. This abstraction allows very fast operations:

  • Fixed VHD/X creation (seconds instead of hours)
  • Dynamic VHD/X expansion
  • Checkpoint merge, which impacts VM backup

Note, you’ll have to reformat WS2012 R2 ReFS to get the new version of ReFS.

Graphics

A lot of people use Hyper-V (directly or in Azure) for RDS/Citrix.

RemoteFX Improvements

image

The AVC444 thing is a lossless codec – lossless 3D rendering, apparently … that’s gobbledegook to me.

DDA Features and GPU Capabilities

We can also use DDA to connect VMs directly to CPUs … this is what the Azure N-Series VMs are doing with high-end NVIDIA GFX cards.

  • DirectX, OpenGL, OpenCL, CUDA
  • Guest OS: Server 2012 R2, Server 2016, Windows 10, Linux

The h/w requirements are very specific and detailed. For example, I have a laptop that I can do RemoteFX with, but I cannot use for DDA (SRIOV not supported on my machine).

Headless Virtual Machine

A VM can be booted without display devices. Reduces the memory footprint, and simulates a headless server.

Operational Efficiency

Once again, Microsoft is improving the administration experience.

PowerShell Direct

You can now to remote PowerShell into a VM via the VMbus on the host – this means you do not need any network access or domain join. You can do either:

  • Enter-PSSession for an interactive session
  • Invoke-Command for a once-off instruction

Supports:

  • Host: Windows 10/WS2016
  • Guest: Windows 10/WS2016

You do need credentials for the guest OS, and you need to do it via the host, so it is secure.

This is one of Ben’s favourite WS2016 features – I know he uses it a lot to build demo rigs and during demos. I love it too for the same reasons.

PowerShell Direct – JEA and Sessions

The following are extensions of PowerShell Direct and PowerShell remoting:

  • Just Enough Administration (JEA): An admin has no rights with their normal account to a remote server. They use a JEA config when connecting to the server that grants them just enough rights to do their work. Their elevated rights are limited to that machine via a temporary user that is deleted when their session ends. Really limits what malware/attacker can target.
  • Justin-Time Administration (JITA): An admin can request rights for a short amount of time from MIM. They must enter a justification, and company can enforce management approval in the process.

vNIC Identification

Name the vNICs and make that name visible in the guest OS. Really useful for VMs with more than 1 vNIC because Hyper-V does not have consistent device naming.

image

Hyper-V Manager Improvements

Yes, it’s the same MMC-based Hyper-V Manager that we got in W2008, but with more bells and whistles.

  • Support for alternative credentials
  • Connect to a host IP address
  • Connect via WinRM
  • Support for high-DPI monitors
  • Manage WS2012, WS2012 R2 and WS2016 from one HVM – HVM in Win10 Anniversary Update (The big Redstone 1 update in Summer 2016) has this functionality.

VM Servicing

MS found that the vast majority of customers never updated the Integration services/components (ICs) in the guest OS of VMs. It was a horrible manual process – or one that was painful to automate. So customers ran with older/buggy versions of ICs, and VMs often lacked features that the host supported!

ICs are updated in the guest OS via Windows Update on WS2016. Problem sorted, assuming proper testing and correct packaging!

MSFT plans to release IC updates via Windows Update to WS2012 R2 in a month, preparing those VMs for migration to WS2016. Nice!

Core Platform

Ben was running out of time here!

Delivering the Best Hyper-V Host Ever

This was the Nano Server push. Honestly – I’m not sold. Too difficult to troubleshoot and a nightmare to deploy without SCVMM.

I do use Nano in the lab. Later, Ben does a demo. I’d not seen VM status in the Nano console before, which Ben shows – the only time I’ve used the console is to verify network settings that I set remotely using PoSH Smile There is also an ability to delete a virtual switch on the console.

Nested Virtualization

Yay! Ben admits that nested virtualization was done for Hyper-V Containers on Azure, but we people requiring labs or training environments can now run multiple working hosts & clusters on a single machine!

VM Configuration File

Short story: it’s binary instead of XML, improving performance on dense hosts. Two files:

  • .VMCX: Configuration
  • .VMRS: Run state

Power Management

Client Hyper-V was impacted badly by Windows 8 era power management features like Connected Standby. That included Surface devices. That’s sorted now.

Development Stuff

This looks like a seed for the future (and I like the idea of what it might lead to, and I won’t say what that might be!). There is now a single WMI (Root\HyperVCluster\v2) view of the entire Hyper-V cluster – you see a cluster as one big Hyper-V server. It really doesn’t do much now.

And there’s also something new called Hyper-V sockets for Microsoft partners to develop on. An extension of the Windows Socket API for “fast, efficient communication between the host and the guest”.

Scale Limits

The numbers are “Top Gear stats” but, according to a session earlier in the week, these are driven by Azure (Hyper-V’s biggest customer). Ben says that the numbers are nuts and we normals won’t ever have this hardware, but Azure came to Hyper-V and asked for bigger numbers for “massive scale”. Apparently some customers want massive super computer scale “for a few months” and Azure wants to give them an OPEX offering so those customers don’t need to buy that h/w.

Note Ben highlights a typo in max RAM per VM: it should say 12 TB max for a VM … what’s 4 TB between friends?!?!

image

Ben wraps up with a few demos.

Ignite 2016 – Extend the Microsoft RDS platform in Azure through Citrix solutions

This post is my set of notes from the session that shows us how Citrix are extending Azure functionality, including the 1st public demo of Citrix Express, which will replace Azure RemoteApp in 2017.

The speakers are:

  • Scott Manchester (main presenter), Principal Group Program Manager, Microsoft
  • Jitendra Deshpande, Citrix
  • Kireeti Valicherla, Citrix

RDS

A MSFT-only solution with multiple goals:

image

Two on-prem solutions:

  • Session-based computing
  • VDI

In the cloud:

  • Session-based computing: RDS in VMs or the deprecated Azure RemoteApp
  • VDI “on Windows 10” … Manchester alludes to some licensing change to allow Enterprise edition of the desktop to be used in cloud-based VDI, which is not possible in any way with a desktop OS right now (plenty do it, breaking licensing rules, and some “do it” using a Server OS with GUI).

RDS Improvements in WS2016

  • Increased performance
  • Enhanced scale in the broker
  • Optimized for the cloud – make it easier to deploy it – some is Azure, some RDS, some licensing.

Azure N-Series

There are a set of VMs that are ideal for graphics intensive RDS/Citrix workloads. They use physical NVIDIA GPUs that are presented to the VM directly using Hyper-V DDA (as in WS2016 Hyper-V).

I skip some of the other stuff that is covered in other sessions.

Citrix

Kiritee from Citrix XenApp/XenDesktop takes the stage. He’s focused on XenApp Express, a new from-Azure service that will be out in 2017.

XenApp 7.11 has Day 1 support for WS2016:

  • Host WS2016 workloads
  • Host XenApp and XenDesktop infrastructure
  • Workload provisioning on ARM
  • Deliver new universal apps to any device
  • Accelerate app migration with AppDNA

XenApp/XenDesktop For N-Series VMs

HDX can be used with N-Series Azure VMs. This includes graphics professionals and designers on “single user Windows 10 CBB VMs” with multi-monitor NVENC H.264 hardware encoding.

Options for Azure Migration

Jitendra of Citrix takes over. He works on XenApp cloud and XenApp Express.

image

You can extend workloads to Azure, host workloads in Azure, or  run on a Citrix-managed service in Azure. In the latter, the management is in Citrix, and your workload runs in Azure. Citrix seamlessly update the management pieces and you just use them without doing upgrades.

These are the Citrix/Azure offerings today and in the future:

image

Back to Kireeti.

Next Generation Service for Remoting Apps

XenApp Express, out of the Azure Marketplace, will be the successor to Azure RemoteApp.

image

Citrix Cloud will provide the management – it’s actually hosted on Azure. You bring your own Windows Server Images into XenApp Express, much like we do with Azure RemoteApp – it an image with the apps pre-installed.

Bad news: The customer must have RDS CALs with Software Assurance (Volume Licensing, and yes, SA is required for cloud usage) or RDS SALs (SPLA). The cost of Azure Remote included the monthly cost of RDS licensing.

The VMs that are deployed are run in your Azure subscription and consume credit/billing there.

Management is done via another portal in Citrix Cloud. Yes, you’ll need to use Azure Portal and the Citrix Cloud portal.

image

Here is the release timeline. A technical preview will be some time in Q4 of this year.

image

Next up, a demo, by Jitendra (I think – we cannot see the presenters in the video). The demo is with a dev build, which will likely change before the tech preview is launched.

  1. You “buy” Citrix XenApp Express in the Azure Marketplace – this limits transactions to certain kinds of subscriptions, e.g. EA but not CSP.
  2. You start by creating an App Collection – similar to Azure RemoteApp. You can make it domain-joined or not-domain joined. A domain should be available from your Azure VNet.
  3. Add your Azure subscription details – subscription, resource group (region), VNET, subnet.
  4. Enter your domain join details – very similar to Azure RemoteApp – domain, OU, computer account domain-join account name/password.
  5. You can use a Citrix image or upload your own image. Here you also select a VM series/size, configure power settings, etc, to control performance/scale/pricing.
  6. You can set your expected max number of simultaneous users.
  7. The end of the wizard shows an estimated cost calculator for your Azure subscription.
  8. You click Start Deployment
  9. Citrix reaches into your subscription and creates the VMs.
  10. Afterwards, you’ll need to publish apps in your app collection.
  11. Then you assign users from your domain – no mention if this is from a DC or from Azure AD.
  12. The user uses Citrix Receiver or the HTML 5 client to sign into the app collection and use the published apps.

The Best Way To Deliver Windows 10 Desktop From The Cloud

Cloud-based VDI using a desktop OS – not allowed up to now under Windows desktop OS (DESKTOP OS) licensing.

There are “new licensing changes” to move Windows 10 workloads to Azure. Citrix XenDesktop will be based on this.

image

  • XenDesktop for Windows 10 on Azure is managed from Citrix Cloud (as above). You manage and provision the service from here, managing what is hosted in Azure.
  • Windows 10 Enterprise CBB licensing is brought by the customer. The customer’s Azure subscription hosts the VDI VMs and your credit is consumed or you pay the Azure bill. They say it must be EA/SA, but that’s unclear. Is that EA with SA only? Can an Open customer with SA do this? Can a customer getting the Windows 10 E3 license via CSP do this? We do not know.

Timeline – GA in Q4 of this year:

image

Next up, a demo.

  1. They are logged into Citrix Cloud, which is first purchased via the Azure Marketplace – limited to a small set of Azure subscriptions, e.g. EA but not CSP at the moment.
  2. A hosting connection to an Azure subscription is set up already.
  3. They create a “machine catalog” – a bunch of machines.
  4. The wizard allows you to only do a desktop OS (this is a Windows 10 service). The wizard allows pooled/dedicated VMs, and you can configure how user changes are saved (local disk, virtual disk, discarded). You then select the VHD master image, which you supply to Citrix. You can use Standard (HDD) or Premium (SSD) storage in Azure for storing the VM. And then you select the quantity of VMs to create and the series/size (from Azure) to use – this will include the N-Series VMs when they are available. There’s more – like VM networking & domain join that you can do (they don’t show this).
  5. He signs into a Windows 10 Azure VM from a Mac, brokered by Citrix Cloud.

That’s all folks!

Ignite 2016 – Microsoft Azure Networking: New Network Services, Features And Scenarios

This session (original here) from Microsoft Ignite 2016 is looking at new networking features in Azure such as Web Application Firewall, IPv6, DNS, accelerated networking, VNet Peering and more. This post is my collection of notes from the recording of this session.

The speakers are:

  • Yousef Khalidi, Corporate Vice President, Microsoft
  • Jason Carson, Enterprise Architect, Manulife
  • Art Chenobrov, Manager Identity, Access and Messaging, Hyatt Hotels
  • Gabriel Silva, Program Manager, Microsoft

A mix of Microsoft  and non-Microsoft speakers. There will be a breadth overview and some customer testimonials. A chunk of marketing consumes the first 7 minutes. Then on to the good stuff.

High Performance Networking

A number of improvements have been made at no cost to the customer. Honestly, I’ve seen some by accident, and they ruined (in a good way) some of my demos Smile

  • Improved performance of all VMs, seeing VNet performance improve by 33% to 50%
  • More IOPS to storage – I saw IOPS increase in some demo/tests
  • For Linux and Windows VMs
  • The global deployment will be completed in 2016 – phased deployments across the Azure regions.

You have to do nothing to get these benefits. I’m sure that Yousef said that’ll we’ll be able to get up to 21 Gbps down depending on the VM SKU/size. Some of this is made possible thanks to making better utilization of NIC capacity.

Accelerated Networking

Azure now has SR-IOV (single-root IO virtualization), where a VM can connect directly to a physical NIC without routing traffic via the virtual switch in the host partition. The results are:

image

  • 10 x latency improvement
  • Increased packets per second (PPS)
  • Reduced jitter – great for media/voice

Now Azure has the highest bandwidth VMs in the cloud: DS15v2 and D15v2 can hit 25 Gbps (in preview). The competition can get up to 20 Gbps “on a good day”.

Performance sensitive applications will benefit. There is a 1.5x improvement for Azure SQL DB in memory OLTP transactions.

Microsoft are rolling this out across Azure over this and the next calendar years. Gabe (Gabriel) does a demo, doing VM to VM latency and bandwidth test. You can enable SR-IOV in the Portal (Accelerated Network setting). The demo is done in West Central US region. You can verify that SR-IOV is enabled for the vNIC in the guest OS – Windows, look for a virtual function (VF) network adapter in Devices. Interestingly, in the demo, we can tell that the host uses Mellanox ConnectX-3 RDMA NICs. The first demo does 100,000 pings on VMs, and this is 10 times lower than current numbers. They run a network stress test between two VMs.

image

 

The get 25 Gbps of connectivity between the 2 VMs:

image

This functionality will be coming “soon” to us.

Next there’s a demo with connection latency tests to a database, from a VM with SRIOV and one without. We see that latency is significantly lower on the accelerated VM. They re-run the test to make the results more tangible. The un-accelerated machine can query 270 rows per second while the accelerated one is hitting 664. Same VMs – just SRIOV is enabled on one of them.

image

The subscription must be enabled for this feature first (still rolling it out) and then all of your VMs can leverage the feature. There is no cost to turning it on and using the feature.

Back to Yousef.

The Network Big Picture

The following is an old slide full of old features:

image

On to the new stuff.

VNet Peering (GA)

A customer can have lots of isolated features with duplicated effort.

image

Customers want to consolidate some of this. For example, can we:

 

  • Have one VNet that has load balancing and virtual appliance firewalls/proxies
  • Connect other VNets to this?

The answer is yes, you can now using VNet peering (limited to connections in a single region) which just went GA.

image

Note that VM connections across a VNet run at the speed of the VMs’ NICs.

Azure DNS (GA)

You can host your records in Azure DNS or elsewhere. The benefit of Azure is that it is global and fast.

image

IPv6 for Azure VMs

We can create IPv6 IP addresses on the load balancer, and use AAAA DNS records (which you can host in Azure DNS if you want) to access VM services in Azure. This is supported for Linux and Windows. This is a big deal for IoT devices.

image

Load Balancing (Review)

Yousef reviews how load balancing can be done toady in Azure. A traffic manager profile (based on DNS records and abstraction) does load balancing/fail over between 2+ Azure deployments (across 1+ regions). A single deployment has an Azure Load Balancer, which uses Layer 4 LB rules to pass traffic through to the VNet. Within the VNet, Azure application gateways (can) the proxy/direct/load balance Layer 7 traffic to web servers (VMs) on the VNet.

image

Web Application Firewall

The web application gateway is still relatively unknown, in my experience, even though it’s been around for 1 year. This is layer 7 handling of traffic to web farms/servers.

image

 

A preview for web application firewall (WAF) has been announced – an extension of the web application gateway.

image

WAF adds security to the WAG. In current preview, it uses a hard set of rules, but custom rules will be coming soon. MSFT hopes to GA it soon (must be ready first).

WAF is an add-on SKU to the gateway. It can run in detection mode (great to watch traffic without intervening – try it out). When you are happy, you switch over to prevention mode so it can intervene.

image

Multiple VIPS for Load Balancer

This is a cost reduction improvement. For example, you needed to run multiple databases behind internal load balancers, with each DB pair requiring a unique VIP. Now we can assign multiple VIPs to a LB, and consolidate the databases to a pair of VMs instead of multiple pairs of VMs.

image

Back end ports can also be reused to facilitate the above.

NIC Enhancements

These improvements didn’t get mentioned in any posts I read or announcements I heard. MAC addresses were not persistent. They have been for a few months now. Also, VM ordering in a VM is retained after VM start (important for NVAs) – there was a bug were the NICs weren’t in persistent order.

image

New virtual appliance scenarios are supported by adding functionality to additional NICs in a VM:

  • Load balancing
  • Direct public IP assignment
  • Multiple IPs on a single NIC

A marketing-heavy video is played to discuss how Hyatt Hotels are using Azure networking. I think that the jist of the story is that Hyatt went from a single data center in the USA, to having multiple PoPs around the world thanks to Azure networking (probably ExpressRoute).  The speaker from Hyatt comes on stage.

Yousef is back on stage to talk about connecting to Azure. I was ready to skip this piece of the video but Yousef did present some interesting stuff. The first is using the Azure backbone to connect disparate offices. Each office connects over “the last mile” to Azure using secure VPN. Then Azure VNet-VNet VPNs provide the WAN. I’d never thought of this architecture – it’s actually pretty simple to set up with the new VPN UI in the Azure Portal. Azure provides low latency and high bandwidth connections – this is a very cheap way to network sites together with lots of speed and low latency.

image

Highly Available Connections to Azure

We can create more than 1 connection to Azure VPN gateways, solving a concern that people have over reliance on a single link/ISP.

image

Most people don’t know it, but the Azure gateway was an active/passive VM cluster behind the curtain. You can now run the gateway in an active/active configuration, giving you greater HA for your site-to-Azure connections. And additionally, you can aggregate the bandwidth of both VPN tunnels/links/ISPs.

image

If you are interested in the expensive ExpressRoute WAN option, then the PoP locations have increased to 35 around the world – more than any other cloud, with lots of partners offering WAN and connection relay options.

image

ExpressRoute has a new UltraPerfromance gateway option: 5x improvement over the 2 Gbps HighPerformance gateway– up to 10 Gbps through to VNets

The ExpressRoute gateway SLA is increased to 99.95%.

More insights into ExpressRoute are being added: troubleshooting, BGP/traffic/routing statistics, diagnostics, alerting, monitoring, etc.

There’s a stint by the Manulife speaker to talk about their usage of Azure, which I skipped.

Monitoring And Diagnostics

Customers want visibility into the virtual networks that they are using for production and mission critical applications/services. So Microsoft has given us this in Azure:

image

More stuff will appear in PoSH, log extractions (for 3rd parties), and in the Portal in the future. And the session moved on to a summary.

Ignite 2016–Cloud Infrastructure with Jason Zander

These are my notes from watching the session online, presented by the man who runs Azure engineering. This high-level session is all about the hybrid cloud nature of Microsoft’s offerings. You choose where to run things, not the cloud vendor.

Data Centre

Microsoft expects a high percentage of customers want to deploy hybrid solutions, a mixture of on-premises, with hosting partners, or in public cloud (like Azure, O365, etc). IDC reckons 80% of enterprises will run hybrid strategy by 2017. This has been Microsoft’s offering since day 1, and it continues that way. Microsoft believes there are legitimate scenarios that will continue to run on-premises.

image

A lot of learning from Azure has fed back into Hyper-V over the years. This continues in WS2016:

  • Distributed QoS
  • Network Controller
  • Discrete device assignment (DDA)

Security

The threats have evolved to target all the new endpoints that modern computing have enabled. Security starts with the basics: patching. Once an attacker is in, they expand their reach. Threats are from all over – internal and external. Advanced persistent threats from zero days and organized & financed attackers are a legit threat. It takes only 24-48 hours for an attacker to get from intrusion to domain admin access, and then they sit there, undiscovered, stealing and damaging, for a mean of 150 days!

image

Windows Server philosophy is to defend in depth:

image

Shielded Virtual Machines

The goal is that even if someone gets physical access to a host (internal threat), they cannot get into the virtual machines.

  • VMs are encrypted at rest and in transit.
  • A host must be trusted by an secured independent authority.

image

Jeff Woolsey, Principal Program Manager, comes on stage to do a demo. Admins can be bad guys! Imagine your SAN admin … he can copy VM virtual hard disks, mount them, and get your data. If that’s a DC VM, then he can steal the domain’s secrets very easily. That’s the easiest ID theft ever. Shielded VMs prevent this. The VMs are a black box that the VM owner controls, not the compute/networking/storage admins.

Jeff does a demo … easy mount and steal from un-shielded VHDX files. Then he goes to a shielded VM … wait no those are VMware VMDK files and “these guys don’t have shielded virtual machines”. He goes to the write folder, and mounting a VHDX fails because it’s encrypted using BitLocker, even though he is a full admin. He goes to Hyper-V Manager and tries a console connection to a shielded VM. There’s no console thumbnail and console is not available – you have to use RDM. The shielded VM uses a virtual unique TPM chip to seal the unique key that protects the virtual disks. The VM processes are protected, which means that you cannot attach a debugger or inspect the RAM of the VM from the host.

This is a truly unique security feature. If you want secured VMs, then WS2016 Hyper-V (and it really requires VMM 2016 for real world) is the only way to do it – forget vSphere.

Software Defined

You get the same Hyper-V on-prem that Microsoft uses in Azure – no one else does that. Scalability has increased. Linux is a first class citizen – 30% of existing Azure VMs are Linux and they think that 60% of new virtual machines are Linux. The software defined networking in WS2016 came from Azure. Load Balancing is tested in Azure and running in the fabric for VMs in WS2016. VXLAN was added too. Storage has been re-invented with Storage Spaces Direct (S2D), lowering costs and increasing performance.

Management

System Center 2016 will be generally available with WS2016 in mid October (it actually isn’t GA yet, despite the misleading tweets). Noise control has been added to SCOM, allowing you to tune alerts more easily.

Application Platform

We have new cloud-first ways to deploy applications.

Nano Server is a refactored Windows Server with no UI – you manage this installation option (it’s not a license) remotely. You can deploy it quickly, it boots up in seconds, and it uses fewer resources than any other Windows option. You can use Nano Server on hosts, storage nodes, in VMs, or in containers. The application workload is where I think Nano makes the most sense.

image

Containers are native in WS2016, managed by Docker or PowerShell. Deploying applications is extremely fast with containers. Coupled with orchestration, the app people stop caring about servers – all compute is abstracted away for them. Windows Server Containers is the same kind of containers that people might be aware of. Hyper-V Containers takes regular kernels so that they don’t have a shared kernel and are isolated by the DEP-backed Hyper-V hypervisor. Docker provides enterprise-ready management for containers, including WS2016. Anyone buying WS2016 gets the Docker Engine, with support from both Docker and MSFT.

Ben Golub, CEO of Docker, comes out. Chat show time … we’ll skip that.

Azure

The tenants of Azure are global, trusted, and hybrid. Note that last one.

Global:

image

 

This is Amsterdam (West Europe). The white buildings are data centers. One is the size of an American Football field (120 yards x around 53 yards). This isn’t one of the big data centers.

image

 

1.6 million miles of fibre around the world, with a new mid-Atlantic one one the way. There are roughly 90 ExpressRoute (WAN connection) PoPs around the world. The platform is broad … CSP has over 5,000 line items in the price list. Over 600 new services and features shipped in the last 12 months.

Some new announcements are highlighted.

  • New H-Series VMs are live on Azure. H is for high performance or HPC.
  • L-Series VMs: Storage optimized.
  • N-Series (already announced): NVIDIA GPU enabled.

An in-demand application is SAP HANA. Microsoft has worked with SAP to create purpose-built infrastructure for HANA with up to 32 TB OLAP and 3 TB OLTP.

New Networking Capabilities

Field-programmable gate array (FPGA) has gone live in Azure, enabling network acceleration up to 25 Mbps. IPv6 support has been added. Also:

  • Web application firewall (added to the web application proxy)
  • Azure DNS is GA

Compliance

Azure has the largest compliance portfolio in cloud scale computing. Don’t just look at the logo – look at what is supported by that certification. Azure has 50% more than AWS in PCI. 300% more than AWS in FedRAMP. 3 more certs were announced:

image

Azure was the first to get the EU-US Privacy Shield certification.

Hybrid

Microsoft means run it on-prem or in the cloud when they hybrid (choice). Other vendors are limited to a network connection to hoover up all your systems and data (no choice).

image

SQL Server 2016 stretch database allows a table to span on-prem and Azure SQL. That’s a perfect example of hybrid in action.

Azure Stack Technical Preview 2 was launched. You can run it on-prem or with a partner service provider. Scenarios include:

  • Data sovereignty
  • Industrial: A private cloud running a factory
  • Temporarily isolated environments
  • Smart cities

The 2 big hurdles are software and hardware. This is why Microsoft is partnering with DellEMC, HPE and Lenovo on solutions for Microsoft Azure Stack. We see behind the HPE rack – 8 x 2U servers with SFP+ networking. There will be quarter rack stacks for getting started and bigger solutions.

Azure + Azure Stack

Bradley Bartz, Principal Group Program Manager, comes out on stage. He’s talking through a scenario. A company in Atlanta runs a local data center. Application are moving to containers. Dev/test will be done in Azure (pay as you go). Production deployment will be done on-prem. An Azure WS2016 VM runs as a container host. OMS is being used by Ops to monitor all items in both clouds. Ops use desired state configuration (DSC) to automate the deploy OMS management to everything by policy. This policy also stores credentials into KeyVault. When devs deploy a nwe container host VM, it is automatically managed by OMS. He now logs in as an operator in the Azure Stack portal. We are shown a demo of the Add Image dialog. A new feature that will come, is syndication of the Azure Marketplace from Azure to Azure Stack. Now when you create a new image in Azure Stack, you can draw down an image from the Azure Stack – this increases inventory for Azure Stack customers, and the market for those selling via the Marketplace. He adds the WS2016 with Containers image from the Marketplace. Now when the devs go into production, they can use the exact same template for their dev/test in Azure as they do for production on-prem.

When a dev is deploying from Visual Studio, they can pick the appropriate resource group in Azure, in Azure Stack, or even to a hosted Azure Stack elsewhere in the world. With Marketplace syndication, you get a consistent compute experience.

Hybrid Cloud Management

There’s more to hybrid than deployment. You need to be able to manage the footprints, including others such as AWS, vSphere and Hyper-V, as one. Microsoft’s strategy is OMS working with or without System Center. New features:

  • Application dependency mapping allows you to link components that make your service, and ID failing pieces and impacts.
  • Network performance monitoring allows you to see applications view of network bottlenecks or link failures.
  • Automation & Control. Path management is coming to Linux. Patch management will also have crowd sourced feedback on patches.
  • Azure Security Center will be converging with OMS “by the end of the year” – no mention if that was MSFT year or calendar year.
  • Backup and DR have had huge improvements over the last 6 months.

Jeff Woolsey comes back out to do an OMS demo. He goes into Alert Management (integration with SCOM) to see various kinds of alerts. Drills into an alert, and there’s nice graphics that show a clear performance issue. Application Dependency Monitor shows all the pieces that make up a service. This is prevent graphically, and one of the boxes has an alert. There is a SQL assessment alert. He drills into a performance alert. We see that there’s a huge chunk of knowledge, based on Microsoft’s interactions with customers. The database needs to be scaled out. Instead of doing this by hand, he makes a runbook to remediate the problem automatically (it was created from a Microsoft gallery item). He associates the runbook with the alert – the runbook will run automatically after an alert.

3 clicks in a new alert and he allows an incident to be created in a third-party service desk. He associates the alert with another click or two. The problem can now auto-remediate, and an operator is notified and can review it.

He goes into the Security and Audit area and a map shows malicious outbound traffic, identified using intelligence from Microsoft’s digital crime unit. Notable issues highlight actions that IT need to take care of (missing updates, malware scanning, suspicious logins, etc). Re patching, he creates an update run in Updates to patch on-prem servers.

Note: Microsoft Ignite 2016 Keynote

I am live blogging this session. Press refresh to see more.

I am not attending Ignite this year. I’m saving my mileage for the MVP Summit in November. I have actually blocked out my calendar so I can watch live streams and recordings (Channel 9).

Pre-Show

There was a preamble with some media types speculating about the future. I had that on mute while listening to a webinar. A countdown kicks the show off, followed by some interview snippets about people’s attitudes to cloud. The theme of this keynote will be work habits (continuous learning) and cloud.

Julia White

The corporate VP is the host of the keynote. In the pre-show, she acknowledge the negative feedback on Ignite 2015:

  • Over-long keynote – the session will be just 90 minutes long, instead of the 180 minute behemoths of the past.
  • Not enough “general” sessions

image

IT stands for “innovation & transformation” these days. Those that refuse to learn, adapt, change, and evolve, become as populous as the T-Rex. Change can be daunting, but it’s exciting and leads to new opportunities. We should embrace new and different, to figure out what is possible.

Scott Guthrie

image

Today will focus on solution to enable productivity, intelligent cloud platform, enable each of us to deliver transformational impact to our organizations and customers – making us IT heroes. Some blurby stuff now with business consulting terminology. I’ll wait for real things to be presented.

Some examples of BMW and Roles Royce (RR – aircraft engines) using Azure to transform their operations. Adobe is a cloud first company (SaaS) and are moving all of their solutions to Azure. Out comes a speaker from Adobe with Satya Nadella for a chat. It’s chat show time. I’ll skip this.

image

Something was said about digital transformation, I think – I was reading tweets. Guthrie comes back on stage. Now for a video of more customers talking about going to the cloud.

There are now 34 unique Azure regions, each with multiple data centres, around the world, more than twice what AWS offers. Here comes a video to show us inside a region. This is North Europe in Dublin (I’ve never been able to say exactly where due to NDA):

image

Hybrid cloud is more about connections. Hybrid is more than just infrastructure. It’s about consistency (psst, Microsoft Azure Stack). Use a common set of tools and skills no matter where you work. Microsoft leads in more magic quadrants than the competition combined, according to Gartner.

image

Guthrie starts to talk about Azure, what it is and it’s openness. You can use the best of the Linux and the best of the Windows eco-systems to build your solutions.

Donovan Brown

image

Demo time. Brown has a bunch of machines running in Azure and on-prem. He wants to manage them as a unified system. Azure Monitor is a new system for monitoring applications no matter where they are. Monitor in the nav bar shows us the items deployed in Azure and their resource usage. We can see Hyper-V and VMware resources too, using SCOM agent data (requires System Center). A lot of Monitor looks like duplication with Log Analytics (OMS). I’m … confused. We then see security alerts and recommendations in Azure Security Center.

Back to Guthrie.

Technical Preview 2 of Azure Stack is announced.

And we’re back to chat show time again. I’m completely tuning out this segment.

Windows Server 2016

This is a cloud platform, for a software defined data center. Just in time admin access, preventing DDOS attacks on a host by VMs, and Nano server offering new density. There is built-in support for containers, and Docker Engine is going to be available to all WS2016 customers, free of charge. Windows Serer 2016 and System Center 2016 general availability is announced. No dates mentioned, and no bits available on either MSDN or Azure. Yes, there’s an eval online, but that is meaningless.

I press pause here – I’ve a Skype Biz call I cannot get out of. Back later for more …

Conference call is over so I’m back watching the video on delay. Donovan is back on stage to talk DevOps – a big focus for MSFT, moving away from evangelizing IT pro stuff. He starts a demo with Team Services, and I check out Twitter. I fast forward past stuff that includes coding (!).

Yusuf Mehdi

image

There’s a video about Windows 10, with plenty of emphasis on ink/stylus and HoloLens/AR. It’s a classic Microsoft “future” video with things that are possible and others that are futuristic. By 2020, half the workforce will be millenials, so Microsoft needs to innovate on how people work and do. 44% of the workforce is expected to be freelancers (rubbish zero hour contracts?). There will 5 billion devices shipping annually – IDC predicts that Windows Phone will …. no; I’ll stop joking Smile Data growth will continue to explode (44 zettabytes). More devices and capabilities introduce more security challenges.

Microsoft product progress:

  • Windows 10 is on 400 million “monthly active devices”.
  • There are over 70 million monthly commercial users of Office 365.
  • I heard (from a player in this market) that EMS crushes MDM competition on seat sales in the EU. Azure AD protects over 1 billion logins.

Cortana demo is shown. That must be cool for the 10 countries that can use Cortana. Then there’s some inking in Office. And they do some ink sums/equations in OneNote. More Cortana – I’m not going to blog about this because it’s not relevant to 90% of us. Outlook now has Delve Analytics – so I can see who read my email and when. My Analytics in O365 is like Fitbit for your Office activity (meetings, email, multitasking, etc). Surface Hub demo (fast forward).

Another video – security threats and risks. It takes an average of 200 days to detect a breach and 80 days to recover (Source: Ponemon Institute – The Post Breach Boom, 2013). The average cost is $12m per incident. We are in an ever ending battle because every end point is under attack. End users and weaknesses in IT processes are the vulnerable targets.

Microsoft is spending over $1billion per year on security. Microsoft has the largest anti-malware system in the world, and they scan more email than anyone. They have the largest online services in the world (O365, Bing, Outlook.com, Azure, etc).

image

All this data gives Microsoft a great view of worldwide IT security, and the ability to innovate defences:

image

Microsoft announces protection for the browser (Edge first) in Windows Defender Application Guard. The Edge Browser is isolated using a hardware based container – that’s protection against malware, zero day threats, etc. So even if you browse an infected site, the infection cannot cross the security boundary to your machine, data, and network.

Ann Johnson

image

This section is all about security. Security is built from the base of the platform: Windows. Windows 10 (especially Enterprise edition) is the most secure version of Windows. We get a demo of Credential Guard. A machine with Windows 7 (no CG is running) as is a Windows 10 machine (with CG). The “hacker” launches attacks from both infected machines. The Windows 7 machine spits out password data to the hacker. The Windows 10 machine cannot retrieve passwords because they are secured in hardware by CG.

Nest up is Windows Defender Application Guard. This uses the same hardware tech as CG. Two browsers, one protected and one not, are run. The unprotected browser hits a “dodgy” website and we see that all of the Windows security features are turned off thanks to a silently downloaded payload. On another machine, nothing happens when the protected browser hits the malicious website – that’s because Application Guard isolates the browser session behind a hardware boundary.

Attacks will happen, it happens to everyone, so Microsoft is engineering for this. Windows Defender Advanced Threat Protection uses behavioural analytics to detect attacks across your network.  In a demo, an attack is opened. We can see how the attack worked. Outlook saved an attachment. This attachment is analysed. Office 365 Advanced Threat Protection is integrated with Defender ATP. We can see that this attachment attempted to send to 2 users, but O365 blocked the attachment because it received a signal from Defender that the attachment was malicious. A source is identified – a user whose identity might be compromised. The user is clicked and Microsoft Advanced Threat Analytics (ATA, a part of the EMS suite) tells us everything – the user’s ID was compromised from another user’s PC. So we have the full trace of the attack.

image

And other than more chat show, that was that. It was a keynote light on announcements. General sessions followed, so I guess that’s where all the news for techies will be.

Webinar Recording: What’s New in Windows Server 2016 Hyper-V?

The good marketing folks at MicroWarehouse have edited and posted the recording of my recent webinar that focused on the new features of Windows Server 2016 (WS2016) Hyper-V. Lots of info in slide form and some demos too. We have also shared the slides of the presentation.

image

Most of the demos were live, but I needed a recording for one after a lab failure. Thanks to Subhasish of Microsoft for the donation of a demo video so I could show Storage Resilience in action.

This webinar had our biggest live audience to date. We’ve already announced a follow-up which is seeing an even larger level of interest. On September 22nd, 14:00 UK/Ireland, 15:00 CET, and 09:00 EST (I think) I’ll be doing a webinar on what’s new in WS2016 Failover Clustering and storage. Expect lots of stretch clustering, storage replica, and Storage Spaces Direct. You can register here.

image.png

Webinar Recording: Defending Today’s Threats With Tomorrow’s Security By Microsoft

MicroWarehouse has posted the recording of our last webinar, which explained why the security solutions of the 1990s that some companies are still relying on, are being easily defeated by attackers today.

The post explains what is happening now, based on 2015 survey information from multiple sources. And I explain how a number of cloud-based security services from Microsoft can protect both your on-premises and cloud infrastructures from these modern attack methods, that your firewall and anti-malware scanning will let pass right through or never see.

In fact, I just saw a support request on a security issue that 2 of the solutions in this webinar would have prevented.

image

We have also shared the slides and a hand-out with some follow-up reading/watching.