MMS 2012: Deep Dive Into ConfigMgr 2012 DRS and SEDO

Speaker: Saud Al-Mishari, MSFT PFE – think he’s based in the UK

The session is on the new replication model: RCM, DRS, and SEDO.

Key Concepts

  • SQL replication in ConfigMgr 2012 is nothing do do with SQL Server Transaction Replication
  • Data Replication Service (DRS)

Terminology

  • Stored procedure: sproc
  • SSB: SQL Service Broker
  • Change Tracking: SQL Server Change Tracking

More:

  • RCM: Replication Configuration Management/Monitoring
  • Replication Pattern: a set of rules on what will replicate
  • Replication group: a set of tables that are monitored and replicated together
  • Replication Link: a replication connection between two SQL servers for a particular RG
  • Backlog: Unable to write data t the SQL Server DB after being received in the SSB Queue (usually SQL Server write performance)

New Replication Model

  • Global data is anything an admin creates and is replicated everywhere, e.g. collection rules
  • Site data is stuff like status, collection membership results, replicated up to parent site.

Client generates XML file and copies to management point.  MP copies MIF to the site server.  Site server process it.  DRS replicates the changed data to the parent  CAS contains the discovery data.

SQL Server Change Tracking

  • Change tracking allows application to keep a record of rows in a table that have been changed: insert/update/deete
  • Does not track changed data – obtained directly each sync
  • Added in SQL Server 2008 … not to be confused with Change Data Capture
  • Is enabled at the DB level and at the table level.

DO NOT ALTER THIS SETTING ON A SITE DATABASE

SQL Service Broker

Messaging service:

  • Asynchronous queue based service
  • Guaranteed delivery (not infrastructural guarantee – developer guarantee)
  • Allows messages to be grouped into a conversation … messages processed in order, allows for multiple threads to process queue

Elasticity:

  • Allows scalability

Replication Patterns

  • Global data flows in both directions.  CAS and primaries all have the same data, e.g. collections and package meta data.
  • Site data flows up.
  • Global-proxy is admin and control data for secondary sites.  A primary and secondary sites all have the same data.  Subset of global data that secondary sites needs.  Leverages SQL 2008 R2 Express at the secondary site with 10 GB limit.

Select * from vReplicationData to find all RGs and their sync schedules

ID is the key field in here.

Provider Access

SMS_ReplicationGroup is a new WMI class that supports replication.  1 instance per RG.  Status propert allow you to determine the sttus of the RG.

What’s in an RG?

Select * from vArticleData where ReplicationID = XX  …. using ID from above query

How big is the RG?

EXEC spDiagGetSpaceUsed

If a site goes down for a week or two, how much data must you send across?  Use the above query to figure out how much data must be replicated by the RG.

Demo

In the SQL Management Studio.  Select * from vReplicationData. Can see all the patterns for global, site and global-proxy.  SyncInterval is the number of minutes between replications.  DRS runs every 5 minutes .. no control over that. 

Select * from vArticleData where ReplictionID = 7.  Looks like Endpoint Protection data being replicated here.

Runs spDiagGetSpaceUsed .. takes a while.  Returns the size of the tables.  Replication Pattern shows the amount of data to replicate if you lose a site for the 3 patterns (global, site, global_proxy).

DRS Architecture

  • RCM handles replication link setup, maintenance and monitoring – command and control.  It’s a thread of SMSEXEC.
  • SSB is the transmission engine of replication
  • The Sender still lives and is used for bulk copy for initialization and re-init.
  • 5 day limit on DRS for outages – Due to the need to retain changes.  It retains 5 days of data.  Try to expand this for a 30 day outage and ConfigMgr needs to maintain 30 days of data.  It’s 5 days to handle a long weekend apparently – site breaks at start of holiday, come back 4 days later and fix it. 

Initialisation:

  1. BCP: to extract table data
  2. Sender: SMS EXEC sender thread
  3. SMB/CIFS: copy data to the destination

On-going replication

  1. SQL Server Change Tracking
  2. DRS sprocs and SQLCLR
  3. SQL Server Service Broker
  4. XML

Demo – Break replication

SQL DBA has a bad day and disables dbo.ConfMgrDRSQueue.  CMTrace is started from DVD.  Opens rcmctrl log on site server.  See that the queue not running causes and error.  We can see that ConfigMgr actually reached out into SQL and re-enabled the queue. 

In CMconsole , we have send demo.  The link is degraded in one direction but not the other under Database Replication.  Looks like TCP 1433 connectivity issue.

Site Initialisation

  1. Setup start
  2. Setup asks CAS for site number.  If you have more than 50,000 clients, then you need SQL Enterprise Edition to chunk up data in the DB and partition it.
  3. Setup finished and waits for replication to initialise.
  4. The replication configuration data is requested.  This group tells RCM as the primary how replication should be setup
  5. CSA receives request and BCPS out the data and sends it via sender back to the primary
  6. Primary now request remaining Global Replication Groups.  CAS creates the BC packages and send them back to the primary.  Primary then applies the new data from the CAS.
  7. Primary site receives BCP fles and inserts all the data from the CAS>  The primary can now switch to normal replication.

DRS Message Replication

  • Provider executes query that modifies table
  • SQL Server writes entries into change tracking table
  • On DRS sync: changes are packages up and inserted into SQL Server message queue sing a stored proc.
  • Message Broker transmits the message to the receiving site.
  • RCM monitors the queue launching activation stored procs to process
  • And more on receiving side to insert modifications on receiving side

WARNING: When A CAS Goes Offline

When the CAS goes offline for more than 5 days, don’t make changes on the Primary as a substitute as the CAS.  The CAS will re-initialise the primaries after more than 5 days outage, thus wiping the Primary’s changes.

DRS Troubleshooting

  • The Replication Link Analyser RLS should be yur first stop.  It’s predictable and can do some fairly complex remediation
  • RCM Log should be the follow up.  But this is just a summary of what has happend.
  • For transmissions layer errors, the SSB queue is sometimes the most immediate source for error messages (of this type)

Views for Detailed Info

  • The main logging view: vLogs.  They log into the DB.  Select top 1000 * from vLogs order by LogTime desc.  Limit that number.  DO not select everything.  Will hammer prod environment and compund the issue.
  • SMS_Replication_Configuration_Monitor registry key to configure logging

DRS Troubleshooting

  • Ensure that TCP 1433 exception is there for SQL Service and 4022 for SQL Broker.
  • SSB keys transmitted through setup – monitoring with Hman.
  • spDiagDRS will give you an overview of the state of DRS replication at the site.  SiteStatus (coded), Replication Group Initialization Status, DRSQueueStates, QueueLenghts (ideally 0 and 0 or you have a backlog), Replication Group Status deltails the last time messages sent

Demo: View Queues

Click on the queues in SQL under service broker under CM database.

Procedural troubleshooting of DRS DEMO

Turns of SQL Broker. Makes a change to Client Policy.

  1. Run spDiagDRS: EXEC spDiagDRS in SQL MS.  We see messages jammed in the outbound queue.
  2. SSB transmission_queue: 
  3. Service broker queues: We see connection failed errors.  Telnet to the port and we see it fails.
  4. vLogs: select * from vLogs ORDER BY LogTime DESC (beware * in real world … too much data)
  5. RCM_ReplicationLinkStatus

The Database Replication link in CM console will flip to degraded and then flip to fail after about 25 minutes.  Can run Replication Link Analyzer (RLA).  In the demo it shows that there’s a network connectivity issue.

Invoke-WmiMethod –namespace rootrootsmssite_CAS –path SMS_ReplicatinGroup –Name InitializeData = arguementlist “20”, “CAS”, “PR1” to reinitialize a RG.  RLA should do this for you if required.

SEDO – Why do we need a way of controlling changes?

  • As global data is replicated everywhere, a user on a primary site culd change an object at the same time as a user on the CAS or another primary.
  • This is an unavoidable consequence of multi-master replicated data model – ask AD.
  • SEDO is the solution to this.

What is SEDO?

  • SEDO = Serialized Editing of Data/Distributed Objects
  • Provides a way to enforece a single user editing of an object at any one time.
  • A lock request round trip can take less than 200ms from Primary to CAS to Primary
  • Default Timeout is 5 minutes.
  • Only SEDO enabled objects require users to get a lock
  • Supports explicit and implicit lock handling.
  • This is all transparent to admins.  Important for devs building extensions to CM.

MMS 2012: Automating Data Protection And Recovery With DPM and System Center 2012

Speakers: Orin Thomas and Mike Ressler

Replication is not the same as backup.  Lose it in site A = lose it in site B.  Backup is still required.  And backup provisioning in the private cloud is a challenge cos admins don’t know what’s being deployed.

DPM is a part of system center, a part of a holistic integrated solution.  Makes it perfect for provisioning in the private cloud.

How Will The Agent Get Deployed?

  • Make it part of image
  • GPO for an OU
  • Scripting or manually
  • Use Configuration Manager
  • And probably lots more options, e.g. a runbook fired off from Service Manager

Their solution is user goes to Service Manager, creates a request, and Orchestrator runs a runbook.  Their is a DPM Integration Pack.  It’s a confusing IP apparently. 

  1. Initialize Data: Add parameters – ServerName, DatabaseName, and Type (3 types of protection group in DPM such as gold, silver, and bronze for recovery points, retention, etc).
  2. Get Data Source (renamed as Get Protection Group): Data Source Location set as protection group and select Type
  3. Get Data Source (get server ID) – choose protection server and select ServerName
  4. Get Data Source (renamed as Get Data Source ID) – DPM, Get protection server name and filter to DatabaseName to protect a single DB, could have said type = SQL to protect all DBs.
  5. Protect Data Source: Protection Group = Get Protection Group
  6. Create Recovery – Something.

Yup, it’s confusing.  Go look at the videos when the guys tweet the link.

Keep the self-service simple.  If there’s more than a few questions, the user won’t do it and they’ll blame you when data isn’t protected and it’s lost.

There’s a bunch of Service Manager stuff after this.

MMS2012 – System Center 2012 Monitoring and Operations Tips and Advice

Speaker: Gordon McKenna and Sean Roberts, Inframon

I’m live blogging this session so hit refresh to see more.

Private Cloud MOC and Certification

New exams and certifications.  70-246 Monitoring and Operating a Private Cloud.  70-247 Configuring and Deploying a Private Cloud.

  • MCSA + 70-246 + 70-247 = MCSE: Private Cloud
  • 70-640 + 70-642 + 70-646 = MCSA

The two training courses are available now.

10750 – Module 4: Monitoring Private Cloud Services

To do J2EE APM you download an opensource Java bean.  OpsMgr network monitoring is network monitoring for server guys. Existing solutions for network guys won’t be replaced.  OpsMgr network monitoring gives the server guys the tools to find a troublesome link/device and enable them to tell the n/w guys.  Port stitching figures out what ports your monitored servers are talking to and shows that to you.

MP Templates are a good starting point.  Check out the new Visio tool and the MP Authoring tool (latter requires significant time investment). 

Distributed Application Monitoring

A new distributed application monitoring tool.  3 types of line:

  • Reference relationship: no impact … dotted line
  • Hosted relationship, e.g. database hosted by database instance.  Health will roll up.
  • Containment: Group of servers.  With aggregate rollup monitor, server goes red, group goes red.

Note that default management pack is no longer there!  Forces you to save your authoring in a suitable MP.  Yay!

Health rolls up to 1 of 4 things:

  • Availability
  • Performance
  • Configuration
  • Security

We can configure the rollup to go up to a level of our choice, e.g. don’t roll up or roll up to top level of distributed application.

  • Presentation Tier – anything user sees
  • Business Tier: back or middle tiers.

Creates a service level dashboard for the new MP based on the distributed app model.  Add the OpsMgr dashboard viewer and adds the webpart into SharePoint.  Grab the URL of the dashboard link in OpsMgr and edit the web part properties to paste the Dashboard link.  Now the SLA dashboard appears in SharePoint.

Tips

  • Always build out service models in the DAD (distributed application developer).  Good eye candy wins prizes!  I concur – have personal experience of that.
  • Use three tier service models that match your business functions
  • Use MP templates for true pro-active monitoring
  • Use APM to stop developer VS IT Pro arguments
  • Create a dedicate SharePoint portal for dashboard and reports

10750 – Automating Incident Creation, Remediation, and Change Requests

Orchestrator components:

  • Orchestration console on IIS (Silverlight)
  • Runbook server(s): usually local to servers
  • Management server running Runbook designed and deployment manager
  • SQL DB

Download integration pack, register it with management server, deploy IP to runbook servers, open Runbook Designer to use it.

Install OpsMgr R2 integration pack  Define a connection to the OpsMgr server.  You then have the actions available to use.  Do the same for Service Manager.

Demo with web service crashing and auto remediation.  OpsMgr detects event.  Orchestrator waits for that event.  It tries to restart the event.  Creates ticket to auto restart IIS.  If that fails, it lodges a ticket in Service Manager for manual OK to reboot the server.

Opens up Runbook designer.  Browses into Runbooks and we see the book in question.  Runs the runbook tester, toggles break point, and runs it.  Now he stops the website.  The runbook kicks off, and they step through the actions.  We get into Service Manager where there’s a change request for a reboot.  That’s approved and the web server is rebooted.

Note: there is a maximum of 50 running runbooks on a Runbook Server.

When configuring a runbook

  • Handle failure and warning links
  • Replace the default strings
  • Change link colours
  • Limit the number of activities for each Runbook
  • Enable runbook logs to an external file

10750 – Module 7: Problem Management In The Private Cloud

Incident = one time occurrence that can be handled by an operator.  Problem is more complex, e.g. engineering issue that requires escalation.

Information stored in Problem Log in Service Manager.  Another demo of automated problem record creation.  An alert will come in in OpsMgr for a DB that goes offline.  The alert auto pipes in as an incident in Service Manager.  Many instances of it in the demo.  It’s a problem.  A problem record is manually created from these incidents.  He fills in information in the New Problem form. 

Now he kills the DB again. 

There’s a runbook that is looking for occurrences of that incident.  It’ll get the service details and the incidents for this service, output data to text file, count lines, if there’s more than X occurrences then it will create a problem based on the data in the file.  This workflow replaces the above manual task for this particular incident.

Hints and tips

  • Target object and classes and use groups to override
  • Be aware of the inheritance for each class
  • Limit the size and activity of a runbook
  • Download and use the Cloud Processes Pack.  Create request driven processes for many cloud services functions such as project, capacity pools, and virtual machines.  Can introduce the concept of charge back billing.  Supplies cloud service runbooks.  Project = collection of capacity pools.

 

MMS2012 – I’ve Deployed OpsMgr 2012 Application Performance Monitoring (APM); Now What?

Speaker: Pete Zerger and a Dude Who Was WIth Avicode

APM was Avicode, and allows .NET and J2EE application monitoring from the inside.  Help IT isolate the issue.  Provide the app team with the info they need to fix the app.

Teams you might have involved in app troubleshooting:

  • Operations: Runs the infrastructure n a day-day basis
  • Support and development: writes it and fixes bug
  • QA/Testing: tests it
  • DevOps: owns the production code

Processes

  • Troubleshooting
  • Daily/weekly app health analysis
  • Fixing top issues
  • Next application release scope
  • Improve monitoring configuration

Reports

Start with Top reports

Figure out how often to send reports, who to send them to, and what apps to include.

Problems distribution analysis is a good high level report of all apps.  Application status gives you a week-week report on app performance/health.  Run it weekly and send to an active/involved supervisor.  Application CPU utilization should be run weekly/monthly.

Make a note of http://dinnernow.codeplex.com/ for testing/demo.

Rules

Filter out noise, e.g. non-actionable alerts .. maybe fixed in next release, etc.  Use rules everyday.  Start with top level problems, create rules for exception events.

Using REGE Sensitive Data Filters

You can use expressions to find and mask sensitive data that you don’t want out in the wild, e.g. social security number, credit card number, etc.

 

There’s a lot more demo after this.  Best you watch the video when it’s made available in a few days.

MMS2012 – SC 2012 VMM: PowerShell Is Your Friend, And Here’s Why

Speakers: Hector Linares, Senior Program Manager and Susan Hill, Senior Technical Writer, MSFT

Went from 162 cmdlets in VMM 2008 R2 to 438 in VMM 2012.  They maintained backwards compatibility through aliases.  The cmdlets got renamed so they don’t conflict with the new Windows Server 2012 Hyper-V cmdlets.

POSH is the driving force for the UI.  Cmdlets are executed as jobs in VMM so there’s an audit trail.  Other partners, e.g. TFS or XenDesktop, integrates with VMM cmdlets for deployment.

Overview of VMM 2012 system

  • Infrastructure: HA VMM Server, PowerShell, Upgrade, Custom Properties
  • Fabric: Server lifecycle management, multiple hypervisors, network management, storage management, dynamic optimisation.
  • Clouds: An abstraction of fabrics.  Application ower usage, capacity and capability, delegation and quota.
  • Services: Service templates, application deployment, customer command execution, image-based servicing.

Cmdlet groups: 46 nouns

  • get-command –module VirtualMachinemanager –commandtype cmdlet
  • get-scvirtualmachine
  • Now you run read-SCvirtualmachine to do a refresh
  • Repair-scvirtualmachine wil do the repair action.
  • Stop-scvirtualmachine takes more parameters, e.g. stop (cold), save state, or clean shutdown
  • Register-sCVMHost to register a bare metal host.
  • Restart-SCVMHost to reboot a host.
  • Test-SCVMHostCluster to run a cluster validation.

Domain Join for VM

You can use –DomainJoinOrganizationalUnit “ou=, dc=” to set where a new VM joins in a domain.

-AutolongCredential to  set autologon account and –AutoLogonCount to say how many times that will run.

These must be set at the same time.  You can clean up with disableautologon.

UnattendSettings

Looks like we can use this to customise an unattend.xml for Specialize (3) and OOBE (6) passes.  Use Add)key,value) to add settings.

  • $unattend.add
  • $unattend.remove

Your settings will override settings in GuestOSProfile or VMTemplate.  You have to commit the settings with set-scvmtemplate (I think – quick slides) to use them.

Demo

In the demo, he wants to override a template.  He gets the template.  Now he creates a new temporary template.  He sets the OU for it to join to.  He creates runas account as the account he’ll use for building the VM.  He uses that for autologon.  He get’s the unattend object.  No he adds a bunch of overrides to the template using $unattend.add().  set-scvmtemplate – vmtemplate $template –UnanntedSettings $unattend) | Out-Null commits the overrides.  They create a $vmconfig using new-scmconfiguration –vmtemplate $template –Name ($vmNamePrefix + @_config@)) | fl Name. 

VMM still doesn’t have the ability to create differencing disks so you have to use WMI to do it instead.  Apparently this has been blogged. 

He sets the disk name and location.  This can be done on a per disk basis.  In this cmdlet he’s told it to use an existing VHD he just created using WMI. 

Virtual Machine Configuration

You can create a VM config so you can deploy very specific VM configs, different from the defaults.  $VHD to get-scvirtualharddisk from the library.  Then set$storageclass viariable with get-scstorageclassification.  Now $ComputeTier with get-sccomputertier.  Then $VMconfig with new-scvmconfiguration and the $computertier variable.  $vhdconfig and get-scvirtualharddiskconfiguration and $vmconfig.  setscvirtualharddiskconfiguration and $vhdconfig and $vhd and $storageclass. 

Now $virtualnetworkadatperconfig = get-scvirtualnetworkadapterconfiguration.  Setscvirtunetworkadapterconfiguration with $virtualnetadapterconfiguration.  And then more stuff.  Download the slide deck when it comes out in a few days.

Basically you build up a VM config and then you create a VM from that config.

There is a script on the net that will automatically sign the scripts in your VMM library.  It was written for 2008 R2.

We’re shown a demo where a script checks for expired (by date) VMs and stores them in the VMM library.

Hyper-V Data Exchange

Can read and set the KVPs in the VM.  Can read data from a VM without using the network via read.  Can pass in string values to a VM regardless of power state with Set.  A Key is a registry VALUE create to store DATA.  The value is the DATA.  And a KVPMAP is a hash table is one ore more VALUEs or DATA.

Cool demo where Hector writes to the registry of the VM in different power states (on, off, paused, save state).

VDI

Jobs submitted to VMM using –RunAsynchronously from one or more runspaces.  Hundreds of parallel jobs.  Typically used in the morning bootstorm in VDI.

VMM 2012 has a concept of threadpools.  By default it handles 25 threads per core in the VMM server with a max of 150 (requires a monster VMM server).  High number of context switches can slow performance of the VMM server.  The WCF timeout is configurable (default of 120 seconds).  Monitor the performance of jobs if you increase threadpools.

If you run asynchronously then query the job object for status.  For higher throughput, use multiple threads with multiple runspaces.

Make sure you tune the VMM refreshers in VDI, and also in very large static environments.  4000 VMs doing a light refresh every 2 minutes and a ful refresh every 30 minutes will hammer the VMM server. 

 

MMS2012 – Operations Manager And Orchestrator: Better Together

Speaker: Brian Wren, Principle Knowledge Engineer, Microsoft.

This is a packed room, and it’s one of the bigger rooms.  Obviously a very popular topic.  A quick poll by the speaker: Very few people in here with Opalis/Orchestrator knowledge.  Most of the audience are OpsMgr experienced.

Quick run through of the two products (skipping OpsMgr in my notes)

Orchestrator has Orchestrator Database and Runbook Server.  Runbook runs actions across applications.  Workflows process on runbook server.  Requires access to remote machines – very difficult for OpsMgr MP to do.  Relatively few complex workflows.  So OpsMgr monitors and Orchestrator does stuff.

We can integrate the two products.

Operations Manager Integration Pack

A runbook can use OpsMgr.  Standard activities:

  • Get Alert
  • Get Monitor
  • Create Alert
  • Update Alert
  • Monitor Alert
  • Monitor State
  • Start Maintenance Mode
  • Stop Maintenance Mode

The monitor action in a runbook causes the executing runbook to sit there waiting for something to happen, e.g. if an alert of certain criteria appears, then continue execution of the runbook.

There is also a Start SCOM Task action.  These are the actions you see in the Tasks pane in the OpsMgr console.

Orchestrator Management Pack

Allows OpsMgr to reach into Orchestrator.

Standard:

  • General health monitoring
  • Create Alert activity for runbook failures

Extend

  • Start a runbook
  • Get information about a runbook

This is made possible by a MP by Infront Consulting.

Extension and Automation Options

http://orchestrator.codeplex.com/releases/view/82959 has a library of cmdlets for Orchestrator because it doesn’t have any apparently.

Demo

Using his own MP instead of the Infront one.  He shows two runbooks that are being monitored by OpsMgr.  The Infront MP checks the last execution of a runbook for its health.  You can launch a runbook via an OpsMgr task. 

Scenarios

  • Working with alerts
  • Recoveries
  • Tasks and Runbooks

Monitor Alert

A runbook monitors OpsMgr for an alert(s).  When the alert comes in, Orchestrator does something.  For example, a critical alert comes in and Runbook can do some complex notification tasks, e.g. if nobody responds in 20 minutes, then do something. 

MSFT no longer investing in connectors for OpsMgr.  Instead they are investing in Integration Packs for Orchestrator to implement this functionality instead – allows more complex tasks.

Monitor State

Expected you won’t use it much.  Monitors the state of objects in OpsMgr.  One scenario: an error occurs to a DB in a distributed application.  With that rolled up state change, we can kick off a runbook in Orchestrator.

Demo

He forces an error to happen.  Now we look at the runbook.  We’re looking for new alerts that come from the MP that will soon detect the error and create an alert.  Now he detects who owns the faulting app.  This could be a query of the Service Manager CMDB.  It’s a SQL query.  He now automatically sets the owner of the alert in OpsMgr using that data and the ID of the alert.  He now checks what time of day it is, and then sends out the appropriate notification.  This is actually another runbook. 

He pulls in data from the 1st runbook.  It send an email.  If that fails, it will create an alert in OpsMgr to say that there’s a problem with the notification system.

In theory, you could then track the alert to see if anyone does work on it in a predefined time.  If not, you could escalate the alert.

Caution

Be careful of automated actions that do recoveries.  You don’t want to blindly reboot some machine every time it does X, e.g. bouncing machine, someone disables an app without maintenance mode, etc.  What if the thing autoresolves during the runbook execution?  The runbook will continue to run.

Automated recovery option 1

Runbook monitors for alert in OpsMgr.  Beware of having lots of concurrently running monitor runbooks because it won’t scale out that way.

Create the recovery in the OpsMgr MP to run the runbook in Orchestrator as required.  It’ll use the Orchestrator web service to start the runbook over the network.  It’s more difficult to set up than the monitor alert runbook.

Wow, this room is 95% full.  Very popular topic.

Demo

There’s a distributed app. Any time there’s an error, he wants to send a notification to his helpdesk.  He loses me here with XML MP authoring – a pity.  Interesting, he appears to check the health of the object in OpsMgr from the runbook before bouncing the failed service.  He then checks the state of the object after a 5 minute wait.  If not healthy … there’s more.

Running Tasks From a Runbook

Runbooks run across the network.  OpsMgr tasks run on the local agent.  Runbook couldn’t do IPConfig but a task could.  You can run an OpsMgr task from a runbook, wait for execution, and suck in the resulting data into the runbook.

Demo

Back to before.  Now the runbook is going to reset a cache in the app to fix the issue.  It’ll be done using an OpsMgr task.  The task is actually a POSH script. 

Elias Khnaser, Honorary vmLimited Ambassador, Talks About Shared Nothing Live Migration

Back in 2009, Elias Khnaser posted a very badly informed article on InformationWeek with on why you shouldn’t deploy Hyper-V.  I gave it a good bashing, tearing down his points one by one with actual facts.  Well, just in time to be hired by Tad, Elias is back and at it again!

First off, let’s look at the title of the article:

Shared-Nothing Live Migration: Cool, But Not a Game-Changer

Hmm, I have to disagree.  No one else does this right now and it’s a real problem for some.  Think of a large data centre going through a hardware or network refresh.  They can’t afford down time while the export, carry, and import VMs.  They want to be able to move those VMs with the minimum of downtime, and maybe even eliminate downtime.  Shared Nothing Live Migration achieves this.

… entertain this scenario for me: a VM with 1 TB virtual disk … wait, Eli, you are not realistic, you might be thinking … fair enough — what about a VM with 500GB virtual disks? Moving that amount of data over a 1 GB Ethernet or even a 10 GB Ethernet is not quick or feasible in most environments …

Firstly, the vast majority of VMs are small.  And while Windows Server 2012 Hyper-V can support 64 TB VHDX, they will the tiny minority.  And to be honest, not only will these size VMs be few and far between, but I’d expect them to run on clustered hosts with shared storage so Share Nothing Live Migration wouldn’t be needed … except for well planned and scheduled migrations to another part of the data centre.

If you do have lots of 1 TB VMs to move around, then you’re a very large data centre and you’ll have lots of budget for big networking such as Infiniband with RDMA to speed things along.  For the rest of us:

  • DCB and converged fabrics
  • Everything from 1 GbE to Infiniband
  • RDMA
  • QoS
  • Many ways to architect our networking based on our needs

In my opinion, the market that will make most use of Shared Nothing Live Migration are the public clouds or hosting companies.  To keep costs down, many of them are using non-clustered hosts.  And from time to time, they want to replace hardware … a planned operation.  They can do this with this new Hyper-V feature.  And I can speak from experience: most hosted VMs are very small and would cause no issue to move, even over a 1 GbE network.

… this would only be used as a maintenance technique which is what it was slated for anyway.

Of course!  As a virtualisation expert, Elias, I’d trust that you know the difference between high availability (HA – normally reactive) and LIve Migration (LM – normally proactive and planned).  But if one is stuck with working with vSphere standard, then one probably never gets to implement these features because of their high vTax.

To think that you will constantly be copying large virtual disks between hosts is not practical and is not scalable.

Really?  Elias have you even looked at why Shared Nothing Live Migration exists?  It’s not there for load balancing or as an alternative to Failover Clustering HA.  It is there to allow us to do strategic moves of virtual workloads from one cluster to another (or a standalone host), one standalone host to another (or a cluster), from one part of the data centre to another, or even from a private cloud to a public one.  If you’re doing this all of the time with all of your VMs then you need to take a long look at yourself and your planning.

… you cannot use high availability with this feature, and that makes sense since you need a point of reference for HA to work. How can you recover a VM when its files are on the host that failed?

You’re confused and you’re wrong:

  1. You don’t need Shared Nothing Live Migration within a cluster because, strangely enough, there is shared storage in a cluster. With the VMs storage on a CSV, you don’t need to move the VHD(X) from one host to another, or from one storage device to another, to LM a VM from one host to another.
  2. You can use Shared Nothing Live Migration to move a workload from/to a cluster.

Live Migration (proactive VM move) is not HA (reactive failover).  Yes, LM has been separated from Failover Clustering, but they are certainly not mutually exclusive.  And anyone who sees Shared Nothing Live Migration as an alternative to HA needs to reconsider their career path.

I expect that when VMware does feature catch up with Windows Server 2012 Hyper-V, Elias might have a change of heart regarding Shared Nothing vMotion Winking smile  But until then, I expect we need to beware of bogus nastiness and stay vmLimited.

Technorati Tags: ,

Windows 8 Names, Editions, and Features

Microsoft confirmed the names of the Windows 8 desktop and server this week, as well and confirming the name of the server operating system (Windows Server 2012).

As expected, the desktop OS will be called Windows 8.  There will, in fact, be four/4, editions of Windows 8, and not three as the media are proclaiming:

  • Windows 8: The home edition of the operating system.
  • Windows RT: Formerly known as Windows on ARM (WOA), this is the OEM only OS that you well get prebuilt on an ARM tablet
  • Windows 8 Pro: The edition of Windows 8 that you will buy for a business
  • Windows 8 Enterprise: This is the edition with all the business features that is available to those who license their desktops with Software Assurance (SA)

A few thoughts:

  • Thankfully MSFT has consolidated Starter, Home Basic, and Home Premium into Windows 8, and they’ve done that with Ultimate into the Pro edition.
  • Windows 8 is an edition of Windows 8.  Really!?!?!  That won’t be confusing, not one bit! (there is some sarcasm in there)
  • What does Windows RT mean?  It’s named after the WinRT development subsystem in Windows 8 that enables Metro apps.  It would be like calling it Windows .Net or Windows Javascript.  To me, it’s a bit techie for the targeted consumer audience.

Microsoft gave us a table to compare the features of Windows 8, Windows RT, and Windows 8 Pro:

Feature name

Windows 8

Windows 8 Pro

Windows RT

Upgrades from Windows 7 Starter, Home Basic, Home Premium

x

x

 

Upgrades from Windows 7 Professional, Ultimate

 

x

 

Start screen, Semantic Zoom, Live Tiles

x

x

x

Windows Store

x

x

x

Apps (Mail, Calendar, People, Messaging, Photos, SkyDrive, Reader, Music, Video)

x

x

x

Microsoft Office (Word, Excel, PowerPoint, OneNote)

   

x

Internet Explorer 10

x

x

x

Device encryption

   

x

Connected standby

x

x

x

Microsoft account

x

x

x

Desktop

x

x

x

Installation of x86/64 and desktop software

x

x

 

Updated Windows Explorer

x

x

x

Windows Defender

x

x

x

SmartScreen

x

x

x

Windows Update

x

x

x

Enhanced Task Manager

x

x

x

Switch languages on the fly (Language Packs)

x

x

x

Better multiple monitor support

x

x

x

Storage Spaces

x

x

 

Windows Media Player

x

x

 

Exchange ActiveSync

x

x

x

File history

x

x

x

ISO / VHD mount

x

x

x

Mobile broadband features

x

x

x

Picture password

x

x

x

Play To

x

x

x

Remote Desktop (client)

x

x

x

Reset and refresh your PC

x

x

x

Snap

x

x

x

Touch and Thumb keyboard

x

x

x

Trusted boot

x

x

x

VPN client

x

x

x

BitLocker and BitLocker To Go

 

x

 

Boot from VHD

 

x

 

Client Hyper-V

 

x

 

Domain Join

 

x

 

Encrypting File System

 

x

 

Group Policy

 

x

 

Remote Desktop (host)

 

x

 

A few thoughts on the features:

  • BitLocker and BitLocker To Go are in the Pro edition for the first time.  Excellent!  This has been badly needed for years.  Now disk encryption will be built-in and available to all Windows 8 desktops and laptops in the business. 
  • However, BitLocker and BitLocker To Go are not in the ARM tablet (Windows RT).  That clearly says to me that Windows RT is not suitable for an organisation that needs encrypted mobile devices.  The ball is firmly in Intel’s court, where you’ll probably be able to run either the Pro or Enterprise editions.
  • Client Hyper-V is in the Pro edition.  You have no excuses not to try it now!  It’s great news for admins, developers, and testers.

Last night Microsoft gave us some more detail on Windows 8 Enterprise features.  This list is not complete:

  • Windows To Go: Boot Windows 8 on a USB 3.0 stick.  It gives you a portable operating system/environment that you can use at home, on the go, and enables BYOD with a corporate build on the stick.
  • DirectAccess VPN without a VPN client, that enables a user to access Internet/local resources as well securely using as remote corporate resources.
  • BranchCache allows users’ PCs to cache files, websites, and other content from central servers, so content is not repeatedly downloaded across the wide area network (WAN). When used with Windows Server 2012, Windows 8 brings several improvements to BranchCache to streamline the deployment process, optimize bandwidth over WAN connections and ensure better security and scalabilty.
  • AppLocker can help mitigate issues by restricting the files and apps that users or groups are allowed to run.
  • VDI enhancements: Enhancements in Microsoft RemoteFX and Windows Server 2012, provide users with a rich desktop experience with the ability to play 3D graphics, use USB peripherals and use touch-enabled devices across any type of network (LAN or WAN) for VDI scenarios.
  • New Windows 8 App Deployment: Domain joined PCs and tablets running Windows 8 Enterprise will automatically be enabled to side-load internal, Windows 8 Metro style apps.  This will be important in an enterprise environment.

You can only access Windows 8 Enterprise if your existing Windows 8 Pro is covered by Software Assurance (included in some programs such as OVS).  Licensing benefits include:

  • Windows To Go Use Rights: With Windows To Go use rights under Software Assurance, an employee will be able to use Windows To Go on any company PC licensed with Windows SA as well as from their home PC. Additionally, through a new companion device license for SA, employees will be able to use WTG on their personal devices at work.
  • Windows RT Virtual Desktop Access (VDA) Rights: When used as a companion of a Windows Software Assurance licensed PC, Windows RT will automatically receive extended VDA rights. These rights will provide access to a full VDI image running in the datacenter which will make Windows RT a great complementary tablet option for business customers.  This is not new, but a continuation of an existing right.
  • Companion Device License: For customers who want to provide full flexibility for how employees access their corporate desktop across devices, MSFT are introducing a new Companion Device License for Windows SA customers. For users of Windows Software Assurance licensed PCs this optional add-on will provide rights to access a corporate desktop either through VDI or Windows To Go on up to four personally owned devices.
Technorati Tags: ,

Service Manager 2012 “Service Ticketing”

Import Management Packs

  • Service Manager CMDB can become aware of your environment from OpsMgr if:
  • You import MP in OPsMgr
  • AND import MP in Service Manager
  • ConfigMgr data is pulled in, including primary devices for users
  • AD
  • Orchestrator runbooks are also importable: LOB and 3rd party management tools

Other options:

  • Import files
  • Write/buy 3rd party connectors

Some sets of data can come from multiple sources.  All that’s mapped into one object in the CMDB. 

Self Service Portal Features

Service Catalog, Silverlight web part hosted in SharePoint:

  • Role based access
  • Users fill forms to create service requests
  • Dynamic forms

Help Articles and more

Supported Configurations:

  • SharePoint site and WCS (web content server) co-located with SM management server
  • SharePoint site and/or WCS remote from SM management server

Can use SharePoint Foundation 2010 or Enterprise.  Can reuse existing SP farms.

Demo

A user wants access to an app and fills out a form requesting it and gives a business case.  A ticket is created, and awaits an approval/rejection.  The helpdesk admin can see the ticket with available actions in the portal.  Click approve and the automated activity does the work, in this case adding the requestor to a security group in AD.

He browses the now accessible web app.  But it crashes.  So now he opens an incident ticket. 

SLA Capabilities

  • Features calendars, business hours, holidays.  SLA metrics in the box.
  • Service level objects are supported for all work items.  Specify target and warning thresholds. 
  • Notifications when you are about to or have breached SLAs.

Demo

He opens the previous incident.  We can see there is an SLO (service level objective) in the form of time left until SLA is breached.  This is defined in Administration, Service Level Management, Service Level Objectives. 

 

 

Visio Management Pack Designer (VMPD)

Speakers: Brian Wren and Baelson Duque, MSFT.

This is a new way to author management packs for System Center 2012 Operations Manager. 

Challenges

  • Creating MPs takes too long
  • Difficult to maintain best practices
  • Difficult to create a model to manage an app

The old R2 Authoring Console was a dog IMO.

Features

  • Create custom monitoring with minimal effort
  • Solution for offline management pack creation
  • Visual design tool

What the VMPD is Not For

  • Editing existing management packs
  • Deeply advance customer scenarios

VMPD Shape Types

  • MP Modelling: Represent components of your app
  • MP Rollup: Connect components and monitors
  • MP Monitoring: Monitors and rules

Patterns:

  • MP modelling a single server patterns: application components with a single type of server
  • MP modelling distributed patterns: Multiple types of server

Demo

Prereq: It requires Visio 2012 Premium edition. 

You start off with a blank diagram with a management pack shape.  A shape data sheet gives you properties of the shape – visible when you click on the shape.  Here we can specify what versions of Windows the MP will support.  This is a discovery.

In MP modelling we have things like server component (e.g. SQL Server Reporting Services) shape.  It’s data sheet allows us to do discovery using “how to find”: registry key, vale, Windows Server Role, and WMI query.  The Affect Computer Health setting allows you to roll the health of this server component up to the computer, e.g. the server role is red therefore the computer is red.  RunOn allows you to optionally schedule when the discovery runs. 

Under a server role, you place a server component(s).  You can use lines/arrows to dictate health roll up, e.g. “worst of this component”. 

A Windows Performance Counter Monitor is added.  You specify the object and counter as well as the instances of that counter.  You can alert or you can alert and collect data.  You can create a performance view for the console.  You can optionally save your data to the data warehouse.  And you can create a linked report!  This is nice. Me want now.  Can even set the monitor to only run on a schedule, e.g. why monitor LOB app performance during down hours.  Can copy/paste the monitors to quickly expand the MP.

An event monitor is created for an event ID and source.  You can set it to trigger after X occurrences in Y seconds. 

You can use patterns to create a composite shape.. a set of shapes that you are frequently reusing.  You can add your own ones via a stencil 

You can then generate an MP and that does all the XML in the backgrouond for you.

Schedule

CTP very soon.