MMS2012 – System Center 2012 Monitoring and Operations Tips and Advice

Speaker: Gordon McKenna and Sean Roberts, Inframon

I’m live blogging this session so hit refresh to see more.

Private Cloud MOC and Certification

New exams and certifications.  70-246 Monitoring and Operating a Private Cloud.  70-247 Configuring and Deploying a Private Cloud.

  • MCSA + 70-246 + 70-247 = MCSE: Private Cloud
  • 70-640 + 70-642 + 70-646 = MCSA

The two training courses are available now.

10750 – Module 4: Monitoring Private Cloud Services

To do J2EE APM you download an opensource Java bean.  OpsMgr network monitoring is network monitoring for server guys. Existing solutions for network guys won’t be replaced.  OpsMgr network monitoring gives the server guys the tools to find a troublesome link/device and enable them to tell the n/w guys.  Port stitching figures out what ports your monitored servers are talking to and shows that to you.

MP Templates are a good starting point.  Check out the new Visio tool and the MP Authoring tool (latter requires significant time investment). 

Distributed Application Monitoring

A new distributed application monitoring tool.  3 types of line:

  • Reference relationship: no impact … dotted line
  • Hosted relationship, e.g. database hosted by database instance.  Health will roll up.
  • Containment: Group of servers.  With aggregate rollup monitor, server goes red, group goes red.

Note that default management pack is no longer there!  Forces you to save your authoring in a suitable MP.  Yay!

Health rolls up to 1 of 4 things:

  • Availability
  • Performance
  • Configuration
  • Security

We can configure the rollup to go up to a level of our choice, e.g. don’t roll up or roll up to top level of distributed application.

  • Presentation Tier – anything user sees
  • Business Tier: back or middle tiers.

Creates a service level dashboard for the new MP based on the distributed app model.  Add the OpsMgr dashboard viewer and adds the webpart into SharePoint.  Grab the URL of the dashboard link in OpsMgr and edit the web part properties to paste the Dashboard link.  Now the SLA dashboard appears in SharePoint.

Tips

  • Always build out service models in the DAD (distributed application developer).  Good eye candy wins prizes!  I concur – have personal experience of that.
  • Use three tier service models that match your business functions
  • Use MP templates for true pro-active monitoring
  • Use APM to stop developer VS IT Pro arguments
  • Create a dedicate SharePoint portal for dashboard and reports

10750 – Automating Incident Creation, Remediation, and Change Requests

Orchestrator components:

  • Orchestration console on IIS (Silverlight)
  • Runbook server(s): usually local to servers
  • Management server running Runbook designed and deployment manager
  • SQL DB

Download integration pack, register it with management server, deploy IP to runbook servers, open Runbook Designer to use it.

Install OpsMgr R2 integration pack  Define a connection to the OpsMgr server.  You then have the actions available to use.  Do the same for Service Manager.

Demo with web service crashing and auto remediation.  OpsMgr detects event.  Orchestrator waits for that event.  It tries to restart the event.  Creates ticket to auto restart IIS.  If that fails, it lodges a ticket in Service Manager for manual OK to reboot the server.

Opens up Runbook designer.  Browses into Runbooks and we see the book in question.  Runs the runbook tester, toggles break point, and runs it.  Now he stops the website.  The runbook kicks off, and they step through the actions.  We get into Service Manager where there’s a change request for a reboot.  That’s approved and the web server is rebooted.

Note: there is a maximum of 50 running runbooks on a Runbook Server.

When configuring a runbook

  • Handle failure and warning links
  • Replace the default strings
  • Change link colours
  • Limit the number of activities for each Runbook
  • Enable runbook logs to an external file

10750 – Module 7: Problem Management In The Private Cloud

Incident = one time occurrence that can be handled by an operator.  Problem is more complex, e.g. engineering issue that requires escalation.

Information stored in Problem Log in Service Manager.  Another demo of automated problem record creation.  An alert will come in in OpsMgr for a DB that goes offline.  The alert auto pipes in as an incident in Service Manager.  Many instances of it in the demo.  It’s a problem.  A problem record is manually created from these incidents.  He fills in information in the New Problem form. 

Now he kills the DB again. 

There’s a runbook that is looking for occurrences of that incident.  It’ll get the service details and the incidents for this service, output data to text file, count lines, if there’s more than X occurrences then it will create a problem based on the data in the file.  This workflow replaces the above manual task for this particular incident.

Hints and tips

  • Target object and classes and use groups to override
  • Be aware of the inheritance for each class
  • Limit the size and activity of a runbook
  • Download and use the Cloud Processes Pack.  Create request driven processes for many cloud services functions such as project, capacity pools, and virtual machines.  Can introduce the concept of charge back billing.  Supplies cloud service runbooks.  Project = collection of capacity pools.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.