Planning Works Out

If you’ve ever seen the back of a server rack that I’ve cabled then you’d never let me even plug in a power lead to a kettle.  I am horrible at cabling.  Simply awful at it.  Those probably aren’t strong enough phrases to be honest.  That’s one of the reasons I like blade/SAN technology; there’s a minimal amount of cabling and it’s all usually done by an expert engineer who’s installing the blade chassis and the SAN.  When we put in our gear, I made sure it was!

The engineer did a nice job at labelling everything.  All lead placements were planned.  We’ve a network mesh going back to our access switches from the blade Ethernet virtual connects.  There’s a divergent path between the blade fibre virtual connects, the fibre switches and the SAN chassis units.  Each server has dual channel HBA mezzanine cards And power is split between circuit A and B in each rack.  That means we can lose a circuit and still be operational.  Adding servers doesn’t require more cabling – only adding a chassis does and then I’ll get the engineer to do the work 🙂

Note: We went with Brocade mezzanine cards instead of the Emulex ones.  At my last job we had 128 HP BL460C’s with Emulex HBA’s.  I’d say at least a quarter of the HBA’s had to be replaced in the month before we went into production.  I spoke with an engineer from the reseller recently and he said they were still regularly failing.  We haven’t had any issues with the Brocade ones.

We put the power and fibre channel fault tolerance to test today.  We needed to replace 2 Power Distributions Units (PDU’s).  They have management boards on them that the data centre doesn’t use.  Instead they have an out-of-band management system.  The management boards faulted so we had annoying alarm lights and sirens.  We often bring people in for a show’n’tell during pre-sales so alarms are not good, even if they mean nothing, which they did.  The data centre power management system and our OpsMgr 2007 HP Management Packs would have told us if we had a power issue.

We scheduled the replacement for this afternoon.  Outages are out of the question for the mission critical services we provide to our managed server hosting customers.  We swapped out the PDU’s with the alarms.  Not a single flicker of a problem was seen.  I watched the OpsMgr console for alerts while I was logged into a few VM’s (stored on the SAN) running tests.  The MPIO fault tolerance (Windows Server 2008 SP2) and the power fault tolerance of the SAN/Blades worked.

I was pretty confident of there not being an issue.  Everything was tested by the HP engineer when we did the installation last year.  All the hardware was looking healthy and the “board” was green in OpsMgr 2007.  This just shows how a little bit of planning before you plug things in and a little testing afterwards works in your favour.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.