{"id":9133,"date":"2008-07-21T10:42:00","date_gmt":"1999-11-29T20:00:00","guid":{"rendered":"https:\/\/aidanfinn.com\/?p=9133"},"modified":"2008-07-21T10:42:00","modified_gmt":"1999-11-29T20:00:00","slug":"why-i-dislike-ibm-director","status":"publish","type":"post","link":"https:\/\/aidanfinn.com\/?p=9133","title":{"rendered":"Why I Dislike IBM Director"},"content":{"rendered":"<p>I inherited a number of IBM servers with this job.\u00a0 They perform a critical business service for our customers.\u00a0 Luckily, the architecture we use is <em>very<\/em> fault tolerant.\n<\/p>\n<p>Over the weekend we deployed updates in a staged manner to our production network &#8211; after testing of course.\u00a0 On Sunday morning, I woke up to an email from System Center Operations Manager 2007 (gotta love it!) saying that one of the servers we patched on Saturday night was not responding to agent heartbeat requests.\u00a0 Uh oh!\u00a0 This was one of those IBM boxes.\u00a0 We have triplicate redundancy so I knew I could let it wait until Monday morning.\u00a0 To be safe, I suspended updates for the remaining production boxes.\u00a0 I didn&#8217;t suspect an update but I wasn&#8217;t taking any chances.\n<\/p>\n<p>I came into the data centre this morning and found the server sitting on a BIOS prompt.\u00a0 Hmm.\u00a0 That&#8217;s not good.\u00a0 It had detected a problem with the external disk storage and was waiting for administrator approval to boot up.\u00a0 What?\u00a0 Hello?\u00a0 Note: the failure was nothing to do with the server-internal boot disks.\n<\/p>\n<p>I checked the Direct Attached Storage (DAS) and it was all green.\u00a0 I booted up the server and saw the DAS was not being connected.\u00a0 I shut down the server and powered down the DAS.\u00a0 I powered up the DAS and was greeted with beeping &#8230; non-stop beeping.\u00a0 The front panel now showed a chassis alert on the DAS and one of the disks in the RAID5 array was alerting as well.\u00a0 Huh!?!\u00a0 Why didn&#8217;t it tell me this when the server already knew there was a problem?\n<\/p>\n<p>I powered up the server.\u00a0 Now it didn&#8217;t prompt me.\u00a0 But it did tell me the external disk was degraded.\u00a0 Fine, the hardware knows there&#8217;s a problem.\n<\/p>\n<p>I logged in and found there were no hardware logs or any sort of interface into the IBM director agent.\u00a0 Nothing.\u00a0 Sweet F.A.\u00a0 The consultants (before my time) who installed the hardware had set up an IBM director console on another box for centralised monitoring.\u00a0 I logged into it and sure enough, there were no alerts.\u00a0 Hold an a *beep*ing minute; the hardware knows there&#8217;s a problem but the monitoring agent from the hardware vendor doesn&#8217;t have a clue?\n<\/p>\n<p>OK, maybe it was the central console at fault?\u00a0 I&#8217;ve never trusted it.\u00a0 I went on to the SCOM console but found no alerts or health degradation on the IBM Director monitors.\u00a0 That made it certain in my mind, the IBM Director agent was clueless.\n<\/p>\n<p>So here&#8217;s my summary why I would recommend people to steer clear of IBM hardware in an enterprise deployment based on this little story:<\/p>\n<ol>\n<li>The DAS failed to show an alert on the front panel or disk despite the server not being able to boot up because it detected a failure.\n<\/li>\n<li>The IBM Director agent failed to report an incident of any kind.\n<\/li>\n<li>There&#8217;s no user interface to the IBM director agent on the server.\n<\/li>\n<li>A failure of a single disk in a RAID5 array in a DAS caused a server not to boot up.\u00a0 That&#8217;s just stupid.\n<\/li>\n<li>We&#8217;ve all heard that Lenovo are taking over the server and storage business.\u00a0 My experience of them with their support was awful &#8211; A call open for around 4 months and 2 months of that with the regional director taking a personal interest.<\/li>\n<\/ol>\n<p>I&#8217;m now left wondering how long I&#8217;ve had a failed disk on this server considering it didn&#8217;t give any monitoring alert or visible notification until I reset the DAS chassis.\n<\/p>\n<p>How would HP handle this?<\/p>\n<ol>\n<li>The SIM agent would have alerted on this and shown it in the HP SIM log and in the SIM web page on the server.\n<\/li>\n<li>The HP SCOM management pack for SIM would have alerted and sent all of the required\/responsible administrators\/operators\/&quot;business owners&quot; a notification of the failure.\n<\/li>\n<li>The disk would have shown an alert light immediately.\n<\/li>\n<li>It&#8217;s unlikely that the server would have been prevented from booting up unless there was a complete failure of the boot disk.\n<\/li>\n<li>I would have had the storage back to a healthy state within 4 hours of opening a call with HP.<\/li>\n<\/ol>\n<p>That&#8217;s a very different experience and one you expect to have from enterprise class servers and storage.\n<\/p>\n<p>EDIT\n<\/p>\n<p>As you can guess, I was concerned with the lack of h\/w monitoring that the IBM Director agent gave me.\u00a0 The horrid response from the MD was that we&#8217;d have to check that the logical disks in question were present on a daily\/manual presence.\u00a0 Yuk!\u00a0 I&#8217;d a better idea: let SCOM do the work for me.\u00a0 I&#8217;ve created a distributed application that entails on the dependancies I can think of for this service, including the presence and health of the logical disk in question.\n<\/p>\n<p>It was funny to see that the HP management pack allowed me to include discovered HP hardware objects but there were no classes for IBM hardware.\u00a0 Come on IBM; you gotta play better with others!\u00a0 Not everyone wants to buy consultancy-ware like Tivoli.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I inherited a number of IBM servers with this job.\u00a0 They perform a critical business service for our customers.\u00a0 Luckily, the architecture we use is very fault tolerant. Over the weekend we deployed updates in a staged manner to our production network &#8211; after testing of course.\u00a0 On Sunday morning, I woke up to an &hellip; <a href=\"https:\/\/aidanfinn.com\/?p=9133\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Why I Dislike IBM Director&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[9],"tags":[],"class_list":["post-9133","post","type-post","status-publish","format-standard","hentry","category-commentary"],"aioseo_notices":[],"jetpack_featured_media_url":"","amp_enabled":true,"_links":{"self":[{"href":"https:\/\/aidanfinn.com\/index.php?rest_route=\/wp\/v2\/posts\/9133","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/aidanfinn.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aidanfinn.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aidanfinn.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/aidanfinn.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=9133"}],"version-history":[{"count":0,"href":"https:\/\/aidanfinn.com\/index.php?rest_route=\/wp\/v2\/posts\/9133\/revisions"}],"wp:attachment":[{"href":"https:\/\/aidanfinn.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=9133"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aidanfinn.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=9133"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aidanfinn.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=9133"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}