I’ve been using HP servers and storage most of the time since 2003. I’ve experience their support via two channels: via partners maintenance contracts and direct support contracts. Has it always been perfect? No, but I’ve gotten things sorted. Typically, the issue is resolved within 4 hours which is perfect. What I love about HP hardware is how easy it is to manage. If you use their setup DVD, you can install your OS with all the HP management software and agents. This lets you configure every aspect of the hardware (with no sacrifices to the gods or black magic required) and the SIM agent will detect any hardware fault. You can use HP’s free software, their paid software or even their management pack for MOM 2005/OpsMgr 2007 to get alerts. Heck, the agent can be configured to directly send SMTP alerts. Each server has a HTTPS based service running on TCP 2381 to allow you to inspect the exact hardware issue, part number and serial number. The HP event log also shows you a clear explanation of what’s happened.
At work, we have 6 IBM X servers and 5 IBM DAS storage units. They were bought before I joined the company and installed by IBM and one of their Irish preferred partners (think of a shape to guess the name – no it isn’t a rhombus). The first thing I did was inspect the installation. I wasn’t familiar with IBM hardware or systems management so I didn’t know fully what to look for or expect.
Aside from the 13 configuration issues I found on this 6 server installation, e.g. the single domain controller on a mission critical service configured to use an external ISP’s DNS servers as it’s primary DNS server, I noticed I had no way of locally inspecting the health of the IBM hardware. I had no way of configuring disk. BTW, we spent endless hours fixing those 13 issues and I added Active Directory fault tolerance.
We use OpsMgr 2007 for health and performance monitoring. HP offers a management pack to integrate with the SIM agent on Proliant servers. IBM claimed to offer a management pack for IBM Director agents. I searched high and low for it. A friend in Holland was also doing the same for a site he was in. Neither of us could find it. Every link on the IBM site was dead. I contacted a sales guy in Ireland. He sent me a link. It turns out IBM published it on their Intranet but not on the Internet. The link wouldn’t work. Eventually we got the MP after many emails.
One year ago we had a failed disk in one of the IBM DAS storage units. No worry; we had a support contract with IBM. Or I thought there should be no worry. I logged the call. After 1 week of stress, getting our directors involved and screaming at local IBM sales people, did we get our replacement disk. Here’s the really worrying bit. The IBM director agent on the server connect to the DAS box did not pick up the failure. I discovered the failure when we rebooted the server and it hung on the POST to say there was an issue.
I the meantime we bought HP blades and a SAN. We had some memory board failures, etc. Each time, I got an alert from the SIM agent via OpsMgr 2007. We logged calls with HP via their portal and memory boards were replaced within 4 hours.
Back to April of this year. One of our 5 IBM DAS units went offline. One of our engineers logged the call. IBM support wanted DSA logs before they’d progress the call. Our engineer sent them in. IBM Support continued to ask for the logs for the following 2 months! In the meantime we had escalated the issue to local IBM staff. Every single person in IBM refused to send anyone out. We’d sent in the logs. We resorted to sending them to local staff members. After 2 months we finally got an engineer out. The Megaraid controller firmware had a bug. I wanted it replaced so it was replaced. I wanted a complete resolution after 2 months of an outage caused by IBM hardware and “support”.
In the meantime, there was another memory degradation in a HP blade. An engineer from a reseller was sent out by HP within 3 hours. There was zero fuss or downtime (Hyper-V cluster).
A few weeks later we started updating firmware on the Megaraid controllers to avoid this issue. The first one went OK. The second one failed. I got a message in POST about foreign configurations. I had a choice of importing or continuing. I didn’t know what to do – I’m not an IBM engineer. I googled but with no joy. Our storage was not visible on the server. I called IBM support expecting this to be a 1 minute conversation. Instead I was on the phone for 2 hours. The support engineer barely spoke English. He decided to have me go through a maze of POST configuration tools. It was clear by the delays in his instructions that he didn’t know the solution; he was searching for answers on an Intranet portal. I asked 3 times if he knew what he was doing. “Yes” was the answer. After the 3rd time I demanded to speak to his team leader. More excuses followed with him. If you know me, you can imagine what my temperament was like at this point. I actually got the guy flustered enough to get him to admit that no one on his team knew how to resolve this issue on the DAS unit. Stunning! I demanded an on-site engineer and one came out later that day. He pressed 1 button to import the foreign configurations in the POST and the issue was resolved. That’s all I wanted from the support desk …. do I press that button or not?
IBM said they’d come out to upgrade the rest of the firmwares to make up for our experience. Fair enough. They did that within a few days. During the process a disk failed in one of the DAS units. Then I saw how my experience differed from theirs. They logged a call and a disk was out in a day.
A week later (last week) I was in the data centre to do some network engineering. I checked on the rack with the IBM gear. Uh-oh. Another disk failure in a DAS unit and another DAS chassis had an alert light. IBM Director picked up neither issue. I logged 2 calls; one for each issue. That was Thursday. A few hours later the IBM support desk called me. A replacement for the failed disk was not in stock in Ireland. We’d get a replacement 2 days later once it was shipped from Holland. What!!!! Our MD wasn’t happy. The MD went straight to IBM and complained. Suddenly the disk was going to be replaced the following morning. It seems to me that IBM was just delaying in some way to reduce shipment costs.
As for the alerting DAS chassis? It’s now the following Wednesday and IBM still hasn’t followed up. I sent in the DSA logs 17 minutes after IBM asked for them last Thursday. They started this rubbish about saying they hadn’t gotten them. Ah but boys, didn’t you see that I CC’d another of our engineers, our MD, and 3 people in IBM Ireland. I aint accepting the BS you’re using to delay action. According to an email from IBM, I should have used an FTP site to upload the logs as my first choice. I tried. Without logging in I had no access to the folder in question. I was given no credentials. So then I tried anonymous with my email address as my password (thank God for green screen education in college). I navigated to the folder but was refused permission to upload. The delaying rubbish about the logs continued up to Monday. Then an engineer called to ask for the logs. I exploded over the phone. I got onto his team leader (the same guy as before) and suggested that maybe the lot of them should be fired and that Lotus Notes was a pile of steaming ****. I wasn’t sending in logs again. It was done once and I told him he could get it from one of the 3 IBM people in Ireland that I’d CC’d. That went down well 🙂
30 minutes later the field service manager for IBM Ireland called me. More of the same. I really don’t care. “Would I go to a meeting to learn more abou
t IBM?”. Why the f**k would I want to do that? I have no time for that BS. I don’t tolerate sales people; I don’t take their calls because I have no time for crap. JUST FIX THE DAMNED DAS BOX! He promised to forward the DSA logs to the support desk. That was 2 days ago. Nothing has happened since.
Oh sorry it has, that manager has tried to go above my head to our MD to get us out to talk about IBM. Oh you sad bugger. That was the wrong move. In fact, that pushed me over the edge. Trying to outmanoeuvre me while still not sending anyone out to fix the DAS unit is the sort of BS I don’t accept from anyone.
So here’s how I compare HP and IBM:
|Management of hardware
||Beyond awful – think BBC Watchdog