Ever have a really bad day? Me too. I’ve masked the names to protect the guilty.
Ever have a really bad day? Or a bunch of bad days in a row? Me too. I’ve masked the names to protect the guilty.
For a very long time I have been aware that stress can take a toll on your health. But in my experience, it’s never been immediate like it has lately.
It started this past weekend with an on-call nightmare: I got two calls about Asterisk, (not my specialty) on Saturday. One guy just needed it started. He had rebooted his machine and now it didn’t work. Easy fix. But the call ruined my lunch. By the time I was done my frozen pizza was almost frozen again.
Later, just before a dinner I was really looking forward to, the shit hit the fan:
A major client experienced a catastrophe.
Here’s the timeline of events over the next hour:
- 6:27p Call from Answering Service - Client Engineer #1 from India
- 6:29p Call from Answering Service - Client Engineer #2 from India
- 6:43p Call from Answering Service - Client Engineer #1 from India
- 6:40p I called my backup - no answer - left message
- 6:43p I text my boss’ boss
- 6:45p I text my boss
- 6:51p I call my boss’ boss - left message
- 6:53p I call my boss - left message
- 7:02p Call from Answering Service - Client Engineer #1 from India
- 7:10p Call from Answering Service - Client Engineer #2 from India
- 7:12p I call our company president - told him I was working on it and can’t get a hold of anyone to help
It seems Client Engineer #3 (from New Jersey) pulled a cable at the Major Client data center and severed connectivity to over a dozen machines. I was being called by three different Major Client employees (and they were giving the answering service grief since they were all from India) and I was unable to contact them because one of the servers affected was the XMPP server.
I tried to call my backup to see if he could help me clear calls until I could figure out how to reach someone via skype, but there was no answer. Then I escalated to my boss’ boss with a text that went unresponded. Then I tried my boss - since he knows the most about Major Client - but he wasn’t on-call - so he didn’t answer or call back either.
Finally, I called our company president in a panic. He wanted to know what I wanted him to do. I told him - nothing, I just wanted him to know there was a shitstorm brewing, and I was alone, in case somebody complained.
Long story short - my boss’ boss and I stayed up until 2am straightening out the mess.
And then Client Lead Engineer - who was on a plane coming home during all of this - called the answering service at 5:59am (AM!) to wake me and find out what happened. Then he kept me awake for an hour to watch him as he fixed the problem the way he knew how.
An hour later the pager went off to remind me I was on-call again. Sleep was a memory. Sunday was going to be shitty.
Monday wasn’t horrible and I got what I considered a proper amount of thanks for holding it together on Saturday. I was still a bit shaky and sleep-deprived, but I was back in the office.
Tuesday I had to be at our data center downtown to move our phone server from one rack to another. It should be an easy task. I had planned all my cabling and showed up at the data center at 6:30a to make the switch at 7am.
Unfortunately, my stomach had other ideas. I barely made it to the bathroom at the data center when I arrived. I was nervous, running on fumes and just chugged a Monster energy drink. That wasn’t starting well at all.
So I get into the data center and start the move. Things are OK. In about 30 minutes I had the server moved and ready to power on. I pushed the power button and IM’d our office assistant. “All done. Please test” She messaged back that she couldn’t get into the server. I double-checked my cables. I had link lights. Everything looked ok.
We waited. Still no access. Oh, shit. This is the phone server. It has to be up by 8am or the office will have no phones.
I triple checked my cables against my documentation. I was at a loss. It was 8:15am. It was time to call my boss’ boss.
Now, I’m pretty sure I woke him up. But his first sentiment to me was “Chuck, this isn’t that hard.”
“Double-check your cables. Did you use new cables? Did you try to change cables?”
I felt insulted and stupid. Of course, I had tried different cables. Did he really think I tried nothing before calling him? After I admitted that I’d been working on it for an hour, he decided to stay on the line with me to help troubleshoot.
We went over everything. From start to finish.
“Did you drop the server?” No.
“Did you do anything that you suspect might have caused an issue?” No.
Again, I felt stupid and embarrassed. I told him that I was aware of how this situation makes me look, but I am certain that I did nothing that I didn’t plan to do. There were no obvious errors or detours in the plan I had made.
Eventually - it was discovered that the cable port that I plugged into the public network (which is how our office assistant would have reached the server) was not configured for the public network. Not my fault - his fault. After he fixed it and everything started working, he apologized. But the damage was done. I felt like jello and was ready to just leave.
Wednesday was another trip to the data center to move another server. This one went OK, but wasn’t without its moments of feeling belittled by my boss’ boss via phone. At least there were no problems, and we were able to get things back up quickly.
Finally, this morning, I had to run updates on the firewall at our data center at 7am, from home. Ok, no problem. 7am hits and I press the button. Shortly thereafter I realize that I didn’t set enough downtime in our monitoring software. As the monitor lost communication with everything behind the firewall (because it was rebooting) I received 56 “down alerts”. 56 text message notification sounds. One after the other. The fucking phone wouldn’t stop.
I wasn’t sure if things had just gone to shit on me or not.
As I’m scrambling to try to not wake the house and shut up the phone at 7am and acknowledge the alerts, the box comes back up and I start to get “up alerts”. The connectivity was being restored. Another 56 text messages in a row.
At his point I am literally shaking like I’ve got fucking Parkinson’s. My hands are cold and I can’t keep my head still.
As things start to come back - I verify that things are OK, now and send my email explaining why there were so many alerts.
Then I sit down in my chair.
I couldn’t decide if I wanted to shit, throw-up, faint or die.
This was too much. Too many things in a row. Too many days in a row.
I can’t do this anymore.