A Year Ago Today…
Posted 2 months, 16 days ago on Sunday, April 20th, 2008 under Employment · 1 Comment ·
One year ago today, I was fired from a job as the Information Security Analyst at a local third party health plan administrator. This is my story…
I was working at a bank that had recently gone through a merger. In that merger, many of my friends and co-workers lost their jobs. It was a trying situation, but all of them eventually landed on their feet. One of them found work at a third party health plan administrator, and was in dire need of help. He set about recruiting me, and for many reasons, I took the bait.
When I arrived, I quickly realized how desperately in need of help he was. The situation was dire. The network was a mess. The hosts on the network were in horrific shape. End of life operating systems were still in production, some hidden away in closets. Win 95 (yes… 95) was hanging about. Likewise NT 4.0, of which some were original installs that had never been patched, even when those patches were readily available. Anti-virus installations were spotty at best, and of those installed, many were without up-to-date signatures. The firewalls had any-any-allow rules both for both ingress and egress traffic. The built in anti-malware modules were disabled, and logging was almost useful. While most users had actually been reduced to power user level, all users of the Citrix farm had local administrator accounts on the member servers. There were lot more issues, but that’s enough for the story, and to list them all would be tedious, to say the least…
On March 12th, all the NT 4.0 servers, and a few unpatched Win 2000 servers were infected with a backdoor trojan. My best guess and research pointed to W32/Rbot-AKU, though I was never able to confirm md5 hashes or find adequate documentation. The only real clues were the presence of %SYSTEM%\system32\qtask.exe, %SYSTEM%\system32\drivers\Oreans.sys, and %SYSTEM%\system32\irdvxc.exe).
How the infection penetrated the network is still unknown. I suspect a home user logged into the Citrix farm as a local administrator brought it in, but I can’t be sure. The behavior of the trojan indicates that it could have simply been a remote infection by random scanning. The firewalls would have certainly allowed it.
This infection caused 3 days of network outage during my second week when my primary technical contact/mentor was out of town. I was on my own trying to figure out firewalls I’d never worked on, a network about which I knew next to nothing and business processes I’ve not been trained on. There was, of course, no documentation. There was nothing in the way of traffic analysis software installed, and the firewalls had horrible logging and terrible connection management. I didn’t know what normal traffic was, or even if there was an issue with traffic. Several third party VPN connections were the source of the vast majority of management tension, so that’s where I focused my efforts for the first two days.
Finally, on the third day of complete outage, I realized that it was a virus. Not knowing the environment, I’d been troubleshooting the wrong thing. I finally found it, cleaned the infected servers I could clean, and powered off those I couldn’t, and the network came alive like magic. Though it wasn’t by design, the trojan on the infected servers was performing a DDoS attack on the firewalls. Their connection queue was constantly full, and they were dropping all traffic. They were simply unable to keep up with the huge volume of traffic being generated.
Years and years of never upgrading, never updating, never patching and pulling computers together from eBay and garage sales equaled infection by virus many and legion. Infection equaled downtime. Three days of downtime equaled hundreds of thousands of dollars lost and a major strike to the company reputation.
I bolstered the firewalls as best I can by tightening down the rule set, enabling the built in IDS and IPS, and prioritizing VPN traffic. I set up scans to monitor the %SYSTEM% folder on all the Win32 devices that I could, but there were 6 domains with incomplete and misconfigured trust, and I was unable to see all of them.
Over the next few weeks, the IT team and I worked diligently and hard to bring the anti-virus installations up to date and protect every machine on the network. We went in over the weekend and hit every machine by hand to make absolutely sure they were covered. Microsoft security patches were applied through this period as well, with some machines needing literally hundreds of patches. This had been done in large part prior to my arrival, but there was work yet to be done.
The trojan hits again on April 12th, but only for about an hour this time. One of the servers I was unable to physically get to for cleaning, having been shut down in March, was brought back online unbeknownst to me for a software install. Only an hour and a half, but it’s enough to enrage the CEO who informed us that anyone who touches a production server without approval will be fired on the spot. The good news is our efforts to protect the network appeared to have worked. This was the only server with an infection that was affecting the network.
Tuesday, April 17th, I noticed another NT 4.0 server was infected. Alarmed, I was relieved that no downtime, as of yet, had occurred. The changes I’d made to the firewalls and the work we’d done to protect the machines on the network were still working. This, however, was an original install NT server, and patches simply did not exist to protect it. As far as I, or anyone else on site knew, the trojan DoS symptoms could kick in at any time. I wrote my manager and the business owner of the server and informed them of the situation. I informed them that if I didn’t clean the server, I could not guarantee it wouldn’t bring the network down again. It had been infected for a few days so far with no affect, but given the lack of documentation, I couldn’t determine or predict how the trojan would behave. I concluded that we might be safe, but we might not be, and left the decision in their hands.
The business owner and IT Applications Manager gave me the go-ahead via e’mail to clean the server. I was in and out in 15 minutes. Following the cleaning, I scanned the network deeply with all the tools at my disposal to finally determine that the trojan was, once and for all, gone.
Wednesday, April 18th, my boss, the IT Infrastructure Manager was fired by the CEO on direction from the Board for lack of confidence. The downtime was too much for them to tolerate… never mind the lack of patching, lack of updating, lack of anti-virus and end of life operating systems that he’d inherited from his predecessors. He was given a no-win situation, did the best he could, and was fired for his efforts. It’s ironic that just three months prior, he was lauded as a hero in the company for the progress he’d made and given a very healthy bonus.
Thursday, April 19th comes and we were all still reeling from the loss of my boss, our friend. In an IT wide meeting, we were assured that it was a one time thing, and that no one is on the warpath.
Friday, April 20th, the CEO and HR arrive at my office door at 8:30am and informed me that my employment had been terminated. The IT Applications Manager apparently told him he had to talk me down from shutting off a production server. I’m too shocked to realize what he’s talking about and simply take what’s mine and leave. Only later do I begin to process what happened and come to realize what had been done to me. The IT App Manager sold me out, and misinformed the CEO regarding what happened with that last infected server. He neglected to mention the permission he gave me, and misrepresented our conversations to make it appear as if I was on a rampage and in a panic.
It is clear to me now that not cleaning that server could have easily resulted in further corporate wide network downtime, and that downtime would have gotten me fired. On the other hand, taking offline with permission for 15 minutes a single NT 4.0 server that is used by 3 or 4 people 3 or 4 times a day in order to guarantee no corporate wide network downtime did get me fired.
I was in a classic catch 22.
In short, I believe both my boss and I were terminated for political reasons. We tried to affect real and positive technical and security improvements in an organization that had, for the past 20+ years, grossly under funded IT, and given zero thought to information security. For the leaders of a company in the third party health plan administration industry to behave in this manner is disturbing. I believe the mindset was such that we simply did not fit, and excuses were found for the both of us to be fired.
Never have I regretted a job-related decision as much as my choice to leave a safe and comfortable job for that one. For that matter, there are very few decisions in any area of my life I’ve regretted more. Compound that by what I’ve learned recently… had I stayed, not only would I be making considerably more money, but I would have a decent chance of moving to Orlando, FL. Fah! *spit*
From what I’ve heard from friends who still work there, the situation isn’t much better. On the technology side, there have been improvements, and many of the safeguards I helped put in place remain and are maintained. On the political side, however, there is still plenty of uncertainty, upheaval and fear. All in all, though being fired had severe consequences in my personal life, a year later I can say without reservation that I am definitely better off.
