When Statistics Fail: Planning for Things You Can't Expect

Friday, April 27, 2012

Rafal Los


Recently, my team organized a meet-up of some of "down under's" technical minds and we sat around talking about cloud computing, the latest advancements in tech, and other new stuff until we were captivated by the news on the television behind us. 

Yet another 8.7 magnitude earthquake had struck in that same fault line which created all those deadly tsunamis in the south pacific which was still fresh in many of our memories and immediately thoughts turned to disaster planning and strategy.

Interestingly enough, the newscaster brought up Fukushima Daiichi and the disaster in planning that had occurred there.  What's odd is that before the accident that is now the worst nuclear event in 25 years, scientists and nuclear regulators from all over agreed that the plants were sufficiently protected against failure. 

There were systems that served as backup to the backup, and multiple safeguards were in place.  They built the plant to the newscaster referred to as the statistical probability requirements.  Loosely translated to layman's terms I can understand it means that according to statistics this plant should never catastrophically fail.  Almost never.

It's that almost never that is now the subject of debate and finger-pointing in Japan's and the world's nuclear community.  Hearing that we looked at each other and immediately I wondered how many of the organizations that have statistically sufficient defenses are just waiting to fall prey to the same type of event. 

Now, I know that in the world of technology even the most catastrophic failures aren't likely to cause the mass destruction that a nuclear power plant will - but the lessons are applicable all the same.

The question remains - even if an event is statistically improbable in our eyes, should we at least consider it in our planning?  I guess it all depends on the criticality of the systems that are under consideration but generally the answer would be 'no'.  Let me be clear - we're not talking about being 'secure' to a certain level but rather prepared for an incident via response strategy for a certain type of event.

Let me make it more concrete, since I've been talking in hypotheticals here.  Let's say there is a medical facility which can be remotely managed for environmental support systems such as HVAC and fire suppression, etc. 

In the planning of incident response strategy and tactics the Information Security team would plan for several different types of attempts to subvert those systems to cause mischief, chaos and potentially loss of life.  We generally account for the mischief. We even most often account for the chaos.  But who really thinks about the IT systems going all-out and causing loss of life?  So "what if" because "probably won't happen" and we only half-heartedly plan for it.

Logically we don't prepare for or try to work at edge cases or statistical improbabilities.  But what if the sea level rises about your sea wall in that .1% probability and that causes everything else (including your backup, the backup to that, and the backup to that) to all fail because they're virtually "under water"....? 

In incident preparedness, if you don't already, maybe it's time that there is a chapter on 'worst case scenarios'.  I know lots of organizations already have these... but as a previous post of mine pointed out many aren't even thinking about testing their own response much less actually looking at absolute worst-case!

Let me give you another example, from my experience.  Let's say hypothetically that there is a very large global network of inter-connected systems and networks for a very large company.  Those systems and networks are all managed remotely, obviously.  In the event of a worst-case scenario or issue the routers, switches and remote networking components are all set up to allow for remote connection directly into the device via a special management network in case the primary network becomes unavailable. 

Now, say that at some point there is an Internet worm that saturates inter-office links so badly that nothing can get down the pipes.  This includes remote management protocols.  No worries, we have the backup network right?  Well... that's saturated too now... so we have a massive network completely dead in the water and the only way to reach the remote links is to send people out.  That may be possible in some locations, but in more remote ones like off-shore oil platforms and such... not so easy. 

Ahh, but because the IT team planned for even the worst, worst-case scenario, each critical remote router has a dial-in modem attached to it where you can actually dial in using a POTS connection so even if the network is super-saturated odds are the phone lines are still working.  Now we jump into those modems, connect to the routers and strategically shut off traffic routes until we can get a grip on global traffic volume.  Issue under control all because of really good planning. 

The issue here is those modems and POTS lines into those modems were expensive to maintain for years of just in case... but when that just in case comes calling?  Wow, were those network guys glad they were in place.

Are you this level of prepared?

Cross-posted from Following the White Rabbit

Possibly Related Articles:
Enterprise Security
Information Security
Testing Enterprise Security Best Practices Statistics Disaster Recovery Incident Response Network Security Policies and Procedures
Post Rating I Like this!
The views expressed in this post are the opinions of the Infosec Island member that posted this content. Infosec Island is not responsible for the content or messaging of this post.

Unauthorized reproduction of this article (in part or in whole) is prohibited without the express written permission of Infosec Island and the Infosec Island member that posted this content--this includes using our RSS feed for any purpose other than personal use.