A few months ago, I wrote about the cargo ship disaster in Baltimore. It was a good example of how companies can get into trouble by not paying attention to small factors that, while not consequential individually, can add up to create a significant failure.
Well, on July 19th, the world experienced another significant failure — 8.5 million Windows computers disabled at once, causing thousands of flight cancellations, hospital outages, and serious challenges in businesses and governments around the globe.
This time, however, it was not due to several small factors. Rather, it was the result of a single big one:Cybersecurity vendor CrowdStrike triggered the crash by including an error in its distribution of what should have been a routine update to its software. That error brought down every client using its software — a substantial proportion of many industries.
Increased Vulnerability
Companies are becoming increasingly vulnerable to these types of failure in their operations.
For example, over the past several years, just-in-time (JIT) manufacturing has become popular thanks to its effectiveness in squeezing costs out of supply chains by having parts delivered to assembly lines “just in time” for use.
It works just fine… provided suppliers can ensure reliable, on-time deliveries. But when a pandemic causes plant closings, or a ship gets wedged in the Panama Canal, everything can come to a screeching halt.
As a result, global companies have been rethinking their supply chains and considering the return of some form of “safety stock” or other means. They are accepting slightly higher costs as a hedge against supply chain disruption.
But it’s not just about supply chains and it’s not just a big company problem. Any organization of any size that is dependent on a particular, potentially vulnerable resource to remain operational is at risk of a single point of failure mishap.
For example, I was recently working with a company in which the CFO was the only person in the organization who understood the complex ins and outs of its cash flow. Soon after he retired, his replacement dug into the data, only to discover they were about to become insolvent!
Another client (a very small one) kept much of its critical data on the laptop of the Executive Director. A new ED was hired and she inherited that laptop. When she resigned a few months later, she “accidentally” deleted a number of critical files, leaving the organization scrambling to rebuild as much of its data as possible from paper and memory.
These and countless other examples highlight a simple fact: single points of failure, of any kind, are risky. Here are some suggestions for how you can protect against them…
Suppliers
Early in my career, I ran the in-store bakeries of a supermarket chain. There were several ingredient distributors to choose from, any one of which could have given us all the product we needed. I contracted with two.
It may have cost us a little more by reducing our purchase size with either, but by keeping both engaged, I was protecting us from anything bad happening to a single source. Plus, it had the added benefit of keeping them competitive with each other.
Technology and Systems
Besides the obvious need to make sure critical data is backed up — frequently, automatically, and off-site — it’s essential that no piece of equipment critical to the organization is without a ready back up. (Even in my own small operation, I have a second computer waiting in the wings if needed at a moment’s notice.)
In my supermarket days, we had battery-operated calculators and flashlights near our cashiers, in case a power outage disabled the cash registers!
People
To the extent possible, avoid having critical business knowledge reside in the head of any one individual. Take explicit steps to cross-train as needed, making sure to include any essential skills, roles, or processes inside and external to the organization.
Some organizations even require mandatory blocks of vacation time for financial executives; they hope to avoid embezzlement by ensuring more than one person is working with the data.
Strategy
I know, this category is different since it is not tactically operational. But the concept still applies: you want to ensure your strategy and plans are not highly dependent on a single key assumption. This may include anticipating how a competitor may act (or react), the success of a particular product or service offering, or how the market may behave in the future.
Key to this is making sure you fully understand the critical assumptions behind your plans and spend time assessing what might happen in different scenarios. That way, if your assumptions turn out to be incorrect or situations change, you are prepared to respond as needed.
Organizations can get into trouble if they drive forward head-down, focused on executing their plans without paying attention to potential changes in the market or context.
Reflections
More than once I have heard investors and business leaders quip that while their eggs may all be in a single basket, it’s not a problem, since they watch that basket very carefully.
All fine and good until the next unforeseeable calamity knocks the basket to the ground.