Facilities management is one of many industries that have grown increasingly digitized and interconnected over the last several years. Technology is now integral to ensuring smooth facility operations, maintaining safety, tracking assets in real-time using the Internet of Things (IoT), optimizing the utilization of resources, and so on.
As recent events have shown, when essential IT systems grind to a halt and when resolution falls outside a facility’s scope of control, having backup plans is vital. Even the most robust and resilient of IT systems and infrastructure can fall victim to unexpected outages stemming from a range of sources, profoundly disrupting operations, as we found out.
To mitigate these risks, many organizations are recognizing the advantages of complete incident response and managed detection and response (MDR) solutions, which include capabilities to safeguard their IT and staff. However, it has to be asked whether this is enough and whether facilities managers can do more to keep operations afloat during periods of disruption and extended downtime.
The recent CrowdStrike-Microsoft IT outage serves as a stark reminder of how vulnerable interconnected facilities systems can be and also underscores how much we rely on them to keep supply chains moving. In light of this recent incident, let’s explore what can happen to facilities as a result of an IT outage and how you, as a facilities manager, can enhance system resilience and minimize downtime.
The CrowdStrike Incident: A Vital Reminder for Facilities Managers
On July 19, a software update from cybersecurity vendor CrowdStrike triggered a huge global IT outage, which some consider to be the largest in history.
Reportedly, 8.5 million Microsoft Windows systems and devices were directly affected by the botched update. The outage was not a flaw linked directly to Microsoft, but rather a flaw in CrowdStrike’s eponymous Falcon platform, which is designed to help protect systems against cyber threats and minimize their risk exposure.
There was a logic flaw in Falcon’s update version 7.11 and above, resulting in a worldwide Windows system crash and the dreaded “blue screen of death.”
While the outage primarily affected Microsoft systems deploying the update, its ripple effects were felt across numerous sectors, including those critical to facilities management, public transport, airlines, healthcare, and others.
While these types of widespread outages are few and far between, the incident above should serve as a vital reminder for facilities managers to never rest on their proverbial laurels. Looking at it from a broad perspective, there are some key vulnerabilities to point out:
- First, many facilities rely on a complex network of interconnected systems. When one system fails, it can trigger a domino effect, impacting everything from environmental condition monitoring controls to security and CCTV systems.
- Many facilities nowadays rely on specialized software for routine tasks like asset tracking, maintenance scheduling, space management, and more. One flaw in a security tool could jeopardize operations across an entire infrastructure.
- The outage impacted millions of people and organizations worldwide, demonstrating how IT failures could affect the very infrastructure that facilities managers are responsible for maintaining.
Impact on Facilities Management Systems
To truly understand how a severe IT outage can affect any facilities management operation, it’s vital to recognize the complex array and setup of systems which can make up modern facilities.
This can include (but is not limited to):
- CAFM (Computer-Aided Facilities Management) Systems: Outages could disrupt the process of space management and asset tracking, leading to miscalculations and lost inventory.
- CMMS (Computerized Maintenance Management Systems): Outages could result in missed maintenance checks and vital equipment safety assessments, leading to the increased risk of hazards stemming from malfunctions and failures.
- BMS (Building Management Systems): Outages could leave facilities managers powerless to control lighting, heating, ventilation, and other climate control systems which can affect staff and equipment.
- Security and CCTV Systems: Outages could leave facilities more prone to security breaches or unrestricted access.
- Energy Management Systems: Outages could cause facilities to lose sight of their true energy consumption, resulting in increased costs.
- IWMS (Integrated Workplace Management Systems): Outages here could impact multiple aspects of operations in facilities, from maintenance and employee management to automated controls.
Real-World Consequences of IT Outages
The CrowdStrike Falcon incident—like many similar outages before it—has demonstrated how IT failures can impact businesses sector-wide.
Failures in security systems and other vital facilities management technology can fundamentally pose a risk to employee and occupant safety. When equipment malfunctions, orders cannot be processed, assets cannot be tracked, controls can be locked down, and so on, it leaves facilities staff in a state of confusion as to how to proceed.
The resulting delays, errors, and decreased productivity can have a profound effect on supply chains, with deliveries unable to be processed or logged, logistics unable to proceed, and real-time asset visibility almost nonexistent.
Extended downtime can indirectly lead to a series of financial consequences stemming from reduced order fulfillment, equipment damage, and recovery costs. The longer an outage or disruption affects facilities responsible for keeping supply chains ticking over, the more costly it can be. The CrowdStrike incident alone could surpass $1 billion in costs, according to a recent CNN report.
Over time, if facilities are plagued by persistent IT outages, they risk losing the trust of their tenants, customers, stakeholders, and suppliers, particularly if these other parties are unaffected.
Strategies for Minimizing and Preventing Outages
While IT outages and disruption can never be eliminated entirely, facilities managers can take several steps to prevent them from occurring as often and minimize their impact on critical daily operations.
Consider the following:
- Create and update business continuity and disaster recovery plans to include manual workarounds for IT system downtime.
- Ensure critical systems have robust backup options to restore them to the last working version if needed.
- Patch, update, and test systems on non-critical servers to verify their effectiveness before they are rolled out to live environments that could disrupt operations.
- Consider using various solutions for critical functions to reduce dependency on a single software vendor and thus increase the risk of a single point of failure.
- Use real-time cyber monitoring tools to detect anomalies and potential suspicious activity before it escalates into large-scale problems.
- Conduct regular risk assessments to identify infrastructure vulnerabilities and develop proactive mitigation and response strategies.
- Work closely with IT specialists to ensure robust, organization-wide cybersecurity measures are in place.
- Establish clear communication channels and protocols for when outages inevitably happen, ensuring stakeholders and partners are informed promptly.
- Ensure staff are trained and upskilled to operate systems manually if needed and also to recognize nefarious activity and improve response and recovery times.
As more facilities embrace site-wide digitization and interconnect with more tech-enabled suppliers and buyers, the ripple effect from IT outages grows increasingly more severe.
Let this recent incident serve as a reminder of the potential vulnerabilities that exist in your facility’s infrastructure and incumbent systems, and familiarize yourself with the risks and preventative measures to reduce the damage and impact caused by an outage of this scale again.
Chester Avey has over a decade of experience in business growth management and cybersecurity. He enjoys sharing his knowledge with other like-minded professionals through his writing. You can connect with Chester by following him on X (formerly Twitter) @ChesterAvey.
The post CrowdStrike IT Outage: Why Facilities Managers Should Take Notice appeared first on Facilities Management Advisor.
0 Comments