Posted inLatest NewsTechnologyUAE

Microsoft’s massive global outage: UAE experts react to major IT outage that ground flights, struck banks, businesses

UAE experts have shared their comments reacting to the massive Microsoft outage, calling the critical event “a wake-up call for businesses globally to reassess their IT infrastructure and the processes they have in place for software updates and security measures”

Microsoft

On Friday, businesses worldwide faced a major IT outage on Friday from Microsoft, causing disruptions across various sectors.

The outage coincided with a disruption at cybersecurity firm CrowdStrike, which occurred early Friday. CrowdStrike attributed the issue to a recent technology update.

Releasing a statement on X (formerly Twitter), Microsoft’s CEO Satya Nadella said: “Yesterday, CrowdStrike released an update that began impacting IT systems globally. We are aware of this issue and are working closely with CrowdStrike and across the industry to provide customers technical guidance and support to safely bring their systems back online.”

Global IT outage causes massive disruption for businesses, services

“The issue has been identified, isolated and a fix has been deployed. We are referring customers to the support portal for the latest updates and will continue to provide complete and continuous public updates on our blog,” Crowdstrike said.

However, UAE experts have shared their comments reacting to the massive outage, calling the critical event “a wake-up call for businesses globally to reassess their IT infrastructure and the processes they have in place for software updates and security measures.”

James Maude, Field CTO, BeyondTrust

It appears an update from Crowdstrike causes the Windows OS to crash, creating global IT systems outages that have impacted almost every industry. Impacted systems present users with the dreaded “Blue Screen of Death” (BSOD), and in the worst cases, users are stuck in a crash and reboot loop. The fix appears to require physical intervention to rename or remove the update file which is responsible making the recovery process time consuming and complicated for remote systems.

While any piece of software can be unstable or have bugs, it is particularly an issue for security vendors such as Crowdstrike, as they have a very deep integration into the operating system in order to monitor and protect the endpoint. This means that any bugs or instability can cause the entire operating system to crash which appears to be what we have unfortunately experienced in the past 24 hours.

There are a few strategies to mitigate the risks of unstable software updates, but ultimately it starts with the vendor conducting rigorous QA in test environments that are as representative of customer environments as possible. Then, having a phased deployment process, gradually rolling out the updates, in stages, to groups of real users, to ensure the software is stable in real world environments before deploying to all users. In this case, it appears that the vendor was confident in the update and had deployed it at scale. In the coming days we should see a root cause analysis conducted to understand how this was able to happen, and most importantly, ensure that it can’t happen again.

Microsoft have been investing heavily in their own native security tooling over the past few years, having had issues with anti-virus vendors patching areas of the operating system and causing instability issues in the past. In recent years this has resulted in increased stability in the operating system, however this incident goes to show that we can’t be complacent. Microsoft need to ensure the OS remains stable in the event 3rd party software crashes and will need to work with security vendors to ensure stability on both sides.

Mark Jow, Security Evangelist EMEA at Gigamon, commented:

“This Microsoft IT outage demonstrates the need for more robust and resilient solutions so that when these issues do arise, they can be resolved quickly without causing such widespread customer chaos and security risk. Preparedness is key – every IT and security vendor must have a robust system in place across its software development lifecycle to test upgrades before they are rolled out to ensure that there are no security flaws within the updates.”

Alexey Lukatsky, Managing Director, Cybersecurity Business Consultant, Positive Technologies

This case reminds us of the importance of secure development, since in this case it was most likely the lack of update checking both on the side of the manufacturer – CrowdStrike – and on the side of consumers who automatically installed all the updates that reached them, and led to a massive global outage around the globe. With the exception of those countries that are not using infosec products from this American corporation.

In addition, this story shows us how firmly information technologies have become embedded in people’s lives and in various business processes, and how catastrophic the consequences of an accidental or unauthorized, malicious impact on the IT infrastructure can be. That is, in other words, businesses are faced with the task of assessing those non-tolerable events with catastrophic consequences that can occur in their activities due to the impact on the IT infrastructure.

And this is not the only case of a similar scale. There have already been cases of this kind. For example, related to the McAfee antivirus update in 2010. A similar problem occurred with updates to the Windows operating system itself, as well as its Microsoft Defender protections, which resulted in the inability to perform normal functions for users. Therefore, this problem is of a general nature, it is not connected with the country of origin of this or that software and simply raises once again the question of how much the influence of the IT infrastructure on business can lead to the implementation of certain non-tolerable events.

At the moment, the root cause, based on the scale of the disaster, the way the incident manifested itself, appears to be failure to follow safe development practices. But there is a version that cannot be ruled out: it has not yet found any confirmation, but we, as experts in the field of cybersecurity, cannot completely deny it. This is the intrusion of attackers into the software development process at CrowdStrike, which could have led to the introduction of malicious functionality into the next update, which ultimately led to this kind of massive failure.

Everyone remembers the story with SolarWinds, also an American company, which suffered from such an incident a couple of years ago when attackers penetrated the development process and introduced malicious functionality into an update that was rolled out to the computers of almost 20 thousand SolarWinds customers.

The only thing that can suggest that these are unlikely to be malicious actions of cybercriminals who have intruded into the development process is that usually in these types of stories the task of cybercriminals is to remain undetected for as long as possible. In order to be able to penetrate the networks of companies in which software products with malicious loads are installed.

In this case, the update almost instantly led to computer inoperability, which is often not the goal of most APT-groups, whose task is not to disable systems, but to obtain either data that can then be sold, or blackmail the victim’s company, or perform some kind of other functions related to cyber espionage.

Darren Anstee, Chief Technology Officer for Security, NETSCOUT

“The worldwide IT outage currently affecting airlines, media, banks and much more appears to have been caused by a faulty software update which was automatically applied, and not a cyberattack. This is another demonstration of how dependent we are on both our IT infrastructure, and the supply chains that deliver tightly integrated capabilities within it.  

 “There will undoubtedly be a huge fall out from this, with a lot of questions set to be raised around how to balance the need for regular security updates for defence, compliance etc, with the risk of applying unqualified updates to systems. Most enterprise software goes through testing and controlled roll-out before it is pushed to a whole population, but this doesn’t seem to be the case in this instance.”

Follow us on

For all the latest business news from the UAE and Gulf countries, follow us on Twitter and LinkedIn, like us on Facebook and subscribe to our YouTube page, which is updated daily.