AWS Outage Reveals Critical Vulnerabilities in Cloud Computing: Why Decentralization is Essential

A major AWS outage has affected thousands of organizations globally, highlighting the risks of over-reliance on centralized cloud computing. This analysis examines why the dominance of just three providers creates significant vulnerabilities, explores the problems of vendor lock-in and regulatory concerns, and offers solutions through multi-cloud and edge computing approaches that can enhance resilience for businesses dependent on cloud services.

Amazon Outage: Computer Scientist Explains Why 'Cloud' Needs To Change

The outage was triggered by a malfunction at an AWS data centre in Northern Virginia, US.

Melbourne:

Amazon Web Services (AWS), the world's largest cloud computing platform, has suffered a significant outage affecting thousands of organizations including banks, financial software platforms like Xero, and social media services such as Snapchat.

Beginning at approximately 6pm AEDT on Monday, the disruption originated from a malfunction at an AWS data centre in Northern Virginia, United States. While AWS claims to have resolved the underlying issue, some internet users continue to report service interruptions.

This event underscores the vulnerabilities associated with heavy reliance on cloud computing infrastructure. However, there are strategies to mitigate some of these inherent risks.

Cloud computing essentially provides on-demand delivery of various IT resources including computing power, database storage, and applications via the internet. Put simply, it involves renting rather than owning IT infrastructure.

The concept gained prominence during the dot com boom of the late 1990s when digital technology companies began delivering software over the internet. As organizations like Amazon developed their capability to offer "software as a service" online, they also began allowing others to rent their virtual servers for a fee.

This represented a compelling value proposition. Cloud computing enables a pay-as-you-go model similar to utility billing, eliminating the substantial upfront investment required for purchasing, operating and managing proprietary data centres.

Consequently, recent statistics indicate that over 94% of enterprises now utilize cloud-based services in some form.

The global cloud market is dominated by three major providers. AWS holds the largest share (approximately 30%), followed by Microsoft Azure (around 20%) and Google Cloud Platform (roughly 13%).

All three providers have experienced recent outages with significant impacts on digital service platforms. For instance, in 2024, an issue with third-party software severely affected Microsoft Azure, causing widespread operational failures for businesses globally.

Google Cloud Platform also suffered a major outage this year due to an internal misconfiguration.

The global internet's heavy dependence on just a few major providers—AWS, Azure, and Google Cloud—creates substantial risks for both businesses and everyday users.

Primarily, this concentration creates a single point of failure. As demonstrated in the recent AWS incident, a simple configuration error in one central system can trigger a cascading effect that instantly paralyzes vast segments of the internet.

Additionally, these providers often implement vendor lock-in. Companies find it prohibitively difficult and expensive to switch platforms due to complex data architectures and excessive fees for transferring large volumes of data out of the cloud (data egress costs). This effectively traps customers, leaving them subject to a single vendor's terms.

Furthermore, the dominance of US-based cloud service providers introduces geopolitical and regulatory concerns. Data stored in these massive systems falls under US laws and government demands, complicating compliance with international data sovereignty regulations such as Australia's Privacy Act.

Moreover, these companies possess the power to censor or restrict service access, giving them control over how firms operate.

Current best practice for mitigating these risks involves adopting a multi-cloud approach enabling decentralization. This means running critical applications across multiple vendors to eliminate the single point of failure.

This strategy can be enhanced with "edge computing," which moves data storage and processing away from large central data centres toward smaller, distributed nodes (such as local servers) that firms can directly control.

Combining edge computing with a multi-cloud approach enhances resilience, improves speed, and helps companies meet strict data regulatory requirements while avoiding dependence on any single entity.

As the traditional wisdom suggests, don't put all your eggs in one basket.

(Author: Jongkil Jay Jeong, Senior Fellow, School of Computing and Information System, The University of Melbourne)

(This article is republished from The Conversation under a Creative Commons license. Read the original article.)

Source: https://www.ndtv.com/world-news/amazon-outage-computer-scientist-explains-why-cloud-needs-to-change-9490508