Amazon Web Services (AWS) has reported that it has resolved a significant outage that affected numerous global websites and applications. The disruption, which began on Monday, impacted over 1,000 services, including well-known platforms like Snapchat and major financial institutions such as Lloyds and Halifax banks.
Downdetector, a platform outage monitor, recorded over 11 million user reports during the outage. Experts noted that such widespread disruptions underscore the deep interdependence of our digital infrastructure, where issues with a single large cloud provider can have far-reaching consequences. Professor Alan Woodward of the University of Surrey commented, “Small errors, often human made, can have widespread and significant impact.”
The problems started around 07:00 BST, affecting a wide array of services, from online gaming platforms to language learning apps. The sheer scale of the outage was evident, with initial reports far exceeding those typically seen on a regular weekday.
Amazon stated that the issue was primarily related to DNS resolution problems with the DynamoDB API endpoint in US-EAST-1. DNS, often described as the internet’s phone book, translates website names into IP addresses. Disruptions to this system can prevent users from accessing online content.
Cloudflare CEO Matthew Prince highlighted the power and vulnerability of cloud services, noting that while they enable scalability, outages can take down many essential services. Cori Crider, head of the Future of Technology Institute, likened the event to a bridge collapse, emphasizing the systemic risk posed by the concentration of cloud computing power in a few major providers like Amazon, Microsoft, and Google. She suggested a need for greater diversity in local services to build resilience against such shocks.
Information technology professor Mike Chapple noted that the situation could be exacerbated by “cascading failures” as systems attempt to recover, drawing a parallel to power outages where initial fixes might only address symptoms. He advised companies to invest in robust backup systems for critical applications hosted in the cloud to mitigate such risks.