AWS Outage Disrupts Major Apps Worldwide
US-EAST-1 incident tied to DNS and load-balancer subsystem triggered cascading failures across the internet
Amazon Web Services experienced a global outage on October 20 that took down scores of apps and websites. Amazon said services later “returned to normal operations,” though some workloads faced backlogs as systems recovered.
Pittsburgh, PA – October 22, 2025 (Updated: 10/30) — A widespread AWS disruption that began shortly after midnight Pacific on Monday rippled across the web, knocking popular apps and business services offline before Amazon reported full restoration later in the day. Reporting indicates the incident centered on US-EAST-1 (N. Virginia) and was the most significant internet disruption since last year’s CrowdStrike event.
Initial status updates pointed to DNS resolution issues preventing applications from reaching the DynamoDB API in US-EAST-1. Later, AWS said the root cause was an internal subsystem that monitors the health of network load balancers within the EC2 internal network with DNS effects compounding the blast radius. By ~3:00 p.m. PT, Amazon said “all AWS services returned to normal operations,” while warning that some services would clear queued messages for several hours.
The outage impacted a broad range of consumer and enterprise brands. Reports and company statements cited disruptions at Snapchat, Reddit, Roblox, Venmo, Zoom, Coinbase, Robinhood, and even Amazon’s own retail, Prime Video, and Alexa services. Ookla said over 4 million users reported issues worldwide, and at least a thousand companies were affected.
All AWS services returned to normal operations… Some services… continue to have a backlog of messages that they will finish processing over the next few hours
said Amazon on Monday.
Key facts
- Timeline: Issues began shortly after midnight PT on Oct 20; AWS later reported full restoration the same day, with lingering backlogs.
- Where: US-EAST-1 (N. Virginia)—a region with prior, high-profile incidents—was identified as the locus.
- Root cause (AWS): Internal load-balancer health-monitor subsystem within the EC2 internal network; DNS problems impeded reaching DynamoDB endpoints.
- Scale: Millions of outage reports; 1,000+ companies impacted, spanning communications, gaming, finance, and retail.
The event underscores the internet’s dependence on a few hyperscale providers. As one expert noted, the episode highlights how “relatively fragile infrastructures” can cascade through everyday digital services.
Other articles
Do You Have a Strategy for Storing Data from Your Network’s Edge?
Learn strategies for edge data storage, balancing speed, scalability, and security to optimize access and management in modern IT environments.
Business Radio X talks to Gene Leyzarovich of JetStor
Gene Leyzarovich of Jetstor discusses advanced computer networks and storage solutions on Business Radio X. Insights into tech innovation and data storage.
.png)