Amazon services 'recovering' as Snapchat and banks among sites hit by outage

Tuesday, 21 October 2025 - 8:18

Amazon+services+%27recovering%27+as+Snapchat+and+banks+among+sites+hit+by+outage
Many of the world's largest websites, including Snapchat, Reddit and Roblox, were knocked offline on Monday after a huge Amazon Web Services (AWS) outage.

More than 1,000 apps and websites - including banks such as Lloyds and Halifax - were impacted by problems at the heart of the cloud computing giant's operations in the US, according to platform outage monitor Downdetector.

It said user reports of problems globally had soared to more than 6.5 million during the outage on Monday morning.

Amazon later said it had fixed the underlying problem, but issues for some services persisted, and experts said the outage demonstrates the perils of many companies relying on a single, dominant provider.

"What this episode has highlighted is just how interdependent our infrastructure is," said Prof Alan Woodward of the University of Surrey.

"So many online services rely upon third parties for their physical infrastructure, and this shows that problems can occur in even the largest of those third-party providers.

"Small errors, often human made, can have widespread and significant impact."

The issues appear to have begun at around 07:00 BST on Monday, as users began to report problems accessing a slew of platforms.

This included a wide range of different sites and services, from massive online games like Fortnite to the language-learning app Duolingo.

Downdetector told the BBC it had seen more than four million reports from users across 500 sites within just a few hours - more than double the amount it would see across an entire regular weekday.

These later peaked at more than six million, it said, as more services including Reddit and Lloyds Bank attempted to recover.

At around 22:00 BST, Amazon said many of its affected services had recovered, saying "we continue to make progress" even as it acknowledged that problems persisted.

A new series of "cascading failures" may have arisen after the initial outage, according to Mike Chapple, an information technology professor at Notre Dame University.

"It's like when you have a large-scale power outage," Chapple said. "Crews start working to try to bring it back on line. The power might flicker a few times," but it's possible "they'd only addressed the symptoms" and not the root cause.

What went wrong?
Amazon has not yet fully detailed what caused Monday's outage or issued an official statement regarding it.

It said in an update on its service status web page the issue "appears to be related to DNS resolution of the DynamoDB API endpoint in US-EAST-1".

DNS, which stands for Domain Name System, is often likened to a phone book for the internet.

It effectively translates the website names people use (like bbc.co.uk) into numbers which can be read and understood by computers.

This process basically underpins the way we use the internet, and disruptions to it can leave web browsers unable to locate the content they are looking for.

Matthew Prince, chief executive of Cloudflare, told the BBC the AWS outage highlighted the power cloud services have over how the internet works.

"Everyone has a bad day, today Amazon had a bad day," he said.

"There are amazing things about the cloud, it allows you to scale… but if you have an outage like this it can take down a lot of services we rely on."

And Cori Crider, head of the Future of Technology Institute, told the BBC it was "a bit like a bridge collapsing".

"An essential part of the economy has fallen to pieces," she said.

And with so much of cloud computing relying on Amazon, Microsoft and Google - estimated at around 70% - she said the status quo was "unsustainable".

"Once you have a concentrated supply in a handful of monopoly providers, when something like this falls over, it takes a huge percentage of the economy out with it," she said.

"We should really look at trying to buy more local services, rather than relying on a handful of American monopoly platforms.

"That's a risk to our security, our sovereignty and our economy and we need to look at structural separations to make our markets more resilient to these kind of shocks."

One computer science expert says some of the responsibility rests with the companies that use AWS.

"Companies using Amazon haven't been taking enough adequate care to build protection systems into their applications," says Ken Birman, a computer science professor at Cornell University in New York.

Outages like the one on Monday occur frequently, although not always at this scale.

Birman tells the BBC that app developers should take care to invest in backing up mission-critical applications that live in the cloud.

"We know how to make these systems stronger, and we know how to do it securely," Birman says.

The question of responsibility could well land in the courts.

More than a year after the massive CrowdStrike outage, Delta Airlines is still wrangling with the company to recover more than $500m in losses.

Even after CrowdStrike had fixed the issue, the airline said it had to manually reset 40,000 servers, leading to major flight delays over several days.

(Source: BBC)


AIA Insurance wins big at Dragons of Asia Awards
Tuesday, 21 October 2025 - 18:48

AIA Insurance Lanka Limited has earned prestigious recognition across the region, making a strong mark at the Dragons of Asia Awards 2025 for its purpose-driven... Read More

STOCK MARKET HITS HISTORIC RECORD AS ASPI SURGES 150 POINTS TO 22,784
Tuesday, 21 October 2025 - 18:27

The Colombo Stock Exchange (CSE) extended its upward momentum today, with the All-Share Price Index (ASPI) climbing 150 points to close at a historic record... Read More

New airlines add Colombo to their radar for winter season
Tuesday, 21 October 2025 - 17:33

The winter season is emerging, and Sri Lanka is attracting more tourists as airlines expand services to Colombo. Several new carriers have added the country... Read More