
SINGAPORE – Amazon Web Services (AWS) has yet to provide clarity on what caused its 16-hour-long global outage on Oct 20, which affected users in Singapore trying to access Zoom and Canva, among other services.
But experts have already compared the impact of such a massive outage to a coordinated cyber attack, and highlighted issues with an overreliance on old technologies.
AWS said on its website that the problem was linked to domain name system (DNS) “resolution issues” which provide real-time system performance updates.
DNS is like a phone book of the internet that matches domain names, such as those in website addresses, to internet protocol addresses, which are strings of numbers that identify devices on the internet.
When probed on whether the problem was due to hardware error, incorrect configuration, human error or a cyber attack, a spokesperson said a detailed post-event summary would be shared, but no timeline was provided.
“We can confirm increased error rates and latencies for multiple AWS Services in the US-East-1 region,” the spokesperson said.
The outage
affected hundreds of online services globally,
including those of banks and airlines overseas, online games like Roblox and Fortnite, and popular apps such as Snapchat and Reddit.
Thousands of online users in Singapore took to the Downdetector website, which tracks service disruptions, to report disruptions, which started at around 3pm on Oct 20. The problem has since been resolved, with services fully restored.
According to AWS’ website, almost 16 hours passed from the time the first lapses of multiple services in the US-East-1 region were reported, to when all the services returned to normal operations.
“When an incident of this scale occurs, whether through technical failure or misconfiguration, the impact on global operations can be just as severe as a coordinated cyber attack,” said Mr Darren Guccione, chief executive and co-founder of cyber-security company Keeper Security.
The prolonged delay in service restoration was caused by a chain of unexpected events.
After resolving the DNS issue, AWS had subsequent impairments, including in an underlying internal subsystem responsible for monitoring the health of its network load balancers. There was also a backlog of internet traffic requests, leading to more delays.
Amazon is the world’s largest cloud provider, followed closely by other US giants, namely Microsoft’s Azure and Google’s Cloud Platform.
The worldwide disruption is the largest since July 2024’s CrowdStrike malfunction that crippled hospitals, airports and banks. A faulty software update issued by CrowdStrike
crashed about 8.5 million Windows devices worldwide
, including in Singapore.
In October 2023,
more than 2.5 million payment and ATM transactions could not be completed by DBS Bank and Citibank customers in Singapore
due to a fault in the cooling system of an Equinix data centre used by the banks.
Prompted by such outages, the Government has sought to
shore up the digital infrastructure
that underpins key services such as banking, e-commerce and telecommunications. The move is to mitigate the risk of damage to Singapore’s economy and digital way of life from infrastructure outages.
Singapore’s upcoming Digital Infrastructure Act, which will be tabled in Parliament in the coming months, aims to make major players such as cloud service providers and data centre operators accountable to higher security and resilience standards.
This latest AWS outage highlighted the internet’s dependence on the operations of a single company, and the need to fortify systems against such disruptions – both locally and globally.
Mr Brent Ellis, principal analyst at global research and advisory firm Forrester, said the outage exposed issues with cloud resilience that stem from an overreliance on services such as DNS, “which were not architected for cloud-era technology demands”.
“It’s a feature of a highly concentrated risk, where even small service outages can ripple through the global economy,” he added.
AmazonService outageMobile apps