Future Tech

Alibaba Cloud waiting for hardware to dry out before trying to restore customer data

Tan KW
Publish date: Tue, 17 Sep 2024, 03:55 PM
Tan KW
0 478,331
Future Tech

A week after a fire broke out at a Singapore datacenter, Alibaba Cloud is waiting for some hardware to dry out before it restores services and customer data.

According to Alibaba Cloud's Monday update, the migration and restoration of affected hardware and machineries at the SIN11 Digital Realty datacenter is progressing as planned, with the remaining affected cloud products gradually being restored.

"Since some of the affected hardware and machineries are located in the dangerous and blocked area of the building where access is not allowed and some hardware and machineries require to be carefully dried in order to ensure data security, hence the restoration of some long tail machines and inventories may take a longer time," explained the Chinese cloud giant.

The fire broke out last Tuesday morning, evidently as a result of an explosion of lithium-ion batteries. Twenty people were immediately evacuated and Singapore's Civil Defence Force (SCDF) arrived on the scene to manage what it prophetically described as a likely "prolonged operation."

Local media reported SCDF was forced to deploy a firefighting robot to quench the blaze and cool the batteries - a potentially smart move as lithium-ion batteries are known to reignite or explode, and also release toxic fumes.

The batteries in question were located on the third floor of a four-storey building that opened its doors as a datacenter in 2016. The building is one of three Digital Realty datacenters in Singapore, and the only one redeveloped from existing infrastructure - the other two were greenfield sites.

By last Tuesday evening, according to Alibaba Cloud, the fire alarm was still sounding. Some network equipment in the datacenter had experienced abnormalities because of high temperatures, which disrupted the network connectivity of certain cloud products.

Customers were warned of possible outages over entire Availability Zones - in this case Zone C. Migrating of systems was underway.

A few hours later, around 1:45am Wednesday morning, Alibaba Cloud reported the server room had started experiencing water accumulation and leaks from efforts to put out the fire. The water posed a risk of electrical short circuits, according to Alibaba Cloud.

Not all was lost though. Later Wednesday morning, "most of the cloud product services affected by the network issues [were] restored to normal," even though the fire department was still waiting for permission to enter the server building.

Then last Friday at around 6:30pm, the hardware equipment on the first floor was reported as "under safe migration." Assessments to start the second floor migration commenced, too.

And that leads to Saturday, when migrated equipment was "under necessary preparation work for installation" - including the process of drying it.

At the time of writing, Alibaba Cloud Health Status reported 15 services that have experienced abnormalities since the fire: Elastic Compute Service, Object Storage Service, Apsara File Storage NAS, Log Service, Elastic Block Storage, ApsaraDB RDS for MySQL, AnalyticDB for MySQL, ApsaraDB for ClickHouse, Lindorm, ApsaraDB RDS for PostgreSQL, ApsaraDB RDS for SQL Server, ApsaraDB for MongoDB, Realtime Compute for Apache Flink, Hologres and Machine Learning Platform For AI.

ActionTrail was abnormal until yesterday, but is now being reported as normal.

Some service providers that presumably rely on these services - including Lazada and TikTok parent Bytedance - were reported as experiencing "significant disruptions."

The Reg has asked Bytedance and Lazada to detail the impact of the incident on their operations and will report if a substantial reply materializes. ®

 

https://www.theregister.com//2024/09/17/alibaba_cloud_singapore_fire_recovery/

Discussions
Be the first to like this. Showing 0 of 0 comments

Post a Comment