Future Tech

Uncle Sam opens probe into CrowdStrike turbulence at Delta Air Lines

Tan KW
Publish date: Thu, 25 Jul 2024, 08:10 AM
Tan KW
0 458,742
Future Tech

The US Department of Transportation (DoT) is investigating Delta Air Lines over its handling of the global IT outage caused by CrowdStrike's content update.

Delta has had a particularly rough time since Friday, consistently cancelling hundreds of flights a day. On Monday it cancelled a whopping 1,150 flights, and since last week hundreds more have been delayed, thrusting travel plans into ruin.

"We launched an investigation today because what we've seen is a very different pattern from Delta than the other airlines," Secretary of Transportation Pete Buttigieg told media on Tuesday.

"Look, the entire global economy was affected on Friday and that's certainly included airlines around the country and around the world. But most of those airlines recovered and got back to normal within a couple of days. Delta, on the other hand, still not back to normal as of today. 

"We have received about 3,000 complaints from passengers… that is unacceptable. And we're concerned both about the delays and cancellations and about how hard it has been to get somebody on the phone.

"We launched a really new era of enforcement with the action that we took against Southwest [in 2023]. $140 million, which was a record penalty, designed to send a message to industry and to get accountability and some compensation for passengers. Of course, I can't prejudge an investigation that's just started today on the Delta side. But I will say that same high standard that we've set is going to guide us from now on."

In a xeet from his personal account, Buttigieg referenced anecdotes from the complaints from Delta customers of having to sleep on the floor of an airport while waiting for updated flight information, all without access to customer service support.

Delta's most recent progress update on Tuesday was filled with "highlights" of its efforts to remediate the ongoing issues, including a 50 percent day-over-day decrease in flight cancellations, a 43 percent increase in Atlanta flight volumes compared to Monday, and a 75 percent clearance of a backlog of issues in its crew tracking system.

The major airline described the days since the outage as "extraordinarily difficult" and that its "heroic" staff and around 1,500 volunteers have been working "tirelessly" to ensure service is restored in time for the upcoming weekend.

"Teams are working around the clock to reposition planes and people to where they need to be so we can return to normal operations by the end of the week," said John Laughter, chief of operations and president at Delta TechOps. "We're seeing solid day-over-day progress across operating metrics that the entire team should be proud of.

"With our collective focus, we will continue this momentum to be in good shape ahead of the busy weekend."

Delta was just one of the major US airlines severely affected by the global IT outage last Friday. Allegiant Air, American Airlines, Frontier Airlines, Spirit Airlines, and United Airlines were also among those grounded by the Federal Aviation Administration on the day, although most have largely recovered now.

(Reports that Southwest Airlines avoided the mess because it was using Windows 3.1 turned out to be based on a viral troll tweet; it's not using the 1992 Microsoft OS in this manner.)

How to recover

The early advice given to the IT admins tasked with repairing approximately 8.5 million endpoints that experienced Blue Screens of Death following the CrowdStrike content update was to boot affected machines in safe mode and delete the dodgy file that caused the mess.

Microsoft Azure cloud users reported having to reboot their systems as many as 15 times before CrowdStrike's fix took effect. In any case, the methods were highly manual, which is why so many organizations have taken so long to recover.

CrowdStrike has since launched an opt-in program to have customers' endpoints restored automatically via the cloud, although the efficacy of this method varies substantially between customers.

In the past few hours, the vendor also published a preliminary postmortem of the incident. It's a fairly verbose read but, in essence, the faulty file wasn't caught because the component that's responsible for checking each detection logic template that's applied to the Falcon Sensor didn't flag one particular template before it was shipped. This meant that the issues the channel file caused weren't picked up before it was sent to customers.

Big money

Savanthi Syth, airline analyst for Raymond James, told CNN she expected Delta's costs to be in the region of $163 million as of Monday - a figure likely to rise as cancellations continue and additional payments are made to staff working overtime.

According to experts speaking to the Financial Times, many insurers are also bracing for a hefty series of payouts in the coming weeks to cover the costs of downtime, business continuity disruption, and hardware (property) damages.

Burns & Wilcox said it expects the lower end of their estimations to be $1 billion in total costs, while others were less keen to put numbers on estimates at this early stage. ®

 

https://www.theregister.com//2024/07/24/transport_department_delta_probe/

Discussions
Be the first to like this. Showing 0 of 0 comments

Post a Comment