Search

Notices

IT outage

Thread Tools
 
Search this Thread
 
Old 07-22-2024, 05:11 AM
  #141  
On Reserve
 
Joined APC: Jul 2024
Posts: 12
Default

Originally Posted by 170Till5
bro, you’re an idiot. The chief pilot office in ATL made an announcement over the loud speaker that if you are willing to fly, come into the office. The coverage ladder is out the window with this. GA about to call in the national guard to airlift people out of ATL 😂… that’s how awful Delta is doing.

But to each their own. I’m done with you all on this conversation. See what you want to see.
Sometimes you have to let it all burn down in order to build it back up stronger. I don't see Ed Bastian loading bags on the ramp or rebooking customers so don't expect me to go out of my way and do the job that someone else should be doing.
CloudMonkey is offline  
Old 07-22-2024, 05:26 AM
  #142  
Gets Weekends Off
 
Joined APC: Feb 2007
Position: Big ones
Posts: 774
Default

Originally Posted by CloudMonkey
Sometimes you have to let it all burn down in order to build it back up stronger. I don't see Ed Bastian loading bags on the ramp or rebooking customers so don't expect me to go out of my way and do the job that someone else should be doing.
And yet no trips are covered using the prefix C. Scheduling has the tools to rapidly fix things today in order to buy time to get better for tomorrow. I wonder why they’re not doing it.
tripled is offline  
Old 07-22-2024, 05:29 AM
  #143  
Gets Weekends Off
 
Joined APC: Jul 2010
Posts: 3,371
Default

Originally Posted by Hotel Kilo
That's not the solution they are looking at. A simple rules change to permit commuters to NOT have to get in contact with CS or PA to book a seat on a capacity flight is an easy fix.

DOT has declared this a "controllable" incident, which means air lines are on the hook.

I fail to see how they concluded this when the issue was buggy software update from Crowd strike that was the causal factor. The updates are automatically pushed with no testing done prior to their deployment. That's a huge foul. We've got thousands of affected computers and servers that have required a lengthy manual reboot process to un f#@$& this mess they (crowd strike) is responsible for.
Is crowd strike the issue here? Or how the company uses their application?

everyone else recovered already.

The Ps solution works. But the company doesn’t know where the crews are. More bases = more Gs can go out where people leave.
PilotJ3 is offline  
Old 07-22-2024, 05:36 AM
  #144  
Gets Weekends Off
 
Joined APC: Jan 2023
Posts: 1,520
Default

Originally Posted by PilotJ3
Is crowd strike the issue here? Or how the company uses their application?

everyone else recovered already.

The Ps solution works. But the company doesn’t know where the crews are. More bases = more Gs can go out where people leave.
Crowd strike is the "third party vendor" referred to in prior news releases from the company. It wasn't until yesterday we named them in our messaging.

Crowd strike is a contractor to Microsoft. Crowd strike routinely pushes updates to their piece of the Microsoft puzzle. The one they pushed in early morning (here in the US) of July 19th is the culprit. It hit Europe first. We knew about it but by the time we had to react the updates where already deployed to the Microsoft systems we employ.

So yes, the buggy update pushed automatically by crowd strike is what's responsible for bricking our computers and servers (gate agent computers crew tracking and scheduling tools etc) all were affected.

To your PS statement. Not really. The common scenario lately has been crew booked PS then that flight cancelled. So now they are trying to book another PS flight at close range but they can't because that flight is at capacity and travel net will not allow it. Now they have to reach out to CS or PA to get them to make the PS listing for them. This takes hours to do and by then that flight they were trying to PS on is departed. Gate Agents can't put you on there PS unless they get the OK from Mecca.

So, simple solution is to remove the rules on booking PS on capacity flights when it's invoked during IROP. Or at least give gate agents the ability to list you at the gate.
Hotel Kilo is offline  
Old 07-22-2024, 05:42 AM
  #145  
Gets Weekends Off
 
Joined APC: Jul 2022
Posts: 930
Default

Originally Posted by Hotel Kilo
Crowd strike is the "third party vendor" referred to in prior news releases from the company. It wasn't until yesterday we named them in our messaging.

Crowd strike is a contractor to Microsoft. Crowd strike routinely pushes updates to their piece of the Microsoft puzzle. The one they pushed on the morning (here in the US) of July 19th at around midnight is the culprit. It hit Europe first. We knew about it but the time we had to react the updates where deployed to the Microsoft systems we employ.

So yes, the buggy update pushed automatically by crowd strike is what's responsible for bricking our computers and servers (gate agent computers crew tracking and scheduling tools etc) all were affected.
It’s fair to blame CrowdStrike for Friday’s initial disruption. However, Saturday, Sunday, and now Monday fall squarely on Delta. Our infrastructure and recovery capability is severely lacking, as this management team has always been more focused on optics than substance.

Our competitors are blowing us away with their recovery efforts.
ancman is offline  
Old 07-22-2024, 05:47 AM
  #146  
Gets Weekends Off
 
Joined APC: Jan 2023
Posts: 1,520
Default

Originally Posted by ancman
It’s fair to blame CrowdStrike for Friday’s initial disruption. However, Saturday, Sunday, and now Monday fall squarely on Delta. Our infrastructure and recovery capability is severely lacking, as this management team has always been more focused on optics than substance.

Our competitors are blowing us away with their recovery efforts.
I think that's being a bit disengenous. In order to "recover" we had to manually reboot thousands of computers ( gate agent, load crew track sched etc) all over the world. The manual reboot process is lengthy. It requires IT folks to accomplish one machine/server at a time.

That is why we have taken some time to get back on step.

Point fingers at the cause.

Now can we improve comms between those out on the line and Mecca - yes. This is like my 7th major meltdown and I see the same every time. We can do better at getting robust and timely comms to and from our personnel out on the line. RA even talked about it many years ago, but here we are.
Hotel Kilo is offline  
Old 07-22-2024, 05:50 AM
  #147  
Line Holder
 
Joined APC: Sep 2022
Posts: 95
Default

Originally Posted by Hotel Kilo
I think that's being a bit disengenous. In order to "recover" we had to manually reboot thousands of computers ( gate agent, load crew track sched etc) all over the world. The manual reboot process is lengthy. It requires IT folks to accomplish one machine/server at a time.

That is why we have taken some time to get back on step.

Point fingers at the cause.

Now can we improve comms between those out on the line and Mecca - yes. This is like my 7th major meltdown and I see the same every time. We can do better at getting robust and timely comms to and from our personnel out on the line. RA even talked about it many years ago, but here we are.
Didn't UAL and AAL have to do the same fix/reset?
SoloPilot is offline  
Old 07-22-2024, 05:50 AM
  #148  
Line Holder
 
Joined APC: Feb 2020
Posts: 84
Default

Originally Posted by Hotel Kilo
Crowd strike is the "third party vendor" referred to in prior news releases from the company. It wasn't until yesterday we named them in our messaging.

Crowd strike is a contractor to Microsoft. Crowd strike routinely pushes updates to their piece of the Microsoft puzzle. The one they pushed in early morning (here in the US) of July 19th is the culprit. It hit Europe first. We knew about it but by the time we had to react the updates where already deployed to the Microsoft systems we employ.

So yes, the buggy update pushed automatically by crowd strike is what's responsible for bricking our computers and servers (gate agent computers crew tracking and scheduling tools etc) all were affected.
This part is not correct. Crowdstrike is a corporate MDR solution (think antivirus on steriods). They have nothing to do with Microsoft or Windows updates. Corporate security/IT chose this particular piece of software because they are the market leader, somewhere around 25%. The problem was cause by Crowstrike themselves who pushed out a faulty update. How this happened we won't know until an offical post mortem report is written. Modern MDR/Anti-Virus solutions run in Ring 0 (Kernel) level of the operating system which is the highest level of access for Windows and thus the most dangerous when things go wrong. It needs to however, live here to operate sufficently and do the things it needs to do to stop threat actors.

The preliminary information that has come out so far is the file that Crowstrike pushed contained nothing but zeros. Because this file lived in such low level in the operating system, it caused Windows to boot loop. How this file came to be containing no information and why it was not caught in QA before being pushed is yet to be determined.

The reason why this is so devestating is because it requires physical access to each machine if the machine does not have out of band mangement (Intel vPRO/IPMI comes to mind but this has vunerability issues of it's own) in order to remove the faulty file. Bitlocker (Which is a microsoft feature for windows) comes into place in all this as a device encryption feature in windows. This prevents someone from removing the drive and placing it into another computer OR booting from a USB drive and manipulating the underlying Windows OS (Remove Passwords, etc) However bitlocker is actually doing what it's designed to do in this case and isn't a culperate. The bitlocker recovery key is needed to get to a command prompt in recovery mode in order to remove the fault Crowdstrike update.

TLDR: This is entirely on Crowdstrike. They are a third party vendor. Nothing to do with Microsoft for a change.

Originally Posted by SoloPilot
Didn't UAL and AAL have to do the same fix/reset?
Backend systems are different. Rumor on the block is Crew360 is the cause for our pain. I don't know much about the software or it's backend Database but apparently this is our archillis heel at the moment.

Last edited by Transit; 07-22-2024 at 05:57 AM. Reason: Spelling
Transit is offline  
Old 07-22-2024, 05:51 AM
  #149  
Gets Weekends Off
 
Joined APC: Jul 2022
Posts: 930
Default

Originally Posted by Hotel Kilo
I think that's being a bit disengenous. In order to "recover" we had to manually reboot thousands of computers ( gate agent, load crew track sched etc) all over the world. The manual reboot process is lengthy. It requires IT folks to accomplish one machine/server at a time.

That is why we have taken some time to get back on step.

Point fingers at the cause.

Now can we improve comms between those out on the line and Mecca - yes. This is like my 7th major meltdown and I see the same every time. We can do better at getting robust and timely comms to and from our personnel out on the line. RA even talked about it many years ago, but here we are.
United had the same problem. They’re WAY ahead of us with their recovery.

The continued abysmal recovery lies squarely on Delta. We lack system redundancy, IT personnel, OCC personnel.

None of that matters to management, as they believe that our customers will continue to pay a premium to fly on us if we wear hats and stand in the way saying goodbye during deplaning.
ancman is offline  
Old 07-22-2024, 06:11 AM
  #150  
Gets Weekends Off
 
Joined APC: Jan 2023
Posts: 1,520
Default

Originally Posted by SoloPilot
Didn't UAL and AAL have to do the same fix/reset?
They are not as fully infested with Microsoft as we are. That's why they got up and running quicker
Hotel Kilo is offline  
Related Topics
Thread
Thread Starter
Forum
Replies
Last Post
DALFA
Delta
13
08-15-2016 06:19 PM
Makanakis
Delta
79
08-11-2016 10:25 PM
bottoms up
United
2
02-05-2015 12:54 PM
LNL76
Major
3
01-11-2014 04:45 PM
LeoSV
Hangar Talk
6
09-28-2007 05:17 AM

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



Your Privacy Choices