John Lee, an IT manager at the University of Illinois’s Grainger College of Engineering, and his team responded admirably to last year’s CrowdStrike outage, strategically sending out help to get systems back online. Lee and his on-campus infrastructure and user services teams also got together shortly after the incident to figure out what they could do better the next time they face 2,500 blue screens of death. In case you’ve been sleeping under a Mac for the last year, a faulty content update to CrowdStrike’s cybersecurity sensor on July 19, 2024, crashed millions of devices and impacted facilities that require high availability, including airports, banks, and healthcare facilities. In Lee’s “after-action review,” he and fellow IT practitioners determined they needed to establish clear incident-management roles, as well as a “command chain” for sharing information across all divisions of the organization, not just the IT department. “It took us some time to self-organize in the beginning, and we recognize that,” Lee said. How the 2024 outage shook things up.—BH |