With ‘shadow’ data, breach costs get even darker

IBM’s report says “unmanaged” assets lead to breaches adding up to over $5 million in losses.

article cover — Mark Edward Atkinson/Getty Images

August 5, 2024

• 3 min read

Billy Hurley has been a reporter with IT Brew since 2022. He writes stories about cybersecurity threats, AI developments, and IT strategies.

IBM released its annual calculation of the average data-breach cost: $4.88 million in 2023—a 10% spike from the previous year’s figure. And this year, the company measured a murky metric that adds more moolah to the figure: shadow data.

“Shadow data is that data that an organization needs to keep track of and should be aware of, but isn’t,” Sam Hector, global strategy leader at IBM Security, told IT Brew, citing examples like uploads to unsanctioned cloud services and storage on personal drives and public repositories, such as GitHub.

When this unmanaged, invisible-to-IT data is involved in the breach, the cost rises to $5.27 million, 16.2% higher than the average cost without shadow data; 35% of breaches featured shadow data, according to IBM’s study of 604 organizations that suffered a data breach between March 2023 and February 2024.

Hector spoke with IT Brew about why costs rise when data goes dark.

Responses below have been edited for length and clarity

What leads to “shadow data?”

Shadow data is primarily being caused by the huge adoption of hybrid clouds. Companies are adopting public cloud services, in order to gain cost efficiencies, and be able to cope with spikes and drops in demand efficiently, as opposed to having all of the infrastructure controlled by themselves. But that causes the vast majority of organizations to have to deal with probably three different cloud providers. And if you look at the way most organizations are using technology now, they’ve not only got these cloud environments, and on premise environments, but they've also got SaaS applications, like, Slack or Microsoft Teams, or Workday, where sensitive data is proliferating across multiple platforms.

What is it about a shadow-data compromise that leads to higher costs?

If you have unmanaged data on platforms, and an attacker finds out about it, and is able to take advantage of it and exfiltrate it, then if you’re not aware of it being sensitive, you’re often not aware of it being breached. So, if an employee has uploaded something to a cloud platform or public cloud, and it’s breached and causes a data breach, it takes a lot longer to actually identify how that's happened, when it’s happened and what needs to happen now in order to recover from that breach.

How does one find one’s shadow data?

You effectively point a tool at all of your different environments that you are aware of. And then it goes out and crawls those environments and scans them for things like credit card information, social security numbers, and pulls back all of those locations where it's stored, and says, “Actually: The ones that you need to pay attention to are there,” because it ranks them according to risk…That category of product is called DSPM (data security posture management).

What are some data recommendations to avoid shadow data breaches?

The maximum impact you can make, according to our report this year, is educating your staff on data handling and security practices and the regulations that they need to abide by and the policies they need to adhere to…Identity and access management is a key factor as well, in terms of controlling who can access that data and when they can access it.

Top insights for IT pros

Top insights for IT pros