DataFuture of AI

The Data You Forgot Is the Data That Will Haunt You: Guardrails Against Silent Leaks

By Rob T. Lee, chief of research and head of faculty, SANS Institute

We talk a lot about backing up our data. But here’s the question no one likes to answer: where does data go when it’s no longer needed? It doesn’t just disappear.

What happens to the leftover data when companies upgrade systems, retire a product offering or application, or employees move on? Is it being truly deleted? Is there a record of all data employees have access to? If organizations don’t know where data is located, they can’t keep track of where it ends up.

Poor data health poses immense risks to modern organizations, and the potential costs are skyrocketing. The average cost of a data breach is $5.3 million, and 20% of organizations report paying a minimum of $250,000 in fees, sometimes just to get their data back.

To reduce potential risks, organizations need to treat data disposal with the same urgency as data protection. It starts with a few core practices.

Create a Full Data Inventory

Companies can’t secure or delete what they don’t know exists. To understand the full scope of data within an organization, it’s best to start with a complete audit—track where data lives across cloud services, devices, backup systems, and even personal employee hardware.

Once data has been fully audited, the next steps can be identified to ensure the information is appropriately organized and managed. It’s also critical to know if there are standards or guidelines to adhere to, company, government, or otherwise.

Understanding Existing Data Regulation

While some U.S.-based organizations have policies for data destruction after employees leave an organization, many of these guidelines are not enforced or they lack clear standards. Part of the issue stems from companies not being prepared to dispose of data the same way it is created, or they fear they may need to reference the information later.

The European Union has clear compliance requirements on this matter with the General Data Protection Regulation (GDPR), where fines can reach up to 20 million euros or 4% of global annual revenue, whichever is higher. The United States does not have similar guidelines to protect individuals’ data, unless you are in California, where the California Consumer Privacy Act (CCPA) can require fines that reach $7,500 per intentional violation or $2,500 per unintentional violation.

AI’s worth in an organization is rooted in how much it can learn from existing data. It is hungry for information, and many programs’ users interact with use our data for model training. The growth of AI then impacts how organizations evaluate their existing data health and policies while simultaneously identifying gaps where their processes of securing it fall short. Without government-led standards or guidelines to follow, it’s important to adopt an AI framework to ensure proper AI controls are in place.

Establish Data Disposal, Retention and Destruction Policies

Whether it’s an employee leaving or an entire system being decommissioned, it’s best to have a process to securely wipe or destroy data. This should be as routine as collecting a badge or laptop on an employee’s last day.

More and more, cybersecurity and HR teams collaborate on projects to ensure onboarding and offboarding are in sync. Within the greater organization, this ensures employees understand how their data is handled and improves clarity for all involved.

Similarly, to ensure organizations implement destruction policies and manage the associated workflows, they can use tools to enforce data retention rules. Data that’s no longer needed shouldn’t stick around just in case. It’s important to automate secure deletion timelines and assign sensitive data that has a digital expiration date.

As AI becomes more embedded in day-to-day operations, organizations need stronger data boundaries. Otherwise, outdated or forgotten information could shape important decisions, slipping into training sets, or being the weak spot, a threat actor uses to get into an organization. Prevent this by treating data lifecycle management as part of the cybersecurity posture, not just an IT checklist item.

Communicate Clearly to Relevant Stakeholders

As relevant changes are made to data practices or AI policies, it’s critical to share new information with relevant stakeholders to keep them up-to-date and informed. Just last month, 23andMe declared bankruptcy, creating mass confusion and a flurry of panic for those who shared DNA information with the company. For many, data sharing is already an overwhelming process and feels impossible to wrangle, which is why effectively relaying information is vital.

Of course, messaging may look different for each type of stakeholder. For instance, customers will want to receive information differently than employees or investors, so it’s wise to tailor updates with that in mind. If needed, bring in external parties who can provide guidance or support.

Forgotten Data is Out There – And It’s Very Much Recoverable

Back in the day, I used to recover data from used hard drives. What I found were full digital lives left behind. I would discover tax records, business plans, medical information – you name it. It was all still there, unsecured, and forgotten.

Today, this situation has worsened. In the age of AI, our data lives everywhere. It’s in the cloud, on laptops, phones, USBs, shared drives, and devices we barely think about. The risk isn’t limited to old hardware sitting in a closet. It’s the active and forgotten data sitting on systems we no longer track.

One-off days like World Backup Day or Data Privacy Day serve as timely reminders to practice good data practices, but these days are not enough. Secure data hygiene is an ongoing, day-to-day practice. After all, forgotten data is still out there, and it’s still very much recoverable – by you or bad actors.

Author

Related Articles

Back to top button