Collecting data allows businesses to glean essential information for improving processes, reducing costs, better serving customers and more. However, it also comes with risk: Every organization that collects data, no matter its size, is a target for hackers. Further, new and evolving government regulations, including the EU’s General Data Protection Regulation and the California Consumer Privacy Act, are placing new responsibilities and restraints on companies when it comes to the collection and use of data.
Organizations across industries are exploring data minimization initiatives that are focused not only on reducing the volume of data they already hold, but also on collecting less new data going forward. Below, 20 members of Forbes Technology Council share practical, effective data minimization strategies that can be leveraged by companies in a variety of industries and relate success stories they’ve overseen themselves.
1. Create A Data Map
To minimize the amount of data you hold, identify where critical and sensitive information is stored. A data map showing storage locations and security patterns helps determine what to retain and manage on a retention plan. Observing an organization create overlaying data management plans for multiple regulators highlighted the importance of cross-planning for regulatory compliance and future training. – Kathleen Hurley, Sage, Inc.
2. Adopt Data Mesh And Data Fabric Architectures
We can address this with two emerging modern data architecture paradigms: data mesh and data fabric. Data mesh offers decentralized data management, so that only necessary data is collected, processed and retained by each domain. Data fabric offers unified data access and management to reduce the need for multiple copies of data, minimizing data duplication and storage overhead. – Suri Nuthalapati, Cloudera
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?
3. Pare Down Nonessential Information
We helped a retail store clean up its customer data by keeping only essential information, such as names and contact details, while ensuring it’s all securely protected. This not only saved the store money on storing data, but also enhanced the security of customers’ personal information. By following strict rules on data usage, we improved customer service without compromising privacy. – Balasubramani Murugesan, Digit7
4. Keep Regulatory Requirements In Mind
With the introduction of the GDPR, I saw data volumes growing significantly in retail banking. Collaborating with data protection officers, we have led several simplification initiatives to strike a balance between meeting legal requirements and storing necessary data. We became very precise in our classifications, taking care to distinguish which types of data must be stored for a fixed period (for example, 10 years) rather than just keeping all of the information collected. – Luboslava Uram, Solvd Group
5. Leverage Causality Analysis
Leveraging causality analysis in our AI work has significantly reduced the amount of data required to surface intelligent recommendations regarding a range of decisions faced by a Fortune 100 company. With less data, we can still achieve useful and actionable insights regarding manufacturing operations, potential market opportunities, customer service optimization and culture improvement. – Pravir Malik, QIQuantum
6. Take A Multitiered Approach
I worked with a financial services company to reduce data overload. The idea is to minimize sensitive data collection, storage and processing to reduce breach risks and improve data management efficiency. We classified data, eliminated unnecessary collection points, enforced strict collection and retention policies, and applied data masking and anonymization, resulting in a 40% decrease in stored data. – Dutt Kalluri, Celsior Technologies
7. Reduce The Amount Of Personal Data Collected
We have significantly reduced the amount of personal data we collect from users as part of our data minimization strategy. This not only enhanced user privacy, but also improved data security, leading to the near elimination of incidents. The strategy also fostered greater trust among clients, aligned with broader regulatory compliance and improved overall customer satisfaction. – Michael Beygelman, Claro Analytics
8. Implement Schema-Based Data Generation
Implementing schema-based (structured) data generation from end-user devices has greatly decreased the compute power needed for processing and minimized the amount of derived data that must be stored before gold data can be made available to data engineers and scientists. This approach has reduced our storage costs by 40%, improved our GDPR and CCPA compliance and bolstered customers’ trust. – Ravi Bandlamudi, AtoB
9. Eliminate Duplicate And Outdated Data
Data cleansing is key for organizations looking to minimize the amount of data they’re storing. One life sciences company we worked with had accumulated diverse legacy systems. They needed to bring all their data together in one single SAP ECC instance. We helped them eliminate duplicate data and remove outdated and unnecessary data before migrating to the new, unified system. – Kevin Campbell, Syniti
10. Implement A Fixed-Term Retention And Purge Policy
Internally, we’ve recently implemented a seven-year data retention and purge policy. We’d had veteran employees with files from the 2000s! Our customer project folders were just as old. Keeping the old data made the risk of having to disclose any breach 300% higher than it is now. Users initially resisted, but with some change management principles, we purged all but some HR and financial data with no complaints. – Chris Stegh, eGroup | Enabling Technologies
11. Substitute Sensitive Elements With Tokens
In digital advertising, data plays a crucial role, and the main thing here is to collect only the data you need. At the same time, I would recommend businesses substitute sensitive elements with tokens. Since tokens cannot be decrypted by accident, the system is more secure, which is beneficial for both an organization and its customers. – Roman Vrublivskyi, SmartHub
12. Cull Fields On Intake Forms
We decided to cull some of the fields on certain intake forms and leads lists to enhance productivity. The results were lead lists that are easier and faster to parse and enhanced production across the board. Sometimes, simple is just better. – Michael Gargiulo, VPN.com
13. Audit, Anonymize And Automate
At a financial company, I led a data minimization initiative that involved auditing and anonymizing data and automating data purging. This enhanced data security, lowered storage costs, improved compliance and customer trust, and boosted operational efficiency. The initiative protected the company from security threats and compliance issues, benefiting both the company and its customers. – Sumit Bhatnagar, JP Morgan Chase
14. Remember That Quality Trumps Quantity
AI requires huge amounts of data, but as always, quality trumps quantity. By looking hard at data quality, one computer vision company managed to reduce the number of images needed to train models by about 16%. This was done by removing similar and poor-quality images and increasing the annotation quality. The result was not only better performance, but also cheaper and faster AI training. – Erik Aasberg, eSmart Systems
15. Use ML To Filter Unnecessary Data
We conducted frequent data audits, redesigned our collection policies and implemented machine learning algorithms to filter unnecessary data. These efforts resulted in a 25% improvement in system performance, enhanced data security, increased client satisfaction, and led to a 20% reduction in data management costs. – Ketan Anand, Suuchi
16. Break Down Department Silos
I worked with a government that wanted to consolidate spending from all departments into a centrally managed procurement dashboard. The issue was data proliferation: Each department maintained its own data, and there was no record management. By breaking down these silos so that all the data could be analyzed in one place, the government realized over $1 million in savings in the first three months. – Lewis Wynne-Jones, ThinkData Works
17. Foster A Culture Of Privacy And Responsibility
A data minimization initiative should include fostering a culture of privacy and responsibility within the company. Employees must become more aware of the importance of data privacy and their role in protecting sensitive information, as this will lead to better data-handling practices across all departments. Automation tools can streamline compliance checks and data management processes. – Roman Reznikov, Intellias
18. Think Of Data As A ‘Toxic Asset’
Our policy is to treat data as a “toxic asset.” This is a helpful concept to keep in mind when dealing with data. Just as we would with a toxic substance, we aim to minimize the handling of data, limit the number of people who come into contact with it, decrease the amount of time we retain it, and reduce exposure. – M. Nash, Integry
19. Focus On Collecting Data Only For Specific, Defined Purposes
We have benefited the most by optimizing the user analytics process. We revised our data collection practices by implementing a strategy of collecting only the essential data points needed for performance analytics and user experience improvements. This has reduced the amount of data stored and enhanced security, regulatory compliance and customer trust. – Phil Portman, Textdrip
20. Implement Edge Devices To Minimize Data Sent To The Cloud
Manufacturing generates 18PB of data annually. We have implemented edge devices to normalize, contextualize and detect anomalies, sending only useful data to the cloud. This reduces costs and avoids overprocessing—for example, when detecting anomalies in images or video, only selected files are sent for further analysis and model training. It has also significantly optimized data handling and cost efficiency. – Ravi Soni, Amazon Web Services