The AI revolution has so many dimensions that are important for business. When much of the world’s focus is on how well AI’s learn and what is being done to help them learn more faster, why are we talking about AI unlearning?
What Is AI Unlearning?
Unlearning in AI is when an AI model “forgets” or removes from its base of knowledge a particular data element. If unlearning is successful, the AI model will not leverage this data in any future predictions or results that it generates.
Do Humans Unlearn?
Before we examine why unlearning is hard for AI, it may be worthwhile to explore how humans unlearn. Although humans often forget, unlearning is not exactly the same as forgetting. Humans cannot forget specific elements on demand. How and what we forget is largely beyond our control. We have some limited ability to consciously remove consideration of specific information when making decisions, but even this is vague. What we have learned often affects our behavior in ways that we may not realize.
Why Unlearning Is Hard For AI
Very superficially, unlearning for AI is hard for similar reasons as for humans. In an AI model, patterns from previously studied data are baked into model coefficients in ways that are not obvious (particularly for the enormous billion value to trillion value models that represent the state of the art). Extracting the impact of a particular data point on this tapestry of numbers is a challenge we have not yet mastered. As AI models acquire more information, it is possible to “forget” or lose the impact of past data, but this process is not as specific or targeted as we may want. In AI of course, one way to perfectly unlearn is to retrain the AI from scratch with all of the data except the item we want to unlearn. This, however is impractical given the enormous cost of AI training.
Another unique benefit of AI that becomes a further hurdle to unlearning is the ability to leverage one AI to create another. Through technologies like Transfer Learning or Model Fine Tuning, an AI model that has already learned can be used as the base to create a second model. While this is tremendously helpful in speeding up model creation and lowering model costs, it has the added complexity that whatever the original model has not forgotten is also now in its derivative.
Why AI Unlearning Is Important For Business
Unlike humans, business applications often have a need to remove data, cleanly and completely. Legal requirements sometimes require the secure deletion of data records after some time. Computer technologies from databases to storage devices have mechanisms for secure deletion specifically designed to meet the regulatory and customer needs for security, privacy, etc. AI presents a challenge to these policies. Information that was once part of an AI training will now persist in that AI (and any derivatives created from it) long after secure deletion has been performed on the information itself.
Other concerns include the need to remove data that is found to be inappropriate in some way. For example, data found to be biased should be removed. In the case of Generative AI, businesses may need to remove training content if it is found to violate copyright restrictions or is determined to be fake.
What Can Your Business Do To Manage AI Unlearning?
Unlearning is a hard task. Businesses should stay on top of their AI and Data Governance to ensure that needs for unlearning are flagged and managed.
- Understand what data was used to train your AI models. If you bought your models from another vendor (or accessed via a third-party API), ensure you understand who is liable for data ethics violations.
- Stay on top of the latest technology in AI Unlearning. As products roll out with unlearning capabilities, this will help your business understand how to integrate them into your AI pipelines.