Why trust a model's explanation?
Do you just trust anyone at their word?
Do you just trust anyone at their word?
Then why trust a model's explanation? This was something that came up in a conversation recently.
I don’t disagree. When I first thought about explainability as an AI risk management control, I thought it was hopeless.
Even for simpler machine learning models, established post-hoc methods like SHAP and LIME can be unstable. Unfaithful to what the model actually does. Sometimes outright misleading.
While there are interpretable machine learning models, you don’t always get to choose.
And once we move to deep learning models, Generative AI or AI agents, the black box now looks more like a black hole.
But as time passed, I realized there was another way of looking at this.
Explainability isn't meant to stand alone.
No control for AI risk management is, whether it’s ISO 42001, NIST AI Risk Management Framework, or Singapore’s AI risk management guidelines that I wrote.
Think about how you actually trust someone at work. You don't just take their word. You check if their reasoning makes sense for the decision at hand. You notice if they ignore evidence that contradicts them. You watch whether their judgment holds up over time.
It’s the same when it comes to looking at AI risk management. Just having explainability is not the be all and end all. Most guidelines have additional provisions that interlock with explainability.
The key ones for explainability (in my view).
1️⃣Fit for purpose.
An explanation isn't good or bad in the abstract. It depends on what you need. A fraud analyst needs something different from a customer asking why they got declined. AI used for internal process automation may not need any explanation at all. Same model, different audiences, different standards. Like how you'd explain a medical diagnosis differently to a fellow doctor versus your worried parent.
2️⃣Selected carefully.
When we choose a model or data for a problem, the appropriate explainability method is part of the selection process. Even selecting the right features in your data is part of the process. You wouldn't design a building and think about the fire escape as an afterthought. It's part of the architecture. Same here. How to explain isn't an add-on. It's a design choice.
3️⃣Evaluated and tested.
Explainability is part of the system. You evaluate and test whether it actually works in your context, not just whether it produces output. A smoke detector that beeps isn't the same as one that detects smoke. You test the thing, not just that it makes noise.
And there's more, such as the right capability to interpret. But that's another post about human oversight, which also interlocks.
The black hole doesn't disappear. But you're no longer staring into the abyss.
What other AI risk controls seem hopeless in isolation? I’ll dive into them.
#AIRiskManagement #Explainability #AIGovernance


