How I need car repairs, falling apples and paracetamol to make the AI black box explainable.
People want to understand the world: that gives confidence and predictability. What we now call the “natural sciences” stands in a long tradition. In old stories you read about the spirit of the river, which has become angry and therefore caused the river to flood. And there is thunder: Thor strikes the clouds with his hammer.
After these first explanations, which we would nowadays call “magical thinking,” the scientific method emerged. Through trial and error this has become a fairly successful way of understanding nature. On the basis of that understanding, humans have built all kinds of technology: steam engines, smartphones, and air fryers. And AI.
With AI we have overplayed our hand a little: even though we invented it all ourselves, we actually don’t understand how it works. As a result, AI evokes doubt and sometimes fear. Just like the flooding river or the thunderstorm thousands of years ago. Can we still explain what happens inside the black box of AI? Can AI perhaps explain that itself?
Try explaining your own choices
Whether AI “thinks” or not, it is interesting to look at how people explain their own thoughts. Our own thoughts are not always explainable either. If someone asks me why I drive an Opel instead of a Hyundai, I am not going to say: “Neuron 6412 in my amygdala had a dopamine level that was 26% above the threshold, causing the ventromedial prefrontal cortex to receive a positive signal and the dorsolateral cortex to decide that Opel was the best choice.” Of course nobody does that. It is practically unfeasible, but more importantly: it makes absolutely no sense.
What people often do is let their unconscious brain (the reptile brain, or better: “system 1” according to Daniel Kahneman) make the decision. Afterwards, they come up with a logical‑sounding explanation (“system 2”). The chatterbox in our brain is continuously busy rationalizing our intuitive choices after the fact. Most of the time that goes well, but in some pathological conditions it spirals out of control: this is then called confabulation, which is something chatbots also regularly suffer from.
Just like with other poorly understood phenomena around us, there is also a need for explanations of how AI works. Explainability of AI has become a full‑fledged research field.
Explaining in practice
What is an explanation, actually? Do you need to understand the entire system at the lowest level? Or is it sufficient to be able to explain, for a single outcome of a system, why that outcome occurred and not something else? And now that we think about it: to whom are you explaining the behavior of AI?
Let me take you to the garage. My Opel was making strange noises, I smelled something burning, and afterwards the engine suffered from overheating – another warning light also came on on the dashboard. I left the car at the garage, and the mechanic called me later. “Cause found, the repair will cost 300 euros.” I could have left it at that, but I still had a follow‑up question: “Can you explain that?”
The explanation of the diagnosis turned out to be that the carbon brush of the fan was worn out and the entire fan needed to be replaced. For some customers this is more than detailed enough, but the mechanic continued. That engine ventilation is needed to cool the engine when there is not enough airflow from driving. And no, replacing just the carbon brushes was no longer possible.
And since we were already talking: there were some other major repairs coming up as well. The question was whether all of that was still worth it. The car is already a few years old, and the repairs cost more than its current market value. (Maybe time to look at that Hyundai.)
Different kinds of explanations
So there are different kinds of explanations, tailored to the knowledge level of the customer. One customer only wants to know what it costs; another wants to understand why it is so expensive and what the options are. Moreover, there is a difference between explaining this one outcome (“Forget it, that repair”) or the entire model (“These are the standards, this is what needs to be done, these are the costs – and if those are much higher than the market value, it is no longer worth it”).
Parents know (and some fear) the moment their child asks: “Mom/dad, where do babies come from?” There are many possible explanations to that question, all correct, but not all equally appropriate.
One fool can ask more questions than ten wise people can answer. If you keep asking long enough, it turns out that for almost everything there comes a point where we no longer know. Everyone knows that apples fall from trees, but physicists still do not really know, 350 years after Isaac Newton, what the nature of gravity is. Even something as logical and seemingly simple as arithmetic (1+1=2) can derail. Bertrand Russell nearly went mad when he tried to logically underpin mathematics and discovered that even mathematics itself cannot be proven. Fortunately, that does not bother us at all in daily life – and that is probably a good thing.
Explaining AI
Even for complex AI systems, there are methods to explain the outcome after the fact. A bit like the mechanic’s explanation: “If this had been a new car, I would have gone ahead with the repair, but for this old beast with all those other defects, it’s no longer worth it.” This is also a commonly used way to explain an outcome in AI systems: you show what the outcome is, and indicate with which small changes in the situation the outcome would change.
A more complex step is to indicate which changes contribute how much to the outcome. “The high costs of the expected additional repairs mainly come from replacing the timing belt, for which half the engine has to be dismantled. In addition, replacing the tires is another cost factor.” This type of explanation is also used in AI systems.
Is a black box problematic?
For AI systems where specialists know how things work under the hood, we get quite far. But what should we do with AI systems that we do not fully understand, but that nevertheless appear to work? That is the famous “black box.” In the media, this is usually mentioned in a disapproving tone, as something we should not want.
Language models contain billions of numbers that all interact and influence each other – that is no longer comprehensible for a human being. There are fascinating attempts to make some sense of all those interactions. But ultimately, we have created something that (usually) works and that we can steer. In essence, those systems are just as complex and inscrutable as an ant colony. We can understand the behavior of the ant colony as a whole, but what each individual ant does, and why, is a mystery – just like we cannot fathom the individual neurons in our own brains.
Surprisingly, even for one of the most widely used medicines in the world, paracetamol, it is also not completely clear how it works exactly. Even in the medical world, it is argued that a black box should be usable if it is effective – also if it remains a black box.
It is not necessary at all to know exactly how a system is constructed under the hood. By experimenting and carefully tracking what does and does not work, we now understand quite well when and how much paracetamol to take.
The question is whether you should be transparent if you actually do not want to know how something works.
Bismarck already knew this: “With laws and sausages, you don’t want to be there when they are made.” Sometimes we do not want that transparency at all. Transparency is not always useful, too: when pharmacies started making itemized bills (7.50 euros for an “explanation consultation”) in the Netherlands, many people scratched their heads. If those costs had been included in the total price, nobody would have cared. Transparency and explainability are related, but different.
AI systems that we do not understand can often still explain to us why a particular outcome is presented.
The behavior of ordinary, everyday things like gravity, arithmetic, and paracetamol can be understood fairly well through testing and observation. Explaining how those mysterious phenomena behave is therefore feasible. Unfortunately, the problem is more difficult with AI chatbots and other AI systems that use language models. Gravity always works, whether it is warm or cold, even on the moon, and in the same way for everyone. The arithmetic sum 1+1=2 applies in the same way always, everywhere, and for everyone. With paracetamol, we had to experiment quite a bit to find out for which kinds of pain it worked or did not, and for which people problems might arise – but paracetamol remains paracetamol.
AI is not so stable. AI can go off the rails. An explanation that is correct ten times may turn out to be a hallucination the eleventh time.
A wrong explanation is disastrous.
A favorite, teasing test for chatbots for a long time was: “How many Rs are there in the word strawberry?” That almost always went wrong, so I wanted a screenshot of this epic mistake for a presentation. Copilot answered my question with “There are three Rs in strawberry.” Damn! The correct answer, so there went my hilarious example. But then Copilot overplayed its hand by also explaining the answer: “There are two Rs adjacent to each other, plus the word ends in ‘ry,’ which gives the third letter R.” So it had not seen the first R at all and had counted the last R twice! Aaaah!
What do you do with that explanation?
This means that the human user must always remain alert and must always continue to think critically. An explanation should not serve as reassurance (“Ah, there is an explanation, so it must be fine”) but must also be convincing. We must also want to understand the explanation.
Surprisingly little research has been done on the effect of AI explanations on the recipient. There is a beautiful result from the German Research Center for Artificial Intelligence (DFKI), in which it was investigated whether explainable AI helps in performing a difficult task. In this experiment, the task was recognizing misinformation (fake news). What turned out: experienced users, such as journalists, did not benefit that much from AI support and the accompanying explanations. But inexperienced news consumers did benefit. They became almost as good as journalists at recognizing fake news – provided that a good explanation was given by the AI.
Why this matters
I come across this phrase nowadays in almost every blog‑like article.
I have not generated this piece using AI! (Although I did use some of its translation capabilities to speed up the process.) It is starting to become a telltale sign of AI-generated texts: “why this matters” has become a kind of filler phrase of the average language model. Just like the famous em dash, which is also a sign that AI was used in writing.
Using AI requires alertness to potentially incorrect outcomes. Explainable AI can be a powerful tool in this, as long as you are willing to critically evaluate that explanation as well. Think of the explainability of AI a bit like the leaflet that comes with your paracetamol pill – but one that you have to read again every single time.


Plaats een reactie