Trusting AI is dangerous. Not trusting AI, too.

(This blog is a co-production with Daan Di Scala, colleague at TNO and PhD Candidate at Utrecht University.)

An old application of AI that every citizen unknowingly used in the early 90s, was the automated reading of handwritten addresses on letters. This made the sorting process of the postal services considerably more efficient. The requirement for the system was to have no more than 0.4% read errors. The team that worked on the system did not think this was reasonable: the human input (done by large amounts of typists) had a 1.5% error. When a system is just as accurate as a human, wouldn’t that be enough?

When a self-driving car causes a traffic casualty, it is breaking news. Other traffic accidents are usually presented as statistics. The jury is still out whether Tesla’s autopilot causes more or less deaths per year, but even when self-driving cars will cause structurally less lethal traffic accidents, it’s an open question whether we want to trust such systems.

It’s clear: to err is human, and because a machine is not human, our requirements are different. Higher. Which is logical: a human error can be dramatic, but usually remains an individual case. When machines are acting erroneously, the scale becomes high very quickly and easily leads to a big mess. It’s understandable that this becomes uncomfortable. Additionally, who is responsible when an AI algorithm mistake causes an accident? (A legal debate that deserves its own blogpost.)

Trust in new technology

When Wikipedia emerged in 2001, it was seen as a very unreliable source. How could text that just about anyone could edit lead to accurate information? In 2005, Nature published an influential study showing that there wasn’t all that much difference between Wikipedia and the famous Encyclopaedia Britannica. Wikipedia had slightly more factual errors, while Britannica was a bit less clear. A later comparison showed that Wikipedia had more than made up for it. By now, Wikipedia is considered an authority and is also widely used by scientists.

The wiki method works remarkably well.

Wikipedia’s way of working was completely new. Anyone could make edits, so it might have been fundamentally ‘unreliable,’ but precisely because of that, it worked well. That doesn’t mean that other wiki-like initiatives are automatically just as good. There’s Grokipedia, which receives a lot of criticism. There’s also Conservapedia, which reflects a rather specific worldview that not everyone would consider ‘reliable’.

Our judgment of what is reliable can change. We view something new with suspicion, but those who have used a certain technology from a young age don’t know any different. Each Technology Generation has its own starting point. To put it bluntly: technology that already existed when we were growing up is simply taken for granted. Technology that emerges around the age of 10 is enthusiastically embraced. Technology introduced after we turn 30 is approached with healthy skepticism.

Trust in AI 

Will LLMs follow the same trend? For anyone working with a chatbot today, AI did not exist when they were born. So, it is new, amazing, magical! However, many critics point to LLMs being just a big complex box of statistics, which often provides answers that are ‘approximately right’. How concerning that we just accept that!

The term ‘stochastic parrot’ might not be completely accurate, but has a nice ring to it: chatbots reproduce in a somewhat random (stochastic) way what has been written down by humans before. Just like how we had to warn kids at school about using Wikipedia, nowadays we have to teach students not to blindly copy information from ‘chat’.

Chatbots are trained to come across convincingly and human-like. That fosters a connection and builds trust. The examples of where things go wrong however are countless. Such as multiple travellers being stopped at the airport customs, because ChatGPT provided them incorrect advice about necessary visas. So much for your vacation! It remains necessary to check the sources of chatbots, because before you know it, your report is submitted with information pulled out of thin air. More examples emerge daily.

Too much trust is clearly problematic. Too little trust can also have disastrous consequences. Especially for applications where AI has already proven itself to be useful, it can actually be worthwhile to take the systems advice seriously. In some medical diagnoses, AI delivers better results than humans – ignoring such advice can therefore be harmful. Unfortunately, there are plenty of examples of accidents caused by following one’s own ‘gut feeling’ instead of (non-human) advice. Think of Air Flight 655, where the crew did not trust the properly functioning Aegis system, or the disaster involving the Costa Concordia, where the ship’s navigation system’s alarms were ignored. 

Ignoring good advice is not unique to automated systems.

Earlier mentioned examples make you think of the largest disaster in aviation history (Tenerife, 1977). This was partially caused by the aircraft’s commander disregarding a comment from the flight engineer. This event led to a structural shift in procedures and culture of the aviation industry.

The right level of trust

While our perception of what can be trusted can be changed over time, there is another reason why something that was initially considered to be unreliable gradually loses that status: the technology itself improves. All those billions invested don’t just go towards the energy bills of data centers, real progress is being made! As the saying goes: “The AI you use today, is the worst you’ll ever use.” The tricky part is that there’s no bell that rings the moment AI becomes so reliable that you can completely let go of your skepticism. And that’s for the better: you must always continue thinking critically.

And, critical thinking is something that you can learn! When a chatbot presents a fact, you can of course check it yourself. Many of such examples have already appeared in the news, so a lot of people are already alert to it. But other problems are harder to anticipate. The randomness of a stochastic parrot for instance: ask a chatbot the same question again, and the answer might be different. Bias is another issue: models trained on outdated data will also reflect an outdated worldview. But does the average user even notice these things?

Trusting unreliable technology anyway

When talking about (un)reliability, it is worthwhile to look at another high-tech innovation that almost all of us use: medicines! And in particular, vaccinations! The similarity with AI is that these are (usually) not 100% reliable either. And, just like AI, medicines or vaccines can also simply be controversial. Resistance to vaccines has existed for a long time, but the COVID-19 vaccines caused extra distrust because they were seemingly developed suspiciously quickly, and also had “something to do with DNA”. The development of AI is also moving very fast and feels almost magical, especially because everyone keeps emphasizing that it works like a kind of black box that no one truly understands. 

Medicines can be black boxes, too.

It’s interesting to note that for one of the most widely used medicines in the world, paracetamol, no one can explain exactly how it works. However, the explainability of AI will be discussed in a later blog.

The critical difference between medicine and AI is that medicine and vaccinations undergo strict and extensive testing, and all side effects are carefully recorded. Most medicines can only be prescribed by a professional. Pharmacies check for drug interactions. There are package inserts. All of this is necessary: only by looking very closely at the situation can you truly trust a medicine. 

Laws, Labor, Literacy

So, how can we really start trusting AI? Three key points:

Laws are coming. It doesn’t yet come close to the regulations surrounding medicines, but the EU Commission’s AI Regulation is at least trying to address it. Mandatory documentation, required warnings to users, oversight, training… 

What’s notable however is that this law is already being weakened before it even fully comes into effect.

For example, the requirement for companies to promote ‘AI literacy’ among their employees has now become a government responsibility. It’s no longer on the companies’ plate, which probably means that the much-needed intuition won’t develop on its own. (All the more reason to continue with these blogs!)”

Then, just like with medicine and Wikipedia, there’s labor to be done. Hard work towards continuous improvement, new research and development remains necessary. 

In the meantime AI users ought to remain (or become) literate! So they will have to gain experience and insight themselves, just as we’ve learned to do with other unreliable technologies. Who would’ve thought: we still have to think for ourselves!

Posted in

Één reactie op “Trusting AI is dangerous. Not trusting AI, too.”

  1. […] is not only important when you want to work smoothly with new technology. It also helps to avoid trust technology too much, or too less. It helps to understand what is real and what could be fake, like fake news or all techniques used […]

    Like

Geef een reactie op Literacy as a response to AI magic – AI gedachten | thoughts Reactie annuleren