Bringing Explainable AI (XAI) to Medicine

20.07.08|Jonathan Rebane

In academics, industry, and socio-political circles, the interest in Explainable AI (XAI) has been booming. Traditional black box models that are non-transparent and untraceable, with little or no ability to understand the how’s and why’s an AI comes to its decisions, are increasingly criticized. In many domains which demand accountability and transparency, an AI is now expected to accurately make decisions while providing understandable report of how the logic of the system operates and how it came to specific conclusions.

With AI becoming an ever-present force in our society, a cornerstone of establishing trust in these systems will be in further opening these black boxes so we can satisfy our need to know what they are doing. There is perhaps no more important domain in which XAI is needed than in medicine, where life and death decisions and personal wellness have already found their way to the hands of AI assistive tools.

An AI revolution in medicine has vast potential to assist in clinician decision making, where already the U.S. Food and Drug Administration has approved multiple AI products such as a diabetic retinopathy detection tool. Various explainability methods in XAI such as LIME and SHAP have also been utilised as a means of providing complete, correct, and compact explanations, which are essential qualities for the integration of AI into clinical workflows. For example in a recent publication, Scott Lundberg et al from University of Washington, have demonstrated the ability of SHAP to provide accurate explanations for AI predictions for chronic kidney disease progression while also leveraging the individual explanations to provide insights about how the AI models behave in general.

Big tech companies are taking note.  “What is vital is to make anything about AI explainable, fair, secure and with lineage, meaning that anyone could see very simply see how any application of AI developed and why,” says Virgina Rometty, Executive Chairman of IBM. But many large tech companies have struggled when it comes to AI implementations in medicine. IBM’s “Watson for Oncology” underperformed because it was trained using a small number of imaginary cancer patients and did not undergo sufficient validation. Such issues in earlier days of AI development have highlighted the need to further validate AI using XAI. IBM has now become a trendsetter in this regard, and offers the AI Explainability 360 Open Source Toolkit for XAI which can be used to develop more trusted solutions. “[AI] may well make care more efficient, more accurate and — if properly deployed — more equitable. But realizing this promise requires being aware of the potential for bias and guarding against it. It means regularly monitoring both the output of algorithms and the downstream consequences” —says Dhruv Khullar, MD, a physician at New York-Presbyterian Hospital.

Issues pertaining to AI bias can be more rigorously validated with XAI, where for example, correct predictions or outliers can be used highlight AI deficiencies when the provided explanation deviate greatly from that of human knowledge and logica. XAI can help to more immediately highlight concept drift, in which AI models trained on historical data do not adept well to long-term or sudden shifts in patient demographics, as seen during the recent COVID-19 outbreak. Concept drift could be indicated from XAI if an AI that normally produces previously sensible explanations abruptly or gradually shift to providing explanations vastly different from the norm. An example of extreme concept drift has been highlighted in the recent article on “The Impact Of COVID-19 On Machine Learning Models.” by Charlie Isaksson, PhData.

The growing digitalisation of healthcare has accumulated a vast wealth of longitudinal healthcare data which has been used to build state-of-the-art AI models that can flag medical risks and automate diagnoses. In the future, this AI revolution promises to encourage a new area of precision medicine where treatments are tailored to individuals to improve the efficiency and effectiveness of medical practice. “By augmenting human performance, AI has the potential to markedly improve productivity, efficiency, workflow, accuracy and speed, both for [physicians] and for patients … What I’m most excited about is using the future to bring back the past: to restore the care in healthcare.” —says Eric Topol, MD, Director and Founder of Scripps Research Translational Institute.

However, if there is any hope for AI to become commonplace in clinical practice it is not enough that an AI simply outperforms a doctor in diagnosis precision. If clinicians are not provided good explanations to understand the logic and motivation from an AI model, this is grounds for any AI decisions to be outright ignored. What could help are new strategies and frameworks for facilitating the integration of AI into medical practice. We at the AI Sustainability Center for example, are offering a practical AI Sustainability Framework to aid organisations in detecting and mitigating social and ethical AI Risks. XAI methods form a key role in mitigating such risks for sensitive domains such as medicine, where transparency and user acceptance of AI decisions is vital.

There is no doubt XAI will play a key role in the sustainable integration of AI in medicine. It is important to remember that many clinicians still view AI assisted practice as a threat, no doubt due to both misconceptions and failures of prior digital implementations in medicine. A key role of XAI will be to provide the necessary transparency needed for practitioners to trust AI systems. To correct for past wrongs of digitalisation in healthcare it is advisable to approach the introduction of AI with the considerations and usability for clinicians at the forefront. For example, what constitutes a ‘good explanation’ may vary according to the clinician’s current workload or degree in which the explanation contradicts the clinician’s opinion or the standard medical protocol. This is one reason AI should be externally validated in clinical settings rather than rushed to full deployment. With these considerations in mind, perhaps one day soon we will see a new era of clinician-machine collaboration.