Auditing the inference processes of medical-image classifiers by leveraging generative AI and the expertise of physicians
Alex J DeGrave, Zhuo Ran Cai, Joseph D Janizek, Roxana Daneshjou, Su-In Lee
Nat Biomed Eng . 2023 Dec 28. doi: 10.1038/s41551-023-01160-9. Online ahead of print.
The inferences of most machine-learning models powering medical artificial intelligence are difficult to interpret. Here we report a general framework for model auditing that combines insights from medical experts with a highly expressive form of explainable artificial intelligence. Specifically, we leveraged the expertise of dermatologists for the clinical task of differentiating melanomas from melanoma 'lookalikes' on the basis of dermoscopic and clinical images of the skin, and the power of generative models to render 'counterfactual' images to understand the 'reasoning' processes of five medical-image classifiers. By altering image attributes to produce analogous images that elicit a different prediction by the classifiers, and by asking physicians to identify medically meaningful features in the images, the counterfactual images revealed that the classifiers rely both on features used by human dermatologists, such as lesional pigmentation patterns, and on undesirable features, such as background skin texture and colour balance. The framework can be applied to any specialized medical domain to make the powerful inference processes of machine-learning models medically understandable.