RA1 & RA2 delivered 4 explanation paradigms that significantly advanced XAI research, alongside validation platforms of RA3 & RA4 and ethical tools RA5.
Rule-Based Factual and Counterfactual Explanations [1,2,9,69,109] Post-hoc, model-agnostic local methods explain black-box decisions by reconstructing the used logic. The core intuition is that while decision boundaries are globally complex, they are locally simple, allowing approximation by interpretable models. LORE (LOcal Rule-based Explainer) (Fig4) pioneered this paradigm (Guidotti et al., arXiv:1805.10820). It generates a local neighborhood around the instance, labels it via the black box, and trains a local decision tree. Crucially, LORE derives a dual explanation: factual (rules explaining "why") and counterfactual (rules explaining "what if"), aligning with cognitive psychology. LORE outperforms competitors due to decision boundary exploration (using genetic algorithms) and robustness (ensuring actionable constraints). Extensions include reasoning (REASONX), merging for global consensus (GLocalX), and natural language generation (MAINLE).
Explanation by Example(s) [5,6,24,52,57,72,75] Based on Latent LORE (LLORE), this paradigm leverages latent feature spaces learned via autoencoders to handle complex data like images (ABELE) and time series (LAST). A local interpretable model filters plausible factual and counterfactual points in the latent space (enriched with black-box predictions), which are then mapped back to the original space as interpretable exemplars.
Domain-Informed Explanation DoctorXAI [4,18,19,22,43], presented at ACM FAT 2020, pioneered ontology-based explanations for sequential data. It adapts local explanations to the medical domain using specific ontologies and Health Records (sequences of events). The medical ontology graph helps generate synthetic neighborhoods and meaningful explanations, addressing group unfairness and supporting trust measurement in user studies.
Post-Hoc Global Explanations [59,120] The Interpretable Latent Space method defines a linear encoding of features by learning a latent space on black-box labeled data. Originating from Bodria’s PhD thesis, this led to ILLUME, a global-to-local approach that sets the stage for a new ML design pipeline merging global meta-explainers with local instantiation.
(RA3, RA4): The XAI Library (github.com/kdd-lab/XAI-Lib) powers the Watchdog platform, providing a benchmarking workspace for algorithms and quantitative evaluation measures. In RA4, we conducted qualitative validation with >200 health professionals using the Judge-Advisor System (JAS) to evaluate trust. This work received an Honorable Mention at ACM-CHI22 [18].
RA5 The interplay between privacy and fairness, privacy risks of explainers, and broader trustworthiness issues were intensively explored, examining the practical implications of the European ethical and legal guidelines [45,32,95,92,117] and leading to the production of new algorithms for auditing, assessing and balancing explainability advantages versus various risks [19,56,45,77,114,110].
The project had wide international impact, reaching a scientific audience of >30,000 people and produced ~150 publications (72 Open Access).
We organized 26 workshops, 3 conferences, and 7 tutorials. PIs delivered 42+ keynotes, and team members presented at 130+ venues.
Highlights include the "XAI Distinguished Seminars" (2021), a 15-team Hackathon (2024), and invitations to ESOF 2024 (Katowice) and TEDx (2023). 14 PhDs made their theses on XAI topics, and the PI activated an XAI course at SNS