Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

PRE-ACT: Prediction of Radiotherapy side Effects using explainable AI for patient Communication and Treatment modification

Periodic Reporting for period 1 - PRE-ACT (PRE-ACT: Prediction of Radiotherapy side Effects using explainable AI for patient Communication and Treatment modification)

Reporting period: 2022-10-01 to 2024-03-31

The PRE-ACT project aims to deliver a framework, grounded on solid and novel human-interpretable AI concepts, to predict the risk of side effects following radiotherapy treatment for breast cancer patients and subsequently utilize it to inform the patients about optimal treatment.
•Leverage data from three multi-centre patient European cohorts to train AI models for risk prediction of the occurrence of side effects with a primary focus on arm lymphedema.
•Homogenise and analyse the data. The data consist of various modalities and include patient medical records such as comorbidities, anatomy, demographics, as well as treatment data, radiotherapy dose distribution data, Computerized Tomography (CT) scans, auto-contouring of critical organs in CT scans, and genetic data.
•Enrich datasets from existing cohorts with additional data that will be collected on patients (e.g. genetic markers in CANTO, cardiac toxicity in REQUITE) to enable the design of integrated AI models with a richer set of features.
•Address fairness considerations in the AI algorithms that aim at uncovering and explaining potential biases in data and provide explanations about their nature.
•Utilize AI models to analyse data and generate risk scores to enable early detection and intervention and potentially reduce healthcare costs.
•Utilize advanced explainable AI (XAI) algorithms that provide explanations both in terms of per data sample (local explanations) and in terms of the whole datasets (global datasets).
•Create an actual testbed within the controlled environment of a subnetwork of AUEB-RC's lab to implement and deploy various FL algorithms to simulate real-world scenarios of hospitals around the world that would like to collaboratively train AI models while keeping their data separate.
•Utilize privacy preserving AI methods such as Federated Learning as a proof of concept to assess the quality of predictions when the data are private and decentralized.
•Assess the impact of explainability of the AI model in a clinical trial that comprises two arms, namely two disjoint subsets of recruited patients. In the first arm, the personalized risk prediction will be communicated to physicians and patients, while in the second arm, it will not.
•Assess the impact of communicating a personalized prediction of lymphoedema risk and prescribing a prophylactic arm sleeve (in case of elevated risk), on the occurrence of the arm lymphedema, the radiation treatment planning and the patients’ quality of life.
•Adopt a participatory co-creation and co-design process with involved stakeholders (patients, physicians, radiation oncologists) so as to conceive a reference framework of appropriate measures for trustworthy AI in addition to usability and considers the process end-to-end: from the user requirements elicitation to the assessment of the communication package.
•Design and build user-friendly platform to present risk predictions and explanations in a clear and understandable way and facilitate communication and collaboration between patients and doctors.
•Develop a back end and front-end Infrastructure that utilize modern technologies supports big data storage, utilizes AI, and supports efficient communication and computation services.
•Design user interface to ensure visually appealing and user-friendly interface, especially for elderly or impaired users.
•Implement robust authentication techniques and encryption methods to ensure the privacy of patients’ data.
The main achievements of PRE-ACT in the first 18 months are:
1.Unifying data from three different breast cancer radiotherapy cohorts into a central database.
2.Re-analyzing the CT scans from all patients to mark the organs-at-risk in a consistent manner across the cohorts.
3. Carrying out a genetic analysis of SNPs that might predict arm lymphedema.
4. Using machine learning to build predictive models for arm lymphedema.
5. Developing algorithms for explaining complex AI models.
6. Developing a clinical trial proposal for implementation of the AI prediction.
7. Collecting the views of key stakeholders on use for AI for prediction of radiotherapy side effects.
8. Building a web application called PRE-ACTOR that will be used to inform patients and doctors of the individual risk for arm lymphedema.
9.Generating a preliminary analysis of health economic factors important in future clinical use of a radiotherapy side effect prediction.
10.Identifying sensitive socio-economic and demographic attributes that could influence fairness and lead to bias in the AI models, such as income and education level.
11.Extensive examination of the data availability and distribution concerning sensitive attributes across the datasets used in the project.
Novel Explainability algorithms, Fidex and FidexGlo, for local and global explanations respectively were introduced. The Fidex algorithm optimizes fidelity by adding rule antecedents in the resulting rules set. FidexGlo acquires global explanations by using local explanations that are progressively merged to cover the target dataset with an advantageous computational complexity compared to the state-of-the-art. The average number of rules generated by FidexGlo is lower than previous algorithms and this is important because it becomes easier to analyze the knowledge embedded in a black-box model.

Federated Learning has been enhanced through pre-training with Generative Adversarial Networks (GANs), and then using them to generate data for each client. Subsequently, these synthetic data were integrated into the FL training procedure with ensemble learning. This approach has yielded notable performance improvements, surpassing 3% and 2.5% accuracy gains for the MNIST and CIFAR-10 datasets, respectively. The successful development and deployment of a controlled experimentation environment for FL will allow for the evaluation of fundamental FL algorithms and state-of-the-art techniques circumventing logistical complexities associated with disparate physical infrastructures.

The inclusion of the top 30 genetic variants – derived from genome-wide association studies within the REQUITE cohort – in the machine learning model for arm lymphedema was shown to produce only a modest improvement in predictive accuracy (the absence of genomics data for the HypoG cohort also contributes to difficulties in developing and assessing models containing genomics). Omitting genomics from the predictive models simplifies the workflow for the clinical trial by not requiring rapid genetic testing along with the requisite expenses. It also shows that using a limited number of genomic variants has little predictive value and that future models for lymphoedema and other breast side effects should utilize a larger number of markers, e.g. through the use of combined genetic risk scores. Although genotyping a large number of markers with a rapid turnaround is cost-prohibitive, we expect future healthcare will include whole-genome genotyping or resequencing for cancer patients.
My booklet 0 0