Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

HetERogeneous sEmantic Data integratIon for the guT-bRain interplaY

Periodic Reporting for period 1 - HEREDITARY (HetERogeneous sEmantic Data integratIon for the guT-bRain interplaY)

Reporting period: 2024-01-01 to 2025-06-30

Europe’s move toward a cross-border European Health Data Space (EHDS) and the strict privacy demands of the GDPR highlight a pressing gap: sensitive medical records, genomics files, images, and clinical notes remain siloed, hampering early diagnosis, personalised treatment, and evidence-based policymaking. At the same time, neurodegenerative disorders and gut-microbiome-related diseases impose rising social and economic costs across the EU. HEREDITARY (Horizon Europe RIA 101137074, 2024-2028) answers this challenge by delivering a trustworthy digital framework that links, analyses, and explains multimodal health data without moving it from its source.

The Overall Objectives of the project:

- Federated data linkage & privacy-preserving AI: Build a scalable platform that lets hospitals and biobanks query and co-train models on distributed data while keeping patient information on-site.
- Semantics-aware analytics: Combine clinical, genomic, imaging, signal, and text streams to reveal new risk factors and therapy targets for gut–brain disorders such as Parkinson’s, ALS, and mental health.
- User-centred decision tools: Provide visual analytics, open knowledge graphs, and multilingual terminology so clinicians, researchers, policymakers, and citizens can act on the insights with confidence.

Pathway to Impact
- Clinical and research gains: Multimodal datasets and FAIR knowledge-graphs will drive faster diagnosis, stratified trials, and precision medicine guidelines.
- Policy contribution: Federated architecture offers a blueprint for EHDS-compliant data sharing, supporting Europe’s Digital Decade and AI Act ambitions.
- Economic and societal value: Earlier detection and tailored interventions are expected to lower care costs and improve quality of life for millions affected by high-burden neuro- and metabolic diseases.

Role of Social Sciences and Humanities
SSH partners lead Health Social Laboratories for co-design with patients and clinicians, translate complex terminology into plain language, assess ethical-legal compliance (GDPR, AI Act), and train stakeholders in policy advocacy. Their work guarantees that the platform is socially robust, ethically sound, and broadly trusted—turning technical advances into real-world health and policy impact.
Technical and Scientific Highlights - M1 to M18

1. Foundations and Governance
– Published a FAIR-compliant Data Management Plan and an ethics playbook aligned with GDPR and the upcoming EHDS Regulation.
– Activated six specialist working groups and organised quarterly technical workshops on federated learning, privacy, ontology design and genomics.

2. Secure Federated Infrastructure and Clinical Data (WP 2)
– Federated network now running across four medical centres; first horizontal and vertical learning experiments on ALS data completed with latency control.
– Five multicentre clinical use cases defined (ALS prognosis, multi-disease diagnosis, Parkinson’s eye imaging, gut-brain phenotyping in healthy cohorts, gut-brain linkage in disease).
– Released two open-source libraries: SelfEEG for self-supervised EEG modelling and a BIOM-to-CSV converter for microbiome workflows.

3. Multimodal Semantic Integration Platform (WP 3)
– HERO Ontology version 2 published with coverage for phenoclinical, genomic and ophthalmic imaging data plus a live SPARQL explorer.
– Prototype polystore (Dremio + Ontop + GraphDB) deployed; in-database analytics tested on distributed VCF files.
– Compiled 9000 candidate terms and 1700 concepts for a multilingual medical terminology hub.
– Delivered PrivEval, a hands-on tool that assesses privacy leakage in synthetic datasets.

4. Advanced Analytics and AI (WP 4)
– Developed PubMiner AI: an LLM pipeline that converts PubMed articles into knowledge-graph triples; already powering an ALS biomarker demonstrator.
– Multimodal self-supervised models (PURPOISE, GANDALF, HEALNet) outperform single-modality baselines on internal EEG, MRI and omics benchmarks.
– Genomics hackathon produced first SNP-level transformer prototypes.

5. Visual Analytics and Interaction (WP 5)
– Launched OnSET for ontology exploration and ALviS for ALS time-series analysis.
– Introduced Droplets, a glyph technique that won the 2024 Bio+MedVis novelty award.
– Added natural-language “Talk-to-Your-Graph” querying to the visual stack.

6. Legal and Ethical Engine (WP 7)
– Completed a comprehensive inventory of legal, ethical and regulatory requirements covering GDPR, Data Governance Act, AI Act and cybersecurity.
– Ongoing deep-dive studies address data-quality obligations under the AI Act and privacy aspects of gut-microbiome data.
Results achieved
– Federated health-data network deployed in four hospitals; first horizontal and vertical learning runs on ALS data completed.
– HERO Ontology and a polystore prototype (Dremio + Ontop + GraphDB) give a unified semantic view of phenoclinical, genomic and imaging data.
– PubMiner AI converts PubMed papers into knowledge-graph triples; SelfEEG and PrivEval libraries released for EEG self-supervision and privacy testing.
– OnSET, ALviS and the award-winning Droplets glyph provide interactive visual exploration of ontologies, time series and high-dimensional cohorts.
– Comprehensive GDPR/EHDS/AI-Act legal inventory and threat-analysis feed privacy-by-design choices across all technical work.

Indicative impacts
Scientific: reusable FAIR terminological database, knowledge-graph triples and open-source analytics pipelines accelerate multimodal research and PhD training.
Clinical: earlier diagnosis and patient stratification for ALS, Parkinson’s and gut–brain disorders; privacy-preserving analytics model validates EHDS ambitions.
Industrial: five key exploitable software assets at TRL 5-6 open new revenue streams in federated AI, biomedical text mining and visual analytics.
Societal: Health Social Labs and multilingual terminology improve data-sharing trust and health literacy among citizens and clinicians.
HEREDITARY WP Organization
My booklet 0 0