Periodic Reporting for period 3 - Solve-RD (Solving the unsolved Rare Diseases) Reporting period: 2021-01-01 to 2022-06-30 Summary of the context and overall objectives of the project “Solve-RD – solving the unsolved rare diseases” is a research project funded by the European Commission for five years (2018-2022). It echoes the ambitious goals set out by the International Rare Diseases Research Consortium (IRDiRC) to deliver diagnostic tests for most rare diseases by 2020. The current diagnostic and subsequent therapeutic management of rare diseases is still highly unsatisfactory for a large proportion of rare disease patients – the unsolved rare disease cases. For these unsolved rare diseases, we are unable to explain the etiology responsible for the disease phenotype, predict the individual disease risk and/or rate of disease progression, and/or quantitate the risk of relatives to develop the same disorder.Our main ambitions are thus• to solve large numbers of rare diseases, for which a molecular cause is not known yet by sophisticated combined omics approaches, and• to improve diagnostics of rare disease patients through contribution to, participation in and implementation of a “genetic knowledge web” based on shared knowledge about genes, genomic variants and phenotypes.Solve-RD fully integrates with the newly formed European Reference Networks (ERNs) for rare diseases. Four ERNs (ERN-RND, -EURO-NMD, -ITHACA, and -GENTURIS) build the core of Solve-RD but we will reach out to patient cohorts across all 24 ERNs as well as the undiagnosed disease programs in order to achieve our aims.Solve-RD identified 3 main challenges and will deliver 7 implementation steps to address these challenges in work packages 1-3:Challenge 1: Accessibility of unsolved rare disease cohorts with comprehensive genetic and phenotypic dataChallenge 2: New and improved approaches for the discovery of novel molecular causesChallenge 3: Translate discoveries to patients’ live and clinical practice Work performed from the beginning of the project to the end of the period covered by the report and main results achieved so far From 01/2021 to 06/2022, Solve-RD continued to implement the activities addressing the three challenges.We reached our goal and collected 21,348 datasets (phenotypic & exome/genome sequencing data) from unsolved RD patients and their family members. Standardised phenotypic information (HPO, ORDO & OMIM encoded) has been collected via the RD-Connect GPAP PhenoStore module.We have continued producing and enriching the Rare Disease Case Ontology (RDCO). Up to now, RDCO has been populated with 412,500 similarity associations.Re-analysis of data freezes 1 & 2 has been done by the Data Analysis Task Force DATF working groups. Results have been prioritized and interpreted by the Data Interpretation Task Forces (DITFs) of each ERN. This collaborative effort led to the diagnosis of 511 patients from 6,003 families. This is already an 8.5 % additional diagnostic rate although many of the analyses and evaluations are still ongoing. The new re-analysis approach pursued in Solve-RD as well as the structures we established to warrant best exchange of expertise have been published in a series of papers in the EJHG in June 2021.Service providers have been chosen for all novel omics technologies. SOPs for biomaterials have been shared with all ERN partners via the DITFs. Sample shipment is only slowly progressing; however, >3,000 samples have been sent to the central lab in Nijmegen for QC and then further distributed to the respective service providers.The RDMM-Europe brokerage service connecting Solve-RD partners who discovered novel RD genes with model organism scientists that have the expertise to functionally validate these genes and variants opened 10 calls for Connection Applications. 36 Seeding Grants have been awarded so far.The co-designed models for the communication of genomic results for RD have been published. The main finding at both study sites in the Czech Republic and the UK was the identification of post-test care as the shared priority for improvement for both health professionals and families.The conference ECOgenomics “European Conference on the Diffusion of Genomic Medicine: Health Economics & Policy” took place online from 26-28 May 2021. Plenary sessions and parallel thematic sessions brought together researchers from the human sciences (mainly health economics), but also researchers from other disciplines.The Treatabolome database has been released and an API connects it to the RD-Connect GPAP to make information about treatable genes and variants of RDs accessible and to improve the visibility of existing variant-specific treatment options at the time of diagnosis to clinicians and their patients. It includes data from 10 systematic literature reviews of which six have been published in a special issue in May 2021 in the JND.The data flow system has continuously been adapted to the project’s needs involving GPAP, the EGA, omics service providers, the sandbox and RD3. GPAP’s new Cohort App module facilitates the exploration and construction of cohorts on experiments metadata and structured clinical data using standard ontologies to improve the analysis of defined cohorts. The FUSE client enables access to files stored at the EGA via the Sandbox and also for real-time visualisation in a genome browser like IGV when analysing data in the GPAP. Progress beyond the state of the art and expected potential impact (including the socio-economic impact and the wider societal implications of the project so far) Solve-RD meets the biggest challenge of diagnosing patients with RD since the implementation of NGS technology. Despite extensive studies of WES in numerous RD patient cohorts, at least 50% of all patients remain unsolved. By applying comprehensive advanced bioinformatics algorithms in four major patient cohorts we anticipate to increase diagnostic yield by about 3-5%. So far, we have managed to solve 511 previously unsolved cases among the 6,003 cases included in freeze 1 & 2 for re-analysis. This is already an 8.5% additional diagnostic rate although many of the analyses and evaluations are still ongoing (Laurie et al. manuscript in preparation).The extension of DNA analysis from WES to WGS in >2,500 well characterized patients is expected to raise this sensitivity to about 60-70%. WGS bioinformatics developed together with RD-Connect will lead to a world-wide not yet available, standardised analytical tool on how to approach genome data for structural variants. Also, no standardised multi-omics approach exists so far; neither at the experimental nor at the bioinformatics level. Solve-RD will develop specific strategies for the different patient cohorts to cope with these complex analyses and addressing simultaneously cost-effective issues. The connection of sophisticated diagnostic approaches will only be successful with deep phenotyping of patient cohorts. The participating and the associated ERNs will select cohorts of >800 unsolved patients with highly peculiar (ultra-rare) phenotypes, increasing the chance to find novel disease genes and novel disease mechanisms. We anticipate to solve >2,000 cases which will translate in a number of new genes and disease mechanisms to be discovered in the course of Solve-RD. For the first time in Europe we established a novel brokerage structure connecting clinicians, gene discoverer and basic researchers in a highly flexible and efficient way to quickly verify novel genes and disease mechanisms. To communicate genomic sequencing information with families, an evidence-based approach has been conducted.