Improving Diagnoses of Mental Retardation in Children in Central Eastern Europe and Central Asia through Genetic Characterisation and Bioinformatics/-Statistics,

Final Report Summary - CHERISH (Improving diagnoses of mental retardation in children in Central Eastern Europe and Central Asia through genetic characterisation and bioinformatics /-statistics)

Intellectual disability (ID) is a neuro-developmental disorder characterised by a substantial limitation in cognitive functioning that manifests before the age of 18 years. It is one of the main disabling conditions in children and adults, and is estimated to affect 1 - 3 % of the population. It is a highly heterogeneous disorder, with the majority of cases recognising a genetic aetiology. The possibility of specific diagnosis and prevention are dependent on the identification of gene defects and chromosome abnormalities associated with this disorder. Recent advances in gene identification techniques have revolutionised the clinical approach to ID, and the CHERISH project significantly contributed to the diagnostic investigation of affected children in countries involved. Patients had the chance to undergo advanced genetic testing and at the same time young researchers had the opportunity to be trained in the field of molecular cytogenetics. During the first months of the project, an interdisciplinary Eastern European and Central Asia (EECA) consortium of experts was established in order to lay down a basis for a significant improvement of clinical, educational and laboratory diagnostic developments in the field of genetics of ID. A standardised approach for the diagnosis of ID with common criteria for patient selection was defined, and allowed the creation of a large samples' collection of patients with clinically well defined ID. More than 1000 patients with undiagnosed ID were sampled and analysed during the project, aiding the detection of new genetic causes of neurodevelopmental disorders. Our activity allowed a diagnostic improvement of diagnoses of ID in children in Eastern European countries, and the results obtained provided fundamental insights for genetic counselling and management for many families.

Project context and objectives:

The overall goal of the CHERISH project was to establish an interdisciplinary EECA consortium of experts to perform a research program of clinical, scientific and public activities for generation of new knowledge about genetic causes of ID. The main objectives of the project were:
1) the development of a standardised approach for clinical diagnosis of ID;
2) the creation of large database and bio-bank of DNA samples from patients with clinically well defined ID of unknown genetic aetiology;
3) the identification of cryptic chromosomal rearrangements by molecular cytogenetic analysis;
4) the identification of new mutations and genes responsible for ID by linkage analysis in familial cases, sequencing of known genes and next generation sequencing of the whole exome of patients with ID;
5) the dissemination of knowledge about the project and its results in scientific publications, meetings and workshops for researchers, clinicians and society.

The first months of the project focused on the set up of clinical standards for patients inclusion, through dedicated meetings. The consortium identified a web-based database tool (please see http://www.cartagenia.com online) for the management of all the information regarding the phenotypic and genetic characteristics of the included patients, in order to obtain a secure system to share data to which the partners only could have access; this tool also allowed sample anonymisation to be a routine process.

Partners of the consortium collected a blood sample from every patient enrolled in the project. For sporadic cases, samples from parents were collected whenever possible, while for familial cases, blood samples collection included patients and other family members who gave their consent. In order to be enrolled, preliminary testing of patients included standard karyotyping and, where indicated, fragile X syndrome molecular analysis, FISH for known microdeletion / microduplication syndromes, FISH or MLPA analysis for subtelomeric rearrangements, neuro-metabolic screening.

One of the main objectives of the global project and of work package (WP) 3 specifically was the discovery / definition of genomic copy number variants (CNVs) by identifying cryptic genomic rearrangements using molecular cytogenetic analysis (whole genome array-comparative genomic hybridisation (CGH)). Therefore, array-CGH experiments on part of the enrolled patients were performed by scientists of in the laboratories of Alma Mater Studiorum - Università di Bologna (UNIBO), the Cyprus Institute of Neurology and Genetics (CING) and Institute of Medical Genetics (IMG), Tomsk, Russia during the period from month 8 to month 36 of the project. The array-CGH technology has led to the identification of many cryptic rearrangements in the CHERISH cohort of patients, leading to a better molecular diagnosis of ID, and has proven fundamental to select correctly the right patients for exome analysis. Previously performed conventional karyotyping, targeted FISH, molecular tests and investigations for metabolic disorders had not revealed any causative anomaly.

In parallel, a large cohort of patients was selected to undergo CNVs analysis through single nucleotide polymorphism (SNP)-array analysis. This analysis was mainly performed at the laboratory of the University of Tartu in Estonia (Utartu). Selection criteria were substantially the same as for the array-CGH analysis. This technology has the the advantage of a higher resolution with respect to array-CGH, and allows the identification of stretches of homozygosity, that can suggest a region with uniparental disomy or of true homozygosity in the case of parental consanguinity. On the other hand, due to its resolution, SNP-array analysis makes interpretation of results more complex because a high number of CNVs are identified, most of them being of benign origin. In order to exclude such neutral variations, identified CNVs were compared with those recurrently present in the database of genomic variants and in the databases of national general populations. The potential clinical significance of CNVs not present in normal individuals was evaluated using OMIM and DECIPHER databases and peer-reviewed literature searches in the PubMed database. The genomic context of aberrant regions was studied using the Ensembl database version 54 (based on NCBI build 36). The presence of genomic aberrations of potential clinical relevance was confirmed by quantitative PCR.

Another main objective of the project was the identification of new genes involved in the development ID. The classical approach to the identification of genes is based on recognition of families with multiple affected family members, linkage analysis through the study of short tandem repeat polymorphisms and sequencing analysis of candidate genes inside the regions showing positive linkage scores. Only a few families with sufficient size were collected by the CHERISH consortium. SNP-array analysis was performed to identify stretches of homozygosity in a consanguineous family during the first part of the project, while two further families underwent homozygosity mapping in the second part of the project.

Candidate genes causative of ID can also be the result of CGH / SNP-array analysis. Identified deletions are often very large and may contain a high number of genes, making gene prioritisation a difficult task, but smaller deletions can give interesting insights and suggest specific genes as causative for ID. In this regard, two genes involved in small interstitial deletion (CADPS2 on chromosome 7q31 and PCDH18 on chromosome 4q28) were considered excellent candidate and deserved further studies: sequencing of CADPS2 coding exons was performed in further patients with ID and with autism spectrum disorders, in order to identify point mutations.

Technological development that took place after the beginning of the project gave the opportunity to analyse almost all the known exons (protein-coding gene portions) of an individual in one single experiment at a reasonable cost. This technology, termed whole-exome sequencing (WES), allowed the identification of disease-causing genes of various genetic disorders in the last years, but has also opened the possibility to speed up the molecular diagnosis of heterogeneous genetic disorder such as ID. The main issue with WES is the analysis of the hundreds of genetic variants identified in every single subject. This technology revolutionised the way to disease-causing gene identification, rapidly making the classical approach obsolete. The knowledge of linkage regions in a specific family (or of homozygosity stretches in consanguineous families) can nevertheless help the process of gene prioritisation, given the extremely high number of variants identified with WES.

Affected individuals from families with ID were selected for WES. Only families with multiple ID affected members were selected: three families with an X-linked pattern of inheritance, one family with an autosomal dominant pattern, six families with autosomal recessive pattern and two families with an unclear pattern (two affected brothers, compatible with either autosomal recessive and X-linked inheritance). Globally, we planned to perform WES in 21 affected subjects from 12 families.

Project results:

Identification of cryptic genomic rearrangements in patients with both syndromic and non-syndromic ID was one of the mainstay of the CHERISH project. The analyses were carried out with either CGH-array or SNP-array on different platforms.

Whole genome array-CGH experiments were performed at a relatively low resolution of about 100 kb (44 K, Agilent, Unites States (US)) in the laboratory of UNIBO in a total of 236 patients from different partners: Italy (104 patients), Russia (56), Ukraine (60), Lithuania (10) and Poland (6). In the laboratory of CING, 104 patients were analysed with a higher resolution array (105 K, Agilent, US): 98 from the Lithuanian cohort, and other patients from Ukraine and Armenia; later on in the project, 22 additional patients from Cyprus and Ukraine with a custom-designed were analysed ultra-high resolution 400 K array. DNA samples from 15 Russian patients were analysed directly in the laboratory of the Institute of Medical Genetics (IMG) in Tomsk, using a new micro-array scanner 'SureScan Microarray' at an intermediate resolution (60 K arrays, Agilent). All data were validated with independent techniques. Globally, 377 unrelated patients underwent array-CGH analysis.

SNP-arrays were implemented for the analysis of globally 461 patients from different cohort of patients at the laboratory of UTARTU: 227 individuals from Estonian cohort, 115 individuals from Armenian cohort, 36 individuals from Poland, 60 individuals from Czech Republic and 23 individuals from Lithuania. Genomic rearrangements were screened by the Infinium HD whole-genome genotyping assay, partly with the HumanOmniExpress version 1.0 BeadChips (Illumina Inc., San Diego, California, US) and partly with the HumanCytoSNP-12 arrays (Illumina Inc.). 123 further independent patients from Czech Republic were analysed directly at the laboratory of Charles University in Prague (CUNI). Globally, 584 unrelated patients underwent SNP-array analysis.

Furthermore, a subset of patients with familial ID compatible with an X-linked pattern of inheritance was analysed at CING laboratory with a custom-designed X-chromosome exon-specific array. Specifically, 52 patients from 50 independent families the majority of which were from Poland (24), Lithuania (11) and Cyprus (7), were analysed with this array that allows identification of X-chromosome imbalances at the resolution of a single exon.

Identified genomic variants were classified as pathogenic / likely pathogenic versus benign / likely benign based on common criteria, mainly if the CNV overlapped well known established microdeletion / microduplication syndromes, if it was of de novo origin or inherited, if it involved genes already associated with ID, if the aberration was large. There are several examples in the medical literature of exceptions to these rules, but the criteria maintain a strong value at least as a first line analysis.

Percentage of identified CNVs varies between 10 and 20 % in the different national cohorts. This is in line with the studies published in the last years. Nevertheless, this was a fundamental goal of the project that helped to ameliorate the diagnostic process for children with ID in Eastern European countries, facilitated the diffusion of expertise among countries (many young scientists travelled among involved laboratories of partners to gain knowledge on specific technologies) and allowed some of the partners' labs to become independent in CNVs analysis. We provide overall results of array analysis in the different cohort of patients, with insight on the most interesting results from a scientific perspective.

Lithuania - Department of Human and Medical Genetics, Vilnius University (DHMGVU)

During the project, 233 patients with ID and 342 healthy individuals were enrolled. Among these, 131 patients were selected for molecular karyotyping: 10 patients were investigated with low resolution array-CGH, 98 patients with high resolution array-CGH, and 23 patients with SNP genotyping-array platform. Additionally, X-chromosome exon-specific array was performed in 11 patients. Pathogenic chromosomal aberrations were detected by molecular karyotyping in 16 patients (diagnostic yield of 12.2 %). Some of these alterations were unique and contributed to the identification of novel candidate genes for ID.

Patient LIT-001-030 has a complex de novo rearrangement, with a double balanced translocation involving four chromosomes, t(3;14) and t(6;20), and an interstitial deletion at 5p14-p14.3. He has severe developmental delay. Array-CGH analysis showed no dosage changes for the chromosomes involved in the translocations, confirming that they were both balanced, but revealed an additional 3.9 Mb de novo interstitial deletion at 5p14.1-p14.3 (position 23025478-26938536, NCBI build 36), in the region of 'Cri du chat' syndrome. Deletion on chromosome 5p leads to a variety of developmental defects, with most cases classified as 'Cri du chat' syndrome (MIM 123450). The patient's clinical phenotype was atypical, suggesting that there may be additional effect of one or more of the translocations' breakpoints contributing to the patient's clinical picture. The present case describes an atypical form of Cri du chat syndrome and stresses the need of both conventional karyotype and array-CGH analysis in cases of genotype - phenotype inconsistency. Such approach may contribute to more accurate diagnosis and genetic counselling.

An unexpected rearrangement was detected in a patient LIT-001-054, a girl with developmental delay and a de novo small 264 kb interstitial duplication in the region of Sotos syndrome at 5q35.3 in close proximity to the critical NSD1 gene. Phenotypic features of both prenatal and postnatal overgrowth, macrocephaly and developmental delay are present in the patient and are suggestive of Sotos syndrome (due to deletions / point mutations in NSD1), rather than the recently reported syndrome due to a reciprocal duplication. The duplication identified in this girl is located right downstream from the NSD1 gene, a region which appears critical for the expression of the gene as regulatory elements might be disrupted or the expression of a not amplified critical gene might be otherwise affected by the duplicated region. Thus, in the process of evaluating identified CNVs, attention should be drawn to the possible influence of chromosomal rearrangement on distant genes, and this case demonstrates that evaluation of the size of chromosomal alteration and gene content are not sufficient for assessment of CNV's pathogenicity. The context of adjacent genes should be considered as well.

We evaluated a boy (LIT-001-025) with severe developmental delay, seizures, microcephaly, hypoplastic corpus callosum, internal hydrocephalus and dysmorphic features (narrow forehead, round face, deep-set eyes, blue sclerae, large and prominent ears, nose with anteverted nares, thin upper lip, small and wide-spaced teeth, hyperextensibility of the elbows, wrists, and fingers, fingertip pads, broad hallux, sacral dimple), carrying a 1.53 Mb interstitial deletion at 4q28.3. Deletion was detected by Agilent 105 K array and involves the genomic region between 137 417 338 and 138 947 282 base pairs on chromosome 4 (NCBI build 36). The alteration was inherited from the healthy mother. Only the PCDH18 gene is involved in the deleted region. The gene product - the cadherin-related neuronal receptor - is thought to play a role in the establishment and function of specific cell-cell connections in the brain and haplo-insufficiency of the gene may play a role in the development of brain dysfunction and associated malformations. We consider this deletion as a private inherited copy number variation associated with specific clinical findings in our patient and PCDH18 gene is a possible candidate gene for ID. This finding deserves nonetheless further studies.

Patient LIT-001-150 is a 14 years old girl with mild ID and facial dysmorphisms, including macrocephaly, ocular hypertelorism, low set ears and other features. A 7p22.1 de novo microduplication, 1 Mb in size, was detected by array-CGH analysis. Microduplication of the 7p22.1 region, 1.7 Mb in size, was recently reported as a specific syndrome and is associated with characteristic facial features and speech delay. The patient that we describe shows similar features but carries a smaller duplication that includes 15 RefSeq genes. Among them, ACTB gene is a strong candidate gene for anomalies of craniofacial development. Further cases with similar duplications will contribute to the delineation of 7p22.1microduplication syndrome.

A previously undescribed de novo 17q21.33 microdeletion, 1.8 Mb in size, was detected in a hyposomic patient (LIT-001-013) with mild ID and dysmorphic features (microcephaly, long face, large beaked nose, thick lower lip, micrognathia). The deletion was detected by whole-genome SNP-array and involves the genomic region between 45 682 246 and 47 544 816 bp on chromosome 17 (NCBI build 36). Among the 24 RefSeq genes included in this deletion, the CA10 and CACNA1G genes are involved in brain development and neurological processes. A possible candidate gene for the prenatal and postnatal growth retardation is CHAD, whose product 'chondroadherin' is a cartilage protein with cell binding properties. These three genes may be partially responsible for the patient's phenotype.

Russia - Institute of Medical Genetics, Russian Academy of Medical Sciences (IMG).

A total of 206 families were recruited according to standardised consortium criteria for clinical examination. Conventional cytogenetic analysis was normal in all patients who underwent further testing. The most common ID-associated conditions, such as fragile-X, Prader-Willi and Angelman syndromes were excluded in selected patients.

DNA samples from 70 patients were analysed by array-CGH to identify cryptic chromosomal rearrangements. 56 DNA samples were studied in the University of Bologna (Italy) using 44 K array (Agilent). DNA samples from 15 patients were analysed directly in the IMG in 2012 using a new microarray scanner 'SureScan Microarray' (Agilent) and 60 K arrays. One patient was diagnosed independently by both 44 K and 60 K arrays. Confirmation studies by real-time PCR were performed in joint research at DHMGVU.

Known microdeletion / microduplication syndromes were diagnosed in 7 patients (10 %). They include del22q11.21 dup22q11 (2 patients), del15q24 (2 patients), del16p11.2 and dup1q25. The size of rearrangements varied from 227.6 and 286 kb for dup22q11 syndrome to 6.14 Mb (for dup1q25 syndrome) and 8.1 Mb (for del15q24 syndrome). Both patients with microduplication 22q11 had benign CNV in 8p11.22 (deletion of 127 kb) and, moreover, one of these patients carries an additional 2.3 Mb duplication of 15q11.1-q11.2 of unknown clinical significance. This duplication affects the region of chromosomal breakpoint close to Prader-Willi and Angelman syndromes. The contribution of different CNVs in combination is a very promising field of research, especially for those aberrations, such as dup22q11, that show phenotypic variability and are often inherited by a healthy parent.

Benign CNVs or CNVs with unknown clinical significance were detected in 27 % and 40 % of patients using 44 K and 60 K arrays, respectively. The classification of this category of molecular cytogenetic results was performed according to information in the database of genomic variants (please see http://projects.tcag/variation/ online).

New regions of chromosomal imbalance, which were not previously described in the literature with relation to ID, were detected in the karyotype of six patients (8.6 %). They include del3p26.3 dup3p26.3 del5pter-p15.2 with del5q13.3 del11p13, dup14q11.22 and dup15q22. Among these, 369 kb deletion and 766 kb duplication, which affect the same 3p26.3 region, are notable: the affected region contains contactin gene (CNTN6), may play a role in the formation of axon connections in the developing nervous system. It is possible that imbalance of copy number of CNTN6 may be responsible for ID in patients with this aberrations.

Deletion of 11p13 (1.155 Mb) partially affects the region of known WAGR syndrome (Wilm's tumour, aniridia, growth retardation). The deletion affects nine genes, some of them (SCLCA2, FJX1, TRIM44 and LDLRAD3) are important for central nervous system development and function.

The duplicated region 14q11.22 (1.9 Mb) contains two genes SLC7A7 and MMP14 and previously was reported in association of lysinuria. In one patient, two deletions on chromosome 5 were diagnosed. Deletion of 5pter-p15.2 affects the proximal region of Cry-du-Chat syndrome, which was not characterised of our patient. The second deletion 5q13.3 affects the gene of Sandhoff disease (HEXB). The father in this family is an asymptomatic carrier of the same 5q13.3 deletion.

Czech Republic - Charles University Prague, Second Medical School and University (CUNI)

177 Czech families with well-characterised families with ID were collected. 19 (10.7 % of total) patients with pathogenic genomic CNVs identified with CGH or SNP-array and findings of uncertain clinical significance, for which at least some support exits that the identified variants could be causal for ID (11 families, 6.2 % of total) were identified. Finally, 28 families show findings of uncertain clinical significance, the causality of which is currently unclear.

Examples:
- Czech family Cze033 with the 2p15-p16.1microdeletion syndrome and the shortest deletion among the families reported in the literature allowed to narrow down the candidate region to three protein-coding genes.
- Czech family Cze073 with a de novo deletion of 2p14-p15 could help to define a novel microdeletion syndrome in 2p14.
- Czech family Cze013 with a de novo deletion of 12q13 including the whole HOXC gene cluster could help to define a novel microdeletion syndrome in 12q13.
- Czech families Cze061 and Cze072 with maternally transmitted deletions of 15q25 could help to define a novel microdeletion syndrome in 15q25.
- Runs of homozygosity (ROH) analysis yielded no remarkable results (no remarkably large ROHs which could indicate UPD, none of the patients had an excessively large number or extent of ROHs which could indicate inbreeding, no remarkable ROHs covered the loci of known MR genes, the ROH clusters in the patients were similar to those in normal controls and likely reflected only general population phenomena).

Estonia - University of Tartu, Institute of Molecular and Cell Biology (UTARTU)

During the entire project, 227 individuals from 169 independent families of the Estonian ID cohort were analysed on SNP arrays. Genotype and phenotype information from the respective general population was used as reference data-set for CNV analysis. Based on data from 1000 unrelated samples from Estonian Genome Centre of University of Tartu (EGCUT) analysed with the same algorithms, the Estonian population specific list of common CNV regions was generated. For data analysis, genotypes were called by BeadStudio software GT module v3.1 (Illumina Inc.). Log R Ratio (LRR) and B Allele Frequency (BAF) values produced by the BeadStudio software were formatted for further CNV analysis and break-point mapping with Hidden Markov Model based softwares QuantiSNP (ver.2.1) and PennCNV (ver. 2009aug27). In addition to LRR and BAF values, SNP marker allele frequency data from the Estonian general population was used as the reference in the PennCNV software. Parameters suggested by authors were used in both QuantiSNP and PennCNV. Only samples with a call rate > 98 % that passed QuantiSNP quality control parameters were analysed. To minimise the number of false positive findings, CNVs > 50 kb in size, detected by both algorithms and visually confirmed in BeadStudio GenomeViewer were selected for further interpretation.

To exclude neutral variations, CNVs were also compared with those recurrently present in the database of genomic variants and in the databases of national general populations. The potential clinical significance of CNVs not present in normal individuals was evaluated using OMIM and Decipher databases and peer-reviewed literature searches in the PubMed database. The genomic context of aberrant regions was studied using the Ensembl database version 54 (based on NCBI build 36).

The presence of genomic aberrations of potential clinical relevance was confirmed by quantitative PCR. FISH analysis was performed according to standard cytogenetic protocol in most cases of individuals carrying duplications and in which unbalanced translocation was suspected.

DNA samples from 169 probands, 58 additional family members, including 18 affected and 40 unaffected individuals were analysed.

Ukraine - Institute of Molecular Biology and Genetics of the National Academy of Sciences (IMBG)

280 blood samples from members of 99 ID families were collected. All patients underwent preliminary analysis and three families were excluded, one because of a diagnosis of fragile-X syndrome, two because a clinical diagnosis of Prader-Willi syndrome was confirmed by specific molecular testing. Two patients had chromosomal aberrations on standard cytogenetic analysis, but were nonetheless selected for further studies, in order to have a better definition of the chromosome defect. Thus, 96 families (271 samples) had access to further stages of the project and stored in the biobank.

All array-CGH findings have been compared to the CNVs recorded in the Database of Genomic Variants (please see http://projects.tcag.ca/variation/ online) to exclude previously reported polymorphisms. Array-CGH findings obtained for 57 patients have been analysed using DGV and Decipher databases in order to find unique CNVs.

The results of CNVs comparison with previously reported polymorphisms revealed 18 cases who needed confirmation with an independent technique (FISH or real time quantitative PCR) and characterisation after consulting specific databases (UCSC genome browser, Ensembl, NCBI) to identify potential candidate genes among those involved in the imbalances. Parental samples were investigated when necessary. The confirmation has been performed in 18 cases with pathogenic, probably pathogenic, probably benign and uncertain rearrangements using qPCR (16 cases) and FISH methods (2 cases). For the qPCR analysis specific primers, flanking the chromosomal aberration, were designed and synthesised in the IMBG. For the FISH analysis of balanced translocations in two cases specific probes for the altered chromosomes were used.

In parallel with array-CGH, MLPA analysis for subtelomeric rearrangements in 31 patient using SALSA MLPA kit P070 telomere-5 and SALSA MLPA kit P036 Telomere-3 kits has been performed. Subtelomeric rearrangements were identified in six patients.

Of 18 cases selected, 15 rearrangements were confirmed in the probands and the origin of the aberration was determined by analysing the parents. Four cases turned out to be de novo, and the remaining chromosomal aberrations were of parental origin.

It's interesting to mention eight cases, which were considered the most promising for candidate gene analysis because of either de novo rearrangement or familial ID. In four of them, the chromosomal rearrangements were large (more than 6 Mb in size). In two cases, the identified probably pathogenic rearrangements involved 84 and 47 genes respectively. In two cases, one single gene was located in the region of rearrangement.

In 2 of 15 cases, the CGH analysis results suggested the presence of an unbalanced translocation. A specific FISH analysis was performed in probands, their parents and other affected family members. In both families, healthy fathers were found to carry a balanced translocation.

For the remaining 39 patients without array-CGH data, MLPA analysis for microdeletions syndromes using Salsa MLPA probemix P245-A2 microdeletion syndromes-1 kit has been performed. The results showed that three patients have duplications in the MECP2 gene. Confirmation of this rearrangement and it's characterisation in probands and their family members was performed with qPCR and revealed all three duplications to be of de novo origin. The MLPA analysis for microdeletion syndromes also revealed 4 cases with Williams syndrome, four cases with Smith-Magenis syndrome, two cases with 9q22.3 microdeletion syndrome, five cases with 17q21 microdeletion syndrome and three cases with 2p16 microdeletion syndrome. These data should be confirmed using independent techniques.

De novo cases

Patient UKR 031 is a five years old boy with moderate mental retardation (IQ-52). Parents are healthy.
Karyotype: 46,XY,del(5)(q15q22) or del(5) (q13q15). Metabolic screening was normal.
Clinical examination showed presence of dysmorphic features: ptosis, strabismus, short upturned nose, thin upper lip, microretrognatia, large, displastic ears, pectus excavatum, muscular hypotension, chryptorchidism, dry skin and appendages.

44K array-CGH analysis of proband revealed novel del5q15-q22.1. The region of rearrangement involves more than 50 genes, none of which has been previously reported to be associated with ID, although several SNPs within the region were associated with psychiatric disorders (bipolar disorder, Asperger syndrome, attention deficit disorder with hyperactivity, panic disorder).

Patient UKR 136 is four years old girl with moderate mental retardation. Parents are mentally healthy, mother has chronic pyelonephritis.
Karyotype: 46, XX.
Clinical examination showed: weight 13 kg, height 98 cm, head circumference 49 cm, neonatal jaundice, dysmorphisms (narrow forehead, epicanthic folds, up-slnated palpebral fissures, large ears, dysplastic ears, microstomia), umbilical hernia, horseshoe kidneys, tonic / clonic seizures, metatarsus valgus, attention deficit, speech delay. Array-CGH analysis revealed two rearrangements: del 2q37.1-qter of maternal origin and a de novo duplication 3q27.3-qter. Both regions contain considerable amount of genes previously not reported to cause ID.

Patient UKR 119 is four years old girl with moderate mental retardation (IQ-62). Parents are healthy.
Karyotype- 46, XX. Metabolic screening was normal.
Clinical examination showed: weight 10 kg, height 105 cm, head circumference 52 cm, hypotonia, autistic features, speech delay. Array-CGH revealed a de novo deletion on chromosome 2 (del2q32.3-q33.1) containing 47 genes. The deletion overlaps with a rearrangement described in Decipher, but phenotypic features are different.

Patient UKR 112 is two years old girl with moderate mental retardation. Parents are healthy.
Karyotype - 46, XX.
Clinical examination showed: weight 10 kg, height 89cm, head circumference 45 cm, microcephaly, epileptic seizures (petit mal), hypotonia and autistic features. Array-CGH revealed a del Xq28, that was not of maternal origin. Considering that the father is healthy, this rearrangement is very likely to be of de novo origin. The only gene involved in the rearrangement is MECP2 (methyl CpG binding protein 2). Mutations in MECP2 are the cause of Rett syndrome, that in its classic form is a progressive neurodevelopmental disorder characterised by apparently normal psychomotor development during the first months of life, followed by regression in language and motor skills, stereotypic hand movements, autistic features, gait ataxia and apraxia, tremors, seizures, and acquired microcephaly. Previously published cases with partial MECP2 gene deletion had been described as classic Rett or Rett-like syndromes. The patient we describe shows some features of Rett syndrome, although her phenotype is not typical.

Selected variants of uncertain pathogenicity

Family 02 (patients UKR 003, UKR 004, mother UKR 005)

Ukr 003 is six years old boy with moderate mental retardation (IQ 52).
Karyotype: 46, XY.
Clinical examination: weight 24 kg, height 119 cm, head circumference 53 cm, facial dysmorphisms (prominent large forehead, prominent teeth with enamel abnormalities). Speech delay.

His sister Ukr- 004 is a three years old girl with mild mental retardation (IQ 64) and similar clinical features.
Karyotype: 46, XX.

Father (UKR 006) is healthy, mother (UKR 005) has speech delay and teeth with enamel abnormalities.
105 K array-CGH revealed a dup 12q24.33 of maternal origin in both proband and his sister. The region of rearrangement involved only one gene - RIMS-binding protein 2 (RIMBP2). This gene is expressed in neurons in region-specific manner, although its function is yet to be established. Presence of this duplication in a mildly affected mother and her children suggests the possibility of RIMBP2 gene involvement in speech development.

Family 08 (patient UKR 022, mother UKR 023)

Ukr 022 is two years old girl with moderate ID and hyperactivity. Mother has mild ID. Father has hydrocephaly.
Karyotype: 46, XX.
She has normal growth and dysmorphic features: epicanthic folds, prominent forehead, heart defect (foramen ovale), high palate, down-slanting palpebral fissures, flat occiput.

Mother (UKR 023) has no dysmorphic features.
105K array-CGH of proband revealed dup 7p21.1 of maternal origin. There are several candidate genes in the region of rearrangement:
- transcription factor A, mitochondrial pseudogene 1 (overlaps with Decipher syndrome, although the phenotype is different);
- ankyrin repeat and MYND domain containing 2 (ANKMY2) (function is not well studied yet, may be involved in the trafficking of signalling proteins).

Armenia - Center of Medical Genetics and Primary Health Care (CMG)

The analysis of molecular karyotyping data of 95 sporadic (48 non-syndromic and 47 syndromic cases) Armenian patients by infinium HD whole-genome genotyping assay with the HumanCytoSNP-12 BeadChips (Illumina Inc.) performed in Utartu revealed 24 aberrations among 23 patients (24 %).

This new approach allowed the detection of an extensive ROH on chromosome 15 in two unrelated patients. The extent of the ROH (>10 Mb) and restriction to a single chromosome might be due to uniparental disomy associated with Angelman syndrome.

On average, 4.8 CNVs were detected per genome with a size range up to 11.3 Mb. Most of the detected CNVs had already been reported in the Database of Genomic Variants or recurrently present in our cohort, and were considered as benign, probably benign, or potentially unreliable.

22 relevant structural aberrations were described in other 21 patients which were categorised as pathogenic (6 deletions and 1 duplication) or probably pathogenic (3 deletions and 5 duplications) varying in size from 127 Kb to 11.3 Mb. Specifically, 7 microdeletions in genomic regions with established clinical significance were detected in 7 patients.

Novel aberrations associated with intellectual disability were detected in 14 syndromic and non-syndromic cases. Most importantly, 10 rare de novo CNVs (< 5 Mb) were considered to be clinically relevant as well as 6 aberrations (> 5 Mb) also were considered of potential clinical significance. A manuscript including all these novel CNVs is currently being prepared for submission to internationally peer-reviewed journals.

The molecular karyotyping of six Armenian families with ID by infinium HD whole-genome genotyping assay with the HumanOmniExpress BeadChip (Illumina Inc.) performed in Utartu revealed two microdeletion and microduplication syndromes in one family at once (ARM104).

Two brothers (ARM104 001 and 002) with ID presented fully similar features of 3p-deletion syndrome, including developmental delay, muscular hypotonia, epichantal folds, flat and long philtrum, micrognathia, dolychocephaly, microcephaly, and hypertelorism. From family histories of non-consanguineous parents, a paternal aunt showed ID, delayed speech, and obesity. Karyotype analyses showed the deletion of 3p not only in affected sibs but also in their healthy father (ARM104 004). The analysis of brothers showed identical 3p25.3 terminal hemizygous deletion of 10.9 MB. The deletion encompasses 49 RefSeq genes, including proposed 1,5 Mb minimal terminal deletion with causative CRBN and CNTN4 genes, and two others proposed as major candidates for ID: CHL1 mapped at 3p26.3 distally and SRGAP3 mapped at 3p25.3 proximally to minimal terminal region. FISH mapping detected balanced translocation 46,XY,t(3;8)(p25.3;p23.3) in father and a partial 3p25.3 terminal trisomy in the paternal aunt. The duplicated region contains GHRL and PPARG genes which contribute to obesity and behavioural problems presented in the aunt. Despite of controversial results related to candidate regions for 3p-deletion syndrome, this is the biggest deleted region among reported few familial cases, encompassing all candidate genes responsible for ID with apparent clinical consequence of 3p-deletion.

Cyprus - The Cyprus Institute of Neurology and Genetics (CING)

The Department of Cytogenetics and Genomics at CING analysed approximately 160 samples from consortium partners, providing two oligonucleotide microarray platforms for whole-genome array-CGH (105 K and 400 K) and a custom-designed X-chromosome exon-specific microarray for screening affected males in X-linked families.

Locally, CING recruited a total of 27 affected individuals (20 male, 7 female) from 15 families of which 7 were screened using the X-chromosome array and 12 using an ultra-high resolution 400 K array. The strategy employed was to perform array-CGH on the most severely affected individual in the family (if multiplex) and, if a potentially pathogenic variant was identified, all family members including affected sibs were then tested (e.g. RT-PCR, MLPA). For 400 K array-CGH cases, several members of each family including parents when possible were screened to aid in the analysis.

Of the 12 Cypriot cases screened using the 400 K array, several aberrations were identified and consisted of a total of 4 duplications and 1 deletion which were all confirmed using either RT-PCR or MLPA. All aberrations were inherited from a parent, none appeared de novo. Further analysis identified two aberrations of particular interest; one was a 2.1 Mb duplication in 17p13.3 which overlaps the Miller-Dieker region (a region of chromosome 17 associated with incomplete brain development) and includes genes involved in neuronal myelination (ASPA) and neurotransmission (NMDA receptor function - SRR) while the other was a 7 Mb duplication identified in 12q21.1-12q21.3 and includes genes involved in axon guidance (NAV3), neurotransmitter release (SYT-1) and function (potassium channel - KCNC2). Following analysis of the seven patients screened using the X-chromosome array, no unique aberrations were identified.

Poland - Department of Medical Genetics, Poznan University (PUMS)

Array studies were carried out on a subset of Polish patients in Utartu on the SNP-array platform. A PhD student from PUMS performed the experiments, with the supervision and support of local molecular cytogeneticists from TARTU. 36 patients were analysed: 10 patients carried 12 pathogenic or potentially pathogenic changes, while no observable changes were identified in 26 patients. All 12 pathogenic or potentially pathogenic changes were confirmed by qPCR or MLPA technique in PUMS laboratory. 11 changes occurred de novo, while parental origin of the last change is at the moment undefined.

DNA of 50 individuals from 25 Polish families have been sent to Cyprus for the X-chromosome exon-specific array. Analysis in three probands from different families revealed rearrangements (POL 90, POL 145, POL 226). Interestingly, in two probands from unrelated families (POL 145 and POL 226), the same duplication (47659243-48101667) was identified. In the remaining case, three different rearrangements are present - two unique duplications (dupX 1199 Kb: 69 554 199 - 70 753 205 and dupX 684,2 Kb: 70 065 966 - 70 750 207) and one deletion (delX 8,5Kb: 118 913 994 - 118 922 455). Theses results have been confirmed by qPCR.

24 further Polish patients were analysed in UNIBO with SNP-array. Raw data analysis was performed in PUMS. 68 pathogenic or potentially pathogenic changes were identified in 19 patients. Out of 68 changes, 61 turned out to be false positive, while 7 changes were confirmed by qPCR. 3 out of these 7 changes were of de novo origin in three families we have not performed this analysis yet.

Italy - Medical Genetics Unit, University of Bologna (UNIBO)

UNIBO laboratory performed a total of 236 low-resolution array-CGH experiments. 'Local' samples were 104 and, among these, a total of 86 CNVs were identified in 60 samples. Most of these variants were classified as 'benign'.

A deletion involving one single gene, CADPS2, was identified in patient ITA.004.01 affected by mild mental ID and generalised epilepsy with normal brain imaging. The deletion was identified in his affected sister, who shows highly similar clinical features, but was absent in their father. It is thus likely to be maternally inherited: the mother died with breast cancer but was reported to show behavioural abnormalities. The deletion was characterized in detail by quatitative real-time PCR and maps between exon 4 and exon 28 of this large gene, resulting in a complete disruption of the gene product. Sequencing of CADPS2 was performed in the proband, in the hypothesis of a recessive disorder, but no point mutations were present. Several lines of evidence suggest that CADPS2 is good candidate gene for ID and autism spectrum disorders (ASD), given that:
1) it is involved in the release of neurotrophins such as BDNF/NT3, as shown from studies on animal models (mice) where CADPS2 was 'knocked out'; these mice are also reported to show autistic traits;
2) alternative spliced forms lacking important domains have been reported to be more frequent in individuals with autism or with lower IQ compared to controls, although data from different groups are controversial;
3) CADPS2 maps to the AUTS1 locus, one of the few linkage intervals for autism susceptibility identified from many independent groups. Therefore, we performed a mutation screening of CADPS2 in 120 ID / ASD sporadic patients, in order to identify other putative deleterious variants. All exon and exon-intron boundaries were analysed by direct sequencing of the PCR products. Novel variants were identified in the coding region: four missense changes; and two silent changes, that may modify putative exonic enhancer sites in exon 1 and exon 6, respectively. All non-coding variants and polymorphisms were already present in public databases (dbSNP). All 6 variants were heterozygous changes and 5 ones were not found in 500 control chromosomes (250 Italian healthy individuals). The segregation of all variants was evaluated in the corresponding pedigrees and an excess of maternal transmission was identified. In particular, the change in exon 26 was inherited from the mother and was not found in an unaffected brother. Conversely, the only change in exon 13 identified in control individuals was of paternal origin. These data suggested possible parent-of-origin and / or imprinting effects of these coding changes. Considering that CADPS2 maps to chromosome 7q31, where a cluster of imprinted genes has already been reported, we are currently investigating the expression of the different alleles of CADPS2 in controls and affected individuals of whom blood RNA was available, and in different brain tissues regions.

This takes us to the other main goal and last part of the project, namely the identification of new genes involved in the development of ID.

Whole exome sequencing (WES)

Whole exome analysis was performed with the Illumina TruSeq Exon Enrichment kit on the Illumina HiSeq-2000 platform at Utartu. An informatic pipeline for the detection and evaluation of genetic variants emerging from exome data analysis was set up by UTARTU and UNIBO.

We performed the following steps:
1. Reads quality check. Reads generated with the sequencing technology were checked using the FastQC program (please see http://www.bioinformatics.babraham.ac.uk/publications.html online).
2. Alignment. Raw reads were aligned with BWA (Li and Durbin, 2009) against the reference genome hg19, and a bam file containing the aligned reads is generated with SAMtools (Li, 2009).
3. Local realignment around indels. Aligned reads from every bam file were realigned locally with the GATK package, in order to transform regions with misalignments into clean reads containing a consensus indel suitable for variant discovery.
4. Duplicate reads removal. PCR duplicate reads were removed using with the Picardtools MarkDuplicates utility (please see http://www.picartools.sourceforge.net online).
5. Quality score recalibration. Base quality scores of aligned reads from every bam file were recalibrated with the GATK package (DePristo, 2011; please see http://www.broadinstitute.org online), in order to obtain more accurate base quality scores.
6. Alignment and coverage metrics. Metrics of every sample's alignment were collected from bam files with SAMtools, in order to check whether the alignment process gave a consistent output. Metrics of the coverage obtained on the targeted exome is calculated using the GATK package, in order to obtain all the available information about the reliability of the variant calling results.
7. Variant calling. Variant positions with respect to the reference sequence hg19 were called in the targeted exome in order to obtain a vcf file with all the sample's variants.
8. Variant filtering. A filter tag was assigned to every variant in the vcf file in order to discriminate between good and bad quality variants.
9. Variant annotation. All the variants were annotated according to NCBI RefGene (please see http://www.ncbi.nlm.nih.gov online) and UCSC KnownGene (please see http://www.genome.ucsc.edu online) databases.

Results:

Alignment and coverage statistics
On average, 97 million paired-end reads were mapped to the reference genome (97 % of the total reads), and the vast majority of the total reads were properly paired (95 %). On average, 42.9 % of the total reads overlapped the Illumina Truseq target regions. We obtained 71X mean average coverage (range: 43.5X-100.8X) and 61X median average coverage (range: 35X-91X). The 91 % of the targeted bases were covered more than 10X (range: 88.8-93.0) giving appropriate confidence for variant calling. The amount of target bases covered was between 0X and 100X.

We detected 18 000 - 19 000 exonic SNVs and 300 - 500 indels in each of the 21 sequenced individuals. The majority of the SNVs (> 98 %) and indels (> 70 %) were reported in dbSNP. The remaining (not in dbSNP) SNVs were on average 260 (refGene) and 270 (knownGene) per individual, while the remaining indels were on average 105 (refGene) and 110 (knownGene), respectively.

All partners received the output from WES analysis and are currently evaluating the most relevant changes in their corresponding families in order to understand if the identified variants can be etiological causes of ID. A manuscript describing these findings will be prepared.

Potential impact:

The CHERISH project had an obvious potential impact for the scientific community, as well as for the participating families and for society.

First of all the participation in the consortium provided the possibility of a professional growth for young scientists, PhD students and clinicians, who had the opportunity to travel to the laboratories of the involved partners to gain specific expertise.

The participating countries have strongly profited from the project in several ways. The first period of the project, that involved training and increasing the awareness of ID, allowed the participation of young scientists and clinicians in courses that addressed different aspects of ID diagnosis and management. This knowledge was then further disseminated in the local scientific and clinical communities, and among the general public. Each team could profit from newly established scientific and clinical contacts, which were realised through mutual visits among partner laboratories. The project also supported the participation of the partner teams in scientific conferences where they could present their findings, learn about the progress in the field, further discuss the methods and interpretation of the results, and find new collaborations. Young researchers and clinicians were preferentially supported to attend scientific meetings.

During the stage of sample collection and clinical characterisation, many local professional collaborators were involved in each country, further increasing the outreach of the project and the public awareness of ID. This preliminary part of the project brought a clear benefit to some families who were identified to carry visible cytogenetic aberrations. These families were excluded from further studies, but nevertheless received a diagnosis explaining their child phenotype.

The CHERISH project helped to introduce array-based analysis as the first-tier diagnostic test for patients with ID (and multiple congenital anomalies) in participating countries, as suggested by internationally recognized authoritative sources in the field of cytogenetics (e.g. ISCA consortium). Between 10 and 20 % of patients in the different cohorts received a specific diagnosis thanks to array analysis. Every single patient who receives such a diagnosis brings a potential benefit for his / her family, at least in terms of genetic counselling and evaluation of recurrence risks for family members. In other families, unknown variants potentially causative for ID were identified: the immediate clinical relevance is limited in these families, but the findings are of scientific interest as they can point to novel syndromes or novel candidate genes for ID.

If array analysis was the best tool for diagnosis of non-specific ID at the beginning of this project, research and diagnosis in medical genetics underwent a real revolution in most recent years with the advent of new technologies for DNA sequencing, allowing analysis of all the known genes of an individual in one single experiment (WES). This approach has been the object of a specific amendment submitted during the last year of implementation of the project. We thus had the opportunity to introduce this new technology in Eastern European countries, contributing to its widespread use. The use of such advanced molecular diagnostics tools in the definition of genetic disorders will eventually open a new phase in the prevention of the manifestations of many monogenic as well as complex genetic disorders.

In order to raise awareness on ID in general and on the project's results specifically among the scientific community and the general public, the consortium has developed a dissemination strategy based on the use of different channels: scientific publications, a scientific symposium, a CHERISH stand during the annual ESHG meeting, three newsletters, an international workshop, three targeted events for researchers, a 'meet the experts' symposium, the realisation of a multilingual brochure for families and patients' associations, a press conference.

The last newsletter, containing information concerning the consortium dissemination activities, scientific publications, portal updates and project meetings, was sent on July 2012 to 44 000 e-mail contacts. All CHERISH newsletters are freely available on the project website.

Educational brochures and informative leaflets were prepared in English, translated in the languages of participating countries and provided to a wide range of audiences including families, parent groups, health care professionals and the general public. The educational brochure was distributed at month 18 of the project and a new version was prepared in the final months.

Finally, all the partners were involved in dissemination of the project objectives and results at International meetings, such as the yearly conferences of the European Society of Human Genetics (ESGH), but also in the local organization of meetings with families and patients' support groups. Along this line, towards the end of the project a 'Meet the experts' symposium addressed to patients, patients' families and associations was organized in Bologna. The CHERISH project was presented and patients and their representatives had a chance to discuss their personal experiences.

During the ESHG Annual Conference in Amsterdam, the Netherlands on 28-31 May 2011, the CHERISH project had a stand in the international exhibition area, where copies of dissemination materials (handouts, leaflets, posters) were distributed to more than 500 participants. During the meeting the EGF staff also recorded some interviews with consortium members. All the interviews are available on the project portal.

A series of one full day scientific sessions dedicated to the CHERISH project were organised in Bologna within the frame of three residential courses (basic and advanced course in genetic counselling in practice, course on molecular and statistical genetics of consanguinity, and the 25th course in medical genetics) planned by the European Genetics Foundation. The consortium had the opportunity to present the results of the project to a wide international audience (more than 100 among students and faculty members) and to underline the strength of the project and the importance of creating international networks for sharing genetic practices in a cross-cultural setting.

Dissemination activities reached their climax on 24 May 2012, when UNBO organised in collaboration with the City Hall of Bologna and the European Genetics Foundation, a public awareness of genetics event with the presence of to Nobel Laureate Professor Mario Capecchi. The event was introduced by the Rector of the University of Bologna, Prof. Ivano Dionigi and by the Vice-Mayor of Bologna Prof. Silvia Giannini. The entire city was involved in the event and during this unique occasion, the CHERISH partners briefly presented their work and introduced the new powerful diagnostic tools that medical genetics offers for the diagnosis of ID, just before Professor Capecchi's lecture for the general public. CHERISH leaflets were distributed to 200 participants. The event has also been web-casted and a recorded version of the partners' presentation is available on the project website.

List of websites:
Project website: http://www.cherishproject.eu

P1) Alma Mater Studiorum - University of Bologna (UNIBO) Unit of Medical Genetics - Prof. Giovanni Romeo (Coordinator)
Tel: +39-051-2088420
Fax:+39-051-2080416
E-mail: giovanni.romeo@unibo.it; romeo@eurogene.org

P2) University of Tartu (Utartu), Institute of Molecular and Cell Biology - Prof. Ants Kurg (Team leader)

P3) Vilnius University (DHMG-VU) Department of Human and Medical Genetics - Prof. Vaidutis Kucinskas (Team leader)

P4) Charles University Prague (CUNI), Department of Biology and Medical Genetics - Prof. Zdenek Sedlacek (Team leader)

P5) Poznan University of Medical Sciences (PUMS), Department of Medical Genetics - Prof. Anna Latos-Bielenska (Team leader)

P6) Institute of Molecular Biology and Genetics of the National Academy of Sciences of Ukraine (IMBG), Department of Human Genomics - Prof. Ludmila A. Livshits (Team leader)

P7) The Cyprus Institute of Neurology and Genetics (CING), Department of Cytogenetics - Dr Philippos C. Patsalis (Team leader)

P8) Institute of Medical Genetics (IMG), Russian Academy of Medical Sciences, Cytogenetics Laboratory - Prof. Igor Lebedev (Team leader)

P9) Center of Medical Genetics and Primary Health Care of Armenia (CMG), Department of Genetics and Cytology - Prof. Tamara Sarkisjan (Team leader)

P11) European Genetics Foundation (EGF) - Matteo Dutto (Team leader)

Project manager (UNIBO): Adriana Ecaterina Dascultu adriana.dascultu@unibo.it
Dissemination manager (UNIBO): Serena Paterlini serena.paterlini2@unibo.it

Final Report Summary - CHERISH (Improving diagnoses of mental retardation in children in Central Eastern Europe and Central Asia through genetic characterisation and bioinformatics /-statistics)

Related documents

Share this page

Download