European multidisciplinary ALS network identification to cure motor neuron degeneration

Final Report Summary - EURO-MOTOR (European multidisciplinary ALS network identification to cure motor neuron degeneration)

Executive Summary:
European multidisciplinary ALS network identification to cure motor neuron degeneration (EuroMOTOR)
The objective of the project (February 2011 – February 2016) was to discover new causative and disease-modifying pathways to pave the way for novel therapies in Amyotrophic Lateral Sclerosis (ALS). To achieve this goal large-scale quantitative data sets were generated in order to integrate and deliver an ALS computation model. Most importantly, EuroMOTOR has established that the genetic architecture of ALS is vastly different compared what we thought 5 years ago: ALS is not as polygenic compared to many other diseases, including Alzheimer’s disease and schizophrenia, but instead is characterized by a distinct rare variant architecture. This has had profound consequences for the ability to apply computational causal modelling in ALS.

Pillar (I) Data Generation
A shared and harmonized database was set up for which a total of 1818 cases and 3254 controls have been recruited in Ireland, Italy and the Netherlands. Blood samples were collected for ~omics studies, and clinical/environmental/lifestyle data for exposome studies, according to our newly generated European guidelines.
Pillar (II) Establishment of in vivo and in vitro ALS models for identification of diagnostic and prognostic biomarkers
Genomics- Instead of identifying genomic loci through genome-wide association studies (GWAS), the ALS rare variant architecture allowed for the identification of “risk genes” through GWAS and whole genome sequencing. In doing so, we have established that C21orf2 and NEK1 are novel ALS risk genes, and that both proteins interact. Also, two additional GWAS loci that may harbour ALS risk genes were replicated.
Proteomics- For protein-protein analysis a quantitative proteomics strategy was developed and employed to investigate the various facets of ALS. We revealed the composition of toxic protein aggregates and provided the first protein inventory of primary motoneurons. Furthermore, we identified pathological changes in patient derived induced pluripotent stem cell (iPS cells) and differentiated motoneurons and assessed the interactome of ALS associated proteins.
Metabolomics- Through the metabolomics analysis of cellular and whole organism models and patient material we have identified common alterations in energy metabolism and other pathways that are associated with ALS and ALS-causing mutations.
Transcriptomics- Gene expression profiles (GEPs) of ALS patients were determined and evaluated and established several disease biomarkers. These were cross-compared with cellular and animal models with the previous described genetic, proteomic and metabolomics models.
Exposomics- To determine which potential lifestyle and environmental factors could define ALS susceptibility or phenotype, detailed information was collected and analyzed on clinical, environmental, and lifestyle exposures in the population-based cohorts of ALS patients and matched controls.
Functional studies- We developed and applied a methodology to integrate the different data (sets) that were generated. This gained insight into pathological disease mechanisms in ALS. We identified biological pathways and genes that are likely involved in ALS, most notably SNARE activity.
Pillar (III) Computational model generation
To replicate above mentioned findings a number of new in vitro (primary and iPS cells) and in vivo (zebrafish and mouse) models were created and used to validate the results. Ephrin signalling, HDAC6, ELP3, lipid/oxygen metabolism and DNA repair were discovered as potential new therapeutic targets for ALS.

Project Context and Objectives:
The main goal of the Clinical coordination (WP2) was to generate two population-based case-control cohorts for discovery (400 patients and 400 controls), replication (400 patients and 400 controls) and validation (final expected 1600 cases and 3200 controls) and to coordinate the collection of blood samples for all ~omics studies within the Euro-MOTOR project. In detail, the objectives for WP2 were:
1) To establish a European-wide network of standardized population-based registries representing a population of about 41 million people
2) To implement a simple, standardized and flexible European-wide database linking clinical data and biological samples for the ~omics analyses and integrative analysis
3) To generate a final guideline concerning the minimal requirements for collection and storage of clinical and epidemiological data throughout Europe
4) To generate a final guideline concerning the collection and storage of biological materials
5) To generate two population-based case-control cohorts to provide the basis for the work
6) To ensure an adequate coordination concerning the distribution of biological materials.
The main goals of work package Genomics (WP3) were to generate a new prospectively collected genomics dataset, and to combine this with the collected available retrospective genomics data to achieve maximum statistical power. Also, the goal was to perform whole exome/genome sequencing and resequencing to identify and validate new ALS risk genes. The following objectives were formulated:
1) To acquire cost-effective genome-wide variation data in the prospective cohorts
2) To acquire highly powered retrospective genome-wide variation data in ALS
3) To allow for focused, high throughput modern re-sequencing based on identified targets from the workpackages
4) To identify new genes that cause familial ALS through whole exome sequencing

For a comprehensive understanding of posttranslational pathologies contributing to cellular aggregates in ALS motor neurons, quantitative proteomics technologies had to be developed in the proteomics work package (WP4). Then the technologies were further applied to investigate ALS aggregate formation and composition in primary motor neurons as well as to identify novel pathological protein-protein interactions involved in ALS. In work package Proteomics (WP4), we concentrated on four major objectives:
1) To optimize and validate techniques for patient iPS cell derived motor neurons, using mouse models and isolated primary motor neurons from mouse models.
2) To investigate ALS aggregate formation and composition through quantitative proteomic techniques using iPS derived motor neurons.
3) Search for novel proteins involved in ALS pathology by quantitative interaction analysis.
4) Perform differential transcription factor binding analysis at SNPs associated with ALS.
During the course of project we modified objective 4. While we initially established the pipeline for SNP analyses promising candidate loci were not available in time. Free resources were reallocated to other objectives and additional collaborations in the consortium. For instance, we developed label free approaches and extended objective 1 to include patient and control fibroblasts and differentiated motoneurons, extended the number of protein-protein interactions screens and investigated highly relevant primary neurons for the analysis of protein aggregates.

Amyotrophic lateral sclerosis has long been associated with hypermetabolism, oxidative stress and other metabolic alterations. Some germline mutations causing ALS are in metabolic genes such as in Mn/Zn-superoxide dismutase (SOD1) and metabolism may also modify the effect of other causal mutations. Metabolomics is the characterization of the small-molecular (metabolite) content of biological samples and represents an important facet of systems biology, providing critical insight into metabolic phenotype. Work package Metabolomics (WP5) provided access to expertise in small-molecule profiling using NMR spectroscopy, GC-MS and LC-MS. Through the characterisation of the model systems available as part of the project and in particular analysis of prospectively collected biospecimens from a large, European cohort our main objectives were:
1) To identify metabolic biomarkers associated with ALS in a prospective study
2) To identify perturbed metabolic pathways associated with ALS phenotypes by systems biology approaches to molecular profile integration
3) To evaluate the translation of metabolic pathway effects between ALS models and man
4) To evaluate the predictive ability of metabolic biomarkers for ALS in a prospective manner

Gene expression profiling (GEPs) or Transcriptomics (WP6), is a powerful method for determining which genes are being expressed within particular cells or tissues; by comparing the GEPs from disease with control samples, it provides us with insights into the pathophysiology of the disease, identifies biomarkers and clusters samples with similar expression patterns, without being restricted to pre-existing hypotheses. The objectives for WP6 were to:
1) Improve our understanding of the mechanisms of disease using patient samples isolated from spinal cord motor neurons (MNs), muscle or induced pluripotent stem cell (iPSC) derived motor neurons or from primary MNs derived from mouse models of ALS
2) Identify clusters of fALS and sALS with similar gene expression profiles, reflecting a common aetiology
3) Establish GEPs associated with fast and slow disease progression using GEP from lymphoblastoid cell lines (LCLs)
4) Determine if genomic variants in ALS patients DNA correlate with gene expression patterns seen in the blood
5) Establish biomarkers in muscle that distinguish early ALS cases from other related diseases and controls, or correlate with disease severity
6) Interrogate cellular models of new genetic variants of ALS (including TARDBP and C9orf72), to understand their mechanisms of pathobiology

Work package Exposome (WP7) used a population-based approach for identification and validation of environmental risk factors for ALS that can be translated into molecular-biological pathways. We established the quantitative exposome in population-based cohorts of 1600 patients and 3200 controls and provided biological material associated with well characterised exposome profiles that can be integrated to higher level system analysis. The objectives for WP7 were as follows:
1) To establish the quantitative exposome in population-based cohorts
2) To provide biological material associated with well characterised exposome profiles that can be integrated to higher level system analysis

For work package Functional studies (in vivo and in vitro) (WP8), the first objective was to generate new in vitro and/or in vivo models for ALS related to mutations in Ataxin-2, DCTN1, TDP-43 and FUS. Different strategies were proposed to create mouse models, primary cultures expressing mutant genes as well as differentiated motor neurons derived from iPS cells from ALS patients and controls.
The second objective was to use these different models to investigate the role of different modifiers of the ALS disease process. In addition, this workpackage had to provide a collection of samples of the different models to the consortium for a multilevel-omics analysis.The third objective was to use the models generated to validate the results obtained in the other work packages and to get a better insight into the exact contribution of these hits in the context of ALS. The final goal of was to propose new potential therapeutic targets that could be translated into new therapies for ALS.
The specific objectives of WP8 were:
1) Full behavioural, histopathological and molecular characterisation of mice which overexpress mutant and wild type TDP-43 and FUS.
2) Development of cell culture assays to characterize specific cellular defects in motor neurons of these mice on the cellular level. These studies include comparison of primary motor neurons from mouse models with mouse and human and ES cell derived motor neurons. This development includes techniques for enrichment of ES cell derived motor neurons so that these cells can be used in WP2-6 for ~omics work.
3) Full behavioural, histopathological and molecular characterisation of mice which conditionally overexpress mutant and wild type TDP-43 and FUS The transgene will be expressed ubiquitously and selectively in motor neurons.
4) To undertake a full characterisation and analysis of the pathology of the mouse strains with a mutation in the endogenous TDP-43 encoding gene by in vivo investigation of their neuromuscular physiology and histopathology in order to compare this to human neurodegenerative conditions particularly ALS, in parallel with the in vitro analysis of the molecular cell biology of these mouse mutants compared to normal animals. This will include the analysis of the transcriptome of these mice, as well as the study of the effects of the mutations on the normal cellular functions of this protein in primary motor neuron cultures.
5) To characterize the phenotype of a mutant and wild type TDP-43 overexpressing zebrafish model and to study the effect of factors known to affect TDP-43 location such as PGRN. To characterize the phenotype of mutant and wildtype FUS overexpressing zebrafish.
6) To establish in vitro models for selected mutations in the dynactin P150 subunit which are found in humans (position 34, 59, 63, 196, 1049, 1248 and others)
7) To determine the phenotype of mouse models over expressing the DCTN1 63 and 196 mutation and compare it to the mutant SOD1 phenotype
8) To elucidate whether tubulin over- or underacetylation affects neurodegeneration of motor neurons and in particular the motor neuron degeneration induced by mutant SOD1
9) To elucidate the molecular basis for the increased susceptibility to ALS that is associated with low Elp3 expression
10) To elucidate the molecular basis for the increased susceptibility to ALS that we identified to be associated with UNC13A polymorphisms
11) Provide a collection of samples from motor neuronal cell model of ALS, expressing G93A SOD1, and exposed to risk factors inducing changes of energy metabolism (hypoxia), and from transgenic SOD1G93A mice and rats fully phenotypically characterized and at different stages of the disease, to the consortium for multilevel ~omics analyses
12) Validation of biomarkers based upon metabolic and oxidative stress (nitroproteins) previously identified by proteomic studies in models in vitro and in vivo
13) To characterize the role of the PHDs in motor neuron degeneration. Moreover, as PHD enzymes are druggable targets, we will also evaluate the therapeutic potential of modulating PHD activity in preclinical models of ALS.
14) To delineate the contribution of altered lipid metabolism and SCD1 to ALS. We will study the effects of SCD1 loss on motor and metabolic phenotype. The identification of upstream factors responsible for SCD1 loss will be also undertaken, in order to isolate specific causes of lipid metabolism dysfunction in ALS muscle.
15) To develop therapeutic strategies based on lipid metabolism handling and SCD1 function. We will determine whether restoring SCD1 levels in skeletal muscle by nutritional, pharmacological and genetic means is sufficient to rescue mutant SOD1 mouse hypermetabolism and delay motor neuron death.
16) To perform a morpholino-based genetic screen of the mutant SOD1 zebrafish model we described previously
17) To validate the hits (WP8-1)
18) To validate the hits identified in the newly generated mutant TDP-43 model and in a mouse model for motor neuron degeneration (mutant SOD1)
19) To explore the therapeutic significance of the hits identified in the mutant SOD1 based model for ALS in the zebrafish
20) To validate the results of WP2-6 by modelling the factors identified in the zebrafish and to study the interaction of previously known ALS-causing or -modifying genes with the newly identified factors from these WPs
21) To generate induced pluripotent stem cell lines (iPS cells) from ALS patients and non-ALS controls for the in-vitro generation of glia and motor neurons
22) To perform high throughput small molecule- and siRNA screens to identify new targets for disease intervention
23) To use a comparative analysis approach to identify novel marker genes that could aid as diagnostic tools for early detection of ALS
24) To validate the factors identified in WP2-6 by studying them in vitro using cells derived from patients with sporadic and familial ALS
25) To validate the role of factors identified in WP2 to 6 in the biology of motor neurons in culture
26) To validate the effect of these factors in the zebrafish models described

For the integration of genetic and environmental data we delivered a robust causal and disease progression model of ALS within work package ALS model generation (WP9). We determined how to best integrate and exchange data, and to develop strategies how to integrate different ~omics datasets. For model building we developed infrastructure for conducting quantitative trait (QTL) locus mapping, noise reduction strategies for increasing the number of detectable QTLs, tools to identify regulators of QTLs that map to the same locus, and reported on the eQTLs, pQTLs, mQTLs that were detected both in human and mouse data. We had to perform modelling on the genes that underly the QTLs, and assessing the properties of the identified networks, to get a model on ALS disease and disease progression.
We obtained our goals by the following objectives:
1) To achieve a computational infrastructure for the multiple data modalities
2) To perform analysis of rodent/cellular multilevel ~omics discovery data and to perform analysis of human multilevel ~omics discovery data
3) To build an integrative ALS computational model for disease and disease progression in discovery data.

Project Results:
Clinical coordination (WP2)
Task 1: Coordination of recruitment of prospective population based discovery and replication cohorts
Patients of the discovery and replication cohorts were chosen from well-established population-based registries of EuroMOTOR partners that are also involved in EURALS: Piemonte/Valle d’Aosta, Italy (UNITO, Beneficiary 10); Puglia, Italy (UNITO, Beneficiary 10); Lombardia, Italy (IRFMN, Beneficiary 9); Ireland (TCD, Beneficiary 8) and The Netherlands (UMCU, Beneficiary 1). All ALS patients (diagnosed from 1 January 2011), with possible, probable, probable laboratory supported, or definite EE category are eligible for inclusion. For each registered case, a normal individual (control) has been selected from the lists of the general practitioners living in the case’s geographic area, matched on gender and age (+/- 5 years).
The recruitment of patients and controls has been completed at 31 July 2015 with a final number of 1818 cases and 3254 controls that matches the planned 1600 and 3200 as specified for the discovery, replication and validation cohorts. Table2.1 provides the specific contribution of patient samples and controls from each beneficiary. Even the total number of enrolled controls matches the requirements, if we consider the contribution of each beneficiary some of the centers involved in the project have not reached the planned 2:1 ratio between controls and patients (2 paired controls for each case).
The prospective population discovery cohort (400 patients, 400 controls) and replication cohort (400 patients, 400 controls), with blood samples and clinical/environmental/lifestyle data collections, have been completed matching the planned numbers as specified for the discovery and replication cohorts.
Samples collection for ~omics analyses in the different laboratories/centers involved in WP2 has been completed. After preliminary tests, it has been decided to mainly focus the analyses on DNA and serum, thus the final collection and shipment was specifically dedicated to these biological samples.
The total number of available samples (Table 2. 4) largely exceed the planned 800 patients and 800 controls. This was due to the periodic shipments of the samples to ~omics laboratories from the different centers, the need to face possible loss of samples (transport, treatment) and, in particular, to the need to perform a matching procedure between each patient and control according to the specific SOP (a part of the total samples collected consists in not- paired samples, both for cases and controls). Table 2.2 shows in detail the total and final number of each sample (DNA, RNA, serum, plasma, and urine) collected by each partner.
A centralized electronic database (“Progeny”) has been developed for online registration of all collected patients/controls information on clinical/environmental/lifestyle data from questionnaires). The electronic database also facilitates queries to identify specific subsets of patients and the locations where biological samples are stored (virtual biobank).The final contribution from each beneficiary is shown in Table 2.3.

Task 2: Drafting a guideline on collection and storage of clinical data and biological samples
The goal of task 2 is to draft a guideline on collection and storage of clinical data and biological samples concerning the minimally required clinical data that are being collected and stored throughout Europe. The European and national regulatory requirements of each country related to the share of clinical data will be followed and incorporated in the guideline.
The guidelines and SOPs were prepared by partner 10 and approved by the other beneficiaries involved in WP2 (“Standard Operating Procedure for Collection of Blood and Urine Samples in the Euromotor Study” and “Standard Operating Procedure for Patients/Controls Recruitment and Questionnaires Administration in the Euromotor Study”).
The final guideline concerns the minimal requirements for collection and storage of clinical and epidemiological data and the collection, storage and distribution of biological materials. Both the guideline (SOPs) and the online database (Progeny) have been developed considering the legal requirements of each participating country and the regulatory requirements related to the sharing of clinical data were followed and incorporated in the guideline.
For each registered patient (and matched control) blood and a questionnaire were taken after written informed consent for the collection of the material required to comply with the study protocol.
Finally, the centralized database shows de-identified data for the protection of Human Subjects and Privacy Issues, that are available to members of the consortium after entering individual login and password.

Task 3: Expand prospective cohorts to 1600 patients and 3200 controls for clinical, environmental, lifestyle data
The purpose of this task is to continue to collect clinical/environmental/lifestyle data only for the exposome workpackage (WP7), in a population-based fashion through questionnaires or telephone interviews to arrive at a cohort of 1600 patients and 3200 controls. The collection of clinical/environmental/lifestyle data from additional patients and controls has followed on continuously from the data collection for the discovery and replication cohorts by UMCU, IRFMN, UNITO and TCD.
Prospective cohorts has been expanded to an overall of 1600 patients and 3200 controls for environmental/lifestyle data collection and in the third reporting period the last cohort (800 patients and 2400 controls) for exposome WP7 has been completed by the groups involved in WP2 collecting only clinical and environmental/lifestyle data through questionnaires.
The total number of questionnaires collected from controls and entered in the electronic database is slightly below the target (3097 vs 3200). This was due to the fact that a part (4.8%) of the controls enrolled have not provided the requested data (questionnaire) and also some of the centers involved in the project have not reached the planned 2 paired controls for each case (2:1).

Task 4: Coordination of including retrospective samples for validation purposes
The validation cohort consists of patients’ samples retrospectively collected by EuroMOTOR partners and samples from new patients and controls. The advantage of the validation cohort is the huge increase in power compared to the prospective cohorts, especially with regard to DNA samples.
In the first and the second period the collection of blood and urine samples for ~omics studies was performed for almost every case/control recruited for the discovery and replication cohorts. In addition retrospective samples are included. The collected biological samples (DNA, RNA, serum, plasma and urine) in the different laboratories (UMCU – partner 1, IRFMN – partner 9, UNITO – partner 10, TCD – partner 8) were periodically distributed to the respective ~omics laboratories for validation.
Biological samples have been sent around using the Standard Operating Procedure for the collection and storage of blood and urine samples in EuroMOTOR. DNA and RNA samples were send from the three registers in Italy and from Ireland to partner 1 (UMCU), where they are used for Genomics (WP3) and Transcriptomics (WP6). Serum, plasma and urine samples were send from the three registers in Italy, from Ireland and from The Netherlands to partner 15 (ICL), where they are used for Metabonomics (WP5). Shipment of additional samples has been completed in March 2015.
The final number of samples collected and shipped to UMCU and ICL for each partner involved in WP2 is shown in Table 2.4.

Genomics (WP3)
The main objectives of WP3 were:
1) To acquire cost-effective genome-wide variation data in the prospective cohorts
2) To acquire highly powered retrospective genome-wide variation data in ALS
3) To allow for focused, high throughput modern re-sequencing based on identified targets from the workpackages
4) To identify new genes that cause familial ALS through whole exome sequencing

As indicated above, we showed that ALS turned out to be a disease characterized by a rare variant design, instead of being a complex highly polygenic disease. As a consequence, a typical GWAS in ALS will not show as much statistical “inflation” that indicates polygenicity, which limits the computational modeling of epistasis and systems genetics analyses, Instead, it allows for the direct identification of distinct ALS risk genes, that may be interacting and active in similar pathways. This is what has been analyzed in WP9. The combination of GWAS and sequencing studies allowed us to identify ALS risk genes, that have more effect on biological systems compared to individual SNPs that typically result from GWAS studies in complex diseases. During the later phases of the project, we have been able to assemble an adequately powered GWAS combined with sequencing data and we were able to identify robust novel ALS risk genes, C21orf2 and NEK1, as described below.

C21orf2 and novel GWAS loci
During the previous reporting periods, using a combination with whole genome sequencing data we identified seven ALS risk loci (4 novel) including a new ALS risk gene, C21orf2. During the final reporting period we set out to replicate these novel findings.
We directly genotyped the newly discovered associated SNPs in nine independent replication cohorts, totaling 2,579 cases and 2,767 controls. In these cohorts we replicated the signals for the C21orf2, MOBP and SCFD1 loci (combined p-value = 3.08 × 10-10, p = 4.19 × 10-10 and p = 3.45 × 10-8 for rs75087725, rs616147 and rs10139154 respectively, Supplementary Fig. 3.4). The association for rs7813314 at 8p23.2 however could not be replicated.
There was no evidence for residual association within each locus after conditioning on the top SNP, indicating that all risk loci are independent signals. Finally, apart from the C9orf72, UNC13A and SARM1 loci, we found no evidence for associations previously described in smaller GWAS in ALS.
The associated low-frequency non-synonymous SNP in C21orf2 suggested that this gene could directly be involved in ALS risk. Indeed, we found no evidence that linkage disequilibrium of sequenced variants beyond C21orf2 explained the association within this locus.
To find additional evidence for a direct role of C21orf2 in ALS risk we investigated the burden of rare coding mutations in this gene as observed in whole genome sequencing data from 2,562 ALS cases and 1,138 controls. After quality control these variants were tested using a pooled association test for rare variants (T5 and T1 for 5% and 1% allele frequency, respectively). This revealed an excess of non-synonymous and loss-of-function mutations among ALS cases that persists after conditioning on rs75087725 (pT5 = 9.2 × 10-5, pT1 = 0.01) which further supports the hypothesis that C21orf2 contributes to ALS risk.

C21orf2 rare variant burden. Summary of the rare (MAF < 0.05) non-synonymous and loss-of-function mutations in the canonical transcript of C21orf2: conditioning on the SNP found to be associated in the GWAS (rs75087725, p.V58L colored grey) there was an increased burden of non-synonymous and loss-of-function mutations among ALS cases (pT5 = 9.2 × 10-5, pT1 = 0.01). Odds-ratios (calculated by counting alleles in cases and controls per stratum, unadjusted for PCs combined in a Cochran-Mantel-Haenszel test) are 1.63 and 1.48 for the T5 and T1 burden respectively. The two loss-of-function mutations observed in cases are colored red.

NEK1
Rare variant burden testing (RVB) compares the combined frequency of rare variants within each gene in a case-control cohort. Candidate associations are identified by significant differences after multiple test correction. We have assembled an exome sequencing dataset for 1,376 index FALS cases and 13,883 controls. Of these, 1,028 cases and 7,938 controls met all required data, inter-relatedness and ancestral quality control criteria. Within the case cohort, 881 FALS were devoid of rare variants in known ALS genes.
The successful detection of disease associations through RVB can depend heavily on the appropriate setting of test parameters. Since genetic loci often contain many alleles of no or low effect, prior filtering of variants based on minor allele frequency (MAF) and pathogenicity predictors can reveal disease signatures otherwise masked by normal human variability. As appropriate MAF or pathogenicity predictor settings may not be obvious in advance, comprehensive assessment of all pursuable analysis strategies is desirable but can in turn introduce excessive multiple test burden. To overcome these limitations, we performed 308 distinct RVB analyses of 10 well establish ALS genes using 44 functional and 7 MAF filters (Fig. 3.1a).
All tests included correction for gene coverage and ancestral covariates. Tests differed in their capacity to detect individual known ALS genes, however, highest net sensitivity was achieved when analyses were restricted to variants with MAF<0.001 and functional classifications of either nonsense, splice altering4 or FATHMM deleterious5. Under these settings, 4 genes exhibited disease association at exome-wide (Bonferroni-corrected, P<2.5x10-6) significance (SOD1, TARDBP, FUS, UBQLN2), 3 achieved near exome significance (TUBA4A, TBK1, VCP), and 3 displayed modest to nominal disease association (PFN1, VAPB, OPTN) (Fig. 3.1b). Genes exhibiting the strongest disease associations included those reported as major ALS genes in population based studies while those exhibiting weaker associations are believed to constitute rarer causes of disease.
Extension of the optimal known ALS gene parameters to all protein coding genes revealed one novel gene displaying exome-wide significant disease association (Fig. 3.1b).
The gene, NEK1 (OR=7.6 P=5.3 x10-7), encodes the serine/threonine kinase NIMA (never in mitosis gene-A) related kinase. Retesting of NEK1 under alternate analysis parameters revealed strong disease associations across most analysis strategies, particularly where loss of function (LOF, nonsense and predicted splice altering) variants were included. Introduction of higher stringency variant, genotype and ancestral control did not alter the association (P=3.4x10-7). No evidence was observed for systematic genomic inflation (λ=1.05) or confounding related to sample ascertainment or case-control biases in NEK1 gene coverage.
In an independent line of research, whole genome sequencing was performed for 4 ALS patients from an isolated community in the Netherlands (population<25,000). High inbreeding coefficients were observed for each of the 4 patients confirming their high degree of relatedness.
Autozygosity mapping, allowing for genetic heterogeneity, identified 4 candidate disease variants occurring within detectable runs of homozygosity (ROH) (Fig. 3.2).
These variants included a p.R261H mutation of NEK1, the only non-synonymous variant in the exomes with a that passed our criteria for highly related samples (variants with a MAF below 0.01 that are both shared by all four patients and have multiple homozygous phenotypes). In the ExAC database, no homozygosity is reported in 59,526 subjects for this variant (59,526 subjects). Two of the 4 SALS cases were homozygous for p.R261H while 2 were heterozygous, raising the possibility that p.R261H might represent a risk factor with additive as opposed to truly recessive risk effects. None of the other 3 candidate variants occurred in more than 2 patients. Analysis of the region revealed a shared p.R261H haplotype spanning 5.5 Mb in all 4 samples.
To validate the risk effects of p.R261H we tested for disease association among 6,172 SALS cases and 4,417 matched controls from 8 countries. This cohort was either genotyped using the Illumina exome chip or whole genome sequenced, allowing for checking any overlap or detectable relatedness to the FALS case-control cohort, which was not present. Meta-analysis of all independent population strata reveal a clear minor allele excess in cases with a combined significance of p=3.5 x 10-5 and OR=2.5. Disease association was also observed within the FALS case-control data (OR=2.6 p=2.4x10-3) and meta-analysis of FALS, SALS and all controls combined (OR=2.5 P=1.2x10-7).
DNA availability facilitated segregation analysis of only one NEK1 LOF mutation, a p.R550X mutation which was also detected in the affected mother of the identified proband. To validate the effect of LOF mutations observed in FALS and assess any potential contribution to sporadic disease, full sequencing data of the NEK1 coding region was available for 2,339 SALS and 1,072 controls. RVB confirmed a significant excess of LOF variants in cases (OR=22.0,p=1.7x10-4). Meta-analysis of discovery and replication LOF analyses yielded a combined significance of P=1.5x10-8 and OR=8.1.
Overall summary of final reporting period WP3: Extensive replication through direct genotyping of specific genetic variants and whole genic burden analyses in large follow up cohorts has now established that C21orf2 and NEK1 are novel ALS risk genes. Interestingly, there is ample biochemical evidence that both proteins interact and are within the same biological pathway. Also, out of 3 novel additional GWAS loci, two replicated and are robust to case-controls status; in total now, there are 4 GWAS loci, besides the finemapped C9orf72 and C21orf2 loci.

Proteomics (WP4)
For the proteomics analysis (WP 4) the major objectives and results were:

Optimize and validate techniques for patient iPS cell derived motor neurons, using mouse models and isolated primary motor neurons from mouse models
We established a label free proteomics pipeline to investigate human cellular material without the need of metabolic labeling or introducing reference standards for accurate quantification (Cox et al., 2014). The reduced complexity (circumventing the combination of two samples, heavy and light) enables us to acquire much “deeper” proteomes (up to 6000 instead of 3000 proteins). Moreover, this approach allows multiple comparisons of cellular systems. A paper has been published (Hornburg et al., 2014) characterizing motor neurons and model cell lines in mice with Sendtner et al. (partner 7). The key findings of the proteomics analysis of primary motor neurons and motor neuronal cell lines are: (1) Individual protein and pathway analysis indicate substantial differences between motor neuron-like cell lines and primary motor neurons, especially for proteins involved in differentiation, cytoskeleton and receptor signalling, whereas common metabolic pathways were more similar. (2) The ALS-associated proteins themselves also showed distinct differences between cell lines and primary motor neurons, providing a molecular basis for understanding fundamental alterations between cell lines and neurons with respect to neuronal pathways with relevance for disease mechanisms. Overall, neuroblastoma cell lines are limited in their suitability to serve as a model system for motor neurons to study ALS relevant functional mechanisms, as they neither reconstitute global motor neuronal properties nor faithfully mimic motor neurons on the individual protein level. However, they retain neuronal character in comparison to other cell lines rendering them a valid choice for some biological investigations and high throughput screens.
We acquired quantitative proteomics data on 10 different fibroblast cell lines (5 ALS cases, 5 controls), 16 iPSC lines (8 ALS cases, 8 controls) and 9 differentiated motoneurons (5 ALS cases, 4 controls). All ALS cases harbor an ATXN2 mutation. We identified actin-binding protein TWF2 significantly downregulated in patient derived differentiated motoneurons. In line with this finding, we observed significant changes for annotations associated with the cytoskeleton. These findings strongly suggest an impairment of the cytoskeleton in motoneurons harboring ATXN2 mutations.

Investigate ALS aggregate formation and composition through quantitative proteomic techniques using primary motor neurons.
Aggregates are important pathological hallmarks in neurodegeneration. So far, the exact content of aggregates is unknown. Our goal was to develop a quantitative proteomics approach to determine the aggregate composition in the context of ALS. Standard proteome workflows allow the quantification of thousands of soluble proteins. Aggregation prone proteins tend to be insoluble and cannot be processed using the same protocol. Insolubility and protease resistant characteristics require adjusted sample preparation methods and altered bioinformatics.
Initially we proposed to use aggregates from iPS cells as biological starting material, however, the relevant model/source of aggregates were not available within the consortium as planned. We therefore established external collaborations in order to obtain suitable biological material. The cell culture model system (NSC34) that WP4 employs for determining the ALS interactome does not cause prominent protein aggregation but instead allows us to focus on interactions with soluble proteins. To investigate ALS relevant aggregate species in primary neurons, we teamed up with the group of Dieter Edbauer (LMU) who discovered novel c9orf72 derived aggregates.
We investigated poly dipeptide aggregates associated with ALS linked c9orf72 hexanucleotide expansions (Mori et al., 2013) and mapped the aggregate composition to identifying novel proteins involved in ALS pathology. The majority of the proteins was involved in the ubiquitin/ proteasome machinery. Furthermore, we identified and validated a novel disease mechanism in ALS by showing that Unc119 is sequestered into the aggregates whereas its overexpression rescues primary neurons from aggregate toxicity. We could confirm a loss of function by sequestration for Unc119 in primary neurons and similar deposition patterns in ALS patient brains. This highly cited study has been published in Acta Neuropathologica (May et al., 2014).

Search for novel proteins involved in ALS pathology by quantitative interaction analysis
Several proteins (such as SOD1, DCTN1, OPTN, FUS, TDP etc.) have been identified to be involved in ALS. We perform quantitative protein-protein interaction screens for these proteins as baits to identify novel interaction partner. The resulting candidates can be tested in functional models described in WP8 and will be modelled into pathways in WP9.
The proteomics workflow, which initially was SILAC based, was further improved and now allows a more comprehensive understanding of the underlying interaction network by multiple comparisons, higher depth and the possibility of supplementation with pull downs from tissue expressing epitope tagged ALS associated proteins.
We compared the interactomes of 3x 6 ALS associated wild type proteins and 3x 19 ALS associated mutant proteins against GFP and empty vector controls in the motoneuronal cell line NSC-34. Overall we were able to quantify more than 2000 proteins. Due to the quantitative nature of our LC-MS approach, we identified interacting proteins for investigated baits that were significantly enriched against the background. We were able to validate the approach by identifying known interactions partners such as Tbk1 (Optn) or the Dctn family (Dctn1). In addition, our interaction screen revealed many so far unknown interacting proteins for instance, the splicing associated factor Snrpc or the Golgi associated protein Vps52 in the case of Optn. Our screen in NSC34 cells revealed common interaction partner for ALS associated mutant and WT proteins allowing us to depict the overlaps in the ALS interactome.
Following up on this approach method, we are currently establishing and applying a protein-protein interaction protocol that is suitable to investigate tagged proteins in vivo in collaboration with Sendtner et al. (partner 7). This will allow us to investigate highly relevant mouse models expressing the epitope tagged ALS-associated proteins of interest as an addition to the EuroMOTOR project.

Perform differential transcription factor binding analysis at SNPs associated with ALS
We have developed a quantitative mass spectrometry based pipeline to identify differential transcription factor binding at SNP (Butter et al., 2012). To facilitate throughput, we have established a desthiobiotin linker system, which allows specific elution by biotin reducing the complexity for MS analysis.
At the time of July 2014 there were only 2 SNP signals in ALS genetics, as reported extensively in WP3. The insight we were getting from the ongoing ALS genetic studies, is, that common SNPs are not important as causative variants, but instead tags for rare variants nearby, which is in sharp contrast to many other diseases.
Since GWAS did not reveal promising SNP candidates, and SNPs are not causal per se in ALS (but tags for rare variants nearby) we were able to initiate and pursue additional projects within the consortium. For instance, in close collaboration with partner 7 we evaluated motorneuronal model systems with quantitative proteomics. This study not only provides the first unbiased and comprehensive characterization of two widely used neuronal cell lines (NSC34, N2a) but also presents the first proteome of primary motorneurons (Hornburg et al., 2014). Also We further advanced on the approaches for investigation insoluble aggregates in primary neurons (May et al., 2014)) and increased number of screened candidates. Together with partner 7 (UKW) we performed functional studies on the c9orf72 protein. The unique synergism with EuroMOTOR allowed us to propose a novel pathological mechanism of c9orf72 in ALS (nature neuroscience, under revision).

Metabolomics (WP5)
Defects in energy metabolism are potential pathogenic mechanisms in amyotrophic lateral sclerosis (ALS), a rapidly fatal disease with no cure. The mechanisms through which this occurs remain elusive and their understanding may prove therapeutically useful. We used metabolomics and stable isotope tracers to examine metabolic changes in a well-characterized cell model of familial ALS, the motor neuronal NSC-34 line stably expressing human wild-type Cu/Zn superoxide dismutase (SOD1wt) or the mutant G93A (SOD1G93A). Our findings (published in Valbuena et al. Molecular Neurobiology 2015) indicate that SOD1wt and SOD1G93A expression both enhanced glucose metabolism under serum deprivation. However, in wtSOD1 cells this phenotype increased supply of amino acids for protein and glutathione synthesis, while in G93ASOD1 cells it was associated with death, aerobic glycolysis and a broad dysregulation of amino acid homeostasis (Figure 5.1). Aerobic glycolysis was mainly due to induction of pyruvate dehydrogenase kinase 1. Our study thus provides novel insight into the role of deranged energy metabolism as a cause of poor adaptation to stress and a promoter of neural cell damage in the presence of mutant SOD1. Furthermore the metabolic alterations we report may help explain why mitochondrial dysfunction and impairment of the endoplasmic reticulum stress response are frequently seen in ALS.
NSC-34 and primary motor neurons share similar metabolic pathways but the functional relevance of results in this single proliferating cell line cannot reproduce the cooperative metabolic processes between the different cell primary types in the nervous system. Neurons and surrounding astrocytes markedly differ in their metabolic functions and are organized as a functional unit with highly complex metabolic cross-talk. Several studies have indicated that ALS is a non cell-autonomous disease and in particular that astrocytes are likely to take part to motor neuron injury and contribute to disease progression.
To determine whether the expression of mutant SOD1 protein changed the metabolic interactions between astrocytes and spinal neurons and whether this might have a role in the progressive loss of motor neuron viability, we conducted metabolic profiling of a co-culture system made from astrocytes and spinal neurons obtained from SOD1-G93A mouse embryos. In this system primary motor neurons isolated from mutant SOD1-G93A embryos do not die when cultured alone but they became selectively vulnerable if co-cultured with transgenic astrocytes (Tortarolo et al., 2015). Specifically 1H NMR Spectroscopy was used to analyse media from astrocyte-spinal neuron co-cultures, as well as astrocytes in single culture.
Clear metabolic differences were observed based on genotype, which were dominated by the presence of SOD1G93A in astrocytes after 3 days culture and by the presence of SOD1G93A neurons after 6 days culture (Figure 5.2). Glucose uptake was increased with SOD1G93A, but while co-cultures with SOD1G93A neurons had lower lactate release, those with SOD1G93A astrocytes exhibited the reverse. Altered utilization of branched-chain amino acids was observed in co-cultures with SOD1G93A neurons alongside reduced release of glutamine and glutamate with both SOD1G93A astrocytes and neurons. Changes in TCA cycle intermediates and ketone bodies were also seen. Overall, the differing metabolic responses in the two cell types highlight the contribution of the astrocyte-motor neuron interaction in the resulting metabolic phenotype, requiring further examination of the specific metabolic changes that arise and their impact on motor neuron survival.
The progression of amyotrophic lateral sclerosis is highly variable between patients. However, we continue to be unable to predict the severity of disease in patients or identify the factors that contribute to the rate of progression. In one model of variable progression, SOD1G93A mice on the C57 and 129S genetic backgrounds have different rates of disease progression and lifespan that occur independently of trans-gene copy number or mutant SOD1 burden in the central nervous system. Alterations to metabolism in the thoracic and lumbar spinal cord from the slower progressing C57-G93A mice and the more rapidly progressing 129S-G93A mice at the presymptomatic, onset, and late stages were analysed by Gas Chromatography-Mass Spectrometry (GC-MS). Distinct metabolic responses to mutant SOD1 were found in the thoracic spinal cord differentiating the two strains at the presymptomatic stage (Figure 5.3). Changes were less clear at onset, and were driven by a common effect of the SOD1 transgene at late stage. Metabolites with significant differences in the thoracic spinal cord function in neurotransmission, energy production, and antioxidant activity. Metabolic differences were less pronounced in the lumbar spinal cord, with differences at each timepoint dominated by SOD1 genotype rather than the strain. Alterations to levels of neurotransmitter metabolites were also observed, along with changes to nucleotide levels. These results indicate that the variation in the metabolic responses to ALS mutations are evident in early life, and that these early responses may play a key role in determining the severity of the disease phenotype.
While SOD1G93A is the most widely studied model of ALS, hexanucleotide repeat expansions in the C9ORF72 gene have been identified as the most common genetic cause of amyotrophic lateral sclerosis (ALS), and provides a previously unspecified link with cases of frontotemporal dementia. However, the function of the C9orf72 protein continues to be unclear, and while the repeat expansion has been shown to lead to the formation of dipeptide repeat proteins, a definitive pathogenic mechanism for the disease remains elusive. To examine potential impacts of the disease on metabolic processes in the CNS, we employed a metabolic profiling approach using GC-MS on cerebellum, frontal cortex, motor cortex, and thoracic spinal cord tissue from C9ORF72 ALS patients and compared profiles to tissue from sporadic ALS cases (sALS) and control individuals. We observe a characteristic metabolic profile for C9ORF72 ALS patients in the cerebellum after principal components analysis (Figure 5.4). Significant increases in amino acids and carboxylic acids as well as metabolites linked to energy metabolism including lactate were observed in C9ORF72 ALS compared to control and sALS. Weaker separation of C9ORF72 ALS cases in the first principal component was seen in frontal cortex tissue, while a small difference in the second principal component was observed for sALS cases compared to both control and C9ORF72 ALS. Our results suggest that the repeat expansion may lead to metabolic changes that contribute to the distinct profile observed in the cerebellum. Improved understanding of the metabolic alterations leading to the characteristic metabolomic phenotype observed may provide new insights on the processes involved in C9ORF72 ALS pathogenesis.
While these studies provide important new insight into the cellular metabolic pathways altered in ALS, we still understand little about metabolic changes associated with the disease at the systemic level. Minimally-invasive molecular biomarkers for ALS detectable in blood or urine could support diagnosis, prognosis, patient stratification or act as surrogate response markers in clinical trials. To define a metabolic signature of ALS in blood serum we used a targeted LC-MS/MS based metabolomics platform to profile >1600 samples from ALS patients and controls recruited from ALS registries in three different countries (Ireland, Italy and the Netherlands). The metabolomics analysis generated ~180 measurements which were subjected to multivariate analysis. From a discovery set of 808 samples (435 ALS cases) an orthogonal partial-least squares discriminant analysis (OPLS-DA) model was constructed to classify cases from controls. The predictive ability of the model was evaluated with distinct replication set of 806 samples (434 ALS cases)(Figure 5.5). Overall the model could classify discovery set samples with 83% accuracy. The accuracy was similar for both cases and controls and the positive predictive value (PPV, the chance of having ALS with a positive test result) was 83.8%. The classification performance for the replication set was very comparable to that of the discovery set, with an overall accuracy 80% and a PPV of 81.2%. This indicated that the model was not overfitted to the discovery set and that the predictive model is valid. No difference in classification success was observed between patients from different countries. The classification performance in the replication set translates to this metabolomic blood test having a likelihood ratio of ~4, i.e. a positive test result from the model makes it 4 times more likely that a patient has ALS. From the model a set of metabolites have been identified that can potentially serve as an ALS biomarker panel for evaluation in future prospective studies.

Transcriptomics (WP6)
The objectives and findings of the transcriptomics comprised:

To improve our understanding of the mechanisms of disease using patient samples isolated from a) spinal cord motor neurons (MNs), b) muscle or c) induced pluripotent stem cell (iPSC) derived MNs
a: GEP (Gene Expression Profiles) of Spinal Cord MNs
GEPs have been generated from a large cohort of post-mortem samples, including from 11 cases with mutations in known ALS genes C9orf72 and FUS, 11 SALS and 16 control samples. (GEPs from SOD1 and CHMP2B-ALS cases had been obtained previously.) Weighted gene correlation network analysis (WGCNA) of the differentially expressed genes in isolated MNs from C9orf72-ALS cases compared to controls highlighted that genes involved in RNA splicing were significantly enriched (p=7.45E-04) (Cooper-Knock et al, 2015,PLoS One 10: e0127376). Of these genes, 58.2% showed an increased expression, potentially reflecting the sequestration of RNA splicing factors which has been seen in C9orf72-ALS. In addition, there was a significant decreased in genes involved in the cholesterol biosynthetic process (p<0.001) in C9orf72-ALS MNs. This is interesting given that increased endoplasmic reticulum (ER) stress has been implicated in ALS and is known to have a negative impact on cholesterol synthesis.
In contrast, differentially expressed genes in FUS-ALS MNs compared to controls were significantly enriched for genes related to axonal guidance, followed by focal adhesion. Comparing these enriched KEGG pathways with those of SOD1 (ECM-receptor interactions, focal adhesion) and CHMP2B (vesicle transport, autophagy, axon guidance) demonstrates that whilst there are some common pathogenic mechanisms occurring within the spinal cord MNs, there are also specific gene expression changes associated with the genetic sub-type. As well as improving our understanding of how these mutations cause motor neuronal cell death, these similarities and differences need to be taken into consideration when designing future therapeutic strategies.
b: GEP of Muscle
GEP profiles have been generated from muscle obtained from SALS cases, frontotemporal dementia (FTD), ALS/FTD, fibromyalgia, other neurodegenerative diseases (NDD: Parkinson’s, Alzheimer’s), myopathy, spinal muscular atrophy (SMA) and spinal bulbar muscular atrophy (SMBA) patients as well as controls. Further details of the results of this work are described under Objective 5: To establish biomarkers in muscle that distinguish early ALS cases from other related diseases and controls, or correlate with disease severity.
c: GEP of iPSC-derived MNs
As technology has evolved and costs reduced during the course of the Euro-MOTOR project, RNA sequencing, rather than microarrays, was used to obtain transcriptional profiles from iPSC-derived MNs from ALS patients with an intermediate expansion in ataxin 2 (ATXN2). CAG nucleotide repeats of 27-33, encoding the amino acid glutamine (Q), are a risk factor for ALS and the aim was to further our understanding of how ATXN2-polyQ contributes to ALS disease pathogenesis. RNA sequencing identified 142 differentially expressed genes in the ALS-polyQ samples, compared to control samples, of which genes involved in nucleosome assembly were the most significantly enriched biological process. Since ATXN2 plays a role in RNA processing and RNA sequencing allows multiple RNA molecules to be sequenced – not just messenger RNAs that encode proteins – the data allowed differential expression of further species of RNA, such as long non-coding RNAs (lncRNAs) to be investigated. This identified 82 differentially expressed lncRNAs in the ATXN2-polyQ samples, compared to controls.
It was important to establish whether these changes were specific to the MNs or were also seen in the cells from the MNs are derived. RNA sequencing was performed on the original fibroblast cell lines from each of the ATXN2-polyQ cases and controls, as well as the iPSCs from which the MNs were derived. Results established that whilst there genes differentially expressed between the ATXN2-polyQ cases and controls in each of these cell types, only one gene was consistently differentially expressed in all three cell types. Therefore, the dysregulation observed in nucleosome-related genes and in RNA processing are specific to the MNs and therefore these models will be used to further understand the contribution of ATXN2-polyQ to ALS.

To identify clusters of fALS and sALS with similar gene expression profiles, reflecting a common aetiology
Firstly, whole blood GEPs from 233 ALS cases and 508 controls were generated in order to identify potential biomarkers for the diagnosis of ALS. Comparative analysis identified 106 differentially expressed genes that could distinguish between ALS and controls. An additional set of 114 ALS and 87 controls supported the use of these 106 genes. To validate this prediction model, a set of 50 ALS cases and 50 controls, not previously included in the analyses, were used; 47 (94%) of the ALS patients were correctly classified as ALS whilst 35 (70%) of the controls correctly classified as controls. Thus, blood GEPs were detected to have a moderate ability to reliably discriminate ALS from controls, though this is currently not sufficiently robust for a clinical screening protocol.
GEPs from blood of ALS cases and controls was subsequently used to investigate if DNA sequence variations in these samples correlated with levels of gene expression (termed expression quantitative trait loci – eQTLs). Please see objective 4 for further information.
Secondly, during the first year of the project, it was established that a GGGGCC repeat expansion in intron 1 of C9orf72 was a major cause of fALS and also contributed towards sALS. Cases positive the C9orf72 expansion within the cohort of LCLs were identified and GEPs from these cases compared to those from non-C9orf72 SALS and control GEPs, in order to begin to elucidate the pathogenic mechanisms associated with the repeat expansion. We established that whilst there were enriched functional groups of differentially expressed genes in common between C9orf72 and non-C9orf72 cases compared to controls (including DNA metabolism and ribonucleoprotein complex biogenesis), those genes differentially expressed specifically in C9orf72-ALS were significantly enriched for genes involved in RNA processing and structural constituents of the ribosome. Some of these alterations also correlated with proteins that were found to bound to the repeat sequence (Cooper-Knock et al, 2014, Brain 137:2040-2051). It was therefore clear that the repeat expansion dysregulated RNA processing more so than in sALS and this was supported by analysis of RNA splicing within these cases. Comparative analyses showed that there was a reduction in the consistency of splicing in C9orf72 cases, compared to non-C9orf72 SALS and controls, and that this was greater in C9orf72 cases with a short disease duration (<2yrs) than in cases with a longer duration (>4yrs) (Cooper-Knock et al, 2015,PLoS One 10: e0127376).
Several of the fALS cases had known mutations in SOD1, TARDBP and FUS, the next most common ALS genetics causes after C9orf72. Gene signatures from each of these three genetic variants were determined and used to cluster the uncharacterized fALS and SALS cases, to establish if there were any other mutations in known genes within the cohort. Four uncharacterized FALS were found to group with the SOD1-ALS cases. However, sequencing of the SOD1 gene did not identify any further mutations. No other significant clusters within the uncharacterized fALS cases were identified.
A large cohort of LCLs from over 200 SALS cases and 100 controls were used to identify those genes differentially expressed in SALS cases. Comparing the SALS GEPs with control GEPs established a group of 451 genes that were differentially expressed and hierarchical clustering revealed 9 significant clusters of sALS patients, among the cohort. However, none of these clusters were found to closely correlate with clinical phenotypes, such as age of onset, site of onset, response to riluzole or disease progression rate. Of the 9 clusters identified, one cluster consisted of 5 of the 7 cases being riluzole naïve when the blood sample was taken, whilst another cluster had a tight disease duration time of 2.36yrs (+/-0.74yrs) despite having cases having multiple different sites of onset (bulbar, limb and mixed). This did suggest, however, that levels of gene expression of particular genes may correlate with disease progression – see following section.

To establish GEPs associated with fast and slow disease progression (using GEP from LCLs)
Due to the heterogeneous nature of ALS, and the evident resulting variability in gene expression, the initial comparative analysis of LCL GEPs from individuals with fast and slow disease progression was performed on C9orf72-ALS cases. Partner 14 identified 628 differentially expressed genes in LCLs between fast (<2yrs) and slow (>4yrs) disease progression; 301 were increased in the slow survival cases and enriched genes were involved in regulation of transcription, negative regulation of apoptosis and vesicle mediated transport whilst 327 genes were increased in the fast survival cases and were enriched for genes involved in mitosis, cellular response to DNA damage and Golgi vesicle transport. This suggests that those individuals able to inhibit induction of apoptosis and mitosis, which in MNs would lead to cell death, survive longer.
Comparisons of fast (<2yrs) and slow (>5yrs) disease progression was subsequently performed with the sALS cases and identified 496 differentially expressed genes, again showing enrichment of mitotic and cell cycle genes as increased in fast progressing disease. Hierarchical clustering of the data also identified a group of 8 sALS among the short survival group. Comparing back to the entire SALS cohort, these 8 cases actually grouped with a further 8 samples, and were distinct from both the rest of the SALS cohort and controls. The group also had a high incidence of bulbar onset (9/16). Interestingly, several ALS genes were seen to be differentially expressed within this group. In order to establish if there were any known or novel genetic variants within these samples that underlined the clustering, exome sequencing was performed and identified several hundred common variants amongst the samples. Further work is underway to evaluate, filter and validate the effect(s) of key variants.

To determine if genomic variants in ALS patients DNA correlate with gene expression patterns seen in the blood
The whole blood GEPs from 323 sALS and 413 controls were combined with the genome wide association study (GWAS) data obtained from the corresponding DNA samples within WP3 Genomics, in order to identify DNA variations that associated with levels of gene expression. Subsequently, this list was refined to those variants that were also associated more frequently with ALS. We identified 12 variants (eQTLS) that were not only found to be associated with ALS within the Euro-MOTOR GWAS data, but also within a larger GWAS study of over 3,000 patients and 10,000 controls. Further analysis of over 5,000 samples (including those from Euro-MOTOR) identified 346 trans-eQTLs (where the DNA variant affects the expression of the gene on a different DNA strand). These were enriched for functional elements and enhancer regions in lymphoid and myeloid tissue – which is perhaps to be expected for blood. Therefore, analysis was performed to look at the effect of eQTLs on the GEPs of blood as well as liver, adipose tissue (subcutaneous and visceral) and skeletal muscle from 85 individuals. In this case, 2072 cis-eQTLs (where the DNA variant is affecting the expression of the gene on the same DNA strand) were identified. Over 47% of the eQTLs showed variability in level of gene expression within a tissue – and in 4.4% there were opposing levels of gene expression. The variants were found in transcriptional regulatory elements enriched for tissue specific expression. Therefore, the originating tissue type is very important in eQTL analysis and as such, comprehensive eQTL mapping in whole blood is likely to have limited value for generating a systems biology/ systems genetics model of disease in a neurodegenerative disease such as ALS.

To establish biomarkers in muscle that distinguish early ALS cases from other related diseases and controls, or correlate with disease severity
We can sub-divided into three separate aims: a) To identify gene expression changes in skeletal muscle that could reliably define the degree of disease severity in ALS patients, b) to determine muscle transcriptome alterations before the onset of ALS motor symptoms in FTD patients, and c) to establish specific dysregulations of gene expression between ALS and SMA and SBMA.
a) GEPs were obtained from deltoid muscle biopsies from 9 ALS cases and 10 healthy controls and the stage of muscle impairment in the ALS cases was determined by manual muscle testing, electrophysiology and the degree of myofibre atrophy. Data analyses identified 25 and 70 genes exclusively expressed in early and late stage atrophy muscle respectively (Pradat et al, 2012 Neurodegener Dis 9;38-52). Expression levels of 198 genes correlated with levels of muscle atrophy. Combining these lists, 155 transcripts could distinguish early signs of muscle impairment, whilst 9 genes predicted advanced muscle atrophy with 100% sensitivity and specificity. Specifically, levels of MYOG and two genes which MYOG regulates, CHRNA1 and CHRNG, were shown to be increased as the levels of muscle atrophy increased.
b) As is evident from the literature, ALS and FTD belong to the same pathological spectrum. Whilst in ALS, up to 5% of cases may develop clinical FTD, in FTD, about 15-20% of cases subsequently present with motor dysfunction. To capture this phenomenon, muscle biopsies were taken from FTD patients with no motor symptoms and then re-assessed two years later. FTD patients then were divided into three groups 1) FTD/ALS [FTD and ALS with muscle denervation], 2) FTD(MND) [FTD with no muscle denervation at the time of biopsy but afterwards] and 3) FTD(-) [FTD, no ALS]. Comparison of the gene expression profiles between these groups detected 67 genes commonly altered in ALS-FTD patients and FTD(MND) patients, who had not shown any motor symptom at the time of biopsy but presented with signs of muscle denervation later. The results therefore indicate that FTD patients evolving towards ALS later in their disease process present with a transcriptome signature in muscle closer to that of ALS patients prior to muscle denervation.
c) Multiple pair-wise comparisons of GEPs of ALS, SMA, SBMA and control muscle samples identified 84 differentially expressed genes that were specifically expressed in ALS muscle. Whilst these are potential biomarkers for ALS against the neurodegenerative disorders SMA and SBMA, further analyses were performed to establish those markers specific to ALS compared to myopathies. GEPs from fibromyalgia, other NDD and myopathy cases established 27 differentially expressed gene specific to ALS. Comparing these two lists, 13 differentially expressed genes were identified that were specific to ALS cases and would potentially allow the differentiation of ALS from neurodegenerative and myopathy disorders.

To interrogate cellular models of new genetic variants of ALS (including TARDBP and C9orf72) to understand their mechanisms of pathobiology
Cellular models of the new genetic variants of ALS, TARDBP and C9orf72, were generated. Since overexpression of wild type (WT) or mutant TDP-43 (encoded by TARDBP), is toxic to the cell, an inducible model using the Flp-In system was constructed in the NSC34 mouse motor neuron-like cell line. In this model, the inserted gene is only expressed when the antibiotic tetracycline is added to the culture media. Expression of WT TARDBP and the p.Q331K mutant were induced in NSC34 cells. Characterization of these cell lines clearly showed nuclear localization of TDP-43 in the WT cells and mislocalisation of TDP-43 to cytoplasm and TDP-43 aggregates in the p.Q331K TDP-43 mutant. RNA sequencing analysis was subsequently performed and preliminary findings indicate that genes involved in aerobic respiration, protein transport and RNA and protein catabolic processes are the most enriched amongst those differentially expressed.
The parental NSC34 Flp-In cell line was also used to model the repeat expansion of C9orf72-ALS. Cell lines containing repeats of 0, 10, 51 and 102 repeats were generated and shown to mirror the disease, with the number of RNA foci increasing and cellular toxicity increasing as the repeat size increased. The 0 and 102 repeats cell lines used for microarray analysis to determine the effect of the presence of a large GGGGCC repeat in a neuronal-like cell line. Those genes differentially expressed in the 102 repeat expansion cells were enriched for genes involved in protein transport, phosphorylation and RNA metabolism, and specifically for genes in the PI3K/PTEN pathway, as has been seen previously in the transcriptional study of C9orf72 MNs (Cooper-Knock et al, 2015,PLoS One 10: e0127376).

Exposomics (WP7)
A research goal of for exposomic analysis was to collect information on the environmental exposures of European patient and control cohorts to allow in-depth analysis to determine the role of environmental factors in the lifetime risk of developing ALS. Information on a wide-range of different exposures was collected, including physical activity, occupation, trauma and use of drugs/medication.
Population-based analyses by partners from Ireland, Italy and the Netherlands provided supportive hypotheses for this collaborative initiative. We provided class I evidence linking smoking to ALS risk, and suggested a protective role of alcohol. An increased risk of ALS was associated with leisure time physical activity, and positive associations of a low pre-morbid BMI and high fat intake were recognized as increasing the ALS risk. Taken together, and coupled with a lower frequency of hypercholesterolemia premorbidly, these results support the hypothesis that a genetic profile or lifestyle promoting physical/vascular fitness increases ALS susceptibility, possibly caused by underlying mechanisms in whole-body or cellular metabolism.
The large population based survival analyses across the prospective EuroMOTOR cohorts on a range of lifestyle factors in ALS, establishing that prior occupational exposure to diesel motor exhaust has a negative prognostic impact in ALS, and shedding new light on factors such as smoking, alcohol and pre-morbid obesity. We also performed the first survival analysis on the effect of exposure to extremely low frequency electromagnetic fields which was shown not to be prognostic.
Furthermore, we have mapped the incidence of ALS in Ireland, providing greater spatial accuracy of ALS risk estimates, which in turn has led to the previously unreported finding of areas of statistically significantly low risk for ALS. Small area mapping did not show any significant associations with population density, social deprivation, soil minerals or toxins. We also studied environmental and spatial exposures. The excess risk in Piedmont region (Italy) was particularly evident in rural areas of Cuneo, Alessandria and Vercelli, suggesting that the environmental exposure to agricultural chemicals could be possibly linked to these exposures
In addition, we investigated the association between ALS and previous trauma and physical activity. Two studies suggest that prior, repeated and/or severe traumas may be risk factors for ALS, while lifetime physical activity did not increase the risk of developing ALS.
Further progression of these analyses are underway using the fully combined datasets from Ireland, Italy and the Netherlands using detailed Job Exposure Matrices generated as part of the EuroMOTOR project.
At the time of writing, 886 (57%) of 1557 recruited patients were known to have died or reached an alternate endpoint. To facilitate survival analysis, patients were followed up over time to determine when patients either a) died, or b) reached the alternate endpoints (tracheostomy or use Non-Invasive-Ventilation for more than 23 hours per day). Preliminary analysis confirms the importance of known prognostic indicators and is in agreement with published characteristics of the populations from which EuroMOTOR patients are recruited, and these data (From WP2) will be incorporated into further analyses of the exposomic datasets from WP7.
Another goal of this research was to create and apply job exposure matrices (JEMs), tools to assess the exposure of the patient and control cohorts to occupational hazards during the course of their work. The JEMs we used examine exposure to electromagnetic fields, pesticides, solvents, inhaled compounds (e.g. dust, gases, etc.) and physical activity. JEMs link occupations to profiles of environmental exposures by providing quantitative assessments of various exogenous exposures for each occupation and leisure time activity. JEMS are less affected by recall bias because exposure classification is performed blinded to health outcome. JEMs have been created by three occupational exposure experts and have now been validated. Using the exposomics questionnaire, cases and controls have provided details regarding their lifetime occupational history, including military service and periods spent as a homemaker to researchers. All occupations are recorded with respect to number of years and the hours per week employed. Information about education, cigarette smoking and alcohol consumption is also recorded.
Subsequently we determined the risk of amyotrophic lateral sclerosis (ALS) associated with lifetime occupational history in a population-based case-control study. The lifetime occupational history of 662 patients and 2,152 controls was obtained using the structured questionnaire, and coded according to the International Standard classification of Occupations version 68 (ISCO-68). Main and last occupations, and if they were ever employed in each occupation were compared between patients and controls, while controlling for potential confounders and adjusting for multiple comparisons. A last job in ISCO major group “agricultural” was associated with an increased risk of ALS (adjusted for age and gender: OR 1.9; 95% CI 1.2-2.9; adjusted for age, gender, smoking and alcohol: OR 1.6; 95% CI 1.0-2.5) and there was a trend towards a higher risk of ALS with a longer employment duration and a shorter time since last employment in the agricultural sector. From previously identified candidate occupations in ALS, only the association with occupation as an athlete, sportsman or related worker could be replicated (OR 2.2; 95% CI: 1.1-4.6). Agricultural work was associated with an increased risk for ALS. The fact that only one of the candidate occupations, i.e. athletes and sportsmen, could be replicated in the present large study, indicates that most candidate occupations previously been hypothesised may be the result of chance findings due to analyses uncorrected for the number of comparisons made.

Functional studies (in vivo and in vitro) (WP8)
a. Primary neuronal cultures
We established a series of new in vitro models which allowed the characterization of alterations in axonal mRNA content in motor neurons, in which the levels of Smn, TDP-43 and other proteins with relevance for ALS and other forms of motor neuron disease were reduced. We developed a new compartmentalized cell culture system that allows quantitative analysis of the mRNA repertoire in the axonal in comparison to the somatodendritic compartment, and new techniques that allow linear amplification of small amounts of transcripts isolated from the axons of isolated motor neurons. In addition, the phenotype of motor neurons in which Smn, TDP-43 and other ALS-related genes were suppressed were determined.
b. Induced pluripotent stem cells
We successfully created induced pluripotent stem (iPS) cell lines from different ALS patients. We reprogrammed 7 ALS iPS cell lines from patients carrying expanded Ataxin-2 repeats, and 7 healthy control cell lines. We derived the isogenic control iPS cell lines (which are then Ataxin-2 knock-out lines). In addition, 6 iPSC lines with repeat expansions in C9orf72 were obtained. Reprogramming and the characterization of obtained iPS cell lines became efficient and was routinely performed. These cells were also successfully differentiated into motor neurons. Initially no screenable phenotype was found in motor neurons of ALS patients. Thorough analyses of all derived iPSC-lines and differentiated motor neurons yielded phenotypes on multiple levels, including morphological phenotypes (soma size), functional phenotypes (electrophysiology and survival) and molecular phenotypes (cell stress, e.g. Golgi fragmentation). In vitro maturation of neurons was further analyzed on a functional basis by electrophysiological experiments. Current and voltage clamp experiments showed that iPS derived motor neurons were functional. We showed that ALS patient derived motor neurons were either hypo- or hyperexcitable compared to control neurons, depending on time point of analysis. Pathway analyses using transcriptomics, proteomics, and metabolomics was performed.

c. Zebrafish
Transient zebrafish models for FUS and TDP-43 comparable to the SOD1 zebrafish model were developed and these models showed shorter motor axons as well an increased branching. While these differences were partially mutation-specific for TDP-43, this was not the case for FUS. These different models were used to validate different modifiers. Overexpression of both human wild type FUS as well as mutant FUS induced shorter axons and an increased branching. These results suggest that the correct dosage of FUS is important for normal neuronal functioning and as overexpression of wild type TDP-43 also had negative effects this also seems the case for TDP-43. We deleted one by one all functional domains from the wild type protein and investigated the effect on toxicity. We were able to attribute the toxic properties of the full-length protein to the N-terminal, low complexity domain of FUS and to the C-terminal RNA/DNA binding region.

d. Mouse
New transgenic mouse models for ALS were created starting from the genetic defects in TDP-43 and FUS and the phenotypes of these new models were characterised.
Thy1.2 FUS mice: We overexpressed human FUS postnatally in neurons using the Thy1.2 promotor. These mice didn’t develop an overt phenotype. No differences were found in weight between control, wild type and R521H FUS mice. No motor phenotypes were identified.
Rosa FUS mice: We generated conditional overexpression mice. In these mice cDNA of wild type (WT) or mutant (R521H) FUS was inserted into the Rosa26 locus behind a floxed stop cassette and a ß-actin promotor. The effect of generalized R521H FUS expression, using PGK-Cre and CAGG Cre-ER lines was investigated. Homozygous overexpressing animals show two phases of lethality. The first phase immediately after birth, the second one between P16 and P25. These animals display rapid progressive paralysis of both forelimbs and hind limbs. Mutant FUS expressing animals gain less weight and perform worse in hind limb suspension test and righting reflex. However, these mice don’t show motor neuron loss. We also selectively expressed mutant FUS in different cell types. The different Cre-ER lines, Thy1 Cre-ER (neuronal), Connexin30 Cre-ER (glial) and PLP Cre-ER (oligodendroglial) were crossed into a homozygous R521H Rosa background. None of these mice showed a phenotype.
Conditional FUS knock-in mice: We generated a knock-out knock-in model mimicking the human situation. For the generation of conditional FUS knock-in mice, we introduced in the endogenous mouse FUS intron 14 a cassette which allows switching off mutant FUS in cells that undergo Cre-mediated recombination. In the heterogeneous situation, these mice die at birth.
ENU mouse models: To study the biology of TDP-43 and how mutations in TARDBP (the gene encoding TDP-43) lead to ALS, we generated and characterised a number of mouse lines carrying point mutations in the mouse endogenous Tardbp (TDP-43) gene as exquisitely dose-sensitive, induced by ENU mutagenesis. We aimed to produce mice that mimic the human condition and express mutant TDP-43 at physiological levels. The first mutation was a non-sense mutation (Q101X); on full characterization, TardpQ101X/+ mice develop a similar phenotype to heterozygous Knockout mice. This strain was also used to test for any possible interactions between TDP-43 loss of function and SOD1-ALS.
The characterization of two other mouse lines, F210I and M323K, was completed. The F210I mutation is embryonic lethal in homozygosis, with heterozygous mice (TardpF210I/+) showing non-motor phenotypes. At the molecular level, the F210I mutation affects the RNA binding capacity of TDP-43, diminishing TDP-43 function on splicing. We characterised this really interesting mutation at the molecular level to further understand the biology of TDP-43 and how splicing dysfunction lead to ALS. The M323K mutation lies within a molecular hot-spot for ALS mutations, and interestingly, positively affects TDP-43 splicing function. Together with the study of the F210I mutation, it allows for the dissection of the role of splicing dysregulation on the development of ALS.
In conclusion, the phenotypes of the different FUS and TDP-43 mouse models that were generated are either too aggressive or too mild. In addition, none of the models currently available shows clear motor neuron loss.

Other mouse models: We created successfully a mouse model for DCTN1 based on variants identified in sporadic ALS patients was and one of these mice showed a clear motor phenotype. Clinically, these transgenic mice showed a distinct hind limb clasping and exhibited evidence of muscle weakness. Furthermore, these mice showed a severe gait disturbance and paralysis of the hindlimbs between eight to nine months of age. Pathological analysis of these transgenic mice revealed evidence of a severe neurogenic degeneration of the skeletal muscles (e.g. quadriceps) characterized by typical hallmarks as angular muscle fibers and grouped muscle atrophies.
Use of different models to get insights into the disease mechanism
a. Disturbances in axonal outgrowth and transport
We showed that genetic deletion of histone deacetylase 6 (HDAC6) in ALS mice results in a slower disease process and in a significantly increased survival. In addition, we observed that pharmacological inhibition of HDAC6 reversed the symptoms in mouse models of hereditary neuropathies (CMT2 and distal HMN). HDAC6 plays an important role in the regulation of axonal transport. HDAC6 is one of the major tubulin deacetylating enzymes and is a member of the class IIb HDACs. It is the only one with tubulin deacetylating activity.
We performed an unbiased screen using the zebrafish model and discovered that EphA4 was an important modifier of the phenotype induced by the expression of different mutant ALS related genes. Knockdown of EphA4 rescued the axon abnormalities induced by three different mutations in SOD1. The rescuing effect of EphA4 knockdown in the zebrafish model was validated by investigating the genetic deletion of EphA4 in the mutant SOD1G93A mouse. A 50% decrease in the expression of EphA4 significantly increased motor performance and survival of mutant SOD1G93A mice. In addition, quantification of the number of ventral horn motor neurons and the percentage of fully innervated neuromuscular junctions at a given time indicated that deletion of one EphA4 allele slows motor neuron degeneration.
To evaluate the involvement of the EphA4 receptor in humans, EphA4 levels in ALS patients were measured and found that low expression levels of EphA4 were associated with later age of onset and longer survival, as predicted by the results obtained in fish and in mice. Furthermore, direct sequencing of the EphA4 gene in 180 patients with ALS revealed mutations in two of them. These mutations showed to induce loss-of-function changes. As predicted by the results described above, these patients had an unexpectedly long survival. These findings suggest that the EphA4-dependent signalling is a modifier of motor axon degeneration.
We investigated whether pharmacological blockade of the EphA4 receptor resulted in the same effect on motor neuron degeneration as genetic manipulation of the EphA4 gene. Treatment of mutant SOD1A4V overexpressing zebrafish embryos with an EphA4 antagonist completely rescued the mutant SOD1-induced axonopathy. To investigate the effect of EphA4 inhibition in a rodent model, mutant SOD1G93A rats were treated with the KYL-peptide, an antagonist peptide against EphA4, through ICV administration. This EphA4 antagonist delayed disease onset of disease and prolonged survival in the mutant SOD1G93A rat model.
We investigated ELP3 as a modifier that may increase the susceptibility to neurodegeneration. Therefore, the mechanism through which ELP3 affects this aspect of motor neuron biology was studied. We discovered that ELP3 is not only present in the nucleus (where it exerts its effect on RNA synthesis), but also in the cytosol. As a consequence, it was hypothesized that ELP3 could target the cytosolic protein tubulin and enhance axonal transport, a mechanism known to be affected in ALS. Only very high overexpression of ELP3 increased tubulin acetylation. At physiological expression levels, ELP3 did not acetylate tubulin. In contrast, we found it to acetylate, at least in the fly, the active zone protein Bruchpilot (BRP). In a collaborative study with the group of Dr. P. Verstreken (Department of Human Genetics, KU Leuven), we discovered that in ELP3 mutants, presynaptic densities assembled normally, but that they showed morphological defects such that their cytoplasmic extensions covered a larger area, resulting in increased vesicle tethering as well as a more proficient neurotransmitter release. We propose a model in which ELP3-dependent acetylation of Bruchpilot at synapses regulates the structure of individual presynaptic densities and neurotransmitter release efficiency.
To study the effect of ELP3 deletion in the mouse, an ELP3 knockout mouse (ELP3-/-) was generated. In homozygous form, this mouse is not viable, dying around day 10 in utero like the ELP1-/- mouse. The heterozygous mouse (ELP3+/-) is phenotypically normal. Crossbreeding these mice with the mutant SOD1G93A mouse resulted in an earlier disease onset, while no significant effect on survival was found. To overcome the embryonic lethality of the Elp3-/- mouse, we conditionally knocked-down of ELP3 in adult mice, making use of the inducible Cre-ER system. Surprisingly, the general knock-down (90%) of ELP3 resulted in death within 40 days, as well as the knock-down driven by the neuronal promoter Thy1.2.
The effect of ELP3 overexpression was also investigated in ALS models. In the mouse, partner 2 found evidence that ELP3 overexpression in the mutant SOD1G93A mouse is neuroprotective, as it delays disease onset and prolongs survival of these mice. ELP3 overexpression was achieved both by intrathecal delivery of AAV9-ELP3 viral particles in neonatal mice and by generation of an inducible (Cre-ER) transgenic HuELP3 mouse. AAV9-mediated overexpression of ELP3 in the spinal cord of SOD1G93A mice pups extended survival by 9 days. On the other hand, Elp3 overexpression in adult mice (60 days-old) delayed the onset of the disease and prolongs the survival of SOD1G93A mice by 9 days. In the zebrafish, partner 2 found a protective effect of ELP3 on the motor axonopathy induced by mutant SOD1, mutant TDP-43 and C9ORF72 ATG-morpholino. The neuroprotective effect of ELP3 in the SOD1A4V zebrafish was abrogated by mutations in the SAM domain (methylation domain), but not in the HAT domain (acetylation domain).
During this project, no evidence for synapse dysfunction was found. One example of a negative result was that we discovered that genetic removal of a gene playing an important role in neurotransmitter release at synapses (UNC13A) had no effect on the phenotype of the ALS mice.
b. Disturbances in energy and metabolism
We found that PHD1 knockdown resulted in beneficial effects on survival of the ALS mice and treatment of an acute model of neuronal damage with antisense oligos (ASOs) against PHD1 resulted in a beneficial effect. Moreover, we discovered that the absence of PHD1 resulted in a shift from glycolytic flux towards the pentose-phosphate pathway.
The presence of alterations of energy and lipid metabolism was shown, both in ALS patients and mutant SOD1 mice. This led to the hypothesis metabolic parameters are a major contributor to the initiation and progression of the disease. We analyzed how alterations in lipid metabolism are relevant to ALS. First, we observed in mutant SOD1 mice and in ALS patients an early loss in muscles of a key enzyme involved in lipid management: stearoyl-CoA desaturase-1 (SCD1), which regulates ß-oxidation of fatty acids and synthesis of complex lipids. Therefore, fatty acid composition of lipids in the circulation and in several tissues of mutant SOD1 mice was quantified. The observed differences provided the starting point for follow-up studies on the characterization of fatty acid composition of lipids in serum samples from patients. We detected in ALS patients a characteristic monounsaturated and polyunsaturated fatty acid composition that has strong biomarker potential. Mechanistic studies on the role of SCD1 in mouse models of ALS (mutant SOD1 mice and experimental denervation model) were also performed. The results showed that repressing SCD1 expression or reducing SCD-dependent enzymatic activity is beneficial to recover motor function after nerve lesion. The importance of the metabolism of sphingolipids, another class of lipids, in ALS muscle pathology, particularly in the maintenance of the neuromuscular junction was reported. Finally, we focused on the therapeutic potential of manipulating SCD1 and amounts of related fatty acids. However, we failed to show a clear beneficial effect on mutant SOD1 mice by targeting SCD1 both pharmacologically and genetically.
Validation of hits from the different screenings
a. Axonal and synaptic factors
The differentiated iPS cells generated were used for the different ~omics studies. This resulted in the generation of a highly relevant translational ALS model, perfectly suited for systems biology approaches. In the context of this project, these cells were used by partners 1, 6, 11, and 13 to apply a multi-omics systems biology approach (transcriptomics (RNAseq), proteomics, metabonomics, and subsequent integration). Partner 7 determined alterations in the transcriptome of Smn and TDP-43 deficient primary motor neurons. It was found that mainly the axon and the axon terminal, but not signaling pathways for neuronal survival were affected in such motor neurons.
b. DNA repair
NEK1 and C21or2 were identified in the WP3 as novel ALS risk genes. NEK1 and C21orf2 were further characterized as NEK1 and C21orf2 interact and are required for efficient DNA repair. DNA damage is one of the earliest detectable events in neurodegeneration and receives a lot of attention. NEK1 encodes a NIMA (never in mitosis gene a)-related expressed kinase 1. This is a serine/threonine kinase involved in cell-cycle regulation. The encoded protein is found in a centrosomal complex with FEZ1, a neuronal protein that plays a role in axonal development. Defects in this gene are a cause of polycystic kidney disease (PKD). Several transcript variants encoding different isoforms have been found for this gene. Diseases associated with NEK1 include short-rib thoracic dysplasia 6 with or without polydactyly and short-rib thoracic dysplasia 3 with or without polydactyly. NEK1 phosphorylates serines and threonines, but also appears to possess tyrosine kinase activity. This enzyme is implicated in the control of meiosis and also seems to be involved in cilium assembly. In response to injury (including DNA damage), NEK1 phosphorylates VDAC1 to limit mitochondrial cell death. Not much is known about C21orf2 apart from the fact that four alternatively spliced transcript variants encoding four different isoforms have been found for this nuclear gene. All isoforms contain leucine-rich repeats. Three of these isoforms are mitochondrial proteins and one of them lacks the target peptide and as a consequence it is not located in mitochondria. The C21orf2 gene is down-regulated in brain from patients with Down syndrome brain, which may represent mitochondrial dysfunction in DS patients.

Functional studies (WP9)
To integrate the different types of obtained data from the ~omics work packages we developed infrastructure for QTL mapping. We developed an open-source Java-based method to perform quantitative trait locus mapping (QTLMappingPipeline) that is publicly available along with extensive documentation at: https://github.com/molgenis/systemsgene3s/eqtl-mapping-pipeline.
It is a generic method (described in Westra et al, Nature Genetics 2013) that can:
- Identify effects of genetic variants on gene expression, methylation, metabonomic or protein levels
- Correlate large datasets with each other (e.g. gene expression with methylation data)
- Contains functionality to reduce unwanted and unknown noise in datasets
- Provides functionality to identify samples that have been accidentally mixed-up in the lab
- Step-by-step guides are provided for eQTL mapping and mQTL mapping.
We developed an open-source pathway enrichment method, called DEPICT (described in Pers et al, Nature Communications 2015) that is publicly available at http://www.broadinstistute.org/mpg/depict/. It integrates genetic and gene expression data and uses predicted gene functions (publicly available browser at http://www.genenetwork.nl/genenetwork/) to improve statistical power to identify significantly enriched pathways (described in Fehrmann et al, Nature Genetics 2015). We used DEPICT on summary statistics of an ALS GWAS study (>10,000 cases and >20,000 controls) and identified significant enrichment of genes involved in the ‘SNARE receptor complex’ and ‘SNAP pathways’ in ALS.
We subsequently obtained several brain eQTL datasets to get a comprehensive overview of the effect of the SNPs on gene expression in brain. We observed that brain eQTL SNPs were preferentially affecting gene expression levels of genes that are involved in the aforementioned GO: SNAP receptor activity and KEGG: SNARE interactions in vesicular transport pathways.
We used the Sherlock web-tool for analyzing the summary statistics of ALS GWAS results and integrating those with the previously published brain eQTL data. In this analysis we did not confine ourselves with the GWAS loci defined by DEPICT but used method for finding overlapping SNP-level associations for trait (ALS) and gene (corresponding cis- or trans-eQTL) and subsequently prioritizing putative disease-associated genes based on those overlaps. The most significant gene identified by Sherlock’s Genetic Signature Matching was MAPT (P=1.1×10-6) which was also under cis-eQTL effects of rs199452 (GWAS MLMA P=5.5×10-5; eQTL P=5.69×10-6) and rs4277389 (GWAS MLMA P=2.12×10-4; eQTL P=5.97×10-9). MAPT encodes the microtubule-associated protein Tau and the variation in this gene has previously been associated with neuropathologies, including ALS in Chinese and Guam populations.
We used immunoprecipitation data from D4.8 to construct protein-protein interaction (PPI) network including both, wild-type and mutated ALS-associated proteins (Fig. 9.1). After intersecting members of PPI network with genes positioning suggestive ALS GWAS loci and genes having cis-eQTL effect in brain tissues, we identified a number of genes which may serve as susceptibility genes for ALS pathogenesis and progression.
We systematically summarized the most informative data layers in an attempt to prioritize ALS-associated genes which can be serve as working hypotheses for future studies (Table 9. 1). By integrating multiple lines of evidence, we were able to find genes which have suggestive association with ALS-associated processes. For example, RNF10 gene under cis-eQTL effect in publicly available brain eQTL studies, it positions in the suggestive GWAS locus (P<10-4) and it also correlates significantly with the combined set of known ALS-associated genes.
One of the most suggestive examples from the prioritized genes was MAP1LC3B which has, to our knowledge, not previously been implicated in ALS. This gene is under genetic regulation in brain, positions in the suggestive GWAS locus and correlates with the set of known ALS genes. The guilt-by-association phenotype predictions with independent RNA-seq based network also suggested that this gene is involved in ALS (P= 1.31 x 10-5; Table 9.1.). This gene is also highly expressed in brain, compared to other tissues.
Taken together our model for ALS disease and disease progression is as follows. Checkmarks indicate the various genes on the model for ALS disease (green) and ALS disease progression (red):

Potential Impact:
Clinical data coordination (WP2)
As ALS is a relative ‘rare’ disease the establishment of an European-wide network of standardized population-based registries representing a population of about 41 million people is valuable for current and further research as this large cohort will increase the power of the analysis. Also the guidelines on clinical, biological and epidemiological data sampling will generate more homogeneous data across different institutes and countries in Europe.
Genomics (WP3)
The genetic landscape of ALS has changed dramatically during the EuroMOTOR project, for a large part because of novel discoveries resulting from this project.
At the start of the project and during the grant writing, sporadic ALS was hypothesized to be a complex disease where many genetic variants with individually relatively small effect would explain the majority of ALS. The 10% of patients with familial ALS were considered to be rare Mendelian forms of the disease.
Nevertheless, the distinction between familial and sporadic ALS is blurred considerably because of several breakthrough discoveries the past few years. We now know that the most common genetic mutation in familial ALS, the C9orf72 repeat expansion, discovered in 2011, is also present in 8-10% of sporadic ALS patients. Because of this, we were able to initially identify the 9p genomic region in a first GWAS in 2009, that was subsequently finemapped to this repeat expansion in 2011.
Our genetic studies during the EuroMOTOR project, uncovered six ALS risk loci, revealed the genetic architecture of ALS in detail for the first time and showed strong evidence for a distinct and important role for rare variations. This disproportionate large role for low frequency variants in ALS, including the newly uncovered C21orf2 and NEK1 risk genes, highlight ALS as a “simplex” disease: not a collection of Mendelian (“simple”) diseases, and not driven by thousands of low effect SNPs (“complex”), but instead defined by genetic variation in ALS risk genes. These risk genes have odds ratios ranging from 1.5 to ±250. For example, the ATXN2 repeat expansion has an odds ratio of ±8, C21orf2 and NEK1 of 1.6 SMN1 of ±3.5 and C9orf72 of ±250. The concept of ALS risk genes and ALS being a “simplex” disease has a large impact on several areas:
Impact on diagnostic testing and genetic counseling
The outcomes of our studies, especially the issue of reduced penetrance will be of high relevance to diagnostic testing and genetic counseling: testing of patients and healthy relatives should be done taking this reduced penetrance into account. It means that precise prediction in unaffected carriers of ALS mutations will be challenging, while the technological advancements in prenatal testing and embryo selection open up all sorts of new questions for female relatives of patients with ALS risk gene mutations.
Impact on clinical trial design
Current clinical trials do not apply any form of stratification - as if ALS is one disease – which hampers drug development and clinical efficacy testing. Elucidating the genetic architecture of ALS will allow better stratification of patients according to determinants of disease progression or disease susceptibility. Such novel stratification strategies can be used directly in upcoming clinical trials for experimental drugs that target human SOD1 mutations, and C9orf72 repeat expansions for example.
The development of new, targeted treatments will also influence the drive to start testing all patients for ALS risk gene mutations.
Targets for further translational neuroscience experiments and the development of therapeutic targets
This concerns feeding forward results to both academia and industry. We had hoped to find more susceptibility loci during earlier phases of the project. However, the now uncovered genetic architecture of ALS, explains why GWAS with multiomics data in the sample sizes initially proposed in this grant, will provide the desired answers. Instead, large scale whole genome sequencing is required, in combination with even larger multiomics data sets.
Nevertheless, the exciting discoveries we made of novel ALS risk genes the past few years, now allow for many translational neuroscience experiments to further understand the etiopathogenesis of ALS.
Proteomics (WP4)
As we developed technological pipelines and provided proves of concept this will shape future ALS research and beyond. For instance, identifying the protein Unc119 as a critical component of toxic ALS aggregates provides a novel drug target for pharmaceutical research. The generic nature of our approaches allows us to extend our future research to other neurodegenerative pathologies. We are currently using approaches developed in EUROMOTOR to analyze aggregates in other neurodegenerative disorders (compare Woerner et al., 2015), to map aggregation prone proteins in the entire proteome and identify common and discriminating features of proteins aggregates. We envision that the unbiased proteomics data on ALS will significantly improve the understanding of this devastating pathology and pave the way for earlier diagnosis and treatment. Many of the collaborations within the consortium are ongoing beyond the financing period underlining the unique opportunities established by the EC.
Metabolomics (WP5)
Our findings present novel insight into the metabolic alterations associated with the presence of ALS and specifically associated with mutations that cause ALS. These therefore indicate new potential pathways for therapeutic intervention and biomarkers for the disease. In particular it is hoped that during the follow up the EUROMOTOR cohort the metabolic signature associated with disease will be able to predict the rate of disease progression. This would provide a valuable additional tool for stratifying treatment and monitoring early response to therapy in clinical trials, thus addressing a key limiting factor and facilitating more testing of novel therapies. Given the clinical and genetic overlap between ALS and frontotemporal dementia (FTD) we believe that our work will also have relevance to more prevalent neurodegenerative disorders, widening the society impact of the research.
Transcriptomics (WP6)
The gene expression profiling of lymphoblastoid cell lines (LCLs) has emphasized that RNA processing is a key mechanism in ALS pathogenesis and this is dysregulated significantly more so in C9orf72-ALS, probably due to the repeat expansion sequestering RNA binding proteins from their normal functions in the cytoplasm. This is demonstrated, not only in LCLs but also in individually isolated motor neurons, to have a significant impact on the processing of messenger RNA and it’s splicing. It is also evident that in individuals with fast disease progression, more cell cycle and mitotic genes were being expressed. Further analysis of the data is required to establish if there are key biomarkers among these genes, whose expression levels correlate more closely with disease progression.
Gene expression profiling of cellular models and induced pluripotent stem cell (iPSC)-derived motor neurons have also allowed the pathogenic mechanisms involved in TARDBP-ALS, C9orf72-ALS and ATXN2-ALS to be elucidated; understanding these mechanisms will be essential for identifying potential gene and biological pathway targets for future therapeutic strategies.
Our gene expression profilling studies clearly demonstrate that muscle pathology exhibits some features specific to ALS that can be exploited to follow onset and progression of the disease. Major advantages of using skeletal muscle tissue to search for ALS biomarkers are that this tissue is easily accessible, and is directly affected by the disease, at an early stage that precedes motor neuron death. However, collecting muscle biopsies remains relatively invasive, and multiple testing on the same patient is sometimes not ethically feasible. Hopefully, a needle biopsy technique would offer the opportunity to obtain muscle specimens by a less invasive procedure.
Exposomics (WP7)
Analysis of the exposomics data has given insight into the potential lifestyle and environmental factors that determine ALS susceptibility or phenotype. Various risk factors have been previously proposed, such as excessive physical exercise (Chio et al, 2005) and exposure to organic pesticides (Sutedja et al, 2009b) but the evidence remained inconclusive. Previous research in this area has been confounded by lack of power and the difficulty of designing unbiased case controlled studies in a disease with low prevalence.
This cohort study, the largest that adequately matched cases and controls in the world, has generated high quality exposure data and allows for its integration with the genomics and metabonomics data produced by other workpackages. The analysis of the exposomic dataset provides insight into the environmental and lifestyle factors that increase the risk of developing ALS, which is important for our understanding of the disease and the development of methods to treat it. As these data are integrated with genomics, transcriptomics and metabolomics data, it is likely that subclusters will be identified that share common pathogenic mechanisms, thus driving novel therapeutic approaches.
In addition to that which has been achieved to date, EuroMOTOR WP7, has generated a wealth of data mining that will follow in the coming months to further investigate the influence of environmental factors on ALS risk, and has provided a model by which exposomics can be investigated in other populations across the world. In this context, EuroMOTOR partners are principal investigators on an application to the US Center for Disease Control call for ALS epidemiologic research, the aim of which is to replicate the work undertaken in WP7 in countries in South America.
With respect to the EuroMOTOR project, the following data mining projects have been agreed and are currently underway:
• A review of the selection and matching of the cases and controls providing gold standard multivariate studies using the largest cohort of population based patients and controls in the world.
• Analyze the effect of smoking, physical exercise and BMI
• An analysis of the impact of comorbidity on ALS phenotype and progression. Comorbidities include cardiovascular, hypertension and other comorbidities, use of medication; statins, traumas.
• The impact of exposure to hormones and pregnancy on ALS risk, phenotype and survival
• A detailed analysis of the possible protective effects of alcohol consumption
• A socioeconomic analysis of ALS risk based on occupation and socioeconomic group.
• Job exposure matrix (JEM) 1: Solvents (and formaldehyde) and metals
• JEM2: Pesticides
• JEM3: Electromagnetic Frequency and shocks
• JEM4: Dust and Diesel
• A combined of analysis of all risk factors

Model generation (WP8)
Primary motor neuron cultures as well as differentiated motor neurons from induced pluripotent stem (iPS) cells from patients with ALS-causing mutations were obtained and these culture systems were used in the different -omics studies to find out which pathways were affected. From these results, it became apparent that mainly the axon and the axon terminal were extremely important in the disease process. This is in line with the notion that has emerged over the past few years that in ALS a dying back mechanism is at play. The axons loose contact and degenerate before the cell bodies die and protection of the cell body is not sufficient to halt the disease.
The concept that the axon is very important is further substantiated by results obtained from the in vivo studies. A non-biased screen in zebrafish resulted in the discovery of the ephrin signaling system. These findings obtained in WP8 suggest that the EphA4-dependent signalling is a modifier of motor axon degeneration and that this axon guidance system could become a new therapeutic target in ALS.
Deacetylation of HDAC6 also influences the axonal function (axonal transport). HDAC6 has only moderate tubulin deacetylating activity at baseline conditions and ubiquitinated proteins recruit HDAC6. Once recruited, HDAC6 could exert a double function. Ubiquitinated proteins will be translocated to the perinuclear region and processed for breakdown by HDAC6 (the “good” function). On the other hand, the deacetylating activity of HDAC6 will deacetylate α-tubulin compromising axonal transport and clearance of damaged proteins eventually leading to axonal loss and neuronal death (the “bad” function). In the context of this workpackage, it was discovered that HDAC6 causes axonal loss and neuronal death in ALS and inhibition of the deacetylation function of HDAC6 could have a potential therapeutic effect.
ELP3 is another disease modifier and a potential new therapeutic target. From the results obtained in workpackage 8 it became clear that ELP3-dependent acetylation of Bruchpilot at synapses regulates the structure of individual presynaptic densities and have a positive effect on neurotransmitter release in fruit flies. In ALS models, overexpression of ELP3 has a protective effect while deletion of ELP3 has negative effects. This is in line with the patient data that show that protective SNPs are associated with a higher ELP3 expression. These results indicate that stimulation of ELP3 expression is an interesting new therapeutic strategy.
The metabolism results obtained in this workpackage 8 are in line with the concept that ALS does not only affect the motor neurons but rather affects the body as a whole. In this respect, ALS is characterized by profound alterations of the metabolism of lipids. The discoveries from WP8 that a characteristic fatty acid composition occurs in blood lipids of ALS patients will pave the way for the establishment of robust biomarkers that help to follow patients in the clinical practice and in the design of clinical trials. Also oxygen metabolism could play an important role in ALS. It was discovered in this workpackage that inhibition of the oxygen sensor PHD1 could be an interesting new therapeutic option.
Last but not least, DNA repair was suggested to be involved in the ALS disease process as NEK1 and C21or2 were identified in the context of this EuroMotor project as novel ALS risk genes. NEK1 and C21orf2 interact and are essential for efficient DNA repair. DNA damage is one of the earliest detectable events in neurodegeneration. NEK1 encodes a serine/threonine kinase involved in cell-cycle regulation. The encoded protein is found in a centrosomal complex with FEZ1, a neuronal protein that plays a role in axonal development.
In conclusion, the ephrin signalling, HDAC6, ELP3, the lipid/oxygen metabolism and DNA repair were discovered as potential new therapeutic targets for ALS. As a consequence, the impact of this workpackage has to be seen in the context of the development of new therapies. Over time, this could lead to a new therapy for this dreadful disease with a large socio-economic impact in our aging society.
Functional studies (WP9)
Amyotrophic lateral sclerosis is a devastating disorder, but it still remains mostly unclear why people get this disease. It is known that some ALS patients have mutations in their DNA that cause ALS (so-called ‘familial ALS’ patients), and several involved genes have been found by now. The majority of ALS patients however do not have mutations in these genes (so-called ‘sporadic ALS’ patients), but often have a slightly increased genetic risk because of subtle changes in other genes (so called genetic risk factors). Although millions of healthy people also have those subtle changes for these genetic risk factors, and thus do not explain why certain people get ALS while others do not, they do provide very important entry-points for better understanding ALS: By studying what these mutations and these genetic risk factors do at many different levels, work package 9 has worked to get a better understanding what processes get disrupted. This has been done by developing novel methods for integrating different types of ‘omics’ data (including genomics, metabolomics and proteomics) and applying these to both human and mouse data that has been generated within the Euro-MOTOR project. We used data on mutations from familial ALS patients and data from the genome-wide association study on sporadic ALS patients (conducted within Euro-MOTOR) to define a set of ‘core ALS’ genes. We developed a new method to interpret these ALS genes and identified two biological pathways as enriched in ALS cases. Using expression data, we highlighted several ALS genes that showed differential expression between cases and controls and also identified several genes whose expression levels are influenced by the ALS genetic risk factors. We ascertained how protein-interactions in mouse models differ between mutant and wild-type mouse and build networks to ascertain differences in the wiring of these networks. The developed computational methods are available as open-source software and are publicly available, allowing researchers to apply these methods to other diseases as well. The pathways and genes that we identified that play a role in ALS provide leads for follow-up research and provide avenues for future drug development.

List of Websites:
www.euromotorproject.eu
contact details Project coordinator:
Leonard van den Berg, Professor of Neurology,
UMC Utrecht Room F02.202 P.O. box 85500, 3584 CX Utrecht, the Netherlands

Final Report Summary - EURO-MOTOR (European multidisciplinary ALS network identification to cure motor neuron degeneration)

Related documents

Share this page Share this page on social networks

Download Download the content of the page