CORDIS - EU research results

Structure and Dynamics of Low-Complexity Regions in Proteins: The Huntingtin Case

Periodic Reporting for period 4 - chemREPEAT (Structure and Dynamics of Low-Complexity Regions in Proteins: The Huntingtin Case)

Reporting period: 2021-03-01 to 2022-02-28

The main aim of chemREPEAT is the structural characterization of protein huntingtin (htt), the causative agent of Huntington’s Disease (HD), and to understand the structural bases of this pathology. The N-terminal region of Htt, the so-called exon1, contains a homorepeat (HR) region that contains a large number of consecutive glutamine residues. Individuals with more than 35 consecutive glutamines suffer this deadly neurodegenerative pathology. The structural characterization of Htt represents an enormous challenge due to its inherent flexibility, which precludes the use of X-ray crystallography, and its repetitive nature that hampers the application of Nuclear Magnetic Resonance (NMR). Along the project, we have developed robust biochemical and computational approaches to overcome present limitations and decipher the structural bases of HD.
A major development has been the Site-Specific Isotopic Labelling, which enables the introduction of isotopically enriched amino acids in defined positions of the protein. When applied to the glutamine tract of huntingtin, we have overcome the repetitive nature of the protein and studied it at the atomic level by NMR. The comparison of the structures for non-pathogenic and pathogenic versions of Htt Exon-1 has provided the structural hints triggering aggregation. Concretely, we have discovered that the glutamine stretch is inherently helical and that the length and stability of this helix is correlated with the number of glutamines. This phenomenon is governed by regions flanking the stretch that propagate the formation of helical conformations. The increase of helical stability favours self-interaction and the formation of oligomers, which eventually aggregate in amyloidogenic fibrils found in HD patients’ brains. These results pave the way to novel therapeutical strategies by selectively reducing the helix content of pathogenic Htt and that could probably be applied to other diseases linked to the abnormal expansion of glutamine tracts.
From the computational side, we have developed a large database composed of three-residue long fragments derived from high-resolution experimental protein structures. Through a series of studies we have demonstrated that these fragments are the minimal building blocks for proteins as they contain the information to generate realistic models of proteins. We have exploited this database to generate large ensembles of proteins in agreement with experimental data, including Htt, identified partially folded regions in proteins, and deciphered folding pathways for small proteins. Our finding open new perspectives in the design of disordered proteins for biotechnological applications.
All together, chemREPEAT has provided novel biochemical and computational methodologies to address crucial biological questions from a structural perspective. These methodologies have a broad range of applications in neurodegenerative and disorder related diseases, protein design...
Here a brief description of these work performed along chemREPEAT:
- Site-Specific Isotopic Labelling (SSIL): The key methodological development of chemREPEAT has been the possibility to introduce isotopically enriched atoms in specific protein positions. We have shown that, by combining the tRNA suppression strategy and Cell-Free (CF), these samples could be produced and studied by NMR. This approach was initially developed for glutamines (Urbanek et al. Angewan. Chem. 2018) and subsequently for prolines (Urbanek et al. JACS 2020). During the last year, we have also developed SSIL for Alanine. Technical details of the methodology have been described in (Morató et al., Biomolecules 2020) and are at the centre of a review (Urbanek et al., ChemBioChem 2020).
- Structural bases of the pathological threshold in HD: Using SSIL, we have investigated non-pathogenic, HttQ16 (Urbanek et al., Structure 2020) and pathogenic, HttQ46 and Htt66 (ms in revision), versions of the protein. By the comparison of their structures a novel perspective of the pathological threshold was derived. Concretely, we assigned a prominent role to the structure in the aggregation process. We have demonstrated that the stability of the helix is the main feature governing the aggregation kinetics and the structure of the resulting fibrils.
- Reduced cis population in poly-Proline (Poly-P): Using SSIL, we addressed the study of the cis/trans populations of individual prolines in Poly-P using Htt as example (Urbanek et al. JACS 2020). Our results unambiguously show that prolines placed in the inner position of Poly-P have a reduced population of the cis conformation with respect to isolated ones.
- Site-specific incorporation of fluorinated amino acids: We have demonstrated that aminoacyl tRNA synthetases (aaRS) used in SSIL to load tRNA also recognize non-natural amino acids with small modifications. We validated this feature by substituting hydrogen atoms by fluorine, which enables 19F-NMR experiments. We have applied this approach for Fluoro-glutamine and three different fluoro-proines (ms in preparation).
- Segmental labelling and SANS experiments. Profiting the power of CF to control the nature of the amino acids in a protein, we have endeavoured the SANS study of Htt using segmentally deuterated samples in which we specifically deuterate or protonate Q and P.
- Bioinformatic analyses of homorepeat-containing proteins: In collaboration with the group of Miguel Andrade, we have studied the role of flanking regions in glutamine-rich proteins and their evolution (Mier et al., Comp Struct Biotech J. 2020). While we identified that leucine is highly enriched in the position preceding Poly-Q, prolines were highly abundant in the C-flanking regions. We have also performed similar analyses for alanine-rich proteins (ms in revision).
- Robotics-inspired algorithms to study disordered proteins. Together with Juan Cortés, we have applied robotics-inspired algorithms in biomolecules. The main tool we use is a large database of tripeptide fragments derived from high-resolution structures. By concatenating these tripeptides, we built realistic ensembles of proteins that were in excellent agreement with experimental data (Estaña et al, Structure 2019). Moreover, we identified partially structured regions in disordered regions (Estaña et al., J Mol Biol 2020). By parallelizing the building algorithm (Estaña et al. Parall. Comput 2018), we also used the database to decipher the folding pathway of small protein (Estaña et al, Molecules 2019). We have also applied tailored statistical methods to robustly explore whether the conformation of a residue is influenced by its neighbours (ms in revision).
Deciphering the structural bases that trigger pathogenicity in huntingtin is a breakthrough in the field. Our studies provide a structural perspective to the phenomenon of pathological threshold in Huntington’s Disease. Concretely, we have identified unexpected persistent helical conformations in pathogenic huntingtin that act as a nucleation point for aggregation. Although it was well know that long poly-Q fragments induced the pathology, the fundamental aspects driving to this phenomenon were unknown. Overall, this result will change the way researchers understand Huntington’s disease and the other Poly-Q related diseases, and eventually paves the way to new pharmacological avenues with the capacity to reduce, in a selective manner, the helicity of the Poly-Q stretch.