Periodic Reporting for period 2 - IDPfun (Driving the functional characterization of intrinsically disordered proteins)
Reporting period: 2020-03-01 to 2023-08-31
Despite their importance, IDPs remain poorly characterized due to significant challenges encountered in both experimental and computational studies. Capturing the properties of dynamic systems is challenging at the experimental level, while modeling IDPs is difficult computationally due to the lack of standardized and integrated data. Computational methods have aided in identifying IDPs based on their amino acid sequences, and curated databases dedicated to IDPs have provided a starting point for their analysis. However, our functional understanding of IDPs remains limited.
Advancing our knowledge of IDPs has substantial societal impact, given their involvement in numerous diseases.IDPfun is an international consortium focused on expanding our understanding of IDP functions. The consortium was established based on fruitful collaborations initiated during the NGP-net COST Action (2015-2019). Leveraging existing state-of-the-art computational tools and databases, many of which were developed by IDPfun members, the consortium provided a collaborative environment for researching novel approaches to model different aspects of IDP behavior and enhance the accessibility of IDP data.
The second significant achievement of IDPfun pertains to the standardization and integration of IDP data into public databases while adopting best practices. The IDPfun consortium developed the Minimum Information About Disorder Experiments (MIADE) guidelines in collaboration with the IDP working group of the Human Proteome Organization - Proteomics Standards Initiative (HUPO-PSI) and the ELIXIR IDP Community. They also integrated the Evidence & Conclusion Ontology (ECO) with new terms for describing IDP experiments and created curation guidelines for community annotation of IDPs from the literature. The IDPfun consortium organized the Critical Assessment of protein Intrinsic Disorder (CAID) challenge, which evaluates IDP prediction methods and promotes standardization of IDP software. CAID is globally recognized, connected with ELIXIR services, and coordinated with the Critical Assessment of protein Structure Prediction (CASP). Also, IDPfun established ontological definitions for describing IDP functions. New terms were integrated into the Gene Ontology (GO), and IDPfun actively participated in the organization of the Critical Assessment of Function Annotation (CAFA), which evaluates function prediction methods.
The third major accomplishment of the IDPfun project involved the integration of IDP knowledge into public databases. New IDP models were incorporated into IDPfun curated databases such as DisProt, PED, ELM, and FuzDB. Additionally, the output of the developed software tools was integrated into the MobiDB aggregation database. By standardizing IDP function and the output format of prediction tools, data exchange between IDPfun resources and core resources maintained at the European Bioinformatics Institute (EBI), including InterPro, PDBe-KB, and UniProtKB, became possible. Moreover, the standardization of IDP ensemble structural data enabled the aggregation of PED records with experimental structural data from the Protein Data Bank (PDB) consortium in the 3D-beacons system, providing a centralized interface for accessing all publicly available structural data.
Scientific and technological achievements that contributed to the overall mission of the IDPfun research project were effectively disseminated through various activities aimed at raising awareness of IDPs in the life sciences community: integrating IDP knowledge into major databases, publishing in high-impact journals, incorporating IDP-related challenges into international bioinformatics assessments (CAFA and CASP), organizing conferences, online seminars, and hackathons, maintaining an active presence in social networks, establishing partnerships with other actions (PhasAGE, ML4NGP, etc.), and consortia (ELIXIR, GO, ECO, HUPO-PSI, InterPro, UniProtKB, PDB, etc.). The IDPfun project also provided training to researchers on utilizing IDP resources through protocol articles, webinars, training schools, and bootcamps.
An important indicator of the impact of the IDPfun project is the total number of citations accumulated by articles associated with the project, which exceeds 3,000. This exceptional result is particularly impressive considering that the majority of these works were published recently.
IDPfun has also enhanced the impact of European scientific advancements in Latin America and vice versa. This collaboration has resulted in the inclusion of Argentina in the list of target countries outside of Europe for new collaborations within the ELIXIR consortium.
Through its outreach activities, IDPfun has contributed to making research careers in the bioinformatics sector more attractive and has encouraged young individuals to pursue such careers in both Europe and Latin America. Additionally, the project has paved the way for the creation of innovative spin-off companies, opening up novel opportunities for entrepreneurship.