Final Report Summary - EBIODIP (The emergence of bioenergetic diversity in prokaryotes.)
Prokaryotes use a wide variety of bioenergetic pathways but the order of emergence of these pathways and their evolutionary relationships are still unresolved issues. The aim of the project was to examine how this metabolic diversity evolved, i.e. whether each pathway evolved independently, whether they all evolved from a common ancestral metabolic mode, or whether parts of pre-existing pathways were co-opted to evolve into new pathways. This analysis was expected to yield insights into the origin of key innovations and the evolutionary flexibility of electron carriers, with potential applications in microbial fuel cell technology.
Approach:
Most bioenergetic pathways are based on an electron transport chain (ETC) which generates a proton gradient across a bioenergetic membrane; the energetically-favourable movement of protons down the gradient is then coupled to ATP synthesis. The electron transport chains of disparate pathways have a similar general structure, being composed of protein complexes acting as electron donors and acceptors, with a central cytochrome bc-type complex and mobile electron carriers between them. Our analysis was based on comparative genomics and phylogenetics of the protein complexes in different pathways which have equivalent functions, in order to establish their homology and evolutionary relationships.
In parallel, we wanted to create a database to facilitate the comparison of bioenergetic pathways, and to examine metagenomic data for the identification of novel bioenergetic enzymes. These will have divergent but homologous sequences to known enzymes already, and may represent novel functions acquired from point mutations in crucial residues, insertions/deletions or domain shuffling. Sequence analysis for conserved domains, predicted structure, etc. can be used to hypothesize a putative role for any novel proteins identified.
Objectives:
Objective (1): Analysis of the molecular evolution of the major protein complexes in electron transport chains (ATP synthase, cytochrome bc-type complexes, electron donors and electron acceptors).
Objective (2): Development of a database to facilitate the comparison of bioenergetic pathways.
Objective (3): Identification of novel bioenergetic enzymes from metagenomic data.
Results:
The molecular evolution analysis of the ATP synthase and the cytochrome bc-type complexes has been completed, resulting in two publications (of which one is currently under review). Using data from 272 species of fully sequenced bacteria and archaea, which represent the full diversity of prokaryotic lineages and multiple bioenergetic modes we examined how ATP synthase complexes map on the 16S rRNA tree: specifically, whether they group according to the type of ETC, or whether they follow taxonomy. The main conclusions are that:
- the distribution of bioenergetic modes is "patchy" (non-monophylletic), which suggests either frequent innovations, or more likely, gene transfer between unrelated species
- phylogenies based on all the ATP synthase subunits show species grouping according to 16S rRNA phylogeny, and not according to bioenergetic mode, indicating an ancient origin of this protein complex, and suggesting that no special modifications are needed for the ATP synthase to work with different electron transport chains
- examination of the ATP synthase genetic locus shows various gene duplications and rearrangements of the ATP synthase subunits in different lineages, which suggest further flexibility and robustness in the control of ATP synthesis.
We then focused on the evolutionary relationships of different families of b-type cytochromes, which form part of a variety of bioenergetic enzymes (the cytochrome b6f complex, ubiquinol and menaquinol reductases, formate dehydrogenases, Ni/Fe-hydrogenases, and succinate dehydrogenase). Using the same species selection as for the ATP synthase analysis, we examined the distribution of these cytochromes across lineages, and asked the question of whether sequences from different species group by cytochrome b family, by phylogenetic mode or by taxonomic group. We also re-examined data from previous studies using this expanded sample of organisms spanning the full diversity of prokaryotic lineages. The main conclusions are that:
- Species do not group based on bioenergetic mode.
- Different cytochrome b types are found in many lineages of the bacteria and archaea, and form distinct groups in phylogenetic analysis, which indicates an ancient origin for this complex, and diversification of different cytochrome b types before the diversification of lineages.
- In some cases, there is evidence of rampant horizontal gene transfer between different species of the bacteria and archaea.
- The results of the phylogenetic analysis allow the assignment of many "hypothetical" sequences to specific orthology groups.
- The b6 cytochrome of the b6f complex is not unique to photosynthetic organisms, and has experienced multiple fission events throughout evolution.
- Analysis of annotated and hypothetical prokaryotic cytochrome b561 sequences, indicate that this protein is not restricted to eukaryotes, as previously suggested
The database could serve as a reference point on available data related to bioenergetics pathways, and it will be organized for quick comprehension visually. The main gap that the database addresses, is that many of the bioenergetic pathways are not catalogued properly in other database, such as KEGG, and as such, do not form parts of regular automated analyses (e.g. metabolic analysis of metagenomic data through MG-RAST). We envision that this database might be available at some point, and, if this is the case, effort will be taken to be accompanied by a publication.
The analysis of metagenomics data for the identification of novel enzymes ran into technical difficulties with the current databases and tools available. We predict that our method can only be used to look for novel enzymes if the sequences are quite long (i.e. >300 amino acids) and correspond to sequences with no known paralogs (i.e. genes which have not been duplicated in evolution). We also examined metagenomic datasets of the human gut microbiome in terms of taxonomic diversity to reveal the bioenergetic diversity/potential, to see which prokaryotic metabolisms can be supported in such an environment. Although human gut microbiota are known to be dominated by prominent species of the proteobacteria and firmicutes, our analysis showed the presence of various species which have only been reported previously from extreme environments, opening up a new area of analysis. We are currently preparing a publication to report on this data.
Relevance and Impact:
The results of the project relating to the molecular evolution of the core enzymes of electron transport chains add to the existing knowledge, and indicate a flexibility of certain parts of the system, e.g. the ATP synthase which can function regardless of the type of ETC, as well as cases of horizontal gene transfer, e.g. for the b-type cytochromes. While the picture of how different ETCs evolved is not yet complete, we believe that our data contributed to the field.