Final Report Summary - MILESTONE (A putative mechanism coupling DNA replication and translation in archaea conserved in eukaryotes)
Living organisms are divided into three domains: bacteria, archaea and eukaryotes. Although archaea resemble bacteria in size, shape and in the absence of a nucleus, their central molecular mechanisms are more similar to those of eukaryotes. Until now, most studies on archaea have focused on systems already known and well described in eukaryotes or bacteria.
In the last few years, whole genome sequencing has delivered an incredible amount of information, as well as many unexpected observations that still have to be exploited with experimental approaches to be fully understood. In particular, it is now possible to identify by genome data mining molecular systems shared by archaea and eukaryotes, which are either poorly characterized or as yet unknown in eukaryotes. The main objective of this project was based on an idea elaborate after archaeal genome context analysis: the existence of a mechanism coupling DNA replication and protein synthesis (translation). This hypothesis is based on the discovery of conserved clusters of genes encoding specific sets of DNA replication proteins lying adjacent to genes encoding specific sets of proteins involved in translation. The presence of eukaryotic orthologue(s) to each of the archaeal proteins present in these clusters suggests that this mechanism is conserved between archaea and eukaryotes (Berthon et al., 2008, 2009).
Genome context analysis has already proved to be fruitful to identify new biological functions. For instance, two new families of nucleases (the NurA family) and helicases (the HerA family) involved in DNA repair/recombination were discovered, because the genes encoding these proteins (previously of unknown functions) were frequently located in archaeal genomes close to the DNA repair/recombination genes mre11 and rad50 (Constantinesco et al., 2002, 2004). The rationale behind this approach is that, like in bacteria, archaeal genes encoding similar or coordinated functions are often clustered in operons (this is the case, for instance, for genes encoding several ribosomal proteins, large RNA polymerase subunits, or components of the exosome). Importantly, considering the similarity between archaea and eukaryotic molecular biology, the function of an unknown eukaryotic protein can thus sometimes be predicted from the genome context analysis of its archaeal orthologue.
The long-term interest of the host laboratory in DNA replication led them to study systematically the genome context of genes encoding DNA replication proteins in archaeal genomes. A number of genes encoding DNA replication proteins were found to be systematically associated in similar clusters, even within genomes of distantly related archaea, such as euryarchaea and crenarchaea (Berthon et al., 2008). Surprisingly, two of these clusters were found to be very often located in the immediate vicinity of conserved clusters grouping a specific set of genes encoding proteins a priori involved in translation, leading to the identification of two specific clusters grouping both DNA replication and translation genes (Figure 1).
The first of these two clusters includes seven genes that colocalize in numerous distantly related archaeal genomes. Three of them encode the DNA replication proteins PCNA (the clamp that tightly tethers several DNA replication and repair proteins to DNA), PriS (the small subunit of the DNA primase), and Gins15 (one of the two subunits of the essential replication GINS complex), whereas the four others encode the ribosomal proteins L44E and S27E, the alpha subunit of the translation initiation factor aIF2, and the protein Nop10, which is involved in ribosome biogenesis. The seven genes of this large cluster ('the PPsGLSIN cluster') are adjacent, always organized in the same order, and transcribed in the same direction in most genomes from crenarchaea and in a number of genomes from euryarchaea.
The second cluster includes genes encoding the DNA replicative helicase MCM, the second subunit of the DNA replication GINS complex, Gins23, and the beta subunit of aIF2 (the GMI cluster). It is striking that, whereas the gene encoding the alpha subunit of aIF2 belongs to the PPsGLSIN cluster, the gene encoding its beta subunit belongs to the GMI cluster. One could have expected the genes encoding the alpha and beta subunits of aIF2 to be sometimes grouped together, but this is never the case. The existence and conservation across widely divergent species of the PPsGLSIN and GMI clusters strongly suggest the existence of an unknown regulatory mechanism coupling DNA replication and protein synthesis in archaea.
Interestingly, all proteins of the PPsGLSIN and GMI clusters have homologues in eukaryotes, further raising the possibility that the putative network predicted by the genome context analysis also exists in eukaryotes. In contrast, these proteins have no homologue in bacteria, with the exception of PCNA, which is distantly related to the bacterial beta-clamp.
Coupling protein synthesis with DNA replication would make sense in order to coordinate DNA synthesis with the capacity of the cell to divide, which is directly related to the amount of nutrient available in the environment. Regulation of replication through the stringent response has indeed recently been shown to occur in Bacillus subtilis. Wang et al. (2007) have shown that the rate of DNA chain elongation in bacteria is coupled to protein synthesis via the alarmone of the stringent response ppGpp. The key element of this regulatory system is the inhibition of the primase DnaG by a direct interaction with ppGpp. However, the archaeal/eukaryotic primase is evolutionarily unrelated to DnaG, and archaea or eukaryotes (apart from plants) do not produce ppGpp (Silverman and Atherly, 1979), suggesting that any putative regulatory network coupling DNA replication and translation in these two domains of life must be very different. Berthon and Forterre have proposed that this putative network does indeed exists and that the PPsGLSIN and GMI clusters revealed by the in silico analysis are part of it.
Several studies have brought some support to this hypothesis. In particular, the presence of MCM in the GMI cluster can be related to a recent observation made in the host laboratory using ChIP-Chip analysis, which revealed that the MCM protein of the hyperthermophilic archaeon Pyrococcus abyssi binds preferentially to the replication origin of this organism in exponential growth phase, but is delocalized to the ribosomal operons in stationary phase (Matsunaga et al., 2007). It is thus possible that MCM is somehow linked to the neo-synthesized ribosomes still bound to the rDNA genes via the ribosomal RNA. This also correlates with an observation of Du and Stillman (2002) who reported that ORC and MCM associate in yeast into a complex with proteins involved in ribosome biosynthesis.
The presence of the ribosomal protein S27E in the PPsGLSIN network is also especially interesting. S27E has indeed two human homologues, S27E and S27L, which both play a significant role in cancer development. This suggests that, besides their roles as ribosomal proteins, these proteins have a regulatory function in cell proliferation. Human S27E was first identified as a growth factor–induced gene and called metallopanstimulin 1 (MPS-1). It is considered as an oncogene and was reported to mediate cellular proliferation in response to various growth factors and other environmental signals. Wang et al (2006) showed that its inactivation inhibited growth and tumor genesis and led to an increase of spontaneous apoptosis in gastric cancer cells. Moreover, eukaryotic S27E binds single- and double-stranded DNA, and could be involved in DNA and/or RNA transaction processes like repair or double-strand break recognition.
The other human homologue S27L was found to be the first ribosomal protein transcriptionally regulated by the tumor repressor p53 (He and Sun, 2007; Li et al., 2007), and has a key position in determining cell fate (cycle arrest or apoptosis) through its ability to activate p21, which is itself an inhibitor of proapoptotic effectors. Through this activity, it also regulates cellular checkpoints, DNA repair, and chromosome stability, and “could be a promising pharmacologic target for modulating sensitivity to DNA-damaging chemotherapeutic agents”.
The aim of the project was therefore to study experimentally the hints given by bioinformatics analysis, using a number of approaches:
- In vitro, the aim would be to discover any direct interactions with the proteins encoded by the genes clusters and the effect on the activity of the interactants.
- In vivo, we will try to understand the transcriptional regulation of the genes; first by looking if they are transcribed as operons, using only one promoter. Secondly, by analysing the response of the transcription after either inhibiting replication, or creating a situation of amino-acid depletion (inhibition of translation); the aim will then be to understand what are the actors and the mechanism of such a regulation.
Results
I started the project by analysing direct interactions between the proteins encoded in the clusters. This was done by cloning each of the genes (gins15, gins23, priS, priL, PCNA, MCM, nop10, aIF-2alpha, aIF-2beta, S27E et L44E) of the hyperthermophile Pyrococcus furiosus in expression plasmids (containing either a his-tag or a flag-tag fusion sequence) and co-expressing them, in pairs, in Escherichia coli. Co-purification was then tempted using one tag or the other, and purified proteins were then electrophoresed to assess the amount of interaction.
Using this method, I was able to check the previously known interactions, for example between the proteins Gins15 and Gins23, or Gins23 and MCM. However, I could not prove any new interaction between the translation and replication proteins.
To test if multi-partner interactions were needed, some of them were cloned as polycistrons, and therefore expressed as it is probably the case in vivo, but no new interaction could be seen.
We therefore concluded that direct interactions between the proteins encoded by the two clusters may not happen in vivo, or those interactions were too transient to be detected by this approach.
We next wanted to study the regulation of transcription of the cluster. To check if they were transcribed as a polycistron, RT-PCR experiments were carried out on total RNA extracted from P. furiosus and S. acidocaldarius. Primers were designed so that PCR fragments overlap, and we thus showed that the clusters were indeed transcribed as operons in both organisms. Another study using Sulfolobus solfataricus also confirmed that the clusters observed in silico were transcribed as operons as well (Wurtzel et al, 2010).
For those two model organisms the structure of the transcription units are shown on Figure 2.
On top of this, two other genes were shown to be part of the operons: minD and pace12. MinD is a protein known to be involved in cell shape and cell division through the regulation of the positioning of the Z-ring, where the septum is formed. PACE12 is a "Protein from Archaea without assigned function that is Conserved in Eukarya" and has been predicted to be involved in cell division or chromosome regulation (Matte-Taillez et al, 2000). This confirms once more that proteins involved in the same processes are encoded together.
We wanted to further characterize the regulation of these clusters in a situation of inhibition of protein synthesis. To imitate that, we used pseudomonic acid, which is a isoleucine-transferase inhibitor. It imitates a lack of isoleucine by preventing the transfer on tRNA of this amino-acid. In bacteria, empty tRNAs load on the ribosome and activate what is known as the Stringent Response (Potrykus and Cashel, 2008) and therefore stops protein synthesis.
RT-qPCR experiments were designed in order to study gene expression under our conditions of interest. For those, S. acidocaldarius was used for practical reasons: in contrary to P. furiosus, this organism can grow at moderate temperature (75°C), aerobically, with a generation time of 4h.
Two conditions were used: a normal growing condition, in rich medium, and a condition were pseudomonic acid was added after one generation; samples were taken every 2h over 3 generations.
Before trying to interpret the results, reference genes had to be defined for qPCR experiments. Based on a transcriptomic study (Wurtzel et al, 2010), I tried 9 genes that seemed to be stable along growth (agl, glcV, rnaseP, rnhB, saci0839, saci0609, saci0957, slaA and sodF). Unfortunately, none of them seem stable enough in both conditions tested. This would have had to be studied more deeply, but I didn't have time to finish those experiments.
Looking in more details at the mechanism described in eukaryotes for amino-acid starvation, I could however identify a few genes and proteins worth to be studied. In yeast, General amino-acid control happens after detection of unloaded tRNAs, through the phosphorylation of the alpha subunit of the translation initiation factor eIF2 (Figure 3) by the kinase GCN2. Phosphorylated eIF2alpha then stably binds eIF2B, which cannot carry out its regeneration activity of the initiation factor eIF2-GDP into eIF2-GTP, therefore preventing its rebinding to the initiator tRNA.
This blocks new translation of proteins, and at the same time activates the transcription regulator GCN4, which will in turn activates the transcription of amino-acid synthesis genes. At the same time, cycle is arrested by an unknown mechanism (Hinnebush, 2005; Grallert and Boye, 2007).
Unfortunately, GCN2 and GCN4 do not seem to have any homologues in Archaea. The subunits eIF2alpha, beta, gamma, and the factor eIF2B have archaeal homologues, although the function of aIF2B is unknown.
However, a study showed that a kinase (PH0512) could phosphorylate aIF2alpha in P. horikoshii (Tahara et al, 2004). I therefore looked if this kinase was conserved among archaea; its homologue in S. acidocaldarius is Saci0965. A protein alignment in representative Archaea shows clearly that not only this putative kinase is conserved, but its gene is also systematically clustered with two other genes encoding RNA-related proteins (Figure 4): an "RNA processing protein", and another translation initiation factor, aIF1alpha. As can be seen on Figure 4, the gene encoding aIF2beta is also sometimes found in this cluster, and this is in the genomes where it is not located close to mcm and gins23 (S. acidocaldarius or N. maritimus for instance). This result seems to comfort the hypothesis of a phosphorylation of translation initiation factor(s) by Saci0965.
To confirm this, I started to clone and purify S. acidocaldarius aIF2alpha and aIF1alpha to assay their phosphorylation by Saci0965, and my aim was to further study the effect of the phosphorylation on the stability of aIF2 complex, the interaction with aIF2B, and on tRNAi and GTP binding.
Unfortunately, I didn't have time to continue this promising study, but we are nevertheless working on a way to publish a communication on those results.
References
Berthon J, Cortez D, Forterre P. (2008) Genomic context analysis in Archaea suggests previously unrecognized links between DNA replication and translation. Genome Biol 4:R71.
Berthon J, Fujikane R, Forterre P. (2009) When DNA replication and protein synthesis come together. Trends Biochem Sci 9:429-34.
Constantinesco F, Forterre P, Elie C. (2002) NurA, a novel 5'-3' nuclease gene linked to rad50 and mre11 homologs of thermophilic Archaea. EMBO Rep 3 6:537-42.
Constantinesco F, Forterre P, Koonin EV, Aravind L, Elie C. (2004) A bipolar DNA helicase gene, herA, clusters with rad50, mre11 and nurA genes in thermophilic archaea. Nucleic Acids Res 32:1439-47
Du YC, Stillman B. (2002) Yph1p, an ORC-interacting protein: potential links between cell proliferation control, DNA replication, and ribosome biogenesis. Cell 109:835-48.
Grallert and Boyle (2007)
He H, Sun Y (2007) Ribosomal protein S27L is a direct p53 target that regulates apoptosis. Oncogene 26:2707-16.
Hinnebush AG (2005) Translational regulation of GCN4 and the general amino acid control of yeast. Annu Rev Microbiol 59:407-450.
Li J, Tan J, Zhuang L, Banerjee B, Yang X, Chau JF, Lee PL, Hande MP, Li B, Yu Q. (2007) Ribosomal protein S27-like, a p53-inducible modulator of cell fate in response to genotoxic stress. Cancer Res 67:11317-26
Matsunaga F, Glatigny A, Mucchielli-Giorgi MH, Agier N, Delacroix H, Marisa L, Durosay P, Ishino Y, Aggerbeck L, Forterre P. (2007) Genomewide and biochemical analyses of DNA-binding activity of Cdc6/Orc1 and Mcm proteins in Pyrococcus sp. Nucleic Acids Res 35:3214-22
Matte-Taillez, O, Forterre, P. and Zivanovic, Y (2000) Mining archaeal proteomes for eukaryotic proteins with novel functions : the PACE case. Trends Genetics 16, 533-536
Perrochia L*, Crozat E* , Hecker A, Zhang W, Bareille J, Collinet B, van Tilbeurgh H, Forterre P, Basta T (2013) In vitro biosynthesis of a universal t6A tRNA modification in Archaea and Eukarya. Nucleic Acids Res 41:1953-64.
Potrykus K and Cashel M (2008) (p)ppGpp: still magical? Annu Rev Microbiol 62:35-61.
Silverman RH, Atherly AG (1979) The search for guanosine tetraphosphate (ppGpp) and other unusual nucleotides in eucaryotes. Microbiol Rev 43:27-41.
Tahara M, Ohsawa A, Saito S, Kimura M (2004) In vitro phosphorylation of initiation factor 2 alpha (aIF2α) from hyperthermophilic archaeon Pyrococcus horikoshii OT3. J Biochem 135:479-485.
Toffano-Nioche C, Ott A, Crozat E, Nguyen NA, Zytnicki M, Leclerc F, Forterre P, Bouloc P, Gautheret D. RNA at 92°C: the non-coding transcriptome of the hyperthermophile Archaea Pyrococcus abyssi; submitted.
Wang JD, Sanders GM, Grossman AD. (2007) Nutritional control of elongation of DNA replication by (p)ppGpp. Cell 128:865-75
Wang YW, Qu Y, Li JF, Chen XH, Liu BY, Gu QL, Zhu ZG. (2006) In vitro and in vivo evidence of metallopanstimulin-1 in gastric cancer progression and tumorigenicity. Clin Cancer Res 12:4965-73.
Wurtzel, O, Sapra, R, Chen, F, Zhu, Y, Simmons, BA, Sorek, R (2010) A single-base resolution map of an archaeal transcriptome. Genome Research 20:133–141.