Periodic Reporting for period 1 - INGENE (Integrating Nutrient economy in phytoplankton GENomics and Evolution)
Reporting period: 2021-06-01 to 2023-05-31
The objectives 2 and 4 of INGENE aimed at analyzing the long-term signatures of N and P scarcity on phytoplankton genomes. I compiled a dataset of N and P content from complete genomes and protein coding genes for prokaryotic and eukaryotic species. I also compiled the cellular P and N content from a literature survey and estimated these values for 5 additional species from measurements performed in the laboratory. I joined both datasets and found an association between the P content of the genome sequence and the total P content of the cell. Moreover, I estimated that the P content of the genome represents up to 57% of the total P content of the cell in phytoplankton species. This is higher than the N content of the genome, which represents up to 9% of the total N content of the cell. As a consequence, selection on DNA mutations changing the P or N content of the genome may have a non-negligible impact on genome evolution. I investigated this hypothesis by the analysis of a mathematical model linking the growth rate of phytoplankton to the P requirement of a cell. This approach enabled me to explore the size of deletion mutations, which decrease genome P content, susceptible to be selected for in natural populations.
I also investigated the N economy in the RNA and proteins of phytoplankton. Amino acids, which are the molecules that form proteins, are coded by messenger RNA sequences of three nucleotides, which are called codons. Some amino acids can be coded by more than one codon, and those codons that code for the same amino acid are called synonymous codons. A higher frequency of N cheap codons has been previously reported in plants and bacteria. The effect of changes in the usage of synonymous codons on the N requirements in messenger RNA scales up with the number of mRNA copies, so that we expect important gene to gene variations as there is a 4 order of magnitude difference between lowly expressed and highly expressed genes. I investigated if selection for N content was detectable on highly transcribed genes and found evidences that codon usage was more biased towards N poor codons in highly expressed genes. I will complete these analyses in the coming months.
The third objective was about estimating the effect of N and P on phytoplankton mutations. In collaboration with Dr. Krasovec, a CNRS research fellow from the host laboratory, we performed mutation accumulation (MA) experiments using phytoplankton cultures that grew with low N and P levels. We hypothetisize that low N and P levels might promote the occurence of mutations, as these low nutrient levels can be stressful for phytoplankton. Some of the mutation accumulation lines produced during MA experiments were already sequenced and the rest are being currently sequenced. I am currently analyzing the data.
INGENE integrates genomics, bioinformatics, and ecophysiology. The multidisciplinary framework provided by this project can be accommodated to other resources (e.g. organic compounds) and organisms (e.g. yeast, bacteria, plants, animals). In this way, INGENE can contribute to investigations carried out in other model organisms and scientific fields.