Skip to main content
European Commission logo print header

Identification and Characterisation of the Sex Locus in the Dioecious Moss<br/>Ceratodon purpureus

Final Report Summary - CERATOSEX (Identification and Characterisation of the Sex Locus in the Dioecious Moss<br/>Ceratodon purpureus.)

The original proposal described four objectives:
Objective 1: Identification of genes expressed specifically during differentiation of male and female sex organs.
Objective 2: Identification of X- and Y-chromosome localised sex-related genes
Objective 3: Evolutionary analysis between male and female sex locus genes
Objective 4: Functional characterisation of X- and Y-chromosome localised sex-related genes
Objectives 1 and 2: Identify sex-related genes and sex linked genes
These objectives are based on the analysis of NGS-derived transcriptomic analyses of male and female plants of C. purpureus.
Work performed and results obtained:
Transcriptome data have been generated by RNA-seq analysis of gametophore tissue from the female and male strains (GG1 and R40) in the laboratory of our collaborator, Dr Stuart McDaniel at the University of Florida. These data are currently being analysed to identify genes differentially expressed between the sexes.
In order to identify sex-linked genes it was intended to use an NGS-based transcriptome analysis of bulked male and female segregants derived from a GG1 x R40 cross produced in Dr. McDaniel’s laboratory (McDaniel SF et al. (2007) Genetics 176, 2489-2500). Unfortunately, this mapping population suffered severe microbial contamination, and was consequently not available for this project.
To circumvent this setback, two approaches were taken (i) generation of a new GG1xR40 population in the Leeds laboratory, and (ii) sampling of a naturally derived population through spore sampling from independent sporogonial obtained from a UK isolate. The second approach was undertaken both as an insurance policy against failure to obtain a new GG1 x R40 mapping population, and as a strategy for investigating genetic diversity and gene flow in a natural population of C. purpureus. This precautionary approach proved to be necessary, since the GG1 strain proved to be extraordinarily refractory to fertilisation under laboratory conditions, and the generation of another GG1 x R40 population was not successful.
For the natural population, RNA-seq libraries were generated and sequenced for 16 individuals derived from independent sibling (2 male and 2 female) sets each from four different sporophytes. Using Illumina 150-base paired-end sequence reads, we have obtained 6.14.108 sequence reads from these libraries. These DNA sequence data are now in the process of being analysed in Leeds and in Florida, through a series of integrated approaches. (i) The sequence reads are being aligned with the genomic sequence data generated by the US DoE Joint Genome Institute as part of the C. purpureus genome programme, in order to undertake structural and functional annotation of this genome. (ii) In addition to alignment with the genomic sequence scaffolds, the sequence reads are being assembled de novo to generate an independent transcriptome assembly (to ensure that genetic variation between the genome sequences of the UK isolate and those of the sequenced GG1 and R40 isolates does not confound the genome annotation). (iii) Sequence polymorphisms between the 16 members of the sampled population are being identified in order to obtain a measure of the extent of genetic variation within this population, especially in relation to the extent of natural gene flow between male and female plants in this population. (iv) Sequence read count data are being used to obtain a genome-wide analysis of transcript abundance for the individuals in this population.
Expected final results, potential impact and use.
(1) The alignment of transcriptomic data to the genomic sequence data will enable the accurate structural annotation of the C. purpureus genome, through the identification of splice junctions and the prediction of transcription initiation and termination sites. This will substantially enhance the accuracy of the genome annotation.
(2) Analysis of the transcriptome data (e.g. by gene ontology analysis) will enable a better understanding of the functional properties and annotation of the C. purpureus genome (e.g. Szovenyi et al. (2011) Molec. Ecol. Res. 15: 203-215). Both of these will provide a basis for the comparative genomic analysis for three sequenced moss genomes from evolutionarily diverged clades: Sphagnum fallax, Ceratodon purpureus and Physcomitrella patens.
(3) Deep sequencing of the transcriptome will enable an accurate analysis of the relative levels of gene expression over ca. 5 orders of magnitude.
(4) Even for a limited segregating population (16 individuals) sufficient sequence polymorphisms will be sampled to identify haplotypic blocks of linked genes that will aid the physical assembly of the C. purpureus genome scaffolds into larger contigs, and contribute significantly to the ultimate aim of assembling the genome sequence into pseudochromosomes.
(5) We expect to be able to identify genes differentially expressed between male and female segregants and genes present in one sex and absent in the other through the identification of sex-specific/sex-linked sequence polymorphisms, thereby underpinning the subsequent analysis of rates of evolution of these genes using subsequent sampling of additional natural populations. Combined with the functional annotation of the transcriptome, this should enable the completion of objective 3 and contribute significantly toward achieving objective 4. The data we obtain will provide support for further research directed toward understanding the evolution of sexual differentiation in haploid-dominant organisms.