Skip to main content

Large-scale identification of secondary metabolites, metabolic pathways and their genes in the model tree poplar

Periodic Reporting for period 2 - POPMET (Large-scale identification of secondary metabolites, metabolic pathways and their genes in the model tree poplar)

Reporting period: 2021-01-01 to 2022-06-30

Poplar is an important woody biomass crop and at the same time the model of choice for molecular research in trees. Although there is steady progress in resolving the functions of unknown genes, the identities of most secondary metabolites in poplar remain unknown. The lack of metabolite identities in experimental systems is a true gap in information content, and impedes i/ obtaining deep insight into the complex biology of living systems and 2/ valorizing these metabolites. The main reason for the lack of metabolite identities is that metabolites are difficult to purify because of their low abundance, hindering their structural characterization and the discovery of their biosynthetic pathways. In this project, we will use CSPP, an innovative method recently developed in my lab, to systematically predict the structures of metabolites along with their biosynthetic pathways in poplar. This CSPP method is based on a combination of metabolomics and informatics. In a next step, the CSPP tool will be combined with two complementary genetic approaches based on re-sequence data from 750 poplar trees to identify the genes encoding the enzymes in the predicted pathways. Genome Wide Association Studies (GWAS) will be made to identify SNPs in the genes involved in the metabolic conversions. Subsequently, rare defective alleles will be identified for these genes in the sequenced population. Genes identified by both approaches will then be further studied either by crossing natural poplars that are heterozygous for the defective alleles, or by CRISPR/Cas9-based gene editing in poplar. The functional studies will be further underpinned by enzyme assays. Given our scarce knowledge on the structure of most secondary metabolites and their metabolic pathways in poplar, this large-scale identification effort will lay the foundation for systems biology research in this species, and will shape opportunities to further develop poplar as an industrial wood-producing crop.
We have optimized a protocol for high molecular weight (HMW) DNA extraction from poplar suitable for long-read genome sequencing using the Oxford Nanopore Technologies (ONT). Subsequently, woody cuttings from 750 poplar genotypes were grown in triplicate in a greenhouse and pure HMW DNA was prepared. We obtained a draft genome assembly of Populus nigra cv. ‘BDG’ using the ONT MinION with a 44X coverage and a N50 contig size of 496kb. Already 192 genotypes (~25X coverage, N50 ~23kb) have been sequenced and sequencing data is being stored in a data repository and being analyzed using our in-house bioinformatics pipeline. Of these 192 sequenced trees, 120 genomes have been assembled and polished. The second objective was to establish the most optimal harvesting stage, tissue and extraction method for metabolite profiling. To this end, metabolite profiles of leaves of three developmental stages of 10 genotypes were generated by LCMS. The metabolome-wide average heritability was similar across the three stages, yet the first fully mature leaf, at leaf plastochron index 5, generated the most informative metabolite spectrum. Furthermore, metabolites extracted from leaf material from one poplar genotype were hydrolyzed to obtain aglycones of the metabolites, and a subset of these aglycones is ready for purification and structural identification by the VIB Metabolomics Core facility. In addition to harvesting the fifth leaf of all genotypes, also debarked stems were harvested and frozen for metabolic profiling. A high-throughput metabolite profiling method was established and we are ready to start metabolite profiling of the leaf samples.
In addition to our first reference genome for Populus nigra, a DynLib database has been established for poplar metabolites that already includes 209 metabolites that were characterized based on CSPP networks and mass spectrometry. The concept of the DynLib database is new and has been published in Desmet et al., CSBJ (2021). The project aims at uncovering the structures of many more secondary metabolites, the abundances of which will be used as traits in GWAS involving 750 P. nigra accessions. We will specifically search for associations in genes that encode enzymes predicted to be involved in their biosynthetic pathways. The project aims at the large-scale discovery of metabolites along with their biosynthetic pathways and genes.