Skip to main content

Microbial biogeography of the gastrointestinal tract: Towards a better understanding of the drivers of oral and colorectal cancer development.

Periodic Reporting for period 1 - TransVivome (Microbial biogeography of the gastrointestinal tract: Towards a better understanding of the drivers of oral and colorectal cancer development.)

Reporting period: 2015-04-01 to 2017-03-31

The human microbiome plays essential roles in modulating human health and disease. Based on preliminary data as well as previously published work, I had hypothesised that 1) the oral cavity acts as a reservoir of microbes which continuously seed the gut and that 2) specific aberrations of the microbiota along the gastrointestinal tract (GIT) are linked to the development of colorectal cancer (CRC). My aim was to test these hypotheses through the integration of metagenomic and metatranscriptomic data generated from microbial consortia in the colons/recta as well as oral cavities of patients with CRC, and compare these to healthy individuals. In an attempt to address the questions: which microbial strains survive passage from the oral cavity in to the gut? And does the mouth act as a reservoir for strains driving CRC development?

Using a cohort of 26 CRC and 16 control, coupled saliva and stool samples, shotgun metagenomic sequences were generated. We found that the transmissibility of oral microbes to the gut via swallowing of saliva is not the same for all species. In many cases, subspecies are restricted to one of the two body sites. Due to the small size of the cohort, results are currently inconclusive with regards to disease specific strain complements in either the oral cavity or lower gastrointestinal tract. Using a much larger cohort (300 individuals) of publicly available saliva-stool coupled sequences from Fiji, China, France and Germany we were able to comment on oral to gut transmission. We showed that few species overlap between the oral and gut communities and that of these species approximately half comprised a sub-species specialised to the separate environments. This was further expanded on with other oral micro-environments, such as plaque, tongue and bucca-mucosa. Using novel methods for low resolution SNP detection and gene clustering we were able to reconstruct the peripheral genome of many of these species and determine potential localised adaptations which restrict subspecies to specific microenvironments along the gastrointestinal tract.
Due to major setbacks with data acquisition it was not possible to address questions relating to the metatrancriptome of OC or CRC patients. Work was performed using a different cohort to that named in the grant for which 42 saliva-stool couples (16 control and 26 CRC) had been previously collected. For these samples, DNA was extract, paired end libraries constructed and DNA sequences acquired using the Illumina HiSeq 2000 platform. Human reads were removed and samples processed using MOCAT metagenomics analysis pipeline (Kultima et al 2012; Bork group). All subsequent sequences were of high quality. The metaSNP pipeline (Costea and Muench et al in submission; Bork group) was used to call single nucleotide polymorphisms (SNPs) against a reference genome database of 1753 gut microbial species. With these data it was possible to address in part or in whole the following hypothesis set out in the original grant:

1) Does the oral cavity acts as a reservoir of microbes which continuously seed the gut?
To address this hypothesis SNP profiles per species were compared between oral and stool coupled samples from an individual and compared to those of oral and stool of other individuals. Many of the species which are found in both the saliva and stool are in low abundance in the stool, this suggests that oral microbes do not tend to dominate the gut microbial community. For these reasons many overlapping microbes had very low resolution SNP profiles in stool sequences and very few positions comparable between environments. To overcome this limitation, novel statistical methods for determining the significance of recovering each individual SNP were developed. This increased the number of species which we could comment on from 10s to 100s. The methods developed significantly advance over previous methods, focusing explicitly on strain retention over time and transmission between environments. Interestingly not all species showed the same strain distribution patterns between the oral cavity and gut, it seems for around 50% of species one strain colonises both the oral cavity and distil gut, whereas for the other 50% there is strain specificity. The main impact of these findings is that depending on the species we are studying we may be dealing with specialist or generalist strains; i.e. it may be incorrect to implicate an oral species in a distil gut disease like CRC as the strains are different and therefore may have different disease causing phenotypes.
The results of this work are being prepared for publication.

2) Are specific aberrations of the microbiota along the gastrointestinal tract (GIT) linked to the development of colorectal cancer (CRC)?
No single species, SNP profile, community composition or community perturbation in the oral cavity was found to be associated with CRC cases. This is most likely due to the limited number of samples. This hypothesis will be revisited in the near future as more samples are added to the study.

Additional hypothesis addressed:
Methods developed after the submission of the current MC-IF allowed work to begin on the following hypothesis:
3) Do we find specific ecotypes along the GIT?
As mentioned for hypothesis 1, approximately half of the species studied showed strain localisation along the GIT. With the inclusion of additional data from the Human Microbiome Project (HMP; over 1200 sequences) it was possible to look at these patterns within the oral cavity, specifically between saliva, tongue, buccal mucosa, supra- and sub- gingival plaque as well as keratinized gingival plaque and stool. In brief, SNP profiles were converted in to an allele frequency distance, statistical tests performed between sequences based on body site of isolation, allowed the identification of site specific ecotypes. In some cases an ecotype would colonise several sites in the oral cavity and be distinct from that of the gut, or more interestingly frequently there were ecotypes specific to sub-ginigval plaque and stool that were very different to that found to colonise the tongue. Using canopy clustering it was possible to reconstruct the peripheral genomes of these ecotypes and comment on adaptations which localise these strains within the GIT. Future work will pursue the hypothesis: “Where in the oral cavity do the ecotypes implicated in CRC development originate from?” A better understanding of where exactly these strains come from will have major implications on CRC diagnostics and targeted therapeutics.
As a result of the current MC-IF project, several new methods for identification of strain specificity have been developed and tested. In addition, methods for processing and analyzing metatranscriptomic data have been developed which may be used for both addressing biological questions and in developing novel diagnostics.