Revealing the full extent of microbial gene diversity and complex microbial interactions, integrated metagenomics and network analysis is a major contribution of MetaPlat. A MetaPlat study investigated the rumen microbial community in cattle through the integration of metagenomic and network-based approaches. One of the main contributions beyond the state of the art is the development of a random matrix theory-based approach to automatically determining the correlation threshold used to construct the co-abundance network associated with methane emission. The findings exhibit a clear modular structure with certain trait-specific genes highly over-represented in modules. More specifically, all the 20 genes previously identified to be associated with methane emissions are found in a module (hypergeometric test, p < 10−11). One third of genes are involved in methane metabolism pathways.
MetaPlat utilises a high-throughput computing asynchronous queueing system that lends itself to scaling up (making processing nodes more powerful) and scaling out (adding multiple processing nodes in parallel). Such queueing systems have a number of advantages. Firstly, their asynchronous nature means that resource usage is kept as efficient as possible: long-running jobs do not hold onto i/o resources and their related threads needlessly. Secondly, loose coupling between queues and their consumers permits the creation of multiple consumers without significant impact on the functioning of the queue itself. The queue does not need to 'know' about or manage its consumers. Scaling becomes a relatively simple matter of adding more processes on a multi-core node, or adding more nodes in a distributed system. Although some data processing is complex, in that it needs to recombine the results of parallel and distributed processes, going forward we will implement an Actor Model (as exemplified by Akka or the Erlang language), effectively implementing a queueing system at a more fine-grained level.
The project has also produced visualisations of the metagenomic data, which is crucial for understanding the microbial diversity in the gut. As part of MetaPlat, visualisation tools are incorporated into the metagenomics pipeline to visualise microbial data through PCoA plots, bar charts and bubble plots. The use of metagenomics in this project has provided unprecedented insight into the form and function of heterogeneous communities of microorganisms and their vast biodiversity, without the need for isolation and lab culture of particular organisms. Microbial communities affect human and animal health, support the growth of plants, are critical components of all terrestrial and aquatic ecosystems and can be exploited to produce fuels or chemicals. Metagenomics, thus pervades a number of hugely important industries central to economic growth and employment. MetaPlat will allow us to better understand the microbial mechanisms behind efficient food production, better meat quality and methane production.