Periodic Reporting for period 2 - MetaPlat (Development of an Easy-to-use Metagenomics Platform for Agricultural Science)
Periodo di rendicontazione: 2017-12-01 al 2019-11-30
Dairy livestock have the ability to convert tough plant material such as grass into quality, high-protein products for human nutrition through fermentation by microbes in their digestive tracts, but a by-product of this action is substantial methane production. Methane is a greenhouse gas that has more heat trapping capacity than CO2 and is produced in vast quantities by livestock world-wide. The dairy and beef industry, has significant economic, nutritional, and cultural value so it is not feasible to demand that everybody stop drinking milk and eating meat. Any strategy that aims to mitigate greenhouse gas emission in agriculture also needs to maintain the efficiency of cattle in food production by investigating the action of microbes in livestock. With this in mind we are creating an easy-to-use high performance machine learning platform with the objective of enabling the rapid analysis of large metagenomic datasets, in order to better understand the microbial mechanisms behind efficient food production, better meat quality and methane production. The project goal has been broken down into the following core objectives:
• Sample collection preparation, and sequencing
• Curation of the reference databases (phylogeny-aware new classification and previously unclassified sequences using machine learning)
• Development of accurate classification algorithms
• Real-time or time-efficient comparison analyses
• Production of statistical and visual representations conveying more useful information.
• Platform Integration
• Provide insights into probiotic supplement usage, methane production and feed conversion efficiency in cattle
This EU funded research is also giving valuable insight into how humans can harness the microbes for converting grass and other plant material into biofuels.
Specialist ‘transfer of knowledge’ workshops have been delivered to research fellows in both molecular biology and cloud computing. Two international research conferences have also been organised around MetaPlat where all partners participated, namely:
CERC 2016 http://www.cerc-conference.eu/.
CERC 2017 http://www.cerc-conference.eu/.
There were a number of workshops organised for the project, including one international Workshop on Data Analytics in Metagenomics (http://scm.ulster.ac.uk/~e10267487/DAM2017/index.html) held in Nov. 2017, in conjunction with IEEE BIBM 2017 conference in Kansas City, USA
MetaPlat has thus far addressed the following key objectives in sample collection preparation, and sequencing, the development of accurate classification algorithms, time-efficient comparison analyses visualization, integration, to provide insights into probiotic supplement usage, methane production and feed conversion efficiency in cattle.
MetaPlat utilises a high-throughput computing asynchronous queueing system that lends itself to scaling up (making processing nodes more powerful) and scaling out (adding multiple processing nodes in parallel). Such queueing systems have a number of advantages. Firstly, their asynchronous nature means that resource usage is kept as efficient as possible: long-running jobs do not hold onto i/o resources and their related threads needlessly. Secondly, loose coupling between queues and their consumers permits the creation of multiple consumers without significant impact on the functioning of the queue itself. The queue does not need to 'know' about or manage its consumers. Scaling becomes a relatively simple matter of adding more processes on a multi-core node, or adding more nodes in a distributed system. Although some data processing is complex, in that it needs to recombine the results of parallel and distributed processes, going forward we will implement an Actor Model (as exemplified by Akka or the Erlang language), effectively implementing a queueing system at a more fine-grained level.
The project has also produced visualisations of the metagenomic data, which is crucial for understanding the microbial diversity in the gut. As part of MetaPlat, visualisation tools are incorporated into the metagenomics pipeline to visualise microbial data through PCoA plots, bar charts and bubble plots. The use of metagenomics in this project has provided unprecedented insight into the form and function of heterogeneous communities of microorganisms and their vast biodiversity, without the need for isolation and lab culture of particular organisms. Microbial communities affect human and animal health, support the growth of plants, are critical components of all terrestrial and aquatic ecosystems and can be exploited to produce fuels or chemicals. Metagenomics, thus pervades a number of hugely important industries central to economic growth and employment. MetaPlat will allow us to better understand the microbial mechanisms behind efficient food production, better meat quality and methane production.