Periodic Reporting for period 1 - BIOINDUSTRY 4.0 (RI services to promote deep digitalization of Industrial Biotechnology - towards smart biomanufacturing)
Reporting period: 2023-01-01 to 2023-12-31
Supporting the digitalisation of IB, the BIOINDUSTRY 4.0 project will create new services delivered by European research infrastructure. These will address several challenges, focusing on the improvement of bioprocesses and the acceleration of development pipelines. BIOINDUSTRY 4.0 will develop data and metadata standards to generate high-quality, interoperable, multiscale bioprocess datasets and methods to share these. By developing data-driven approaches and exploiting AI to empower novel decision support systems and digital twins (DT), the project will make it possible to extract new knowledge from data that can be used to better design bioprocesses, exploiting the potential of microbial biodiversity and providing the means to develop real time, autonomous control of bioprocesses.
Core work has focused on the collection of use case legacy datasets pertaining to different microorganisms cultivated in various bioprocess conditions. The datasets contain heterogenous data describing both physiological behaviours, omic-level characterisation (e.g. genomic data) and dynamics (e.g. fluxomic data), physico-chemical parameters (pH, temperature, gas flow analysis, substrate consumption), and bioreaction performance (titers, productivities, yields) measured at different scales. Data and metadata collection and identification has been facilitated by the use of a template designed for the purpose, and all legacy data available so far has been uploaded to the Yoda data management platform of Wageningen University (yoda.wur.nl).
Other work on data has centred on microbial collections. Specifically, the nature and quantity of data available in the principal European microbial bioresource collections was analysed and work on a new microbial strain data standard performed. In relation to this work, a test-website (www.strainsbook.org) was launched. In parallel, a survey was launched to better understand industry needs and expectations regarding the development of new microbial workhorses.
To develop the data framework and compute environment for bioprocess digital twins, preliminary work focused on a series of targets. The first relates to data harmonisation and sharing, both of which are requirements for the operation of DTs. Work on defining the specifications for a data fabric and on minimum information models have been performed. Likewise, work on mapping requirements for the compute environment, and creating and testing prototypes that execute containerised workflows and stream data has been done.
The first steps towards creating DTs involved the creation of modelling scaffolds, the first one being built using an Escherichia coli bioreaction use case. Several modelling platforms (gProms, Modelica, Python etc) are being employed to ensure wide coverage and offer different options to future users. Other work was directed towards the development of an ontology to model experimental data and metadata.
Because real-time autonomous control of bioreactors requires the deployment of high-performance sensing devices to provide continuous or at least frequent measurements process variables, BIOINDUSTRY 4.0 also gives focus to this challenge. In the preliminary project phase, work was begun on a new generation RAMAN-based detector and an optical device equipped with an offset probe.
Finally, to ensure that BIOINDUSTRY 4.0 realisations will be of interest to a wide R&D user community, a stakeholder engagement strategy is being developed. At the beginning of the project a database of stakeholder contacts was compiled, presentation documents prepared and fifteen interviews with a varied stakeholder subgroup were performed.
Presently, overarching queries of multiple microbial resource databases is impossible. However, BIOINDUSTRY 4.0 aims to make this feasible, providing a unified platform that will allow users to query all of the MIRRI-ERIC collections, as well as that of DSMZ. The prototype portal, www.strainsbook.org already provides a glimpse of how this will work. In the long-term this work will provide powerful querying capabilities and offer industry a novel tool to mine and exploit the potential of microbial biodiversity.
Building the control loop between the physical and digital twin involves the deployment of so-called PAT (Process analytical technology) devices. In the first part of BIOINDUSTRY 4.0 important groundwork has been laid (draft device assembly and structural design plans) to facilitate the development of novel instruments that will eventually provide the means to access bioprocess parameters that are currently inaccessible or difficult to measure.