European Commission logo
español español
CORDIS - Resultados de investigaciones de la UE
Contenido archivado el 2024-05-27

An interactive protein engineering portal with validated software and database facilities

Final Report Summary - NEWPROT (An interactive protein engineering portal with validated software and database facilities)

Executive Summary:
Executive summary
Protein engineering is employed in fields as diverse as animal nutrition, aquaculture, crop production, biofuels, biopharmaceuticals, food and beverages, household care, leather and textile, waste water solutions, and through chemical biology in the production of high-value chemical compounds. In recent years chemical biology has gained prominence. Chemical biology is the branch of general protein engineering that is concerned with the use of modified enzymes for the production of interesting, high-value chemical compounds.
Protein engineering has been a research topic for more than 25 years, and throughout this period theory and application have gone hand-in-hand. The NewProt partners intended to keep working in this tradition in that they want to integrate and develop protein engineering software and databases in a regime that includes rigorous and continuous experimental validation. The NewProt project reflected on the KBBE.2011.3.6-01 call "Increasing the accessibility, usability and predictive capacities of bioinformatics tools for biotechnology applications". The NewProt partners intended to bring together the combined skills of two prominent academic protein engineering labs and eight SMEs with a high profile in a wide variety of complementary protein engineering branches to produce a self-service portal (SSP) for protein engineering software and databases. Predictive technologies were experimentally validated. Software and databases became as much as possible free or open source. The NewProt team was carefully selected to consist of teams that provide the highest quality work in their area of expertise, and the areas of expertise that come together are highly complementary. All partners (academic and SME alike) will benefit from a well-functioning SSP. The academic partners are already used to always give public access to their results, but in NewProt also all SMEs have business models that flourish by openness.
Two academic groups and eight SMEs who all work on different aspects of protein engineering collaborated and produced a website, called SSP, that combines a portal with an interactive workbench that provides users a self-service system for in silico protein engineering. SSP gives users access to a wide variety of protein engineering facilities and information systems, and allows them to interactively work with a comprehensive set of well-integrated computational protein engineering tools. The databases provided are curated and the software and protocols are experimentally validated. The portal is solidly based in modern information technology and its usability for protein engineering and for use by software providers was thoroughly validated.

Project Context and Objectives:
The three main, measurable NewProt objectives were 1) the design and implementation of the actual portal with workbench; 2) the comprehensive suite of fully interoperable, well-documented software, databases, and information system; and 3) the experimental results of the mutation and design experiments that were executed to validate the software.
• Objective 1 was the design and implementation of the SSP that is a portal with an embedded interactive workbench. The SSP gives users interactive access to the in silico protein engineering facilities. The SSP is simple to install and simple to maintain, yet solidly rooted in modern information technology. The SSP can be provided as a virtual machine that can be installed by the push of a button by the partner SMEs and by other industries too. It is also possible to embed SSP elements in commercial and academic third party portal systems.
• Objective 2 involved the installation at the SSP of a comprehensive set of fully interoperable, well documented, and easy to use software and data facilities for in silico protein engineering.
Objective 3 was the experimental validation of all aspects of the SSP. Mutation-prediction facilities were experimentally validated. The validation results were used to both improve the quality of the software and protocols and to document the possibilities and limitations of the software methods.

The validation experiments additionally enabled the protein engineering of synthetically useful enzymes and some engineered biocatalysts were marketed by the SMEs Ingenza and Enzymicals. NewProt thus provided both SMEs with highly efficient tools to ensure an excellent and competitive standing in their business areas.
The final product of the project is a series of experimentally validated well-documented protein engineering software facilities and protocols that are available for interactive use at a portal with an embedded workbench, called SSP, that is solidly rooted in modern information technologies. The SSP is hosted at FluidOps, and is freely accessible. The whole SSP can be obtained from FluidOps for hosting and for in-house usage by third parties.
The SSP is freely available for protein engineers and other interested scientists and educators around the world. This means that the NewProt products are likely to get the widest possible dissemination, just as planned five years ago, when the NewProt proposal was written.

Project Results:
S & T results / foreground
The Foreground IP / S & T results discussion is divided in a few sections:
1. IP rights and ownership
2. Obtained tangible results
3. Obtained intangible results

1) IP rights and ownership
The NewProt partners have used a very simple IP concept that follows the following rules:
1. All IP stays with the SME who obtained it / did the research for it.
2. For foreground IP of the academic partners it will case by case be discussed up-front which of the SME partners will obtain that IP (for free). Experiments that can lead to foreground IP that any partner would not like to share (or would not like to give up in case it was obtained by an academic partner) should not be performed within the NewProt project.
3. All background IP stays with the partner who owns it.

This schema was so simple that the partners decided not to even worry about any details of the IP contract and simply use one provided by the EU. In retrospect we can say that this was a wise choice as very little time was lost on formal contract negotiation things and the scientists could spend all their time at what they like most, and are best at: innovating.

Some of the foreground are paramount to the business model of the SME owner, and thus can/will not be made available freely through the portal. For these cases the CMBI has produced software that does 'roughly the same' as the commercial software (but in all cases clearly not as good, detailed, extensive as the commercial package it emulates). This CMBI emulator software is then available freely to portal users who do not have a licence for the commercial package.

2) Obtained tangible results
The final, tangible results of the NewProt project are the improvements made to the products of the SME partners, the scientific progress made by all, and the new proteins that have been produced in the experimental validation rounds, and last but not least, the SSP portal that is freely available for all protein engineers and other interested scientists and educators alike. Here I will focus on the SSP portal. It is difficult to describe an interactive portal in static words, so we opted for a video presentation consisting of about a dozen videos that step-by-step explain the functionality of the portal, and take potential users by the hand.
Each video is represented by the start-image of the video itself, a title that explains which portal aspect is explained in the video (they are listed here in a logical sequential order) , and the youtube address of that video.
Cookies must be allowed for youtube to work well, and the browser should be a mainstream one and should not be too old.

“ Getting Started – Create a Project using the Accession Code”

“ Getting Started – Create a Project using the Fasta sequence”

“ Getting Started – Create a Project using the Text Search Option”

“Getting Started – Import PDB files”

“Getting Started – Upload a Homology Model”

“Getting Started – Import a model from PMP”

“Other – Upload Resources”

“Calculations & Visualizations – Visualizations with YASARA”

“Calculations & Visualizations – HotSpot Wizard”

“ Calculations & Visualizations – Other Computational Services”

“HOPE – How to generate a HOPE report”

“How to work with YASARA – Basic Moves”

“How to work with YASARA – The 7 scene-styles”

3) Obtained intangible results
The intangible results that resulted from the NewProt project are mainly described in the impact section where several of the partner SMEs summarize how they, personally, and solely from a point of view of their own management look at the NewProt project now that it has ended. Although these very upbeat messages from the SME partners are clearly the most important intangible (or sometimes hard- or un-quantifiable) results, more intangible results must be mentioned. Beyond these SME results, it is worth mentioning that the NewProt collaboration has resulted in a large number of friendships and a small but powerful network of like-minded scientists who certainly will collaborate again in some national or international, applied or fundamental project.

Potential Impact:
The largest impact of NewProt consists of intangible results obtained by the SME partners who all have strengthened their market positions thanks to the collaborations and the transfer of knowledge and skills that took place in the many, intensive collaborations that have characterized the NewProt project. Some of the SME partners have explicitly written how NewProt helped them stay ahead in an ever more competitive environment. Despite the crisis all SME partners grew during the NewProt period.
Lead Pharma:
Lead Pharma aims to discover and develop first or best in class small molecule drugs for the treatment of autoimmune diseases and cancer. Lead Pharma's drug discovery engine combines medicinal, structural, and computational chemistry with molecular pharmacology, cell and tissue-based pharmacology to select and advance the most promising molecules. Structure-based drug design is an important element within the drug discovery process at Lead Pharma. This requires the availability of, preferably high resolution, crystal structures of the drug targets of interest bound to small molecule modulators. Obtaining high-resolution protein crystal structures can be very challenging due to difficulties in expression, purification, crystallisation or diffraction. A possible approach to address these issues is to engineer these proteins to either improve their expression, stability during purification or storage, or their behaviour in crystallographic studies. During the course of the NewProt project Lead Pharma has successfully applied several of the tools and servers developed with the project to support on-going abovementioned efforts in our structural biology group.

Figure 1: Electron density (1.2 Å resolution) of crystal structure solved at Lead Pharma during the course of the NewProt programme.

In February 2015 Lead Pharma announced that it entered into a research collaboration and license agreement with Sanofi to discover, develop and commercialize small-molecule therapies directed against the nuclear hormone receptors called ROR gamma t to treat a broad range of autoimmune disorders, including rheumatoid arthritis, psoriasis and inflammatory bowel disease, which are among the most common.
Six databases have been generated for the NewProt consortium. These six databases were used as templates for generating new databases and licensed to commercial companies. Until the end of 2015 they were exploited for use by nine customers who obtained a paid license. The table below shows the exploitation results per database.
Database name Commercial licenses
2-methylcitrate dehydratase 1
Alanine racemase 1
a-b-Hydrolase 9
Cytokines 0
D-amino acid PLP 1
PLP-dependent transferase 1

The amount that was added to the Bio-Prodict turnover for the 4 year period of the NewProt project due to the commercial licenses is approximately 120K euro.

The real added value of the NewProt project in euro’s is higher due to the fact we were able to sell other databases (besides the six listed above) easier due to:
1. Visibility of Bio-Prodict due to the NewProt program.
2. Improved functionalities of the 3DM software.
It is hard to determine exact numbers of the impact of these two points due to the NewProt project, but since the start of the NewProt project Bio-Prodict has grown from:
1. 100K euro yearly turnover to 650K turnover (2014)
2. 1 employee to 7 full permanent staff
3. 32K profit to 265K euro profit over 2014.

We have derived significant benefits from NewProt and in particular through the interaction with BIOP’s 3DM in accelerating the demonstration of new bioprocess feasibility and in enhancing rate limiting bioprocess steps. We have also found that 3DM can be highly relevant in ways not originally anticipated, through its synergy with phylogenetic and other informatics analyses to assist the discovery of new biocatalytic activities.
The ongoing value to Ingenza in collaborating with BIOP and continuing to develop specific 3DM databases, is illustrated by our participation in a new ERA-IB project “IPCRES” and a third project, funded by IBioIC, that will also make use of 3DM databases for the decarboxylation of novel substrates. In the IPCRES project Ingenza has engaged BIOP as a subcontractor and has provided the first target of a number of 3DM databases to be generated. The agreement of Ingenza to engage BIOP as a sub-contractor in fact enabled BIOP to participate in the project at all, due to limited national funding in the ERA-IB programme. We believe that during IPCRES, participation in the IBioIC/BIOP collaboration and in future relationships with 3DM we can provide increasing feedback to help develop both the inter-company relationship and the utility of the 3DM technology to Ingenza and users everywhere.

The process of explaining the atomic cause of a genetic disorder requires an all-out gathering of data and thus belongs in the Big Data arena. The experts in the NewProt project were asked to see, if they can modify proteins deliberately so that they can be used to produce chemicals, process food, clean-up the environment, etc. For that purpose they had to invert the logic of the machinery built to explain genetic problems. They managed to do that using a Self-Service portal that they have designed and implemented together with fluid Operations™ (fluidOps). The Self-Service portal in based on Information Workbench™ from fluidOps.
Information Workbench delivers a work environment for the portal which allows for the interactive use of all the created resources. Leveraging Information Workbench´s semantic technologies, the Self-Service Portal ensures the semantic interoperability of data deriving from different protein engineering data sources, from freely available data sources on the Web, and from other integrated software solutions in a central place. The portal uses standards (RDF and Linked Data) and relevant domain-specific ontologies.
In addition eCloudManager™ from fluidOps, an app which provides an integrated view of the entire data center including storage systems, physical and virtual infrastructures, networks, applications and business resources such as enterprise relevant data, technical documentations, knowledge bases and reports is also in use. Leveraging eCloudManager’s ability to deliver entire application landscapes, the Self-Service Portal is designed such that all partners can instantiate private copies on-demand. To this end, the Self-Service Portal is made available as a virtual machine image that can be instantiated on private infrastructure e.g. VMware, or, if needed, in public clouds. The portal is easy to maintain, and project partners as well as other companies will be able to install it per mouse-click.
The requirements raised in NewProt significantly improved the overall functionality of Information Workbench such as:
• Handling of Jobs
• Integration of external web services
• Stylability of the product

The NewProt App as such is a very important asset to gain customers in the life science industry. This industry is very relevant for fluidOps, since this industry is a first mover in semantic technologies. The partnerships that fluidOps established will be instrumental in building up a network of application development partners which deliver domain specific applications on top of Information Workbench. This is very important for fluidOps, since the partners' domain know-how is essential in gaining traction in this domain.

List of Websites:
The SSP portal is available at:

In the early days of the project a more informal web site was maintained at but due to the total lack of any interest (this web site was visited from outside the consortium only a handful times) this web site was discontinued after one and a half year. It will remain available at the above address for a few months after the end of the NewProt project.