Servicio de Información Comunitario sobre Investigación y Desarrollo - CORDIS

Final Report Summary - COSMOS (Integrated In Silico Models for the Prediction of Human Repeated Dose Toxicity of Cosmetics to Optimise Safety)

Executive Summary:
The cosmetics industry is vitally important to the economy of the European Union (EU) and its products are innovative as well as being essential to improve the quality of life. The safety, to consumers, of the final products, which may include varied combinations of many different ingredients and materials, is paramount; potentially harmful cosmetics products could have a deleterious effect not only on individual consumers, but also on the commercial viability of the company that produces it. Traditionally, the safety of chemicals has been assessed from the results of testing on animals combined with a knowledge of the level of exposure to that chemical. However, legislation within the EU coupled with, and as a response to, public opinion, means the testing of cosmetics ingredients is no longer possible. This change in policy has come at the same time as revolutions in computational, molecular and biological sciences along with a modern perspective on toxicology i.e. that we are able to move away from testing relatively high doses on animals, to considering more realistic exposures to human pathways, cells or even tissues. What is termed “21st Century Toxicology” relies, therefore on alternative approaches that utilise computational modelling and results from in vitro tests.
The COSMOS Project (2011-2015) was initiated to support the computational modelling of toxicity and specifically to address the assessment of safety of cosmetics ingredients. The COSMOS Project was co-ordinated by Liverpool John Moores University and brought together a further twelve partners from the EU and two from the USA. Specifically the partners brought together expertise in toxicity data compilation and evaluation, threshold of toxicological concern (TTC), modelling of toxicity and pharmacokinetics and relevant informatics. The philosophy of the COSMOS Project was to ensure a reliable database and robust computational workflows would be freely available to aid safety assessment.
The proper harvesting, curation and presentation of toxicological data and information is a pre-requisite for modelling. The COSMOS Database (DB) is a freely available legacy from the COSMOS Project (available from that incorporates an inventory of over 5,000 chemical structures which are known cosmetics ingredients; toxicity data for over 1600 chemical substances including the results of over 12,500 toxicity studies. These data proved to be a rich source of information to mine enabling extraction of chemically based toxicological knowledge. In addition, safety data (the concentration at which no significant adverse effect was observed), derived from the repeated dose toxicity test results, enabled a thorough reanalysis of pre-existing TTC levels. The TTC approach was supported by a scheme that allows for route-to-route extrapolation (i.e. from oral to dermal exposure) of toxicological information. Models to predict activity of chemicals, solely from their structure, were developed for effects such as binding to DNA, protein and nuclear receptors, as well as predicting effects associated with liver toxicity. The modelling approaches were developed in concert with on-going research into relevant adverse outcome pathways (AOPs). Pharmacokinetics and the distribution of compounds in humans, following dermal and oral exposure were also modelled. The full suite of models for pharmacokinetics and toxicity prediction is freely available as a series of KNIME workflows ( with accompanying webtutorials.
There is a significant legacy from the COSMOS Project, which includes not only COSMOS DB, KNIME workflows, webinars, web-tutorials, a large number of publications and conference presentations but also the development of a new approach to assessing cosmetic safety.

Project Context and Objectives:
Society expects high quality, safe products. To ensure the safety of cosmetics ingredients, as well as many other types of chemicals, testing on animals has normally been performed. The effects seen to animals were extrapolated up to those that may occur in man and a decision made regarding the safety of a particular ingredient. However, decades of public opinion has been against animal testing on finished cosmetics products and their individual ingredients. This stimulated changes to legislation in Europe and elsewhere, and ultimately resulted in the European Union’s Cosmetics Regulation which required cessation of all animal testing, for toxicological purposes, of cosmetic ingredients marketed in the EU from 2013 onwards.
Whilst the European cosmetics industry is well aware of its ethical responsibilities with regard to animal use, it is also an enormous employer in the EU and generates considerable wealth. If no alternatives to the traditional use of animals for toxicity testing are available, the ban on testing for new ingredients could be perceived as an inhibitor of innovation and hence profitability. This stressor, combined with other pressures such as the EU REACH legislation for chemicals registration, has placed the burden on science to develop and take up advances in alternatives to toxicity testing. These advances are commensurate with the promotion of what is termed “21st Century Toxicology”; this aims to improve toxicological assessment by an understanding of the means by which compounds cause harmful effects, the amount required to cause that effect and if this harmful effect would be realised following a realistic exposure scenario – with all this information being obtained in systems directly relevant to humans. This new paradigm in toxicological assessment is based around in vitro measurement (i.e. using cells, mechanistic pathways etc) and appropriate computational modelling.
To assess the safety of cosmetics ingredients, knowledge is required about their effects following long term use. To determine this experimentally, repeated dose experiments were performed. The goal therefore is to be able to predict the effects from such an experiment, without recourse to using animals. Prediction of repeated dose toxicity has posed a real challenge to computational modelling to provide a viable alternative to animal testing. To understand this challenge, it must be understood that toxicity resulting from long-term exposure to a chemical is a consequence of perturbing a biological system at the cellular, tissue and organ (or multiple organ) level. Several factors can influence the outcome including toxicokinetics and physiological adaptive response mechanisms. Repeated dose toxicity testing provides a No Observable (Adverse) Effect Level (NO(A)EL) which is used in quantitative risk assessment of chemicals; data from 28-day or 90-day rodent oral toxicity assays are typically used. Because of these factors, added to the fact that there are many potential mechanisms and may be multiple (interacting) organ systems involved in eliciting the toxicity, in silico models have previously been considered too simplistic to model such complex interactions.
Whilst it is generally acknowledged that repeated dose toxicity is not readily amenable to “classical” correlative computational modelling such as quantitative structure-activity relationships (QSARs), recent advances in in silico modelling, such as read-across methods, can provide a solution. It is important to challenge the preconception that computational models cannot be sufficiently sophisticated to provide useful predictions as read-across, supported by mechanistic understanding and kinetics predictions, may provide a solution.
There are a number of requirements for the development of models for repeated dose toxicity. It is fundamental that models are based on reliable information. Thus, access to high quality toxicity data, on which to extract knowledge and develop models, is required. These data need to be stored in a manner that allows relevant details (such as experimental protocol, organ level effects, pathology, etc.) to be readily accessed. There is currently no single source of repeated dose toxicity data; in addition, many data may be available within industry with no means of extracting them. What is required is a comprehensive, flexible and reliable database, tailored for repeated dose toxicity, from which useful predictive models can be developed.
Toxicology is based on the premise of dose-response, i.e. that, whatever it is, a compound will only be harmful if it reaches sufficient concentration at the active site. The Threshold of Toxicological Concern (TTC) approach identifies an exposure threshold below which no adverse effect on humans is anticipated. This approach has been adopted by the US Food and Drug Administration (US FDA) to determine safe levels for food contact substances. However, to be used confidently to assess the safety of cosmetics ingredients the database on which the current TTC approach is developed needs to be expanded to be more representative of cosmetics ingredients.
The complexity of modelling repeated dose toxicity endpoints is due, in part, to the lack of definition and (in some cases) understanding of the mechanisms of action of chronic toxicity. This has meant that in many cases it was not possible to determine the relationships that linked the chemistry behind the process to the toxic effect. This has now resulted in more concerted attempts to model individual pathways which lead to effects, rather than modelling the effect itself. Linked to this is the increase in interest in Adverse Outcome Pathways (AOPs) associated with global initiatives, such as the OECD AOP development programme. Thus there is an increased possibility to utilise the knowledge from AOPs to drive toxicity prediction.
The application of category formation and read-across has become more acceptable to address regulatory needs. Read-across is an in silico method that enables the activity of a compound of interest to be inferred using information that is known about similar compounds. A rational method is used to group chemicals together into categories of similar compounds. Data for known compounds within the category need not be restricted to a definitive toxic endpoint (e.g. ED50 value) but can include information that is indicative of toxic effect (e.g. potency in an associated in vitro assay) and thus is applicable to the prediction of repeated dose toxicity.
In addition, there is a need to link computational modelling to the outcome of relevant in vitro assays to gain better understanding of toxicity pathways in vivo. Current initiatives, such as the Tox 21 program, are subjecting a range of compounds of interest to a vast array of in vitro tests. The results from such assays can provide useful insight into mechanisms of toxic effects but require further development and insight. One of the key factors in predicting the overall biological activity of a compound is correlating toxic effect to the dose that is received by the target organ. This can differ considerably from the dose administered to the animal due to biokinetic factors (absorption, distribution, metabolism and excretion). Traditional approaches, such as QSAR, that correlate dose administered to biological activity can produce misleading models if such factors are not accounted for. Biokinetic models are currently under-developed. Improvements are required to allow for the extrapolation of information from toxicity pathways and in vitro assays to organ level effects in humans.

Project Objectives

It is clear to meet the needs of 21st Century Toxicology, computational modelling must move away from being a retrospective, and often academic, analysis of data and information which has often been performed in isolation. There is now a paradigm-shift towards developing models based on an understanding of the underlying mechanisms involved in eliciting an adverse effect. Development of new models will require the integration of several approaches, such as category formation and read-across, in vitro methods, in vitro to in vivo extrapolation (IVIVE) models, etc. The aim is to develop flexible and usable models that may achieve the ultimate goal of predicting NO(A)EL values using entirely non-test alternative methods. Such an aspiration requires a multinational effort to pool expertise and resources into a unique project entirely focused on addressing this challenge. The COSMOS Project was developed to meet several of these challenges by tackling some of the problems outlined above.
Specifically, the aim of the COSMOS project was to develop tools for the retrieval of data and to support the prediction of repeated dose toxicity to humans for cosmetics-related chemicals. COSMOS was at the centre of efforts to integrate reliable and open access toxicity data, greater application of the TTC approach, grouping for read-across, (Q)SARs and modelling of biokinetics, with the opportunities offered by informatics and the toxicity pathway approach. This was in line with the current paradigm-shift in toxicology towards developing models based on an understanding of the underlying mechanisms involved in eliciting an adverse effect (AOPs). The tools developed include adaptable workflows that form a set of building blocks allowing users to incorporate their own data and search existing data compilations.
The specific objectives of the COSMOS project were:
• Objective 1: to collate and curate toxicological data with an emphasis on repeated dose exposure. The COSMOS database of toxicological information for cosmetic ingredients (and beyond) was intended to provide the backbone to the development of alternative models and forms a robust platform to collect, organise and mine highly curated and quality assured in vivo and in vitro toxicity data. It was designed to have the capability of contributing to the development of alternatives in the other SEURAT-1 projects by providing access to high quality data as well as to the SEURAT-1 case studies

• Objective 2: to create an inventory of known cosmetic ingredients and associated quality controlled chemical structures. The COSMOS Cosmetics Inventory was developed to define the chemical space of cosmetics ingredients within an informatics environment. This enabled the analysis of the chemical space of cosmetics ingredients and provided a basis on which to determine the need, or otherwise, to extend the existing TTC (Munro) dataset.

• Objective 3: to extend the TTC approach and assess its applicability to cosmetics. The COSMOS Project developed TTC approaches better suited to classes of cosmetic ingredients in order to support efficient safety assessment. The TTC approaches have updated current knowledge and data sets and involved considerable quality assurance by external expects. Further analysis and effort were also placed into providing better route-to-route extrapolation facilitating the use of a TTC value, based on oral dosing, to dermal exposure.
• Objective 4: to develop innovative toxicity prediction strategies based on chemical categories, read-across and (Q)SARs for organ level toxicity and relate these to key events in adverse outcome pathways (AOPs). The COSMOS Project aimed to provide a number of innovative computational tools for organ-level toxicity prediction, which were built around the COSMOS database and Cosmetics Inventory. In particular, chemical categories were developed from knowledge derived from AOPs. These were extended into more quantitative approaches to predicting toxic potency, e.g. (Q)SARs and refined to incorporate kinetic and metabolic studies to permit quantitative interpretation of results in terms of consumer risk. The AOP approach provided a transparent link from chemistry to toxicological effect. COSMOS supported the development and promotion of AOPs, in particular by providing a framework for organising the chemistry involved in the processes.
• Objective 5: to develop a multi-scale modelling approach including cell-based and physiologically-based kinetic (PBK) models to predict target organ concentrations and extrapolate from in vitro to in vivo exposure scenarios. Models for toxicodynamics and toxicokinetics were developed within the COSMOS Project which extended capabilities for in vitro – in vivo extrapolation (IVIVE), allowing for the better application of results from cell based assays to perform human safety assessment. Research included kinetics modelling (e.g. through physiologically-based kinetic (PBK) models); a better understanding of the effect of the test system (e.g. sorption) and chemicals’ (e.g. volatility, stability) properties relating to extrapolation; and modelling and prediction of metabolism. The intention is that these models could be used to determine the internal exposure (dose at target organ level) necessary to elicit the effect.
• Objective 6: to use KNIME technology to integrate access to data and modelling approaches into adaptable and flexible computational workflows that would be publicly accessible, providing transparent methods for use in safety assessment of cosmetics. The COSMOS Project aimed to utilise the highly adaptable KNIME computational workflow technology as a platform to create and distribute models. The KNIME web-server was used as a means to distribute the models to stakeholders.

Project Results:
The COSMOS project was structured around five Research and Technological Development (RTD) work packages (WP1-5), along with further work packages on knowledge management, dissemination and training (WP6) and administrative and financial project management (WP7). This report summarises the research results obtained within the COSMOS Project at the work package level.


The objective of work package 1 was to provide data for COSMOS partners, as well as the scientific community at large, to overcome the lack of oral repeated dose toxicity data suitable for use in safety assessment.

The main achievements of WP1:

The COSMOS Cosmetics Inventory

The COSMOS Cosmetics Inventory is a compilation of cosmetics-related ingredients incorporating information from the European Commission’s Cosmetic Ingredients (CosIng) database and the US Personal Care Products Council (PCPC) lists, including over 19000 unique International Nomenclature of Cosmetics Ingredients (INCI) names and over 9000 unique CAS numbers.

The COSMOS Database (DB)

Linked to the Cosmetic Inventory is the COSMOS DB. The COSMOS DB is a chemo-centric system which provides chemical and toxicological data to support the data needs of the COSMOS project, as well as safety assessors in public and private organisations. COSMOS DB a high quality web-based database completely based on open source technology which links chemical structures to repeated dose toxicity, skin permeability and other endpoint data. In total, COSMOS DB contains more than 12000 toxicity studies across 27 endpoints for over 1600 compounds and more than 80000 chemical records with more than 40000 unique structures.

a. Data content

COSMOS DB integrates data from various sources into an unified data model:

1) Chemistry
Chemistry data have been collected from various sources. Special emphasis was put on cosmetics related chemicals. The European Union Inventory of Cosmetic Substances and Ingredients (CosIng), and the USA Personal Care Products Council (PCPC) inventories have been parsed into COSMOS DB (forming the COSMOS Cosmetics Inventory). COSMOS partners have harvested oral toxicity data from sources including the US FDA (Food and Drug Administration) PAFA (direct food additives and colourants) and OFAS (food contact substances) databases, US EPA (Environmental Protection Agency) ToxRefDB, European Commission SCCS (Scientific Committee on Consumer Safety) opinions and the scientific literature. In addition, skin permeability data were donated from the EDETOX Database and COSMOS partners.

2) Toxicity Information
The COSMOS DB contains 12,538 toxicological studies for 1,660 compounds. Within US FDA PAFA there are 12,198 studies for 27 endpoints. The oral repeat dose toxicity database (oRepeatToxDB) contains 340 studies for five endpoints.

b. Components of COSMOS DB

1. oRepeatToxDB
A subset of the COSMOS DB with full-dose level toxicity information, dermal absorption/skin permeability data, and metabolism information forms the oral repeated dose toxicity database (oRepeatToxDB). The oRepeatToxDB contains 230 cosmetics-related chemicals, for which 340 oral studies (sub-acute (duration ≥ 28days), sub-chronic, chronic, carcinogenicity (non-neoplastic lesions), reproductive-developmental toxicity, neurotoxicity, and immunology studies) were harvested from available regulatory and literature sources (such as SCCS opinions, NTP reports) as well as primary literature publications. The oRepeatToxDB includes ontology for phenotypic effects at each dose level using controlled vocabulary. Toxicity effects observed at target organ sites have been organised hierarchically to relate organs to tissues to cells. Species were limited to rat/mouse for target organs, rat/mouse/rabbit for reproductive-developmental toxicity, and dog/monkeys for all studies except neoplastic effects.

2. Threshold of Toxicological Concern (TTC) Database
The COSMOS TTC DB was extracted from the oRepeatDose ToxDB by applying a set of study selection criteria relating to following parameters: study type, species, duration, route of exposure, dose levels and range, effects and references.
Briefly, the COSMOS TTC DB contains 552 chemicals with critical information, including critical study information, critical effects and Point of Departure (POD) values. This new database consists of No Observe Adverse Effect Level (NOAEL) / Lowest Observed Adverse Effect Level (LOAEL) data of cosmetics-related chemicals along with information of oral toxicity studies selected as “critical” by COSMOS Criteria and the review sessions (more details on TTC database are available in the next section for WP2).

3. COSMOS Safety Evaluation Database
The COSMOS Safety Evaluation Database was constructed in order to make the new TTC information available in a database format. Information includes: study design, NOAEL/LOAEL (values and owners), critical NOAEL, critical effects at LOAEL, quality control (rationales and quality control (QC) owners).
The COSMOS Safety Evaluation Database includes all updated Margin of Safety data from SCCNFP/SCCP/SCCS as well as Margin of Exposure (MOE) and Acceptable Daily Intake (ADI) available from other regulatory bodies.

Points of departure
• NOAEL or Benchmark Dose (Lower Confidence Limit) (BMDL) decisions
• Critical Study and effects/sites
Evaluation methods:
Margin of Safety (SCCNFP/SCCP/SCCS)
Margin of Exposure (EU EFSA)
Oral Reference Dose RfD (US EPA IRIS).

c. Data Curation

A data curation strategy was developed in order to provide oral repeated-dose toxicity data. The process includes both manual harvesting and existing data merge.
High quality of chemical records/structures and toxicity data within COSMOS DB was assured via formal QC and Quality Assurance (QA) procedures. Toxicity data were assessed for their quality and this information is also available in COSMOS DB. The data record reliability was rated objectively by applying COSMOS MINImum Study (MINIS) criteria which represent the set of required and recommended experimental parameters. Data acceptance was assessed by toxicologists using Klimisch scores.
Moreover, the COSMOS Data Entry System (DES) has been developed to enter chemical information into the COSMOS Database and to enable quality control work. The COSMOS DES has two modules:
1) Chemistry DES: Chemical information entry and quality control
2) Study DES: Toxicity data entry and quality control

The Chemistry DES provides annotation fields for verifications of:
• Structure: connection table, structure connection table source, stereochemistry, double bond geometry, structure representation (exact or modified), quality score,
• Compound: molecular formula, material type, composition type, inventory sources
• Registry numbers/IDs/Names: types, values and source
• Product use function/Category: CosIng use type, PAFA use types

d. Searching and exporting data from COSMOS DB

COSMOS DB supports data retrieval via a user friendly web interface, which allows querying by chemical, toxicological or both types of data. The chemical search can be carried out by name, registry numbers (CAS RNs) or other identifiers (e.g., COSMOS IDs, DSSTox IDs) provided for a single structure or for a list of chemicals, as well as by structure (sketched or provided as SMILES string). Exact, substructure and similarity structure search types are possible. The scope of the toxicological queries can be defined in detail by the users with respect to the endpoint (study type) and endpoint-specific parameters (e.g., species, strain, sex, route of administration, cells/cell lines, test calls, target sites) as well as data source. The toxicological data of interest can be retrieved for all relevant compounds included in the database (if no query chemicals are defined) or retrieved just for the specified compounds of interest. The search can be also carried out by inventories, e.g., cosmetics inventories (CosIng + PCPC).

COSMOS DB includes also a web-based application allowing for the export of data and predefined datasets. The workflow “Export structures and data” allows exporting of single or multiple structures in an SD file or saving compound-related data in a flat file. The workflow “TTC export” allows downloading of the pre-defined COSMOS non-cancer and Munro TTC datasets.
The content of the first version of COSMOS DB v1.0 was also made downloadable as SD file and Excel tables.

e. Public release and dissemination of COSMOS DB

COSMOS DB version 1.0 was made publicly available in December 2013 from the URL A webinar was held to explain the use and application of the Database, the recording and short user guidance is available from the COSMOS website
The second, final version of COSMOS DB v2.0 was released in December 2015 from COSMOS DB was being widely disseminated through several presentations and posters during the international conferences and meetings as well as through other international projects.

f. Sustainability and Legacy of COSMOS DB

COSMOS DB will remain available after the end of the project and access to COSMOS DB will remain free upon registration with an email address. The COSMOS DB effort will be continued and the database will be updated by an international public data-sharing initiative led by LJMU along with other partners from the COSMOS consortium. COSMOS DB v2.0 will continue the effort toward one centralised database for public projects. COSMOS Data SharePoint will house the v2.0 content, and will be maintained and managed by Molecular Networks and Altamira (

g. COSMOS Space

COSMOS DB is supported by COSMOS Space which facilitates sharing of predictive toxicology resources (data sets, models, workflows, documentation, meta-data as wikis editable by the data owners). COSMOS Space is a publicly available resource based on a free-registration process with minimum user information requirements. It was designed, implemented, tested within the COSMOS Consortium and with external users and released publicly during this reporting period.
COSMOS Space ( manages the user interface to COSMOS Share (a public pool of resources within the COSMOS Space community), COSMOS DB and the COSMOS KNIME WebPortal. COSMOS Space facilitates sharing of predictive toxicology resources (data sets, models, workflows, documentation, meta-data) as uploads within the user’s own account. A basic scoring system to follow-up a community feedback on shared resources, as a validation flag of interest on resources, has been introduced. Additionally, COSMOS Space facilitates links to COSMOS DB and supports the dissemination of the COSMOS KNIME workflows. In particular COSMOS Space links to all COSMOS online resources, makes the documentation of the COSMOS KNIME workflows publicly available to external users, provides a list of all available workflows as well as a standardised online template for the workflow developers to compile the documentation

Results Highlights within WP1:

• Constructing the freely available COSMOS DB with more than 12,000 toxicity studies across 27 endpoints for over 1,600 compounds and more than 80,000 chemical records with more than 40,000 unique structures
• Providing high quality of chemical records/structures and toxicity data within COSMOS DB which was assured via formal Quality Control (QC) and Quality Assurance (QA) procedures
• COSMOS Chemistry DES (Data Entry System) has been developed to support structure entry and quality control
• COSMOSDB “Export structures and data” functionality allows exporting of single or multiple structures in an SD file or saving compound-related data in a flat file
• COSMOS DB is supported by COSMOS Space which facilitates sharing of predictive toxicology resources (data sets, models, workflows, documentation, meta-data as wikis editable by the data owners)
• The COSMOS DB will be updated by an international public data-sharing initiative led by LJMU along with other volunteers of the COSMOS consortium


The purpose of work package 2 was to develop methods to improve and adapt the TTC concept to cosmetics ingredients as part of the overall project objective to establish a Threshold of Toxicological Concern (TTC) database for assessing safety of cosmetics-related chemicals for repeated-dose toxicity endpoints relevant to human health

The main achievements of WP2:


As the basis for the evaluation of the TTC approach to cosmetics-related chemicals, a new oral repeated dose toxicity database, oRepeatTox DB (included in COSMOS DB), was compiled to be used as a resource to build the new COSMOS non-cancer TTC database of NO(A)ELs, enriched with cosmetics ingredients. This general oral repeat-dose toxicity database was the key in the process of compiling the appropriate studies to be used in construction of the TTC dataset. These studies provided underlying data for evaluation for appropriate NOAEL and LOAEL decisions. It was also important to house the NOAEL/LOAEL values in a separate simpler database with critical effects as well as the sources of the decisions and their rationales. Therefore a database with reliability scores for methods and results was built.
The final TTC database contains 556 chemicals with critical information, including critical study information, critical effects and POD values. Most of these chemicals (495) belong to cosmetic inventory.

a. Curation strategy of COSMOS TTC database

To ensure a reliable database with transparency, much effort was devoted to the selection of data for chemicals and toxicity studies. Chemicals were selected when their toxicity data originated from regulatory sources including EU SCC, SCF, FDA, EPA, and JECFA.

The curation process for the COSMOS TTC dataset involved three stages:
1) Implementation of study inclusion criteria to all studies included into toxicity database
• oral repeat dose (>=28 days) toxicity studies
• reproductive and developmental toxicity studies
2) Implementation of data reliability and NOAEL selection criteria
• reliability criteria
• chronic NOAELs preferred
• lowest NOAEL with clear LOAEL selected
• free standing NOAELs excluded whenever possible
3) Final assessment by toxicological experts of the remaining data in terms of:
• toxicological meaningfulness
• human relevance
• known mechanism
As it was not possible to evaluate all studies (around 500), only 25% of the existing studies underwent the QC process. During that process, nine cycles of dataset evaluation and four separate QC sessions were undertaken.

b. Profiles of the Chemicals

The majority of chemicals (85%) in COSMOS TTC database are found in the Cosmetics Inventory defined by COSING or PCPC databases. The biologically active lipid-soluble vitamins and essential amino acids were excluded from TTC database.
The largest sources of the COSMOS Non-Cancer TTC database are EU Scientific Committee of Consumer Safety (EU SCC), US FDA PAFA and CFSAN documents, EPA ToxRefDB, and Munro. There were many overlaps between sources, although the critical studies and the point of departure decisions may vary widely.
“Munro” is the database from which the current non-cancer TTC thresholds were derived (Munro 1996). There are 190 test substance names (178 unique chemical structures) overlapping between the COSMOS and Munro Non-Cancer TTC databases. The analysis of Cramer classification using the Toxtree v2.6 application showed that greater fraction of chemicals in COSMOS TTC database (37%) belongs to Cramer Class I than in Munro dataset (less than 25%). Conversely, Cramer Class III is wider represented in Munro dataset (73%) than in COSMOS TTC database (55%). Some chemical classes, such as nutrients were removed from COSMOS TTC database, as the TTC concept is not applied to this group of compounds by regulatory bodies. However, retinol and phenyl alanine are still present in Munro dataset. Review of Cramer Classes showed 31 cases where Munro and Toxtree assignments were in disagreement. These conflicts were manually reviewed by COSMOS chemistry partners to assign the Cramer Classes for each case.
Chemical space is better characterised by a new categorisation method using ToxPrint chemotypes instead of using Cramer Classes. ToxPrint chemotypes are grouping chemicals based on types of atom, bond, ring, functions and connectivity, which are coded in the Chemical Substructure Representation Mark-up Language (CSRML) format. The comparative analyses using both Cramer Classes and ToxPrint Chemotypes also confirmed that the chemical space of the new COSMOS Non-Cancer TTC database is significantly different to the current Munro dataset. The COSMOS TTC dataset is missing natural products (steroids), in turn Munro is lacking organosilicon compounds and cationic surfactants. This confirmed the enrichment of the TTC database with cosmetics-relevant substances as compared to the original Munro dataset: The chemical space of surfactants, organosilicon and hair dye compounds was significantly increased.

c. Toxicity study profile

Although the COSMOS Non-cancer TTC database used chronic NOAELs by preference, the most abundant studies in the database are rat subchronic / short-term studies (≥28 days). Therefore, duration adjustment factors or 3-fold and 6-fold were applied to subchronic and short-term (≥28 days) studies, respectively, to derive the point of departure for chronic NOAEL values. In addition, for studies giving rise to LOAEL values (LOAEL is the lowest dose tested) alone, an adjustment factor of 3-fold was applied. No duration adjustment factors were applied to reproductive/developmental studies. These factors were applied to the original NOAEL or LOAEL values to calculate the point of departure of the chemicals for this database.
Moreover, the COSMOS TTC database is expected to be less potent than that of the current Munro database since the content was enriched with cosmetics-related chemicals. The median NOEL for the whole Munro database is 20.5 mg/kg-bw/day with a mean of 222 mg/kg-bw/day (for 613 test substances), whilst the median of the whole COSMOS TTC database is two times higher.

d. Data distribution and thresholds

To calculate the point of departure (POD), the fallowing Duration Adjustment Factors were applied:
• uncertainty factor of three for extrapolating of subchronic NOAEL to chronic NOAEL
• uncertainty factor of two for extrapolating of short term NOAEL to subchronic NOAEL
• no uncertainty factor was applied to reproductive, developmental, or multigeneration studies
Through this analysis of POD between COSMOS and Munro TTC databases, the COSMOS TTC database demonstrated that although the chemicals are less potent, they are diverse and still broadly cover critical effects important in safety evaluations. The COSMOS TTC database spans six order of magnitude variations in POD values.
Analysis of the data in the COSMOS TTC dataset revealed that the 5th percentile in the cumulative probability distribution of NOAEL values for Cramer Class I and III cosmetics is higher than the corresponding 5th percentile in the Munro dataset. Conversely, the 5th percentile in the cumulative probability distribution of NOAEL values for Cramer Class II in COSMOS TTC database is much lower than the corresponding 5th percentile in the Munro dataset. This is caused by presence of two significant outliers: allyl heptanoate and 3,5-di-tert-butyl-4-hydroxyhydrocinnamate.
In turn, the analysis of the data in the combined (COSMOS + Munro) dataset showed that the 5th percentiles in the cumulative probability distribution of NOAEL values for Cramer Classes are similar to Munro dataset, especially for Cramer Class II(Munro: 0.15 mg/kg-bw/day). In the case of Cramer Class I, the 5th percentile in the cumulative probability distribution of NOAEL values in the combined dataset is slightly higher than in Munro dataset (Munro: 3.0 mg/kg-bw/day). For Cramer Class II, the 5th percentile in combined data set is slightly lower (Munro: 0.91 mg/kg-bw/day)

Tiered Decision Tree

In order to address the application of TTC values derived from oral data to use cases relevant to cosmetics, a tiered decision-tree approach was developed to assess the chemicals’ bioavailability, which takes into account the absorption/permeability via dermal or oral routes as well as metabolism differences between skin and liver. Dermal absorption was calculated using an established predictive algorithm (Potts and Guy equation) to derive the maximum skin flux adjusted to the actual ‘dose’ applied. The predicted systemic availability (assuming no local metabolism), can then be ranked against the oral TTC for the relevant structural class. The predictive approach has been evaluated by deriving the experimental/prediction ratio for systemic availability for 22 cosmetic chemical exposure scenarios. These emphasise that estimation of skin penetration may be challenging for penetration enhancing formulations, short application times with incomplete rinse-off, or significant metabolism. While there were a few exceptions, the experiment-to-prediction ratios mostly fell within a factor of 10 of the ideal value of 1. It can be concluded therefore, that the approach is fit-for-purpose when used as a screening and prioritisation tool. The details on the development of this tiered decision tree was published. The algorithms for tiered decision tree and skin permeability models were coded into KNIME workflows.

Software Tools

The non-cancer TTC databases of COSMOS and Munro have been successfully imported into the COSMOSDB DataShare Point v2.0 ( A web-based workflow service was established to provide a set of software tools to facilitate use of the TTC method. Three workflow tools have been implemented:

1) “Export structure and data”
Structures and other compound data can be exported from the database. A text file containing CMS ID, names, or CAS numbers can be selected and searched within this workflow.

2) “TTC Export”
The full COSMOS TTC and Munro datasets as well as a combination of both as the “Munro expanded by COSMOS” can be exported as MS Excel file containing the following columns: chemistry; study info: test substance; study info: study design; study results; POD calculations; critical effects; data quality – QC results; data quality; study info: data quality; study info: references and citations.

3) “TTC workflow” to run decision tree
The workflow is to enter a compound (SMILES or draw molecule) along with the exposure value and retrieve the assessment from the system based on the Cramer Classification (using Toxtree implemented in COSMOSDB) and the current thresholds. Although COSMOS was well aware of the problems encountered in Toxtree results for Cramer Classifications, It was decided not to create another source of potential problems by implementing yet another Cramer Tree application until the problems could be solved. For this reason, the user is warned when the application encounters known problems in Toxtree.
When there are more than one study reported with different NOEL/LOEL values, an algorithmic selection of studies with minimum NOEL values was performed. These methods can be applied to select studies from US FDA PAFA or US EPA ToxRefDB databases. This functionality is provided as “export structure and data”, not as a TTC export tool.

Results Highlights within WP2:

• The COSMOS non-cancer TTC database of NO(A)ELs, enriched with cosmetics ingredients was constructed
• The TTC database has been carefully curated during QC processes
• More than 25% of the database has been reviewed for data acceptance by experts in safety assessment through study review sessions
• Evaluation of Cramer classes
• Chemical space of COSMOS non-cancer TTC database has been expanded to reflect cosmetics-related chemicals.
• “Export TTC” functionality allows exporting full COSMOS TTC and Munro datasets as well as a combination of both as the “Munro expanded by COSMOS” as MS Excel file
• A tiered decision-tree approach was developed to address dermal exposure to cosmetics-related chemicals as a guide to estimate systemic exposure when applying the oral TTC


Along with the overall project objective to develop innovative strategies to use existing and novel in silico approaches to predict toxicity, the purpose of Work Package 3 was to optimise in silico methods, such as (Q)SAR and read-across, for the purpose of long-term toxicity prediction of cosmetic ingredients.

The main achievements of WP3:

Computational Tools for Toxicity Prediction

In order to develop new models for toxicity prediction, an evaluation of QSAR models to predict the chronic toxicity endpoints that are important for TTC thresholds, e.g. repeated dose toxicity and selected target organ/tissue toxicities was undertaken. Then a number of computational (so-called in silico) models were developed and evaluated within the COSMOS Project.

a. Modelling of the binding to nuclear receptors (NRs) involved in the development of liver steatosis

Within the COSMOS project, alternative in silico (non-testing) methods were developed to support toxicity prediction in the MoA framework. More specifically, different in silico methodologies, including (Q)SAR and molecular modelling, have been employed and integrated for the evaluation of potential binding to NRs involved in the development of liver steatosis. A variety of nuclear receptors could play a role in liver steatosis, including LXR (liver X receptor), PXR (Pregnane X Receptor), AhR (Aryl hydrocarbon receptor), ER (estrogen receptor), PPARα and PPARγ (peroxisome proliferator-activated receptor isoforms α and γ). The focus of the COSMOS research in silico studies has been mainly on LXR and PPARγ receptors.

1) PLS-DA classification models for LXR binding prediction

A QSAR classification model for the prediction of potential LXR binders was developed based on PLS-DA (Partial Least Square – Discriminant Analysis) classification method using the commercial in-house software SIMCA and MOSES molecular descriptors. The model is based on three latent variables derived from seven MOSES 2D descriptors which encode basic electronic properties, hydrophobicity, molecular shape and complexity. Moreover, two related PLS-DA models based on freely available molecular descriptors, namely PaDEL and RDKit descriptors, were developed. The PLS-DA classification models implemented in KNIME showed similar classification performance and applicability domains as compared to the original QSAR model developed in SIMCA.

2) Molecular Modelling methods for LXR binding prediction

Available crystal structures of human LXR (α and β isoforms) complexes from the Protein Data Bank (PDB,, in addition to the available experimental data on LXR binding affinity and activation from literature and available databases (e.g., ChEMBL), were collected. Different molecular modelling (MM) approaches, including both ligand- and structure-based methods (i.e., ensemble docking, e-pharmacophore, and fingerprints-based similarity), were used in order to characterise the ligand binding domain of LXR and to define the essential features leading to LXR binding. A validation dataset, including known LXR binders, was assembled to assess the ability of the developed MM methods to identify LXR binders. The MM approaches developed were then integrated by means of data fusion/consensus modelling with the aim of optimising the predictive performances by taking advantage of the different modelling methodologies.

3) Nuclear receptor ligand screening

Structural and physico-chemical features of nuclear receptors ligands (e.g., ligands of RAR, RXR, LXR and PPAR etc..) have been examined using data (e.g. ki and EC50) from ChEMBL database and data (i.e. ligand-protein-interactions) from PDB database. The gathered information was used to select relevant chemical features of potential NR ligands. For each NR, the information on relevant substructures (SMARTS strings of scaffolds and functional groups) and physico-chemical features, e.g. ranges for molecular weight, log P, vertex adjacency information magnitude, topical polar surface area, number of hydrogen bond donors and rotational bonds, are checked and if suitable, the compound is assigned as a potential ligand. For NR ligands which are associated with hepatosteatosis, an additional filter based on structural alerts for liver steatosis is applied.

b. Models for gastrointestinal absorption (GIA) and skin permeability prediction

The estimation of bioavailability after oral and dermal administration is of key importance in the prediction of the chronic toxicity of cosmetic-related ingredients. In fact, whilst cosmetics are usually applied dermally, the majority of available repeated dose toxicity data are obtained from oral administration. The extrapolation of chronic toxicity data (e.g. NOEL/NOAEL) from oral to dermal exposure routes, i.e. the “oral-to-dermal extrapolation”, is one of the issues addressed within the COSMOS Project in order to extend the current TTC approaches to cosmetic substances.
To support bioavailability estimation after oral and dermal exposure, several scenarios for use of NOAEL data have been proposed and in silico models predicting GIA and skin permeability were developed.

1) PAMPA permeability prediction
The PAMPA (parallel artificial membrane permeation) assay is a high-throughput in vitro assay for the prediction of oral passive absorption. Permeability constants obtained from double sink PAMPA assay were used to develop a multiple linear regression (MLR) model, as an improvement of and existing model. The new QSAR model implemented in KNIME uses log D and TPSA/MW ratio as descriptors and a data set of 276 compounds from the Database of Double-Sink PAMPA logP0, logPm at pH = 6.5 and logPm at pH = 7.4. Because of the lack of freely available tools for logD estimation, two implementations of the model were produced based on logD estimations readily obtainable through free online services calculated by ACD/Labs tools and calculated by ChemAxon tools.

2) Skin permeability prediction
Potts and Guy’s QSPR to predict skin permeability was modified by incorporating a larger dataset and a statistical tool to assess data quality; cf. confidence scoring (CS). The QSAR model consists in a multivariate linear regression based on two CDK descriptors, i.e. molecular volume (MV) and lipophilicity (XLogP), and using CS as weights.

c. Chemical Space Analysis

A computational procedure for explorative analysis of the chemical space of datasets/chemical inventories was developed combining multiple approaches and descriptors. The procedure for chemical space analysis includes three main analysis:

1) physico-chemical space analysis
The physico-chemical space of the dataset is analysed by means of distribution analysis of individual physico-chemical properties and Principal Component Analysis (PCA) based on physico-chemical properties. A sub-set of descriptors accounting for physico-chemical and electronic properties were calculated using the PaDEL-Descriptors software and used for the physico-chemical space analysis as part of the Chemical Space Analysis Workflow.

2) structural space analysis
The structural space of the dataset is analysed by means of Principal Component Analysis based on 1D-2D molecular descriptors. A sub-set of molecular descriptors accounting for mono- and bi-dimensional structural features were calculated using the PaDEL-Descriptors software and used for the structural space analysis as part of the Chemical Space Analysis Workflow.

3) functional groups profiling
The functional group profiling of the dataset is performed by means of frequency rate analysis based on defined molecular features (e.g., functional groups). Substructure fingerprints calculated using the PaDEL-Descriptors software were used for the functional group profiling as part of the Chemical Space Analysis Workflow.

d. Structural alerts

The fundamental requirement for the development of a category suitable for predicting toxicological endpoints is the ability to group chemicals together based on a common molecular initiating event (MIE). The MIE is the interaction between a chemical and biological system that results in the initiation of the biological cascade leading to adverse outcome. Structural alerts define the key futures of a molecule that are required for commencing the MIE and provide the mechanistic knowledge about MIE. A collection of structural alerts that induce the same MIE are considered to be an in silico profiler.
The four in silico profilers have been developed:

1) In silico profiler for covalent DNA binding
It is well known that for both mutagenicity and carcinogenicity one of the fundamental steps is the formation of a covalent bond between nucleophile and electrophile. The mechanistic importance of this chemical reaction makes the mechanistic alert approach the natural choice for developing the profiler for DNA binding. Moreover, the assessment of the mechanistic domain overlap between corresponding structural alerts in the literature compilations has been investigated. This analysis ensured that for a given structural alert the maximum mechanistic information (and thus domain) was extracted. A total of 111 structural alerts crossing six broad organic chemistry mechanisms (domains): acylation (AC), Michael addition (MA), Schiff base formation (SB), unimolecular aliphatic nucleophilic substitution (SN1), bimolecular aliphatic nucleophilic substitution (SN2) and radical mechanism (Rad) have been created and defined as SMARTS patterns.

2) In silico profiler for covalent protein binding
The in silico profiler for covalent protein binding was developed based on the review of current scientific knowledge on structural alerts relating to a number of toxicity endpoints such as: skin sensitisation, respiratory sensitisation and aquatic toxicity. There are several publications in which structural alerts for direct acting and indirect acting electrophiles have been published. The existing structural alerts have been mapped in terms of their relationships with mechanistic organic chemistry (i.e. identify alerts from the published compilations related to covalent protein binding). The mapping was performed to achieve maximum overlap and usability whilst restricting redundancy in the alerts, and to ensure that the alerts are related to the molecular initiating event of covalent protein binding.
As a result, a total of 108 structural alerts covering five broad organic chemistry mechanisms (domains): acylation (AC), Michael addition (MA), Schiff base formation (SB), bimolecular aliphatic nucleophilic substitution (SN2) and aromatic nucleophilic substitution (SNAr) have been created. The identified mechanistic structural alerts were defined as SMARTS patterns. Also the detailed mechanistic chemistry reaction associated with each of the alerts has been compiled. The mechanistic information is intended to outline how the alert can act as a direct electrophile or how it can be converted into an electrophile. Therefore, the important consideration within the mechanistic chemistry framework is the inclusion of potential metabolic activation.

3) In silico profiler for hepatotoxicity
The ability of a compound to cause adverse effects to the liver is one of the most common reasons for drug development failures and the withdrawal of drugs from the market. However, the complexity and diversity of hepatotoxicity, the limited (if any) mechanistic insight, the lack of large high quality liver toxicity datasets and the role of metabolism, ensure that hepatotoxicity is one of the most difficult endpoints to model.
The in silico profiler for hepatotoxicity was developed based on generating 16 chemical categories and then developing structural alerts able to identify potential hepatotoxins. This was achieved by grouping chemicals based upon their structural similarity, and then the mechanism(s) by which these compounds cause hepatotoxicity were investigated and mechanistic rationale was proposed, where possible, to yield mechanistically supported structural alerts. Alerts of this nature have the potential to be used in the screening of compounds to highlight potential hepatotoxicity. The identified structural alerts were further defined as SMARTS patterns and implemented in KNIME
However, taking into account the complexity of hepatotoxicity, the structural alerts are not limited to a single mechanism of action. Indeed, it is probable that many hepatotoxic compounds elicit their toxicities via multiple mechanisms of action. Therefore, it is also possible that the alerts within this profiler may possess the ability to initiate multiple adverse outcome pathways leading to hepatotoxicity.

4) In silico profiler for mitochondrial toxicity
The ability to predict organ-level toxicity is becoming increasingly important to the long term goal of replacing animal use in determining a Lowest Observed (Adverse) Effect Level (LO(A)EL). The toxicity induced by mitochondrial dysfunction has been linked to a variety of organ toxicities within kidney, liver and nervous tissues. Therefore, there is an urgent need to develop models able to early detect potential mitochondria toxicants. The structural alerts demonstrate such ability for identifying chemicals which can disrupt the mitochondrial functionality.
21 Structural alerts for mitochondrial toxicity were developed based around clearly defined mechanistic information. This was achieved by grouping chemicals based upon their structural similarity, followed by a literature search to elucidate mechanistic information for the chemicals in categories associated with toxicity to mitochondria. The identified structural alerts were defined as SMARTS patterns and implemented in KNIME

Development of metabolic profilers

Two metabolic profilers have been developed, based on literature research, publically available databases and the evaluation of metabolism datasets:

• The first profiler deals with liver metabolism, including first-pass effects via phase I (functionalisation, e.g. hydroxylation) and phase II (conjugation, e.g. glucuronidation) reactions.

• The second profiler is for skin metabolism, based on knowledge of enzymes present in the skin and reactions they perform, e.g. alcohol dehydrogenase or esterase.

These two different profilers account for the variation in the oral administration route (i.e., first-pass metabolism in liver) and the dermal administration route (i.e., first-pass metabolism in skin). First-pass effects reduce systemic availability and increase localised activity of the metabolites in the skin. On the other hand, physicochemical properties of compounds resulting in rapid penetration significantly reduce the potential for first-pass dermal metabolism during percutaneous penetration. For example, glycol ethers could be metabolised in the basal layer of the epidermis, but are known to rapidly penetrate the skin.
An investigation of the metabolism liver and skin confirmed that there are significant differences between both sites for cosmetic ingredients.

The most important metabolic reactions in skin are:
• Carboxylesterases (24% of enzymatic transformations) - Phase I
• Alcohol dehydrogenases (18%) - Phase I
• Glutathione-S-transferases (12%) - Phase II
Vice versa, the top three metabolic reactions in liver are:
• Hydroxylases (26%, mostly by CYPs) - Phase I
• Glucuronyltransferases (14%) - Phase II
• Sulfotransferases (9%) - Phase II

Models should also accommodate temporal scales to investigate time dependent concentrations in different tissues. This is an important aspect to evaluate whether those concentrations are beyond acceptable toxic effect levels in any tissue at particular time points. For this reason, models for the prediction of total and hepatic clearance (CLt and CLh) were also developed.

Ranking methods and ranking models

A priority setting procedure was developed by means of innovative in silico approaches and chemometric tools to allow for the screening and ranking of chemicals according to their toxicity profiles. Ranking is equivalent to sorting chemicals according to their relative levels of concern, thus providing the basis for analysing trends across multiple endpoints; additionally ranking may lead to the identification of different profiles of toxicological behaviour, which might also be regarded as different subcategories.
Different methodologies (e.g., consensus modelling, data fusion and total order ranking methods) have been employed, compared and integrated to develop an in silico procedure for ranking chemicals. With the aim of exemplifying the use of different ranking approaches that combine results from different sources, a number of case-studies were carried out:
1) ranking chemicals based on their potential binding to Liver X Receptor (LXR)
2) ranking chemicals across multiple endpoints, including LXR binding potential and liver toxicity potential.

Implementation of the Models into KNIME Workflows

When feasible, the (Q)SAR models and computational methods developed within WP 3 (described above) were implemented as KNIME workflows to use with the free KNIME software. The KNIME Analytics Platform is a pipeline tool for data analysis, data manipulation, data visualisation, and reporting (more details in WP5 section). By means of graphical workflows, data are read from various data sources and subsequently transformed into suitable formats for model building and/or visual analysis. The KNIME technology integrates access to databases and modelling approaches into flexible computational workflows that are being made publicly accessible and provide a transparent method for use in the safety assessment of cosmetics. Thus, the implementation of the developed computational models/methods into KNIME workflows allows for model transparency, availability and reproducibility. End-users will be able to adapt or extend the workflows in the KNIME Analytics Platform (which was formerly known as KNIME Desktop) to their needs.
In addition, the workflows have been implemented as COSMOS KNIME WebPortal versions ( The WebPortal versions of the workflows allow for an even easier execution of the workflows since they can be run in a web browser without installation of the software (or knowledge of the software), using a wizard-like execution where selected configuration options are requested from the user. Some modifications were required to adapt the KNIME desktop versions for their execution in the WebPortal. In particular, modifications were applied to adapt the workflows for the “input” windows (e.g., to allow the user to select the type of input data), as well as for the workflows output, i.e. the download and reporting of results. Concerning the reporting, standardised COSMOS formats were used.

Documentation of computational models

Documentation templates have been developed within the COSMOS project to provide key information on the workflows, including information on workflows development, model background, encoded algorithms, required parameters and user guidance. Such documentation is provided for the execution of both desktop and WebPortal version of the workflows and is available through COSMOS Space (
Additionally, web tutorials have been recorded guiding users step by step through the execution of the workflows in the COSMOS KNIME WebPortal and are available on YouTube via see WP 6 section.
With regard to the (Q)SAR models developed for LXR binding prediction, detailed documentation is also provided according to the JRC (Q)SAR Model reporting Formats (QMRF) ( The QMRF is a harmonised template for summarising and reporting key information on (Q)SAR models development, validation and applicability domain.
In addition, the COSMOS models translated into KNIME workflows have been also documented in the EURL ECVAM DataBase service on Alternative Methods to animal experimentation (DB-ALM) available at as well as the ToxBank Warehouse.

Supporting SEURAT-1 Level 2 and 3 Case Studies

Molecular Modelling (MM) studies performed within COSMOS WP3 for the prediction of LXR and PPARγ binding (and activation) were proposed for SEURAT-1 Proof of Concept Case Studies. This Level 2 case study was a proof of concept that molecular modelling (MM) methodologies can be employed in predictive toxicology as part of an integrated strategy. MM methodologies were used to address specific molecular initiating events (MIEs) involved in the development of liver steatosis, i.e. LXR binding and PPARγ binding (and activation). To predict LXR binding potential, several MM methodologies were used, including both ligand- and structure based methods (i.e., ensemble docking, e-Pharmacophore and fingerprints-based similarity). The MM approaches developed were then integrated by means of data fusion/consensus modelling. To predict PPARγ binding, a virtual screening protocol was assembled by first applying molecular docking, and then by filtering the generated poses with a pharmacophore model which was generated based on the X-ray complexes of the three most active PPARγ full agonists. 3D QSAR (CoMSIA) models were also developed to predict PPARγ activation.
The usage of MM approaches to predict receptor binding/activation provided hints for the characterisation of molecular mechanisms that trigger further downstream events and promote the development of liver toxicity. In the AOP framework, MM may thus play an important role by working in synergy with other in silico (QSAR, chemotypes, alerts) and in vitro approaches.
Moreover, the in silico procedure developed, which integrates a variety of in silico approaches (e.g., molecular modelling, QSAR and structural alerts) and chemometric tools in an innovative way, was also employed for the screening and ranking of chemicals to support the Read-across and Ab initio SEURAT-1 Level 3 case studies.

Results Highlights within WP3:

• An extensive review of the state of the art of in silico methods to predict chronic toxicity is provided. This confirms the increasing role of category formation and read-across in toxicity prediction.
• Development of a number of computational methods predicting chronic toxicity endpoints that are important for TTC thresholds
• Incorporation of kinetic and metabolic studies in computational toxicology workflows. Specifically models built for heptatic clearance and metabolism.
• Development of an in silico procedure for the integration of different modelling approaches by means of consensus modelling and data fusion methods supporting Seurat-1 Case Studies
• Implementation of the models and data analysis methods into flexible and user-friendly KNIME workflows with related documentation and user guidance


The purpose of WP4 was to bring together a series of innovative methods and models, including PBPK models, to predict target organ doses. In particular, the extrapolations from information from in silico and in vitro systems to in vivo effects were addressed.

The main achievements of WP4:

Development of PBPK models

Physiologically-based toxico-kinetic (PBTK) models are crucial models to extrapolate from in vitro measurements to in vivo predictions. In the context of the ban of animal testing in cosmetics hazard assessment, these models have to be calibrated based on in vitro data and mathematical models.
The following tools to predict repeat exposure and simulate kinetic and toxic effects were built within WP4:

a. The Virtual Cell Based Assay (VCBA)

The purpose of this model is to simulate the fate of a chemical and the intracellular concentration leading to cell perturbation. The VCBA comprises five interconnected sub-models:
1) The fate and transport model that takes into consideration evaporation, partitioning of chemicals from the dissolved phase to serum proteins and lipids, adsorption onto the plastic, and also degradation and metabolism
2) The cell partitioning model that considers partitioning of the chemical between four compartments: one aqueous fraction (intracellular water) and three non-aqueous fractions (proteins, lipids mitochondria)
3) The cell growth and division model that is based on the four cell cycle phases
4) The toxicity and effects model that simulates the direct effects of a chemical concentration on cell dynamics
5) Experimental set up model: takes into account the surface, area, size and shape of the well

b. BK/TD models for mixtures

The WP4 aimed to develop biokinetics / toxicodynamics (BK/TD) models to predict the HepaRG viability over time resulting from repeated exposures to the cosmetic mixtures. Long-term cell viability was studied through impedance measurements of HepaRG cells exposed repeatedly every 2 to 3 days for 4 weeks to mixtures of three hepatotoxic cosmetic ingredients: coumarin, isoeugenol and benzophenone-2 (BP2). Based on preliminary analyses on the cell viability following exposure to the single compounds, the mixture BK/TD model to predict the real-time cell viability following repeated exposures was built.

c. Physiologically Based Kinetic (PBK) models

PBK models represent the body as a set of interconnected compartments described by mathematical -differential- equations that facilitates quantitative description of the Adsorption, Distribution, Metabolism, and Excretion of a chemical/metabolite taking into account physiological, physicochemical and biochemical parameters. These models were built in order to simulate relevant dose and time profiles concentrations of a cosmetic ingredient and relevant metabolites of interest during absorption, distribution, metabolism and excretion within the body. PBK models can be used to extrapolate: High to low doses, Route to Route and in vitro to in vivo.

d. 1-D and 2-DLiver model

Slightly more complex models describe the liver as a 1D tube representing the liver sinusoid. This modelling strategy incorporates the transport of substances coupled to the hepatocytes metabolism and allows a simple representation of a zonated liver, where substance metabolism depends on the position of each hepatocyte in the sinusoid.

Additionally, 2D liver model was developed which includes mechanisms for cell necrosis and cell proliferation. This model allows analysing the effect of the accumulation of compounds on hepatocyte viability and detoxification capacity.
A case study for acetaminophen toxicity demonstrated that repeated dosage of acetaminophen is more dangerous than a high single dose having a critical threshold at ~4 g/d administered over 7 days.

Development of Human Bioaccumulation model

Traditionally, bioaccumulation potential has been assessed in aquatic or terrestrial organisms, but not directly in humans. To address this shortcoming, an approach for predicting human bioaccumulation potential based on the use of a simple PBTK model was developed, coded in the Symcyp software. This generic PBTK was able to predict the bioaccumulation potential of a chemical, expressed as the human bioconcentration factor (hBCF), based on selected physicochemical parameters, in vitro human liver metabolism and plasma-protein binding data, minimal renal excretion and a constant exposure scenario.
The PBTK model was re-written in R, and re-implemented as an open source KNIME workflow. This model was designed to incorporate not only the chemical properties of the compounds, but also the processes that tend to decrease the concentration of the compound, such as metabolism.

IVIVE Extrapolation to Target Organ Level

Two independent approaches for extrapolating in vitro toxicity effects to the target organ level have been developed:

1) The first approach models long-term toxicity data using impedance metrics. Due to the cost of the long-term toxicity testing, the acute toxicity data to extrapolate to chronic toxicity data were also used. The model couples dynamic descriptions of the major in vitro kinetic processes involved with a simple model of viability loss. The model was applied to describe HepaRG cell viability loss following exposure to three cosmetic related substances: coumarin, isoeugenol and benzophenone-2.

2) In the second approach the PBPK model was coupled with a cell compartment by using the Virtual Cell Based Assay (VCBA) dynamic outputs and simulating cell viability and mitochondrial membrane potential. This approach was used to perform in vitro to in vivo extrapolation (IVIVE) with respect to two case study compounds, caffeine and coumarin. The time profile curves as well as dose response curves were stimulated in order to be able to perform forward as well as backward extrapolation. The combined models have been automated and implemented into a KNIME workflow which is able to predict internal concentrations. By applying real case scenarios it was showed how the tool can be used to calculate the so called Margin of Internal Exposure, which forms the basis of a risk characterisation.

Implementation of the Models into User-Friendly KNIME Tools

The models developed within WP4 relying on mathematical algorithms such as Physiologically-Based Kinetic and Dynamic (PBK/D) models, and Virtual Cell Based Assay (VCBA) models were translated to automated user-friendly KNIME workflows. Depending on the chemical information, the KNIME workflow can be simple (only one R script) or complex (multiple R scripts). The KNIME workflows are available as desktop and WebPortal versions. For both versions a user guide and appropriate documentation are available through COSMOS Space (
Additionally, web tutorials have been recorded guiding users step by step through the execution of the workflows in the COSMOS KNIME WebPortal and are available on YouTube via

Results Highlights within WP4:

• Development of tools for simulating the long-term (repeat exposure) toxic effects of chemicals in in vitro systems and the kinetics in humans
• Proofs of concepts to support in vitro to in vivo and route-to-route (oral to dermal and inhalation to dermal) extrapolations based on PBTK models. A set of methods and multiscale models (from cells to the whole body) for better assessments of cosmetic ingredients safety without animal experiments
• Development of approaches for IVIVE Extrapolation to Target Organ Level
• Easy to use and freely available tools (KNIME workflows) supporting risk assessment
• Supporting experiments and model predictions for the SEURAT-1 ab initio case study


Aligned with the overall project objective to integrate open source and open access modelling approaches into adaptable and flexible in silico workflows, the purpose of Work Package 5 was to enhance KNIME, an open source modular integration platform to enable all other partners to reuse, deploy, and archive their data processing, analysis, and prediction workflows.

The main achievements of WP5:

Improvements of KNIME Server and KNIME Analytics Platform

KNIME is the modular integration platform for the database and computational toxicity prediction methods which are being developed in the COSMOS project. By means of graphical workflows, data are read from various data sources and subsequently transformed into suitable formats for model building and/or visual analysis. The KNIME technology integrates access to databases, data processing and analysis, as well as modelling approaches into flexible computational workflows that will be adaptable and form a set of building blocks allowing users to incorporate their own data and search existing data compilations.
KNIME provides a simple extension application programming interface which allows for easy integration of new methods which are usually represented by so-called nodes. Since KNIME is open source it is a suitable platform for developing and deploying the computational methods that are being developed in the different COSMOS working areas.
The KNIME Analytics Platform is a desktop program which runs locally on a computer and uses a directory on that computer to store the workflows. In context of the COSMOS project it is desirable that workflows can easily be shared by all groups during development and once they are usable for the public they should be easily accessible and usable even for non-experts in KNIME. The KNIME Server offers a central storage for workflows that can be accessed directly from within the KNIME Analytics Platform client. The following extensions and improvements were triggered by the COSMOS project:
• In order to support the implementation and dissemination of the computational models developed in the COSMOS project, the KNIME Server as well as the KNIME Analytics Platform desktop application have been extended and improved in many ways in order to support the implementation and dissemination of the computational models developed in the COSMOS project. An interactive integration of the R programming language. In some workflows R scripts are used and the interactive R nodes make it easy to develop them directly inside the workflow.
• A set of nodes for calculating common statistics and hypothesis testing, such as t-tests, Cronbach correlation, or Kruskal-Wallis test.
• The ability to select columns in node dialogs based on exact names, on patterns, or based on the type. Especially if user-supplied data is used column names often cannot be determined in advance or are spelled wrong. In this case the new column selection components that are used in most nodes by now offer more flexibility.

KNIME WebPortal

The COSMOS workflows were also transferred into user-friendly WebPortal versions. The COSMOS KNIME WebPortal is freely accessible at the URL, and is supported by COSMOS Space ( The KNIME WebPortal allows accessing the KNIME Server and executing workflows through a web interface from any recent web browser without installation of the software locally and without knowledge of the KNIME workflows as such. Thus, as opposed to access from the KNIME desktop application mainly for workflow developers, end users that only want to run a workflow with custom data can access the server via the KNIME WebPortal. The WebPortal allows for a step-by-step execution. Each step asks for user input, such as files or model parameters, potentially also dependent on previous inputs. After all inputs have been provided, the workflow is executed and the results can be downloaded as files and/or graphical reports are generated as summaries.

a. Strategy for Legacy and Maintenance

The COSMOS WebPortal including the workflows are currently hosted on a server at UNIBRAD. As long as the hardware is maintained KNIME will ensure that the COSMOS KNIME Server keeps running (e.g. extending licenses, updating with bugfixes, etc.). Later on the workflows could also be copied onto KNIME’s Public Example server so that they are still available and downloadable by users.


Computational models developed in the COSMOS Project have been coded using the open access KNIME workflow technology. KNIME workflows were developed in the local desktop application (which is completely Open Source). Every workflow can easily be uploaded to a KNIME Server and is available for download (depending on the user permissions). Moreover, every workflow can also be executed directly on the server and the fully executed workflow including the results can be downloaded to the client.
The COSMOS KNIME Server has two different areas for storing workflows. The “private” area is only accessible to workflow developers whereas the “public” areas is visible to everyone. The group information is taken from COSMOS Space and can be changed by an administrator.

The public area contains the following workflows:

• Absorption

o PAMPA permeability estimation
o Skin permeability estimation

• Biokinetics

o Bioaccumulation
o In-vitro to in-vivo extrapolation for…
- Caffeine
- Estragol
- Styrene
o Physiological based kinetics for…
- Caffeine
- Coumarin
- Estragol
- Ethanol
- Hydroquinone
- Isopropanol
- Methyliodide
- Nicotine
- Styrene
o Virtual Cell-based Assay

• Chemical Space Analysis

• Molecular Initiating Events
o Covalent DNA Binding Alerts
o Covalent Protein Binding Alerts
o Hepatoxicity Alerts
o Mitochrondrial Toxicity Alerts

• Nuclear Receptor Binding
o LXR binding potential (based on PaDEL or RDKit)
o Nuclear receptor ligand binding

• Ranking

Each workflow is supported by the extensive documentation available in COSMOS Space ( The workflow documentation describes the purpose of the workflow, its structure and most importantly provides user guidance. Specifically, the user guidance explains in an easy to follow, step by step manner, how to execute the workflow.
The workflow documentation can be downloaded as pdf from COSMOS Space or can be directly printed.
Additionally, for most workflows short tutorial videos were created and are available on YouTube via
Additionally, for most workflows short tutorial videos were created and are available on YouTube via
These web tutorials are also accessible directly from the following links:
• Tutorial for “Physiologically Based Kinetic models”:
• Tutorial for “Virtual Cell Based Assays”:
• Tutorial for “Skin Permeability Estimation”:
• Tutorial for “PAMPA logPm predictor”:
• Tutorial for “LXR Binding Prediction”:
• Tutorial for “Potential Nuclear Receptor ligands and alerts towards hepatosteatosis”:

Moreover, the COSMOS models translated into KNIME workflows were also documented in the EURL ECVAM DataBase service on Alternative Methods to animal experimentation (DB-ALM) available at The DB-ALM is a public, factual database service that provides evaluated information on development and applications of advanced and alternative methods to animal experimentation in biomedical sciences and toxicology, both in research and for regulatory purposes. Currently, three models developed by COSMOS project are recorded in DB-ALM: “Physiologically Based Kinetic models”, “Virtual Cell Based Assays” and “The Human Bio-Accumulation Model”. Documentations for the remaining models have been submitted to DB-ALM and should be published soon.execute the workflow.

Results Highlights within WP5:

• Various usability improvements and new nodes in KNIME (e.g. opening via double-click on server; Python integration; node for and model data validation)
• Implementing the models developed within WP 3 and 4 both into adaptable KNIME workflows for use in local software and as user-friendly WebPortal versions
• Public release of KNIME workflows in WebPortal supplemented by detailed documentation
• Support of KNIME workflows through web tutorials


In line with the overall project objective to fully engage stakeholders by the efficient dissemination of results to ensure strong impact and integration with companion projects inside and outside of the SEURAT-1 cluster, the purpose of WP 6 is to ensure full and adequate dissemination of the COSMOS databases, models and workflows as well as efficient integration with other projects.

Raising the profile and dissemination of results Of COSMOS Project was realised by:
• A dedicated web-site, general project leaflet and additional leaflets to highlight specific topics
• Interaction with stakeholders
• Webinars and training events
• Publications and conference presentations

In addition to the general website the following COSMOS components are available at related URLs:
COSMOS Database


COSMOS WP 7 led by the co-ordinator LJMU was established to ensure the smooth operation of the project and its overall management.

Potential Impact:
The COSMOS Project has provided a high quality database of toxicological information and computational models which will assist in the prediction and assessment of the hazards associated with cosmetics ingredients. These outputs will directly support the safety/risk assessment of the use of existing, and potentially novel, ingredients in cosmetics. This research undertaken in the COSMOS Project supports the area of computational modelling as it is being implemented in the vision of 21st Century Toxicology. Overall, this enables more relevant and reliable information relating to human safety to be obtained; it contributes to the reduction of animals for toxicological assessment; and it assists in the development of cheaper and greener products.
Provision of data is crucial for the assessment of toxicity. The COSMOS Project has provided a database which will greatly improve data availability for repeated-dose toxicity assessment of cosmetics-related chemicals. This has been achieved by COSMOS partners working with regulatory authorities, industry and other trade organisations to obtain donations of data. These have been supplemented by direct data harvesting within the COSMOS Project in order to extract data from publicly available sources such as the Opinions of the (European Union’s) Scientific Committee on Consumer Safety (SCCS). Further, these data were supplemented by those for other toxicities. Another significant data gathering exercise compiled a large amount of information on the absorption of chemicals through the skin e.g. % absorbed, maximal flux and permeability coefficient, resulting in the largest publicly available database of information relating to dermal penetration. There is the potential for a considerable impact of COSMOS DB on regulatory agencies, cosmetics industry, research institutes, universities, and small/medium enterprises who may use this information to make safety decisions regarding new and future ingredients and impurities. The data held within such a database are also vital for efforts to develop novel computational models and to support read-across predictions of toxicity to fill data gaps. The COSMOS Project has also provided strategies for the curation and harvesting of chemical structures and toxicity data. This is vital to ensure the reliability of data held within such databases. COSMOS supports information exchange and feedback enabling use of, and confidence in, project outcomes through access to the freely available public resources: COSMOS DB, COSMOS Space and KNIME WebPortal. COSMOS DB has also provided the platform to encourage interactions and collaborative approaches between SEURAT-1 partners, notably with the ToxBank Project, to enable mining of data and information underpinning prediction of target organ toxicity. As a legacy of the COSMOS DB, the integrity of the data has been assured by the availability of a “flat” file that may be read into a spreadsheet, as well as the establishment of the COSMOS Share Point to maintain the database and provide a means of collecting further data in an openly available format.
The Threshold of Toxicological Concern (TTC) approach has had a considerable impact on risk assessment of chemicals in many sectors e.g. pharmaceuticals, foods. Specifically TTC has the potential to have an enormous impact on the safety/risk assessment of cosmetics-related substances for which chemical-specific toxicity data are lacking. Provided that reliable exposure data are available, the TTC approach provides a relative assessment of the safety of exposure to a substance without recourse to any testing. As such, it provides a potential solution to the safety/risk assessment of many cosmetics ingredients and impurities, or any material that cannot be tested. Traditional approaches to TTC have been derived from data and chemicals not necessarily representative of cosmetics ingredients. The COSMOS Project has provided a new database for the application and extension of the TTC approach. The database has enriched, and greatly expanded, the existing (Munro) dataset by the inclusion of No Observed (Adverse) Effect Levels (NO(A)ELs) for cosmetics ingredients. Importantly, the new non-cancer TTC database is transparent and open (available via COSMOS DB) and incorporates recommendations for how the data should be used. As part of the development of the new COSMOS TTC database, considerable effort was placed into quality assurance, essentially providing a blueprint for new studies in this area with regard to assessing the quality, or otherwise, of published toxicity studies. Since TTC is a pragmatic method recommended by the European Food Safety Authority (EFSA), as well as, SCCS for the safety/risk assessment of chemicals found in food, cosmetics or consumer products, this new TTC database will have broad economic impact on the cosmetics industry and be a well applied tool. In order to apply the TTC values derived from oral data to use cases relevant to cosmetics, the COSMOS Project has extended the current TTC approach to cosmetic-related chemicals. Among the strategies to address this issue, a tiered approach has been developed for the assessment of chemicals’ bioavailability, which takes into account the absorption/permeability via dermal or oral routes. This provides novel models indicating possible systemic bioavailability following dermal or oral absorption. The findings of the TTC approach have been presented to the SCCS (November 2015) and have been commented upon in the Opinions of the SCCS.

Novel models for the prediction of toxicity of cosmetics ingredients have been developed within the COSMOS Project. These models are openly and freely available and allow the user to make a prediction from chemical structure alone. The models make predictions for the following properties that are important for risk / safety assessment of cosmetics ingredients: absorption of a compound following oral and dermal administration; binding to nuclear receptors (important receptors, interaction with which can be a pre-cursor to toxicity) that may be indicative of harmful effects, e.g. steatosis, to the liver; ranking of chemicals in terms of toxic potential; and structural alerts (i.e. specific fragments of molecules) associated with binding to DNA and protein, general hepatotoxicity and specific effects relating to mitochondrial toxicity. These models require the input of chemical structure only; as such they are increasingly seen as viable alternatives to animal testing. Results from the COSMOS Project, along with efficient use of data and information from COSMOS DB will support the increasing trend for using “read-across” following the formation of categories of similar compounds for organ level toxicity prediction. These techniques will be supplemented by the optimisation of molecular modelling methodologies for their application in predictive toxicology, particularly assisting with the implementation of the mechanistic (Adverse Outcome Pathway - AOP) framework for toxicity prediction. Studies within the COSMOS Project have been performed to generate knowledge and set the foundation for future development of models and applications. The impact of these models will be through the area of computational toxicity prediction with particular significance to industry (cosmetics and beyond) as well as international efforts to improve this area of science.
The findings and models from the COSMOS Project will also support the overall area of in vitro to in vivo extrapolation (IVIVE). The project has provided models and methodologies for route-to-route extrapolations with an emphasis on oral-to-dermal extrapolation. This is important for safety/risk assessment when only oral data are available. The extension of models for IVIVE has improved the relevance of in vitro testing. Further, the calibration of Physiologically-Based ToxicoKinetic (PBTK) models based on in vitro and in silico data has provided tools to assist in the prediction of target organ concentrations following a specific topical exposure. These models have been supported by a 2D liver model that can be coupled to PBTK models to refine kinetics and hepatotoxic effects assessment as well QSAR models to predict metabolites and rate of disappearance of the parent compound (total and hepatic clearance). Overall, these models will have a significant impact on safety/risk assessment of the organ level toxicity to humans of cosmetics and related ingredients.
The data compilation and modelling efforts in the COSMOS Project have been supported by appropriate open access informatics and workflow approaches through the KNIME technology. This has enabled data and models to be accessed and utilised by all stakeholders and has provided a platform, through the KNIME Web Portal for the development of further tools beyond the lifespan of COSMOS. The impact of this approach has been to make the models publicly available ensuring their uptake by industry, regulators and academia. The models are transparent and can be developed further by the user e.g. incorporating new data or knowledge.
The COSMOS Project had considerable impact with cross-(SEURAT-1) cluster activities. For instance, the COSMOS Project contributed data curation and organisation to a number of high level case studies demonstrating the use of read-across as a technique to predict toxicity and fill data gaps. Further, COSMOS partners were instrumental in developing read-across arguments and understanding. The impact of these activities has been to realise a strategy and framework to document and support read-across which is relevant to industry users when making regulatory submissions.
COSMOS successfully implemented a dissemination plan targeting key stakeholders. The public face of the COSMOS Project was the web-site ( which was updated regularly with news items, project results, publications and links to COSMOS DB and the models. Newsletters were distributed electronically and a series of project leaflets (general information, COSMOS DB, computational models) were created and distributed widely at over twenty key conferences and symposia. General information on the COSMOS project (and on the TTC approach) were presented at meetings of the SCCS. A large number of conference presentations were made including the organisation of specific sessions (for example at the Society of Toxicology annual meeting in 2014); the COSMOS legacy also benefitted from a dedicated COSMOS dissemination day contributions to training events and the final SEURAT-1 symposium in Brussels. In addition to the resources (COSMOS DB and KNIME workflows) made available on the internet, a large number of training videos and tutorials have been developed. The project also provided a permanent legacy of its work through publication in a number of key journals.
In summary the COSMOS Project will provide a legacy of data compilation, strategies for the further use of informatics to support computational toxicology and models for toxicity prediction. There will be a very broad impact to all stakeholders involved in the safety/risk assessment of cosmetics ingredients as well as in other sectors. The public will ultimately benefit in the development of safer and cheaper products, with reduced reliance on information from animal studies.

List of Websites:
COSMOS web site:
COSMOS Database:

Información relacionada

Reported by

United Kingdom
Síganos en: RSS Facebook Twitter YouTube Gestionado por la Oficina de Publicaciones de la UE Arriba