Community Research and Development Information Service - CORDIS

H2020

KConnect Report Summary

Project ID: 644753
Funded under: H2020-EU.2.1.1.4.

Periodic Reporting for period 1 - KConnect (Khresmoi Multilingual Medical Text Analysis, Search and Machine Translation Connected in a Thriving Data-Value Chain)

Reporting period: 2015-02-01 to 2016-01-31

Summary of the context and overall objectives of the project

The healthcare sector consists of many stakeholders, including the pharmaceutical and medical products industries, healthcare providers, health insurers, clinicians and patients. Each stakeholder generates pools of textual data, which have typically remained disconnected. The amount of information to analyse in the health sector is growing rapidly. The two types of textual information in the medical domain that are of particular interest in KConnect are published scientific papers in the medical domain, and Electronic Health Records (EHR). According to Medline Trend, 1.120.070 papers were published in Medline in 2013, almost double the number of papers in 2003 (591.637). Making sense of the knowledge contained in this amount of complex unstructured text can only be done rapidly enough through the use of (semi-)automated text analysis techniques. A hospital with 250.000 active patients generates one Terabyte of text data per year. It is essential to process this data for Comparative Effectiveness Research to predict which treatments work best for which patients; for Predictive Modeling to flag patients with potential negative developments (e.g. potentially suicidal psychiatric patients); as well as for Quality Control of the healthcare system. As increasing numbers of medical establishments are realising the potential of EHR analysis, and also the cost of not doing this analysis in terms of inefficiency and unnecessary loss of life, the demand for such solutions will increase significantly in the next years.

KConnect builds on the multilingual medical text processing technologies developed in the very successful Khresmoi FP7 project. The core of the exploitation involves making the technologies available as cloud-based services (on public clouds and installable on private clouds) that are straightforwardly extensible to new languages. The companies involved have commercial interests in exploiting these technologies for providing vertical search solutions in the medical domain, innovating their medical search portals and installing Electronic Health Record (EHR) analysis solutions. Around this technology, they will also create momentum for its uptake through building a Professional Services Community of companies able to consult on and install the solutions.

The overall objective of the KConnect project is to create a medical text Data-Value Chain with a critical mass of participating companies using cutting-edge commercial cloud-based services for multilingual Semantic Annotation, Semantic Search and Machine Translation of Electronic Health Records and medical publications.

To achieve this overall objective, the KConnect project has six sub-objectives:
1. Facilitate straightforward end-user adaptation of KConnect’s multilingual medicine-specific Semantic Annotation, Semantic Search and Machine Translation technologies to new languages, by making available language adaptation toolkits.
2. Productise multilingual medicine-specific Semantic Annotation, Semantic Search and Machine Translation services through a cloud-based market and as installable packages on private clouds.
3. Facilitate integration of multilingual medicine-specific Semantic Annotation, Semantic Search and Machine Translation technologies into online health portals and vertical search solutions through two routes: the cloud-based market and locally installed as part of a private cloud solution.
4. Expand the multilingual medicine-specific Semantic Annotation, Semantic Search and Machine Translation technologies to the analysis of patient records, to allow straightforward implementation of innovative solutions within hospitals.
5. Develop pricing models and business models to exploit both the cloud-based market and customised vertical search solution approaches.
6. Ensure impact and take-up through the effective dissemination and communication of project results, in particular through the creation of a KConnect Professional Services Community.

Work performed from the beginning of the project to the end of the period covered by the report and main results achieved so far

The annotation pipeline adaptation toolkit was created and evaluated, with Swedish and Hungarian components integrated. The toolkit for training machine translation systems – Eman Lite v1, was developed. The tools for classification and log analysis have also been developed.

The first version of the Cloud Market was deployed to Amazon Web Services. It includes services for semantic annotation, semantic search and machine translation. The adaptation for local installation of the services has also been designed.

The KConnect machine translation service is already implemented in the production system of the TRIP search engine. Early Prototypes of the PREC and HON systems are already available, and experiments with further components on the TRIP data are underway.

Early semantic annotation pipelines and semantic indices for patient records were developed and are available for both English and Swedish.

Extensive consultation with potential clients to determine the value proposition of KConnect services has been done. Initial versions of the pricing and business models have been designed.

Extensive dissemination and communication of project results took place in the first year of the project. The highlight was the project booth at the ICT 2015 event in Lisbon in October 2015.

Progress beyond the state of the art and expected potential impact (including the socio-economic impact and the wider societal implications of the project so far)

Progress beyond the state-of-the-art took place in a number of areas. For semantic annotation, the main advances were new methods for treating data in medical records, in particular temporal data. The toolkits for straightforward adaptation to new languages are also a step beyond the state-of-the-art for semantic annotation and machine translation. The use of annotation to enhance classification and search log analysis in the medical domain also provided useful new results.

Through the KConnect cloud market, we expect to be able to encourage many companies to adopt the KConnect technologies, which should lead to the planned high impact of KConnect.

Related information

Record Number: 190125 / Last updated on: 2016-11-08
Follow us on: RSS Facebook Twitter YouTube Managed by the EU Publications Office Top