Periodic Reporting for period 2 - SIFRm (Semantic Indexing of French Biomedical Data Resources - mobility)
Reporting period: 2018-09-01 to 2019-08-31
The Semantic Indexing of French Biomedical Data Resources (SIFR/SIFRm — www.lirmm.fr/sifr) project investigates the scientific and technical challenges in building ontology-based services to leverage biomedical ontologies and terminologies in indexing, mining and retrieval of biomedical data. Our main goal is to enable straightforward use of ontologies freeing health researchers to deal with knowledge engineering issues and to concentrate on the biological and medical challenges.
Within SIFR, we build an ontology-based indexing workflow (i.e. SIFR Annotator) similar to what exists for English resources but specialized for other EU languages, starting with French. This service is available within a portal of ~30 French biomedical ontologies/terminologies which reuses the NCBO BioPortal technology, developed at Stanford University. The SIFR BioPortal has been released in June 2015 (http://bioportal.lirmm.fr) and actively used and improved since then. Recently, the SIFR Annotator has been enriched to process clinical data and contextualize the annotations (negation, temporality, experiencer). We offer now, both for English and French a unique open ontology-based annotation service that both recognize ontology concepts and contextualize them allowing non-natural-language-processing experts to both annotate and contextualize medical conditions in clinical notes.
In addition, we are also abstracting and generalizing our results to agronomy by offering an ontology repository for agronomical ontologies called AgroPortal. The AgroPortal project, is a community effort started by the Montpellier scientific community (LIRMM, IRD, CIRAD, INRA, Bioversity International) to build an ontology repository for agronomy and related domains (food, plant sciences and biodiversity). Our goal is to encourage the adoption of metadata and semantics to facilitate open science. By enabling straightforward use of ontologies, we expect data managers and researchers to focus on their tasks, without requiring them to deal with the complex engineering work needed for ontology management.
SIFR/SIFRm (2013-2019) is a collaborative action between LIRMM & BMIR previously funded by the French ANR Young Researcher program and currently by the EU H2020 Marie Sklodowska-Curie Program (2016-2019). Dr. Clement Jonquet, SIFR’s principal investigator, is assistant professor at University of Montpellier & LIRMM, and previously visiting scholar at Stanford BMIR, within Pr. Mark Musen’s team.
• We deployed, customized and maintain an ontology repository for French biomedical ontologies/terminologies, the SIFR BioPortal (http://bioportal.lirmm.fr) that hosts 30 terminologies and ontologies and offer multiple ontology-related services to the community.
• We developed a proxy web service for the NCBO Annotator (http://bioportal.lirmm.fr/ncbo_annotatorplus) that gives access —for English data— to new features that have been investigated and implemented within SIFR. This is now include also inside the original NCBO BioPortal.
• We worked on automatic detection of emotion on public heath forums using text mining techniques and built a patient vocabulary out of public patient-written resources (http://bioportal.lirmm.fr/ontologies/MUEVO).
• We develop, enhance and maintain AgroPortal platform prototype (http://agroportal.lirmm.fr) which goals is to offer a reference ontology repository for the agronomic/plant domain. This is major outcomes of SIFRm, which has become an independent project now.
• 24 open access scientific publications or communications (with explicit acknowledgement of SIFRm) including: 8 international articles in journal such as Bioinformatics (Oxford), Web Semantics (Springer), Data Semantics (Springer), Biomedical Semantics (BMC); 1 dissemination journal; 7 international conferences or workshops; 3 national conferences or workshop.
In collaboration with the ANR PractiKPharma project, we are investigating the challenges of processing clinical text data and semantically annotate Electronic Health Records of the G. Pompidou Hospital to extract pharmacogenomics data. In addition, the results of the project are not limited to French (also include English, Spanish) and we are also transferring our results in the agronomic domain in the context of the AgroPortal project (http://agroportal.lirmm.fr).
AgroPortal is a core component of a new ANR funded project started mid-2019 called D2KAB (www.d2kab.org). This project gathers 10 French partners for 4 years and aims to create a framework to turn agronomy and biodiversity data into knowledge – semantically described, interoperable, actionable, open – and investigate the scientific methods and tools to exploit this knowledge for applications in agriculture and biodiversity sciences.
Most of SIFRm’s journal publications are gold open access and the project developments are all open source: https://github.com/sifrproject and https://github.com/agroportal