Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Biodiversity Community Integrated Knowledge Library

Periodic Reporting for period 2 - BiCIKL (Biodiversity Community Integrated Knowledge Library)

Reporting period: 2022-11-01 to 2024-04-30

At present, flawless exchange of biodiversity data is limited by technical and organisational barriers. Lack of standards, inefficient exchange of data among Research Infrastructures (RIs) impairs research progress, increases costs and limits the possibilities for innovation. At the same time, research needs about biodiversity grow in magnitude in response to urgent societal needs, such as mass species extinction, loss of vital ecosystem services, discovery and threats of new diseases and development of bio-based materials, technologies and energy, among others.

BiCIKL catalyses a culture change in the way biodiversity data are identified, linked, integrated and re-used. BiCIKL pursues its mission through enabling the capability of 15 key biodiversity infrastructures to exchange data among themselves and with external users. To make this possible, the RIs adopt the FAIR principles for their data to become Findable, Accessible, Interoperable and Reusable. The objectives and tasks of BiCIKL converge to these principles.

BiCIKL delivers global level access to data and tools along the entire biodiversity research cycle: collecting specimens > extracting molecular sequences > identifying species > analysing and publishing > constructing biodiversity knowledge graph > re-using data for new scientific discoveries and other societal needs.
During its lifetime, BiCIKL reached significant achievements towards its four key objectives:

Find: Ensure seamless discoverability of data through globally unique identifiers from each participating infrastructure and across data domains. Examples of BICIKL developed and adopted standards for use of persistent identifiers (PIDs) are:

• Improved data discovery at large, including taxon names, specimens, genetic sequences and literature at each participating RI and across RIs, including a central data discovery tool.
• Aligned best practices and standards for use of PIDs for different data classes and their implementation in several BiCIKL RIs.
• Improved bi-directional links between the RIs through the use of PIDs and automated links discovery and validation services.
• A Pan-European system for assigning Digital Object Identifiers (DOI) to digital specimens in collections, in collaboration with global stakeholders.
• Recommendations and best practices for use of PIDs in the biodiversity literature.
• Recommendations to infrastructures to use data brokers to link PIDs where competing systems exist.

Access: Provide, facilitate, support and scale up open access to FAIR interlinked data, from literature, natural history collections, sequence archives and taxonomic nomenclature in both human-readable and machine-actionable formats. BiCIKL enhances transnational and virtual access to data via:

• Improved access to FAIR biodiversity data at each RI and across RIs, through 16 newly developed or improved tools and workflows.
• Access to interlinked data through bi-directional and multi-directional linking between RIs.
• Access to Linked Open Data (LOD) through the biodiversity knowledge graphs created or enhanced in BiCIKL.
• Support for access to data and services to Open Call projects proposed by international teams of researchers and to many unnamed users worldwide.
• Development and standardisation of APIs for programmatic access to data at each RI.

Interoperate: Harmonising the existing standards, metadata, policies and technologies for provision and ingestion of FAIR data is developed through joint research & technical development and community engagement and resulted in:

• Recommendations, best practices and guidelines for interoperability and compatible data standards between RIs, for both human and machine-interpretable use (APIs).
• New or improved APIs following the guidelines on technical compliance.
• Guidelines on various aspects of production and use of interlinked FAIR data implemented by several BiCIKL RIs.
• Efficient bi-directional and multi-directional linking mechanisms between specimens, sequences, taxon names, and literature.
• Two policy briefs with recommendations for an increased interoperability of FAIR biodiversity data.
• Best practice manual for findability, re-use and accessibility of RIs.
• Recommendations and best practices included in the BiCIKL training program.

Reuse: Optimisation of the reusability and reproducibility of complex datasets, assembled from different biodiversity-related domains for generation of new knowledge has been progressed through:

• A globally unique, automated workflow for liberation, annotation, dissemination and re-use of data from the biodiversity literature.
• A semantic-based journal production workflow for publication and re-use of FAIR biodiversity data.
• Automated workflow for real-time RDF conversion of full-text articles into Linked Open Data and biodiversity knowledge graph.
• Open Call projects and published articles demonstrate the usability of enhanced interlinked data.
• Community engagement in human-in-the-loop methods of data curation by workbench and clearing house tools.

The workflows and tools created under the joint research activities of BiCIKL are tested in real time through the Open Call projects performed by research groups throughout the world, thus supporting another key objective of BiCIKL and its funding program: Building a new community of users who will be able to address societal challenges through data-driven, next-generation research.
BiCIKL connects data from different, previously fragmented domains, including data liberated from the huge biodiversity literature into a big FAIR data pool seamlessly available to researchers, public authorities and business to foster innovations in science, nature conservation and digital economy. The main BiCIKL results towards project’s objectives are:

• A new vibrant community of users equipped with novel research tools for search and access to data interlinked across domains.
• Interlinked corpora of knowledge used by research groups through newly developed bi- and multi-directional data linking.
• Data imprisoned in dozens of thousands pages of biodiversity literature are extracted, annotated, published and converted into FAIR Linked Open Data (LOD).
• Fifteen novel services, powered with the most advanced technologies (e.g. language models, RDF) for better access to and linkages of biodiversity data freely available for use by anyone through the Biodiversity Knowledge Hub (BKH).
• BKH serves as a single knowledge broker to interlinked, both human- and machine-readable FAIR data, connecting specimens, genomics, taxonomy and literature.
• Through its innovative approach, newly developed or enhanced tools and services, and capacity building, BiCIKL contributed added value to the new community over the sum of the previously existing services.

Beyond research, BiCIKL adds a significant value in serving the society through combined use of data from different domains to provide evidence, for example on biological invasions, or historical dynamics of biodiversity and ecosystems, hence modelling and supporting informed policy decisions in pursuing the key goals of the 2022 Biodiversity UN COP15 conference in Montreal: (1) protect and restore 30% of the world’s land and seas globally by 2030, and (2) reduce the extinction rate by tenfold for all species by 2050. The BiCIKL results will be a direct contribution to achieving these life-saving goals!
Data life cycle in biodiversity science
My booklet 0 0