Skip to main content
European Commission logo print header

Semiautomatic Methods for the Creation and Maintenance of Links between Scientific Resource Aggregations on the Web

Final Report Summary - SCILINK (Semiautomatic Methods for the Creation and Maintenance of Links between Scientific Resource Aggregations on the Web)

The goal of the SciLink project was to reassess the traditional way of publishing scientific results on the Web and to develop solutions that allow scientists to publish building blocks for scholarly communication as dereferencable Web resources.

A main research focus was on developing strategies for maintaining links and synchronizing Web resources based on existing Web standards. The outcome of this research activity is a Web-based resource synchronization, called \emph{ResourceSync}, which builds on the Sitemaps protocol and allows third party systems to remain synchronized with a server's evolving resources. ResourceSync has been approved as a NISO standard in 2014 (Z39.99-2014) and is now being implemented for arxiv.org which is one of the most important open access repositories for scholarly communication.

The SciLink project also investigated new forms of digital annotations in scholarly communication, contributed to the development of the W3C Open Annotation specification and delivered a showcase (Maphub) demonstrating how annotations on historical maps could be shared and link resources on the Web. This showcase received the 2013 Open Humanities Award from the Open Knowledge Foundation and inspired further developments, such as the Annotorious library, which enables annotations functionality on any kind of (scholarly) Web resource. We also investigated a new tagging technique, called Semantic Tagging, which associates digial resources with Web resources instead of strings. We studied that technique in an in-lab user experiment and found that semantic tagging does not affect the annotation outcome, while providing tagging relationships to well-defined concept definitions. The standards (Open Annotation), tools (Maphub), and techniques (Semantic Tagging) resulting from this research play a major role in providing open, digital annotation within scholarly communication and open access repositories such as arxiv.org.

Since social media is currently encouraging new forms of scholarly interactions, we also made two contributions in that direction. First, we addressed the problem of disambiguating named entities for short, user-generated texts on the social Web (e.g. annotations) and developed an approach that handles that challenge by modeling user-interest with respect to personal knowledge contexts (e.g Wikipedia). Second, we developed a machine-learning based technique that enables identification of useful user comments in social media platforms.

Creatig and sharing controlled vocabularies, such as thesauri and taxonomies, is another basic building block in scholarly communictation and Simple Knowledge Organization System (SKOS) has become the de-facto Web-based exchange standard. However, SKOS vocabularies often differ in terms of quality, which reduces their applicability across system boundaries. SciLink contributed to investigate how taxonomists could be supported in improving SKOS vocabularies by pointing out quality issues that go beyond the integrity constraints defined in the SKOS specification. We identified potential quantifiable quality issues and formalized them into computable quality checking functions that can find affected resources in a given SKOS vocabulary. We implemented these functions in the qSKOS quality assessment tool, conducted experimental evaluations on 15 existing vocabularies, and found possible quality issues in all of them.

Since one of the goals of SciLink was to contribute results back to Europeana, we developed the Europeana Linked Open Data pilot exposing open metadata on approximately 2.4 million texts, images, videos and sounds gathered by Europeana. All metadata are released under Creative Commons CC0 and therefore dedicated to the public domain. The metadata follow the Europeana Data Model, which is a derivative of OAI-ORE, and clients can access data either by dereferencing URIs, downloading data dumps, or executing SPARQL queries against the dataset.

Verwandte Dokumente