Linked Open Data for environment protection in Smart Regions

Final Report Summary - SMARTOPENDATA (Linked Open Data for environment protection in Smart Regions)

Executive Summary:
The SmartOpenData project has completed the creation of a Linked Open Data -LOD- infrastructure extended throughout Europe. This research infrastructure has five different pilots as one of its most visible outcomes. Nevertheless, those tools and applications are well grounded on several data modelling and data harmonization processes.
The SmartOpenData information environment has been fed, in most cases, by public and freely available existing sources for biodiversity and environment protection and research in rural and European protected areas, National Parks and tourist locations. However, SmartOpendata, has gone further in some cases, by publishing datasets that are not completely public or freely available for Citizenship.
Accordingly, the SmartOpenData infrastructure is based, in the first instance, on existing software applications and datasets. Nevertheless, the project has also developed its own tools and has made public several new datasets using innovative publication tools and techniques, at least in regard to geospatial information systems.
Given the asymmetric situation of public data registries and publication across Europe, SmartOpenData has carried out general tasks in regards to information processing and publication. In each project scenarios, the required datasets and the owners of that information have been sought out. As the original formats were found to be various and inhomogeneous, the SmartOpenData partners selected the most suitable tools for refining, transforming and publishing the data in each specific case or, in some cases, they have developed completely new applications.
However, in the context of LOD, data publication is just one of the results that can be obtained. Final users demand more than just the information and data and, consequently, the project has made publically available all of the data models and ontologies that accurately describe the generated datasets that are published as LOD.
Therefore, SmartOpenData has defined mechanisms and strategies for searching, acquiring, adapting and publishing Open Data provided by existing sources regarding biodiversity and environment protection in rural and European protected areas and National Parks and, hereafter, the LOD obtained thereby has been used to solve specific semantic queries.
As a secondary, but no less important outcome, in addition to the technical aspects and results SmartOpenData has helped to reduce the gap between the geospatial community and Semantic Web movement led by international standards bodies and universities. Indeed, SmartOpenData has focused on how LOD can be explained to and disseminated into the geospatial community and applied generally to spatial data resources.
At this time, there are several issues that have already been properly answered regarding GIS systems and datasets, but there are still open questions in the context of spatial data as LOD. For example, small geometries encoding or the implementation of topological functions have been implemented correctly but those processes could be improved.
The vision of the SmartOpenData project was widely confirmed: there are many different environmental information sources and their level of openness is varied. And, as a direct consequence, the economic value of the datasets can be greatly improved through their wide public exposition in a proper way.
As a final objective, SmartOpenData contacted several SMEs and stakeholders to offer them the results of the project, with the conviction that the power of Linked Open Data will foster innovation within the environmental sector.

Project Context and Objectives:
The SmartOpenData project was funded under the Call ENV.2013.6.5-3 “Exploiting the European Open Data Strategy to mobilise the use of environmental data and information” and it has been carried out with the conviction that opening up public sector data and information for re-use has a significant potential to act as an engine for innovation, growth and transparent governance. This was the original vision of the project and its results have proved that exploiting Europe's Open Data Strategy contribute to decision-making in policy areas, fostering the participation of citizens in environmental governance and generating new innovative products and services.
Using open, readily accessible and freely available Earth Observation data and information as the main original datasources, although SmartOpendata has gone further in some specific cases, by publishing datasets that are not completely public or freely available for Citizenship, the SmartOpenData project has enabled wide access to scientific data using Linked Open Data paradigms and Semantic Web tools.
Therefore, SmartOpenData has discovered, transformed and published several biodiversity and environmental data sources. The datasets to be used were requested (if they were not open) and collected. In some cases, they were completely transformed before publication. Publishing those datasets facilitated full access to this useful information for SMEs, general citizenship, policy makers and other relevant stakeholders.
The analysis of Semantic Technologies allowed the project to understand and disseminate the Linked Open Data -LOD- Technologies inside the geospatial community. Besides this, the dialogue has been enriching in both directions. Indeed, the general principles of geographic information systems and geospatial data were also explained to technical partners, that now know better the natural, environmental and biodiversity contexts.
This approach has allowed researchers in different domains, especially GIS and Data Modelling and Processing, to collaborate on the same data sets, to ensure seamless interoperability of data catalogues, to engage in entirely new forms of scientific research and to explore correlations between research results.
Results orientation towards SMEs has been, not only one of our additional eligibility criterion, but also one of the main guidelines of the project considering the SmartOpenData Consortium and the project business orientation.

The main objectives of the project during its lifetime were:
Creation of a sustainable Linked Open Data infrastructure in order to promote environmental protection data sharing among public bodies in the European Union
The SmartOpenData infrastructure has made environmental open data easy to access and use. It has been achieved through the use of Linked Open Data technologies for modelling, acquiring, harmonising and using data provided by external sources from existing catalogues and open public data portals. The sources have been diverse with varying levels of openness. For example, not all datasets collected were public on the Internet and, in that case, they have been requested from public administration bodies and data providers.
SmartOpenData has built 5 Star Linked Open Data datasets according to Sir Tim Berners-Lee's classification , preserving their original quality. Data Quality assurance and Intellectual property rights issues have been considered and taken into account.
The achievement of this objective is shown by the input datasets accumulated by the Portuguese-Spanish pilot from different data sources that have been transformed to and published as a final 5-star LOD endpoint: Soil Chemical Characteristics, Climatology, Forestry maps, Administrative regions from Portugal and Spain, Water and Drinking water and many others. This specific case highlights a very good example of the power of Linked Open Data: through use of these technologies it is possible to query and use data from different European countries.
The Slovakian pilot, at the end of the project, has published and transformed 13 datasets (National parks and protected landscape areas, Protected natural monuments, Special protection areas - Bird directive, Biosphere reserves, UNESCO world nature heritage sites, Protected landscape elements and many others) into 5 triplestores: Slovakian SK Linked Data LD Protected sites, SK LD Land cover, SK LD Contaminated sites, SK LD Biogeographical regions, SK LD Species distribution.

For its part, the Italian pilot published and processed the following 5 datasets into one 5-star triplestore:
● Natura2000 database - http://www.eea.europa.eu/data-and-maps/data/natura-5
● EEA Waterbase - Lakes - http://www.eea.europa.eu/data-and-maps/data/waterbase-lakes-10
● ARPA extraction (enriched with coordinates), sheets “StationsLakes” and “HazSubstLakes_Agg”
● EEA Waterbase - Rivers - http://www.eea.europa.eu/data-and-maps/data/waterbase-rivers-10 ARPA extraction, sheets “StationsRivers” and “HazSubstRivers_Agg”
● EEA Unit of measurement of Hazardous Substances - http://dd.eionet.europa.eu/dataelements/48239
● EEA code list of determinands - http://dd.eionet.europa.eu/datasets/latest/Groundwater/tables/HazSubstGW_Disagg/elements/DeterminandCode
The goal of the Czech pilot has been the transformation of the NFI (National Forest Inventory) data from a relational database into an RDF/XML or TURTLE, and publish these LOD on the web. All transformations were done using a r2rml-parser, which can also create a triple store with a dynamic connection to the database, however at this moment only one-time transformation has been done. Final RDF/XML or TURTLE files are available from the http://nil.uhul.cz/lod/ns/* , where * represents name of a vocabulary, e.g. http://nil.uhul.cz/lod/nfi/forest_cover/ or http://nil.uhul.cz/lod/nfi/forest_cover.ttl .
In the Irish pilot case, the primary input data sets have been the Irish Record of Monuments and Places (RMP), and Logainm, the official dataset of Irish Placenames. As a result, the transformation of the Monuments dataset to RDF was completed using the OpenRefine Tools. This allowed the data to be mashed together with the Linked Logainm source to produce the National Monument locations linked with the definitive Irish placenames of those locations. The exercise has ensured that both the Logainm and National Monuments teams will collaborate more closely in the future, and help to ensure the wider use of both.

The following SmartOpendata project objective is Enhancing Linked Open Data with semantic support by integrating semantic technologies built upon connected Linked Open Data catalogues aiming at building sustainable, profitable and standardised environment protection and climate change surveillance services
This objective covers the use of semantic technologies to build a new paradigm of environment protection services through the extensive use of Linked Open Data, significantly improving their accuracy, power and scope and reducing implementation costs making them affordable and sustainable for the first time.
As explained in the previous objective, SmartOpenData integrated semantic results and technologies with public open data sources made available by Data Providers or on-demand requested datasets. For example, the achieved 5-star level implies connection and links with external datasets such as DBPedia or GeoLinkedData.
SmartOpenData implemented the paradigms of Linked Open Data with respect to geospatial information. The open publication of SmartOpenData datasets has been understood as an end in itself but, within this area, an important point has been the participation of some of our datasets and tools in GEO-GEOSS AIP calls. On the other hand, the evaluation results of Work Package 6 has highlighted the interest in SmartOpenData results inside the geospatial and environmental community and relevant stakeholders.
The achievement of the 5-star datasets previously highlighted means, automatically, the achievement of this objective.
SmartOpenData has Defined business models specially focused on SMEs and based on innovative services as new opportunities to align research results, previous work and projects, tackling active involvement of the whole value chain in Smart Regions at policy, industry and society levels
In this regard, SmartOpenData has defined a community Governance Model particularly sensitive to SMEs in order to promote their participation. The Governance Model has allowed the creation of long-term relationships between stakeholders, policy makers, citizens and companies. The SmartOpenData business models focused on SMEs are defined in our Market Analysis and Exploitation Report.
During the project lifetime, the links with several external initiatives were fully developed and operative. Regarding SMEs engagement, many of them have been involved in SmartOpenData workshops and evaluation activities of our services, tools and datasets.
Demonstrate the impact of the sharing and exploiting data and information from many varied resources, in rural and European protected areas by providing public access to the data and developing demonstrators that will show how services can provide high quality results in regional development working with semantically integrated resources
Demonstration of impact has been carried out through the implementation of SmartOpenData pilots, driven by strategic partners in Spain, Portugal, Italy, Ireland, Czech Republic and Slovakia. Through these pilots, the project has been able to offer several services for environment and biodiversity protection. Other areas of interest of our developed tools and services are protected/invasive species, sustainable exploitation of natural resources and climate change.
Through the evaluation processes of the SmartOpenData final results, the project has been able to show the achieved impact of the services developed by the project.
A summarized description of pilots, focused in different segments, domains and European areas, is available at the SmartOpenData web page (http://www.smartopendata.eu/pilots).

Project Results:
The first results of the project were focused on the analysis and definition of clear use cases that properly explain the power of Linked Open Data tools inside the context of geospatial environmental and biodiversity data. During that early phase, the required datasets were requested or obtained from several sources.
During the first year, the project defined a complete list of requirements that served as the starting point for pilots’ definitions. Also, a complete study regarding semantic tools and architectures that could be used in geospatial environment was carried out. Based on these, the common SmartOpenData architecture was defined.

SmartOpenData general architecture
The modularity of the SmartOpenData infrastructures and solutions allow that each stakeholder could define the technical platform most suitable for their needs but publishing datasets in a common format that would be readable for external actors and parties not directly involved in SmartOpenData project.
After answering the questions “Why does SmartOpenData want to transform data to Linked Open Data?” and “What data SmartOpenData wants/needs to use?” the following question raised and solved by the project was “How data are transformed?” The process of refining, transformation and publication of data is not immediate and was a big challenge.
As a consequence and as one of the main foundations of the project, SmartOpenData has completed the definition of a powerful and flexible vocabulary (published by W3C at http://www.w3.org/2015/03/inspire/smod#) that covers the central core of common issues of the pilots. For this SmartOpenData data model, and as far as possible, the project has used INSPIRE as the basis for the data structures in each pilot. Hereafter, using this central core, the pilots extended the SmartOpenData vocabulary to take into account their own singularities.
Once the data model was defined, a general data harmonization process was carried out jointly by partners responsible for pilots and technical partners. The final visible result of this harmonization process was publication of the Linked Open Datasets, and no less important, a conceptual modelling cycle supported by domain experts and data modellers discussions lies behind this.

SmartOpenData harmonization methodology
SmartOpenData discovered, transformed and published several biodiversity and environmental data sources. The datasets to be used were requested (if they were not open) and collected. In some cases, they were completely transformed before publication. Publishing those datasets facilitated full access to this useful information for SMEs, general citizenship, policy makers and other relevant stakeholders.
The analysis of Semantic Technologies allowed the project to understand and disseminate the Linked Open Data -LOD- Technologies inside the geospatial community. Besides this, the dialogue has been enriching in both directions. Indeed, the general principles of geographic information systems and geospatial data were also explained to technical partners, that now know better the natural, environmental and biodiversity contexts.
On this LOD publication layer, software tools developed by the project deliver the full potential of linked information. Those tools have been:
SIREn indexing infrastructure as a highly scalable open-source full-text search engine especially suited for nested and schema-less data.
Sefarad is a web application developed to explore linked data by executing SPARQL queries at a chosen endpoint without writing code. Thus, it provides a semantic front-end to Linked Open Data that allows the user to configure his/her own dashboard with many different widgets to visualize, explore and analyse graphically the different characteristics, attributes and relationships of the queried data.
The Administration and Notification Service provides common facilities for exploiting environmental data. This refers to the fact that many solutions to the exploitation of environmental data are not interoperable, and any change to the data source stops these solutions working. Also, this service improves environmental data searchability thanks to its integration of big data infrastructure for structured and semi-structured search facilities.
The SmartOpenData pilots, as showcases of the tasks developed through the previously mentioned processes, demonstrate data modelling, data processing, data publication and technical tools. The first and final iterations of pilots have shown practical examples of the SmartOpenData architecture and implementation of the SmartOpenData data models.
The pilots, applications, tools and infrastructure were evaluated from the user groups’ perspective. The goals of this evaluation were to assess the quality of the technical results, verify whether they meet users’ specified requirements and to collect feedback on the developed tools via interaction with the internal and external user groups represented by the relevant stakeholders. In this context, two evaluation phases were carried out in the project: an internal evaluation, involving internal user groups representing participants from the main project partners, and an external evaluation, involving external user groups participants of pilot users.
Contribution to standards has been one of the main results of the project and due to the creation of W3C-OGC working group, the SmartOpenData project has been very active in this process.
Besides these technical results, the project contributed to many other economic and administrative results. For example, the project highlighted a greater transparency in public administrations, within our scope, through the improved visibility of environmental information. The external evaluation process carried out proved the interest of citizens and citizens' associations, SMEs and others relevant stakeholders regarding the published datasets and the applications developed using those technical tools enable them to contribute to environmental governance processes in the domains of transparency, knowledge management, accountability and responsiveness.
As an intangible, but no less important, result, SmartOpenData has achieved the development of a European common approach to LOD environmental data. In the first instance, this common approach was confined within the project consortium but it has served, in many countries and between several bodies and organizations, as the starting point for working groups that will continue the strategies and methodologies developed by SmartOpenData. Indeed, cooperation with other national and international research initiatives has been one of the main results of the project, for example, the project developed strong links with the GEOSS community due to participations in the AIP calls.
All of the planned technical, administrative and economic results of the project were satisfactorily achieved. The final general results and impacts can be summarized in the following list:
Data shared by public bodies and final users: This is a major result as far as the level of data openness across European countries is very uneven. As a final result, a large amount of top quality shared linked open data has been published. SmartOpenData data providers guaranteed an optimal origin to build the publication architecture as a starting point from which its Governance Model has served as a mechanism to attract new public bodies allowing new participations.
User engagement: User engagement is a key factor to get real impact. In order to promote it, SmartOpenData set up user groups in each area of interest and geographical location that provided very useful feedback to complete the final stages of the project.
Alignment with European protection environment trends and standards: The SmartOpenData data model and infrastructure were built upon using existing standards regarding geographic, spatial, LOD and semantic standards. But in addition, SmartOpenData participated in this process through the W3C-OGC work group, promoted by this project, and others.
The project achieved the Creation of an open source ecosystem that through the consequent dissemination strategy and liaison plan has been offered to third parties and their customers to participate and benefit from the SmartOpenData outcomes.
In order to provide a clear and concise overview of the results of the project, the following pages can be helpful to obtain a clear vision of SmartOpenData outcomes. This section is actually a standalone document that was used to explain to external users the results of the project during the second iteration of SmartOpenData evaluation process. This document is also available at: http://www.smartopendata.eu/news/smartopendata-summary

SmartOpenData components main outcomes

Modeling framework (This component represents the Linked Data model composed from set of vocabularies based on the European Union's INSPIRE data specifications):
SmartOpenData INSPIRE vocabularies: http://www.w3.org/2015/03/inspire/
Additional vocabularies:
Contaminated sites: http://data.sazp.sk/vocab/contaminated-sites

Linked Open Data (Repository providing links to the Linked Open Data created and made available by the SmartOpenData project partners): http://ckan.sazp.sk/group/smartopendata

Software - Semantic Front End Facilities (Software components providing common facilities for exploiting environmental data):
Distributed Semantic Indexing infrastructure:
Siren Intro video:https://www.youtube.com/watch?t=19&v=-KiZsx8GYtc
Siren Info: http://www.sindicetech.com/siren.html
Siren Demo: http://siren.solutions/kibi
Visualisation Framework:
SEFARAD Intro video:https://www.youtube.com/watch?v=AaEEantOFVc
SEFARAD Info:https://github.com/gsi-upm/Sefarad
SEFARAD Demo:
http://demos.gsi.dit.upm.es/smartopendata/index.html#/sparql/slovakiaPolygonsDemo
Administration and Notification Service:
Intro video: https://youtu.be/n4Wqj7Qh_xE
Info: https://bitbucket.org/dsanchezderivera/sod-queries
Demo: http://ns3000566.ip-37-59-3.eu:5000/
Data transformation, publication and analytics tools and services:
Grafitizer - deployed as part of DataGraft: https://datagraft.net/
OpenDataNode: http://opendatanode.org/
Kibi - Data analytics tool: http://siren.solutions/kibi-a-kibana-fork-for-data-intelligence/
Pilots (Demonstrating the utilisation of the previous components and impact of the sharing and exploiting of data and information from many varied resources, in rural and European protected areas by providing public access to the data):
An overview: http://www.smartopendata.eu/pilots/
ES&PT Agroforestry management pilot:
Link to video: https://www.youtube.com/watch?v=YSn0NKaEqPg
Link to pilot: http://map.tragsatec.es/SMODGeoportal/geoportal/SMOD.html
IE Environmental research, Biodiversity pilot.
General video at: www.youtube.com/watch?v=ssJVUxS5Vjc
SmartOpenData enabled European Tourism Indicator System (ETIS) Webservice & Apps. Link to pilot: http://geoparks.cloudapp.net
SmartOpenData enabled App to Ground-Truth potential Protected Monument sites for the Irish Heritage Council. Link to pilot: http://heritagevault.cloudapp.net
IT Water monitoring pilot:
Link to pilot: for end users “sites”, “stations” and “observations” dashboards are of interest
CZ Forest sustainability pilot:
Link to video: https://www.youtube.com/watch?v=9YneundZiRU
Link to pilot: http://nil.uhul.cz
SK Environmental data reuse:
Meta-search enhanced OGC services crawler: Link to video: https://youtu.be/1y-Ba60_41E
HTML List of OGC services: http://smartopendata.sazp.sk/en/pilots/metasearch-enhanced-ogc-crawler#tabs-1
Geocatalogue of metadata harvested from OGC services discovered on Google SE: http://smartopendata.sazp.sk/en/pilots/metasearch-enhanced-ogc-crawler#tabs-2
Publishing of INSPIRE metadata as RDF data via TripleGeoCSW API: http://smartopendata.sazp.sk/en/pilots/metasearch-enhanced-ogc-crawler#tabs-3
Biodiversity MashUp: Link to video:https://youtu.be/Af6CijjMoeQ
SK Linked Open Data http://ckan.sazp.sk/organization/sazp
Human readable API: http://data.sazp.sk/parliament/
Machine redable API: http://data.sazp.sk/parliament/sparql

Regarding non-technical and intangible results, the SmartOpenData business model has been applied during all of the project lifetime to define liaisons with user associations and other projects.
SmartOpenData has been involved in the development and tasks of the W3C and Open Geospatial Consortium (OGC) working group. This group was created to improve interoperability and integration of spatial data on the Web. Spatial data is integral to many of our human endeavours and so there is a high value in making it easier to integrate that data into Web based datasets and services.
Likewise, one of the main objectives of the SmartOpenData project has been reuse of public information in order to make it more useful and more visible to Citizenship and SMEs across Europe. For example, the following image explained, at the beginning of the project, why more and deeper efforts regarding the development of public datasets were necessary.

Public open data across Europe. Soft colours mean a low density of datasets. Source publicdata.eu - .2014

As can be seen in the previous figure, the United Kingdom and Italy were in a very good initial situation. However other countries in the Consortium presented a different situation (for example, Lithuania, Portugal, Spain and Slovakia). SmartOpenData has contributed to improving the situation of Open Data and Linked Open Data in its activity area.
But not only that, establishing the tools oriented to link different datasets in order to obtain better answers to complex queries was the obvious next step. In this context, SmartOpenData has developed its efforts working on one of the main aspects of the problem: Linking data using geographical aspects (transboundary datasets) and linking data using logical content of the datasets.
As a corollary, the SmartOpenData datasets have been published and registered in the GEOSS system, using this global platform as a showcase of our results.
Following this introduction, the next sections will explain the main results achieved.
Architecture of SmartOpenData infrastructure
The SmartOpenData reference infrastructure model and the high-level technical specifications related were defined as a corner stone during the early stages of the project. This reference infrastructure includes the description of main components and connection points to other tools and systems. The RM-ODP (Reference Model for Open Distributed Processes) methodology was used to define the SmartOpenData reference architecture that meets the technical and user requirements established. These address interoperability and multilingualism aspects, metrics engine and interfaces. The reference architecture defines both, platform neutral components and also provides suggestions for concrete implementation. The aim was not to design a monolithic solution for all Link Open Data, but to define basic architecture components of the Link Open Data chain and potential solutions for solving the concrete problems of the SmartOpenData project.
The RM-ODP divides all processes of architecture design into five generic and complementary “viewpoints” (Enterprise, Information, Computational, Engineering and Technical) of the system and its environment. The conclusions of each viewpoint for SmartOpenData were as follows:
Enterprise viewpoint: is focused on the analysis of pilot scenarios and the definition of a limited numbers of generic use cases, which are implemented to support basis functionalities required by more scenarios, but also supporting the process of data and metadata. This Viewpoint concluded that the SmartOpenData LOD functionalities required by the Pilots and cross functional themes were as follows:
Required generic LOD functionalities Pilots Cross functional
ES IE IT CZ SK Business Tourism
Transformation (Relational -> RDF) X X X X X
Transformation (GML -> RDF) X
Transformation (GMD -> RDF) X
Storage X X X
Search X X X X X
Federated querying X X X X
Visualization of LOD with other data X X X X X
Visualization of LOD using conventional GI tools X X X X X
Upgrading X
Publishing data from existing SQL databases as RDF X X X
Publishing data from existing CSV databases as RDF X
SmartOpenData LOD functionalities
The Information viewpoint describes the way that SmartOpenData stores, collects, updates, manipulates, manages, and distributes information. The major issues which were analysed and assessed are (i) the Basic data types used in SmartOpenData, (ii) Ontologies and vocabularies and (iii) information structure and content with a clear focus on the metadata and data models.
SmartOpenData used the INSPIRE registry together with DCAT defined classes for the mapping with existing data models. This INSPIRE profile is very important because it mediates the bridge between the INSPIRE and other European portals.
The Computational viewpoint is focused on generic components, which could be reused for more scenarios and which are basic elements of the infrastructure. Various basic components were identified in various areas of the SmartOpenData system, including local data, Server side applications (Relational storage, Data publication on the web), Transformation (GML to RDF, relational data to RDF, metadata GMD to RDF), Publishing data from existing SQL databases as RDF, RDF Storage and Client side applications.
The Engineering viewpoint focused on the mechanisms and functions required to support distributed interactions among objects in the system. It defines the SmartOpenData conceptual Architecture to address the distribution of processing performed by the system to manage the information and provide the functionalities.

Component diagram of SmartOpenData infrastructure
The Technology viewpoint describes the technological specifications for the physical deployment of the system implementation, including the choice of technology in the system, how specifications are implemented, and the specification of relevant technologies and supporting to support for testing.
Many potential tools and solutions to address the requirements of the SmartOpenData infrastructures were identified for reuse from other European projects such as HABITATS, Plan4Business, GeoKnow, LOD2 and SemGrow, along with use of the DCAT-AP and CKAN metadata profiles.
SmartOpenData Data model
SmartOpenData drew on the experience of previous work done by the GeoKnow project and under the ARE3NA project within the JRC to produce the beginnings of an RDF vocabulary that mirrors the INSPIRE Data Model. Importantly, the model does not try to replicate the whole of the INSPIRE model, rather, it offers a model that is suitable for use in Linked Data structures, one that is amenable to linking geospatial and environmental data to other sources of data on the Web.
The vocabularies were installed in the highly stable w3.org namespace and cover the aspects needed for the pilots conducted within SmartOpenData: The Generic Concept Model that underpins INSPIRE, Protected Sites, Land Use, Administrative Units, Bio-geographical Units, Species Distribution, Corine Land Cover and Environmental Monitoring Facilities. These may be extended by other projects wishing to cover further INSPIRE themes.
SmartOpenData has proved the usefulness of Linked Data to solve a number of different problems related to the rural economy and environment. The SmartOpenData pilots are very different in terms of the kind of problems being tackled and so the common aspects are limited. The initial expectation was that the project would develop a core data model of its own, this would be updated towards the end of the project, and each pilot harmonised using the core model as a basis. In practice, it was decided that a better approach would be to make maximum use of the INPSIRE data model as the basis for interoperability. The end result is that SmartOpenData has defined a set of RDF classes and properties that make use of the aspects of the INSPIRE data model relevant to the pilots and a separate vocabulary that is specific to the SmartOpenData pilots only. This last point is explained in the following section.
The final model is completely in line with 'Linked Data thinking' and no longer attempts to recreate the full scope of INSPIRE in RDF. There are two principal motivations for this: (i) Experience: when creating Linked Data for use in the pilots, a slavish following of INSPIRE proved burdensome and unhelpful and (ii) The publication of the Study on RDF and PIDs for INSPIRE by Diederik Tirry and Danny Vandenbroucke under ISA Action 1.17: A Reusable INSPIRE Reference Platform (ARE3NA). This report summarised work by three experts: Clemens Portele, Linda van den Brink and Stuart Williams. Some of this work was shared with the SmartOpenData (SmartOpenData) project before publication.
It is this combination of factors that is behind the final model being at once simpler and much more comprehensive than the initial one in its coverage of the INSPIRE themes. In addition to the basics, the initial model covers just the Protected Sites and one relationship from the Land Usage theme. The final model covers more themes but with the same number of classes. Note in particular that the three classes associated specifically with Geographical Names have disappeared altogether.

The initial SmartOpenData model tried to capture the full complexity of the INSPIRE model

The Final SmartOpenData Model, simpler than the initial version despite the addition of many more INSPIRE themes.
SmartOpenData Data harmonisation process
This process, explained in the SmartOpenData document 3.5-Final Data Harmonisation, covers operational aspects of the data harmonisation task concern data transformations from input data structures to RDF. 3 different approaches were identified based on the pilots’ requirements: CSV-to-RDF, XML-to-RDF and RDBMS-to-RDF (Czech pilot).
The previously mentioned data model is based on several INSPIRE themes that were chosen to represent domains of the pilots conducted within the project. As the following table shows, there is an intersection between the domains of the pilots, for example, in the topic of protected sites, which is relevant to most of the pilots. However, there are INSPIRE topics used by one pilot only, such as Environmental Monitoring Facility and Cadastral Parcel.
Vocabulary Italian Pilot Portuguese-Spanish Pilot Czech Pilot Slovak Pilot Irish Pilot
SmartOpenData Protected Site
SmartOpenData Land Use
SmartOpenData Administrative Units
SmartOpenData Bio-Geographical Regions
SmartOpenData Species Distribution
SmartOpenData Corine Land Cover
SmartOpenData Environmental Monitoring Facility
SmartOpenData Cadastral Parcels
SmartOpenData Custom Vocabulary
Vocabularies of third parties
Own vocabularies
Vocabulary usage by pilot
But in all of the pilots the model was extended with custom terms. These custom terms were aggregated into the SmartOpenData custom vocabulary http://www.w3.org/2015/03/inspire/smod#. Overall, our observation is that a common data model based on the existing INSPIRE standards facilitated the process of data harmonisation to a greater or lesser extent depending on the settings and requirements of each pilot. Pilots with INSPIRE-compliant datasets, such as the Slovak, not only used the model as a target schema for RDF transformations, but also took advantages of the transformation tools that exist for INSPIRE-compliant datasets. Other pilots, such as the Portuguese-Spanish, used a small fragment of the model compared to the required domain extension.

Workflow of OpenRefine-based data harmonisation
The SmartOpenData project has worked actively on the results of using the RDF plugin for OpenRefine to perform CSV-to-RDF transformations and compared it to Grafterizer, a tool that is being actively being developed at the moment. On the one hand, the rich functionality of OpenRefine allowed SmartOpenData to perform various data pre-processing steps and prepare data for RDF mappings. On the other hand, the GUI of the RDF plugin enabled intuitive and interactive construction of RDF skeletons for the data. We reported on several challenging cases, but overall we managed to perform all of the required transformations. Perhaps, the weakest side of this approach is its scalability. Although, we didn’t hit this limitation, we are aware of the existing issues with memory usage by OpenRefine. Grafterizer is presented as an alternative solution which is being actively developed and already provides several features not available in OpenRefine, such as reusing utility functions in one transformation, changing operation order and editing transformation operation.
SmartOpenData Distributed Semantic Indexing Infrastructure - SIREN
For this specific tasks of the SmartOpenData project, the challenge was pushing the state of the art big data “information retrieval” technique (typically associated with “search” problems) to solve a problem which has a large amount of “Structure” in the data.
While information retrieval systems have been enormously successful in delivering real time experience to users when collections of unstructured documents are involved, less usual has been the experience of seeing these techniques operating seamlessly on content that is both unstructured and structured (databases, records etc.) – in a way that truly leverages the relational information across the records.
Systems like Sort/Elasticsearch have acquired enormous popularity over the years as enterprise grade platforms that can deliver near real time results over large collections of documents.
The challenge for the project was: can we provide the same interactive experience when relational data aspects are involved? For example not only “when searching documents that contain the work “Quercus Coccifera” (the Mediterranean Oak) but instead” “find legislation documents that deal with areas in which the Quercus Coccifera is naturally found”. The name “Quercus Coccifera” might not show at all in those documents, but because of the “relationship between an entity mentioned in the document and the database knowledge then the document becomes relevant and must be found and explained for its relevancy by the system.
On top of the above “relational search” core feature, we have also obtained features like: (i) real time analytics – allowing us to have nice visual interactive information about “aggregates” of the data we’re looking at –, (ii) powerful integration with aggregate Map tools and (iii) the ability to operate quickly on very diverse data – that is as much as possible in a “schema-less” fashion.
In regards to the Siren indexing infrastructure, the core technical achievement has been allowing structured data to be efficiently indexed by information retrieval systems such as Solr and Elasticsearch. This is possible by the advances we made in the "Semantic Information Retrieval Engine” (or SIREn).

An overview of the SIREn Plugin Architecture
The Siren “relational faceted browser” illustrates how the SIREn system is used in order to create the end user experience that allows real-time restrictions on relational data sets. This methodology, using a considerable level of precomputation – or denormalization, creates a number of indexes which the user interface coordinates seamlessly by rewriting queries as the users navigates across “Entities”.
One of the goals of this project was to demonstrate that the infrastructure could scale in a multinational/cross location wide manner. For this it is paramount that the data can be distributed not only among more computers, in the same physical location but also in different physical locations e.g. across datacenters. For this, we extended the SolrCloud capabilities to obtain “cross data centre replication”.
Within the SmartOpenData scope, all tasks and processes related to data gathering and transformation have been very important. Therefore, SIREN has its own ETL Extract-Transform-Load system. This complement has been implemented to produce the Relational Faceted Browser “effect”.
The User interface and interaction system is internally called the KnowledgeBrowser. This is a highly interactive search/ analytics user interface powered siren and the relational faceted browser methodology.

The KnowledgeBrowser
SmartOpenData Visualization Framework – Sefarad
One of the most promising fields for Geographical Information Systems (GIS) is the use of Linked Open Data Visualization Tools. This approach refers to a technique for visualizing published, connected and structured information on the web according to linked data principles. Many linked data visualization tools are exclusively used for custom tailored applications and they cannot be parameterized in order to adapt them to other cases beyond the representation of the studied case. Thus, they are inaccessible for users with more general requirement needs. Many visualization tools are often accompanied by underlying assumptions that unknown to the domain experts or cannot be explicitly characterized for a particular model.
To contribute to the improvement of the above problems, a semantic visualization tool named SEFARAD was developed by SmartOpenData. This tool provides mechanisms for enabling non skilled users to visualize linked data sources with a high level description of classes, properties and relations. By using a map faceted navigation capability users are able to go beyond data exploration to map visualization filtering features. Furthermore, SEFARAD provides a utility for creating and processing human-readable SPARQL queries which can be used for representing linked data information.

Main Layout - Dashboard.
Sefarad is a web application developed to explore linked data by executing SPARQL queries to a chosen endpoint without writing code. Thus, it provides a semantic front-end to Linked Open Data. It allows the user to configure his/her own dashboard with many different widgets to visualize, explore and analyse graphically the different characteristics, attributes and relationships of the queried data. Sefarad is developed in HTML5 and follows a Model View-View Model (MVVM) pattern performed with the Knockout framework . This JavaScript library allows us to create responsive and dynamic interfaces which automatically is updated when the data changes. The different parts of the UI are connected to the data model by declarative bindings. Sefarad consists of two different tabs: dashboard and control panel. The first tab allows the user to perform faceted search on the data accessed, so the users can explore a collection of information by applying multiple filters. In the control panel tab statistics about the dataset are visualized.
The great potential of Sefarad for the SmartOpenData project lies in the capability to easily create its own widgets. There should not worry about obtaining the filtered data and updating the widget when a new facet is selected thanks to the Knockout framework. For this purpose, the application specifies how to create a new JavaScript file in which it should be placed a JavaScript object using D3.js framework . Advantage of this feature is taken to develop geographic widgets.
One of the main purposes of this project was to show the queried semantic data. To this end Sefarad includes a large library of widgets to display many kinds of information: filtering widgets, slider widget, graphic widgets (bars, wheels, donuts, etc.). To visualize geographic information, the most important widgets are: results table, filtering widgets (tagcloud and selector), Openlayers map for GeoJSON and Openlayers map for GeoServer shapefiles.

Available widget templates in SEFARAD
Technically, SEFARAD is an HTML5 Framework to query, manage and represent Geo Linked Data. Therefore, we need a SPARQL Engine to edit and execute the queries to the endpoint we want and retrieve the data. Non-technical users can edit their queries by using a SPARQL editor which helps them with recommendations and corrections while editing the query and advanced users can edit their own queries. The application allows the users to query any semantic repository with a corresponding SPARQL endpoint or any local dataset within a local database such as Fuseki and Virtuoso or Geo Server. In order to get the data in a proper format for Sefarad, the Geo Proxy module handles the conversion of the data when it is needed.
Once the data is retrieved, the Search and filtering module provides the necessary tools to manage the queried data enabling Faceted search, Keyword search and Geo filtering. The application automatically indexes, sorts and classifies the information, obtaining the different facets and values of the data. All of the filtering services are provided to the user in an intuitive graphical interface by providing multiple widgets, a search box and a sortable table of results. To handle the changes in the filtered data due to the different filters selected by the user, the Model View View-Model module uses the features offered by the Knockout framework. Every time a new search or filter criteria is included, the data model is automatically updated with the results that meet the new conditions and all the widgets displayed in the layout are redrawn using updated final results.

General Architecture.
For the security and administration tasks, we need a User management module. This module includes a Security: authentication and authorization sub-module based on PHP and MongoDB. The user's username and password (encoded in MD5 hash) are stored in a MongoDB collection named users. When a user wants to log in, the application checks his user credentials with a PHP5 script. In case of success, depending on the users permissions (admin, basic user, etc.), the different tools and options are shown or hidden (i.e. add, configure and delete widgets). Furthermore, the user preferences and settings are stored in another MongoDB collection, so that when a user logs in the application is configured using the last configuration saved by the user. This is managed by MongoDB: settings and preferences sub-module.
Finally, the Setup module provides a graphical installer for an easy deployment of the application in any computer running a Linux operating system, installing everything needed to run Sefarad. This module includes two sub-modules: a Custom installer, which allows the user to select which modules to include in its installation; and an automation module, to automate certain repetitive tasks with a single command.
SmartOpenData administration and notification service
For an understanding of the developed system, the functional design of the notification system deserves special attention. It has been decoupled into two processes: the process by which notifications are generated, and the way these notifications reach users.
The notification system includes the mechanism whereby the users get a notification if a query result has changed in any manner. This query has to be provided and registered in the system.
The notification system manages all users’ queries, which can have different database endpoints and they may last a long time to perform complex queries. The results are sent by an asynchronous communication, so this is why a novel system for transmitting information between different subsystems is used. The technique is known as the publish/subscribe mechanism. To make the system complete, the Administration and Notification Service has been separated into two distinct functionalities; generating new queries and continuous monitoring of the various databases for subsequent notification.
Therefore the functional design of the notification system can be represented as shown in the following figure:

Functional description of relative systems
The ability offered by this mechanism is the ability to separate query executions to various databases and result visualization, so the user does not have to wait until the query is made to continue their task. The system warns the user when data is received or the query is complete.
The Notification provider is manageable by the user and supported by a node.js server. This server is publicly accessible and provides a web interface. On the other hand, the DB operator queries the database. It is running continuously as a Node.js server without web access to take care of delivering new data to the notification provider using the publishing mechanism explained above. The decision to use Node.js lay primarily with the ease with which the Javascript programming offers for web requests to various database rest endpoints.
To set a publish/subscribe mechanism, the existence of a "broker", which controls the delivery of the published messages and manages new data subscriptions, is required. All communication between the DB operator and notification provider necessarily passes through the broker.

System distribution with pub/sub mesh
All publish/subscribe capable systems communicate with each other by exchanging messages categorized into different topics. So we can distinguish what action requires the received message, either towards one end or the other end of the system. An example of a communication flow, in which the user sets a new query to the database, considers these steps:
1. User makes a query in the notification provider and activates it.
2. Notification provider sends the corresponding information to the relevant DB operator through the publish/subscribe mechanism.
3. DB operator saves the query data and performs the query.
4. First results are sent back to the notification provider when the query finishes.
5. DB operator will check continuously if data changes in the database.
6. Notification provider receives the data by the publish/subscribe mechanism and saves the results so that the user can review them.
7. If the user is connected, a real time notification will be sent through a web socket and will be visible in the screen.
SmartOpenData Pilots
Portugal & Spain
The results of the pilot have been really useful for both partners: DGT (Portugal) and TRAGSA (Spain), despite of that several important challenges were faced. The first of them was the use of non-familiar technology in order to learn about and create semantic data, work with new datasets, query and display them using this new approach, and finally being able to link these datasets with external data. To achieve those results, Pilot stakeholders requirements were understood as questions to be answered: this helped the technical staff to know the needs to be covered when linked data are used. Moreover, the guidelines followed were defined by the tools and technological possibilities currently existing. To know them, TRAGSA was supported by the invaluable guidance of technical partners as SINTEF or SpazioDati.
The second challenge was the geographically disjoint pilot’s working areas (Tagus basin in Portugal and Galicia region in Spain) and therefore the inhomogeneous layers and datasets.

A screenshot of the viewer’s tools composition
Mainly, this in-homogeneity was caused by the logic difference of the values categories contained or the different set of intervals. Therefore, an initial homogenization process was carried out in those business cases in which there were data from both areas.
A third challenge that we found was the technology limitations regarding drawing of the geometric entities obtained, since it is only possible to draw a limited number of geometries. Several solutions were proposed but, at finally, the WKT format was selected.
A huge amount of environmental information was published, being available to anyone. Not only this, those new datasets were linked to external data, achieving the goals that were set at the beginning of the project: Obtaining Linked Open Data for information and actual use cases.
Ireland
To address European protected areas and its National Parks, the Irish pilot focused on the Burren National Park of Ireland. The pilot aimed to demonstrate the value of SmartOpenData in helping Decision Makers and Researchers to better manage, preserve, sustain and use this unique ecosystem. The pilot’s primary objective was to create the following sustainable services that would continue beyond the life of the project.
1. Tourism for Conservation. SmartOpenData enabled the European Tourism Indicator System (ETIS) Webservice for the Burren and European GeoParks Network.
2. Protecting Heritage Sites. SmartOpenData enabled App to Ground-Truth potential Protected Monument sites
The ETIS service is a survey based generation tool used to provide real-time statistical information on the GeoPark performance in relation to the performance criteria defined by the Geopark’s Management. However the ETIS model raised some problems to concrete links to the SmartOpenData data model, enabling the service to be operational in many GeoParks across Europe, and use of a common data model will enable potential eco-tourists to benchmark, compare and contrast the progress of various sustainable destinations in achieving their objectives, before deciding which to visit.

ETIS Mobile App User Interface
The SmartOpenData enabled service to Ground-Truth actual and potential Protected Monument sites, enables Monument Field Officers, eco-tourists, and other people interested in their local heritage, to seek out and ground truth potential Monument sites.
● To mobilise a very motivated community of stakeholders, including
● Experts – Field Monuments Advisors and Researchers
● Citizens – visitors and people interested in their local heritage
● Public bodies – Heritage Council, National Monuments Service, Irish Government Department of Arts, Heritage and the Gaeltacht.
● Enterprises, Companies and SMEs – Farmers.
● The Crowdsourcing/Voluntary GI Ground Truthing process of gathering data in the field enables users to either complement or dispute remotely collected data.

Crowdsource Ground Truthing – Mobile App
The Ground Truthing Service is built around its Heritage Vault Portal, which is a webservice at http://heritagevault.cloudapp.net where Monument Field Officers and other experts check the ground-truth reports from the crowdsourced mobile App.
The two services were operated and evaluated (in WP6) with both internal and external stakeholders and users, involving
1. User engagement – as a first step in validating the value of the Irish Pilot’s services to its intended users.
2. Direct user interaction with the open data access process – as the next step in user involvement with the GI/LOD sources.
3. Co-design of innovative “demand pull” services –- the ultimate engagement of the stakeholders to evolve the Irish Pilot’s service beyond the project, and use the SmartOpenData platform to create new opportunities, and in turn sustain the platform.
The innovative aspects of the services include:
• Both
• Are based on Linked Data principles and Linked Open Data sources.
• use the SmartOpenData data models, tools and approaches
• came out of real needs that were identified by working closely with the relevant external stakeholders.
• ETIS is the first implementation of the standard that the EU hopes will be used Europe-wide for all Sustainable Destinations.
• The Monuments Ground-Truthing Service is expected to totally change how the Monument Field Officers operate in Ireland.
• If the experience with Crowdsourced Ground Truthing proves positive – the Irish Heritage Council plan to integrate it into their National Monuments Service and dataset.
The planned ETIS and Ground Truthing services were successfully implemented and achieved the planned service levels. However, usage and take-up was slower than planned. So meetings and presentations with users are continuing as there is a very positive reaction and interest in the services. So the services will be continued and expanded and it is expected that the original targets will be far exceeded in time, as there is a real need for both services. In addition, it is expected that extensions of the ETIS and Ground-Truthing Services will result in further services.

Slovakia
Identification of the available geospatial resources via a Metasearch enhanced OGC crawler has proven the significant amount of the datasets and services identified with the Google search engine and allowed this information to be exposed to the three distinguished layers of internet (Mainstream web, Spatial data infrastructures (SDI/Geospatial web) as well as to the Semantic web).
During the implementation of this WebCrawler sub-pilot, various challenges were identified, from the quality of the identified metadata to the issues related with delivery of the geospatial information and knowledge from semantic web to the SDI communities and mainstream web stakeholders.

Snaps of the interfaces for the various layers of the internet
A second set of activities undertaken via the Biodiversity mashup sub-pilot was focused on understanding the new paradigm delineating the concepts of the linking data into the knowledge within the semantic web. SAZP was one of the first public sector bodies in Slovakia willing to identify and deliver linked open geo data on a national level. In connection with the project consortium members, other related projects (COMSODE, SDI4Apps) as well as user communities contribution to the establishment of the SmartOpenData INSPIRE vocabularies as well as domain specific vocabularies , which were consequently used in transforming the SK INSPIRE and other datasets into harmonised linked open geo data . This set of resources have been offered together with the other linked open geo datasets and related resources via LENGTH (LinkEd opeN Geo daTa Hub) platform with aim to support knowledge sharing and further use and re-use towards external user groups and related stakeholders. Visualisation of the published linked open geo data is an important means to communicate the information made available via semantic technologies was supported via a set of applications allowing the display and basic queries via web browser interface .

Snaps of the Biodiversity mash-up sub pilot outcomes
Delivery of the outcomes related to this second sub pilot opened a set of questions addressing aspects as clarification of the linked data added value provided to the harmonised data published via INSPIRE, issues with visualisation of large datasets or projections from national coordinate systems.
The main lessons learnt related to the understanding the potential behind the semantic web technologies having hands on real data with the possibility to create the links to the relevant third party datasets (EUNIS, Natura 2000). At the same the SmartOpenData project allowed an understanding of the challenges related to the new way of thinking taking into the consideration the open world assumptions driving the concepts behind the Web of Data and the need to invest and stimulate the sufficient expertise in this field. SAZP is planning to continue with the activities initiated by the SmartOpenData project to ensure their sustainability and development of the new added value. This will be done on practical, internal (company) level via identification of possibilities to publish another linked open geo data as well as on the level of interaction with the stakeholders’ user communities via various events like conferences, workshops or hackathons to stimulate the development of the new links, services applications and other products.
The Czech Republic
The NFI is a project with support in 32 European countries, and also the UHUL FMI places great emphasis on and resources for the project. In the context of the SmartOpenData project, the UHUL FMI developed a web page for a presentation of the Czech NFI outcomes. Publishing a NFI data access point for the stakeholders has been one of the main results of this pilot.
Within the LOD environment, the Czech Pilot sees potential to ensure responsibility over the information using URL (http://nil.uhul.cz/) which is going to represent the access point for the data. The webpage was made publicly available during the SmartOpenData project and allows a user to find direct access to open data, which contains estimates of forest attributes covering the whole Czech Republic.
The open data comes from observations of so called plots, which are randomly generated locations in special networks/grids covering the whole Czech Republic. Some of these locations are accessed by field workers. With the current number of fieldworkers it is possible to observe approximately 25,000 plots in four years. However current technology allows us to make the numbers even denser using stereo-photo interpretation of a plot.
The observations are used for estimation of forest characteristics for regions in the Czech Republic. With the current density of the plots, we were able to estimate values up to NUTS 3 level. When the aerial photo and remote sensing interpretation have been used, it is possible to address even NUTS 4 level with reasonable statistical error. NUTS regions are well-known across the EU and therefore a NUTS region became a base spatial representation of the Czech NFI for interoperability issues and cooperation within the LOD.

Location of a URL to the LOD version of data
During SmartOpenData project the Czech National Forest Inventory became involved in the creation of linked open data results. UHUL FMI see great potential in this technology, because the NFI results are completely open, interoperable with other inventories and other European data sources and also can be defined in the wider internet environment. Therefore the results have been published also in a linked open data formats as turtle and RDF/XML files. The turtle and RDF/XML files have the same content but different serialization. A turtle file uses indent structure and has been recommended by the W3C consortium. An RDF/XML file is more used by XML oriented programmers and therefore implemented more often. These formats were supplemented by a raw HTML, which could be directly read in the browser by humans. Every results page has links to the linked open data formats at the top part of the document, see picture above.
Cross-border cooperation was also a very important feature of this Pilot. During the first phase of the SmartOpenData project, Czech Partners dealt with colleagues from the Slovakian National Forest Inventory represented by the NLC (http://www.nlcsk.sk/). Common data interpretation was needed before we could visualise the NFI together. The UHUL FMI and NLC focused on following estimates, which were estimated by acceptably same methods, therefore they can be visualised together. Moreover the estimates are partly processed by ENFIN (http://www.nlcsk.sk/) which makes it possible to involve more forest inventories. The selected estimates were: Average growing stock per hectare, Forest cover, Total forest area and Total growing stock. Both inventories have been processed the same way, they were uploaded to the same relational model in a database and transformed to the LOD.
Italy
The demonstrator is publicly available at http://kb.spaziodati.eu/. In general, one can create and configure their own dashboards and visualisations on the data loaded into KiBi. Three dashboards were created in order to address user queries of the pilot: “Protected Sites” dashboard, “Water sampling stations” dashboard, “Observations” dashboard.
For example, in regards to Protected Sites dashboard, the Italian Pilot loaded into KiBi data about 223 protected sites in Sicily. “Protected Sites” dashboard is meant to explore this data. Next figure illustrates Protected Sites dashboard.

Italian Pilot Demonstrator “Protected Sites” dashboard
This dashboard offers five different visualisations:
1. “sites map” - a map with protected sites. Each circle indicates protected sites found in the given: the bigger the circle the more sites are present in the area.
2. “sites table” - a table view of the data available for each protected site:
a. “site” - a URI of a site
b. “siteName” - the name of a site
c. “location” - latitude and longitude of a site
d. “description” - textual description of a site
e. “ecoQuality” - description of the site in terms of ecological quality
3. “Sites Sizes” - a bar plot of sizes of protected sites (in Ha).
4. “Tags eco quality sito” - a cloud of high frequency words extracted from “ecoQuality”
5. “Tags descrizione sito” - a cloud of high frequency words extracted from “description”
Similar tools are shown by the remaining dashboards.
Word clouds and bar visualisations are interactive. For example, on “Protected Sites” dashboard one can click on one of the bars that accumulates sites of a size of “2000” (Ha). All other visualisations on the dashboard will be updated to represent data about protected sites which are 2000 Ha in size. Accordingly, Entity URIs that one can see in table views are clickable. They are resolved into HTML representations served by the Virtuoso faceted browser. In addition to the visualisation previously mentioned, there are relational buttons. For example, “see closest stations” on “Protected Sites” dashboard connects protected sites to the nearest monitoring stations. These buttons are important to implement the user queries of the pilot. One example of the semantic queries solved by this pilot is the question “Which protected site or areas of a protected site are more or less subject to pollution?”

Italian Pilot Demonstrator: “Protected Sites” affected by high concentration of Arsenic
An important element of the Italian pilot since the outset has been the engagement of stakeholders external to the project, both to validate the impact of the LOD demonstrator (as well as the SmartOpenData infrastructure in general) and also engage actors in the co-creation of a regional Open Data platform and ensure the long-term sustainability of the project’s results.

Potential Impact:
The market and economic impact of GI services and Open Data is enormous, estimated to be €110-200B (with 30% growth) annually for the former, while the direct impact of Open Data on the EU27 economy was estimated at €32B in 2010, with a projected annual growth rate of 7%, i.e. €42B in 2015. Global estimates for the former, are even much higher, at US$3-5 trillion a year and US$13 trillion cumulative over the next 5 years in the G20 countries. Research has found that GeoSpatial OD has the widest applicability across the economy in both the UK and Spain, and likely across Europe. These provide an indication that the SmartOpenData GI/LOD platform could address over half of the OD market, a potential impact of over €20B in the EU alone.
Distinctly, user engagement is a key factor to get real impact. In order to promote it, SmartOpenData set up and maintained stakeholders groups that provided frequent feedback from early in the project, establishing a customer discovery process that allows solutions to be built that really fit the needs of the end users (public bodies, SMEs, researchers, citizens) of the infrastructure and services. In order to achieve the potential impact of SmartOpenData, several crucial steps were taken. Besides the ambitious technical aspects of the realisation of the services and user interfaces, data harmonisation, processing and analysis components, the following issues were addressed as well:
(1) First, to ensure that the SmartOpenData platform meets the actual and potential developing future needs of its targeted diverse user communities. It was of central importance to involve members of those as early as possible in the research and development process of the platform. User feedback was used in turn to adapt and extend the proposed service levels, interfaces, license models and the customisable analysis and processing facilities as needed.
(2) As a side effect, the involvement of the community was a starting point for promoting the SmartOpenData platform and its benefits over currently existing offerings and practices to the communities.
(3) Gathering an initial set of data from various European countries and the development of schema and data format mappings to integrate these data sets and thus actually provide useable content for the platform.
(4) Verifying the actual resource requirements of the SmartOpenData platform to improve the estimated future amounts of traffic and computing power that was required as the numbers of users, data and custom processing and analysis conducted on the platform increased.
This framework has defined the main guidelines of our dissemination activities during project lifetime. They have been many and varied but, nevertheless, some of them can be highlighted:
Linking Geospatial Data WorkShop
Our Partner ERCIM-W3C organized this workshop on behalf of the SmartOpenData project. This event was also co-organized by the UK Government, Ordnance Survey, the OGC and Google. All information about this workshop is available at http://www.w3.org/2014/03/lgd/.
The Linking Geospatial Data Workshop was billed primarily as a joint exercise between the Open Geospatial Consortium (OGC) and W3C as part of its role within the SmartOpenData project. It was supported and encouraged by the UK Government's Department for the Environment, Food and Rural Affairs, the Ordnance Survey and Google Maps who hosted the event. It came about through a desire to make better use of the Web as a platform for sharing and linking Geospatial Information (GI) alongside but not instead of existing GI systems.
The title of the event was carefully chosen, particularly using the word 'linking' not 'linked.' The chairs did not want this to be a specifically Linked Data event, rather it was to be an event about how different GI datasets can be linked and accessed on the Web. As it turned out, many presentations were directly from the Linked data community and it's clear that it has a lot to offer in the GI field but this was not a given before the event.

Google's Campus London was used to full capacity
In numbers: the workshop attracted 72 papers, 106 participants (with several more on the waiting list), 38 presentations, 16 panellists and 8 bar camp pitches.
The final plenary session of the workshop allowed participants to reflect on what had been said and to draw conclusions. There are many relevant standards in existence. There are other 'standards' that, although massively implemented, are not formalized which presents a problem for some government bodies: GeoJSON is top of that list.
In short: there is work to do to tidy up some existing standards and to provide guidance on how developers and publishers should proceed. The lack of coherent advice to publishers is such that a lot of data isn't published (at least in a linkable form) and where it is published it is less attractive to developers than it could be. OGC and W3C represent different communities and therefore a joint working group was required to create or recommend standards that work across those communities. OGC and W3C committed to work together towards establishing such a group.
GEO and GEOSS Initiatives
The project sent a response to the Call for Participation in the 8th Phase of GEOSS Architecture Implementation Pilot (AIP), following up on our participation in the 7th Phase. This proposal reflected the project aim to be more involved in GEOSS initiatives, and specifically in this Architecture Implementation, in order to share its technical results and outcome data sources. The project has maintained weekly audio-meetings regarding these tasks.
As a consequence of this participation, SmartOpenData suggested and was accepted, the participation of Mr. Bart De Lathouwer, OGC responsible for planning and managing interoperability initiatives, in the Open Data Cluster meeting in September 2014.
Regarding GEOSS, The Linked Open Geo Data Hub platform LENGTH, published by SAZP, which is used to centralize the SmartOpenData results, pilots and datasets. In this regard, this central point has been registered in the GEOSS infrastructure. This “Common” GEOSS registry will not prevent the individual registry of the Datasets.

Geospatial World Forum
Lisbon invited under the umbrella of the INSPIRE-Geospatial World Forum (GWF), 25-29 May 2015, experts and enthusiasts with an interest in the potential use and exchange of spatial data across the INSPIRE and Linked Data environments. On Tuesday the OGC/W3C Spatial Data on the Web workshop provided an overview about the possibilities to collaborate in improvement of interoperability and integration for spatial data on the Web. A joint workshop organised on Friday with the support of the related projects (SmartOpenData, SDI4Apps, COMSODE, BOLEGWEB, GeoKnow) aimed to address recent activities as well as challenges associated with exposing the INSPIRE data and metadata into the semantic web. A combination of the two subsections provided the latest update on the standardisation activities, information about the INSPIRE related vocabularies aiming to make INSPIRE concepts linkable instead of their duplication. Examples of the INSPIRE linked data and options to share metadata, including aspects of the possible risks and benefits as well as options, how to identify and support relevant stakeholders communities. The second part of the event was dedicated to the presentation of the transformation methodology and tools for data and metadata developed in the GeoKnow project, including practical examples. Discussions mainly addressed the topics of possible options for linking of reusable properties, ways of modelling the possible extensions, RDF versioning, possibilities for ontology modelling courses and knowledge exchange. As stressed by Phil Archer (W3C-ERCIM), linked geo data is not able to solve all open issues and support any use-case, but can significantly contribute in certain domains and applications. Although most of the presented activities were still “work in progress”, workshops allowed participants to get an overview on the latest directions and provided the space for further networking. At the same time organisers invited participants to follow the work on the projects, provide feedback and help to disseminate the outcomes, where appropriate. Further information is shown below and also two pictures of the event:
• Spatial Data on the Web Presentation
• INSPIRE & Linked Data: Bridging the gap workshop description: Session I presentation and Session II (presentation slides by Athena RC)

Linked Open Data: Connecting Open Data Sources to Understand Environmental Impacts
This Workshop in Palermo (IT) was held on 7th July 2015 with circa 40 participants. It was arranged following the final SmartOpenData general meeting, also held in Palermo.
Following the welcome addresses, the first session of the workshop involved members of the SmartOpenData consortium presenting the project and its overall issues and objectives, followed by presentations of the Italian pilot, including a live demonstration of progress to date. The final presentation dealt with the issue of engaging local communities in Open Data and presented the Palermo Declaration, a declaration of commitment and principles of open data for environmental stewardship that provides a common ground for local pilot stakeholders to commit to.
The second session involved different Sicilian stakeholders presenting their interest in participating in a local partnership for Linked Open Data through the Palermo Declaration. This included the University of Palermo, which had recently launched a Memorandum of Understanding with local stakeholders in the village of Aspra (bordering the ARPA pilot area), Andrea Borruso of the OpenDataSicilia network (as well as the Panoptes SME), and the Alderman for Territorial Planning as well as Innovation and Smart City of the City of Bagheria (the municipality in which the ARPA pilot is located), Luca Tripoli). Round table participants included further SME participation, notably Antonino Galante of Telebit Consulting, Patti (a city in Eastern Sicily) and Luigi Grasso of Etnahitech Srl, Catania, as well as Ciro Spataro of the City of Palermo. The discussion focused on the interest in the role that ARPA was playing in SmartOpenData and its willingness to collaborate towards promoting LOD strategies in Sicily.
Follow-up from the workshop has included: discussions with Etnahitech on the possibility of opening a free-access portal for Sicilian environmental Open Data, possible Memoranda of Understanding with the cities of Palermo, Bagheria and Syracuse, further engagement with the SMEs participating in the ODS network, including collaboration in capturing the opportunities to promote app development through eg the H2020 ODINE project, FI-WARE Accelerator initiatives, etc. A SmartOpenData session was foreseen on 4 Sep 2015 at the ODS2015 Summer Session, and a joint SmartOpenData-ODINE workshop was foreseen for Oct 2015.

Round Table. Center: Luca Tripoli, Bagheria's Alderman for Planning.
Danubehack
SAZP, CCSS, FMI and HSRS worked with others to organise a large scale hackathon in Bratislava. Running from 15-17 October, 2015, Danubehack was designed to encourage use of open data, particularly geospatial data, including SmartOpenData linked open data, models and infrastructure. SAZP is amongst those committed to sustaining their support for Linked Open Data and so the hackathon was part of a long term strategy. Many innovative projects and ideas were presented and the winner received €1500 and a virtual server for a year to help continue their work that seeks to turn unused land in the Bratislava region into productive farmland.

DanubeHack in fragments
The main ambition of this event was to create a space where people could present what can be done with open (and where relevant geo) data resources, technologies, ideas and knowledge. Another even more important dimension of this event was the willingness to provide space where people representing various types of stakeholders (from producers to the users) could meet and exchange their experience and knowledge. Based on that, the event was run in two parallel sessions, where the hackathon part was dedicated to the coding and development of new apps, services or data resources based on the list of identified possible resources (Data, Catalogues and Tools) originally collected by the SmartOpenData consortium members. Out of 14 ideas introduced at the beginning of the hackathon part, 9 projects managed to present their results after two days of intensive work. The second part was dedicated to the workshops, which were defined by the organizers together with the participants, including the “Open (Geo) Data in my country” panel session. The workshops presented latest data and technology resources, examples of successful Open Data projects, practical guidelines on how to present open data projects, or data sharing related presentation with interesting discussions. The results of both tracks are available via the event website and the feedback provided by the 75 participants was the strongest satisfaction organizers could receive.
Linked Open Data at GIS Ireland, October 2015
During GIS Ireland 2015, 2nd October, our Partner John O'Flaherty, from MAC, explained the project and its ties with Geospatial Linked Open Data for the GeoTechnology of the Future. The workshop addressed technological developments and changes, which could have significant ramifications with the Geographic Information (GI) and GeoTechnologies (GT) environments.

Big Data, Linked Data, Internet of Things, Open Data, 3D/4D, inside/outside building positionings, UAVs, cloud computing, Artificial Reality and Intelligence, Networks of Networks, Ontologies, 5G etc are just some of the more specific areas within which this tsunami of change is occurring, all of which can and will impact on the GI/GT environment in the emerging future.
“El valor de la información forestal y su activación socioeconómica” workshop at INIA
The SmartOpenData project results in the context of the Spanish public administration have been really significant. In addition, during the development of the Spanish-Portuguese pilot, the active participation of four SMEs and local authorities in the Galician region of Maceda was very important.
In order to more widely disseminate the project results among one of its more important users (SMEs) TRAGSA started the definition of a technical workshop in the month of June 2015. In this workshop, an interesting list of conferences was split into four main groups:
• Introduction: Technical explanation of LOD and administrative justification. Online connection with the Project Officer regarding GEO-GEOSS initiatives.
• Forestry and environmental management: public and private aspects
• Public and Private Data
• Demonstrations, projects and applications.

The workshop "The value of forestry information and its socio-economic activation" (in Spanish), was jointly organized by the National Research Institute and Agricultural and Food Technology (INIA) and the Tragsa Group last October 29.
The event brought together key players in the forestry sector and rural development such as MAGRAMA (Spanish Ministry of agriculture), Government of La Rioja region, companies and groups of companies (COSE, CESEFOR, FORA) and academic and research institutions (University of Santiago de Compostela, University of Valladolid, INIA), pushing forward joint strategies and dialogue among different areas of knowledge.
The meeting also featured an online connection with DG Research - European Commission that highlighted the main guidelines of GEOSS (Global Earth Observation System of Systems) – GEO initiatives and the supporting strategies to Open Data and Linked Open Data by the European Commission.
T-Systems Big Data Challenge
SmartOpenData was invited to participate in the T-Systems Big Data Challenge 2015 and its proposal reached the final elimination stage.
This call was particularly interested in solutions that offer benefits to European citizens and their public administrations, which face challenges in meeting increased demand for mobility, urban logistics, tourism and more, while simultaneously reducing negative environmental impacts. The most professional implementations that use satellites in combination with other data sources on the big data platform were considered for the prize.

Official “Copernicus Masters Finalist label”
External cooperation
SmartOpenData developed strong links with several projects, institutions and working groups during its lifetime. The following list points out the principal amongst them.
OGC/W3C Spatial Data on the Web Working Group (SDWWG)
The OGC Spatial Data on the Web Working Group (SDWWG) is constituted as a subgroup of the OGC Geosemantics DWG. It operates in collaboration with a parallel group in W3C of the same name. The Mission of the SDWWG working group is to clarify and formalize the relevant standards landscape for spatial on the web. In particular:
• to determine how spatial information can best be integrated with other data on the Web;
• to determine how machines and people can discover that different facts in different datasets relate to the same place, especially when 'place' is expressed in different ways and at different levels of granularity;
• to identify and assess existing methods and tools and then create a set of best practices for their use;
• Where desirable, to complete the standardization of informal technologies already in widespread use.
• All working group proceedings are available via W3C at http://www.w3.org/2015/spatial/
Both the World Wide Web Consortium (W3C) and the Open Geospatial Consortium (OGC) launched working groups devoted to the task. They are pledging to closely coordinate their activities and publish joint recommendations. SmartOpenData played a critical role in establishing this group and contributing to the work of this group. The SmartOpenData project, the World Wide Web Consortium (W3C) in partnership with the Open Geospatial Consortium (OGC) and the OGC GeoSPARQL Standards Working Group, the UK Government Linked Data Working Group, Google and Ordnance Survey, organised an initial workshop with the target groups to share experiences, successes and observations in using geo and location information. The Linking Geospatial Data Workshop was billed primarily as a joint exercise between the Open Geospatial Consortium (OGC) and W3C as part of its role within the SmartOpenData project. It came about through a desire to make better use of the Web as a platform for sharing and linking Geospatial Information (GI) alongside but not instead of existing GI systems.
GEOSS
The Global Earth Observation System of Systems (GEOSS) is a coordinating and integrating network of Earth observing and information systems, contributed on a voluntary basis by Members and Participating Organizations of the intergovernmental Group on Earth Observations (GEO). The vision for GEOSS is to realize a future wherein decisions and actions for the benefit of humankind are informed by coordinated, comprehensive and sustained Earth observations and information.
GEOSS will achieve comprehensive, coordinated and sustained observations of the Earth system, in order to improve monitoring of the state of the Earth, increase understanding of Earth processes, and enhance prediction of the behaviour of the Earth system. The GEOSS Architecture Implementation Pilot (AIP) develops and deploys new process and infrastructure components for the GEOSS Common Infrastructure (GCI) and the broader GEOSS architecture. OGC leads the AIP using the OGC Interoperability Program policy and procedures.
AIP's aim to increase use of GEOSS resources by end-users, in applying in situ and remotely sensed data, by further developing results from previous GEO developments through integration with the GEOSS Common Infrastructure (GCI).
The AIP's goals are to:
1. Increase Societal Benefit Area (SBA) use of GEOSS Resources for end-user
2. Increase availability of GEOSS Resources
3. Focus on benefits and usability for Developing Countries
4. Solidify previous GEO results and technical achievements

SmartOpenData actively contributed to GEOSS AIP-7 and GEOS AIP-8 mainly by introducing principles of Linked Open Data and sharing and offering its results to AIP partners.
Other Projects
The COMSODE project addressed the topic of OpenData from the perspective of supporting software and methodology development together with the publication of related datasets. Close cooperation was established with SmartOpenData, particularly in the use of the OpenDataNode software framework as well as via utilisation of the methodology framework supporting the public authorities with their effort to open their datasets. In addition, the COMSODE project contributed to the development of the SmartOpenData modelling framework represented by INSPIRE vocabularies. SmartOpenData contributed to the activities of the COMSODE through involvement of the User Board via SAZP representation as well as in the awareness raising and dissemination activities (e.g. Open Data In Action Workshop at ICT 2015 Lisbon).
SmartOpenData influenced three projects from the last call of CIP ICT PSP work programme “2013 COMPETITIVENESS AND INNOVATION FRAMEWORK PROGRAMME (CIP)”: OpenTransportNet, SDI4Apps and FOODIE, which are now extending some basic ideas of SmartOpenData.
OpenTransportNet is a project designed to revolutionize the way that transport related services are created across Europe. By bringing together open geo-spatial data within City Data Hubs and enabling it to be viewed in new easy-to-understand ways, OpenTransportNet enables:
• Anyone to have fun with data, by viewing data mash-up's in maps and graphs and be able to use and embed these maps in their own websites
• Public Sector users to gain insights from linking and visualising different data sets and be able to make better public service decisions based on the findings
• Businesses and entrepreneurs to use the data to enhance existing services and build new transport-related services
• The wider open community to benefit from the project outputs and findings to advance geospatial data standards such as INSPIRE
OTN further development work started in SmartOpenData about implementation of the INSPIRE profile as an extension of DCAT. DCAT represents the metadata format proposed for European portals based on principles of the semantic web. In the framework of ISA Action (ARe3NA ), an alignment exercise was carried out between INSPIRE metadata and DCAT-AP. OTN implemented this as part of OTN solution.
SDI4Apps is an EU-funded project managed by the University of West Bohemia in the Czech Republic. The project is being implemented with the concerted effort of 18 organizations across Europe. SDI4Apps seeks to build a cloud-based framework with an open API for data integration focusing on the development of six pilot applications. The project draws on INSPIRE, Copernicus and GEOSS and aspires to build a WIN-WIN strategy for building a successful business for hundreds of SMEs on the basis of European spatial data infrastructures.
SDI4Apps used the experience of SmartOpenData to introduce principles of Linked Open Data into tourism as part of their smart Point of Interest concept.
The SDI4Apps team developed a seamless open database of POIs, which will be distributed as 5 star Linked Open Data to be accessible for all users. The essential core of the model was extended by several attributes which are integral components of some original data and could be helpful for tourist purposes. The SDI4Apps Points of Interest data set is the seamless and open resource of POIs that is available for other users to download, search or reuse in applications and services. Its principal target is to provide information for cycling as Linked data together with other data set containing road network. The added value of the SDI4Apps approach in comparison to other similar solutions consists in the implementation of linked data, use of standardized and respected datatype properties and development of the completely harmonized data set with uniform data model and common classification.
The key point of the FOODIE project is to create a platform hub on the cloud where spatial and non-spatial data related to the agricultural sector are available for agri-food stakeholders groups and are interoperable. It will offer an infrastructure for building an interactive and collaborative network; the integration of existing open datasets related to agriculture; data publication and data linking of external agriculture data sources, providing specific and high-value applications and services for the support of planning and decision-making processes. FOODIE uses experience from SmartOpenData mainly for the integration of Linked Open Data coming from Eurostat and the FAO.
During the course of the project, SmartOpenData has had continuous contact and collaboration, through SINTEF, with the FP7 DaPaaS project (DaPaaS - A data-and-platform-as-a-service approach to efficient open data publication and consumption; http://dapaas.eu) which SINTEF coordinated. DaPaaS created and operated the DataGraft platform (https://datagraft.net) – a cloud-based service for data transformation and data access. The collaboration materialized in the fact that the SmartOpenData results were reused in DaPaaS, and in turn SmartOpenData reused DaPaaS results. More specifically, Jarfter – the software developed by SINTEF in SmartOpenData for packaging data transformations was reused and integrated with the DataGraft platform, and in turn, SmartOpenData used DataGraft for creating and hosting data transformations and publishing the resulting data of transformation on DataGraft for the Spanish-Portuguese and Italian pilots. The cooperation between SmartOpenData and DaPaaS was beneficial for both projects.
The proDataMarket project (Enabling the property data marketplace for novel data-driven business products is a relatively recent H2020 innovation action project led by SINTEF, and includes as partners both TRAGSA and SpazioDati. It is fair to say that the cooperation of those three partners in SmartOpenData created the opportunity for them to continue the cooperation based on the work in SmartOpenData. SINTEF and SpazioDati are reusing their technical results and expertise gained in SmartOpenData in proDataMarket, and TRAGSA are reusing their data published through SmartOpenData in proDataMarket, therefore SmartOpenData is taken further in proDataMarket – a good success story of reusing and building upon cooperation and results from SmartOpenData.

List of Websites:
http://www.smartopendata.eu

WP Leader WP Beneficiary
WP1 Mariano Navarro TRAGSA
WP2 John O'Flaherty MAC
WP3 Phil Archer W3C
WP4 Tomás Robles UPM
WP5 Karel Charvat HSRS
WP6 Loris Bozzato FBK
WP7 Mariano Navarro TRAGSA

Participants:

Full Name Email address
Mariano Navarro de la Cruz - mnc@tragsa.es
Jesús Estrada - jmev@tragsa.es
María Eugenia García de Garayo Millán - mggm@tragsa.es
Gregorio Urquía Osorio - guo@tragsa.es
Ramón Baiguet - rbl@tragsa.es
Tomas Robles - trobles@dit.upm.es
Roberto Prieto - international.research@upm.es
John O'Flaherty - j.oflaherty@mac.ie
Renaud Delbru - renaud@sindicetech.com
Giovanni Tummarello - giovanni@sindicetech.com
Majella O'Brien - majellaobrien@mwra.ie
Giovanni Vacante - gvacante@arpa.sicilia.it
Giovanni Tummarello - tummarello@fbk.eu
Umberto Silvestri - eu-projects.admin@fbk.eu
Michele Mostarda - mostarda@fbk.eu
Anna Maria DallaSerra - dallaser@fbk.eu
Gabriele Antonelli - amministrazione@spaziodati.eu
Michele Barbera - barbera@spaziodati.eu
Stanislav Holy - standa@hsrs.cz
Karel Charvat - kch@bnhelp.cz
Otakar Cerba - ota.cerba@gmail.com
Stepan Kafka - kafka@email.cz
Marek Mlcousek - mlcousek.marek@uhul.cz
Jan Bojko - bojko.jan@uhul.cz
Zbynek Krivanek - krivanek@ccss.cz
Josef Fryml - fryml@ccss.cz
Dumitru Roman - dimitru.roman@sintef.no
Arne Berre - Arne.J.Berre@sintef.no
Maris Alberts - alberts@latnet.lv
Peteris Bruns - peteris.bruns@gmail.com;peteris.bruns@gmail.lv
Maria Vale - mvale@dgterritorio.pt
Rui Reis - rui.reis@dgterritorio.pt
Martin Tuchyna - martin.tuchyna@sazp.sk
Philippe Rohou - philippe.rohou@ercim.eu
Phil Archer - phila@w3.org

Final Report Summary - SMARTOPENDATA (Linked Open Data for environment protection in Smart Regions)

Related documents

Download Download the content of the page