Skip to main content

Scalable Linking and Integration of Big POI data

Periodic Reporting for period 2 - SLIPO (Scalable Linking and Integration of Big POI data)

Reporting period: 2018-07-01 to 2019-12-31

Locations that exhibit a certain interest or serve a certain purpose are commonly referred to as Points of Interest (POIs). The concept of a POI is quite broad, encompassing anything from a shop, restaurant or museum to an ATM or bus stop. POI data are the cornerstone of any application, service, and product even remotely related to our physical surroundings. The creation, update, and provision of POI datasets consists a multi-billion cross-domain and cross-border industry, with a value chain natively incorporating most domains of our economy, from mobility and tourism, to logistics and manufacturing. Advances in the timely and accurate provision of POIs result into significant direct and indirect gains throughout our economy. Productivity gains, optimization of value chains, match-making consumers with goods and service providers, new value added products, are just a few examples. POI data are truly one of the foundations and value multipliers of our Digital Economy.

The value and impact of POIs is reflected in the complex, expensive and labor-intensive effort required for their production and maintenance, which inherently involves stakeholders and users throughout their value chain. Their initial production involves field-work, constant monitoring for their evolution and accuracy, integration of user-feedback mechanisms for reporting errors, quality assurance of new data, and roll-out across a plethora of services and products. In the POI market, the competitive advantages of data providers are clear and measurable: the greater the size, timeliness, richness, and accuracy of data, the better. The value chain of POI data has rapidly changed, with new data sources of even greater volume and heterogeneity, introducing opportunities for growth, but also complexity, intensifying the challenges for the quality-assured integration, enrichment, and data sharing of POIs.

POI data are by nature semantically diverse and spatiotemporally evolving, representing different entities and associations depending on their geographical, temporal, and thematic context. Due to their use in various domains and contexts, POI data is typically found in diverse, heterogeneous sources, from which bits and pieces of information need to be combined and assembled to increase value. However, this is hindered by the lack of common identifiers and data sharing formats. Even the means by which we typically identify and share POIs is inherently ambiguous. As a result, the integration of POI data remains labor-intensive and scalable only for domain-specific or small-scale efforts, leading to loss of information and thus lost value.

SLIPO’s objective is to deliver the missing technologies for addressing the data integration challenges of POI data in terms of coverage, timeliness, accuracy, and richness. In SLIPO, we argue that Linked Data technologies can address the limitations, gaps and challenges of the current landscape in integrating, enriching, and sharing POI data. Our goal is to transfer the research output generated by our work in project GeoKnow, to the specific challenge of POI data, introducing validated and cost-effective innovations across their value chain.
SLIPO has completed its final period with the successful release of the SLIPO system, the first comprehensive cloud-based platform for the quality-assured world-scale integration of Big POI data assets. SLIPO reduces the effort, time, and complexity of POI data integration, providing POIs of increased size, coverage, richness and timeliness at a fraction of the cost. The SLIPO system enables non-expert of linked data technologies to import, link, fuse, and enrich heterogeneous proprietary and open POI data, regardless of their original format, schema, or identifiers. SLIPO integrates and extends leading open source Linked Data to specifically address the requirements of world-scale POI integration.

Already validated in a pre-commercial setting, SLIPO delivers integrated POI assets with quality comparable to that manual-driven data integration. SLIPO allows users to securely manage and store their geospatial data assets, graphically design complex data integration workflows, full automate data integration, track the provenance of their POI assets, implement strict QA policies, and export their data integration results in third-party systems and products. Further, SLIPO provides a series of integrated analytics extracting added value from POIs data assets to feed decision making. Finally, the entire SLIPO platform, its data assets, workflows, and analytics, is available through Python-based Jupyter notebooks, further supporting industrial data scientists and allowing the direct exploitation of SLIPO in existing business workflows. SLIPO's main features are:
• World-scale POI data integration over heterogeneous geospatial data assets
• Fully automated as well as expert-driven definition of data integration workflows
• Secure management and provision of POI assets, integration workflows, and users
• Integrated QA services, curation, and provenance tracking
• Scalable out-of-the-box value-added analytics for POI data assets
• Support for integration with Python-based Jupyter notebooks

SLIPO's output is relevant to all economy sectors where POI data are applied. SLIPO's customer benefits are:
• Increase value, richness, quality and timeliness of your POI data assets
• Achieve practically identical integration results with expert-driven manual integration
• Integration at a fraction of the effort and cost
• Leverage proprietary and open/public geospatial data assets
• Expand products, services and workflows across EU and the world
• Cloud-based, low-cost, and pay-as-you-go pricing models
SLIPO reduces the effort, time and cost required to produce POI data of high quality, and allow will allow non-expert POI producers and consumers to easily transform, interlink, fuse, enrich and assess the quality of big POI data. Overall:
• TripleGeo was extended to support practically all industrial geospatial data formats and standards, gained support for user-defined and custom mappings, hierarchical classification schemes, and increased its performance by orders of magnitude.
• LIMES increased its scalability and effectiveness for POI data by optimizing its spatial interlinking approaches, introducing new hybrid similarity functions and configurable weighting, as well as class-expression-specific specifications for tuning proximity functions on POIs.
• FAGI was enhanced with several new fusion operators and strategies for spatial and thematic properties, metrics to assess metadata similarity and quality, and performance improvements.
• DEER has been extended with POI-specific enrichment functions, pro-active enrichment strategies, and enhancements in the execution of complex non-linear enrichment pipelines.
• SANSA has been improved with core functionalities for input data support, querying and inferencing, rule mining, and clustering.
• LOCI, a new framework for large-scale geospatial analytics over POI data has been developed, tested, and integrated as a value-added service.
• The SLIPO Workbench, a cloud-based application enabling the ad hoc integration of Big POI data assets, has been delivered, extensively tested and validated in a real-world setting.
• The SLIPO system is in production operation and commercially applied by the project partners and industrial stakeholders.
SLIPO logo