Periodic Reporting for period 2 - IoTCrawler (IoTCrawler)
Reporting period: 2019-08-01 to 2021-04-30
- The heterogeneity of various data sources limits cross-domain applications.
- Lack of metainformation limits exploitation across platforms.
- Missing security and neglected privacy present a major concern in most domains and a challenge for constrained IoT resources.
- The large-scale, distributed and dynamic nature of IoT resources require new methods for crawling, discovery, indexing, geolocation and ranking.
- New search engines are required for new IoT applications, such as bots that automatically initiate search based on users’ context, requiring machine intelligence.
IoTCrawler was a 39 month Horizon 2020 project, that set out to develop an adaptive and scalable crawling, indexing, semantic data/service search and integration, with a special focus on privacy and security, combined with real world an large scale enablers and products driven by innovative use-case scenarios and new business models. It has accomplished all the main goals that were set in the Description of Work and, moreover, some of the targets were greatly exceeded; providing not only valuable tools to deal with the complexities of modern IoT use-cases but also experience in dealing with new scenarios and challenges as well as new business models that will drive the future of IoT.
IoTCrawler offers distributed crawling and indexing mechanisms, enabling real-time (or near real-time) discovery and search of massive real world (IoT data streams), in a secure and privacy- and trust-aware framework. It also provides quality analysis for the data streams, information abstraction and develops query and search ,ranking and selection enablers to respond to spatial and multi-modal data queries in future communication networks. This creates a scalable search engine for future Internet. As the Web search engines and, in particular, Google’s PageRank algorithm changed the way people find and access the information on the Web (by deep crawling/indexing and utilising the links between the Web pages and documents), IoTCrawler changes the way the data (especially new forms such as IoT data) can be published and accessed in large-scale distributed networks, paving the way for creating new applications and services that rely on ad-hoc and dynamic data/service query and access.
Following the goal of generating impact and ensuring a wider uptake of the solutions and ideas a dissemination plan was followed, establishing the goals, generating awareness and engagement, and finally generating impact via different routes and activities; such as video demonstrations and tutorials, reaching a larger online audience, webinars and other events and presentations. Finally, during the span of the project, a total of 44 publications have been completed, exceeding our initial goal.
Business exploitation and commercialization were the ultimate goals of IoTCrawler. After careful planning, partners identified individual and joint exploitation assets, using LEAN Canvases to drive the prosecution of the business plan developments, which resulted in the creation of the Action Plans, Memorandums of Understanding, Elevator Pitch and finally Investor Presentation. Additionally, a selection of assets were identified and presented to different potential partners, some of them contemplating the possibility of commercialisation.
IoTCrawler is an EU H2020 project that addresses the above challenges by proposing efficient and scalable methods for crawling, discovery, indexing and ranking of IoT resources in large-scale cross-platform and cross-disciplinary systems and scenarios. It develops enablers for secure and privacy-aware discovery and access to the resources, and monitors and analyses QoS and QoI to rank suitable resources and to support fault recovery and service continuity.
The project aims to create scalable and flexible IoT resource discovery by using meta-data and resource descriptions in a dynamic data model. This means that searching actions could result in non-optimal results that could fit the user expectations. For this, the system should understand the user priorities (which are often machine-initiated queries and search requests) and provide the results accordingly by using adaptive and dynamic techniques.