CORDIS - EU research results

Integrating Big Data, Software and Communities for Addressing Europe’s Societal Challenges

Periodic Reporting for period 2 - BigDataEurope (Integrating Big Data, Software and Communities for Addressing Europe’s Societal Challenges)

Reporting period: 2016-07-01 to 2017-12-31

The BigDataEurope project’s mission was twofold:

i) Coordination: to establish a networking platform and align the efforts of big data producers, consumers and practitioners along the seven societal challenges of the H2020 framework programme.

ii) Support: to understand the requirements and challenges faced by each one of the seven communities in order to design a big data architecture that fully takes advantage of existing technology without reinventing the wheel, and implement instances of this architecture as flexible big data solutions that are able to supports stakeholders in all domains.

Through the realisation of both actions the project achieved the overarching objective of lowering the technical entry barriers for the use and combination of existing big data technologies towards real-world applications. The results of the project can be classified under three themes:

● Establishing Societal Networks: The communities brought together by the project, most notably through the seven yearly face-to-face workshops and the online webinars and hangouts, benefited from opportunities to strengthen existing collaborations and exchange information about their personal experiences (projects, efforts, problems and solutions) relating to big data management, processing and exploitation. Network activities are foreseen to outlast the project.

● Big Data Platform and Tools: The BDI platform is a free-to-use integrated stack of tools to manipulate, publish and use large-scale data resources in customised data processing chains; requiring minimal knowledge of the technologies involved in order to be effectively used. The BDI fills a gap as a plug-and-play platform that can integrate existing components in both an architectural (stacking components, defining workflows) and usability (one-stop configuration, same look and feel) sense. BDI was realised to meet the identified stakeholder requirements, minimise disruption to current workflows, and maximise opportunities to exploit the latest developments in big data harvesting, processing, analytics and visualisation.

● Seven Pilot Demonstrators: The BDI was successfully instantiated to address distinct reference implementations for 7 societal real-world use-cases. The demonstration of these pilots within the relevant communities validates the technical results achieved by the project, and provides attainable examples of how BDI can be customised to provide an integrated solution to a wide variety of big data challenges faced by any domain.
The consortium supported each societal community through appointed domain and a technical representatives, ensuring that project proceedings are delivered to the community for the highest impact. Community building activities carried out in the first half of the project diminished gradually towards the end of the project and efforts were instead invested in intensive dissemination of project results.

Community activities provided a rich source of input for the support action throughout the project’s duration. In the first half of the project, the stakeholder’s engagement was planned around the requirements elicitation that shaped the architecture behind the BDI platform. In the second half, community activities were heavily centered around the pilots showcasing the applicability of BDI. The development of the platform was extended to the project's end (final release in November 2017) to incorporate further stakeholder feedback.

Throughout the project, progress and results were frequently disseminated through various communication channels, most importantly via the website (and it’s seven community-focussed subpages), periodic newsletters, and social media. Apart from the 21 face-to-face workshops, the 33 hangouts (societal focus) and 5 technical webinars (cross-sectoral interest) provided to be a very popular means of dissemination; with the added value that a majority of them are also recorded for posterity (Youtube). Event material and presentations are openly accessible on the project’s SlideShare account. To date, the project has issued six newsletters and current subscribers exceed 1,000.

In the first half of the project, experts in the consortium conducted a survey of the state-of-the-art in big data technology, in search of the most suitable existing open-source components and methods that can address the identified stakeholder challenges. Via the chosen deployment strategy the ‘plug-and-play’ Big Data Integrator supports the stacking of alternative components, thus retaining flexibility while recommending various setups for different Big Data requirements. An instance of BDI with a predefined list of core components that meet the elicited stakeholder requirements has been deployed for each societal domain. In parallel, the consortium set out to demonstrate how BDI can be easily deployed and used for a wide range of real world use-cases.

In the second half of the project, technical efforts shifted focus to the seven societal pilots selected. Pilots were implemented in three phases, each time increasing their breadth of applicability while at the same time taking the opportunity to disseminate intermediate results and gather feedback from stakeholders.
The generic Big Data Integrator, the seven domain-specific instances, and the seven realised pilot solutions, are all provided as open-source solutions offering interested parties the opportunity to reproduce or customise the solutions remotely. Technical specifications, how-to’s and demonstrators are linked through the project’s website and is maintained on the project’s GitHub account.
Rather than replicating existing Big Data solutions, the project has facilitated the custom integration (plug-and-play) of existing components for different applications. As an additional innovation, the BDI stack comprises a semantic layer, which supports the mapping and integration of heterogeneous data thus particularly addressing the variety dimension of Big Data. The results enable market players of any size to apply technology which was previously out-of-reach due to a lack of in-house necessary skills or budget to setup similar customised solutions. The implemented pilots demonstrate how the BDI platform and its reference implementations can ingest data from a variety of sources, and offer technologies that can be flexibly tailored to target innovative applications in various domains.

Project workshops, hangouts and technical webinars focused on all seven societal communities, and helped disseminate results broadly. The BDI platform attracted the interest of a large domain-independent technical audience, succeeding thus in contributing to the state-of-the-art amongst top Big Data specialists in Europe and worldwide. Impact and uptake was maximised by establishing a wide range of partnerships and participating in joint activities with other H2020 CSAs, such as EuDEco, the European Science Academy, and initiatives like the Big Data Value PPP and the related Association. In addition, the project had a very wide visibility in sponsored events, including the Apache Big Data Europe (2016) and the Semantics (2016) conferences, the joint BDVA/European Data Forum (2017), and the International Semantic Web Conference (2017).